# statistical methods for comparing of 2 1-year surveys

We would like to compare a percent from a 2013 single year dataset with the same percent from the 2014 single year data set.  Can we use statistical methods that require independent samples like a chi - square test?  Although different people may have been surveyed in the two years, I don't believe one year is independent of the next year because the samples represent the same population.

Thank you in advance for your help or thoughts.

• Couple of comments:
1. having the samples from the same population doesn't necessarily mean the samples are not independent. The sample size is small compared to the population, so they could be entirely different people.

2.The design and method of the ACS is here www.census.gov/.../design-and-methodology.html One page of doc (chapter 4, page 4) says "One of the ACS design requirements is that no HU address be in sample more than once in any five-year period." So looks like 2013 and 2014 are independent samples.
• In reply to Gene Shackman:

Yes, that's what is confusing. The samples are independent but the populations are not. Usually when we talk about independent samples both are independent like men vs. women.
Anyway, I found the formulas that ACS recommends. So, I will use those instead of chi-square www.census.gov/.../acs_accuracy_of_data_2010.pdf. Thank you for commenting.
• In reply to stat_analyst:

The formula that ACS gives is a paired samples samples Z-test, so I'm going to take that as evidence that my hunch that these are dependent samples is true.
• In reply to stat_analyst:

Could you point to the formula you mean? What page?
• In reply to Gene Shackman:

The formula is on page 23.
After you weight them both years represent the entire US population which doesn't change much from one year to the next. So, the values from one year are very much related to the values from the other year. So, I think the samples are independent but the populations are dependent. Most people say independent samples but they could just as easily say independent populations, I think both need to be true for independence.
• ACS years are independent samples, but, stat_analyst, you are correct that Chi-Square is not an appropriate test of significance because Chi-Square is generally used to test differences across categories (i.e. distinct populations).

You are also correct that what you'd want to use is the z-score test described by the Census Bureau in this document: www2.census.gov/.../MultiyearACSAccuracyofData2013.pdf
• In reply to Beth Jarosz:

Well, at least according to Wikipedia, the formula on page 23 is for an independent 2 sample t test. en.wikipedia.org/.../Student's_t-test So I would think that means independent samples.

It was explained on another site. Suppose you have students coming in for some kind of randomized trial. Half are assigned to treatment, half to no treatment. This is independent samples, because the students in the treatment condition are independent of the students in the no treatment group, even though they are all students at the same institution.
• In reply to Beth Jarosz:

Beth,

Would you say that any statistical test requiring independence would be inappropriate? Another one that has been suggested in my group is to put two years of pums data together and use year as a variable in a regression. This model also requires that the two years are independent.
• In reply to Gene Shackman:

Gene,

You are right that is a formula for independent samples. Dependent samples would have only one se.

But your example is very different, it is not survey sampling. The samples are not selected to represent the entire student population. There is no weighting. There really isn't any sampling going on. Those are distinct groups.
• In reply to stat_analyst:

I was just using the example to illustrate the meaning of "independent", that is, two groups not dependent on each other. The ACS of course involves a lot more, as you indicate weighting, and so forth. I just wanted a simplified example.