Hi all, I am working with the 2009-2013 ACS 5-year estimates detailed tables, and have a general question about comparing ACS data at different geographic levels. I have downloaded ACS data at both the census tract and the ZCTA levels. Does anyone know whether the same raw data is used for all of ACS's geographic levels (i.e. whether I should expect the ZCTA and census tract ACS data to be "equivalent"), or if the data from these two geographic levels draw from different samples? I am primarily asking because I am planning to crosswalk both of these datasets to the 5-digit ZIP code level, and would like to know what to expect when comparing the resulting numbers to each other.
The ACS "raw" data, i.e. the survey form answers, is based on something called the Master Address File, which contains geocoded data for each structure in the US. You can look up an address in…
Yes. the tract level estimates and ZCTA level estimates are derived from the same underlying sample.
I don't believe ZCTAs cover the entire country.It's definitely not based on the respondents' ZIP; ZCTAs are explicitly different than mailing zip codes, based on coordinate location, not address.…
I have a related question... we've noticed that total estimated population is about 20K lower for ZCTAs than it is for counties and tracts in the 2020 ACS 5-year. What could explain this "gap"? Is this a relic of sampling methods? Perhaps respondents didn't enter in a ZIP that has a counterpart in the ZCTA file?
I don't believe ZCTAs cover the entire country.It's definitely not based on the respondents' ZIP; ZCTAs are explicitly different than mailing zip codes, based on coordinate location, not address.
Bernie said:I don't believe ZCTAs cover the entire country.
That's right, they don't.
Margins of error might also account for some of the difference.
For 2017-2021, I'm counting a total tract (and county) population of 333,036,755, and a total ZCTA population of 333,032,934, or a difference of 3,281.
For 2016-2020, the difference is 21,905.
for 2015-2019, the difference is 17,490.
The ACS "raw" data, i.e. the survey form answers, is based on something called the Master Address File, which contains geocoded data for each structure in the US. You can look up an address in the MAF using the census geocoder, https://geocoding.geo.census.gov/geocoder/geographies/onelineaddress?form. Every physical "address" has an assigned value for every level of geography, block group, tract, ZCTA (zip code tabulation area), county state PUMA etc.etc. Note ZCTA's are not the same as postal zip codes which change all the time. ACS population totals and some "marginals" (hispanic and race for example) are controlled at the county level. If you take the ACS values for total population B01003 and add them up across all the tracts in a county you will get the total population for the county. Tracts in a zcta do not necessarily add up to the total population of the ZCTA. Counties add up to states. Some other characteristics are "controlled." For example the number of females in all the tracts in a county add up to the number of females in the county (B01001) -- I think. In general "things" don't "add up." The ACS is a survey and the numbers on data.census.gov are estimates with sampling error.
www.census.gov/.../2021-01.html
Yes, it does appear that the margins of error are on average higher for ZCTAs than for county or tract.My math for 2016-2020:
SUM E_TOTPOP
MAX E_TOTPOP
AVG E_TOTPOP
STDEV E_TOTPOP
MAX M_TOTPOP*
AVG M_TOTPOP*
STDEV M_TOTPOP*
County
326,569,308
10,040,682
103,903
332,097
1,564
7
51.5
Tract
39,373
3,882
1,657
5,676
555
295.95
ZCTA
326,549,615
126,310
9,898
14,762
6,567
598
660.2
*nulls excluded