Efficiently create long panel of 1-year estimates

Hi,

For research that I am currently conducting, I would like to create a panel data set with 1-year estimate data at the ZCTA level for 2009-2019. The data should have population counts by age, education level, and employment status, as well as median income. 

Using the FTP site I see that I can identify the appropriate tables for each variable, use the sequence number to identify the the right file, and download that file. For a relatively long panel though, this is a bit laborious because it seems there are sometimes changes across years in tables, sequence numbers, and variables. 

Has anyone created a panel of ACS 1-year estimates, so that I don't have to replicate this process?

If I do have to pull data for each year from the FTP site, are there any tips or best practices to efficiently wade though all those data? 

Parents
  • Hi! Unfortunately, there is no ZCTA-level data at one-year estimates. There isn't a big enough sample in one year to produce meaningful data at that small a geography.

  • that makes sense - thanks for pointing that out! 

    If I were shooting for a higher level, like county-level measures of the variables from the 1-year estimates, do you have suggestions for how to efficiently create such a dataset? Must one rebuild it from the yearly files pulled from the FTP? 

    I'm interested in the 1-year estimates because I hope to capture year-to-year variation in these population counts. I appreciate the point about sample size and hopefully using county-level data would help with the sampling error. 

  • The 1-year estimates are published only for areas with at least 65,000 residents, which excludes most counties (though the included counties, with their larger populations, include a significant majority of the US population).

    Even for larger areas, the 1-year estimates will in many cases have margins of error large enough to cause the annual changes to be statistically insignificant. I'd recommend looking at this ACS users handbook for guidance on making comparisons and considering statistical significance.

    To answer your main question, you could consider using the API for ACS data to request 1-year tables in a systematic, programmatic way that would be more efficient. If you're an R user, there's an R package, tidycensus, that provides R helper functions for interfacing with the API.

    If you can think of a way to do your analysis with 5-year estimates, you could use IPUMS NHGIS to get time series tables that link ACS 5-year data and census data across time, allowing you to select which tables and years to include in your data file. There's also an NHGIS API that would allow you to request 1-year or 5-year tables, though NHGIS has 1-year data only back to 2010.

Reply
  • The 1-year estimates are published only for areas with at least 65,000 residents, which excludes most counties (though the included counties, with their larger populations, include a significant majority of the US population).

    Even for larger areas, the 1-year estimates will in many cases have margins of error large enough to cause the annual changes to be statistically insignificant. I'd recommend looking at this ACS users handbook for guidance on making comparisons and considering statistical significance.

    To answer your main question, you could consider using the API for ACS data to request 1-year tables in a systematic, programmatic way that would be more efficient. If you're an R user, there's an R package, tidycensus, that provides R helper functions for interfacing with the API.

    If you can think of a way to do your analysis with 5-year estimates, you could use IPUMS NHGIS to get time series tables that link ACS 5-year data and census data across time, allowing you to select which tables and years to include in your data file. There's also an NHGIS API that would allow you to request 1-year or 5-year tables, though NHGIS has 1-year data only back to 2010.

Children