Using ACS for a long Study period to be blended with a large cohort

Hi,

I have a study looking to examine health outcomes over a long period and use Census data/ACS to provide contextual factors. Here are some hypothetical descriptors:

  • 1,000,000 individuals with geocoded addresses for X,Y coordinates 
  • The addresses represent where the individuals were living at the time of a common medical procedure 
  • Spatial joined data to get FIPS 2020 at the block group level 
  • Records span from 1998 to 2021
  • There are several very differing regions across the country contributing.

Advice tends to say, "do not use overlapping periods for comparison", but over such a long period of time, there have been substantial changes to the area. The 1-year estimates have larger MOEs, but there are big jumps in changes, even with the 5-year estimates.

The study would like to use block level estimates, but the documentation suggests two things: 

  1. MOE are much larger at the block group level
  2. Block groups should be combined prior to analysis: 
    1. https://www.federalregister.gov/documents/2018/11/13/2018-24570/block-groups-for-the-2020-census-final-criteria 
    2. I read this as, "you can do this, but you need to commit effort into aggregating block groups effectively".
  3. Only ~1% of households are sampled across the US for the ACS in a given year; which means between 6 and 30 survey responses were used for a given block group, whereas a tract would consistently have ~45.

Here's what I think I would propose to this study:

  1. Use tract level data. You could use Block Group, but it represents work for regions to aggregate block groups effectively.
  2. For 1998-2005 records: use the 2000 Decennial Long form SES values.
  3. For 2006-2010: use the 2006-2010 ACS 5-year. Although the sampling rate was lower, ACS was collected during this period.
  4. For 2011-2021: use the ACS that matches the ACS release year

The rationale, at least from my experience in a rapidly changing city, there are some big changes. The neighborhood where I work, the median household income went from ~$30k to >$100k between 2010 and 2020 in the ACS 5 year releases. 

Anyhow, does this seem like a rational methodology for blending ACS data with such a long study frame?

If not, can you tell me why and what you would suggest instead and why?