I am working on refining and publicizing the Yost Index, which is a composite index incorporating information from 7 ACS tables (poverty, education, home value, income, employment, mortgage, and rent). It has been around since the early 2000s and does a good job of reducing the complexity of SES information in health studies where SES is an important confounder but not the primary focus of inquiry. It has been calculated for the nation and by state, at the block group and census tract level, and for a variety of years.
I have recently been asked me to compute the margin of error for the index. That was not the request exactly - it was more in the form of a challenge: "aren't the MOEs for your variables so massive that what you are doing is not workable?" I disagree - for many of these variables the MOEs are typically around 10-20% of the estimate. One way to respond is to compute the MOEs and let the users decide if they are massive or not. However, I am not sure the best way to do this across 7 tables. It seems that the errors would not be independent - for a given census tract, if income is underestimated, then poverty would be likely to be overestimated.
This ACS Handbooks provide a set of formulas you can use to calculate MOE. (See Chapter 8 https://www.census.gov/content/dam/Census/library/publications/2020/acs/acs_general_handbook_2020_ch08.pdf)
The only challenge is (as you described above) the formulas do not take into account whether or not the error in each component covaries, so these formulas likely overstate the error if the errors are correlated to begin with. To the best of my knowledge, no one has solved that "problem" yet (though if someone has, hopefully they'll post here!)
(To answer that question you could use PUMS microdata to produce the index and error estimates for a larger areas, compare that with the aggregating method of calculating MOE from tract data, and analyze the magnitude of over (or under) estimates of MOE.)
It wouldn’t work for all summary levels, but Census provides select tables in their variance replicate tables (https://www.census.gov/programs-surveys/acs/data/variance-tables.html). In each table you’ll find 80 columns: each column representing the estimate using the replicate weight. From these tables, you could calculate 80 indices (assuming all the components are available) and use the successive differences formula to estimate the standard error.
Thanks, I'll start with this (assuming all components are available)
Thanks for your suggestion - it seems like that would make for a good paper on its own.
Unfortunately, only 3 of the 7 Yost index components are available.
That's a shame. Out of curiosity, which measures were not available?
I ask because while it may not be possible to generate the "true" Yost index, a close approximation may be able to be done:
For instance, if median income is not directly available, you could substitute mean income or calculate the median from the distribution table (ACS medians are linear interpolations anyways, not true medians).
And the SE on the approximation may be closer to the true SE than other methods.
I have the education, occupation, and employment variables. I am missing median home value, median rent, and median income - but as you say, these could be developed. I have the wrong poverty value, but again it could be approximated.
However, since I can calculate the covariance structure for these 7 variables myself (they all range from moderate to strong), couldn't I just simulate replications myself that would have the correct mean and standard deviation and covariance structure? (In fact, I went ahead and did this, and the results look ok).
That works too.