# Using ACS to construct a socioeconomic index and margins of error

I am working on refining and publicizing the Yost Index, which is a composite index incorporating information from 7 ACS tables (poverty, education, home value, income, employment, mortgage, and rent). It has been around since the early 2000s and does a good job of reducing the complexity of SES information in health studies where SES is an important confounder but not the primary focus of inquiry. It has been calculated for the nation and by state, at the block group and census tract level, and for a variety of years.

I have recently been asked me to compute the margin of error for the index. That was not the request exactly - it was more in the form of a challenge: "aren't the MOEs for your variables so massive that what you are doing is not workable?" I disagree - for many of these variables the MOEs are typically around 10-20% of the estimate. One way to respond is to compute the MOEs and let the users decide if they are massive or not. However, I am not sure the best way to do this across 7 tables. It seems that the errors would not be independent - for a given census tract, if income is underestimated, then poverty would be likely to be overestimated.

Parents
Children
• That's a shame. Out of curiosity, which measures were not available?

I ask because while it may not be possible to generate the "true" Yost index, a close approximation may be able to be done:

For instance, if median income is not directly available, you could substitute mean income or calculate the median from the distribution table (ACS medians are linear interpolations anyways, not true medians).

And the SE on the approximation may be closer to the true SE than other methods.

• I have the education, occupation, and employment variables. I am missing median home value, median rent, and median income - but as you say, these could be developed. I have the wrong poverty value, but again it could be approximated.

However, since I can calculate the covariance structure for these 7 variables myself (they all range from moderate to strong), couldn't I just simulate replications myself that would have the correct mean and standard deviation and covariance structure? (In fact, I went ahead and did this, and the results look ok).

• That works too.