Standard errors for tract-level school enrollment ratios from ACS tables

My organization often needs to estimate the net school enrollment ratio of children and youth ages 3-24 (inclusive) for various levels of geography including census tracts. For larger geographies, we estimate this directly from the PUMS and calculate standard errors using replicate weights. For tracts though we’re stuck with what we can get from Table B14003 SEX BY SCHOOL ENROLLMENT BY TYPE OF SCHOOL BY AGE FOR THE POPULATION 3 YEARS AND OVER. Obtaining the numerator, the total number of students enrolled in public or private school between the ages of 3 and 24, requires summing 24 individual estimates in the table, which breaks enrollment down by age, gender, and control of school. I’ve used the methods outlined by Census in their “Accuracy of the Data” documents to obtain the approximated SE for this sum but I have serious doubts that these approximations tell us much of anything given that so many individual MoE estimates have to be aggregated. Does anyone have a suggestion for a better way to calculate or even indirectly estimate the SE for school enrollment of this age range at the census tract level? Thank you!
  • Hi Patrick,

    In the first session of the ACS Data User Group Conference, a number of us wrestled with this problem. The session was titled, Aggregating ACS Estimates and Calculating Margins of Error. I am afraid that there is no good answer to calculating the SE after combining geographies or collapsing categories. The problem lies with the unknown covariance term. Check the presentations to see if you pick up any useful suggestions.

    Best,
    Warren
  • At the ACS conference a number of commenters referenced a Census Bureau "guideline" that no more than 4 ACS categories can be combined before the effect of covariance undermines the utility of the result. It was not entirely clear where that figure came from, but I do recall seeing it myself.

    You might be able to limit the number of categories you use to 12, rather than 24, by subtracting rather than adding. For example, take the total number of Males Enrolled in a Public School and subtract the 25 to 34 and 35 and Over subcategories. The method for calculating the SE would be the same as for the addition of multiple values, but now you use three terms to get the result instead of six. My recollection is that this an acceptable procedure and might reduce the scale of your covariance issue.
  • You could even get it down to five with...
    total population 3 and over - (males 25 To 34 Years not enrolled + males 35 Years And Over not enrolled + females 25 To 34 Years not enrolled + females 35 Years And Over not enrolled)

    A big picture approach to this problem would be if the Census Bureau released more crosstabs. In this particular example you could imagine an extra block of results not split by gender, this would get the computation down to three values.
  • Many thanks Warren, Cliff, and David for your thoughts on this. Great suggestions and I look forward to trying them out. Wish I had been able to make the conference! I will have a look at the presentations from the first session on this topic though. Thank you for your help!
  • This once again illustrates the fact that the detailed tables need a thorough review. I "screamed" loud and clear about this problem regarding labor force participation and unemployment rate calculation, until, at long last, a table was added with a simple tabulation (I think it's B40). Clearly similar lobbying is needed for B14003.

    The basic problem comes, originally, from the fact that the detailed tables were designed by the subject matter experts in the Bureau, who did (and still do) not understand the needs of small area data users. They use 3 and 4 way crosstabs at the national level. We're much more likely to need simple frequencies or two-way cross tabs.