Unexpectedly low margins of error for 2020 PUMS 1-yr file

I am using the 2020 1-year ACS, accessed from IPUMS. I created two variables, the AMI threshold based of HUD income cutoffs within the region of interest; and whether the household had more than 1 occupant per room (overcrowded). 

I joined my clean data to the replicate weights (which in IPUMS automatically uses the experimental weights and experimental replicate weights for the year 2020). I then turn the cleaned data into a household-level survey object (h_repd) and use the srvyr package in R. I am trying to get a simple household count grouped by PUMA, AMI threshold and overcrowding. See below:

However, when I run the numbers, my standard errors are really low, indicating the data is more "reliable." But we know the data is fraught with data quality issues and should have larger standard errors.

For example, in the output below, when the unweighted sample size n is just 1, it shows a CV of 12% (line 3) or 28% (line 7). 

Compare this to the results for the year 2019 (below). When n is very small such as 2, the CV is at least about 70%.

I assume it must be an error somewhere in my code, but the fact that 2019 looks correct but 2020 does not is making me wonder if others have run into similar issues. 

Does anyone have any guidance on this? Should my SEs and CVs be this low? Am I forgetting a step?

  • I have zero understanding of your code etc. But I wonder if you ran this for another area or two to see if other data falls into the same low range. and what variance there is from 2019 as well. I assume this would be easy to do.

  • Here is 2019+2020 for Seattle-area PUMA 11603 (area containing Capitol Hill). A similar thing is happening.

  • Wow, that was fast. However, due to regional staff and survey variations, it might be best to use an Eastern location. Then I might run 2018 to prove that this year is the anomaly. I'm new to this group, but I'm a big fan of tickets (assuming you have some clue there's an issue). Let's see what some of the experts say first.  I log easily 20-30 tickets a year to the NIH and 95% are valid (not me being stupid), and I generally get great responses and changes are made.

  • Using replicate weights, in general, will result in quite low standard deviations compared to the published numbers that the Census puts out.  This is especially true for income, since income is, of course highly skewed and if one follows a reasonable approach.  The reason for this is that the Bureau incorrectly uses the normal approximation for cases when they should note, and for income they use an outdated approach to estimation.  The most telling examples are published tables with a margin of error of counts that go negative.  Along time ago, I sent them a memo on this that resulted in a long discussion, which include Rod Little when he was at the Bureau.  If you want to read about this, the discussion including my memo on this is available here.  www.dropbox.com/.../Memo_Regarding_ACS-With_Response.pdf

  • Just as a background, the formula for the estimation of variance using replicates does not include the number of observations used to create said estimate in its denomiator. Rather, the variance is estimated from the variability of the replicate weights. Generally, the ACS uses certain "controls" that post-stratify the ACS weights to known population totals from the Population Estimates program. When an ACS estimate becomes "close" to the controlled estimate, the variance decreases, as the estimate is designed to mirror the controlled estimates (often without error).

    All this is to say, the 2020 ACS experimental data used a different weighting scheme than usual ACS data and brought in additional sources of auxiliary data to adjust weights. Because these adjustment were designed to address differential response with regard to socioeconomic status, it may have also had the effect of reducing the variability of the weights if adjustments were applied to small strata.  For example, if there were lower responses among poorer populations, which resulted in a smaller N within a particular stratum, those weights may have been adjusted so that the sum of the weights (replicate) matched a total. In the extreme case of only 1 observation in the strata, all replicates could be fixed to match the calibration estimate.

    tl,dr: The 2020 ACS experimental weights did things to address nonresponse due to SES which could reduce variances of related SES measures. 

  • Thank you for following up! I only prepared an extract for Oregon and Washington (hence why I was "fast" at doing the same for Seattle haha). Let me try to prepare another extract for my home state of Michigan Heart

  • Building off of 's comment about the different weighting methodology. I remember reading this article a few months ago, and it really helped me understand what was going on with the experimental weights: https://www.census.gov/content/dam/Census/library/working-papers/2021/acs/2021_Rothbaum_01.pdf

    I feel like they may have mentioned something about the methodology they used potentially lowering the standard error or something to that effect (maybe something about efficiency on the factors that the weights were tied to?), but it was too long ago for me to be sure. I wanted to post it just in case it's helpful to you!