I am using the 2020 1-year ACS, accessed from IPUMS. I created two variables, the AMI threshold based of HUD income cutoffs within the region of interest; and whether the household had more than 1 occupant per room (overcrowded).
I joined my clean data to the replicate weights (which in IPUMS automatically uses the experimental weights and experimental replicate weights for the year 2020). I then turn the cleaned data into a household-level survey object (h_repd) and use the srvyr package in R. I am trying to get a simple household count grouped by PUMA, AMI threshold and overcrowding. See below:
However, when I run the numbers, my standard errors are really low, indicating the data is more "reliable." But we know the data is fraught with data quality issues and should have larger standard errors.
For example, in the output below, when the unweighted sample size n is just 1, it shows a CV of 12% (line 3) or 28% (line 7).
Compare this to the results for the year 2019 (below). When n is very small such as 2, the CV is at least about 70%.
I assume it must be an error somewhere in my code, but the fact that 2019 looks correct but 2020 does not is making me wonder if others have run into similar issues.
Does anyone have any guidance on this? Should my SEs and CVs be this low? Am I forgetting a step?
Using replicate weights, in general, will result in quite low standard deviations compared to the published numbers that the Census puts out. This is especially true for income, since income is, of course…
Just as a background, the formula for the estimation of variance using replicates does not include the number of observations used to create said estimate in its denomiator. Rather, the variance is estimated…
Building off of Matthew Brault 's comment about the different weighting methodology. I remember reading this article a few months ago, and it really helped me understand what was going on with the experimental…
Using replicate weights, in general, will result in quite low standard deviations compared to the published numbers that the Census puts out. This is especially true for income, since income is, of course highly skewed and if one follows a reasonable approach. The reason for this is that the Bureau incorrectly uses the normal approximation for cases when they should note, and for income they use an outdated approach to estimation. The most telling examples are published tables with a margin of error of counts that go negative. Along time ago, I sent them a memo on this that resulted in a long discussion, which include Rod Little when he was at the Bureau. If you want to read about this, the discussion including my memo on this is available here. www.dropbox.com/.../Memo_Regarding_ACS-With_Response.pdf