Unexpectedly low margins of error for 2020 PUMS 1-yr file

I am using the 2020 1-year ACS, accessed from IPUMS. I created two variables, the AMI threshold based of HUD income cutoffs within the region of interest; and whether the household had more than 1 occupant per room (overcrowded). 

I joined my clean data to the replicate weights (which in IPUMS automatically uses the experimental weights and experimental replicate weights for the year 2020). I then turn the cleaned data into a household-level survey object (h_repd) and use the srvyr package in R. I am trying to get a simple household count grouped by PUMA, AMI threshold and overcrowding. See below:

However, when I run the numbers, my standard errors are really low, indicating the data is more "reliable." But we know the data is fraught with data quality issues and should have larger standard errors.

For example, in the output below, when the unweighted sample size n is just 1, it shows a CV of 12% (line 3) or 28% (line 7). 

Compare this to the results for the year 2019 (below). When n is very small such as 2, the CV is at least about 70%.

I assume it must be an error somewhere in my code, but the fact that 2019 looks correct but 2020 does not is making me wonder if others have run into similar issues. 

Does anyone have any guidance on this? Should my SEs and CVs be this low? Am I forgetting a step?

Parents
  • I have zero understanding of your code etc. But I wonder if you ran this for another area or two to see if other data falls into the same low range. and what variance there is from 2019 as well. I assume this would be easy to do.

Reply
  • I have zero understanding of your code etc. But I wonder if you ran this for another area or two to see if other data falls into the same low range. and what variance there is from 2019 as well. I assume this would be easy to do.

Children
  • Here is 2019+2020 for Seattle-area PUMA 11603 (area containing Capitol Hill). A similar thing is happening.

  • Wow, that was fast. However, due to regional staff and survey variations, it might be best to use an Eastern location. Then I might run 2018 to prove that this year is the anomaly. I'm new to this group, but I'm a big fan of tickets (assuming you have some clue there's an issue). Let's see what some of the experts say first.  I log easily 20-30 tickets a year to the NIH and 95% are valid (not me being stupid), and I generally get great responses and changes are made.

  • Thank you for following up! I only prepared an extract for Oregon and Washington (hence why I was "fast" at doing the same for Seattle haha). Let me try to prepare another extract for my home state of Michigan Heart