Unexpectedly low margins of error for 2020 PUMS 1-yr file

I am using the 2020 1-year ACS, accessed from IPUMS. I created two variables, the AMI threshold based of HUD income cutoffs within the region of interest; and whether the household had more than 1 occupant per room (overcrowded). 

I joined my clean data to the replicate weights (which in IPUMS automatically uses the experimental weights and experimental replicate weights for the year 2020). I then turn the cleaned data into a household-level survey object (h_repd) and use the srvyr package in R. I am trying to get a simple household count grouped by PUMA, AMI threshold and overcrowding. See below:

However, when I run the numbers, my standard errors are really low, indicating the data is more "reliable." But we know the data is fraught with data quality issues and should have larger standard errors.

For example, in the output below, when the unweighted sample size n is just 1, it shows a CV of 12% (line 3) or 28% (line 7). 

Compare this to the results for the year 2019 (below). When n is very small such as 2, the CV is at least about 70%.

I assume it must be an error somewhere in my code, but the fact that 2019 looks correct but 2020 does not is making me wonder if others have run into similar issues. 

Does anyone have any guidance on this? Should my SEs and CVs be this low? Am I forgetting a step?

Parents
  • Just as a background, the formula for the estimation of variance using replicates does not include the number of observations used to create said estimate in its denomiator. Rather, the variance is estimated from the variability of the replicate weights. Generally, the ACS uses certain "controls" that post-stratify the ACS weights to known population totals from the Population Estimates program. When an ACS estimate becomes "close" to the controlled estimate, the variance decreases, as the estimate is designed to mirror the controlled estimates (often without error).

    All this is to say, the 2020 ACS experimental data used a different weighting scheme than usual ACS data and brought in additional sources of auxiliary data to adjust weights. Because these adjustment were designed to address differential response with regard to socioeconomic status, it may have also had the effect of reducing the variability of the weights if adjustments were applied to small strata.  For example, if there were lower responses among poorer populations, which resulted in a smaller N within a particular stratum, those weights may have been adjusted so that the sum of the weights (replicate) matched a total. In the extreme case of only 1 observation in the strata, all replicates could be fixed to match the calibration estimate.

    tl,dr: The 2020 ACS experimental weights did things to address nonresponse due to SES which could reduce variances of related SES measures. 

Reply
  • Just as a background, the formula for the estimation of variance using replicates does not include the number of observations used to create said estimate in its denomiator. Rather, the variance is estimated from the variability of the replicate weights. Generally, the ACS uses certain "controls" that post-stratify the ACS weights to known population totals from the Population Estimates program. When an ACS estimate becomes "close" to the controlled estimate, the variance decreases, as the estimate is designed to mirror the controlled estimates (often without error).

    All this is to say, the 2020 ACS experimental data used a different weighting scheme than usual ACS data and brought in additional sources of auxiliary data to adjust weights. Because these adjustment were designed to address differential response with regard to socioeconomic status, it may have also had the effect of reducing the variability of the weights if adjustments were applied to small strata.  For example, if there were lower responses among poorer populations, which resulted in a smaller N within a particular stratum, those weights may have been adjusted so that the sum of the weights (replicate) matched a total. In the extreme case of only 1 observation in the strata, all replicates could be fixed to match the calibration estimate.

    tl,dr: The 2020 ACS experimental weights did things to address nonresponse due to SES which could reduce variances of related SES measures. 

Children
No Data