Hi all,
Does anyone know why the variance estimates for the 2020 PUMS estimates are smaller than the previous years?
We have noticed the same. It would seem that the variances should be larger since the estimates are based on experimental data. Would also appreciate any information on this issue.
Same here.
In the technical documentation the Census Bureau reports, "Due to the variance properties of the experimental estimation methodology, the variance estimates for some PUMS estimates may be smaller than expected when compared to the equivalent variance estimates from previous years" (page 8 in Accuracy of the Experimental Data.)
Thanks Alicia. I had read that readme doc but I was looking for more detailed explanations. One of the Census research paper explains a bit better why the reduced variance estimates are an essential byproduct of entrophy-balanced weights (See page 33): https://www.census.gov/content/dam/Census/library/working-papers/2021/acs/2021_Rothbaum_01.pdf
In addition to changing the point estimates, the entropy balance weights also affect the standard errors of the estimates. It is generally understood that increased variability among the survey weights can increase the standard errors, so weighting adjustments aimed at reducing bias are often done at the expense of increasing variance. However, Little and Vartivarian (2005) show that this may not hold true if the variable used to adjust for nonresponse is correlated with the survey variable of interest, a property they call “super-efficiency.” For example, by reweighting respondents to match moments of the administrative income distribution among all occupied housing units, the standard error for reweighted estimates of survey income will be reduced because administrative income is highly correlated with survey income responses. Prior work has found similar effects. For example, Eggleston and Westra (2020) construct administrative-data-based weights for the SIPP and find that the standard error for median earnings at the national level decreased by 35 percent, although this decrease was not statistically significant. Rothbaum and Bee (2021) find that entropy balance weights in the CPS ASEC reduce the standard error for median household income in 2020 by 50 percent. Standard errors will also narrow for other variables that are not targeted using the linked data, but which are correlated with information that is targeted. For example, we do not have linked data on the education of respondents and nonrespondents. However, if education is correlated with income, homeownership, marital status, and other variables that we can target using the linked data, then the standard error on survey estimates of education will also be reduced through reweighting. In summary, it is important to note that the change in weighting methodology for the entropy balance weights should affect the margins of error in addition to affected point estimates. Even though the 2020 34 ACS sample was smaller because of lower response rates, margins of error for 2020 ACS estimates might not be larger than prior estimates if the weighting counteracts the effects of a smaller sample size.
Although the paper certainly offers better explanations, I am still mulling over whether this is what it ought to be given that the administrative data employed to address a large and presumably highly variable non-response biases across subgroups are not covering a full distribution. For example, IRS data on income distribution of tax filers are utilized, which certainly help match a known income distribution. But knowing that individuals with incomes below the filing thresholds are missing in the data, I tend to think that variance is smaller than what it should be.
I have inquired to the Census so stay tuned!