I need some statistical advice. Paging Stas Kolenikov !
When calculating MOEs for derived estimates -- specifically, averages -- should I use the formula for ratio, or proportion?
The two formulas shown in the ACS handbook chapter 8 (pages 64 and 65) are nearly identical. except that the proportion formula uses a minus operator under the radix whereas the ratio formula uses a plus.
I'm finding that using the proportion formula sometimes results in an error from trying to take the square root of a negative number.
Here's an example from the ACS 2022 1-year data for the nation:
B25065_E001 (aggregate gross rent) = $63,086,890,700
B25065_M001 (MOE for above) = ±$234,476,359
B25063_E002 (count of cash renters) = 42,971,061
B25063_M002 (MOE for above) = ±162,515
I'm trying to derive average gross rent as (aggregate gross rent / count of cash renters), or about $1,468.13 for the USA. Seems about right. But plugging the numbers into the proportion formula leads to madness:
MOE(P-hat) = sqrt(B25065_M0012 - ((B25065_E001 / B25063_E002)2 * B25063_M0022)) / B25063_E002
MOE(P-hat) = sqrt(5.49e+16 - (2,155,391 * 1.85e+15)) / 42,971,061
MOE(P-hat) = sqrt(-3.98e+21) / 42,971,061
I feel like I'm missing something. Is it because the source estimates and MOEs are counting different things? Is there a different formula for calculating MOEs for derived averages?
Thanks for any guidance.
My comments refer to this document
https://www.census.gov/content/dam/Census/library/publications/2020/acs/acs_general_handbook_2020_ch08.pdf
The 2 formulas are (6) and (7) in this document.
Formula (6…
Glenn Rice the average is the ratio: you have the total (across the U.S.) of cash payments in the numerator, and you have the total (# of households that are renters) in the denominator. So the calculations…
Just another note formula (6) with the - sign applies to the case where the data is counts and the numerator is a subset of the denominator. Formula (7) applies to a ratio of any 2 things. The 2 "things…
To possibly answer my own question and ask a new one: Could I use formula (9) to do this?
Calculating Measures of Error for the Product of Two Estimates
Since dividing x / y is the same as the product x * 1/y, could this work?
EDIT: trying this out....
MOE(X-hat * 1/Y-hat) = sqrt((X-hat**2 * (1/MOE[Y-hat])**2) + ((1/Y-hat)**2 * MOE[X-hat]**2))
= sqrt((B25065_E001**2 * (1/B25063_M002)**2) + ((1/B25063_E002)**2 * B25065_M001**2))
= sqrt((63086890700**2 * (1/162515)**2) + ((1/42971061)**2 * 234476359**2))
= sqrt(2.3886483 + 29.7746023)
= $5.67
...which seems low, but MOEs should be pretty low for the nation as a whole.
Better, worse, or just completely wrong? Thanks
Formula (6) has a - sign under the square root.
The caveat for that formula is that
"Users should note that if the value under the square root is negative, then substitute a “plus” for the “minus” signunder the square root in formula (6). This modified formula is the same as the formula for the MOE of a ratio,which will be discussed in the next section." This occurs when either the proportion/ratio is large or the MoE or the denominator is large when compared to the numerator MoE
Formula (7) has a + sign.
The MoE given by (6) will be lower than the MoE given by (7).
Glenn Rice the average is the ratio: you have the total (across the U.S.) of cash payments in the numerator, and you have the total (# of households that are renters) in the denominator. So the calculations for the ratio give the MOE of $7.78 (which seems a bit too tight but what do I know).
Analysis of microdata on IPUMS (https://sda.usa.ipums.org/sdaweb/analysis/exec?formid=mnf&sdaprog=means&dataset=all_acs_samples&sec508=false&dep=rent&row=year&filters=rent%281-**%29&weightlist=hhwt&main=means&transform=none&percentileopt=none&confidence=on&cflevel=95&se=on&wncases=on&color=on&ch_type=bar&ch_color=yes&ch_width=600&ch_height=400&ch_orientation=vertical&ch_effects=use2D&decmeans=2&dectotals=0&decdiffs=1&decmedian=2&decse=1&decsd=1&decminmax=2&decwn=1&deczstats=2&csvformat=no&csvfilename=means.csv) gives this (mean of rent by year, filter rent(1-**) i.e. non-missing) -- the total N of 95M probably means it counts the individuals rather than the households. The MOE is even tighter at about $1.8.
The sampling error of $234M should give a heart attack to any reasonable economist... but fortunately economists don't know survey statistics :).
Thanks Stas!