MOEs for derived averages

Glenn Rice over 1 year ago

I need some statistical advice. Paging Stas Kolenikov !

When calculating MOEs for derived estimates -- specifically, averages -- should I use the formula for ratio, or proportion?

The two formulas shown in the ACS handbook chapter 8 (pages 64 and 65) are nearly identical. except that the proportion formula uses a minus operator under the radix whereas the ratio formula uses a plus.

I'm finding that using the proportion formula sometimes results in an error from trying to take the square root of a negative number.

Here's an example from the ACS 2022 1-year data for the nation:

B25065_E001 (aggregate gross rent) = $63,086,890,700

B25065_M001 (MOE for above) = ±$234,476,359

B25063_E002 (count of cash renters) = 42,971,061

B25063_M002 (MOE for above) = ±162,515

I'm trying to derive average gross rent as (aggregate gross rent / count of cash renters), or about $1,468.13 for the USA. Seems about right. But plugging the numbers into the proportion formula leads to madness:

MOE(P-hat) = sqrt(B25065_M001² - ((B25065_E001 / B25063_E002)² * B25063_M002²)) / B25063_E002

MOE(P-hat) = sqrt(5.49e+16 - (2,155,391 * 1.85e+15)) / 42,971,061

MOE(P-hat) = sqrt(-3.98e+21) / 42,971,061

I feel like I'm missing something. Is it because the source estimates and MOEs are counting different things? Is there a different formula for calculating MOEs for derived averages?

Thanks for any guidance.

Top Replies

David Dorer over 1 year ago in reply to Glenn Rice +1

My comments refer to this document

https://www.census.gov/content/dam/Census/library/publications/2020/acs/acs_general_handbook_2020_ch08.pdf

The 2 formulas are (6) and (7) in this document.

Formula (6…
Stas Kolenikov over 1 year ago in reply to David Dorer +1

Glenn Rice the average is the ratio: you have the total (across the U.S.) of cash payments in the numerator, and you have the total (# of households that are renters) in the denominator. So the calculations…
David Dorer over 1 year ago in reply to Glenn Rice +1

Just another note formula (6) with the - sign applies to the case where the data is counts and the numerator is a subset of the denominator. Formula (7) applies to a ratio of any 2 things. The 2 "things…

Glenn Rice over 1 year ago

To possibly answer my own question and ask a new one: Could I use formula (9) to do this?

Calculating Measures of Error for the Product of Two Estimates

Since dividing x / y is the same as the product x * 1/y, could this work?

EDIT: trying this out....

MOE(X-hat * 1/Y-hat) = sqrt((X-hat**2 * (1/MOE[Y-hat])**2) + ((1/Y-hat)**2 * MOE[X-hat]**2))

= sqrt((B25065_E001**2 * (1/B25063_M002)**2) + ((1/B25063_E002)**2 * B25065_M001**2))

= sqrt((63086890700**2 * (1/162515)**2) + ((1/42971061)**2 * 234476359**2))

= sqrt(2.3886483 + 29.7746023)

= $5.67

...which seems low, but MOEs should be pretty low for the nation as a whole.

Better, worse, or just completely wrong? Thanks
Cancel
Up 0 Down

Reply

Cancel
David Dorer over 1 year ago in reply to Glenn Rice

My comments refer to this document

https://www.census.gov/content/dam/Census/library/publications/2020/acs/acs_general_handbook_2020_ch08.pdf

The 2 formulas are (6) and (7) in this document.

Formula (6) has a - sign under the square root.

The caveat for that formula is that

"Users should note that if the value under the square root is negative, then substitute a “plus” for the “minus” sign
under the square root in formula (6). This modified formula is the same as the formula for the MOE of a ratio,
which will be discussed in the next section." This occurs when either the proportion/ratio is large or the MoE or the denominator is large when compared to the numerator MoE

Formula (7) has a + sign.

The MoE given by (6) will be lower than the MoE given by (7).
Cancel
Up +1 Down

Reply

Cancel
Glenn Rice over 1 year ago in reply to Glenn Rice

Responding to my own second question:

JUST COMPLETELY WRONG
Cancel
Up 0 Down

Reply

Cancel
Stas Kolenikov over 1 year ago in reply to David Dorer

Glenn Rice the average is the ratio: you have the total (across the U.S.) of cash payments in the numerator, and you have the total (# of households that are renters) in the denominator. So the calculations for the ratio give the MOE of $7.78 (which seems a bit too tight but what do I know).

Analysis of microdata on IPUMS (https://sda.usa.ipums.org/sdaweb/analysis/exec?formid=mnf&sdaprog=means&dataset=all_acs_samples&sec508=false&dep=rent&row=year&filters=rent%281-**%29&weightlist=hhwt&main=means&transform=none&percentileopt=none&confidence=on&cflevel=95&se=on&wncases=on&color=on&ch_type=bar&ch_color=yes&ch_width=600&ch_height=400&ch_orientation=vertical&ch_effects=use2D&decmeans=2&dectotals=0&decdiffs=1&decmedian=2&decse=1&decsd=1&decminmax=2&decwn=1&deczstats=2&csvformat=no&csvfilename=means.csv) gives this (mean of rent by year, filter rent(1-**) i.e. non-missing) -- the total N of 95M probably means it counts the individuals rather than the households. The MOE is even tighter at about $1.8.

2021: 2021 1,251.89
(1,250.08-1,253.71)
.925
95,572,083.0

The sampling error of $234M should give a heart attack to any reasonable economist... but fortunately economists don't know survey statistics :).
Cancel
Up +1 Down

Reply

Cancel
Glenn Rice over 1 year ago in reply to Stas Kolenikov

Thanks Stas!
Cancel
Up 0 Down

Reply

Cancel
Glenn Rice over 1 year ago in reply to David Dorer

Thanks David! I've looked at that page a hundred times and never saw that note. It would have saved me a world of trouble.
Cancel
Up 0 Down

Reply

Cancel
David Dorer over 1 year ago in reply to Glenn Rice

Just another note formula (6) with the - sign applies to the case where the data is counts and the numerator is a subset of the denominator. Formula (7) applies to a ratio of any 2 things. The 2 "things" can even come from different tables, as they do in your case.

If you use R, I have some code that computes ratios, products and linear combinations (sums of variables with a fixed coefficient for each term ) taking into account the MoEs of the terms. The ratio code uses a + sign under the square root. I use formula 7 which is conservative (larger MoE) when compared to formula 6. All these formulas are approximate and are based on the variance (or standard deviation which is the square root of the variance) and the rest comes form the "delta method." https://en.wikipedia.org/wiki/Delta_method. The delta method (multivariate version) for x/y depends on the derivative with respect to x == (1/y) and the derivative with respect to y == -x/(y * y). With these facts about the derivatives you can kind of see where the formulas in chapter 8 come from.
Cancel
Up +1 Down

Reply

Cancel