Calculating a Median From a Weighted Distribution

vishal over 3 years ago

I'm reading through the PDF "Understanding and Using American Community Survey Data - What State and Local Government Users Need to Know" and need help recreating the calculation for the median salary in Case Study #1 as shown in Figure 3.8 on page 12, where the data used from the ACS 5-year estimates is population estimates for 12-month male earnings.

In the example, this PUMS documentation is referenced, where on page 17, the following formula is given for the standard error of a 50 percent proportion. From the CSV of design factors, I chose 1.3 (the corresponding row is: 2019,5-Year,Minnesota,'27',POPULATION,Person Earnings/Income, 1.3).

I am not sure I'm calculating the B variable correctly. The PUMS documentation defines it is as "Denominator of Estimated Percentage" and "the weighted total" of the frequency distribution, which I calculate as 575226 (corresponding to the example shown in the handbook, I calculated the sum of the cumulative frequencies for male earnings associated with "Rural" RUCA codes). However, the standard error I calculate is 0.3736, whereas in the handbook the values is 0.599.

I have two questions:

1) Did I choose the correct design factor?

2) What is the correct way to calculate B?

Hopefully the screenshot below is showing. If not, the formula is: SE(50 percent) = Design Factor x sqrt(95/5B x 50^2)

Parents

vishal over 3 years ago

I was able to resolve it with help from the wonderful ACS Data Support Team! The value of DF is 1.3 (Minnesota, Population, People Earnings/Income) and the value of B is the sum of the population estimates for which the median is being calculated (in this case, 82448). In the formula for SE (50 percent) in the screenshot from the handbook, one of the terms is 95/5B, where 95/5 is the finite population correction factor (100 - f) divided by the sample fraction (f), where f = 5%. The data used in the handbook case study is from 5-year estimates. 1-year estimates sample 2.5% of the population, so the 5-year estimates represent a 5 * 2.5 = 12.5% sample. Instead of 95/5, that ratio becomes (100 - 12.5)/12.5 = 87.5/12.5. So the final formula becomes:

SE (50 percent) = 1.3 * sqrt(87.5/(12.5 * 82448) * 50^2) = 0.599

This gives the same SE value as the handbook.
Cancel
Up 0 Down

Reply

Cancel

Reply

vishal over 3 years ago

I was able to resolve it with help from the wonderful ACS Data Support Team! The value of DF is 1.3 (Minnesota, Population, People Earnings/Income) and the value of B is the sum of the population estimates for which the median is being calculated (in this case, 82448). In the formula for SE (50 percent) in the screenshot from the handbook, one of the terms is 95/5B, where 95/5 is the finite population correction factor (100 - f) divided by the sample fraction (f), where f = 5%. The data used in the handbook case study is from 5-year estimates. 1-year estimates sample 2.5% of the population, so the 5-year estimates represent a 5 * 2.5 = 12.5% sample. Instead of 95/5, that ratio becomes (100 - 12.5)/12.5 = 87.5/12.5. So the final formula becomes:

SE (50 percent) = 1.3 * sqrt(87.5/(12.5 * 82448) * 50^2) = 0.599

This gives the same SE value as the handbook.
Cancel
Up 0 Down

Reply

Cancel

Children

No Data