MOE > value

jnigrine over 1 year ago

There are a number of 0 (and small) values where the margin of error is huge compared to the value, e.g. 0 +/- 15 Black / African-American residents of Burt, MI.

Similarly, the CI can be a big proportion of the value, e.g. the 2018 5-year estimate for the female 5-9 year old population of the subdivision of Clio City in Genesee County, MI is 64 +/- 52.

I've heard ad hoc advice to treat any value with a CI 20% or less of the value is suspect. Even if that's reasonable, what would you do with a 0 estimate? Any thoughts or suggestions would be appreciated!

Top Replies

David Dorer over 1 year ago +1

Since you use the term "CI," I assume that you mean "confidence interval" and have some understanding of the concept. Often people write "+- "t o indicate that the "underlying" population…

Parents

David Dorer over 1 year ago

Since you use the term "CI," I assume that you mean "confidence interval" and have some understanding of the concept. Often people write "+- "t o indicate that the "underlying" population number/count falls within an interval. The "confidence interval" gives a range where the value would fall if you repeated the survey by drawing many samples from the population and tabulate the data again and again. For a 95% confidence interval, the sampled value would fall in the interval for 95 out of 100 sample "draws." The interval that you get with this calculation is not symmetric as is implied by the +- symbol for the estimate. From the definition of a 95% Confidence Interval you can see that for any estimate, there are many possible intervals that provide 95% "coverage." Thus when you give a confidence interval you need to specify what method you used to compute the interval. When the MoE > Estimate and the Estimate is > 0, the lower value of the "+-" confidence interval is less than 0 , and this is impossible. The lower limit of the confidence interval would be zero but this is impossible as well because you had a positive count for that particular ACS sample year(s for 5 year data). People usually think that the confidence interval indicates where the "true" value lies. The "true" value is the value that you would get if you did a census, i.e. you "surveyed" everybody.

In any case these problems are all solved by using "replicate weights" for microdata or variance replicate estimate tables. These are tables that include many estimates of the underlying population value. See this page to locate the tables https://www.census.gov/programs-surveys/acs/data/variance-tables.html "Variance Replicate Estimate tables include estimates, margins of error, and 80 variance replicates for selected American Community Survey (ACS) 5-year Detailed Tables" If you want to make a calculation from the replicate table cell values, you repeat the your calculation 80 times taking a different "replicate" each time. You then apply a formula to calculate the interval for the underlying values that you computed. Any "function" or formula can be used for your calculation, sum, ratio, product etc. Use your imagination.

The census people "suppress" table cells so that people don't miss interpret the value in the published table. This drives statisticians / mathematicians crazy because there is always some information in the number that you get when you tabulate a survey. You just need to know how to interpret the result that you get. There is also a problem with a zero result when you take a survey. Are there no people in the underlying population in that table cell, a "structural" zero, or are you just getting a zero value for that sample. This is another topic worthy of discussion.

The way that I handle these issues in a presentation or report is to report the number that you get and put in a footnote to explain the situation. Also when I report an estimate or confidence interval I put in at least 1 or 2 digits beyond the decimal point to indicate that the number is an estimate and not an actual census (count everybody) count.

Hope this helps

Dave
Cancel
Up +1 Down

Reply

Cancel

Reply

David Dorer over 1 year ago

Since you use the term "CI," I assume that you mean "confidence interval" and have some understanding of the concept. Often people write "+- "t o indicate that the "underlying" population number/count falls within an interval. The "confidence interval" gives a range where the value would fall if you repeated the survey by drawing many samples from the population and tabulate the data again and again. For a 95% confidence interval, the sampled value would fall in the interval for 95 out of 100 sample "draws." The interval that you get with this calculation is not symmetric as is implied by the +- symbol for the estimate. From the definition of a 95% Confidence Interval you can see that for any estimate, there are many possible intervals that provide 95% "coverage." Thus when you give a confidence interval you need to specify what method you used to compute the interval. When the MoE > Estimate and the Estimate is > 0, the lower value of the "+-" confidence interval is less than 0 , and this is impossible. The lower limit of the confidence interval would be zero but this is impossible as well because you had a positive count for that particular ACS sample year(s for 5 year data). People usually think that the confidence interval indicates where the "true" value lies. The "true" value is the value that you would get if you did a census, i.e. you "surveyed" everybody.

In any case these problems are all solved by using "replicate weights" for microdata or variance replicate estimate tables. These are tables that include many estimates of the underlying population value. See this page to locate the tables https://www.census.gov/programs-surveys/acs/data/variance-tables.html "Variance Replicate Estimate tables include estimates, margins of error, and 80 variance replicates for selected American Community Survey (ACS) 5-year Detailed Tables" If you want to make a calculation from the replicate table cell values, you repeat the your calculation 80 times taking a different "replicate" each time. You then apply a formula to calculate the interval for the underlying values that you computed. Any "function" or formula can be used for your calculation, sum, ratio, product etc. Use your imagination.

The census people "suppress" table cells so that people don't miss interpret the value in the published table. This drives statisticians / mathematicians crazy because there is always some information in the number that you get when you tabulate a survey. You just need to know how to interpret the result that you get. There is also a problem with a zero result when you take a survey. Are there no people in the underlying population in that table cell, a "structural" zero, or are you just getting a zero value for that sample. This is another topic worthy of discussion.

The way that I handle these issues in a presentation or report is to report the number that you get and put in a footnote to explain the situation. Also when I report an estimate or confidence interval I put in at least 1 or 2 digits beyond the decimal point to indicate that the number is an estimate and not an actual census (count everybody) count.

Hope this helps

Dave
Cancel
Up +1 Down

Reply

Cancel

Children

No Data