Advice on use of Blockgroup Data

We have a user who insists on using blockgroup ACS data to conduct some exploratory mapping and analysis of hyperlocal conditions. I have recommended against this step, suggesting the use of census tracts instead.  My understanding, and opinion, is that blockgroup data is useful insofar as blockgroups combine to form larger geographies. Individual blockgroups often have such relatively large margins of error that the data is insufficiently "stable" to use reliably to draw conclusions and develop policy. (For perspective, the area under discussion includes 32 tracts and about 80 blockgroups.)

Is there any "official" guidance I can point toward to bolster my case that tracts are the better way to go?


  • Cliff, you are 100% correct. That's why I've always thought that ACS block group data should not be online at all. I don't know about "official" guidance, but tell your user just to look at the MOEs! That should frighten him/her off. Happy New Year.
  • In reply to Patty Becker:

    Dear All:

    Ah the discussion of so-called MOE's from the Census. As some of you know , the MOE's are computed using a so-called normal approximation. However, the Census also provides a series of tables using Balanced Random Replicates, which correctly compute the MOEs, which are generally smaller, sometime much smaller. Part of this has to do with the issue of treating each ACS as discrete (or the combined years of data as discrete), some of this has to do with the fact that the Census actually created MOEs for the ACS, but never did it for the Long Form.

    In 2011, I wrote a memo the the Census, and they responded that I had raised important issues, that they were trying to address in a production environment. However, the issue still continues in the 5 year and 1 year ACS's. They still report negative numbers in their margins of error..

    Standard errors with very small or very large proportions actually shrink, as do those with medians of skewed distributions. Unfortunately, I have run into issues where people consider the ACS precision to be much worse than it actually is in court settings..

    Here is a dropbox link to correspondence I had withe Bureau on this back in 2011.

    My own advice would be to map the variables of interest on a map and if the data seem reasonably consistent they probably are. This is in effect a "poor mans" version of a Bayesian approach.

    Margin's of error are no where near as precise as people think, it depends upon the underlying assumptions that are used to generate them.

  • In reply to Andrew Beveridge:

    No Offense, but this explanation was not very clear. Could you rephrase and provide a bit more detail?

    Thank You
  • In reply to David Nelson:

    They report standard errors that include negative people, there are numerous examples in the correspondence I cited. For instance, for block groups in NYC that do not have Hispanic Native Americans they report 0 plus or mines 132. There are many examples. They say ignore the negatives, but they exist because of miscomputed MOEs. MOEs depend on a series of assumptions, which they make regarding distributional shape.

  • In reply to Andrew Beveridge:

    I'm always getting help to use ACS data at block group level so please tell me what's wrong with what I do. I have to do certified industrial valuations defending my approaches and data to clients and sometimes in federal court. One result is I never rely on just point estimates even if they're called "medians". One approach I use is monte carlo simulation which helps me avoid subjects like mean, median, modes and any "weird" or so called "undesirable" MOE's. To me data is what it is except I would of course delete negative MOE's so thanks for that heads up. I simply use monte carlo software to sample each parameter by block group in my modeling 10,000 times (or more or less) and of course get results in frequency distributions. Rarely do I find data or result "point estimates" of much value. Ranges and confidence intervals in results are as important to me as for the data. I also change the associated distribution if the data warrants it. (Software lets me set median or mean, type of distribution, standard deviations all truncated at zero eliminating any chance for negative values.) I'd appreciate some criticism. thanks.
    mitch albert
  • In reply to MitchCats:

    Mitch you are the perfect argument about the utility of the block group data. My only comment would be that you need to realize that the Census's computation of MOEs (which are based on standard deviations) can lead to absurdity. It is not just that they go negative, but the go negative when the sample is positive. This of course is impossible, since if you find any cases in a given category, then zero is impossible. The problem is that they use some sort of normal approximation to compute the margin of errors, but given modern software one can use other techniques. Simply put when you get near zero, you do not have a symmetrical MOE. The precise way to deal with this is to use a binomial. SAS and other software (SUDAAN) use random replicates, and/or lorenz transformations. The Census chose not to do this, and it has lead to much confusion, and in many case an over estimate of the actual error. I have had to defend their data in court setting going against their "guidance."

    For example, income is a skewed variable. However, their MOE for the median is drawn from the MOE for the mean with a mutliplier (I think 1.28). If you used resampling statistics it is well known that the median MOE approaches zero.
  • In reply to Andrew Beveridge:

    Thanks Andrew for the other MOE tips. I enjoy your candor.
    I'll have to ask my "mentor" about median's drawn from a mean's MOE. I guess it's all algebra solving for what you want. Never thought of that. Appreciate insights as I'm often on the "simple" side of things. I try not to correct the experts but my certification requires me to address real or even perceived issues.
    My monte carlo software lets me use any distribution by parameters or shape and to also truncate at zero. I appreciate again the heads up on the issue with always using/assuming normal distributions and "absurdities" due to "hidden" methodology.
    One of my "defenses" is the "giggle test". If it looks too "dumb" and I laugh, I change it so I, at least, don't laugh. Now you've offered some peer oriented intellectual reasoning for me to use, which I'm always looking for. I'm happy to take the initiative to override "data" normal distributions and apply binomial or positive or negative skewed log normal, etc. distributions. I always look at these "modeling" exercises as though I'm holding six 12' fly rods and trying to get their ends to meet which I can never do but the process gives me practical usable input/output type of insights.
    regards, mitch, phd economics