I'm using ACS 2015-2019 five-year estimates for census block groups in LA county. I notice that there are census block groups that have zero occupied housing units but positive population. I was wondering how is that possible. If there are zero occupied housing units, where do these population live?
The following census block groups have large population but zero occupied housing units. 060372653011 (population: 11977), 060379202001 (population: 5393), 060379010031 (population: 4895), 060375746011 (population: 2276). I saw on the map that 060372653011 basically covers UCLA. I cannot tell what regions the other census block groups are at, but some of them seem to be in the canyon/mountain areas.
I was wondering whether anyone has any idea why census block groups that have zero occupied housing units could have positive population. Thank you.
Group quarters? Like prisons, retirement and nursing homes, etc
Census Bureau did put a lot of effort into getting the most accurate college population data possible under very challenging circumstances! That said, it's worth noting that while group quarters counts…
and dorms --UCLA :)
Ah, I did not see the bit about UCLA. In our college town, the dorms were closed on April 1 due to covid.
I know that many of us were worried about the enumeration of college/university students, so I was happy to see that it seems pretty accurate in the Twin Cities. We do our own survey of college/university housing as part of our population estimates program, and 2020 Census counts for most schools were pretty close to what we gathered directly from college/university staff. (Like the Census Bureau, we requested what the approximate populations would have been in the absence of the pandemic.)
Off-campus housing is less clear, but the population counts for most block groups around colleges/universities seem reasonable.
Census Bureau did put a lot of effort into getting the most accurate college population data possible under very challenging circumstances! That said, it's worth noting that while group quarters counts seem quite reliable in many communities, there are concerns in some communities in California. I found some discrepancies in the communities that I study and when discussing the discrepancies with state demographers I learned that there may be systematic undercounts at CSU and UC schools. So perhaps the best way to think about GQ is that 2020 Census data quality may vary from place to place.
They have set up a separate GQ Count Review for the 2020. Apparently some of the GQs were put in nearby wrong blocks. One of the problems, however, is that Differential Privacy was applied to the Block Groups, so in NY State after the prisoner relocations, some qroup quarters blocks ended up with negative populations, since presumably the original counts had already been distorted when the prison population was relocated. According to Applied Geographic Systems there are many problems with the 2020:
Here is a brief excerpt with a web link for more:
by Gary Menger | Oct 7, 2021 | Census, Featured | 0 comments
A tremendous effort in the analytics world is devoted to the task of data preparation and cleansing. Data exchanges refer to ‘curated’ data, which suggests that the suppliers of that data have gone to the trouble of estimating missing fields, reigning in outliers, and harmonizing the data with other known and trusted data sources at various geographic aggregations. Users of that curated data then rely on the dataset for their models, often automated, for making informed business decisions. If the goal of analytics is to reduce uncertainty in business decisions, the minimization of error must be a priority at all stages of the effort – since error in source data is not only propagated but magnified.
Consider it equivalent to having a termite infestation in your house. Superficially, everything looks perfectly fine, as the frame of the house is covered by layers of drywall, paint, siding and roofing materials. Sooner or later though, the foundational rot erupts at the surface – sagging floors and cracking drywall – but by then, the damage done is substantial. Structural rehabilitation of the bones of a house is an expensive and time-consuming effort. Tenting the house early in the process is in comparison extremely cheap, despite the neighbors mocking about the circus coming to town.
The notion that error would be deliberately induced into a foundational dataset is close to a moral issue. Who would do such a dastardly deed?
What is the Issue?
In years past, the census has – as required by law – made substantial efforts at protecting the privacy of individuals. As the genealogy world well knows, the physical records which have names and addresses, are sealed for decades. When the census included both the short form and the long form, the sensitive personal data found in the long form was reasonably well protected — it was based on a sample and techniques were employed to “borrow” characteristics between similar, nearby census blocks. With the demise of the long form – replaced by the American Community Survey (ACS) – the census consists of only completely (obviously with some error) enumerated geographic areas. As a result, the data for small areas can be used in conjunction with other databases (mailing lists, property records, etc.) to potentially identify individuals within them.
We have recently been talking about the census concept of a “privacy budget” and its potential effects on the 2020 data releases. Detailed discussions of those issues can be found on the AGS blog —
The unpleasant conclusion is that the data has been seriously corrupted, so much so that a significant number of census block groups have statistically impossible data, among them –
For every identified impossibility, there lurks underneath it at least ten improbabilities, and this is just the baseline numbers. The real meat of the 2020 census is found in the detailed tables which address key population characteristics (age, sex, race, Hispanic origin, ancestry) and household characteristics (household size and structure).