New Connecticut counties, er... regions

This regards those new Connecticut "county equivalents" or "planning regions."
My apologies if this issue has already been covered and I overlooked it.
www.federalregister.gov/.../change-to-county-equivalents-in-the-state-of-connecticut

Some users have helpfully pointed out how data for the new areas can be aggregated from lower-level geographies, such as county subdivisions.
Which is fine, if one has the data at those more detailed geo levels.

But what about the wealth of useful data published at the county levels (such as health indicators)?  How might we convert new data for the "county equivalents" back to the previous CT county structure so it can be used with pre-existing data for analysis, comparison, reporting, trends, etc.  We don't want to leave a big hole for the state of Connecticut until all agencies and companies switch over to the new counties in a year or two.
Or when we switch over to the new CT regions to include in our list of all US counties, how should we convert data for the earlier CT counties to its new CT equivalent geographies?

The only workable solution I can imagine to convert new CT counties to previous counties (and vice versa) is to create an allocation table which shows the percent of each existing CT county in the new regions/equivalents.  The value for allocation would be either area or population.

Anyone have a better idea?  Would this approach give "good enough" estimates of the new counties?

Thanks,
Bert Sperling
bestplaces.net

Parents
  • Sorry to be the bearer of bad news, but in short, there is no general solution that produces reliably "good enough" estimates for all use cases.

    The first thing to try is what you noted: if you can get the needed pre-2022 data for county subdivisions, then aggregate that to the 2022 county equivalents. (Another caveat here: this works smoothly only if your data are counts. It's not so easy to aggregate medians or quotients, etc.)

    A second good option is to allocate in the other direction, from post-2022 county subdivisions to pre-2022 counties.

    Note that for either of the above strategies, I think the most useful resource is the county-to-county-subdivsion crosswalk available through the Bureau's Substantial Changes to Counties and County Equivalent Entites page (which is, unfortunately, not linked to anywhere on the Bureau's 2022 Relationship Files page... not yet anyway).

    If all else fails, then yes, you could create a crosswalk that goes directly from pre-2022 counties to the new county equivalents, or vice versa. But I'd do this only as a last resort. It will always be better if you can allocate from lower-level units: block groups or census tracts would also work well, or even ZIP Code Tabulation Areas, which don't nest within counties, but could still be used to model distributions of population characteristics within counties. If you _do_ use a direct county-to-county-equivalent crosswalk, avoid weighting by areas of intersection if possible. Population weights are significantly better than area weights in most cases, but even then, population weights assume that population characteristics are uniform within each county, i.e., perfectly integrated, demographically and socioeconomically, with the same % minority, % children, % in poverty, etc., in every sub-part of the county. As you can imagine, this assumption can be very wrong! Population weights also don't work well with housing data; if household sizes vary a lot and/or there are many vacant housing units or large group quarters populations, then housing and population distributions can differ substantially, so for housing data, I'd recommend using housing weights if possible.

    Despite these limitations, a county-to-county-equivalent allocation could still be "good enough" in some settings, but it'll always be up to you to decide; it can't generally be guaranteed.

Reply
  • Sorry to be the bearer of bad news, but in short, there is no general solution that produces reliably "good enough" estimates for all use cases.

    The first thing to try is what you noted: if you can get the needed pre-2022 data for county subdivisions, then aggregate that to the 2022 county equivalents. (Another caveat here: this works smoothly only if your data are counts. It's not so easy to aggregate medians or quotients, etc.)

    A second good option is to allocate in the other direction, from post-2022 county subdivisions to pre-2022 counties.

    Note that for either of the above strategies, I think the most useful resource is the county-to-county-subdivsion crosswalk available through the Bureau's Substantial Changes to Counties and County Equivalent Entites page (which is, unfortunately, not linked to anywhere on the Bureau's 2022 Relationship Files page... not yet anyway).

    If all else fails, then yes, you could create a crosswalk that goes directly from pre-2022 counties to the new county equivalents, or vice versa. But I'd do this only as a last resort. It will always be better if you can allocate from lower-level units: block groups or census tracts would also work well, or even ZIP Code Tabulation Areas, which don't nest within counties, but could still be used to model distributions of population characteristics within counties. If you _do_ use a direct county-to-county-equivalent crosswalk, avoid weighting by areas of intersection if possible. Population weights are significantly better than area weights in most cases, but even then, population weights assume that population characteristics are uniform within each county, i.e., perfectly integrated, demographically and socioeconomically, with the same % minority, % children, % in poverty, etc., in every sub-part of the county. As you can imagine, this assumption can be very wrong! Population weights also don't work well with housing data; if household sizes vary a lot and/or there are many vacant housing units or large group quarters populations, then housing and population distributions can differ substantially, so for housing data, I'd recommend using housing weights if possible.

    Despite these limitations, a county-to-county-equivalent allocation could still be "good enough" in some settings, but it'll always be up to you to decide; it can't generally be guaranteed.

Children
No Data