Building 118th Congress Congressional Districts from Census Tracts

I am using the ACS to produce some demographic statistics by Congressional District (CD), based on the districts drawn for the 118th Congress. The ACS data tool (data.census.gov) is still using boundaries from the 116th Congress. The smallest geography that can be extracted from the ACS data tables is census tract, so I am attempting to sum up census tracts by congressional district.

To do this, I used the GeoCorr tool at the Missouri Census Data Center (https://mcdc.missouri.edu/) to generate a crosswalk from census tracts to CDs. This gives me factors to apply when a census tract crosses multiple CDs. But that gives me multiple records for such census tracts, causing a many-to-many merge situation when I try to merge this crosswalk with the ACS tract-level data.

First, is this the best way to go about constructing 118th Congress district-based estimates from the ACS?

Second, how should I resolve the many to many merge?

Thank you!

Top Replies

Parents
  • Hi - the Congressional District Health Dashboard calculates various estimates for the 118th Congress by aggregating tract or county estimates to the CD-level. You can read about our methods in more detail in our technical document (p. 12). With the GeoCorr crosswalk and factors (assuming they're population?), you're on the right track. Does your ACS dataset have multiple ACS estimates in long format? I'm guessing that's what's producing the many-to-many merge. I don't necessarily think that's an issue, as long as you're grouping by the correct variables for the final sum. You could also separate the ACS variables for a 1:many left join with the GeoCorr crosswalk for a cleaner approach. Feel free to email info@CDhealthdashboard.org to discuss in more depth.  

Reply
  • Hi - the Congressional District Health Dashboard calculates various estimates for the 118th Congress by aggregating tract or county estimates to the CD-level. You can read about our methods in more detail in our technical document (p. 12). With the GeoCorr crosswalk and factors (assuming they're population?), you're on the right track. Does your ACS dataset have multiple ACS estimates in long format? I'm guessing that's what's producing the many-to-many merge. I don't necessarily think that's an issue, as long as you're grouping by the correct variables for the final sum. You could also separate the ACS variables for a 1:many left join with the GeoCorr crosswalk for a cleaner approach. Feel free to email info@CDhealthdashboard.org to discuss in more depth.  

Children