5-Yr PUMA Geography

Hello, I am accessing the 5-Yr estimates for 2022 but not seeing that PUMAs are an option in the "Geography"  Any ideas how to get this?

2022

2021

Parents
  • Yes, in MDAT, for the 2022 5-year PUMS, you'll find PUMA information through two _variables_, not geographies. The PUMA10 variable identifies 2010 PUMA codes for respondents from 2018 through 2021. PUMA20 identifies 2020 PUMA codes for respondents from 2022. This two-variable system will most likely continue for 5-year PUMS until the 2026 release when, once again, the 5-year PUMS will use only one set of PUMA definitions for the entire 5-year period. (MDAT uses the same setup for the 2012 through 2015 5-year PUMS releases, which also used two sets of PUMA definitions.)

    For IPUMS USA, we're working on providing PUMA codes for the 2022 5-year sample through a single "PUMA" variable (as we already do for the 2012 through 2015 5-year samples). We aim to release that update sometime in the next couple weeks. We have several other resources related to PUMAs and PUMA changes through our Geographic Tools & Resources page.

  • Thank you, Jonathan.  I found the variable, but there is no detail with it.  So, there is no telling which PUMA each row is.

  • As Jonathan indicate, the 2018-2022 PUMS data is a "mixed geography" file.  There are 2 variables that don't exist in the 2017-2021 file, PUMA10 and PUMA20.  The file is 4/5ths PUMA10 records and 1/5th PUMA20 records.  This makes the 2022 5 year file pretty useless.  You are better using the 2022 1-year PUMS data.  For the API for the 2018-2022 you can't use the  the "&for=public use microdata:PUMAFIPS" construction.  You need to use "&for state=STATEFIPS" and the "subset" on PUMA10 or PUMA20 FIPS code records.

    From the people at ACSO (American Community Survey Operations) :

    Hi David, 
    I think you are running into is the dual-PUMA issue. 
    This is what I  found in the 2022 PUMS 5-year User Guide:  The current PUMA boundaries are based on Census 2020 definitions, while records from 2021 and earlier use boundaries based on Census 2010 definitions. Therefore, multi-year files for 2022 will contain PUMA codes created from both Census 2010 and Census 2020. PUMA codes defined using Census 2010 are called PUMA10, while the newer PUMA codes defined from Census 2020 are called PUMA20. Each record on the PUMS files will contain either the PUMA10 or PUMA20 code, based on which year the record’s data were collected. Due to disclosure concerns, it is not possible to update the PUMA codes for the records from 2021 and earlier to 2020-based PUMAs by using their detailed geographic locations. Data users will need to crosswalk their data to obtain a single PUMA geography using other means, such as using allocation rates using GEOCORR.  

    I have reached out to the PUMS subject matter experts and have received the following responses:

    There's an error in the first API URL you provided, for the housing file call.  It's retrieving records where PUMA20=00902 regardless of state, while the person API call is restricting to PUMA20=00902 and state=25.

    Dual PUMAs were used for the DY22 5-year PUMS, as PUMA20 is only on the 2022 records, while PUMA10 is on the 2018 through 2021 records.  The information available is on p.12 of the ACS 5-year PUMS User Guide: https://www.census.gov/programs-surveys/acs/microdata/documentation.html

    You may benefit from using the PUMA10 and PUMA20 Variables to narrow down the search with the universe of 00902 below or could use the state link and do the same thing. 
    This might be helpful if you want to only be in that PUMA area for the Geography:


    I have attached the PUMS Data Dictionary for you as well. 


    I hope this helps.  Let me know if you have any other questions. 

    Vicki
Reply
  • As Jonathan indicate, the 2018-2022 PUMS data is a "mixed geography" file.  There are 2 variables that don't exist in the 2017-2021 file, PUMA10 and PUMA20.  The file is 4/5ths PUMA10 records and 1/5th PUMA20 records.  This makes the 2022 5 year file pretty useless.  You are better using the 2022 1-year PUMS data.  For the API for the 2018-2022 you can't use the  the "&for=public use microdata:PUMAFIPS" construction.  You need to use "&for state=STATEFIPS" and the "subset" on PUMA10 or PUMA20 FIPS code records.

    From the people at ACSO (American Community Survey Operations) :

    Hi David, 
    I think you are running into is the dual-PUMA issue. 
    This is what I  found in the 2022 PUMS 5-year User Guide:  The current PUMA boundaries are based on Census 2020 definitions, while records from 2021 and earlier use boundaries based on Census 2010 definitions. Therefore, multi-year files for 2022 will contain PUMA codes created from both Census 2010 and Census 2020. PUMA codes defined using Census 2010 are called PUMA10, while the newer PUMA codes defined from Census 2020 are called PUMA20. Each record on the PUMS files will contain either the PUMA10 or PUMA20 code, based on which year the record’s data were collected. Due to disclosure concerns, it is not possible to update the PUMA codes for the records from 2021 and earlier to 2020-based PUMAs by using their detailed geographic locations. Data users will need to crosswalk their data to obtain a single PUMA geography using other means, such as using allocation rates using GEOCORR.  

    I have reached out to the PUMS subject matter experts and have received the following responses:

    There's an error in the first API URL you provided, for the housing file call.  It's retrieving records where PUMA20=00902 regardless of state, while the person API call is restricting to PUMA20=00902 and state=25.

    Dual PUMAs were used for the DY22 5-year PUMS, as PUMA20 is only on the 2022 records, while PUMA10 is on the 2018 through 2021 records.  The information available is on p.12 of the ACS 5-year PUMS User Guide: https://www.census.gov/programs-surveys/acs/microdata/documentation.html

    You may benefit from using the PUMA10 and PUMA20 Variables to narrow down the search with the universe of 00902 below or could use the state link and do the same thing. 
    This might be helpful if you want to only be in that PUMA area for the Geography:


    I have attached the PUMS Data Dictionary for you as well. 


    I hope this helps.  Let me know if you have any other questions. 

    Vicki
Children
  • Thank you David.  I need to use the 5-yr estimate as I want to go down to the PUMA level (I actually want county data).  But, this seems to not be an option with this file.  Like I've stated, I used the PUMA20 or PUMA10 variable but its pretty useless, with the way I'm using it.  The PUMA value does not show on the table.

    Any ideas of how to accurately get to the county level for 2022 data?

    Thank you,

    Lorna

  • Dear Lorna,

    If you look at my post about a Small Area Estimation (SAE) program that I wrote you can use the program to get county data.  The problem is that a county may contain several PUMAs, This situation is pretty easy to handle you can just "add up" the tables for the relevant PUMAs.  For PUMAs that cross county lines, which may contain parts of several counties, you are stuck.  I wrote the SAE program to handle this situation.  I start with PUMS data for all the relevant PUMAs (large area) and then I create PUMs like tract level data. Next I "stack" the tracts for the county that I want.  There are many potential issues with this approach but it seems reasonable.  You can then create any county table that you want using the synthetic data.

    The current version of the program does not produce useful MOEs.  I'm working on an extension the produces replicate weights. You can produce MOEs using the replicate weights.  To do all this you need to be able to use "R"  Do you have any experience with R ?  If you work for a nonprofit 501(c)(3) or government you can get free support through my foundation dorerfoundation.org

    Dave

  • Thanks again David.  I think I will switch over and use the PUMS data with SAS.  I need demographics by county/zip for 200% FPL.  I'm VERY interested in in using the Supplemental poverty data though.  Can you get me started?  I work for the state of Washington, DSHS 

  • I haven't used SAS in years and years so I don't recall how transferrable this resource is.... These two links were extremely helpful with using pums data in R.

    https://walker-data.com/census-r/introduction-to-census-microdata.html

    https://walker-data.com/tidycensus/articles/pums-data.html

    Meghan

  • Dear Lorna,

    Go to dorerfoundation.org an look for the "Contact Us" tab across the top. Send an email to the address and we can communicate via email.

    Best,

    Dave