Tutorials on using PUMS data in R


I'm interested in using and analyzing IPUMS data in RStudio, but I'm finding it difficult to find a truly comprehensive walk-through on how to import it and analyze it successfully. I would greatly appreciate any pointers.



  • Hi Peter,

    I have some code that I can adapt/share. Can you mention some of the variables you're interested in?

    I use the ACS PUMS files, which are straight from Census. IPUMS slightly modifies these files by renaming variables and recoding some of them so that they're generally consistent with decennial Census files. Still you should be able to use the code to figure out how to work with PUMS data. I'll aim to have something up in the next few days.

    As far as importing it goes, at least with IPUMS, you'll have to go to their website and create a .csv extract that you download and then import into R.
  • Hi Peter,

    I admittedly haven't used it yet, but IPUMS developed the ipumsr package to help with this. It comes with import and metadata functions. It also has good documentation with links to other packages for things like survey weighting. Here is the introduction documentation: cran.r-project.org/.../ipums.html.
  • In reply to Rob Kemp:

    Hi Peter,

    Sorry for the delay here. These files have been on my to-do list for a while. They're still a work in progress, but I hope are helpful. I expect I'll add a few sections to the analysis file over the next couple of weeks as well.

    You can find two R examples here:

    Specifically, see:
    180730_ACS_PUMS_Download.R - R script to download 5-year PUMS files for IL, IN, and WI 37 minutes ago
    180730_ACS_PUMS_Example_Analysis.R - Initial version of example analysis in R file

    If you get a chance to use them, please let me know if you find any errors.

    Also, FWIW, Anthony D'amico has a trove of R files that show how to use many different Census survey products in R.

  • Here's a link to a github repo that creates LaTeX Beamer slide with ACS PUMS (csv) files. There are scripts that will take you from downloading the files to rendering a pdf slide deck.

  • In reply to Mihir Iyer:

    Hi Mihir,

    What a great resource. This must have taken some time to put together. Thanks for making it public! I am meaning to add some R files to my repo that go over how to set the survey design for ACS PUMS data, but you've beat me to it. This is very valuable.

  • In reply to Vincent Palacios:

    No worries at all Vincent! Glad to be of help :)
  • In reply to Vincent Palacios:

    I'm pretty new to R and have been using SPSS to analyze the PUMS files. However, in discovering that I need to using the replicate weights to generate my own MOE and recently learning that SPSS cannot do that, I am deciding between R or STATA. Do you know if R is able to handle replicate weights? Do you happen to have any script for that? I'm looking at determining housing cost burden by race and want to use replicate weighting method to generate my standard error and margins of error.
    Many thanks for the resources above which have been helpful to me as I start using R.
  • In reply to Elise Cordle:

    Yes R can handle replicate weights through the 'survey' package. Once you get the survey object setup (assign replicate weight columns) it is fairly straight forward after that. Take a look at the data prep code in this GitHub repo: github.com/.../vietvet-acsstats-2016 for some ideas.