Hi!
I am looking to obtain the total population and total housing unit population for a PUMA using the 2018 ACS 1-year PUMS file for TX. I use these files often, but admittedly have never been tasked with deriving these numbers. I am using R in RStudio. So far, this is what I have:
# Set working directorysetwd(" ")
# Verify working directorygetwd()
# Load librarylibrary(tidyverse)library(tidycensus)library(car)library(dplyr)library(matrixStats)library(stringr)library(survey)library(srvyr)
# Load household data 2018 1-year ACS PUMSpums2018h <- read.csv("")
# Load person data 2018 1-year ACS PUMSpums2018p <- read.csv("")
# Merge household and person PUMS datapums2018 <- inner_join(pums2018h, pums2018p, by = c("SERIALNO","DIVISION", "PUMA", "REGION", "ST", "ADJINC"))
# Set variables of interest to includepums2018_var <- pums2018 %>% select ( SERIALNO, AGEP, PWGTP, RELP, SCH, SCHG, ST, PUMA, WAGP, WKL, ESR, ADJINC, BLD, HHT, HINCP, NP, WIF, NR, TEN, TYPE, WGTP )
# Filter observations to Grayson Countypuma2900 <- pums2018_var %>% filter(PUMA == 2900)
# Total population(tot_pop <- count(puma2900, wt = PWGTP))
# Housing unit population (RELP=0-15) (hsg_pop <- puma2900 %>% filter(RELP < 16) %>% count(wt = PWGTP))
I hope I copied over my code correctly—if not, please let me know.
The last two lines of code give me the following error: Error in count(., puma2900, wt = PWGTP) : Argument 'x' is not a vector: list
What is going on and what is the correct way to tell R to calculate these items?
Thank you in advance.
Rafaelg: you have various options. You are loading the TIDYCENSUS package in your script, but not using it in the remainder of your example. The TIDYCENSUS package was updated (spring 2021) to have a "get_pums" function. Full documentation and video lectures by Professor Kyle Walker at TCU are great. His workshop/videos from last year (on youtube) are perfect. You can use it with or without the replicate weights. Read Walker's book!
Another main option is to access the 2018 ACS PUMS (and most other PUMS data, for that matter) with iPUMS.org. If you want fast crosstabulations, use the SDA (Survey Documentation Analysis) web software. Just select 2018 PUMS, Texas & PUMA 2900 and you're on your way! If you need the microdata records themselves, I'm pretty sure that's an option in iPUMS.
iPUMS is absolutely amazing. If you need a NATIONWIDE table that can ONLY be produced using PUMS (e.g., households by vehicles in household by age of householder by race/ethnicity of householder by tenure by year) then iPUMS is the only way to go!
And if you're only needing the total population and household population for Texas PUMA 2900 (Grayson + Cooke + Fannin Counties, Texas), then I would recommend data.census.gov. PUMAs are now "standard" geographic areas, so PUMA-level single-year ACS tables (full data, not just the microdata sample) are readily available.
hope this helps. Chuck