Hello,
I am currently working on calculating the median household income for a specific PUMA.
I am using the 2017 1-year ACS PUMS for my state (merged person and household files by SERIALNO). Also, I have converted household income dollars to 2017 dollars via the following code:
puma$hinc2017 <- puma$HINCP * (puma$ADJINC / 1000000)
So far, so good.
Additionally, I've generated a flag variable which identifies householderspuma$hholder <- factor(ifelse(puma$RELP == 0, 1, NA))
Now, I need to calculate the median household income. In Stata, I have seen the code written like this:
sum hinc2017 if hholder==1 [fweight=wgtp], detailgen hinc2017_all=r(p50)
How is this done in R?
I know there is a median() function within base R, but when I execute the following code, the result returns NA:
median(puma$hinc2017)
[1] NA
Any advise would be appreciated.
R will issue NA if you have missing values. Start debugging by adding na.rm = TRUE as in median(dataframe$varname, na.rm = TRUE)
You need the to rely on some other packages. For example, the survey package or the srvyr package. See here for an example using these + tidycensus and others:
https://walker-data.com/tidycensus/articles…
Try this weighted median function — worked for me www.rdocumentation.org/.../weighted.median