Calculating median household income for PUMA from PUMS files in R

Hello, 

I am currently working on calculating the median household income for a specific PUMA.

I am using the 2017 1-year ACS PUMS for my state (merged person and household files by SERIALNO). Also, I have converted household income dollars to 2017 dollars via the following code:

puma$hinc2017 <- puma$HINCP * (puma$ADJINC / 1000000)

So far, so good. 

Additionally, I've generated a flag variable which identifies householders
puma$hholder <- factor(ifelse(puma$RELP == 0, 1, NA))

Now, I need to calculate the median household income. In Stata, I have seen the code written like this:

sum hinc2017 if hholder==1 [fweight=wgtp], detail
gen hinc2017_all=r(p50)

How is this done in R?

I know there is a median() function within base R, but when I execute the following code, the result returns NA: 

median(puma$hinc2017)

[1] NA

Any advise would be appreciated. 

Parents
  • R will issue NA if you have missing values. Start debugging by adding na.rm = TRUE as in median(dataframe$varname, na.rm = TRUE)

  • Thank you Ani. This helped. The output now provides an integer, which is good, but it isn't the integer that was provided in the output in the Stata code. That code seems to use a condition and applying weights:

    sum hinc2017 if hholder==1 [fweight=wgtp], detail
    gen hinc2017_all=r(p50)

    How would I do this in R?

Reply
  • Thank you Ani. This helped. The output now provides an integer, which is good, but it isn't the integer that was provided in the output in the Stata code. That code seems to use a condition and applying weights:

    sum hinc2017 if hholder==1 [fweight=wgtp], detail
    gen hinc2017_all=r(p50)

    How would I do this in R?

Children