I am attempting to double-check the number of people in household for a specific PUMA.
I am sorting(group_by) SERIALNO, once the person and household files have been merged.
This is what I am attempting to do:
As you can see, I am sorting by SERIAL NO. For example, 2017000001750 appears once and NP equals once which is correct as there is one person in this household. For, SERIALNO 2017000003510, it appears four times, with an NP of 4 denoting four people in the household. I want to create a variable (HHSIZE) which counts the number of times a specific SERIALNO appears (2017000001750 = 1, 2017000003510 = 4) and then run a logical test: HHSIZE minus(-) NP == 0, and return a value of "TRUE".
So, far, this is what I have:
df %>% group_by(SERIALNO) %>% mutate(HHSIZE = count(SERIALNO)) # I am sure this isn't quite right as I get 0s in the output
# I also am struggling in developing code to test where NP - HHSIZE == 0 returns a value of TRUE
Thanks as always for being patient with me.