This is my first time using ACS PUMS data and I am trying to create an estimate of the Type of Unit variable from the housing file(s) of the 2015 1-yr ACS. I am using the a, b, and Puerto Rico files (ss15husa.csv, ss15husb.csv, ss15hpr.csv). I am using R with Thomas Lumley's survey package.
When I calculate the raw tabulation of the variable TYPE I get the results below indicating that there are records for all three types (Housing, Ins. GQ, and Non-Ins GQ) of units.
1 2 3 1363661 71728 77922
However, when I run the svytotal function on the housing survey design object I created I get zeroes for the gq levels:
> svytotal(~factor(TYPE), hou_prof_design) total SEfactor(TYPE)1 136367197 5089.4factor(TYPE)2 0 0.0factor(TYPE)3 0 0.0
This doesn't make sense and am looking for some advice on how to rectify. A colleague of mine has suggested that I use the person weight but I am unclear on how to associate the person weights for the records in the housing file that correspond to the gq levels in the TYPE variable. Any help will be greatly appreciated it.
Beth - can't thank you enough for your thoughtful reply. Here's the background. The question I have been given is: How many Vietnam War Era Veterans live in Group Quarters? This question is part of whole slew of others that together will form a Profile Vietnam War Veterans. As such this is not an analysis but a collections of "facts" that can describe those Veterans.
Another related question with respect to Housing Units vs. Group Quarters: I have noticed that for many of the variables in the housing file the "b" level is often coded as as NA (GQ/Vacant). I have been treating them as NA, which is to drop them. Is this enough to prevent the mixing of the figurative apples and oranges? Or do I need to subset on the TYPE variable selecting Housing Units only? Cheers!