Estimate of Type of Unit (TYPE) Results in a value of 0 for Institutional & Non-Institutional GQ.

This is my first time using ACS PUMS data and I am trying to create an estimate of the Type of Unit variable from the housing file(s) of the 2015 1-yr ACS. I am using the a, b, and Puerto Rico files (ss15husa.csv, ss15husb.csv, ss15hpr.csv). I am using R with Thomas Lumley's survey package.

When I calculate the raw tabulation of the variable TYPE I get the results below indicating that there are records for all three types (Housing, Ins. GQ, and Non-Ins GQ) of units.

1                       2            3
1363661   71728    77922

However, when I run the svytotal function on the housing survey design object I created I get zeroes for the gq levels:

> svytotal(~factor(TYPE), hou_prof_design)
                                total           SE
factor(TYPE)1 136367197    5089.4
factor(TYPE)2                 0          0.0
factor(TYPE)3                 0          0.0

This doesn't make sense and am looking for some advice on how to rectify. A colleague of mine has suggested that I use the person weight but I am unclear on how to associate the person weights for the records in the housing file that correspond to the gq levels in the TYPE variable. Any help will be greatly appreciated it.

Parents
  • Hi,

    A few thoughts:

    1. if the svytotal command in R pulls in the "wgtp" (housing weight) variable, it is 00000 for group quarters because it's a placeholder (see www2.census.gov/.../PUMSDataDict15.pdf, pages 1-31 has all the Housing Unit variables). If you're wondering, "Why is the housing weight zero for group quarters?" it's because the ACS isn't trying to capture characteristics of group quarters the same way it does try to capture characteristics of housing units (e.g., plumbing, heating, rent/housing value, internet access). The ACS does capture characteristics of the people who live in group quarters.

    2. I agree with your colleague, use the person weights. Unfortunately this means you'll need additional csv files: ss15pusa, ss15pusb, and ss15ppr. The "h" or "p" after the 15 tells you whether it's a person file or a housing unit file. These files can be found here (factfinder.census.gov/.../productview.xhtml, population records as well as housing records. Link the population records to the housing unit records using the "serialno" variable as your matching key. Then the "pwgtp" variable is your person weight. (Data Dictionary, page 32+ has all the Person Record variables).

    Hopefully this helps,
    Diana

  • Diana - I am grateful for your reply as it gives me a definitive path forward. At the same time I'm grinding my teeth because of the work I have ahead of me. You may have read in my reply to Beth that I am working on creating a body of "facts" to describe Vietnam War Era Veterans. So I have already setup the Person file data, created a survey design object for this subpopulation, then using SERIALNO identified their corresponding Housing file records, etc.

    Now just to make sure I'm thinking straight, I have identified the records in the Housing that represent Vietnam War Veterans. From those Housing records I will identify the ones where TYPE is GQ, then using those SERIALNOs I will go back to the Person records to pull in the weights. This is logic right?


    P.S. Too embarrassed to say how much time I spent thinking about and researching "Why is the housing weight zero for group quarters?" Thanks a bunch :)

Reply
  • Diana - I am grateful for your reply as it gives me a definitive path forward. At the same time I'm grinding my teeth because of the work I have ahead of me. You may have read in my reply to Beth that I am working on creating a body of "facts" to describe Vietnam War Era Veterans. So I have already setup the Person file data, created a survey design object for this subpopulation, then using SERIALNO identified their corresponding Housing file records, etc.

    Now just to make sure I'm thinking straight, I have identified the records in the Housing that represent Vietnam War Veterans. From those Housing records I will identify the ones where TYPE is GQ, then using those SERIALNOs I will go back to the Person records to pull in the weights. This is logic right?


    P.S. Too embarrassed to say how much time I spent thinking about and researching "Why is the housing weight zero for group quarters?" Thanks a bunch :)

Children
  • Yeah, if you've already identified the list of "serialno" values for Vietnam veterans, you can match those to the housing files and get the distribution of housing "type" and use "pwgtp" to get the estimated number of Vietnam veterans who lived in group quarters in 2015.

    You won't be able to get much info about the physical/structural characteristics of the group quarters, since as you noticed with the "b", group quarters are "not in universe" for a lot of the housing characteristics variables (e.g. plumbing, heating, year built), the group quarters respondents most likely were simply not asked these questions.

    You can, however, subset your population records to only Vietnam veterans who lived in group quarters - SAS code would be "if type in ('2', '3')" - and then run various descriptive statistics on the population variables/population characteristics as of 2015 (e.g., age, race, sex, marital status, education, labor force participation, disability, commute).

    -Diana

    P.S. Don't worry at all! There are so many details and intricacies with ACS data. That's what this whole ACS Data Users Group site is for!