Family Structure and Income

I am working with a colleague to evaluate the median family income using PUMS data (in this case 2007-2011 5-year data that combines personal and household files based on serialno) and have run into a question about family structure versus household structure versus incomes.

For example, PUMS has some observations that look like the attached file (headers use PUMS variable names). All observations have the same serialno, the sporder 1-4 makes sense, the personal incomes (pincp) seem reasonable, and the household income (hincp) makes sense.

But, npf says the family size is 3 while the family income variable says every member of the household has the same family income? I would expect one member of the household would not have a family income (i.e. sporder = 1 to 4 but nfp=3 so one member of household is not a member of the family), or maybe the household has two families so it should have 2 different family incomes and the family sizes should all say 2. Suggestions about how to interpret this type of outcome, alternative variables to use, or errors in logic?

  • It looks like you definitely have a primary family / subfamily relationship going on. (SFR and SFN are not blank for 2 of the cases.) It looks like persons 2 and 3 are part of a subfamily, and persons 1 and 4 are primary family... Given that, I do not have much additional insight to offer on why both subfamilies have the same income?
  • Can you provide the geographic information: state code, PUMA #? I can look into this if I have that info.

    Doug Hillmer
  • Thanks to everyone for the quick feedback. The state code is 18, PUMA is 202.

    Between writing it down and thinking about everyone's feedback, I've realized the problem(s). The family income variable (fincp) is by definition a household level variable (i.e. in the household PUMS dataset rather than the person dataset) so there is only one family income for any given household. The same thing is true for the family size variable. When I merge the household dataset with the person dataset based on serialno, all of a household's characteristics are assigned to all of the persons in that household - this explains why I am getting a household size of 4, a family size of 3, all 4 people showing the same family income, etc. It seems like the solution is to create my own family income variable based on combining personal incomes (pincp) according to family unit rules? Make sense?
  • Instead of creating your own family income why not use the RELP variable to determine what the family income is. In other words, only merge/count the family income if the RELP is 0-10 & maybe 13 depending on how you want to define family.

    [Updated on 1/29/2015 3:26 PM]
  • Yes, it is always tricky when merging the person and household characteristics. I think you may have found the source of your problem. I also agree that using RELP can help you isolate the family members in the household.
  • Could someone provide a little more background on the RELP variable? My exposure to PUMS is through the IPUMS site. A quick look there does not reveal RELP in the list of variable definitions.
  • RELP is indeed included in the PUMS documentation from the Census Bureau on pages 32-33 of the referenced document. The RELP variable appears to be essentially the same as the RELATE variable found on the IPUMS site, the difference being that census code for the householder/reference person is 00 whereas IPUMS uses 01, for historical reasons I suspect.
  • The other option when constructing family income is to use the SFN, which assigns a family ID number to each person within a subfamily within the household. Primary household members have a blank or missing value for the field, while sub-family members have a unique subfamily ID.
    Hope this helps!
  • Douglas/Doug, thanks for the suggestion re: RELP. Beth, thanks for the suggestion re: SFN. I've suddenly learned more about family structures that I ever intended :)

    For anyone that might look up this string later, using the example I provided, RELP shows the household of 4 people is comprised of a householder, the householder's unmarried partner (the unmarried partner is not part of the "family"), the householder's sister, and the householder's in-law. The sister and in-law are a subfamily of the householder's family. The sister has no income and the in-law's income is part of the "family" income using the fincp variable. The unmarried partner's income is not part of the "family income" since by definition that person is not a member of the family.

    It seems like an odd definition of family to include the sister/in-law but exclude the unmarried partner, but it is what it is. And, it seems a little more unusual when there is more than one subfamily (e.g. serialno 2011000737718 of the same dataset utilizing the SFN variable to evaluate), but again it is all about the definition of family and the concept that there can technically only be one formally defined family in a household.

    Thanks for everyone's feedback/help!
  • And, for anyone interested, I happened to come across this recent post on the topic as well:

    blogs.census.gov/.../