Implied household population by household size inconsistencies

Working with block groups, I've noticed discrepancies between the households by size and the household population, for example, suppose I have a block group with the following:

- population in occupied households: 1454

- households by household size:

  • 1: 89
  • 2: 273
  • 3: 72
  • 4: 97
  • 5: 31
  • 6: 0
  • 7+: 0

To get household population by household size, I would think that I could take each household count for sizes 1-6, and multiply them by the corresponding household size. The pop for 7+ households would then be the total household population minus the sum of the calculated populations for household sizes 1-6. So for this case:

  • 1: 89
  • 2: 546
  • 3: 216
  • 4: 388
  • 5: 155
  • 6: 0
  • total pop for household sizes 1 - 6: 1394
  • remaining for 7+: 89

Note the inconsistency here: we have 0 7+ plus households, but we to match the household population we would theoretically need to have 89 in this group. I've also noticed cases where the inverse is true: there are 7+ households, but adding the implied total across 1-6 household size groupings is larger than the entire household population for the block group (i.e. the pop in 7+ households would then be negative).

So I guess I'm looking for suggestions as to why this is occurring and on how to make this more consistent.

My first inclination is to scale the population variables so that they match the implied population in the household sizes, but am curious if anyone has dealt with this differently. 


  • The discrepancy here is likely the result of differences between household and person weights. The total population number (1454) is estimated using person weights. Person weights have been adjusted (post-stratified) to match independent population estimates of key demographic groups. This process may cause respondents within the same household to have different weights.

    When you estimate a total population by adding up the number of 1-person households with 2x # of 2-person households, plus 3x # of 3-person households, etc., etc., this is effectively the equivalent to estimating the population with person weights where all are equal to the respondent's household weight.

    I hope this helps.
  • Hi Scott -
    I first came across this problem about 10 years ago. It stems from controlling routines (my understanding is that it stems from the technique used to control group quarters). In fact, this problem was such an issue for the work that I was doing that I wrote a paper describing a technique for creating outside-of-ACS estimates of households by household size.

    The solution you use will be dependent upon your end goal with the data. (For example, do you need a robust estimate of household by size for some infrastructure planning?)

    Anyway - I'm happy to share more information. Please feel free to send me an email (bjarosz (at) prb (dot) org) and I'll share insights and papers.