# Distribution of Occupied Housing Units by number of persons in household

Can someone help me understand what's going on here?

I created the table below from the PUMS 2018 1yr housing file showing the number of Occupied Housing Units by number of persons in the household (variable NP).

I would expect to get the total population living in households by summing the product of occupied housing units and the number of persons (the last column in the table).

But it doesn't. It totals 301,348,072 persons, where the PUMS person file estimates 319,075,830 persons in occupied housing units. A difference of over 17 million is too large to ignore which makes me doubt this distribution.

Thank you and I look forward to your insights.

Occupied Housing Units
by Number of persons in household (NP)
Weighted housing file PUMS 2018 1yr
-----------------------------------
Occupied
Housing     Household
Units     Population
NP        (OccHu)   (NP x OccHU)
-----------------------------------
1     34,064,779     34,064,779
2     41,606,974     83,213,948
3     18,801,990     56,405,970
4     15,325,904     61,303,616
5      7,157,039     35,785,195
6      2,791,217     16,747,302
7      1,009,842      7,068,894
8        425,925      3,407,400
9        176,614      1,589,526
10         81,548        815,480
11         38,768        426,448
12         20,448        245,376
13          8,708        113,204
14          4,465         62,510
15          2,671         40,065
16            728         11,648
17          1,381         23,477
18            223          4,014
20            961         19,220
-----------------------------------
Total 121,520,185    301,348,072

• Hi Joseph -

One (albeit minor) issue is that the 20 is a top-code, so a household in that 20 category could be 20 people, or it could be 99 people. (But really, that's only a piece of what's going on here.)

There are some known issues with the households-by-size distribution in ACS, mostly due to independent person- and household-level controlling. We had a thread on this issue a few months ago that may be useful for you:
acsdatacommunity.prb.org/.../566

In addition, PUMS has privacy protection measures that add additional noise to certain variables. Those measures may be part of the problem here.

Hope this helps. And I'm happy to chat more about this. It's an issue I've been wrangling with for years.
• I think your calculations assume that the each person has equal weight or at least the average weight in each of the categories is equal to 1. For example: the household population in 2-person households = 2 * 41,606,974 * avg weight of the persons in 2 person-households. If the average weight is not equal to 1 in each category, you get different totals.