Hello,
I’m the City demographer with the City of Seattle and am looking at per capita income estimates for the City of Seattle from the 2021 ACS. These estimates indicate that among the 50 most populous cities in the U.S., Seattle had the highest per capita income after San Francisco. I am wondering if the Census Bureau uses top coding, systematic data swapping, or other methods for processing responses from individuals who have very high-income levels, or if the Census Bureau uses some other procedure for addressing the presence of very high-income levels when computing estimates in regularly published tables for per capita income, aggregate income, aggregate earnings, and so forth. I recognize that top-coding is applied in Public Use Microdata Samples. However, I am wondering instead about the regularly published tables that aggregate data from respondents.
I've also submitted this as a question to census.askdata@census.gov, but am not sure how long it will take them to get back to me, so figured I'd post it here in case someone here knows the answer. I'll also respond to my question here when I get an answer from the Bureau.
Thank you!
-Diana
Dear Diana,
In general the ACS PUMS data and the data used in the "B", "S", "DP" use the same source data. For top coding the PUMS codebook is a useful place to start. The disclosure avoidance techniques used in ACS tables and PUMS data are different however. For income there are summary statistics that don't use top coding. For example tables S1901 and S1902 report mean income for households. These tables also report the number of households. For a geography if you multiply the number of households times the mean house income, which is reported in these tables, you get the "total pot of money" earned by households in that geography. You can then divide by the total population in households for the geography to get the per capita household income. There is also B19301 that reports per capita income for the total population directly not just households. However S1902 reports mean income for various categories if you want a population breakdown. Note "Households" does not include everybody. There are "Group Quarters," for example nursing homes/ college dorms/ etc. (see table S2601A or B26001) For accounting purposes B26001 gives the group quarters population down to the census tract. Using mean income avoids the top coding issue. Except for a few statistics "disclosure avoidance" is applied to the entries in tables. This is complicated and one of the techniques applied is "top coding." A general article to get you oriented is here https://www.census.gov/newsroom/blogs/random-samplings/2022/12/disclosure-avoidance-protections-acs.html
The complete details are here
https://nces.ed.gov/FCSM/pdf/spwp22.pdf
For tables chapter II part (D) is a good place to start.
PUMS data is in a separate chapter.
acso.users.support@census.gov is the best place (email) to get information about the specific techniques used with the ACS. They usually respond the same day and you avoid the handoffs that occur if you use census.askdata@census.gov
As an FYI you should review how the ACS collects and defines household income.
Also "Family Income" is different from "Household Income." Family members is a subgroup of Household members as not all Household members are Family members -- check definition. The ACS has the concept of a "Head of Household." "person 1" on the ACS form (I think). Family members need to be related to the head of household.
Dave
Wow, always learning something new from reading these posts. I knew about S1901 and S1902. DId not know about B19301 but did know there can be quite a difference between mean and per capita. Also, did not know about S2601A and B26001, did know households are not the full population. Now I know the specifics if I ever want to look. More than my little mind can handle right now, but I've put them in my notes.