Discrepancy in population totals between DP05 and summed tract-level values from B01001 tables?

I’ve recently identified an unusual discrepancy in city population values between ACS tables and would be grateful for anyone's expertise in examining this issue. I’ve summarized the problem below.

 We are interested in the total population of each city, both globally and for each of the following racial/ethnic groups: black/African American (alone), American Indian and Alaska native (alone), Asian (alone), Native Hawaiian and other Pacific Islander (alone), some other race (alone), two or more races, white (alone, not Hispanic or Latino), and Hispanic or Latino.

 In short, we calculated 2016 city population values in three different ways in ACS 5 year estimate data:

1)    Using reported values in Total Population of DP05

2)    Using Tract140 data from race-specific B01001B-H tables:

  1. Summing the population in each racial/ethnic group tract, by city, to calculate the total population of black/African American, AIAN, Asian, etc. persons in a city;
  2. Summing the total population in each racial/ethnic group within a city to calculate the total population in each city.


3)    Using Tract140 data from B01001:

  1. Summing the total number of persons in each tract to estimate the total population in the city



There is noticeable variation in the comparing the values between analyses (1) and (2), both in terms of total population and by racial/ethnic group. While some DP05/B01001B-I values are equivalent, some values in B01001B- I are 50% (or more) greater than its corresponding value in DP05. I’ve done a cursory analysis of lower confidence limits for values in summed B01001B-I and don’t see a pattern of overlap with DP05. There is no apparent pattern by state or population value in the discrepancies we’re seeing, aside from a consistent overestimate of B01001B-H relative to DP05. This overestimate occurs at both at the total-population level and in stratified comparisons of individual racial/ethnic groups.  I can forward output from proc univariates, scatterplots, histograms etc by email if these would be helpful (my email is Miriam.gofine@nyumc.org).

 Of note, analysis (2) (the sum of tracts in B01001B-I to generate a city-level population) above was performed in the course of a city/tract racial segregation analysis. I performed analysis (3) to understand the relationships between estimates in different tables after we identified the discrepancy between (1) and (2). The values in analysis (2) estimates do not equal the value of the city-level estimate reported in B01001. To illustrate this, the values for Birmingham AL are as follows:

Total Population

  • DP05: total population = 212,424
  • Sum of tract values in B01001B-I = 394,480
  • Sum of tract values in B01001 = 400,463 (MOEDerived Estimate=4262.1)


Black (Table B01001B)

  • DP05 (HISPANIC OR LATINO AND RACE > Not Hispanic or Latino > Black): total population = 152,326
  • Sum of tract values in B01001B = 206,075 (MOEDerived Estimate=3594.7)


I can provide additional analyses on the relationship between (2) and (3) as needed.

Finally, there is always the possibility that I’m using incorrect tables or have an error in my syntax. I should note here that I’ve validated the DP05 values in the data I’ve downloaded from AFF and read into SAS against the values reported on American Fact Finder’s online tables. I’ve also manually validated the syntax for the summing within and across tracts against a manual analysis in Excel.


--> Does anyone have thoughts on what might explain these findings?

  • Are you using non-Hispanic Black, non-Hispanic Asian, etc. when summing with Hispanic/Latino? If not, you may be double-counting some individuals.
  • In reply to Jlee:

    Thanks for your comment. That is a great point and I'll make sure we're not double-counting.

    However, the issues I've described above are present even in stratified analyses of black alone, white alone, summed "AIAN + 2 or more + sum other race", summed "Asian+NHOPI". In/excluding Hispanic is not what's driving the issue.
  • city boundaries do not coincide with Census geography (tracts, block groups, etc) - could that be your issue?
  • In reply to JamiRae:

    wait that is probably not your issue - but "alone" does not mean Not Hispanic - it means someone indicated white only and no other race. so you could be double counting as Jlee points out.