Hello, I want to know how I can access a table or set of tables at state - county - census track - block group level detail (and census blocks if possible) with information related to basic census stats such as total population, sex, age, income, (and retail sales if possible). My purpose is to ingest this information to BigQuery for different business and geographical analysis, so I would also want to know if it's possible to get the latitude and longitude at this level of detail.
I know having this level of detail would result in an insanely amount of data and it will probably require pulling the data from different tables and ingest it in many partitions, so in case it's not feasible to get the information I need in a decent amount of csv/excel files, and API requests are the best bet, I would like to know if there is a suggested methodology to run this heavy ETL, such as looping through geography codes or using * wildcard to select all codes for some specific geographies (that ones that apply), or a combination of both methodologies or any other additional approach.
My best guess is looping through geographic codes but I don't know the code ranges, so in case you guys suggest going to the API request with looping codes I would highly appreciate if you can share where to find that total code list for states, counties, census tracks, block groups and census blocks.
Thank you!
Ricardo.
Passing on a former colleague’s advice for familiarizing yourself with what’s available— look at the list of tables and then you’ll know the table ID which helps (no retail sales, sorry)…
Generally the geoid (fips) codes can be found in the tiger shape files. They contain data fields with the geocodes and longitude and latitude for the "internal point" of the geographic polygon INTPTLAT INTPTLON .https://www.census.gov/geographies/mapping-files/time-series/geo/tiger-line-file.2022.html technical documentation https://www.census.gov/programs-surveys/geography/technical-documentation/complete-technical-documentation/tiger-geo-line.2022.html#list-tab-TN2BGQZWFO8ATUC9LB I don't believe that there is an API for the TIGER LINE files. They can be downloaded as a zip file from the Census website https://www2.census.gov/geo/tiger/TIGER2022/ for the most recent files (post 2020 census).
For example for Massachusetts census tracts: /https://www2.census.gov/geo/tiger/TIGER2022/TRACT/tl_2022_25_tract.zip (25 is the Massachusetts FIPS code).
The smallest geography is the "block group." A census tract might have 1-4 block groups or so.
Passing on a former colleague’s advice for familiarizing yourself with what’s available— look at the list of tables and then you’ll know the table ID which helps (no retail sales, sorry)
https://www.census.gov/programs-surveys/acs/technical-documentation/table-shells.html
The files with the list of various geocodes:by geography type
www.census.gov/.../gazetteer-files.2021.html
Factfinder was nice. It’s been gone for years now. Replaced by data.census.gov
www.census.gov/.../transition-from-aff.html
Also for business stats you might look at County/Zip Code Business Patterns, lots of munginess from non disclosure so read up, but you generally get # businesses by type, employees and payroll www.census.gov/.../cbp.html
IPUMS NHGIS provides online and API access to nationwide data files at the block group and block level. You can find and select any set of available summary tables and download them all in one file (per source dataset). Block data are available only for decennial censuses, most recently the 2020 census, and only for a limited set of subjects, so you can't, for example, get income info at the block level. The U.S. Census Bureau collects information on income and many other characteristics through the ACS, and block groups are the lowest level for which ACS data are reported. NHGIS also has the ACS block group data, and you can get it in nationwide files for multiple tables without having to loop through geographies or merge tables together.I don't know of any data source for retail sales for small areas.For geographic coordinates for block groups, I recommend using the Census Bureau's centers of population. NHGIS also provides these centers as shapefiles. For blocks or other levels, if you get NHGIS tables for a decennial census (not ACS), the data file should include longitude and latitude for a central point within each area. NHGIS also provides nationwide shapefiles of block group polygons and state-extent shapefiles of block polygons.
Thank you David, when I open the tiger file I see many files with weird extensions https://ibb.co/Jyt2rJb
Which one has the shape data? and how I can open it? Also, is it possible to have the state-county-track-block group details in a single file?
Thanks Jonathan, your information has been very useful for what I really need
These files are all parts of a single shapefile. In your example, tl_2022_25_tract is the shapefile name, and the six items are pieces of it. The DBF is a database, the SHP contains the geometry, the PRJ is the projection data, and the others are various kinds of metadata.
You would use a GIS program to open a shapefile. ArcGIS, QGIS, etc.
jose_vides said: is it possible to have the state-county-track-block group details in a single file
It's tract (not "track") and no, the Bureau shapefiles include only a single geography type.
Jonathan Schroeder said:I don't know of any data source for retail sales for small areas.
Some states report taxable sales. At least, I know Missouri does, by city and Standard Industry Codes:
https://dor.mo.gov/public-reports/#pubtax
But I don't know of a central national source for such data.
Jose might find it easier to get internal point ("centroid") coordinates from the Bureau's gazetteer files. These are just flat files and don't require a GIS program to access.
https://www.census.gov/geographies/reference-files/time-series/geo/gazetteer-files.html
We also have these at MCDC: https://mcdc.missouri.edu/cgi-bin/uexplore?/data/gazeteer