Clean data downloads?

I'm a non-profit researcher and policy analyst and I feel like I'm constantly banging my head against a wall to download ACS data in a usable format. I have decent statistical analysis and data analysis chops but I am not a coder.

For example -- I'm currently trying to download S2701, Selected Characteristics of Health Insurance Coverage in the US. I just want to produce an excel spreadsheet where I can quickly reference the percent of residents in a certain age group who are uninsured, by county. But I cannot figure out how to download the excel data in a way that the spreadsheet is usable. It produces really gnarly tables with tons of merged cells, and it's impossible to use that format of table and then sort the data or use it for a vlookup. I need to pull the 254 counties in my state which is too much to go through manually.

I want to produce a table where I can have the counties in column A and the values in column B. Maybe margin of error in column C. Instead I have the county name every 10th cell down in column A.

Is there a fix that doesn't involve me writing macros or coding to get usable data?

Parents
  • OK, I do things a little differently, and will someday use R etc. I have excel sheets for each table and fields I want to pull down. For S2701 (which is a good choice) I have a list of all the fields (from Census data, advanced, table download - Col Metadata file) and then I have a little formula. So I just put the field names in rows (up to 49 names) and the formula creates the API for those fields for all counties, all states and the US, it takes about 1 minute. For example; (all counties)

    https://api.census.gov/data/2021/acs/acs5/subject?get=S2701_C01_001E,S2701_C01_017E,S2701_C01_023E,S2701_C01_024E,S2701_C01_032E,S2701_C01_035E,S2701_C05_001E,S2701_C05_002E,S2701_C05_011E,NAME&for=county:*

    S2701_C01_001E Estimate!!Total!!Civilian noninstitutionalized population
    S2701_C01_017E Estimate!!Total!!Civilian noninstitutionalized population!!RACE AND HISPANIC OR LATINO ORIGIN!!Black or African American alone
    S2701_C01_023E Estimate!!Total!!Civilian noninstitutionalized population!!RACE AND HISPANIC OR LATINO ORIGIN!!Hispanic or Latino (of any race)
    S2701_C01_024E Estimate!!Total!!Civilian noninstitutionalized population!!RACE AND HISPANIC OR LATINO ORIGIN!!White alone, not Hispanic or Latino
    S2701_C01_032E Estimate!!Total!!Civilian noninstitutionalized population!!NATIVITY AND U.S. CITIZENSHIP STATUS!!Foreign born
    S2701_C01_035E Estimate!!Total!!Civilian noninstitutionalized population!!DISABILITY STATUS!!With a disability
    S2701_C05_001E Estimate!!Percent Uninsured!!Civilian noninstitutionalized population
    S2701_C05_002E Estimate!!Percent Uninsured!!Civilian noninstitutionalized population!!AGE!!Under 6 years
    S2701_C05_011E Estimate!!Percent Uninsured!!Civilian noninstitutionalized population!!AGE!!Under 19 years

    I can send you the file, you can use it for any table it takes 5 minutes to get the metadata file copy it into the sheet and then choose which fields you want.  There are 305 fields in S2701, otherwise, I'd just paste the API for all fields here. I think for short pulls (no heavy lifting) this works great. For heavy lifting tidycensus and other solutions that this group knows in their sleep are best.

    Again, I can send you the file, let me know and good luck.

    Tom

Reply
  • OK, I do things a little differently, and will someday use R etc. I have excel sheets for each table and fields I want to pull down. For S2701 (which is a good choice) I have a list of all the fields (from Census data, advanced, table download - Col Metadata file) and then I have a little formula. So I just put the field names in rows (up to 49 names) and the formula creates the API for those fields for all counties, all states and the US, it takes about 1 minute. For example; (all counties)

    https://api.census.gov/data/2021/acs/acs5/subject?get=S2701_C01_001E,S2701_C01_017E,S2701_C01_023E,S2701_C01_024E,S2701_C01_032E,S2701_C01_035E,S2701_C05_001E,S2701_C05_002E,S2701_C05_011E,NAME&for=county:*

    S2701_C01_001E Estimate!!Total!!Civilian noninstitutionalized population
    S2701_C01_017E Estimate!!Total!!Civilian noninstitutionalized population!!RACE AND HISPANIC OR LATINO ORIGIN!!Black or African American alone
    S2701_C01_023E Estimate!!Total!!Civilian noninstitutionalized population!!RACE AND HISPANIC OR LATINO ORIGIN!!Hispanic or Latino (of any race)
    S2701_C01_024E Estimate!!Total!!Civilian noninstitutionalized population!!RACE AND HISPANIC OR LATINO ORIGIN!!White alone, not Hispanic or Latino
    S2701_C01_032E Estimate!!Total!!Civilian noninstitutionalized population!!NATIVITY AND U.S. CITIZENSHIP STATUS!!Foreign born
    S2701_C01_035E Estimate!!Total!!Civilian noninstitutionalized population!!DISABILITY STATUS!!With a disability
    S2701_C05_001E Estimate!!Percent Uninsured!!Civilian noninstitutionalized population
    S2701_C05_002E Estimate!!Percent Uninsured!!Civilian noninstitutionalized population!!AGE!!Under 6 years
    S2701_C05_011E Estimate!!Percent Uninsured!!Civilian noninstitutionalized population!!AGE!!Under 19 years

    I can send you the file, you can use it for any table it takes 5 minutes to get the metadata file copy it into the sheet and then choose which fields you want.  There are 305 fields in S2701, otherwise, I'd just paste the API for all fields here. I think for short pulls (no heavy lifting) this works great. For heavy lifting tidycensus and other solutions that this group knows in their sleep are best.

    Again, I can send you the file, let me know and good luck.

    Tom

Children
No Data