Clean data downloads?

I'm a non-profit researcher and policy analyst and I feel like I'm constantly banging my head against a wall to download ACS data in a usable format. I have decent statistical analysis and data analysis chops but I am not a coder.

For example -- I'm currently trying to download S2701, Selected Characteristics of Health Insurance Coverage in the US. I just want to produce an excel spreadsheet where I can quickly reference the percent of residents in a certain age group who are uninsured, by county. But I cannot figure out how to download the excel data in a way that the spreadsheet is usable. It produces really gnarly tables with tons of merged cells, and it's impossible to use that format of table and then sort the data or use it for a vlookup. I need to pull the 254 counties in my state which is too much to go through manually.

I want to produce a table where I can have the counties in column A and the values in column B. Maybe margin of error in column C. Instead I have the county name every 10th cell down in column A.

Is there a fix that doesn't involve me writing macros or coding to get usable data?

Parents
  • I have three suggestions.

    1. For your immediate problem of health insurance info for Texas (the only state with 254 counties) use the SAHIE interactive data tool. That's from a separate Census program aimed specifically at health insurance. The table underneath the map should give you exactly what you want. 

    SAHIE (census.gov)

    2. Use Excel's PowerQuery (also known as get and transform data) subsystem to format the spreadsheet the way you wanted. I'm guessing (could be wrong, you didn't say) that you entered S2701 into data.census.gov, set the geography to counties, picked Texas, and then clicked the transpose button to get the counties into columns and that messed up the formatting. I'm suggesting you download the table as Excel without the transpose, open it in Excel PowerQuery, delete the headers that would break the formatting there, and then use PowerQuery for the transpose. FWIW is a fantastic tool for cleaning up spreadsheets and there are lots of short tutorials on Youtube and the Internet on how to use it. Also good for doing SQL-type joins through a GUI without any coding.

    3. Use one of the Census API packages in R or Python. I am very fond of tidycensus in R which can produce a dataframe that contains both data and map info (latter optional). You can then export that dataframe as Excel or CSV. That only involves a wee bit of coding (no algorithms). It's really well documented at tidycensus (walker-data.com)

Reply
  • I have three suggestions.

    1. For your immediate problem of health insurance info for Texas (the only state with 254 counties) use the SAHIE interactive data tool. That's from a separate Census program aimed specifically at health insurance. The table underneath the map should give you exactly what you want. 

    SAHIE (census.gov)

    2. Use Excel's PowerQuery (also known as get and transform data) subsystem to format the spreadsheet the way you wanted. I'm guessing (could be wrong, you didn't say) that you entered S2701 into data.census.gov, set the geography to counties, picked Texas, and then clicked the transpose button to get the counties into columns and that messed up the formatting. I'm suggesting you download the table as Excel without the transpose, open it in Excel PowerQuery, delete the headers that would break the formatting there, and then use PowerQuery for the transpose. FWIW is a fantastic tool for cleaning up spreadsheets and there are lots of short tutorials on Youtube and the Internet on how to use it. Also good for doing SQL-type joins through a GUI without any coding.

    3. Use one of the Census API packages in R or Python. I am very fond of tidycensus in R which can produce a dataframe that contains both data and map info (latter optional). You can then export that dataframe as Excel or CSV. That only involves a wee bit of coding (no algorithms). It's really well documented at tidycensus (walker-data.com)

Children