Proposed changes to the ACS Summary File format

The ACS Office at the Census Bureau is currently testing a new format for the ACS Summary File, which is a comma-delimited text file that contains all the Detailed Tables for the ACS.  

Information about the proposed updates to the ACS Summary File are described on the Census Bureau's website. 

We are starting this new Discussion Thread so that ACS data users can post any comments or questions about the proposes changes. ACS Summary File users are also encouraged to participate in the webinar scheduled for this afternoon on this topic.

Parents
  • This seems like something we could adapt to fairly readily. 

    I'd like to make a plea for structured metadata which is published in something other than a variety of XLSX files.  Things that application builders need to know which is maybe taken for granted in data analysis use cases.

    • table name
    • table universe
    • column name
    • data type (int/float, or possibly count/median/etc)
    • parent/child relationships between columns (e.g. these children should sum to this parent)
    • geographies which are categorically excluded from a given table (basically Appendix B from this page on Data Suppression)
    • the character encoding used for text (only applies to geoheaders and metadata, but it's important)

    and some things which would be really nice to have

    • table new or changed since last release
    • clearer articulation of data suppressed on a per-geography level, currently just represented by blank values
    • which ACS question(s) are the source of the data for a given table
    • something which helps map when a table universe is a proper subset of another table, like table/column (I know not all universes are so straightforward)
    • A better explanation of the prefix part of geoheaders, specifically the "M4/M5" geographic variant used for CBSAs and CSAs, which map to specific delineation vintages, but not in a way which is made clear to data users.

    Sorry if this is just hijacking the thread...

Reply
  • This seems like something we could adapt to fairly readily. 

    I'd like to make a plea for structured metadata which is published in something other than a variety of XLSX files.  Things that application builders need to know which is maybe taken for granted in data analysis use cases.

    • table name
    • table universe
    • column name
    • data type (int/float, or possibly count/median/etc)
    • parent/child relationships between columns (e.g. these children should sum to this parent)
    • geographies which are categorically excluded from a given table (basically Appendix B from this page on Data Suppression)
    • the character encoding used for text (only applies to geoheaders and metadata, but it's important)

    and some things which would be really nice to have

    • table new or changed since last release
    • clearer articulation of data suppressed on a per-geography level, currently just represented by blank values
    • which ACS question(s) are the source of the data for a given table
    • something which helps map when a table universe is a proper subset of another table, like table/column (I know not all universes are so straightforward)
    • A better explanation of the prefix part of geoheaders, specifically the "M4/M5" geographic variant used for CBSAs and CSAs, which map to specific delineation vintages, but not in a way which is made clear to data users.

    Sorry if this is just hijacking the thread...

Children
  • I think it is hijacking the thread, since the changes to the format of the data files won't affect the metadata files, but I like a lot of these ideas, and I think they'd be very much worth discussing in a separate thread.