The ACS Office at the Census Bureau is currently testing a new format for the ACS Summary File, which is a comma-delimited text file that contains all the Detailed Tables for the ACS.
Information about the proposed updates to the ACS Summary File are described on the Census Bureau's website.
We are starting this new Discussion Thread so that ACS data users can post any comments or questions about the proposes changes. ACS Summary File users are also encouraged to participate in the webinar scheduled for this afternoon on this topic.
This seems like something we could adapt to fairly readily.
I'd like to make a plea for structured metadata which is published in something other than a variety of XLSX files. Things that application…
As a longtime ACS Summary File user, this is a huge, and welcome change. Perhaps the best improvement is having column headers in the data files. This not only reduces the complexity in using the files…
The FTP site includes a file that I think is the complete data file (acsdt5y2018.zip) --but it's listed as 11 gigabytes. After unzipping, that's a TON of data to sift through. I also appreciate having…
As someone who uses SAS to build datasets from the raw ACS data files and perform subsequent data analysis, I would strongly advise against naming the fields/variables with an "E' or "M" at the end of the name. This would make it more difficult to use a range of variables in calculations, for example, when collapsing a table into broader categories like age groups, educational attainment, etc. So instead of fields/variables formatted like this:
B01001_001EB01001_001MB01001_002EB01001_002MB01001_003EB01001_003M
I would suggest a naming convention more like this:
B01001_E001B01001_M001B01001_E002B01001_M002B01001_E003B01001_M003
Just my .02
Maybe even something like:
B01001_E_001
B01001_E_002
B01001_E_003
B01001_M_001
B01001_M_002
B01001_M_003
B01001A_E_001
B01001A_E_002
...would accommodate SAS users while still reducing confusion
I do like the underscore to separate the table id from the table item and I do prefer the table item padded with zeros. I could go either way with the second underscore between the E/M and the table item.
We should also identify any name length restrictions of any software packages users are using to work with the data and variable names that may exceed these limits. For example, I believe old DBF files had a 10-character field name restriction.
Sure. I merely added the second underscore as a possible mitigation for the confusion problem Bernie mentioned, that would still be usable in SAS programs with only minor modification