The ACS Office at the Census Bureau is currently testing a new format for the ACS Summary File, which is a comma-delimited text file that contains all the Detailed Tables for the ACS.
Information about the proposed updates to the ACS Summary File are described on the Census Bureau's website.
We are starting this new Discussion Thread so that ACS data users can post any comments or questions about the proposes changes. ACS Summary File users are also encouraged to participate in the webinar scheduled for this afternoon on this topic.
This seems like something we could adapt to fairly readily.
I'd like to make a plea for structured metadata which is published in something other than a variety of XLSX files. Things that application…
As a longtime ACS Summary File user, this is a huge, and welcome change. Perhaps the best improvement is having column headers in the data files. This not only reduces the complexity in using the files…
The FTP site includes a file that I think is the complete data file (acsdt5y2018.zip) --but it's listed as 11 gigabytes. After unzipping, that's a TON of data to sift through. I also appreciate having…
As someone who uses SAS to build datasets from the raw ACS data files and perform subsequent data analysis, I would strongly advise against naming the fields/variables with an "E' or "M" at the end of the name. This would make it more difficult to use a range of variables in calculations, for example, when collapsing a table into broader categories like age groups, educational attainment, etc. So instead of fields/variables formatted like this:
B01001_001EB01001_001MB01001_002EB01001_002MB01001_003EB01001_003M
I would suggest a naming convention more like this:
B01001_E001B01001_M001B01001_E002B01001_M002B01001_E003B01001_M003
Just my .02
It'll be a bit confusing since some tables have letter suffixes (like B01001E_001). Someone could easily confuse B01001E_E001 with B01001_E001. Not to say it shouldn't be done, but something to consider. Also, it would kill continuity of column headers with previous years' data (which I assume will not be re-released in the new format).Also, I've been using this data since the beginning of ACS, and I still think "Error" and not "Estimate" every time I see that E. Am I the only one?
you're not the only one!
In reference to your statement,
"Also, it would kill continuity of column headers with previous years' data (which I assume will not be re-released in the new format).",
were the previous data ever released with the "E" or "M" appended to the end of the field/variable name? We use custom SAS programs to build the SAS datasets from the raw ACS data (not the CB provided SAS programs). I don't recall any of the previous data files having a header row with field/variable names. From the CB provided SAS programs, it appears the variables are named in the xxxe001 manner and not as xxx001e, but I could be mistaken.Regardless, it looks like any change that includes both the estimates and MOEs in the same data will necessitate naming the variables in such a way that it may "break" continuity with previous data releases, unless the end-user built the datasets to account for that.
Good point; my bad. The E/M was made as a prefix to the data filenames and worksheet tab name in the data templates. So it will necessitate a change, as you say.