CHAS vs ACS for housing problems

I'm wondering if anyone is familiar with the CHAS data set (a special tabulation of ACS data for HUD). 

I'd prefer to use ACS data directly to calculate the CHAS estimate of housing problems (defined as a housing unit experiencing 1 of 4 housing problems including overcrowding, cost burden, lack of plumbing, or kitchen facilities). However it seems like CHAS is the only way to do so because it offers a unique count of the units experiencing at least 1 of 4 housing problems, where with ACS data there's potential overlap in the count of households (since they're separate questions). For example tabulating from ACS, if a unit had lack of plumbing, crowding and was cost burdened, it would appear that 3 units had problems whereas it should be counted as a single unit, having at least 1 of the 4 problems.

My conclusion was that CHAS avoids this and therefore I shouldn't/cannot use ACS, but I'm wondering if there's a way around it? Anyone else have experience with CHAS and know if I'm on the right track?

  • If the housing problem tabulation is your variable of interest, then yes, CHAS makes compiling that data very elegant because it filters out the overlapping responses. It can be done manually using a microdata file, but then you're unable to assess any areas smaller than a PUMA. It all really depends on your exact research question. Happy to discuss further.

  • I'm also curious, do you know if it would be appropriate to calculate the MOEs for US estimates of housing problems by race/ethnicity in the CHAS data? Following the ACS handbook I know we can aggregate MOEs by geography or by subgroup but this would be across both (income, renter vs owner, and then by state) which I've never done before...

  • Hi Elise,

    I work for HUD and managed the development of the ACS-based CHAS data. As Bryan said, the CHAS data are perfect for the scenario you described. With standard ACS tables there's no way to eliminate duplicative housing problems. (thanks, Bryan!). Since you're interested in state and US level estimates, you could also use PUMS data to come up with your own estimates, but that might be harder depending on how familiar you are with PUMS data and Census variable definitions. 

    Regarding your second question--yes you can calculate MOEs in the CHAS data. The raw data files HUD posts include MOE. If you use the CHAS data to create derived estimates, you would need to calculate a derived MOE, the same way you would with regular ACS data (chapter 8 of the handbook you referenced:

    Feel free to reach out if you have questions or want to discuss your results.


  • Thanks for your reply Paul! Glad to have you as a resource.

    I'm wondering if aggregating MOEs across both geography and groups is a valid way to calculate the CIs for US estimates by race? So combining all states, all income levels, and owner+renter for each race/ethnicity? I haven't seen an example of aggregating across both (for ACS in general) for the derived calculations.

  • MOE and CI have limited usefulness in something like the ACS where you never really know how things are weighted, which can't be addressed with sampling error. Just as an example I did an exercise for a conference showing tracts for a city with foreign born, and one tract had exactly one foreign-born person, born in Colombia. Impossible under any sampling scenario to find the characteristics of exactly one person in a tract, obviously. I get more confident when I see lots of examples of a pattern in different areas and in different years. Then it seems real to me even if the exact numbers are obviously in doubt, especially in small areas. 

  • Thanks very much for all your work on the CHAS data, Paul. Do you know when the 2013-2017 CHAS release will be coming out? That information would help me plan a couple projects that could benefit from it. 


Reply Children
No Data