Languages data for population ages 5 and older, with breakdown of numbers by ability to speak English categories, for Queens, NYC (PUMS)


I would like to retrieve the following dataset from MDAT, but I'm having challenges with confirming the accuracy

2020 5-year ACS PUMS

Weight: Person

Variables: Language other than English spoken at home, Population ages 5+ only, Ability to speak English

Geographies: Queens Community Districts (PUMA), NYC

I appreciate any guidance or feedback that you have! This has been my workflow in MDAT (

1) Select Variables:

AGEP selected

Within age, I specify ages 5-99

ENG selected (Ability to speak English)

LANP selected (Language Other than English spoken at home)

2) Select geographies:

State: NY

All community districts in Queens selected

i then go to customize table and export the data to a csv.

Ideally, I would like to format the data in this way:

Language Population age 5+ Proficient in English (Speaks English Very Well + Speaks English Well) Limited English Proficient (Speaks English Not Well + Speaks English Not at all)

Just wanted to verify if this methodology is correct. Thanks!



  • I think your data setup is fine.  I'm always a bit wary of using  results at the PUMA level especially as you crosstab by ability to speak English.  If you use iPUMS the estimates will have confidence intervals whereas I don't think MDAT app offers this.  Not sure if you're concerned about this or not

  • Thanks! I generally avoid using PUMA-level data also, but detailed languages data is not available on the tract level...Thanks for the recommendation to not crosstab PUMA-level data by ability to speak English. I've looked at iPUMS, but navigating through the site has been a learning curve for me. Will look into it. Thanks again!