Use PUMS data for PUMAS belonging only in one county

Hello! I try to use PUMS 5yr 2008-2012 data and I face a strange fact that when I sum the weights (variable PWGTP in the dataset) for the PUMAS that belong to a specific County, I do not take as a result the known total population in this county (compared to the population from ACS 2008-2012 estimates)...
Could anyone advise me if I use the given weights properly? The PUMS data are for PUMAS in one whole state and I try to extract only the PUMAS in the county that I need. Is it possible or there is problem because the PUMS data are defined at a state level?
Parents
  • You are totally right!But, currently, I am working on Oklahoma county which, according to the 2012 tiger shapefile for PUMA10 for Oklahoma State includes 6 PUMAs (I checked it also from other sources). So, I downloaded the .csv file for the 2008-2012 PUMS in Oklahoma State and in this file there are two columns, one for the PUMA00 and one for the PUMA10. Either joining or not the two columns in one to have all the existing PUMAs, the weird in my estimations is that:
    1) when I sum all the weights from the whole dataset, I can correctly reproduce the whole population in the State but
    2) when I extract a small number of PUMAs (6) and I sum the weight (PWGTP variable) that corresponds only to the six PUMAs that I am interested in (oklahoma county), I take totally wrong number of population compared to other official data for this county.
    So, I try to understand now if:
    a) I have a mistake in the calculations for the extraction of the 6 PUMAs or
    b) the PUMs data are defined in the level of State and they are not representative at the scale of 1 single PUMA
    I am sorry for being tiring but your help is really valuable for me!
    Thank you in advance!
Reply
  • You are totally right!But, currently, I am working on Oklahoma county which, according to the 2012 tiger shapefile for PUMA10 for Oklahoma State includes 6 PUMAs (I checked it also from other sources). So, I downloaded the .csv file for the 2008-2012 PUMS in Oklahoma State and in this file there are two columns, one for the PUMA00 and one for the PUMA10. Either joining or not the two columns in one to have all the existing PUMAs, the weird in my estimations is that:
    1) when I sum all the weights from the whole dataset, I can correctly reproduce the whole population in the State but
    2) when I extract a small number of PUMAs (6) and I sum the weight (PWGTP variable) that corresponds only to the six PUMAs that I am interested in (oklahoma county), I take totally wrong number of population compared to other official data for this county.
    So, I try to understand now if:
    a) I have a mistake in the calculations for the extraction of the 6 PUMAs or
    b) the PUMs data are defined in the level of State and they are not representative at the scale of 1 single PUMA
    I am sorry for being tiring but your help is really valuable for me!
    Thank you in advance!
Children
No Data