I am a relatively new user of PUMS data. I use the 2016 1 year estimate PUMS population data, subset it to a dataset include the individual data for a county within the age group of my interest (ffx). Then I need to calculate the population estimate of the county by race, and need to calculate margin of error for the estimate. Since the margin of error depend on the sampling design, I assume the PUMS population survey did a stratified sampling on state, PUMA area, and housing unit, and wrote the following code to get the 90% confidence interval, and then I can calculate the margin of error as the radius of the interval. Could anyone tell me if my understanding to the PUMS sampling design is correct and if the following SAS code is correct? Thank you so much!
Proc surveyfreq data = ffx;
Table county*Race/CLWT CL ALPHA = 0.1 ;
strata ST PUMA SERIALNO;
The IPUMS-USA site has a helpful summary of using the PUMS replicate weights: https://usa.ipums.org/usa/repwt.shtml
The example given on that page is PROC SURVEYREG, and it uses the IPUMS-USA variable names rather than the Census Bureau's variable names. To adapt the PROC SURVEYFREQ code you've written here, you can just drop the STRATA statement and use the REPWEIGHTS statement found in the IPUMS-USA example.
Proc surveyfreq data = ffx;Table county*Race/CLWT CL ALPHA = 0.1 ;weight PWGTP;repweights pwgtp1-pwgtp80 /jkcoefs=0.05;run;
I hope this helps!
In reply to Matt Schroeder: