Top-coded values

Has anyone ever attempted to "fill-in" the top-coded values for the 5 year ACS PUMS data? For example, in 2009, all of the house values (variable: valp) greater than 4 million dollars in Hawaii get cutoff and it's impossible to tell if the observation you're looking at is a 50 million dollar home or a 4.01 million dollar home - since both look the same. The top-coded cutoff values are different for every year by state combination and I haven't had much luck trying to break through the top-coded value. I've mostly been running linear regressions on the data and am just trying to get a reasonable estimate for each observation that is top-coded. Any insight would be greatly appreciated.

[Updated on 2/24/2015 3:08 PM]
Parents
  • Referring to a second data source would be my advice as well.

    In some states you can access the local tax assessor records for approximate home valuation. (Assessed value is never the same as market value, but in some states those two values are closer than in others.)

    But the technique also depends on your ultimate goal for the data. If you're trying to estimate some sort of regression formula linking household characteristics to the value of the home, I'd be a bit skeptical of using modeled inputs (i.e. filled in top-coded values) in conjunction with survey data.
Reply
  • Referring to a second data source would be my advice as well.

    In some states you can access the local tax assessor records for approximate home valuation. (Assessed value is never the same as market value, but in some states those two values are closer than in others.)

    But the technique also depends on your ultimate goal for the data. If you're trying to estimate some sort of regression formula linking household characteristics to the value of the home, I'd be a bit skeptical of using modeled inputs (i.e. filled in top-coded values) in conjunction with survey data.
Children
No Data