I've run into a bit of a conundrum when comparing one and five year results for Table B25038 TENURE BY YEAR HOUSEHOLDER MOVED INTO UNIT.
I am comparing values for the City of Cambridge, Massachusetts after calculating the percentage of householders who moved into their units in the prior two years, the most recent category provided.
For the period 2017 through 2021 I found the following proportions using one year data starting with 2017: 44%, 31%, 44%, null (no table for 2020), and 45%.
From the 2017-2021 ACS the same table generates the value of 17% for the same category.
Comparing published values, the 2021 ACS reports that 23,719 householders moved into their unit since 2019 where as the 2017-21 five year ACS reports 8,192. Similar discrepancies exist for 2017, 2018, and 2019.
While I do not expect one year and five year ACS data to match or have a simple mathematical relationship to one another, the level of the discrepancy is unusual in my experience. Have others had similar observations?
I've dug into this some more. I missed question 3 on the ACS form: When did PERSON 1 (listed on page 2) move intothis house, apartment, or mobile home? I took the vintage 2021 B25038 1 year and 5 year tables and added together the corresponding "renter occupied" and "owner occupied" rows to get a "total number of households" number or each "look back" interval. I then compared the resulting 1 year and 5 year tables. There is quite a discrepancy for the 2019 or later cell.
I just wrote an email to the ACSO user support with the two tables. While making the calculations I came up with two thoughts. There could be a "recall bias" because in the different survey years people have to remember back different lengths of time. Depending on how much time has past since the event people can make a different degree of error when they report their last move. Another potential bias could be due to the issues with the 2020 "Covid" year and the problems with producing weights for the estimates since household visits were curtailed during the pandemic. I'll let you know what I hear back.
Since I'm a mathematician I need to think about this some. I would start with ignoring the sampling aspect of the ACS. So a household would get a positive response for moving in two consecutive 1 year datasets provided that they don't move a lot. This simple description of double counting assumes that a person only moved once in the relevant look back interval. Since Cambridge has a fairly large population on one year leases a significant population has probably moved twice in the 2 year look back interval. Another thing is that the average ratios of counts is not the ratio of the averages of the counts. It sounds like you are working with the underlying counts so you haven't fallen into that trap. In any case, for the 1 year dataset there is some double counting. For simplicity assume that the ACS is collected on a single day each year. I assume that for the 5 year dataset there is a correction for the 2 year look back interval. I can't find any documentation about how the 5 year "flow" estimates are compiled. The underlying statistical theory deals with stochastic processes and ergodic flows. This theory gives the relationship between "space" (single point in time) averages as is (nearly) done for the 1 year dataset. and averaging across 1 year datasets for different years when compared to the across time averages as are used in the 5 year datasets. The decennial census is a single-point-in-time count (1 April of the census year).
I'm trying to find a simple paper/webpage that gives an example of how this works. The Census recommends against comparing multiple overlapping ACS 5 year estimates. This also applies to comparing multiple 1 year estimates to 5 year ACS estimates. You can do it but the mathematics is complicated and you need to account for population movement over time. I'll keep looking for an article on this. A similar type of situation occurs in the intercensal estimates of the US population. Some of the kabuki appears here:
A knee jerk reaction to the numbers in that the 2 year look back accounts for the factor of about 2 in your calculation of counts. We may be able to contact the expert at the census who knows how they compute the ACS 5 year estimates.
Another question that I often ask people when I am working with them is "What is the scientific question that you are trying to answer ? " Not to be too humorous but one question could be how many garbage trucks you need on September 2nd to pick up all the junk that people put out on the sidewalk over the weekend. In which case the 2 year look back estimate may not be the number you want. You might be tempted to divide the 1 year ACS estimate by 2 to account for the 2 year look back but then that will be too small. Some people will move twice in during the 2 year interval. I used to be a student and I lived in Cambridge so I know what things look like on moving day !
I'll post when I find out more.
I looked at the ACS questionnaire and the only question about moving is
15a Did this person live in this house or apartment 1 year ago?
Yes, this house -> SKIP to question 16No, outside the United States and Puerto Rico –
Print name of foreign country, or U.S. Virgin Islands, Guam, etc., below; then SKIPto question 16
No, different house in the United States or Puerto Rico
15b. Where did this person live 1 year ago?Address (Number and street name)
So table B25038 uses data "external" to the ACS questionnaire . I sent an email to ACSO Users Support. I have used this variable in the past and I will likely use it in the future so I need to know what is going on.
UPDATE TUES 2-28-2022 11:40 EST
I just received an email from the people at ACSO user support and they said that only data collected on the ACS questionnaire is used to make the estimates in B25038. They must use some sort of statistical model (for the population "flows') with "linkage" on the previous address The question on the survey form is "Did you live at this address 1 year ago?" If you answer "no" this indicates that you have moved at least once in the last year. Next they ask for the previous address. Later they ask do you own or rent which gives the breakdown in the table of renter v owner.
Just emailed ACSO user support. See edit of previous post.
"Not to be too humorous but one question could be how many garbage trucks you need on September 2nd to pick up all the junk that people put out on the sidewalk over the weekend." As it happens, this question did originate from our solid waste staff, who want to get a handle on the number of new households needing education about trash disposal and recycling each year.
I somehow missed your original post. I assume that your edit is found in the first two paragraphs of your post.
This could indeed be one of the consequences of the disruptions to data collection caused by Covid, though I suspect something else is afoot. I am aware that comparing overlapping ACS estimates is fraught. That aside, I would not expect to see differences of the magnitude seen here. One aspect I had not paid enough attention to until now is the most recent category in table B25038 varies and might start 0, 1 or 2 years in the past. Here is a table comparing most current values from the 2017-21 ACS and the preceding one year ACS tables:
Here is a similar table from the 2014-18 ACS, so we can rule out Covid effects:
I don't think the changing number of "years elapsed" account s for what I'm seeing but it certainly complicates the picture.