I'm working on a project in which I'm subtracting the education of individuals via ACS data, and using BLS typical education needed for employment to create a variable of "underemployment". As I was doing this, I noticed a high number of physical therapists with the maximum negative value, signifying no high school education but practicing as a physical therapist, which requires a professional degree. A physical therapist aide is 'OCCP' = 3620 while a physical therapist is 3610, so I'm wondering if ACS retroactively fixes these types of errors. For instance, there is a 29 year old who apparently completed no grade school education (SCHL = 1) in Georgia. There are 35 of these physical therapists, 5 chiropractors, 2 judicial law clerks, and an audiologist. I weighted some other occupations which are aggregated in ACS but not BLS so the "underemployment" variable has fractional value almost as low in "Lawyers, and judges, magistrates, and other judicial workers" as well as "other life scientists". My dataset is the 5-year 2018-2022 population national population files aggregated (a, b, c, and d).
Oh, I love a good data mystery!
Two thoughts...
1 - Have you checked whether or not those records were imputed (using the relevant imputation flag variables)?
2 - There are some data that are recoded for logical consistency, and some that are not. I don't know all of the rules (have never seen them published) but I have seen enough inconsistent data in the workforce-related variables (occupation, industry, education, journey-to-work, etc...) to think that there are few logical consistency checks there (if any). So if it's not an imputed response, it may just be wonky* respondent behavior.
*Technical term.
I'm somewhat new to using the data, and I'm not familiar with the imputation flag variables. I'll be reading the data accuracy file, but so far, I've just worked through the dictionary: https://www2.census.gov/programs-surveys/acs/tech_docs/pums/data_dict/PUMS_Data_Dictionary_2018-2022.pdf
The "flag" variables are typically the same variable name as the one you're using, preceded by an F. For example, the imputation (or "allocation") flag for OCCP is FOCCP. For highest education (SCHL) it's FSCHLP. So you could try running your analysis excluding any cases where FOCCP = 1 or FSCHLP = 1 and see if you still get mismatches between education and occupation.