Collaborative modeling project on PUMS

Hi all,
I'm a graduate student researcher in computer science. I am part of a project that is exploring using a collaborative framework for a predictive model on ACS data - we are trying to collaboratively predict personal income given non-income responses to ACS 1-year (focusing on just PUMS data from Massachusetts). I think some of you may find this project interesting, and I would love for people to check it out and contribute to the project! So far we have received contributions from 11 people around the world. We are also running a research study as well to evaluate how the framework supports collaborative modeling, which you are invited to join as well if you are interested. Full description is below.

Micah

##

Do you want to use your data science skills for good? Collaborative, open-source projects that create a machine learning model could have a significant impact in civic technology, social sciences, public health, and more.

I'd like to invite you to join a collaborative data science project using an experimental software framework called Ballet.

Your task will be to write a feature definition that can be used to predict personal income from raw survey responses to the US Census American Community Survey. The model built from features submitted by the community can then be used to optimize administration of the survey, direct public policy interventions, assist empirical researchers, and more.

This task is expected to take from 30 minutes to 2 hours. You will also be asked to complete a short survey about your experiences and may additionally be contacted to have a short interview with researchers. If you complete the study you will be entered in a raffle for a small Amazon gift card as a token of thanks for your time and effort.

To participate, please go to https://dai.lids.mit.edu/ballet-study and follow the instructions there by October 3. You should preferably have basic experience in Python programming and data science development.

You can also check out the project without signing up for the study here.

Parents Reply
  • Hi that is a good point about the difficulty predicting income in March-now data -- thanks for the pointers! For the exercise we are fixing the development data set to a previous year's data (2018), but a full evaluation of our model will be to see how it performs on unseen data like the ACS responses from pandemic months. Yes, one way of improving the model would be to merge in these other data sources, this would be a great feature for someone to contribute

Children
No Data