In partnership with the NYU Furman Center, the Center for Innovation through Data Intelligence (CIDI) is evaluating how machine learning algorithms applied to data from administrative sources can be used to predict which families are at high risk of homelessness, and identify which buildings are likely to house at risk families.
The study links multiple years of data on family shelter applications and stays with micro data on families receiving benefits in New York City, administrative data from housing courts, building-level information from an array of city agencies, and neighborhood characteristics from the Census and city agencies. Using this data set, we evaluate the predictive performance of an array of machine learning algorithms, including random forests, neural networks, boosted trees and logistic regression. We examine the accuracy of these predictive models across years and over different prediction windows. We also assess the contribution of different types of spatial variables. We then explore whether the families (not currently applying or in shelter) with the highest predicted risk of homeless are the also applying for existing prevention services to understand whether algorithm-driven outreach could improve targeting.
NYC Department of Homeless Services
NYC Human Resources Administration/ New York City Department of Social Services
NYC Department of Housing Preservation and Development
New York State Office of Court Administration