A multi-source analysis integrating Census ACS, FRED macroeconomic data, and HUD Fair Market Rent benchmarks to investigate what drives housing burden across all 62 New York counties — with a novel application of machine unlearning to assess how 2022–2024 interest rate shocks shaped predictive models.
This is the final project for CUNY's DATA 607 — but the research question came from genuine curiosity, not a rubric. Housing affordability is one of the defining pressures on New York residents, and I wanted to understand it rigorously: what actually drives the gap between what people earn and what they pay to live here?
The project integrates three heterogeneous government data sources — Census ACS 5-Year estimates for 62 counties (2009–2024), FRED API data for mortgage rates and inflation, and HUD Fair Market Rent schedules as a policy-grounded rental benchmark — into a single unified panel dataset built from scratch via API calls and GitHub-hosted CSVs.
The workflow follows an OSEMN framework (Obtain, Scrub, Explore, Model, Interpret) — progressing from raw API ingestion and cleaning, through exploratory visualization, into predictive modeling with tidymodels and a novel experiment in machine unlearning.
The machine unlearning component — selectively removing 2022–2024 high-interest-rate observations by zeroing out case weights — was the methodological centerpiece. It let me ask: how much of what these models "know" is rate-cycle specific versus structurally stable? That question has real implications for how HUD FMR benchmarks should be adjusted across rate-cycle transitions.
Two questions structure the modeling work — one about prediction, one about sensitivity. Both are answered through the same unified county-year panel of 62 New York counties from 2009 to 2024.
| Language | R · Quarto |
| Data | tidycensus, fredr, httr, jsonlite |
| Wrangling | tidyverse, lubridate, zoo, dplyr |
| Modeling | tidymodels, vip, randomForest |
| Spatial | leaflet, tigris, sf |
| Viz | ggplot2, patchwork, scales, gt |
| App | Shiny · ShinyApps.io |
| Publish | RPubs · GitHub |
| Dataset | Census ACS 5-Yr · FRED · HUD FMR (62 NY counties, 2009–2024) |
Published report, interactive app, and all source code are publicly available.