Hi, I'm John. I'd like to understand the world better. I'm currently on leave from my PhD to work on Datamule.
Previously, I was a researcher at MIT & Berkeley.
Fun Fact: My Erdős number is 4
Affiliations
UCLA Econ PhD '28
MIT
Berkeley (Dropped out in 2020, Graduated 2021)
ODF23
Rapid Reviews: COVID 19
COVID-19 Policy Alliance
Selected Open Source
Developer of the datamule project
Developer of doc2dict
Developer of the Structured Output Organization
Selected Policy
Worked on the Puerto Rican Debt Crisis (Simon Johnson)
Wrote the English guidebook to the Korean CDC's Covid-19 Playbook (Simon Johnson)
Worked on Jump-Starting America (Simon Johnson)
Worked on the Massachusetts Covid-19 response (Simon Johnson)
Participated in the Covid-19 Policy Hackathon with a paper on using machine learning
to optimize radio education
Selected Papers (RA)
Strengthening State Capacity (AER)
Used classical machine learning and regex to clean and link extremely messy
civil servant name and job strings with US census data
Wrote custom PyQT Gui to aid in manually validating unclear matches
Wrote a geocoding package in R that used Wikipedia for better accuracy with
historical names
Ideology and Performance in Public Organizations (Econometrica)
Scraped 5tb of federal contract data from 1997 to 2022
Flattened the xml structure into tabular using lxml with memory
optimizations
The Costs of Employment Segregation (QJE)
Used fuzzy matching and regex techniques to link 700,000 messy job title
strings extracted from federal records to the US Census
Results surpassed the Census Linking Project, the academic gold standard, in
accuracy and percent linked
A Glimpse of Freedom: Political Resistance in East Germany (AEJ)
Made some complex graphics. Really fun to mess around with ggplot2
Bureaucratic Representation and State Responsiveness (Review of Economics &
Statistics)
Used Google Cloud Vision to convert 100,000 British Blue Book scan tables
into text format
Played around with Google's beta table OCR parsing tool
Selected Papers
Transitioning Out of the Coronavirus Lockdown: A Framework for Evaluating Zone-Based
Social Distancing
Misc
Wrote a rshiny app "Bridge" that used the World Bank API to compare past living
standards to the present