Research / BFI Working PaperMar 13, 2023

Machine Learning as a Tool for Hypothesis Generation

While hypothesis testing is a highly formalized activity, hypothesis generation remains largely informal. We propose a procedure that uses machine learning algorithms—and their capacity to notice patterns people might not—to generate novel hypotheses about human behavior. We illustrate the procedure with a concrete application: judge decisions. We begin with a striking fact: up to half of the predictable variation in who judges jail is explained solely by the pixels in the defendant’s mugshot—that is, the predictions from an algorithm built using just facial images. We develop a procedure that allows human subjects to interact with this black-box algorithm to produce hypotheses about what in the face influences judge decisions. The procedure generates hypotheses that are both interpretable and novel: They are not explained by factors implied by existing research (demographics, facial features emphasized by previous psychology studies), nor are they already known (even if just tacitly) to people or even experts. Though these results are specific, our procedure is general. It provides a way to produce novel, interpretable hypotheses from any high-dimensional dataset (e.g. cell phones, satellites, online behavior, news headlines, corporate filings, and high-frequency time series). A central tenet of our paper is that hypothesis generation is in and of itself a valuable activity, and hope this encourages future work in this largely “pre-scientific” stage of science.

More Research From These Scholars

BFI Working Paper Sep 20, 2021

Fragile Algorithms and Fallible Decision-Makers: Lessons from the Justice System

Jens Ludwig, Sendhil Mullainathan
Topics:  Technology & Innovation
BFI Working Paper May 21, 2019

Simplicity Creates Inequity: Implications for Fairness, Stereotypes, and Interpretability

Jon Kleinberg, Sendhil Mullainathan
Topics:  Economic Mobility & Poverty, Technology & Innovation
BFI Working Paper Feb 16, 2021

Evaluating Contradictory Experimental and Non-Experimental Estimates of Neighborhood Effects on Economic Outcomes for Adults

David J. Harding, Lisa Sanbonmatsu, Greg J. Duncan, Lisa A. Gennetian, Lawrence F. Katz, Ronald C. Kessler, Jeffrey R. Kling, Matthew Sciandra, Jens Ludwig
Topics:  Economic Mobility & Poverty