A new paper in the Quarterly Journal of Economics, published by Oxford University Press, finds that replacing certain judicial decision-making functions with algorithms could improve outcomes for defendants by eliminating some of the systemic biases of judges.
Decision makers make consequential choices based on predictions of unknown outcomes. Judges, in particular, make decisions about whether to grant bail to defendants or how to sentence those convicted. Companies now use machine learning based models increasingly in high-stakes decisions. There are various assumptions about human behavior underlying the deployment of such learning models that play out in product recommendations on Amazon, the spam filtering of email, and predictive texts on one’s phone.
The researchers here developed a statistical test of one such behavioral assumption, whether decision makers make systematic prediction mistakes, and further developed methods for estimating the ways in which their predictions are systematically biased. Analyzing the New York City pretrial system, the research reveals that a substantial portion of judges make systematic prediction mistakes about pretrial misconduct risk given defendant characteristics, including race, age, and prior behavior.
The research here used information from judges in New York City, who are quasi-randomly assigned to cases defined at the assigned courtroom by shift. The study tested whether the release decisions of judges reflect accurate beliefs about the risk of a defendant failing to appear for trial (among other things). The study was based on information on 1,460,462 New York City cases, of which 758,027 cases were subject to a pretrial release decision.
The paper here derived a statistical test for whether a decision maker makes systematic prediction mistakes and provided methods for estimating the ways in which the decision maker’s predictions are systematically biased. By analyzing the pretrial release decisions of judges in New York City, the paper estimates that at least 20% of judges make systematic prediction mistakes about defendant misconduct risk given defendant characteristics. Motivated by this analysis, the researcher here estimated the effects of replacing judges with algorithmic decision rules.
The paper found that decisions of at least 32% of judges in New York City are inconsistent with the actual ability of defendants to post a specified bail amount and real the risk of them failing to appear for trial. The research here indicates that when both defendant race and age are considered the median judge makes systematic prediction mistakes on approximately 30% of defendants assigned to them. When both defendant race and whether the defendant was charged with a felony is considered the median judge makes systematic prediction mistakes on approximately 24% of defendants assigned to them.
While the paper notes that replacing judges with an algorithmic decision rule has ambiguous effects that depend on the policymaker’s objective (is the desired outcome one in which more defendants show up for trial or one in which fewer defendants sit in jail waiting for trial?) it appears that replacing judges with an algorithmic decision rule would lead to up to 20% improvements in trial outcomes, as measured based on the failure to appear rate among released defendants and the pretrial detention rate.
“The effects of replacing human decision makers with algorithms depends on the trade-off between whether the human makes systematic prediction mistakes based on observable information available to the algorithm versus whether the human observes any useful private information,” said the paper’s lead author, Ashesh Rambachan. “The econometric framework in this paper enables empirical researchers to provide direct evidence on these competing forces.”
The paper, “Identifying prediction mistakes in observational data,” is available (on May 28, 2024) at https://academic.oup.com/qje/article-lookup/doi/10.1093/qje/qjae013.