A new study from researchers at the University of Washington has found that ChatGPT, OpenAI’s popular AI tool, consistently ranks resumes with disability-related honors and credentials lower than identical resumes without those honors and credentials. This bias raises concerns as more recruiters begin using AI tools like ChatGPT to summarize resumes and rank candidates.
Kate Glazko, a doctoral student in the UW’s Paul G. Allen School of Computer Science & Engineering and the study’s lead author, noticed this trend while seeking research internships last year. Glazko, who studies how generative AI can replicate and amplify real-world biases, wondered how such a system might rank resumes that implied someone had a disability.
ChatGPT Exhibits Explicit and Implicit Ableism
In the study, researchers used a publicly available curriculum vitae (CV) and created six enhanced CVs, each implying a different disability by including four disability-related credentials. When using ChatGPT’s GPT-4 model to rank these enhanced CVs against the original version for a real “student researcher” job listing, the system ranked the enhanced CVs first only one quarter of the time in 60 trials.
When asked to explain the rankings, GPT-4’s responses exhibited explicit and implicit ableism. For example, it noted that a candidate with depression had “additional focus on DEI and personal challenges,” which “detract from the core technical and research-oriented aspects of the role.”
“Some of GPT’s descriptions would color a person’s entire resume based on their disability and claimed that involvement with DEI or disability is potentially taking away from other parts of the resume,” Glazko said.
Customizing ChatGPT to Reduce Bias
Researchers then used the GPTs Editor tool to customize GPT-4 with written instructions directing it not to exhibit ableist biases and instead work with disability justice and DEI principles. When running the experiment again with the newly trained chatbot, the system ranked the enhanced CVs higher than the control CV 37 times out of 60. However, for some disabilities, the improvements were minimal or absent.
“People need to be aware of the system’s biases when using AI for these real-world tasks,” Glazko said. “Otherwise, a recruiter using ChatGPT can’t make these corrections, or be aware that, even with instructions, bias can persist.”
The researchers emphasize the importance of studying and documenting these biases to ensure technology is implemented and deployed in ways that are equitable and fair. They call for further research to test other AI systems, include more disabilities, explore intersections of bias, and investigate whether further customization could reduce biases more consistently across disabilities.
“It is so important that we study and document these biases,” said senior author Jennifer Mankoff, a UW professor in the Allen School. “We’ve learned a lot from and will hopefully contribute back to a larger conversation — not only regarding disability, but also other minoritized identities — around making sure technology is implemented and deployed in ways that are equitable and fair.”