Common test of mental state understanding is biased

How do clinicians rate how well a patient understands what other people are thinking and feeling? That is to say—how does the patient assess another person’s mental state?

An accurate tool is key for measuring treatment outcomes and carries profound consequences for the patient’s mental and physical well-being.

To that end, psychologists determine a person’s mental state understanding (MSU), which is based on the theory that success in the social world hinges upon our ability to decipher and infer the hidden beliefs, emotions, and intentions of others. A large body of research has demonstrated that being able to do so results in a number of positive social effects: increased popularity, improved interpersonal rapport, prosocial behavior, and the like.

Conversely, those who struggle with MSU experience a variety of negative effects: few friends, isolation, and the risk for severe psychiatric illness, such as schizophrenia spectrum disorders. The link between social isolation, psychiatric illness, and mortality is a strong one, hence the importance of a reliable assessment tool.

Problematic test

The National Institute for Mental Health (NIMH) recommends a test, called the Reading the Mind in the Eyes Task (RMET). Here, participants view 36 black and white photographs, originally selected from magazine articles, of solely the eyes of Caucasian female and male actors. Participants then decide which of four adjectives—such as panicked, incredulous, despondent, or interested—best describes the mental state expressed in the eyes (the correct answer has been generated through consensus ratings).

But there’s a problem. Using data from more than 40,000 people, a new study published this month in Psychological Medicine concludes that the test is deeply flawed.

“It’s biased against the less educated, the less intelligent, and against ethnic and racial minorities,” says lead author David Dodell-Feder, an assistant professor of psychology at the University of Rochester. “It relies too heavily on a person’s vocabulary, intelligence, and culturally-biased stimuli. That’s particularly problematic because it’s endorsed by the national authority in our field and therefore the most widely-used assessment tool.”

What surprised the researchers most was that the difference in the performance of people of some races and certain levels of education was as large or even larger than the difference between neurotypical people and people with schizophrenia or autism—two groups that exhibit well-documented, marked, and pervasive social difficulties.

The team, comprised of Rochester’s Dodell-Feder, and Harvard Medical School and McLean Hospital’s Kerry Ressler and Laura Germine, studied 40,248 native-speaking or primarily English-speaking people between the ages of 10 to 70. Study participants completed one of five measures on TestMyBrain.org: either the RMET, or a shortened version of RMET, a multiracial emotion identification task, an emotion discrimination task, or a non-social/non-verbal processing speed task of digit symbol matching.

The scientists found that education, race, and ethnicity explained more of the variance in a person’s RMET performance, and that the differences between levels of education, race, and ethnicity were more pronounced for the RMET—compared to the other three tasks.

As a result, more highly educated, non-Hispanic, and white or Caucasian individuals performed best on the RMET. The researchers concluded that the RMET may be unduly influenced by social class and culture, hence posing a serious challenge to assessing correctly the mental state understanding in clinical populations, especially given the strong link between social status and psychiatric illness. The team also discovered that unlike on other tasks, the performance on the RMET improved across a person’s lifespan.

One would expect the greatest differences to exist between neurotypical people and those with schizophrenia or autism spectrum disorder because the latter two groups tend to experience social difficulties. Instead, the difference in the performance of people of some races and certain levels of education was as large or even larger than the difference between neurotypical people and people with schizophrenia or autism. The RMET may be unduly influenced by social class and culture, posing a serious challenge to assessing mental state understanding (MSU) accurately.

“The findings are troubling because they suggest that the RMET task may not be appropriately assessing mental state understanding in certain groups of people,” says Dodell-Feder, who also holds a secondary appointment in the Department of Neuroscience at the University of Rochester Medical Center.

On a practical level, false assessment can be costly—monetarily and for the patient’s health. Missed MSU impairments could lead researchers and clinicians to fail to identify someone at risk for social difficulties, leading them on a path towards mental and physical decline, the researchers warn.

On the other hand, detecting impairments where they do not exist, could lead to misidentifying someone as being at-risk for social difficulties, or worse, psychopathology, causing potential stigma and unnecessary and costly interventions. Alternatively, clinicians could incorrectly conclude that a treatment for social dysfunction is working when it is not, and vice versa.

So, should the RMET be thrown out entirely?

Not necessarily, says Dodell-Feder. One could keep the design of the task but use different stimuli that are multiracial and include different response options, which contain a less complicated vocabulary. Team member Germine is currently testing a new, multiracial version of the task. Another option would be to abandon it, or use it alongside other tasks that have been demonstrated to be valid cross-culturally, of which there are very few in the current literature.

“Either way, our findings show that it might be premature for NIMH to make strong recommendations regarding the use of certain tasks for measuring mental state understanding before we can thoroughly assess the validity of their usage across peoples,” says Dodell-Feder.

The data analyzed in this study are available on the Open Science Framework repository at https://osf.io/tn9vb/.


Substack subscription form sign up