Study illuminates trade-off between complex words and complex sentences

Bruce Willis’s recent announcement that he was retiring from acting brought widespread public attention to the neurological condition aphasia. While everyone struggles occasionally with finding the right word or tripping over their sentences, aphasia patients can lose the ability to comprehend language entirely.

Though Willis hasn’t confirmed it, some doctors suspect that he may have a particularly brutal and degenerative form called primary progressive aphasia (PPA).

Scientists have long understood that there are several subtypes of PPA — some versions come with lexical deficits, affecting a person’s ability to access words, while others cause syntactic deficits, making it difficult to construct sentences. 

A collaborative team of cognitive scientists and doctors from MIT and Massachusetts General Hospital (MGH) has now developed a quantitative way to identify these different deficits. In the process, they illuminated a fundamental trade-off the brain makes when speaking between vocabulary and grammar. Their results show that PPA patients with grammar deficits use richer, more complex vocabulary to compensate for their syntax struggles and vice versa. 

The results were published June 16 in the Proceedings of the National Academy of Sciences.

Edward Gibson, a professor of brain and cognitive sciences at MIT and senior author of the study, says it’s an important first step to understanding how the language centers of the brain might be processing grammar and vocabulary independently.

The study also revealed new insights into healthy brain function. “From a deficit in patients, we were able to find a basic property in healthy language production,” says Neguine Rezaii, the study’s lead author and a neuropsychiatrist at MGH. She adds that improved methods for identifying deficits in aphasia patients could help increase access to disease diagnosis and inform our understanding of how other neurological conditions affect language production. 

Measuring complexity

Based on prior research into those suffering from stroke-induced aphasia, the researchers hypothesized that the speech patterns of PPA patients would reflect this trade-off between complex words and sentences. For example, if one can’t recall the word sailboat, they might construct a more roundabout phrase — “the thing that moves in water with wind,” for example — to get their meaning across. 

To measure lexical complexity, the researchers relied on a well-established concept called word frequency. High-frequency words — “the” being the classic example — are those that are more commonly used and easily accessible. Low-frequency words, in contrast, are those that are less familiar, often richer, and, for aphasia patients, the first words to go. Gibson gives the example of zebu versus zebra. Both are hooved animals found in Africa, but one of the words requires looking at a dictionary.

To quantify the frequency of different words, the researchers analyzed a database called Switchboard, which consists of random telephone conversations from over 500 American English speakers. They edited out the typical stutters, errors, and false starts of normal speech, and then analyzed how often different words appear. 

Previously, though, there was no equivalent measure for syntax complexity. 

“For words, we always had word frequency, but syntax frequency is something new to this project,” says Rezaii. 

While analyzing Switchboard, the researchers also classified each word in terms of its speech type to create a massive list of possible grammar rules. Subject + verb (“She left”) was one common construction. Subject + verb + object (“She left them”) was another one. With their list of possible rules in hand, the team could quantify the frequency of different sentence structures and syntaxes to get a measure that was comparable to word frequency. 

Quantifying the trade-off

With their metrics in place, the team asked study participants to describe a scene doctors sometimes use to diagnose aphasia: a drawing of a busy family picnic. The researchers then calculated an average word frequency and syntax frequency of each sentence for each participant.

Just as hypothesized, among 79 PPA patients — with healthy speakers to control for other factors that might affect language, such as age — there was a clear negative correlation between word and syntax frequency depending on which subtype of PPA they had. Patients with lexical deficits used low-frequency, complex sentence structures, while those with grammar deficits used more low-frequency, descriptive words. 

The researchers then tested the same method on a sample of healthy English speakers. Surprisingly, the results held.

“It’s pretty cognitively demanding for the brain to use both complex syntax and complex words in one sentence,” says Rezaii, explaining that even those without aphasia seem to be making this trade-off between vocabulary and syntax. The difference, she says, is that healthy speakers can make a different trade-off sentence-to-sentence. Aphasia patients, though, have no choice and must constantly compensate depending on their deficit. 

Gibson says there are several possible explanations for this trade-off. Perhaps our brains are trying to create language that’s unambiguous and clear, but also efficient. Or perhaps there’s just not enough capacity to construct complex sentences that also include unusual words. More research, he says, is necessary to disentangle the various processes involved in different facets of language production.  

For PPA patients, though, this is a major step forward. 

“The way people have been categorized by clinicians, it’s very informal,” says Gibson. “This potentially provides formal, quantitative, computational ways you can analyze the speech of a person to figure out what category of deficit they might have.”

Substack subscription form sign up
The material in this press release comes from the originating research organization. Content may be edited for style and length. Want more? Sign up for our daily email.