Sperm Whales Have Vowels and the Grammar to Go With Them

Off the coast of Dominica, in water deep enough to swallow a skyscraper, a female sperm whale draws breath and dives. Somewhere below, she clicks. Not randomly. Not reflexively. She clicks in sequences with internal structure: rhythm, duration, a vowel quality she actively controls and that her neighbours recognise. Until recently, we had no idea how rich that structure actually was.

A study published this week in Proceedings of the Royal Society B pushes the parallel between whale clicks and human speech further than anyone had previously managed. Gašper Beguš at the University of California, Berkeley, and colleagues demonstrate that sperm whale codas — short bursts of clicks used for social communication — don’t just resemble human vowels acoustically. They obey the same distributional rules that govern vowels in human language, rules that linguists spent decades working out from Sanskrit, Mandarin, Finnish and Hungarian.

Vowels in the Deep

The team analysed 3,948 codas recorded from fifteen female and immature sperm whales off Dominica between 2014 and 2018, using hydrophone tags attached directly to the whales. Each coda was manually annotated for what the researchers call vowel quality: whether the click sequence had one peak in its frequency spectrum (an “a-coda”) or two peaks (an “i-coda”). That distinction, first described in earlier work, maps onto roughly the same acoustic territory as the vowels in the English words father and feet.

What’s new is what happens when you look at the distribution.

In human languages, vowels don’t sit independently of other sound structure. The low vowel /a/ is universally longer than the high vowel /i/. Some languages, Hungarian, Arabic, Finnish among them, distinguish between short and long versions of the same vowel, so that changing duration alone changes meaning. And when you produce one vowel, your mouth is already preparing for the next: a phenomenon called coarticulation, where sounds bleed into each other at the edges. These are not quirks of a particular language family. They are phonological universals, patterns so reliable that linguists use them to infer the deep architecture of human speech production.

Beguš and colleagues found all five of these properties in the whale data.

Five Properties, One Parallel

A-codas are reliably longer than i-codas within the same click pattern type. The i-codas show a bimodal distribution of durations, two distinct peaks in the histogram, suggesting whales distinguish short i from a long ī, the way Hungarian speakers distinguish bor (wine) from bór (boron). Different whales have individual baseline tempos, just as different people speak at different habitual rates. And when a whale transitions between an a-coda and an i-coda, the first click of the new coda often still carries the acoustic signature of the previous one. Coarticulation. In a whale.

The coarticulation finding is perhaps the most striking, partly because it was hiding in an anomaly the team might have dismissed as noise. Some codas had a first click that didn’t match the rest: an a-click opening an otherwise i-coda sequence, or vice versa. Looking more closely, the researchers found these mismatches were not random. They clustered at transitions between vowel qualities. A whale switching from an a-coda sequence to an i-coda sequence was more likely to produce a mismatched first click than a whale continuing in the same quality. The acoustic filter, the distal air sac near the whale’s phonic lips, was being reset mid-sequence, and the edge clicks caught it in transition.

That analogy to the vocal tract deserves some unpacking. Sperm whale clicks are produced by a pair of phonic lips near the top of the nasal passage. The frequency content of each click is shaped by a large air sac sitting adjacent to that structure. When the air sac is configured one way, the click has one spectral peak; configured differently, it has two. This is the source-filter model, familiar from human speech: vocal folds generate raw sound, the throat and mouth shape it. In whales, the same logic applies with different hardware. The fact that coarticulation appears at all implies that the filter doesn’t switch instantaneously between configurations, just as a human tongue can’t leap from /i/ position to /a/ position without traversing the space between.

What It Doesn’t Yet Prove

There are limits to how far the parallel can be pushed. Researchers still don’t know what any particular coda means. Whether a-codas signal different things to i-codas, or whether duration contrasts carry semantic load, remains an open question. The study makes no claims about whale language in any mentalist sense, only that the structural machinery appears to be in place.

Beguš and the team also note that all five properties they identify emerged independently in whales, since sperm whales and humans last shared a common ancestor perhaps 90 million years ago. Whatever evolutionary pressures drove hominins toward discrete vowel categories and coarticulation apparently drove cetaceans in the same direction from an entirely different starting point. The research is part of Project CETI, a longer-running effort to record, annotate and eventually decode sperm whale communication using machine learning. Whether that goal is achievable remains a matter of debate among linguists and biologists alike, not least because meaning is so much harder to infer than structure. But structure is a reasonable place to start, and right now the structure is looking unexpectedly familiar. The whale clicking somewhere below the surface off Dominica may not be speaking, exactly. She is doing something that rhymes with it.

DOI: 10.1098/rspb.2025.2994


Frequently Asked Questions

What are sperm whale codas?

Codas are short bursts of clicks that sperm whales use for social communication. Each coda consists of a series of clicks with specific rhythms and timing patterns, and whales in the same cultural clan tend to use similar coda types. They appear to play a role in maintaining social bonds and coordinating group behaviour.

What does it mean that whale codas have “vowel quality”?

Researchers found that the individual clicks within a coda have distinct spectral shapes: either one frequency peak or two. These correspond roughly to the acoustic properties of human vowel sounds like “ah” and “ee.” Whales appear to actively control which configuration they use, and the same rhythmic coda type can occur in either quality.

Is this evidence that sperm whales have language?

The study demonstrates structural complexity that closely parallels properties found in human phonology, but it does not show that codas carry specific meanings. The researchers are careful to distinguish between having language-like structure and having language itself, noting that the semantic content of coda sequences remains unknown.

What is coarticulation and why does it matter in whales?

Coarticulation is when adjacent sounds influence each other during production: your mouth begins preparing for the next vowel before finishing the current one, leaving acoustic traces at the boundary. The discovery of this effect in whales suggests their vocal apparatus operates under similar physical constraints to the human vocal tract, implying an independently evolved parallel solution to the same production problem.

What is Project CETI and what are its goals?

Project CETI (Cetacean Translation Initiative) is a research effort combining bioacoustics, machine learning and linguistics to record and analyse sperm whale communication at scale. The long-term aim is to understand what information whale codas encode, though researchers acknowledge that inferring meaning from structure alone is a difficult and uncertain task.


Discover more from Wild Science

Subscribe to get the latest posts sent to your email.

Leave a Comment