AI Tool Revives Endangered Paiute Language, Bridging Generations

Jared Coleman, a recent Ph.D. graduate in computer science, has developed an innovative AI-powered translation tool to help revitalize Owens Valley Paiute, a critically endangered Indigenous language. This groundbreaking work combines traditional linguistic methods with cutting-edge AI technology to preserve and promote a language on the brink of extinction.

Merging Ancient Tongues with Modern Tech

Coleman, a member of the Big Pine Paiute Tribe of Owens Valley, created the tool to address the unique challenges of translating a “no-resource language” – one with virtually no publicly available translated sentences for training machine learning models. His approach, called LLM-RBMT (Large Language Model-assisted Rule-Based Machine Translation), ingeniously combines old-school rule-based translation techniques with advanced natural language processing.

“Essentially, the LLM acts as a sophisticated intermediary, using its advanced understanding of language to make sure the rule-based system produces accurate translations,” Coleman explained. This hybrid method simplifies complex sentences and uses English placeholders for unknown words, mirroring how language learners naturally speak.

A Personal Mission with Broader Impact

For Coleman, this project is deeply personal. “My dad did not grow up speaking the language – like many families, it was forced out of use by boarding schools where speaking the language was forbidden,” he shared. “I’m lucky my great-grandparents sat down with linguists to document the language and to create recordings so I can hear their voices and words. And now, to listen to my great-grandfather and know what he is saying, there’s something very personally satisfying about that.”

Beyond the translation tool, Coleman has developed a suite of digital resources called Kubishi (meaning ‘brain’ in Paiute), including an online dictionary and a sentence-builder system. These tools are part of a larger effort to preserve and revitalize the Paiute language.

The research, presented at NAACL’s AmericasNLP workshop, demonstrates the potential of AI in supporting endangered language preservation. As Coleman prepares to join Loyola Marymount University as an assistant professor, he remains committed to expanding this work, acknowledging it as “one piece of a much larger puzzle” in the ongoing efforts to keep Indigenous languages alive and thriving.


Substack subscription form sign up