New study reveals major racial bias in leading genomics databases

A national group of researchers has confirmed for the first time that two of the top genomic databases, which are in wide use today by clinical geneticists, reflect a measurable bias toward genetic data based on European ancestry over that of African ancestry.

The results of their study were published in the latest issue of Nature Communications.

The research team was led by Timothy O’Connor, assistant professor at the University of Maryland School of Medicine (UM SOM) and a faculty member of the school’s Institute of Genomic Sciences. He is also a specialist in the areas of Human Evolutionary Genomics, Genotype/Phenotype Architecture, and Computational Biology. Other members of the study included researchers from UM SOM’s Department of Medicine and the Program in Personalized and Genomic Medicine, and from the Johns Hopkins University, the University of Colorado, and the Henry Ford Health System.

This deficit in African ancestry genomic data was identified during an 18-month long study conducted under the auspices of the larger Consortium on Asthma among African-Ancestry Populations in the Americas (CAAPA). To create a benchmark for comparison to current database results, the researchers first created the largest, high-quality non-European genome data set ever assembled. Genetic samples of 642 subjects from the African diaspora, including representatives from US, African, and Afro-Caribbean populations, were sequenced in order to produce this unique data set. Then, when compared with current clinical genomic databases, researchers found a clearer preference in those databases for European genetic variants over non-European variants.

“By better understanding the important role of African ancestry in clinical genetics, we can begin to actually identify a disease that has been forgotten or is not part of an individual’s self-identification,” says O’Connor. “For example, if an African-American patient walks in the door, he might have 20 percent European ancestry, while another might have 20 percent African ancestry. That difference will dramatically change how many variants are found in their genome, and what disease risks they might encounter. That’s why we need to expand these databases to include a broader range of ancestries, in order to produce more accurate medical genetic diagnoses.”

O’Connor also points out that this shortfall in genomic data also comes at a financial cost. “If you translate the review time it takes for each one of these variants to be sequenced in terms of cost in a clinical setting, you’re looking at a difference of about $1,000 more to analyze an African American’s genome than a European American’s genome–and you still receive less accurate results,” he notes.

“This groundbreaking research by Dr. O’Connor and his team clearly underscores the need for greater diversity in today’s genomic databases,” says UM SOM Dean E. Albert Reece, MD, PhD, MBA, who is also Vice President of Medical Affairs at the University of Maryland and the John Z. and Akiko Bowers Distinguished Professor at UM SOM. “By applying the genetic ancestry data of all major racial backgrounds, we can perform more precise and cost-effective clinical diagnoses that benefit patients and physicians alike.”


Substack subscription form sign up