Genetic data crunching achieves milestone

The revolution was not televised. In the fall of 1999, the Stanford Microarray Database booted up, and a level of computing power was suddenly available to the field of molecular biology that only a few years earlier was inconceivable. On Oct. 19, the database recorded its 50,000th experiment, marking its place at the forefront of an information processing revolution that has yielded groundbreaking insights into the relationships between genes and illness, as well as fundamental biological discoveries.

From Stanford University:

Genetic data crunching achieves milestone at Stanford

The revolution was not televised.

In the fall of 1999, the Stanford Microarray Database booted up, and a level of computing power was suddenly available to the field of molecular biology that only a few years earlier was inconceivable. On Oct. 19, the database recorded its 50,000th experiment, marking its place at the forefront of an information processing revolution that has yielded groundbreaking insights into the relationships between genes and illness, as well as fundamental biological discoveries.

Microarrays, developed in the lab of biochemistry professor Patrick Brown, MD, PhD, in the early 1990s, took molecular biology by storm. They’re small slides spotted with fixed samples of DNA, each for a different gene. When a researcher prepares a labeled cell extract and incubates it with the slide, messengers in the sample stick to the fixed DNA, showing which genes in the sample are active. Microarrays are especially useful for comparisons between normal and cancerous tissues or between different stages of development. Researchers use them to nose out the genes associated with such changes.

The problem, however, is that experiments with microarrays yield vast amounts of data. ”Microarrays allow researchers to do in six months what previously would have taken six years of concerted effort,” explained Gavin Sherlock, PhD, assistant research professor in genetics, who has been involved in the Stanford database from the beginning.

The need for the university database became apparent in the late 1990s after Brown and David Botstein, PhD, former chair of the genetics department, had put together a database for their own microarray results. They soon found that they needed something more sophisticated. Efficient processing and storing of microarray data, as well as the ability to easily retrieve and compare data with other experiments were all required. New information about genes spotted on the slides is continuously discovered and needs to be incorporated into data from previous experiments. In late 1999 Botstein and Brown received a grant from the National Cancer Institute for a completely revamped database, and by April 2000 all 5,000 experiments from the old database had been transferred to the new database, officially known as the Stanford Microarray Database.

Since then, researchers have used data in the database to illuminate everything from cell division in yeast to cancer-causing genes to what happens to bacteria when they’re deprived of iron. Microarray data have also allowed scientists to understand how various drugs affect the malaria bug, to find out what the immune system attacks in patients with autoimmune diseases and to pinpoint genes involved in multiple sclerosis.

Sherlock estimates the database now supports 400 campus researchers doing work on 30 different organisms (more are added as needed), and he believes it to be the world’s largest academic microarray database. About 200 papers have been published by Stanford researchers based on its data and many more by other groups reanalyzing Stanford data.

About one-quarter to one-third of all publicly available microarray data in the world is in the Stanford system, Sherlock said. Most of the database’s experiments are not yet public; results are available only to Stanford researchers and their collaborators until an article using the data is published in a journal. It is growing at a rapid pace, with nearly 1,000 experiments being added to the database every month.

Statistics like these – combined with Stanford’s invention of the microarray and the nine-person team devoted to maintaining the database – make Stanford a natural leader in the field. Several years ago the team made the database’s source code publicly available, and Catherine Ball, PhD, the director of the Stanford Microarray Database, serves as president of the Microarray Gene Expression Database Society, an international group working to implement standards for such work with microarrays.

”It’s much less scary to be doing microarrays at Stanford than anywhere else,” Ball said. ”In fact, if you’re not, you have to explain why.” It’s not only a result of Stanford’s long history with microarrays, she said. ”Anyone on campus was able to walk in and ask a postdoc in the Brown-Botstein lab, ‘Can you please help me get this started in my lab?”’ she explained. ”Just having a team with expertise, enthusiasm and a cooperative nature has made this university much more likely to use microarray technology than anyplace else.”


Substack subscription form sign up