Awash in DNA data

Too much of a good thing?

Angela Wyant
  Patrick Brown has sparked scientists to make their own version of a powerful genetic tool he invented.

>SIDEBAR 1: What's a microarray

>SIDEBAR 2: How a microarray experiment works


The 200,000 or so American women who will be diagnosed with breast cancer this year have more treatments available than ever before. Unfortunately, medical science can offer no crystal ball to reveal which option is best for the individual. Clinicians making the treatment recommendations for breast cancer as well as for other cancers have to rely on statistics about how other women have fared.

If only doctors could see each tumor’s strengths and weaknesses to help them plan the most effective use of surgery, radiation, chemotherapy or — in the case of breast cancer — hormone therapy. Identifying the genetic fingerprint of tumors could spare patients chemotherapy experiences that they won’t benefit from, says associate professor of medicine George Fisher, MD, PhD, an oncologist and the director of Stanford’s cancer clinical trials office. “Conversely, it might identify patients who have little chance of cure without further chemo or radiation treatment,” he says.

A tool that came out of Stanford labs a decade ago is allowing researchers to start to tease apart the entangled molecular interiors of tumors, making more personal treatments a potential reality. The tool — called the microarray — transformed the field of genetics by allowing researchers in the basic sciences to do in days or months what previously would have taken years. And now, scientists are hoping to harness the power of microarrays in clinical applications.

What’s a microarray?

A microarray is a tool used to analyze the activity of thousands of genes in a cell at once. One of the most widely used types of microarrays was developed by biochemistry professor Patrick Brown, MD, PhD, and his colleagues a decade ago. This type of microarray consists of a piece of glass the size of a microscope slide with an orderly arrangement of spots of DNA strands attached to its surface.

When a researcher douses a microarray with a test sample, DNA strands in the sample stick to matching strands in the spots, effectively creating a snapshot of which genes in the sample are active. The type of microarray developed in Brown’s lab compares two samples at once — for example, a normal vs. a diseased tissue — to identify the differences between the two.


The microarray is a glass microscope slide with an orderly arrangement of DNA spots attached to its surface. Each spot represents a particular gene, which can grab onto the matching genes from a test sample. Once the researchers sees the minuscule matchups, they have a snapshot of which genes are active in that sample.

Although clinical uses of microarrays are still experimental and mired in complexity, they are beginning to show promise. This year, the first two microarray-based tools for cancer diagnosis gained U.S. Food and Drug Administration approval. One is for helping to predict the likelihood of breast cancer recurrence in women with newly diagnosed, early-stage invasive breast cancer. The other tests a liver enzyme to predict how quickly a person metabolizes certain drugs, to help in prescribing a proper amount. Many more microarray-based tests are in the testing pipeline.

A group of Stanford geneticists and biochemists frustrated with the limitations of their available research methods devised the microarray technology. Studying genes one at a time was not only laborious and time-consuming, it was unable to address the question of how genes interact with each other in intricate networks that can involve hundreds, even thousands of genes. What the researchers wanted was a streamlined, automated way to capture gene interactions. To do this, biochemistry professor Patrick Brown, MD, PhD, led a team that figured out how to introduce techniques from material science, physics, computer science and engineering to answer biology questions. They came up with the microarray and published the first proof of the tool’s potential in the journal Science in 1995.

Although the original microarray prototype reported in Science included only 45 genes from the mustard plant, it was immediately clear that researchers could scale up the technology to include thousands of genes. “We wanted to push biology to a new frontier; biology is very complicated and we knew we would never get to the answers looking at genes one at a time,” says biochemistry and genetics professor Ronald Davis, PhD, who was one of the article’s co-authors. “We needed to increase our ability by a factor of a thousand, and that’s exactly what a microarray allows.”

Patrick Brown
  A microarray holds thousands of gene spots, each spot representing a different snippet of DNA.

In 1995, industry was working on a similar tool. But the Stanford group’s publication was the first report of any research use of a DNA microarray. The team went on to create the first microarrays that carried all yeast genes  — representing the first time an organism’s entire genome was available for analysis at once. The Stanford microarray developers jump-started researchers’ interest by freely sharing the technology to make microarrays and the protocols to use them. The developers would show researchers how to set up a system for making their own arrays spotted with the DNA of their choice and would even provide the software for analyzing the results, a tradition that continues to this day. Lowering the price barrier and allowing the flexibility of do-it-yourself array creation sparked thousands of scientists worldwide to try out microarrays.

“There is not a single field in the biological sciences that has not been touched by microarrays,” says Joseph DeRisi, PhD, who was a graduate student in Brown’s lab just as microarrays were taking off. DeRisi, now an associate professor of biochemistry and biophysics at the UC-San Francisco, reels off some of their uses: “Studying Antarctic ice cores, salt ponds, infectious diseases, viruses — and every sequenced organism. The impact of this technology is literally immeasurable.”

“People outside of Stanford saw this as ground zero for where the most exciting stuff was happening,” recalls Gavin Sherlock, PhD, assistant professor (research) of genetics. “Everyone was wowed by what was going on at Stanford.” Researchers used microarrays to examine gene activity in everything from yeast cells to the unidentified dwellers of pond scum.

Patrick Brown
  The colors on the microarray show which genes in a sample are active.

The obvious appeal of the microarray is the ability to survey thousands of genes at once, illuminating complex interactions that might be unsuspected otherwise. The Achilles’ heel is that the vast amount of data they generate, especially in the realm of human disease, is hard to interpret. Designing the experiments that can provide statistically sound conclusions has proven challenging, as has gleaning biological meaning from thousands of spots.

A new breed of scientist has evolved to tackle the data generated by scanning thousands of genes at a time. Biologists are learning a language usually reserved for mathematicians, number crunchers are setting up test tubes and computer programmers are brushing up on their genetics.

Wing Hung Wong, PhD, embodies this new hybrid scientist. Trained in classical mathematics and applied statistics, Wong arrived at Stanford last year as a professor of statistics and of health research and policy. He also leads a biological research laboratory, where he’s developing software to help analyze microarray data.

Wong says that while many researchers have figured out how to use microarrays to simply show gene activity at a given time, the challenge is answering more difficult biological questions, such as what’s happening in a cell over time or under different conditions. Answers to these types of questions could lead to practical, clinical diagnostic tools.

“Microarrays are definitely not a magic tool that will allow us to personalize cancer treatment overnight,” says Wong. “But they are indispensable because we know too little about cancer biology to focus on just a few genes right now. Microarrays are more or less the only method now that can look at the whole genome at the same time.”

Patrick Brown
  If a DNA spot glows only in either green or red, that gene is active in only one sample or the other. Yellow means the gene is active in both samples, and black means the gene is active in neither.

In hopes of paving the path toward using microarrays in the clinic, assistant professor of pathology Jonathan Pollack, MD, PhD, and his colleagues are attempting to address the technical aspects of how to ensure reliable, reproducible results. Despite the challenges of analyzing clinical samples — for instance, picking the right cells out of a biopsy to study — he foresees a day when microarrays will be used routinely by the pathology service to provide supplementary information for cancer diagnoses. He has collaborated with several surgeons and oncologists using microarrays to classify prostate cancer, breast cancer and leukemia. And while he’s optimistic about the microarray’s potential, he’s cautious about its value today. “From beginning to end, there are a lot of variables that make it more difficult to develop microarrays as a clinical test,” he says.

“Like any other technology, what comes out is only as good as what goes in,” says Catherine Ball, PhD, who in the 1990s worked in the lab of former Stanford genetics professor David Botstein and now directs the Stanford Microarray Database, a public resource that contains data from more than 50,000 microarray experiments. She emphasizes that unless researchers meticulously design their experiments, the results will likely have negligible value. “That happened all the time,” she says, “because microarrays were this cool, fabulous, new thing and people just wanted to get their hands on them and do something fun. It’s not fun or exciting to be deliberate.”

At the other end of the experiment is the analysis: how to best extract biological meaning and clinically useful information from the expression patterns. “There is a lot of complexity in validating microarray data,” says Pollack. “Any problems are magnified because the data involve thousands of genes.”

To help get a handle on micrarray data, Robert Tibshirani, PhD, professor of health research and policy (biostatistics), figures out where the analysis can go wrong. “The microarray is a very powerful technology,” he says, “but also a dangerous one because of the ease of making false positive conclusions” — the result of searching so hard among so much data that an apparent relationship is created. He says that fully one-third of all published microarray data might be analyzed improperly, yielding meaningless results.

How a microarray experiment works:

Extract the mRNA from the cells of both samples — mRNA molecules are the protein-building instructions issued by genes. The more mRNA a gene produces, the more active is the gene.

Convert each mRNA to a fluorescently labeled cDNA — a more stable molecule than mRNA. Typically, cDNA from one sample is labeled green and the other red.

Mix the colored cDNA together and place drops of the liquid onto the top of the microarray slide. The cDNA in the drops will bind to their corresponding DNA spots.

Scan with a laser scanner to visualize what has stuck to the microarray.

Analyze the results by using computer programs. If a DNA spot glows only in either green or red, that gene is active in only one sample or the other. Yellow means the gene is active in both samples, and black means the gene is active in neither.

Rather than simply raising a red flag, Tibshirani offers a solution to the problem: he has created analysis tools that can take the “fuzziness” out of the data and provide an assurance that what is identified as real, is real. “Tools like these help you believe in your work,” says Tibshirani. “You have to remember that it is easy to fool yourself, too.” His software, which is free and plugs into basic spreadsheet software, helps researchers see the strength in their conclusions and to realize when they are reaching too hard to find associations.

Brown, the architect of the home-brewed microarray effort launched at Stanford, adds that there is an even more complicated and challenging problem than developing statistical methods: figuring out the meaning of all the spots on the array. “The technology is fundamentally straightforward, but bringing it together is where you really get insight, integrating all kinds of information — about cell biology, physiology, about diseases, all the things we already know about the genes’ functions.” Continuing his interest in making microarrays as accessible as possible, Brown advocates freely available databases that compile the vast amounts of biological information needed to make sense of the spots on a microarray.

Ronald Levy, MD, the Robert K. and Helen K. Summy Professor in the School of Medicine, says that even as the technology’s analysis problems smooth out, microarrays in their current form might not be a practical tool for making medical diagnoses. But they’re very useful for pointing people in the right direction, says Levy, a Stanford medical alumnus, class of ’68. He is using them to zero in on genes that determine whether a person with lymphoma is likely to respond to a given treatment. A clinically useful tool, he says, will probably end up being an assay of a few genes, a couple dozen at most, originally identified by microarray experiments. The FDA-approved test for breast cancer, produced by Genomic Health, is exactly such an assay. Based on findings from microarray experiments, the company created a streamlined test using standard assay technology to report the activity of just 21 genes.

Only time will tell what role the microarray will eventually play in the clinic. Take the case of another revolutionizing technology, DNA sequencing, says Ball. “You can go into a lab now and have your DNA sequenced to see if you have a familial gene that predisposes you to breast cancer,” she says. “We weren’t doing that when sequencing first came out. We weren’t doing that even when sequencing was 10 years old. We have a lot to look forward to with microarrays.”

For clinicians treating breast cancer, the future can’t come soon enough. Most women diagnosed with breast cancer receive some type of systemic therapy after surgery, either chemotherapy or hormonal therapy, or both, to prevent the cancer from spreading, according to chief of surgical oncology research Stefanie Jeffrey, MD. But many of these patients would do just as well without it. “We end up giving expensive, toxic therapy because we can’t define who really needs it,” she says.

Jeffrey was a key member of the team that pioneered the use of microarrays to study breast cancer, and helped to make Stanford a focal point for microarray research. Her lab group has figured out how to expand small amounts of tumor tissue from biopsies to amounts needed for microarray analysis and is continuing to define the molecular signatures of different types of breast cancer, searching for clues that link breast cancer genetics to how the patient fares.

Even though the microarray-based test for breast cancer has been approved by the FDA for helping figure out which women would benefit from the extra therapy, so far it is only rarely used at Stanford; it can benefit only a subset of women with breast cancer, and most of the physicians consider it too experimental at this point. “The importance of this type of work still is principally in the area of revealing clues and understanding the biology of breast cancer,” says Frank Stockdale, MD, PhD, the Maureen Lyles D’Ambrogio Professor in the School of Medicine, Emeritus.

Large-scale head-to-head comparisons of microarray-guided treatment vs. standard care would start answering physicians’ questions about the new tests’ value. One of the few such trials under way is a Netherlands Cancer Institute effort comparing outcomes of breast cancer patients whose treatment is based on the results of a microarray profile with those whose care is guided by standard clinical measures, such as tumor size and how much the cancer has spread. Trials such as this one, which will include 13,000 women and will run for several years, will show if clues from microarrays can help to make the decision of whether a woman with breast cancer should undergo chemotherapy less agonizing.

Comments? Contact Stanford Medicine at

 Back To Contents