stanford medicine


illustration of platypus and other creatures

Genomes, Get Yer Fresh Genomes

After honing their skills on organisms of all stripes, scientists are within reach of sequencing human genomes at bargain prices

The odd-looking platypus is often cited as proof that God has a sense of humor. If you accept that premise, you have to agree that he’s also a creative speller.

Organisms as diverse as tapeworms and elephants parlay a common genetic alphabet of just four letters, or nucleotides, into a host of biological differences that allow them to crawl, swim, fly, lay eggs, grow feathers, digest food and grow old in very distinctive ways.

Researchers have been striving for more than 30 years to create a genomic dictionary of sorts that will allow them to unravel the species-specific lexicon that makes an amoeba an amoeba, a human a human or a platypus a platypus. When complete, such a reference tool will provide a glimpse into both our evolutionary past and our promising future.

But the very technology necessary to develop such a finely focused lens also gives us an unparalleled opportunity for introspection — the possibility of holding our own genetic sequence in our hands. The incredible strides we’ve made in the speed, accuracy and cost-effectiveness of sequencing whole genomes are mirrored in a recent push by the National Human Genome Research Institute of the National Institutes of Health to reach the “$1,000 genome,” a holy grail of genetics that will allow the sequencing of an entire mammalian genome for $1,000 or less. Experts estimate this feat will be within our grasp in the next five to 10 years.

“Meeting this goal will be stunningly exciting,” says Jeffery Schloss, PhD, program director for technology development coordination in the division of extramural research at the NHGRI. However, it comes at a cost. For a while, at least, the focus on cost reduction will mean less money to devote to new sequencing projects. The NHGRI, which has funded the genomic sequencing of more than 160 organisms, has budgeted about $44 million for its large-scale sequencing program for fiscal year 2009, down from $150 million in 2005. In contrast, the amount slated for medical sequencing and efforts devoted to understanding the genetics of cancer has increased to about $69 million from zero in 2005. Although the shift is dramatic, the expectation is that the subsequent advances in technology, and the reduced price tag, can be applied to the sequencing of nearly any genome.

“While it’s appropriate for the NIH to be focusing more on human health,” says David Haussler, PhD, a Stanford consulting professor and Howard Hughes Medical Institute investigator at the University of California-Santa Cruz, “there is still an enormous amount to be learned about biology in general and about the animals that inhabit this planet.”

Though the NIH is the major funder of U.S. genome sequencing efforts, other institutions also play a role, including the Department of Energy, which is interested in how microbes might be used in environmental remediation and energy production, and NASA, which has been involved in sequencing and interpreting the human genome. An obvious candidate for picking up the sequencing slack is the National Science Foundation. However, the constant struggle for government dollars limits what any agency can do.

“Funding for the NSF has been inadequate for some time,” says Haussler, who has testified to that effect before Congress. “In the absence of additional federal support, we have to look elsewhere to keep this sequencing effort from falling through the cracks between organizations.”

In the meantime, all eyes are focused on the human genome. Schloss expects the $1,000 genome goal will likely be reached by about 2014, but some people think it might come to pass even sooner. In addition to the more than $50 million the NHGRI has allocated to the effort since 2004, the nonprofit X Prize Foundation has promised $10 million — the largest medical prize in history — to the first group to sequence 100 human genomes within 10 days or less at a cost of less than $10,000 per genome, catapulting genomic sequencing into the ranks of lunar landers, moon rovers, commercial space flight and ultra-efficient 100 mpg cars in the pantheon of major human endeavors.

It’s a startling rise to public prominence from modest beginnings only 30 years ago.

Field and stream

The era of genomic sequencing began in 1972 when researchers sequenced a small gene from one of the smallest of organisms — a bacterial parasite called a bacteriophage. The first complete genome — a string of just over 5,000 nucleotides, also from a bacteriophage — was sequenced in 1977. Since then, researchers have cracked the genetic codes of close to 200 different organisms, including baker’s yeast, fruit flies and Arabidopsis (a small flowering plant related to mustard often used in laboratory research). A watershed moment occurred in 2001 with the publication of a draft sequence of the human genome, followed in 2002 and 2004 by the mouse and the rat genomes, respectively.

“I’m always a little disappointed that we can’t just sequence every organism. But the overarching goal of the Human Genome Project has always been to better understand human disease.”

But even as the rate and ease of discovery increased, researchers have had to be picky about which organism to target next, and why. The time, effort and money involved are still considerable, and have to be justified by a comparatively large payoff in expected biological knowledge.

“I’m always a little disappointed that we can’t just sequence every organism,” says Adam Felsenfeld, PhD, the director of the large-scale genome sequencing program at NHGRI. “But the overarching goal of the Human Genome Project has always been to better understand human disease.”

One way to do that is to compare the sequences of our closest relatives, the apes, with our own. Another way is to investigate the sequences of animals that are very different from us, like the elephant. Genes and regions that change very little over evolutionary time are likely to be important biological building blocks. Those that differ markedly between groups may delineate important competitive advantages or evolutionary leaps.

“We can learn a lot about ourselves by sequencing related species,” says Stanford consulting professor Haussler.

He has participated in NHGRI working groups assembled to recommend species for sequencing, and recently published research showing that the genomes of humans and chimpanzees differ significantly in their spelling of a gene involved in brain development.

“We can see how things differ and how they evolved to be the way they are now,” says Haussler. “Many evolutionary changes have occurred in response to environmental or pathogenic challenges.”

For a recent example you need look no further than the platypus. Researchers at Stanford’s School of Medicine used the publication of the genome of this odd Australian animal to explain an evolutionary tour de force that led to a reproductive advantage possessed by nearly all of today’s mammals — the ability to move the testicles away from the warm core of the body during development. Keeping the testicles’ heat-sensitive cargo, the sperm, in a cooler outer pouch called the scrotum freed the animals to raise their core body temperatures to levels conducive to running, jumping and hunting more efficiently.

“If you could bracket these kinds of evolutionary leaps, you could, in principle, find genomic correlates that correspond to these leaps,” says Felsenfeld. Understanding these drastic changes gets us closer to knowing what makes us human. It also highlights the incredible forces that have shaped our fellow planet dwellers during the past several million years.

These sequences contain a tremendous amount of information not only about the basic principles of evolution, but also the problems that we face in conservation and ecology,” says Haussler. “It will be utterly transformative to be able to study the changes in the vertebrate sequences that led, for example, to the spectacular return of the blue whale, formerly a land mammal, to the ocean.”

Techno beat

Sequencing large genomes like that of the platypus require money, however; when proposed, the platypus genome sequencing project was expected to cost more than $40 million. It also demands the collaboration of many people and sophisticated sequencing equipment. Before the advent of the first automated sequencing machine in 1986, legions of researchers would spend months or years relying on their eyesight to painstakingly interpret patterns of dark bands in columns on X-ray films.

The first sequencing technique was based on the concept of chain termination. By separating a DNA synthesis reaction into four tubes corresponding to the four nucleotides in DNA, the researchers were able to selectively terminate the growing chains at A, T, C or G. They could then decipher the order of the nucleotides in the gene by comparing the relative lengths of the reaction products, which were either radioactively or fluorescently labeled.

Advances in sequencing technology have now sped up that process immeasurably and continue to lead to significant reductions in both the time and cost of sequencing whole genomes. Upcoming techniques, including one that records the addition of individual nucleotides to a growing chain and another that spools single strands of DNA through tiny tubes, or nanopores, may soon allow researchers to read the genetic code in a way similar to running your finger along a line of text in a book.

“There’s been no dearth of interesting ideas,” says Schloss. “People keep coming up with new approaches, and that’s really good. We’re seeing technological advances that have very interesting implications for many other fields.”

High-throughput, automated machines now zip through thousands of nucleotides an hour, and sophisticated computer programs knit individual, shorter pieces into whole-genome strings of nucleotides. Re-sequencing previously sequenced genomes can take hours instead of months or years, in part because researchers can determine the proper order of the short pieces by overlaying them on the reference sequence.

For example, the original project to sequence the fruit fly genome took several years. The entire 120 megabase genome, comprising more than 13,000 genes, can now be re-sequenced in as little as an afternoon, Felsenfeld estimates, and the cost of such sequencing decreased by half about every 20 months or so between 2000 and 2006.

Many other organisms are currently being sequenced, including the dolphin, the pangolin and the squirrel, to name just a few. Each one has been carefully selected by NHGRI working groups of experts like Haussler to add to the breadth and depth of knowledge necessary to put humans into context in the world around them. But the recent technological advances have caused a shift in focus toward more biomedical types of sequencing.

“Our sequencing capabilities have arrived at the point where we can now tackle large studies of human variation and disease that were previously unapproachable,” says Felsenfeld. “In particular, three new sequencing platforms deliver tenfold higher throughput for the same cost.”

Brave new alphabet

The differences in the time and cost to sequence human genomes are also striking.

According to the journal Nature, it took more than 13 years and about $2.7 billion to complete the first human genome sequencing project in 2003. In contrast, it took Celera and the J. Craig Venter Institute just four years and $100 million to sequence the genome of founder Venter, which was announced in 2007. Then in April 2008, genetics pioneer James Watson’s complete sequence was published after just four and a half months and cost about $1.5 million.

Of course, every subsequent effort builds on the technological and knowledge advances of the ones that preceded it. The first human genome project was so expensive because it required, among other things, the development of suitable technology, the construction of the first human genetic map and the sequencing of several model organisms. Splitting out the sequencing-specific costs from the total is tricky, but various sources estimate it to be between $600 million and $800 million.

So, where are we headed? Will we all one day tote around our genomic sequence on a microchip in our wallets, or under our skin? At a cost of $1,000, obtaining such information would be on par with other routine medical imaging or diagnostic procedures. And, unlike many other health-care costs, sequencing your DNA would likely need to be done only once. With the possible exception of mutations in cancerous cells, your genome is what it is, like your blood type or your eye color.

The rapid acquisition of genome sequences is likely to lead us to a day when access to an organism’s sequence is as commonplace as knowing its name; no longer will we have to choose between humans or animals when doling out research funds. In the laboratory, it will be as indispensable to research as pipettes or test tubes, and in the clinic, physicians will use what they’ve learned about human biology through the study of the platypus, the armadillo or the elephant shrew to diagnose conditions as varied as heart disease, HIV infection or schizophrenia. We might use our genetic fingerprints to determine which blood pressure or antidepressant medication is likely to work best, or to identify which chemotherapy would be most successful against our specific cancer.

Along the way we’ll have to determine how to handle questions of privacy, openness and risk. A national dialog has already begun about how to protect individuals who carry variations of genes associated with particular diseases from repercussions by employers or insurance companies. It’s a serious concern; James Watson himself, the co-disoverer of DNA’s structure, chose not to learn of his genetic risk of developing Alzheimer’s disease, in part because he preferred keeping himself and the public in the dark as to whether he was likely to develop the progressive, incurable condition.

One possible solution might be self-generated. As genomic data on individuals reach a critical mass, it may no longer be possible or efficient for businesses in the coming years to discriminate based on specific DNA mutations. What now seems like a clarion call of alarm might fade into so much white noise once all potential employees or insurance policy holders sport their own brave new alphabet.

“The bottom line is, as Dr. Collins says, we’ve all got glitches in our DNA,” says Schloss. Francis Collins, MD, PhD, is the former director of the NHGRI. “We’re all going to die of something, period. Furthermore, some health complications may be predicted in your genetic sequence, but not all.”

“It seems impossible to overestimate the importance of human variation. We’ve known about it all along, of course, but it’s always been ignored or pushed to the side because it’s been very hard and very expensive to study until now.”

What deserves a bit more thought, according to Schloss, is the fact that, although your DNA sequence is uniquely yours, it’s also a composite of your parents’ and a portion of your children’s. As such, publicizing or sharing your own sequence necessarily reveals something about your relatives. “There are all sorts of interesting complications,” says Schloss. “But we’re not doing all this willy-nilly. We’re spending a lot of time thinking about these issues.”

In addition to promoting the $1,000 genome, the NHGRI has joined forces with the National Cancer Institute on a project known as The Cancer Genome Atlas, or TCGA. The goal of the collaboration is to explore the broad spectrum of genetic changes involved in cancer in humans. For example, a group of Stanford researchers recently participated in an effort by the TCGA network to identify genes that are activated in human glioblastomas. Another international effort, known as the 1,000 Genomes Project, aims to explore normal human diversity by sequencing the genomes of at least 1,000 people from around the world.

“It seems impossible to overestimate the importance of human variation,” says Felsenfeld. “We’ve known about it all along, of course, but it’s always been ignored or pushed to the side because it’s been very hard and very expensive to study until now.”

Finally, a number of projects are investigating how the genomes of different species, like those of pathogens and hosts, have evolved together. Understanding the delicate dance of infection and evasion is important to identifying reliable molecular targets for therapy.

As sequences become more and more commonplace, however, it’s also important to remember their limitations. The 3 billion letters that you harbor in nearly every cell in your body are only a recipe.

“Your genome isn’t the same as you,” says Felsenfeld. “It’s wonderful, but it’s not magical.”

It’s an exciting recipe, though. “The story of the flexibility and strength and beauty of life is written in DNA,” says Haussler. “I’m certain that one day we’ll have a genome to go along with each of the species that we study. It will become absolutely indispensable to the future of the life sciences. It’s a wonderfully exciting time to be alive.”


We know your name, we know your sequence, too

Revelations from a sampling of the more than 180 different organisms sequenced so far

Fruit fly

Sequence completed: 2000

Don’t cry, but you’re more than half fly. About 60 percent of our genes are also found in fruit flies. But there’s a silver lining. According to the National Human Genome Research Institute, when scientists inserted a human gene associated with early-onset Parkinson’s disease into fruit flies, the little buzzers displayed symptoms similar to those seen in humans with the disorder.


Sequence completed: 2002

This Japanese pufferfish has the shortest known vertebrate genome. Seems like we could learn a thing or two from this fish; it pulls off the same basic body layout as ours with much less fuss by doing away with most of the “junk” DNA that peppers the human genome. It stands to reason that what’s left is really important, like the toxin that might kill you unless you have a good sushi chef…


Sequence completed: 2004

People who raise chickens might not be surprised to know that about 85 percent of the birds’ genome is made up of what biologists call genetic “dark matter” — shorthand for “perfectly useless as far as we know.” However, researchers have found that eggshell-specific proteins are related to proteins involved in bone calcification in mammals. As might be expected, though, chickens outshine us in the presence of genes for making egg whites and yolks.

Honey bee

Sequence completed: 2006

Honey bees don’t just pollinate our crops and churn out honey. They also have a complex social structure and communication chain. Researchers found that the expression of some genes change in response to the needs of the hive — a useful kind of peer pressure that keeps the colony running smoothly. It’s possible that similar pathways govern human behavior in social situations, giving a whole new meaning to “catching a buzz.”

Sea urchin

Sequence completed: 2006

Sea urchins don’t have eyes, ears or noses. But, oddly, they have the same genes for vision, hearing and smelling as you and I. In fact, some of the vision proteins are expressed in the urchin’s tube foot, with which the animal explores its world. Sea urchins share a common ancestor with humans, and they may be a valuable model for studies of Alzheimer’s and cancer.


Sequence completed: 2008

Mix a bird, a reptile and a mammal together and what do you get? An animal that lays eggs, nurses its young and packs a venomous wallop in spurs on its back legs. It’s an enigma wrapped in a riddle, and researchers are only now figuring out what makes it tick. The platypus is an important “bridge” animal — an ancient cipher that unlocks the key to the split between several branches on the evolutionary tree.





©2008 Stanford University  |  Terms of Use  |  About Us