Spotlight: Drawing the tree of life
Cambridge, MA
- Wednesday, November 15, 2006
![]() |
Photo courtesy of Tom Lloyd, iStock Photo |
Thucydides, the world's first historian, aptly described the
difficulty that lies in reconstructing the past. In the opening
paragraph of his book, History of the Peloponnesian War,
he wrote: "The events of remote antiquity, and even those that more
immediately precede the war, could not from lapse of time be clearly
ascertained."
The quotation, though penned some 2,500 years ago,
highlights the challenge facing every historian — to accurately
reconstruct episodes of the past for which only scattered fragments of
information remain. Modern-day biologists, whose work often adopts a
historical perspective, face this challenge too. They seek to retrace
the evolutionary pathways that created all of today’s living organisms
by studying DNA, nature's own — and sometimes spotty — historical
record.
The first person to realize that organisms bear the
indelible stamp of their origin was none other than Charles Darwin. In
July 1837, only a month after he began the first of his famous
notebooks devoted to the Origin of Species,
he scribbled a crude but unmistakable evolutionary tree. The drawing,
with the most ancient species at the bottom and their descendants
branching off irregularly along the trunk, captured two key insights.
First, each living species is not created de novo
but is related to all other organisms through common ancestry. Second,
the genealogical relationships among them (called "phylogenies") can be
visualized in the form of a great tree of life.
Now, a century
and a half later, a complete and accurate tree of life remains an
elusive goal. Systematists — the archaeologists of DNA — have tried to
flesh out evolutionary relationships by comparing just one gene or
perhaps several of them from a group of organisms. But, how reliable
are these phylogenies based on single genes?
Together with
researchers in Sean Carroll's laboratory at the University of
Wisconsin, we have explored this question using genomic sequences from
eight species of yeast, three of which were produced by the Broad’s
Fungal Genome Initiative. We collected more than 100 genes interspersed
throughout the yeasts' genomes and compared the evolutionary trees
obtained from each of the genes. The analysis revealed that single and
few-gene datasets have a significant probability of generating
inaccurate and conflicting evolutionary trees. By contrast, datasets
composed of much larger sets of genes yielded a single, fully resolved
phylogeny with maximum statistical support.
![]() |
|
Antonis Rokas |
As systematists are constrained by what little data is available,
they often have to strike a balance between the number of species they
study and the number of genes they use to reconstruct the species’
evolutionary history. Therefore, we expanded our data set to include an
additional six yeast species, allowing us to investigate the relative
contribution of gene number and species number to phylogenetic
accuracy. Importantly, we found that no matter how many species were
used, increasing the number of genes studied was a prerequisite for a
more accurate phylogeny.
The results from these two studies
indicate that more data could resolve many difficult phylogenetic
problems. So, we decided to test this hypothesis on a branch of the
tree of life that has proved particularly challenging — that of the
animal kingdom. We devised novel experimental protocols to
systematically amplify large numbers of genes from any animal, applied
them to several animal species, and combined them with bioinformatic
data from additional species. Despite the large amount of data
analyzed, we found that many of the phylogenetic relationships among
animals simply could not be resolved.
Not to be discouraged,
we decided to test our methods using genomic data from animals' closest
relatives at the kingdom level — the fungi. Thanks in part to
sequencing done here at the Broad, we had access to an abundance of
genomes throughout the fungal tree. We sampled exactly the same genes
from fungi that we had from animals and tested whether the lack of
resolution in the animal tree was due to the choice of genes or to the
branching pattern specific to the animal phylogeny.
Importantly,
we found that the genes robustly resolved phylogenetic relationships
within fungi, suggesting that the amount of data we had for animals was
potentially adequate to resolve relationships among them — even though
it didn't. We wondered if one possible explanation for this lack of
resolution might lie in the different shapes that the evolutionary
trees of fungi and animals have taken on in the course of evolution.
For instance, it is well recognized that, instead of looking like
arborescent trees, some evolutionary genealogies look more like bushes,
which can pose special problems. Through our work, we found that the
resolution of the animal phylogeny is dramatically affected by its
“bushiness” — how closely spaced its branches are and how frequently
lone branches appear. In fact, this bushiness raises concerns whether
conventional molecular analyses will be sufficient to trace the
evolutionary genealogy of certain groups of organisms — like animals —
whose origins are several hundreds of millions of years in the past.
While
DNA sequence information may not always suffice, other genome features,
such as large-scale DNA rearrangements offer powerful alternatives for
addressing such phylogenetic riddles. The use of these rare changes is
feasible only in a genomic context but can yield remarkably precise
evolutionary trees. Working in the Broad's Microbial Analysis Group, we
are developing computational methods to find such rare events in
genomic data and use them to explore evolutionary relationships.
The
impact of genomics on the grand quest for a complete phylogenetic
encyclopedia is just beginning. Of course, the fraction of species for
which genome-scale data are available is truly minuscule: there are
about 2 million known species of organisms and another 10,000 are
discovered each year. Comparative genomics, by vastly increasing the
molecular data available for a small but critical number of species, is
bound to play a key role in efforts to assemble a comprehensive tree of
life.
Thus, it seems that some pieces are finally falling into place — Thucydides would be proud.
—Antonis Rokas
Further reading:
Rokas A. (2006) Genomics and the tree of life. Science; 313:1897-1899.
Rokas A, Carroll SB. (2006) Bushes in the Tree of Life. PLoS Biology; 4: e352.
Rokas A, Krueger D, Carroll SB. (2005) Animal evolution and the molecular signature of radiations compressed in time. Science; 310:1933-1938.
Rokas A, Williams BL, King N, Carroll SB. (2003) Genome-scale approaches to resolving incongruence in molecular phylogenies. Nature; 425:798-804.
For more information, contact:
Communications
news@broad.mit.edu

