BMC Evolutionary Biology
BioMed Central
Open Access
Research article
Site specific rates of mitochondrial genomes and the phylogeny of
eutheria
Karl M Kjer1 and Rodney L Honeycutt*2
Address: 1Rutgers University, Department of Ecology, Evolution, and Natural Resources, Blake Hall, 93 Lipman Drive, New Brunswick, New Jersey
08901-8524, USA and 2Pepperdine University, Natural Science Division, 24255 Pacific Coast Hwy, Malibu, California 90263-4321, USA
Email: Karl M Kjer - kjer@aesop.rutgers.edu; Rodney L Honeycutt* - rodney.honeycutt@pepperdine.edu
* Corresponding author
Published: 25 January 2007
BMC Evolutionary Biology 2007, 7:8
doi:10.1186/1471-2148-7-8
Received: 20 October 2006
Accepted: 25 January 2007
This article is available from: http://www.biomedcentral.com/1471-2148/7/8
© 2007 Kjer and Honeycutt; licensee BioMed Central Ltd.
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0),
which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Abstract
Background: Traditionally, most studies employing data from whole mitochondrial genomes to
diagnose relationships among the major lineages of mammals have attempted to exclude regions
that potentially complicate phylogenetic analysis. Components generally excluded are 3rd codon
positions of protein-encoding genes, the control region, rRNAs, tRNAs, and the ND6 gene
(encoded on the opposite strand). We present an approach that includes all the data, with the
exception of the control region. This approach is based on a site-specific rate model that
accommodates excessive homoplasy and that utilizes secondary structure as a reference for proper
alignment of rRNAs and tRNAs.
Results: Mitochondrial genomic data for 78 eutherian mammals, 8 metatherians, and 3
monotremes were analyzed with a Bayesian analysis and our site specific rate model. The resultant
phylogeny revealed strong support for most nodes and was highly congruent with more recent
phylogenies based on nuclear DNA sequences. In addition, many of the conflicting relationships
observed by earlier mitochondrial-based analyses were resolved without need for the exclusion of
large subsets of the data.
Conclusion: Rather than exclusion of data to minimize presumed noise associated with nonprotein encoding genes in the mitochondrial genome, our results indicate that selection of an
appropriate model that accommodates rate heterogeneity across data partitions and proper
treatment of RNA genes can result in a mitochondrial genome-based phylogeny of eutherian
mammals that is reasonably congruent with recent phylogenies derived from nuclear genes.
Background
The class Mammalia provides a classic example of an
adaptive radiation, characterized by a proliferation of lineages displaying a diverse array of ecomorphological specializations for feeding and locomotion [1]. Many
additional biological attributes (e.g., behavior, physiology), coupled with this diversity in form and function,
have allowed mammals to exploit a broad range of habi-
tats worldwide. There are approximately 135 families of
living mammals apportioned into 26 orders and two
major subclasses, Prototheria and Theria, with the former
subclass containing the order Monotremata (duck-billed
platypus and spiny-anteaters) and the latter containing
the infraclasses Metatheria (marsupials) and Eutheria
(placentals), which are subdivided into 7 and 18 orders,
respectively [2,3]. Lineage-specific rate heterogeneity in
Page 1 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
terms of morphological diversification [4] and molecular
divergence [5-7] is a trademark of the various orders and
families of mammals, especially within the Eutheria, and
this has complicated efforts to resolve phylogenetic relationships among the higher categories of mammals.
Until relatively recently, most contributions to the "mammal tree of life," as it relates to phylogeny and classification, were made by functional morphologists and
paleontologists [2,8-10]. More recent molecular efforts
have resulted in confirmation of some previous hypotheses, the refutation of others, and the proposal of novel
arrangements [10-13].
The most severe disagreements between morphology and
molecules originated from studies based on mitochondrial genome sequences. For example, monophyly of
Rodentia (the most speciose order of mammals) is based
on a combination of dentition, skull morphology, soft
anatomy, the postcranial skeleton, and the jaw mechanism [14], and early classifications never questioned the
naturalness of this clade. Nevertheless, several early studies of nuclear genes [15-17] and mitochondrial genomes
[18-20] argued that guinea pigs and presumably their relatives (hystricognath rodents from South America and
Africa) were "not rodents," but represented a separate and
more basal eutherian lineage, apart from muroid rodents
(rats and mice). These same data challenged the monophyly of Glires, a group recognized on the basis of morphology [10,21] and containing the orders Lagomorpha
(rabbits) and Rodentia, by suggesting a sister-group relationship between lagomorphs and primates [22]. The
morphological placement of the order Xenarthra (armadillos, sloths, and anteaters) at the base of the eutherian
radiation was also challenged, with mitochondrial data
suggesting either the Erinaecidae [hedgehogs; [23]] or
rodents at the base. In contrast to the morphology, xenarthrans were considered sister to a clade containing the
orders Carnivora, Perrisodactyla (horses, rhinos, and elephants), Artiodactyla (pigs, antelope, deer, camels, etc.),
and Cetacea (whales and dolphins) [24]. Two of the more
startling results from the analysis of mitochondrial
genomes included: 1) the placement the order
Monotremata as sister to Metatheria, thus making the subclass Theria paraphyletic [25], and 2) a sister-group relationship between the anthropoid primates and
Dermoptera (flying lemurs), thus rendering the order Primates paraphyletic [26]. Neither of these hypotheses is
supported from either other molecular data or morphology [9,10,27-29].
More extensive studies employing greater taxon sampling
as well as larger amounts of nucleotide sequence data
from mitochondrial RNA (primarily rRNA) and/or
nuclear genes [30-38] have resulted in higher levels of
http://www.biomedcentral.com/1471-2148/7/8
congruence with earlier morphological studies, including
increased support for a more basal position of Xenarthra,
the monophyly of Rodentia, Glires, and Primates, a
monophyletic Theria, the Paenungulata (containing elephants, hyraxes, and sirenians), Tetytheria (elephants and
sirenians), and Euarchonta (Scandentia, Dermoptera, Primates).
In contrast to recent studies employing primarily nuclear
DNA sequences, a more recent study of whole mitochondrial genomes [26] failed to retrieve many of the well-supported clades identified by nuclear gene studies. Springer
et al.'s [36] comparison of mitochondrial and nuclear
gene sequences implied that mitochondrial data are less
effective at resolving relationships at deeper nodes of the
mammalian tree, and in many cases mitochondrial
sequences failed to recover "benchmark clades," that are
well-supported by both morphology and nuclear genes.
In this particular comparison, nuclear genes apparently
outperformed mitochondrial genomes because they
evolve at a rate appropriate for resolving more divergent
relationships among major lineages of mammals.
Unless mitochondrial genomes are evolving at rates where
saturation becomes a problem at deeper nodes, one
would expect the inclusion of analytical procedures that
accommodate asymmetries observed for mtDNA [29,3942], coupled with appropriate placement of the root of
the eutherian tree [30,40,43] and increased taxon sampling [44-47], to result in mitochondrial phylogenies that
are more congruent with the consensus reached by
nuclear genes. For the most part, a consideration of these
factors has improved more recent results, primarily
because model-based analyses of more mitochondrial
genomes were employed [41]. Nevertheless, as with earlier studies employing whole mitochondrial genomes,
Reyes et al. [41] excluded several regions of the genome
prior to analysis with a model that accommodated multiple rates of substitution. For instance, 3rd codon positions,
first positions involving leucine, and the control region
are generally excluded to reduce homoplasy resulting saturation effects. The ND6 gene, encoded on the L-strand, is
omitted because of presumed differences in constraints
(e.g., base composition) relative to genes encoded on the
H-strand. Finally, ribosomal genes (rRNAs) and transfer
RNAs (tRNAs) are frequently left out, presumably because
they are difficult to align.
It is our contention that exclusion of data is unnecessary if
appropriate model-based analyses are employed. If fast
evolving sites like 3rd codon positions can be appropriately modeled, then there is little reason for excluding
them from a likelihood-based analysis. Similarly, if rRNAs
and tRNAs can be reasonably well aligned with secondary
structure, we see little justification for excluding these
Page 2 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
characters. In this paper we provide an analysis of whole
mitochondrial genomes from 89 mammalian taxa and
investigate relationships among major lineages of eutherians. Except for the control region, which is difficult to
align across highly divergent taxa, all sequences were used
in an analysis employing a pseudoreplicate-generated,
site-specific rate model, first proposed by Kjer et al. [48].
Our major goal is to evaluate the effectiveness of this
model to negate a prior exclusion of potentially useful
data, and we base our conclusions on comparison of
results to more extensive studies based on a large panel of
nuclear gene sequences and extensive taxon sampling.
Results
The annotated Nexus file consists of 14,740 nucleotides,
includes 3,783 amino acid characters as well as additional
taxa (not used in this analysis), and is available on Kjer's
website [49]. The Nexus file on the website includes character set definitions ("charsets") that allow the user to
identify and analyze single gene partitions, codon positions, and rate classes separately, and taxon set definitions
("taxsets") that allow the user to evaluate relationships
among specific taxa. The most likely tree from the Bayesian analysis is shown in Fig. 1. This phylogeny reveals
strong support for several major groups of eutherians
including: 1) a monophyletic Afrotheria, a basal clade
containing Proboscidea (elephants), Sirenia (manatees
and dugongs), Hyracoidea (hyraxes), Macroscelidea (elephant shrews), Tubulidentata (aardvarks), Afrosoricidea
(insectivore families Chrysochloridae or golden moles
and Tenrecidae or tenrecs); 2) a monophyletic Xenarthra
sister to Afrotheria; 3) Euarchontoglires represented by
two major clades, one containing the Primates (including
Anthropoidea, Tarsiformes, and Lemuriformes), with
Dermoptera (flying lemurs) nested inside, and the other
containing a monophyletic Glires (rabbits and rodents);
4) euarchontan order Scandentia (tree shrews) sister to
the two major groups of Euarchontoglires; 5) Laurasiatheria containing a paraphyletic Eulipotyphyla (representing the insectivore families Erinaceidae and Soricidae, and
Talpidae), Chiroptera (bats), Pholidota (pangolins),
Cetartiodactyla (Artiodactyla and Cetacea or whales and
dolphins), Perrisodactyla (horses, rhinos, tapirs), and
Carnivora; 6) a sister-group relationship between Euarchontoglires and Laurasiatheria. In addition to these
major clades, monophyly of Paenungulata (containing
the orders Proboscidea, Sirenia, and Hyracoidea), Tethytheria (Sirenia and Proboscidea), and Cetartiodactyla
(Artiodactyla and Cetacea) with cetaceans sister to hippo
is strongly supported.
Table 1 shows the number of characters in each class, the
rescaled consistency indices (RC), the mean model
parameters and rate classes associated with the six partitions. The RC values show that the rate classes are very dif-
http://www.biomedcentral.com/1471-2148/7/8
ferent in terms of how well the data map onto the tree.
The fastest rate class is C-T rich (80%), just as C-T transitions are the fastest substitution class while slower rate
classes are much less biased in terms of nucleotide composition (Table 1). Among site rate variation is most pronounced at the slowest and the fastest rate classes. Figure
2 shows a characterization of the partitions in terms of
codon position and RNAs. RNA sequences tended to be
conservative, and in terms of rates were similar to 2nd
codon positions of protein-encoding genes. As expected,
3rd codon positions were associated with the faster rate
classes, although a portion of 3rd positions evolved slowly
(approximately 200 in rate classes 3–6). There were more
parsimony-informative RNA characters (786), as well as
first and second codon position characters (1928) in the
"fast" rate class 2, than in rate class 6 (the slowest; 197 parsimony informative rRNA sites, and 258 parsimony
informative 1st and 2nd codon sites). There were about
the same number of variable RNA characters in rate class
6 (532) as there were second codon sites (543). We note
that many 1st and 2nd codon sites are fast-evolving (2,206
in the fastest two rate classes), and 186 parsimonyinformative (of 1800) RNA characters that have been discarded from other analyses are members of the slowest
rate class, which is comparable to 131 (of 2541) parsimony-informative second codon positions in rate class 6.
Discussion
This analysis shows that third codon positions, redundant
first codon (leucine) positions, the ND6, and the RNA
genes can be included in a combined model-based analysis without drastically contradicting the general consensus
from previous molecular studies. In fact, all benchmark
clades for eutherian mammals that could be compared to
the list provided by Springer et al. [36] were retrieved in
our analysis and received high support. These benchmark
clades include (all posterior probabilities 100): 1) Carnivora (Feliformia + Caniformia); 2) Cetacea (toothed
whales and dolphins + baleen whales); 3) Cetartiodactyla
(Artiodactyla + Cetacea); 4) Chiroptera (bats); 5) Diprotodontia (wombats, wallaroos, and brush-tailed possums); 6) Paenungulata (hyrax + elephants and Sirenia);
7) Perrisodactyla (rhino and tapir + horses); 8) Rumantia
(bovines, sheep, deer); and 9) Xenarthra (armadillo +
tamandua). The mitochondrial genome-based phylogeny
shown in Fig. 1 is congruent with previous nuclear gene
studies [32-34,50] in several respects. Although placement of the root varies among studies [51], the nuclear
gene studies and our study place the groups Afrotheria
and Xenarthra at the base of the eutherian phylogeny followed by a sister-group relationship between the monophyletic groups Euarchontoglires and Laurasiatheria
(collectively called the Boreoeutheria). Several other
monophyletic groups appear to be well-supported and
congruent between our mtDNA and previous nuclear
Page 3 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
Ornithorhynchus - Platypus
Tachyglossus
Echidnas
Zaglossus
Monotremata
Metatheria
Theria
http://www.biomedcentral.com/1471-2148/7/8
Notoryctes - Marsupial mole
Didelphis - Opposum
Monodelphis
86
Isoodon - Bandicoot
Dromiciops
93
93
Vombatus - Wombat
Macropus - Wallaroo Diprotodonta
Trichosurus
Xenarthra
Dasypus - Armadillo
Tamandua
Procavia - Hyrax
99
Loxodonta - Elephant
Dugong - Sirenia
Macroscelidea
Elephantulus
Afrotheria
Macroscelides
Echinops - Tenrec
75
Chrysochloris - Golden mole
Orycteropus - Aardvark
Scandentia
Tupaia - Tree Shrew
Tarsius - Tarsier
83
Nycticebus - Slow loris
Lemur
96
Cynocephalus - Flying lemur
Eutheria
Cebus
Anthropoidea
Primates
Papio
Hylobates - Gibbon
Pongo
96 Euarchontoglires
Gorilla
Homo
Pan troglodytes
Pan paniscus
Oryctolagus
96
Rabbits
Lepus
Lagomorpha
Pikas
Ochotona collaris
79
Ochotona princeps
Sciurus - Squirrel
Myoxus - Dormouse
Boreoeutheria 96
Hystricognathi
Cavia - Guinea pig
Glires
Thryonomys
96
Jaculus
Rodentia
Spalax
Volemys
Mus
Rattus
Echinosorex - Moonrat
Erinaceidae
Hedgehogs
Hemiechinus
Erinaceus
Soricidae
Sorex - Shrew
Soriculus - Shrew
Laurasiatheria
Talpidae
Urotrichus - Shrew mole
Mogera - Mole
Talpa - Mole
Artibeus
Pipistrellus
Bats
Chalinolobus
Chiroptera
Rhinolophus
Megachiroptera
Pteropus
Pteropus
Pholidota
Manis - Pangolin
Sus - Pig
Lama
Muntiacus - Deer
Ovis - Sheep "Artiodactyla"
Bos - Cow
98
Cetartiodactyla
Bubalus
Hippopotamus
Physeter - Sperm whale
Cetacea
Balaenoptera physalus
Balaenoptera musculus
Horses
Equus caballus
Equus asinus
Tapirus - Tapir
Perrisodactyla
Rhinoceros
Rhinos
Ceratotherium
Herpestes - Mongoose
Feliformia
Cats
Felis
Acinonyx
- Cheetah
Carnivora
Canis - Dog
Ursus
americanus
Bears
Caniformia
Ursus maritimus
Ursus arctos
Seals Halichoerus
Phoca
0.1
Odobenus - Walrus
Eumetopias - Sea lion
Arctocephalus - Fur seal
Didelphidae
Figure
Most
likely
1 phylogram derived from the Bayesian Analysis (-ln 533753
Most likely phylogram derived from the Bayesian Analysis (-ln 533753.675). Numerals indicate estimated posterior probability.
These values are either placed on top of the node they represent (or with arrows pointing to the top of the internode) or
directly to the left of the node. Nodes without numerals are supported at 100%. Higher taxa are indicated either on top of
their representative internode, directly to the left of the node or to the right of the clade, and are delimited with brackets.
Page 4 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
http://www.biomedcentral.com/1471-2148/7/8
Table 1: Mean model parameters and six character partitions and rate classes
Partitions
1
2
3
4
5
6
Character
Const.
Inform
RC
r(A<->C)
r(A<->G)
r(A<->T)
r(C<->G)
r(C<->T)
r(G<->T)
pi(A)
pi(C)
pi(G)
pi(T)
alpha
m
1460
0
1460
0.02
1E-05 ± 4E-05
0.833 ± 0.107
0.008 ± 0.007
5E-05 ± 8E-05
0.137 ± 0.101
0.022 ± 0.003
0.18 ± 0.04
0.44 ± 0.02
0.03 ± 0.01
0.36 ± 0.02
0.623 ± 0.170
5.76 ± 0.41
5138
0
5138
0.048
0.26 ± 0.001
0.444 ± 0.006
0.042 ± 0.001
0.021 ± 0.001
0.280 ± 0.006
0.186 ± 0.003
0.44 ± 0.00
0.29 ± 0.00
0.06 ± 0.00
0.21 ± 0.00
0.932 ± 0.012
1.17 ± 0.11
1585
0
1585
0.172
0.131 ± 0.005
0.307 ± 0.010
0.092 ± 0.004
0.063 ± 0.005
0.341 ± 0.010
0.066 ± 0.004
0.31 ± 0.01
0.21 ± 0.01
0.18 ± 0.01
0.30 ± 0.01
3.361 ± 0.260
0.15 ± 0.01
241
0
241
0.332
0.113 ± 0.011
0.235 ± 0.107
0.129 ± 0.011
0.118 ± 0.015
0.284 ± 0.101
0.121 ± 0.013
0.32 ± 0.02
0.21 ± 0.01
0.19 ± 0.01
0.28 ± 0.02
42.83 ± 5.770
0.11 ± 0.01
41
0
41
0.448
0.060 ± 0.026
0.200 ± 0.066
0.066 ± 0.029
0.224 ± 0.070
0.230 ± 0.064
0.221 ± 0.071
0.47 ± 0.09
0.24 ± 0.05
0.09 ± 0.03
0.21 ± 0.04
27.39 ± 615.515
0.31 ± 0.40
6275
4719
459
0.818
0.124 ± 0.008
0.299 ± 0.012
0.110 ± 0.006
0.128 ± 0.009
0.274 ± 0.011
0.065 ± 0.005
0.25 ± 0.00
0.23 ± 0.00
0.21 ± 0.00
0.32 ± 0.01
0.879 ± 0.100
0.01 ± 0.00
"Character" refers to the number of characters in a partition. "Const." is the number of constant (invariant) sites, and "Inform" is the number of
parsimony informative sites. "RC" is the rescaled consistency index. The next six lines are the values from the rmatrix, followed by the percentages
of each of the nucleotides. "Alpha" is the shape parameter from the gamma distribution, and "m" refers to the relative rates among partitions. Rates
increase from classes 1 to 6.
DNA studies including Paenungulata (Hyracoidea, Sirenia, and Proboscidae), Cetartiodactyla (Artiodactyla and
Cetacea), Chiroptera, and Glires (Lagomorpha and
Rodentia).
Although several groups are identified by both our whole
mitochondrial genome analysis and nuclear genes, not all
of these molecularly-defined groups are necessarily congruent with morphological data. For instance, some morphological studies support a monophyletic Archonta
containing the euarchontans as well as Chiroptera [9,10],
and although a relationship between the orders Artiodactyla and Cetacea has support from morphology, a sistergroup relationship between Cetacea and the family Hippopotamidae (hippos) denoted by both nuclear genes
and mitochondrial genomes [52] is supported by some
[53] but not all morphological analyses [54,55]. Some
earlier morphological comparisons [9], but none of the
molecular data, support Volitantia, a group containing
Chiroptera and Dermoptera. More recent molecular studies, including the one presented here, have indicated paraphyly for the chiropteran suborder Microchiroptera with
the family Rhinolophidae grouping closer to the Megachiroptera, a clade containing non-echolocating taxa [56-58],
and this is not corroborated by morphological data.
Our phylogenetic results are similar to those presented by
Reyes et al. [41], which was based on a GTR+I+G Bayesian
analysis that excluded RNAs, ND6, and redundant codon
positions. Gibson et al. [39] also showed that there were
lineage and gene specific biases of C and T compositions,
and performed an analysis with a model that reduced the
character complexity of these nucleotides to Y, creating a
three-state model. While Gibson et al. [39] included
RNAs, they also excluded third codon positions and the
ND6, resulting in a dataset of 7,402 sites. While we agree
with the corrections proposed by both Gibson et al. [39]
and Reyes et al. [41] in reducing the influence of homoplastic and biased characters, our approach differed in
including a site specific rate model that rendered noisy
sites less influential at deeper nodes, while retaining them
as characters toward the tips of the tree. Our matrix is
nearly twice the size of the largest previous analyses. In
performing the pseudoreplicate reweighting, the noisiest
sites are presumably identified and accommodated in a
model. Many different partitions, including those that
were excluded by others, can be explored by downloading
the Nexus file and including specific "charsets" such as the
ND6. For example, a parsimony analysis of the ND6 gene
results in the recovery of therians, metatherians, eutherians, anthropoid primates (in the same order as the combined analysis), whales, and carnivores, among other
groups (not shown). Clearly, the ND6 contains some
non-random signal, including 26% of its 535 nucleotides
in rate class 6 (the slowest).
The trees in our analysis of the combined data differ from
others in the placement of Xenarthra; ours with Afrotheria
(Fig. 1), supporting a northern-southern hemisphere split,
and Gibson et al. [39] and Reyes et al. [41] with Euarchontoglires. Note, both this analysis and the analysis of Gibson et al. [39] compensate for the large number of
homoplastic C-T transitions but in different ways. Kriegs
et al. [59], using retrotransposed elements (which they
suppose to be "homoplasy free"), supported Xenarthra as
the sister taxon of the rest of Eutheria. While we agree that
Page 5 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
http://www.biomedcentral.com/1471-2148/7/8
Partition
Fastest
1
2
3
4
5
Slowest
6
Rate
tions
partition
Figure
Classes
(black)
2that and
are Partition
RNAs (white),
of Variable
first codon
Sites –positions
Top: A visualization
(light grey), second
with piecodon
graphspositions
of the proportion
(dark grey),
of and
sitesthird
in each
codon
rate-class
posiRate Classes and Partition of Variable Sites – Top: A visualization with pie graphs of the proportion of sites in each rate-class
partition that are RNAs (white), first codon positions (light grey), second codon positions (dark grey), and third codon positions (black). Rate classes are listed across the top, from fastest (class 1) to slowest (class 6). Bottom: A bar-graph visualization of the numbers of each of these classes among partitions, using the same color coding, as indicated in the key. Constant
sites, found only in rate class six, are indicated with hatched bars. Raw numbers of each of the values in the bar graph are given
below the bars. Fifteen sites from the origin belong in rate class 6, one in rate class 4, and two in rate class 3 (not shown).
Page 6 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
the two retrotransposed elements supporting this relationship are exceedingly strong characters, we prefer to
consider the independent loss of these in the sloth and the
armadillo as "possible but unlikely." The rest of Kriegs et
al.'s [59] conclusions are supported by our analysis. The
placement of Manis (pangolin) also differs between this
hypothesis and Gibson et al. [39] and Reyes et al [41].
Although we show 100% posterior probability for our
hypothesis, we also note the exceedingly short branch
length of the internode placing Manis as the sister taxon to
(Cetartiodactyla(Perissodactyla(Carnivora))). Lewis et al.
[60] describe conditions under which Bayesian posterior
probabilities may be inflated, and we have not corrected
for potentially inflated support for our placement of both
Manis and Xenartha. The placement of Xenarthra with
Afrotheria and the position of Manis in our phylogenetic
hypothesis are congruent with Hudelot et al. [31], who
used a 7-state doublet model to accommodate paired
RNA sites. Similarities between this study and Hudelot et
al [31] could be attributed to the inclusion of RNAs in
both studies, while differences are more likely due to differences between models.
Finally, the mitochondrial genome data, even after inclusion of all sequences and a model that incorporates multiple rate classes, reveal several anomalies that are not
congruent with recent nuclear gene phylogenies. Some
particular anomalies appear to be inherent to all mitogenomic analyses [26,28,39,41], regardless of either taxon
sampling or the phylogenetic methods employed. Rather
than a monophyletic Primates, as revealed by nuclear
genes, our analyses as well as previous mitochondrial phylogenies indicate a paraphyletic Primates with the order
Dermoptera (flying lemurs) sister to anthropoid primates
(monkeys, lesser and great apes) to the exclusion of the
other primate lineages such as tarsiers and prosimians
(lemurs). Monophyly of the insectivore group Eulipotyphla, containing the families Erinaceidae, Soricidae, and
Talpidae, is supported by nuclear gene phylogenies [3234,61] but not by mitochondrial data, which in our case
indicates eulipotyphlan diphyly with the Erinaceidae
(hedgehogs) at the base of the Laurasiatheria clade. The
order Scandentia (tree shrews) is generally considered sister to either Dermoptera or Primates based on recent
molecular and morphological data [10,33,34,50],
whereas mitogenomic analyses place scandentians at the
base of Euarchontoglires. Additionally, mitochondrial
data support a monophyletic Tethytheria (elephants and
manatees), whereas the more recent nuclear studies [34]
do not, and although recent molecular data [62] place
marsupial moles (Notoryctes) as part of a monophyletic
group (Australidelphia) confined to Australia, our analysis places them basal to other lineages of Metatheria.
http://www.biomedcentral.com/1471-2148/7/8
Persistent incongruence between mitochondrial and
nuclear gene phylogenies relative to the placement of
some mammalian lineages may have more than one
explanation. Long-branch attraction is often used as an
explanation for misplacement of taxa [63,64], and many
of the ambiguous placements involve lineages with longer
branches (Fig. 1). As indicated by Bergsten [63], outgroups can often influence placement of ingroup taxa,
which may be the case for the position of the marsupial
mole. Increased taxon sampling and the incorporation of
maximum likelihood models for mitogenomic analyses
[63] did remove the Erinaceidae from a basal position in
the placental phylogeny to one associated with the Laurasiatheria. Nevertheless, these modifications do not result
in a monophyletic Eulipotyphla, as suggested by nuclear
genes. In the case of the placement of Dermoptera, there
is no apparent reason to consider this as the result of
either long branches or branch support from character
partitions in the higher rate classes. Schmitz et al. [28] suggested an association between demopteran and anthropoid primate mitochondrial sequences being the result of
similarities in nucleotide and amino acid composition.
However, Hudelot et al. [31] recovered a monophyletic
primates with their doublet model, with the flying lemur
as its sister taxon, despite similarities in nucleotide composition at third positions between the flying lemur and
Anthropoidea. Finally, if these areas of incongruence are
the result of similarities in base composition, covariotide/
covarion effects, or some other source of heterogeneity
[64], it may very well be that no existing model adequately corrects for all anomalies observed for the mammalian mitochondrial genome.
Conclusion
Although some incongruence still remains between phylogenies derived from mitochondrial and nuclear
sequences, our results indicate that the exclusion of data is
not necessary for an effective reconstruction of eutherian
relationships (although we still excluded the control
region and unalignable RNA sites). Rather, selection of an
appropriate model that accommodates rate heterogeneity
across data partitions and proper treatment of RNA genes
can yield information highly congruent with more extensive nuclear sequences, even when addressing the deepest
nodes of the eutherian phylogeny. And while we are using
"expected" clades to support our conclusions, we note
that we are not using phylogenetic expectations as a
rationale to exclude data, as is often the case, but rather to
retain data. Arguments to retain data should be met with
a lower burden of proof than arguments to exclude data.
Methods
Mitochondrial genomes were downloaded from GenBank. A Nexus file was constructed, with each block in the
file corresponding to either one gene or a block of data
Page 7 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
between 100–150 nucleotides for manually aligned
rRNAs (the number of nucleotides that are visible on one
computer screen without scrolling). Nucleotides between
genes were manually aligned, and unaligned regions were
placed between brackets (which eliminates them from the
dataset, while retaining them for visual inspection).
Ribosomal RNAs and tRNAs were aligned manually with
reference to secondary structure, according to recommendations of Kjer [65] and Gutell et al. [66]. Models for
rRNA secondary structure came from the Comparative
RNA Web (CRW) Site [67]. The control region was eliminated. All other genes and codon positions were included.
Genes coded in the reverse strand were reversed and complemented.
A site specific rate model was constructed according to
Kjer et al. [48]. Briefly, a fast heuristic bootstrap analysis,
with 1000 replicates, was completed in PAUP, having
saved one tree per replicate. The characters were then separated into 6 discrete rate classes by first selecting the
"reweight characters" option in PAUP, according to the
"best" CI from among the 1000 bootstrap-generated trees.
By selecting "view character weights," and editing the
resultant output, we constructed a file in Microsoft Excel
that was sorted according to the weights, and then reimported into the Nexus file to construct 6 partitions or
"charsets" from fastest to slowest. These charsets were
then used in a partitioned Bayesian analysis, with each
partition free to vary according to its own GTR + gamma
model.
Each Bayesian analysis was performed with 3 hot and one
cold chain. Burnin periods were graphically visualized
from the .p files from MrBayes and viewed in Excel. The
first set of two independent Bayesian analyses was run for
7.5 million generations in MrBayes 3.0 [68]. Since the
likelihood scores from these two chains were not the
same, another pair of analyses was conducted in MrBayes
3.1 [68]. This analysis was terminated with a power-failure after 5 million replicates. However, these runs had stabilized on the same likelihood plateau, which was the
same as the better of two earlier runs of 7.5 million. Therefore, after discarding the burnin, trees from all three optimal analyses were pooled into a single tree file, from
which a majority rule consensus was used to visualize posterior probability values. The best tree was visualized with
Treeview [69], and the likelihood phylogram was
exported as a pict file for modification.
Authors' contributions
KMK collected genome sequences from GenBank, aligned
sequences, and performed initial analyses. RLH provided
a detailed comparison of the new phylogeny to previous
phylogenetic hypotheses for mammalian relationships
http://www.biomedcentral.com/1471-2148/7/8
and interpreted results relative to ideas concerning the
evolution of mammals.
Acknowledgements
We thank Kenneth (Tripp) MacDonald, William J. Murphy, and two anonymous reviewers for helpful comments on the manuscript. KMK acknowledges financial support from NSF DEB 0423834 and the New Jersey
Agricultural Experiment Station, and RLH thanks Pepperdine University for
defraying costs of publication.
References
1.
2.
3.
4.
5.
6.
7.
8.
9.
10.
11.
12.
13.
14.
15.
16.
17.
18.
19.
20.
21.
22.
23.
Osborn HF: The Age of Mammals in Europe, Asia and North America
New York: MacMillan; 1910.
Simpson GG: The principles of classification and a classification of mammals. Amer Mus Nat Hist Bull 1945, 85:1-350.
Wilson DE, Reeder DM: Mammal Species of the World: A Taxonomic
and Geographic Reference Washington DC: Smithsonian Institution
Press; 1993.
Simpson GG: Tempo and Mode in Evolution New York: Columbia University Press; 1944.
Li W-H, Ellsworth DL, Krushkal J, Chang BH-J, Hewett-Emmett D:
Rates of nucleotide substitution in primates and rodents and
the generation-time effect hypothesis. Mol Phylogenet Evol 1996,
5:182-187.
Martin AP, Palumbi SR: Body size, metabolic rate, generation
time and the molecular clock. Proc Natl Acad of Sci USA 1993,
90:4087-4091.
Springer MS: Molecular clocks and the timing of the placental
and marsupial radiations in relation to the Cretaceous-Tertiary boundary. J Mammal Evol 1997, 4:285-302.
McKenna MC, Bell SK: Classification of Mammals: Above the Species
Level New York: Columbia University Press; 1997.
Novacek MJ, Wyss AR, McKenna MC: The major groups of eutherian mammals. In The Phylogeny and Classification of the Tetrapods
Edited by: Benton MJ. Oxford: Clarendon Press; 1988:31-71.
Novacek MJ: Mammal phylogenies: shaking the tree. Nature
1992, 356:121-125.
de Jong WW: Molecules remodel the mammalian tree. Trends
Ecol Evol 1998, 13:270-275.
Honeycutt RL, Adkins RM: Higher level systematics of eutherian
mammals: an assessment of molecular characters and phylogenetic hypotheses. Ann Rev Ecol Syst 1993, 24:279-305.
Springer MS, Stanhope MJ, Madsen O, de Jong WW: Molecules consolidate the placental mammal tree. Trends Ecol Evol 2004,
19:430-438.
Luckett W, Hartenberger J-L: Monophyly or polyphyly of the
order Rodentia: possible conflict between morphological
and molecular interpretations. J Mammal Evol 1993, 1:127-147.
Graur D, Hide W, Li W-H: Is the guinea-pig a rodent? Nature
1991, 351:649-652.
Graur D, Hide W, Zharkikh AA, Li W-H: The biochemical phylogeny of guinea pigs and gundis and the paraphyly of the
order Rodentia. Comp Biochem Physiol B 1992, 101:495-498.
Li W-H, Hide WA, Zharkikh A, Ma D-P, Graur D: The molecular
taxonomy and evolution of the guinea pig. J Heredity 1992,
83:174-181.
D'Erchia AM, Gissi C, Pesole G, Saccone C, Arnason U: The guineapig is not a rodent. Nature 1996, 381:597-600.
Reyes A, Pesole G, Saccone C: Complete mitochondrial DNA
sequence of the fat dormouse, Glis glis: further evidence of
rodent paraphyly. Mol Biol Evol 1998, 15:499-505.
Reyes A, Gissi C, Pesole G, Catzeflis FM, Saccone C: Where do
rodents fit? Evidence from the complete mitochondrial
genome of Sciurus vulgaris. Mol Biol Evol 2000, 17:979-983.
Novacek MJ: Cranial evidence for rodent affinities. In Evolutionary Relationships Among Rodents: A Multidisciplinary Analysis Edited by:
Luckett WP, Hartenberger JL. New York: Plenum Press; 1985:59-81.
Graur D, Duret L, Guoy M: Phylogenetic position of the order
Lagomorpha (rabbits, hares and allies).
Nature 1996,
379:333-335.
Mouchaty SK, Gullberg A, Janke A, Arnason U: The phylogenetic
position of the Talpidae within Eutheria based on analysis of
Page 8 of 9
(page number not for citation purposes)
BMC Evolutionary Biology 2007, 7:8
24.
25.
26.
27.
28.
29.
30.
31.
32.
33.
34.
35.
36.
37.
38.
39.
40.
41.
42.
43.
44.
45.
complete mitochondrial sequences.
Mol Biol Evol 2000,
17:60-67.
Arnason U, Gullberg A, Janke A: Phylogenetic analyses of mitochondrial DNA suggest a sister group relationship between
Xenarthra (Edentata) and ferungulates. Mol Biol Evol 1997,
14:762-768.
Janke A, Xu X, Arnason U: The complete mitochondrial
genome of the wallaroo (Macropus robustus) and the phylogenetic relationship among Montremata, Marsupialia, and
Eutheria. Proc Natl Acad Sci USA 1997, 94:1276-1281.
Arnason U, Adegoke JA, Bodin K, Born EW, Esa YB, Gullberg A, Nilsson M, Short RV, Xu X, Janke A: Mammalian mitogenomic relationships and the root of the eutherian tree. Proc Natl Acad Sci
USA 2002, 99:8151-8156.
Allard MW, Honeycutt RL, Novacek MJ: Advances in higher level
mammalian relationships. Cladistics 1999, 15:213-219.
Schmitz J, Ohme M, Suryobroto B, Zischler H: The colugo (Cynocephalus variegates, Dermoptera): the primates'gliding sister? Mol Biol Evol 2002, 19:2308-2312.
Schmitz J, Ohme M, Zischler H: The complete mitochondrial
sequence of Tarsius bancanus: evidence for an extensive
nucleotide compositional plasticity of primate mitochondrial DNA. Mol Biol Evol 2002, 19:544-553.
Douzery EJP, Huchon D: Rabbits, if anything, are likely Glires.
Mol Phylogenet Evol 2004, 33:922-935.
Hudelot C, Gowri-Shankar V, Jow H, Rattray M, Higgs PG: RNAbased phylogenetic methods: application to mammalian
RNA sequences. Mol Phylogenet Evol 2003, 28:241-252.
Madsen O, Scally M, Douady CJ, Kao DJ, DeBry RW, Adkins R,
Amrine HM, Stanhope MJ, de Jong WW, Springer MS: Parallel adaptive radiations in two major clades of placental mammals.
Nature 2001, 409:610-614.
Murphy WJ, Eizirik E, Johnson WE, Zhang YP, Ryder OA, O'Brien SJ:
Molecular phylogenetics and the origins of placental mammals. Nature 2001, 409:614-618.
Murphy WJ, Eizirik E, O'Brien SJ, Madsen O, Scally M, Douady CJ,
Teeling E, Ryder OA, Stanhope MJ, de Jong WW, Springer MS: Resolution of the early placental mammal radiation using Bayesian phylogenetics. Science 2001, 294:2348-2351.
Springer MS, Cleven GC, O Madsen O, de Jong WW, Waddell VG,
Amrine HM, Stanhope MJ: Endemic African mammals shake the
phylogenetic tree. Nature 1997, 388:61-64.
Springer MS, DeBry RW, Douady C, Amrine HM, Madsen O, de Jong
WW, Stanhope MJ: Mitochondrial versus nuclear gene
sequences in deep-level mammalian phylogeny reconstruction. Mol Biol Evol 2001, 18:132-143.
Stanhope MJ, Waddell VG, Madsen O, de Jong WW, Hedges SB:
Molecular evidence for multiple origins of Insectivora and for
a new order of endemic African insectivore mammals. Proc
Natl Acad Sci USA 1998, 95:9967-9972.
Waddell PJ, Shelley S: Evaluating placental inter-ordinal phylogenies with novel sequences including RAG1,?-fibrinogen,
ND6, and mt-tRNA, pluse MCMC-driven nucleotide, amino
acid, and codon models. Mol Phylogenet Evol 2003, 28:197-224.
Gibson A, Gowri-Shankar V, Higgs PG, Rattray M: A comprehensive analysis of mammalian mitochondrial genome base
composition and improved phylogenetic methods. Mol Biol
Evol 2005, 22:251-264.
Penny D, Hasegawa M: The platypus put in its place. Nature 1997,
387:549-550.
Reyes A, Gissi C, Catzeflis F, Nevo E, Pesole G, Saccone C: Congruent mammalian trees from mitochondrial and nuclear genes
using Bayesian methods. Mol Biol Evol 2004, 21:397-403.
Sullivan J, Swofford DL: Are guinea pigs rodents? The importance of adequate models in molecular phylogenetics. J Mammal Evol 1997, 4:77-86.
Phillips MJ, Penny D: The root of the mammalian tree inferred
from whole mitochondrial genomes. Mol Phylogenet Evol 2003,
28:171-185.
Delsuc F, Scally M, Madsen O, Stanhope MJ, de Jong WW, Catzeflis
FM, Springer MS, Douzery EJP: Molecular phylogeny of living
xenarthrans and the impact of character and taxon sampling
on the placental tree rooting. Mol Biol Evol 2002, 19:1656-1671.
Halanych KM: Lagomorphs misplaced by more characters and
fewer taxa. Syst Biol 1998, 47:138-146.
http://www.biomedcentral.com/1471-2148/7/8
46.
47.
48.
49.
50.
51.
52.
53.
54.
55.
56.
57.
58.
59.
60.
61.
62.
63.
64.
65.
66.
67.
68.
69.
Lin YH, McLenachan PA, Gore AR, Phillips MJ, Ota R, Hendy MD,
Penny D: Four new mitochondrial genomes and the increased
stability of evolutionary trees of mammals from improved
taxon sampling. Mol Biol Evol 2002, 19:2060-2070.
Lin YH, Waddell P, Penny D: Pika and vole mitochondrial
genomes increase support for both rodent monophyly and
Glires. Gene 2002, 294:119-129.
Kjer KM, Blahnik RJ, Holzenthal RW: Phylogeny of Trichoptera
(Caddisflies): characterization of signal and noise within multiple datasets. Syst Biol 2001, 50:781-816.
Phylogenetic Datasets
[http://www.rci.rutgers.edu/~insects/
pdata.htm]
Springer MS, Stanhope MJ, Madsen O, de Jong WW: Molecules consolidate the placental mammal tree. Trends Ecol Evol 2005,
19:430-438.
Asher RJ, Novacek MJ, Geisler JH: Relationships of endemic African mammals and their fossil relatives based on morphological and molecular evidence. J Mammal Evol 2003, 10:131-194.
Ursing BM, Arnason U: Analyses of mitochondrial genomes
strongly support a hippopotamus-whale clade. Proc R Soc London [Biol] 1998, 265:2251-2255.
Geisler JH, Uhen MD: Morphological support for a close relationship between hippos and whales. J Vertebrate Paleont 2003,
23:991-996.
O'Leary MA, Geisler JH: The position of Cetacea within Mammalia: phylogenetic analysis of morphological data from
extinct and extant taxa. Syst Biol 1999, 48:455-490.
Theodor JM: Molecular clock divergence estimates and the
fossil record of Cetartiodactyla. J Paleont 2004, 78:39-44.
Springer MS, Teeling EC, Madsen O, Stanhope MJ, de Jong WW: Integrated fossil and molecular data reconstruct bat echolocation. Proc Natl Acad Sci USA 2001, 98:6241-6246.
Teeling EC, Madsen O, van den Bussche RA, de Jong WW, Stanhope
M: Microbat paraphyly and the convergent evolution of a key
innovation in Old World rhinolophoid microbats. Proc Natl
Acad Sci USA 2002, 99:1431-1436.
Teeling EC, Springer MS, Madsen O, Bates P, O'Brien SJ: A molecular phylogeny for bats illuminates biogeography and the fossil record. Science 2005, 307:580-584.
Kriegs JO, Churakov G, Kiefmann M, Jordan U, Brosius J, Schmitz J:
Retrotransposed elements as archives for the evolutionary
history of placental mammals. PloS Biology 2006, 4:e91.
Lewis PO, Holder MT, Holsinger KE: Polytomies and Bayesian
phylogenetic inference. Syst Biol 2005, 54:241-53.
Amerine-Madsen H, Koepfli K-P, Wayne RK, Springer MS: A new
phylogenetic marker, apolipoprotein B, provides compelling
evidence for eutherian relationships. Mol Phylogenet Evol 2003,
28:225-240.
Amrine-Madsen H, Scally M, Westerman M, Stanhope MJ, Krajewski
C, Springer MS: Nuclear gene sequences provide evidence for
the monophyly of australidelphian marsupials. Mol Phylogenet
Evol 2003, 28:186-196.
Bergsten J: A review of long-branch attraction. Cladistics 2005,
21:163-193.
Sanderson MJ, Shaffer HB: Troubleshooting molecular phylogenetic analyses. Ann Rev Ecol Syst 2002, 33:49-72.
Kjer KM: Use of rRNA secondary structure in phylogenetic
studies to identify homologous positions: an example of
alignment and data presentation from the frogs. Mol Phylogenet Evol 1995, 4:314-330.
Gutell RR, Larsen N, Woese CR: Lessons from an evolving
rRNA: 16S and 23S rRNA structures from a comparative
perspective. Microbiol Rev 1994, 58:10-26.
Cannone JJ, Subramanian S, Schnare MN, Collett JR, D'Souza LM, Du
Y, Feng B, Lin N, Madabusi LV, Müller KM, Pande N, Shang Z, Yu N,
Gutell RR: The comparative RNA web (CRW) site: an online
database of comparative sequence and structure information for ribosomal, intron, and other RNAs. BMC Bioinformatics
2002, 3:2. [Correction:BMC Bioinformatics 2002, 3:15.]
Ronquist F, Huelsenbeck JP: MrBayes 3: Bayesian phylogenetic
inference under mixed models.
Bioinformatics 2003,
19:1572-1574.
Page RD: TreeView: an application to display phylogenetic
trees on personal computers. Comp Appl BioScience 1996,
12:357-358.
Page 9 of 9
(page number not for citation purposes)
View publication stats