Package ‘iteRates’
February 20, 2015
Type Package
Title Parametric rate comparison
Version 3.1
Date 2012-12-03
Author Premal Shah, Benjamin Fitzpatrick, James Fordyce
Maintainer Ben Fitzpatrick <benfitz@utk.edu>
Description Iterates through a phylogenetic tree to identify regions
of rate variation using the parametric rate comparison test.
License GPL (>= 3)
LazyLoad yes
Depends partitions, stats, VGAM, MASS, ape, apTreeshape, geiger,
gtools
NeedsCompilation no
Repository CRAN
Date/Publication 2013-05-03 21:40:36
R topics documented:
iteRates-package
color.tree.plot . .
comp.fit.subs . .
comp.subs . . . .
FP.comp.subs . .
id.subtrees . . . .
tab.summary . . .
tree.na.Count . .
tree.rand.test . . .
trimTree . . . . .
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
Index
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
.
2
2
4
6
8
9
10
12
13
14
16
1
2
color.tree.plot
iteRates-package
iteRates
Description
Iterates through a phylogenetic tree to identify regions of rate variation using the parametric rate
comparison test.
Details
Package:
Type:
Version:
Date:
License:
LazyLoad:
iteRates
Package
3.0
2011-05-24
GPL 3.0
yes
The user provides a phylogenetic tree of object class phylo. The package will iterate through all
useable subtrees and identify regions of the tree with different rates of diversification using the
parametric rate comparison test.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
Maintainer: Ben Fitzpatrick <benfitz@utk.edu>
color.tree.plot
color.tree.plot
Description
This function plots phylogenetic trees on the current graphical device and indicates potential regions
of the tree that might have undergone a shift in diversification rate.
Usage
color.tree.plot(out, tree, p.thres = 1, evid.thres=0, PorE=1, show.node.label = FALSE,
NODE = TRUE, PADJ = NULL, scale = 1, col.rank = TRUE, breaks = 50, ...)
3
color.tree.plot
Arguments
out
the output object from comp.subs.
tree
an object of class "phylo" used in the comp.subs analysis.
p.thres
a numeric between 0 and 1 setting the threshold to plot rate-shifts with p-value<=p.thres.
Default is 1.0.
evid.thres
a numeric setting the threshold to plot rate-shifts with evidence ratio >=evid.thres.
Default is 0.
PorE
a switch to indicate whether rate-shifts are indicated based on the p-value (PorE=1)
or the evidence ratio (PorE=1).
show.node.label
a logical indicating whether the node labels need to be plotted with the tree.
Default is FALSE.
NODE
a logical switch between identifying rate-shifts on trees by coloring "nodes" or
"branches". Default is TRUE.
PADJ
a character vector to adjust p-values from comp.subs for multiple comparison.
Options are identical to the ones in p.adjust in the stats package including
"holm","hochberg", "hommel", "bonferroni", "BH", "BY", "fdr", "none". Default is NULL.
scale
a numeric that controls the size of the colored nodes or thickness of colored
branch lengths used to indicate rate-shifts. Default is 1.
col.rank
a logical indicating whether various instances of potential rate-shifts should be
colored based on the rank of the p-value or the absolute magnitude of the rateshift. Default is TRUE indicating use of ranks instead of magnitude.
breaks
a numeric indicating the range of colors to be used for plotting. Choosing a
smaller value will lead to big differences in colors while a bigger value will lead
to finer variations in colors.
...
additional arguments to be passed to plot.phylo in the ape package.
Details
When passing an object of class "phylo" (tree) follow the guidelines in plot.phylo in the ape
package. Also make sure that the tree passed to color.tree.plot is the same as the one used to
generate out from comp.subs.
Value
color.tree.plot returns only a graphical device output.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James A. Fordyce.
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
4
comp.fit.subs
See Also
comp.subs, plot.phylo
Examples
data(geospiza)
attach(geospiza)
output.geospiza <- comp.subs(geospiza.tree)
color.tree.plot(out=output.geospiza,
color.tree.plot(out=output.geospiza,
color.tree.plot(out=output.geospiza,
color.tree.plot(out=output.geospiza,
comp.fit.subs
tree=
tree=
tree=
tree=
geospiza.tree)
geospiza.tree, NODE=FALSE)
geospiza.tree, p.thres=1)
geospiza.tree, scale=2)
comp.fit.subs
Description
The function implements the K-clades parametric rate comparison test. This function compares
rate estimates among defined subtrees and evaluates various groupings from 1 to k groups for these
subtrees.
Usage
comp.fit.subs(trees, focal, k, mod.id = c(1, 0, 0, 0), min.val = 0.01)
Arguments
trees
A list from from function id.subtrees.
focal
A vector indicating the subtrees to compare
k
A value indicting the maximum number of groupings of subtrees to examine
mod.id
A vector with four elements of 0 or 1 indicating which models to consider. 1
indicates that the model should be considered. 0 indicates the model is not
considered. These for elements refer to an exponential, Weibel, lognormal, and
rate variable, respectively.
min.val
A value for determining the minimum edge length for a tree scaled against the
longest edge length. A value of 0.01 (the default) rescales the minimum edge
length to 1
Details
The list of possible subtrees is provided by the function id.subtrees. The function will explore all
possible groupings of subtrees into k defined groups choosing the best fit model for each partition
from among the models identified by mod.id.
5
comp.fit.subs
Value
A dataframe that consists of the following:
k
The number of groups
Groups
The groupings for each subtree numbered as 1 to the number of subtrees indicated. The numbering corresponds to the order in which subtrees are identified
by focal. Groups are separated with vs.
gi_Pj
The jth parameter value for the ith group in the analysis
gi_mod.id
The best model chosen for the ith group
gi_n.param
The number of parameters in the best model for the ith group
AIC
Akaike information criterion score for the entire model for a grouping scheme
AICc
Akaike information criterion corrected for sample size
dAICc
The delta AIC across all grouping schemes and k values relative to the best fit
model
Note
The output can get very large as k increases. Function tab.summary is useful for reducing the size
of the result table.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
See Also
tab.summary id.subtrees
Examples
data(hivtree.newick)
cat(hivtree.newick, file = "hivtree.phy", sep = "\n")
tree.hiv <- read.tree("hivtree.phy") # load tree
unlink("hivtree.phy") # delete the file "hivtree.phy"
idHIV<-id.subtrees(tree.hiv)
plot(idHIV$tree,show.node.label=TRUE)
cfsHIV<-comp.fit.subs(idHIV$subtree,focal=c(153,119,96,5),k=4)
6
comp.subs
comp.subs
comp.subs
Description
The function implements the parametric rate comparison test. The function iterates through all
subtrees of a phylogenetic tree and compares the distribution of branch lengths in the subtree to the
"remainder" tree. It is intended to be used with a chronogram in order to test whether diversification
rates differ among clades within a broader phylogeny. A variety of truncated distributions can be
used and compared via likelihood.
Usage
comp.subs(tree, thr = 6, srt = "drop", min.val = 0.01,
mod.id = c(1, 0, 0, 0),verbose=TRUE)
Arguments
tree
An object of class phylo. To test variation in diversification rates, this should
be a chronogram.
thr
Threshold subtree or remainder tree size below which comparisons should not
be performed. thr is the minimum number of edges (in either the subtree or
remainder tree) for a comparison to be made.
srt
Treatment of subtree root edge. Default is "drop" meaning the edge subtending
each subtree will be left out of the comparison for that subtree. Alternatives
"in" or "out" classify the subtree root edge as part of the subtree or part of the
remainder tree, respectively.
min.val
Replacement of zero-length branches with a small positive number to avoid spurious zeros in likelihood calculations. This value is treated as a fraction of the
maximum branch (it is multiplied by the maximum edge length and that resultant is substituted for zero-length branches in tree
mod.id
Indicator vector specifying statistical distributions to be fit to the data. In order,
the distributions are exponential, Weibull, lognormal, and variable rates Venditti
et al. 2010. Default is exponential only.
verbose
A logical indicating whether progress is updated on the screen
Details
All distributions are fit using the likelihood for the truncated form
Value
A data frame containing up to 15 variables for each subtree of tree. Each row corresponds to a
subtree and the order is that returned by the function subtrees. Subtrees that are not tested (owing
to failure to meet the thr threshold) have NA’s for all variables:
7
comp.subs
Par1.tot
First estimated parameter of the best fit model for the pooled edge lengths of the
subtree and remainder tree. For exponential, this is the rate. For Weibull it is the
"shape" parameter. For lognormal it is mu. For the variable rates distribution it
is alpha.
Par2.tot
Second estimated parameter of the best fit model for the pooled edge lengths.
For exponential, it is NA. For Weibull it is the "scale" parameter. For lognormal,
it is sigma. For variable rates, it is beta.
Par1.tr1
First estimated parameter for the best fit model for the subtree
Par2.tr1
Second estimated parameter for the best fit model for the subtree
Par1.tr2
First estimated parameter for the best fit model for the remainder tree
Par2.tr2
Second estimated parameter for the best fit model for the remainder tree
llk.1r
log likelihood of the best fit model for the pooled set of edges: the one-rate
model.
llk.2r
log likelihood for the best two-rate model
mod.1r.tot
Best fit distribution for the one-rate model: 1=exponential, 2=Weibull, 3=lognormal, 4=variable rates
mod.2r.tr1
Best fit distribution for the subtree under the two-rate model
mod.2r.tr2
Best fit distribution for the remainder tree under the two-rate model
node1
Identifies the node corresponding to the most recent common ancestor of the
subtree and its sister clade. That is, the node ancestral to the branch along which
a rate change might have occured.
node2
Identifies the most recent common ancestor of all taxa in the subtree. That is, the
descendant node of the branch along which a rate chage might have occurred.
p.val
P-value from the likelihood ratio test of the two-rate vs. one-rate model for the
subtree defined by node2
EvidRatio
The evidence ratio from the AICc scores of the two-rate vs. one-rate model for
the subtree defined by node2
Author(s)
Premal Shah, James A. Fordyce, Benjamin M. Fitzpatrick
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013 A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377. Venditti, C., A. Meade, and
M. Pagel, 2010. Phylogenies reveal new interpretation of speciation and the red queen. Nature
463:349-352.
Examples
data(geospiza)
attach(geospiza)
comp.subs(geospiza.tree)
8
FP.comp.subs
FP.comp.subs
FP.comp.subs
Description
This function simulates pure birth trees with a given number of taxa and NA subtrees and calculates
the null expectation for the number of significant rate differences.
Usage
FP.comp.subs(tree.size, na.present, sims = 100, missing = 0,
alpha = 0.05, verbose = FALSE, ...)
Arguments
tree.size
A value for the number of terminal taxa in the tree to simulate.
na.present
A value for the number of NA subtrees in the simulated trees.
sims
A value for the number of trees to simulate.
missing
A value indicating the number of missing taxa from the tree.
alpha
A value indicating the threshold for statistical significance.
verbose
A boolean indicating whether a summary of the simulations is printed to the
screen.
...
Arguments passed on to comp.subs function
Details
This function is useful if the user wants to know the expected number of significant rate differences
for a tree of a given size and number of NA subtrees. This function calls on comp.subs, and
arguments can be passed on.
Value
A list that consists of the following:
tree.size
The number of terminal taxa provided by the user.
missing
The number of missing taxa from the tree.
sims
The number of simulated trees.
FPRthres
The number of significant rate difference detections expected based upon the
alpha value provided by the user.
Note
comp.subs is an exploratory data analysis tool and concerns of false positives should be considered
accordingly. The argument "missing" can be used for trees with incomplete taxon sampling. Thus,
if a group should have 100 taxa included, but only 90 are present in the tree, tree.size=100 and
missing=10.
9
id.subtrees
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
See Also
comp.subs
Examples
## Not run:
data(geospiza)
tree<-geospiza$geospiza.tree
na.count<-tree.na.Count(tree)
FP.comp.subs(tree.size=14,na.present=na.count,verbose=TRUE)
## End(Not run)
id.subtrees
id.subtrees
Description
This function identifies and numbers all subtrees within a tree of object class phylo. It creates the
object required for function comp.fit.subs.
Usage
id.subtrees(tree)
Arguments
tree
A tree of object class phylo.
Details
This function identifies all the subtrees in a tree. These identifiers are used to identify the focal
subtrees used in the comp.fit.subs function.
Value
A list that consists of the following:
tree
subtree
The original tree as object class phylo with nodes labeled identifying the identification number for all subtrees.
A list of all possible subtrees as object class phylo.
10
tab.summary
Note
This function will rename all node labels.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
See Also
comp.fit.subs
Examples
## Not run:
data(hivtree.newick)
cat(hivtree.newick, file = "hivtree.phy", sep = "\n")
tree.hiv <- read.tree("hivtree.phy") # load tree
unlink("hivtree.phy") # delete the file "hivtree.phy"
idHIV<-id.subtrees(tree.hiv)
plot(idHIV$tree,show.node.label=TRUE)
## End(Not run)
tab.summary
tab.summary
Description
This function provides an abridged output of results obtained from the comp.fit.subs function by
restricting the output to a user provided delta AIC threshold.
Usage
tab.summary(res, daic = 2, show.rate = FALSE)
Arguments
res
A dataframe obtained from comp.fit.subs function.
daic
A value indicating a threshold of delta AIC relative to the best fit model for each
k to be included in the output.
show.rate
A boolean indicting whether the rate parameters are included in the output.
11
tab.summary
Details
This function will provide a reduced output of the results provided by the comp.fit.subs function
by allowing the user to choose a critical delta AIC for each value of k that determines which comparisons are included in the output. The best fit model for each k is included in the output regardless
of delta AIC. The show.rate argument indicates whether the rate estimate for each of the subtrees is
included in the output.
Value
A dataframe that consists of the following:
k
The number of groups
Groups
the groupings for each subtree numbered as 1 to the number of subtrees indicated. The numbering corresponds to the order in which subtrees are identified
by focal. Groups are separated with ’vs.’.
gi_rate
The rate for the ith group in the analysis.
LL
The log likelihood for the entire model for a grouping scheme.
AIC
Akaike information criterion score for the entire model for a grouping scheme.
AICc
Akaike information criterion corrected for sample size.
dAICc
The delta AIC across all grouping schemes and k values relative to the best fit
model.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
See Also
tab.summary id.subtrees
Examples
## Not run:
data(hivtree.newick)
cat(hivtree.newick, file = "hivtree.phy", sep = "\n")
tree.hiv <- read.tree("hivtree.phy") # load tree
unlink("hivtree.phy") # delete the file "hivtree.phy"
idHIV<-id.subtrees(tree.hiv)
plot(idHIV$tree,show.node.label=TRUE)
cfsHIV<-comp.fit.subs(idHIV$subtree,focal=c(153,119,96,5),k=4)
tab.summary(cfsHIV)
tab.summary(cfsHIV,daic=1)
12
tree.na.Count
tab.summary(cfsHIV,daic=0.01)
## End(Not run)
tree.na.Count
tree.na.Count
Description
This function will identify the number of NA subtrees present in a given phylogenetic tree.
Usage
tree.na.Count(tree, thr = 6, srt = "drop", min.val = 0.01,
mod.id = c(1, 0, 0, 0))
Arguments
tree
A tree of object class phylo.
thr
The threshold for the minimum number of edges to be used for calculating the
rate of a subtree.
srt
Determines how the edge leading to a subtree is dealt with when calculating
rates. The default, "drop", excludes the edge leading to the subtree from the
analysis. "in" will include the edge as part of the subtree and "out" will include
the edge as part of the remaining tree.
min.val
A value for determining the minimum edge length for a tree scaled against the
longest edge length. A value of 0.01 (the default) rescales the minimum edge
length to 1
mod.id
A vector with four elements of 0 or 1 indicating which models to consider. 1
indicates that the model should be considered. 0 indicates the model is not
considered. These for elements refer to an exponential, Weibel, lognormal, and
rate variable, respectively.
Details
This function identifies the number of NA subtrees present in a given phylogenetic tree. This information might be useful if the user is interested in simulating trees with the same amount of
information (i.e., useable edges) for calculating rates.
Value
A number indicating the number of NAs in the given tree.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
13
tree.rand.test
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
See Also
FP.comp.subs
Examples
## Not run:
data(geospiza)
tree<-geospiza$geospiza.tree
tree.na.Count(tree)
## End(Not run)
tree.rand.test
tree.rand.test
Description
This function performs a randomization test for rate variation among clades.
Usage
tree.rand.test(tree, reps=1000, mod.id=c(1,0,0,0), trace=TRUE)
Arguments
tree
reps
mod.id
trace
An ultrametric tree of object class phylo.
Desired number of randomizations
Indicator vector specifying statistical distributions to be fit to the data. In order,
the distributions are exponential, Weibull, lognormal, and variable rates Venditti
et al. 2010. Default is exponential only.
If true, progress will be indicated by printing to the screen.
Details
This function addresses the potential for spurious inference of diversification rate variation when a
phylogeny deviates from the pure birth model. Deviation from pure birth (e.g., when extinction is
important or speciation probabilities change over time) distorts the distribution of branching times
such that internode lengths do not satisfy the independent and identical distribution (iid) assumption
of the PRC test. This function distinguishes among-clade rate variation from rate variation through
time by holding the set of branching times constant and randomizing tree topologies. That is, it
simulates the null hypothesis that speciation and extinction probabilities are constant across lineages
at any given time. The function provides a null distribution for the false detection rate - the fraction
of subtrees appearing to have deviant diversification rates when there is no true among-clade rate
variation.
14
trimTree
Value
A list that consists of the following:
tree
The original tree as object class phylo.
obs.p
Observed set of p-values from comp.subs.
ncs
A (potentially large) list of output (p-values and evidence ratios) from each randomization.
obs.detection
Detection rate for the observed tree. This is the fraction of qualified subtrees
with rate variation according to a p-value less than 0.05
p.detection
The fraction of null trees that have more detections than the observed.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
Examples
## Not run:
data(geospiza)
tree <- geospiza$geospiza.tree
tree.rand.test(tree,reps=50) # few reps used to illustrate without taking too much time
## End(Not run)
trimTree
trimTree
Description
This function will trim a specified amount of time, or branch length, from the tips of an ultrametric
tree.
Usage
trimTree(phy, Time)
Arguments
phy
An ultrametric tree of object class phylo.
Time
A value indicating the amount of branch length (time) to be removed from the
tips of the tree
15
trimTree
Details
This function is useful if there is some ambiguity regarding the resolution of the tips. This might
include possible over-splitting of taxa, or incomplete taxon sampling. For example, it might be
desirable to analyze a tree where the most recent 1 million years is excluded to account for the
possibility of incomplete sampling. It is important to note that analyses conducted on the trimmed
tree is based on lineages that are still extant and cannot account for lineages that might have been
present at the time of the trimming but has subsequently gone extinct.
Value
A list that consists of the following:
o.tree
The original tree as object class phylo.
t.tree
The tree after the designated amount of branch length has been trimmed from
the tips as object class phylo.
new.tip.clades A vector in the t.tree phylo object that gives the tip names following trimming
that identifies the original tip names in the newly defined clades.
Author(s)
Premal Shah, Benjamin Fitzpatrick and James Fordyce.
References
Shah, P., B. M. Fitzpatrick, and J. A. Fordyce. 2013. A parametric method for assessing diversification rate variation in phylogenetic trees. Evolution 67:368-377.
Examples
## Not run:
data(hivtree.newick)
cat(hivtree.newick, file = "hivtree.phy", sep = "\n")
tree.hiv <- read.tree("hivtree.phy") # load tree
unlink("hivtree.phy") # delete the file "hivtree.phy"
trim.hiv<-trimTree(phy=tree.hiv,Time=0.1)#trims 0.1 branchlength units from the tree
par(mfrow=c(1,2))
plot.phylo(trim.hiv$o.tree);plot.phylo(trim.hiv$t.tree)
# Identify the names of the original terminal taxa
# that correspond to the newly defined, numbered tips.
trim.hiv$t.tree$new.tip.clades
## End(Not run)
Index
∗Topic \textasciitildekwd1
comp.fit.subs, 4
FP.comp.subs, 8
id.subtrees, 9
tab.summary, 10
∗Topic \textasciitildekwd2
comp.fit.subs, 4
FP.comp.subs, 8
id.subtrees, 9
tab.summary, 10
color.tree.plot, 2
comp.fit.subs, 4
comp.subs, 4, 6
FP.comp.subs, 8
id.subtrees, 9
iteRates (iteRates-package), 2
iteRates-package, 2
plot.phylo, 4
tab.summary, 10
tree.na.Count, 12
tree.rand.test, 13
trimTree, 14
16