Abstract
The Robinson-Foulds (RF) distance is a well-established measure for the comparative analysis of phylogenetic trees, and comparing unrooted gene trees with rooted species trees is a standard application in phylogenetics. However, the RF distance is not defined to compare unrooted trees with rooted trees. Here we propose urRF a new measure based on the standard RF distance to compare an unrooted tree with a rooted tree, and describe a linear time algorithm to compute urRF. Further, we show that our measure relates to several intriguing properties that can be crucial for the in-depth comparative study and the credible rooting of trees. Finally, we present detailed empirical studies based on real data sets demonstrating the performance of our novel RF measure, the efficiency of our linear time algorithm and the quality of determining optimal rootings of unrooted gene trees.
Support was provided to PG by the grant of MNiSW #N N301 065236, to OE by the NSF (#0830012 and #106029), and to PG and OE by NCN #2011/01/B/ST6/02777 and the NIMBioS Working Group: Gene Tree Reconciliation through NSF #EF-0832858.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Bansal, M.S., Burleigh, J.G., Eulenstein, O., Fernández-Baca, D.: Robinson-foulds supertrees. Algorithms for Molecular Biology 5, 18 (2010)
Bender, M.A., Farach-Colton, M.: The lca Problem Revisited. In: Gonnet, G.H., Viola, A. (eds.) LATIN 2000. LNCS, vol. 1776, pp. 88–94. Springer, Heidelberg (2000)
Bordewich, M., Semple, C.: On the computational complexity of the rooted subtree prune and regraft distance. Annals of Combinatorics 8, 409–423 (2004)
Bourque, M.: Abres de Steiner et reseaux dont varie I’emplagement de certains sommnets. PhD thesis, Department d’Informatique, University Montréal, Montréal (1978)
Bryant, D., Steel, M.: Computing the distribution of a tree metric. IEEE/ACM Trans. Comput. Biol. Bioinform. 6(3), 420–426 (2009)
Bryant, D., Tsang, J., Kearney, P.E., Li, M.: Computing the quartet distance between evolutionary trees. In: Symposium on Discrete Algorithms, pp. 285–286 (2000)
Chaudhary, R., Burleigh, J.G., Fernández-Baca, D.: Fast Local Search for Unrooted Robinson-Foulds Supertrees. In: Chen, J., Wang, J., Zelikovsky, A. (eds.) ISBRA 2011. LNCS, vol. 6674, pp. 184–196. Springer, Heidelberg (2011)
Chen, F., Mackey, A.J., Stoeckert, C.J., Roos, D.S.: Orthomcl-db: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Research 34(suppl.1), D363–D368
DasGupta, B., He, X., Jiang, T., Li, M., Tromp, J., Zhang, L.: On distances between phylogenetic trees. In: SODA, pp. 427–436 (1997)
Day, W.H.E.: Optimal algorithms for comparing trees with labeled leaves. Journal of Classification 2(1), 7–28 (1985)
Edgar, R.C.: MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Research 32, 1792–1797 (2004)
Eulenstein, O., Mirkin, B., Vingron, M.: Duplication-based measures of difference between gene and species trees. J. Comput. Biol. 5(1), 135–148 (1998)
Felsenstein, J.: Inferring phylogenies. Sinauer Associates (2004)
Goodman, M., Czelusniak, J., Moore, G.W., Romero-Herrera, A.E., Matsuda, G.: Fitting the gene lineage into its species lineage, a parsimony strategy illustrated by cladograms constructed from globin sequences. Systematic Zoology 28(2), 132–163 (1979)
Górecki, P., Tiuryn, J.: Dls-trees: A model of evolutionary scenarios. Theor. Comput. Sci. 359(1-3), 378–399 (2006)
Górecki, P., Tiuryn, J.: Inferring phylogeny from whole genomes. Bioinformatics 23(2), e116–e222 (2007)
Guindon, S., Delsuc, F., Dufayard, J., Gascuel, O.: Estimating maximum likelihood phylogenies with PhyML. Methods Mol. Biol. 537, 113–137 (2009)
Mecham, C.A.: Theoretical and computational considerations of the compatibility of qualitative taxonomic characters. NATO ASI Series, vol. 1, pp. 304–314. Springer, Berlin (1983)
Mirkin, B., Muchnik, I.B., Smith, T.F.: A biologically consistent model for comparing molecular phylogenies. J. Comput. Biol. 2(4), 493–507 (1995)
Page, R.D.M.: Maps between trees and cladistic analysis of historical associations among genes, organisms, and areas. Systematic Biology 43(1), 58–77 (1994)
Pattengale, N.D., Gottlieb, E.J., Moret, B.M.E.: Efficiently computing the robinson-foulds metric. J. Comput. Biol. 14(6), 724–735 (2007)
Robinson, D.F., Foulds, L.R.: Comparison of weighted labelled trees. Lecture Notes in Mathematics, vol. 748, pp. 119–126 (1979)
Sanderson, M., McMahon, M.: Inferring angiosperm phylogeny from est data with widespread gene duplication. BMC Evolutionary Biology 7(suppl.1) (2007)
Sayers, E.W., et al.: Database resources of the national center for biotechnology information. Nucleic Acids Research 37(suppl.1), D5–D15 (2009)
Semple, C., Steel, M.A.: Phylogenetics. Oxford University Press (2003)
Smith, A.: Rooting molecular trees: problems and strategies. Biol. J. Linn. Soc. 51, 279–292
Strimmer, K., von Haeseler, A.: Quartet puzzling: A quartet maximum likelihood method for reconstructing tree topologies. Molecular Biology and Evolution 13, 964–969 (1996)
Wheeler, W.: Nucleic acid sequence phylogeny and random outgroups. Cladistics – The International Journal of the Willi Hennig Society 51, 363–368 (1990)
Author information
Authors and Affiliations
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2012 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Górecki, P., Eulenstein, O. (2012). A Robinson-Foulds Measure to Compare Unrooted Trees with Rooted Trees. In: Bleris, L., Măndoiu, I., Schwartz, R., Wang, J. (eds) Bioinformatics Research and Applications. ISBRA 2012. Lecture Notes in Computer Science(), vol 7292. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-642-30191-9_12
Download citation
DOI: https://doi.org/10.1007/978-3-642-30191-9_12
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-642-30190-2
Online ISBN: 978-3-642-30191-9
eBook Packages: Computer ScienceComputer Science (R0)