Abstract
The rooted triplet distance measures the structural dissimilarity between two rooted phylogenetic trees (unordered trees with distinct leaf labels and no outdegree-1 nodes) having the same leaf label set. It is defined as the number of 3-subsets of the leaf label set that induce two different subtrees in the two trees. The fastest currently known algorithm for computing the rooted triplet distance was designed by Brodal et al. (SODA 2013). It runs in \(O(n \log n)\) time, where n is the number of leaf labels in the input trees, and a long-standing open question is whether this is optimal or not. In this paper, we present two new \(o(n \log n)\)-time algorithms for the special case of caterpillars (rooted phylogenetic trees in which every node has at most one non-leaf child), thus breaking the \(O(n \log n)\)-time bound for a fundamental class of trees. Our first algorithm makes use of a dynamic rank-select data structure by Raman et al. (WADS 2001) and runs in \(O(n \log n / \log \log n)\) time. Our second algorithm relies on an efficient orthogonal range counting algorithm invented by Chan and Pǎtraşcu (SODA 2010) and runs in \(O(n \sqrt{\log n})\) time.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Bansal, M.S., Dong, J., Fernández-Baca, D.: Comparing and aggregating partially resolved trees. Theoret. Comput. Sci. 412(48), 6634–6652 (2011)
Brodal, G.S., Fagerberg, R., Mailund, T., Pedersen, C.N.S., Sand, A.: Efficient algorithms for computing the triplet and quartet distance between trees of arbitrary degree. In: Proceedings of the Twenty-Fourth Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2013), pp. 1814–1832. SIAM (2013)
Brodal, G.S., Mampentzidis, K.: Cache oblivious algorithms for computing the triplet distance between trees. In: 25th Annual European Symposium on Algorithms (ESA 2017). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2017)
Chan, T.M., Pătraşcu, M.: Counting inversions, offline orthogonal range counting, and related problems. In: Proceedings of the Twenty-First Annual ACM-SIAM Symposium on Discrete Algorithms (SODA 2010), pp. 161–173. SIAM (2010)
Critchlow, D.E., Pearl, D.K., Qian, C.: The triples distance for rooted bifurcating phylogenetic trees. Syst. Biol. 45(3), 323–334 (1996)
Dasgupta, B., He, X., Jiang, T., Li, M., Tromp, J.: On computing the nearest neighbor interchange distance. In: Proceedings of the DIMACS Workshop on Discrete Problems with Medical Applications, DIMACS Series in Discrete Mathematics and Theoretical Computer Science, vol. 55, pp. 125–143. American Mathematical Soc. (2000)
Day, W.H.E.: Optimal algorithms for comparing trees with labeled leaves. J. Classif. 2(1), 7–28 (1985)
Dobson, A.J.: Comparing the shapes of trees. In: Street, A.P., Wallis, W.D. (eds.) Combinatorial Mathematics III. LNM, vol. 452, pp. 95–100. Springer, Heidelberg (1975). https://doi.org/10.1007/BFb0069548
Huelsenbeck, J.P., Nielsen, R., Ronquist, F., Bollback, J.P.: Bayesian inference of phylogeny and its impact on evolutionary biology. Science 294(5550), 2310–2314 (2001)
Jansson, J., Mampentzidis, K., T.P, S.: Building a small and informative phylogenetic supertree. In: 19th International Workshop on Algorithms in Bioinformatics (WABI 2019). Schloss Dagstuhl-Leibniz-Zentrum fuer Informatik (2019)
Jansson, J., Rajaby, R.: A more practical algorithm for the rooted triplet distance. J. Comput. Biol. 24(2), 106–126 (2017)
Kendall, M., Colijn, C.: Mapping phylogenetic trees to reveal distinct patterns of evolution. Mol. Biol. Evol. 33(10), 2735–2743 (2016)
Liao, W., et al.: Alignment-free transcriptomic and metatranscriptomic comparison using sequencing signatures with variable length Markov chains. Sci. Rep. 6(1), 1–15 (2016)
Moreno-Dominguez, D., Anwander, A., Knösche, T.R.: A hierarchical method for whole-brain connectivity-based parcellation. Hum. Brain Mapp. 35(10), 5000–5025 (2014)
Nakhleh, L., Sun, J., Warnow, T., Linder, C.R., Moret, B.M.E., Tholse, A.: Towards the development of computational tools for evaluating phylogenetic network reconstruction methods. In: Biocomputing 2003, pp. 315–326. World Scientific (2002)
Page, R.D.M., Cruickshank, R., Johnson, K.P.: Louse (Insecta: Phthiraptera) mitochondrial 12S rRNA secondary structure is highly variable. Insect Mol. Biol. 11(4), 361–369 (2002)
Raman, R., Raman, V., Rao, S.S.: Succinct dynamic data structures. In: Dehne, F., Sack, J.-R., Tamassia, R. (eds.) WADS 2001. LNCS, vol. 2125, pp. 426–437. Springer, Heidelberg (2001). https://doi.org/10.1007/3-540-44634-6_39
Robinson, D.F.: Comparison of labeled trees with valency three. J. Combin. Theory Ser. B 11(2), 105–119 (1971)
Robinson, D.F., Foulds, L.R.: Comparison of phylogenetic trees. Math. Biosci. 53(1–2), 131–147 (1981)
Sand, A., Brodal, G.S., Fagerberg, R., Pedersen, C.N.S., Mailund, T.: A practical \({O}(n \log ^2 n)\) time algorithm for computing the triplet distance on binary trees. BMC Bioinform. 14, S18 (2013). https://doi.org/10.1186/1471-2105-14-S2-S18
Acknowledgment
JJ was partially funded by RGC/GRF project 15217019.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2021 Springer Nature Switzerland AG
About this paper
Cite this paper
Jansson, J., Lee, W.L. (2021). Fast Algorithms for the Rooted Triplet Distance Between Caterpillars. In: Bampis, E., Pagourtzis, A. (eds) Fundamentals of Computation Theory. FCT 2021. Lecture Notes in Computer Science(), vol 12867. Springer, Cham. https://doi.org/10.1007/978-3-030-86593-1_23
Download citation
DOI: https://doi.org/10.1007/978-3-030-86593-1_23
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-86592-4
Online ISBN: 978-3-030-86593-1
eBook Packages: Computer ScienceComputer Science (R0)