Abstract
In this work, we introduce a variant of the Single Cut or Join distance that accounts for duplicated genes, in the context of directed evolution from an ancestral genome to a descendant genome where orthology relations between ancestral genes and their descendant are known. Our model includes two duplication mechanisms: single-gene tandem duplication and creation of single-gene circular chromosomes. We prove that in this model, computing the distance and a parsimonious evolutionary scenario in terms of SCJ and single-gene duplication events can be done in linear time. Simulations show that the inferred number of cuts and joins scales linearly with the true number of such events even at high rates of genome rearrangements and segmental duplications. We also show that the median problem is tractable for this distance.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
Notes
- 1.
We use the generic term “gene” here to identify a genomic locus.
- 2.
References
Fertin, G., Labarre, A., Rusu, I., Tannier, E., Vialette, S.: Combinatorics of Genome Rearrangements. MIT Press, Cambridge (2009)
Wang, D., Li, D., Ning, K., Wang, L.: Core-genome scaffold comparison reveals the prevalence that inversion events are associated with pairs of inverted repeats. BMC Genom. 18(1), 268 (2017)
Neafsey, D., Waterhouse, R., et al.: Mosquito genomics. Highly evolvable malaria vectors: the genomes of 16 Anopheles mosquitoes. Science 347(6217), 1258522 (2015)
Hannenhalli, S., Pevzner, P.: Transforming cabbage into turnip (polynomial algorithm for sorting signed permutations by reversals). In: 27th Annual ACM Symposium on the Theory of Computing (STOC 1995), pp. 178–189 (1995)
Feijão, P., Meidanis, J.: SCJ: a breakpoint-like distance that simplifies several rearrangement problems. IEEE/ACM Trans. Comput. Biol. Bioinform. 8(5), 1318–1329 (2011)
da Silva, P., Machado, R., Dantas, S., Braga, M.: DCJ-indel and DCJ-substitution distances with distinct operation costs. Algorithms Mol. Biol. 8(1), 21 (2013)
Braga, M., Willing, E., Stoye, J.: Double cut and join with insertions and deletions. J. Comput. Biol. 18(9), 1167–1184 (2011)
Tannier, E., Zheng, C., Sankoff, D.: Multichromosomal median and halving problems under various different genomic distances. BMC Bioinform. 10, 120 (2009)
Shao, M., Lin, Y., Moret, B.: An exact algorithm to compute the double-cut-and-join distance for genomes with duplicate genes. J. Comput. Biol. 22(5), 425–435 (2015)
Compeau, P.E.C.: DCJ-Indel sorting revisited. Algorithms Mol. Biol. 8, 6 (2013)
Rubert, D., Feijão, P., Braga, M., Stoye, J., Martinez, F.: Approximating the DCJ distance of balanced genomes in linear time. Algorithms Mol. Biol. 12, 3 (2017)
Bryant, D.: The complexity of calculating exemplar distances. In: Sankoff, D., Nadeau, J.H. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment and Evolution of Gene Families, vol. 1, pp. 207–211. Springer, Dordrecht (2000). doi:10.1007/978-94-011-4309-7_19
Blin, G., Chauve, C., Fertin, G.: The breakpoint distance for signed sequences. In: Algorithms and Computational Methods for Biochemical and Evolutionary Networks (CompBioNets 2004). Text in Algorithms, vol. 3, pp. 3–16 (2004)
Angibaud, S., Fertin, G., Rusu, I., Thevenin, A., Vialette, S.: On the approximability of comparing genomes with duplicates. J. Graph Algorithms Appl. 13(1), 19–53 (2009)
Blin, G., Fertin, G., Sikora, F., Vialette, S.: The ExemplarBreakpointDistance for non-trivial genomes cannot be approximated. In: Das, S., Uehara, R. (eds.) WALCOM 2009. LNCS, vol. 5431, pp. 357–368. Springer, Heidelberg (2009). doi:10.1007/978-3-642-00202-1_31
Shao, M., Moret, B.: A fast and exact algorithm for the exemplar breakpoint distance. J. Comput. Biol. 23(5), 337–346 (2016)
Shao, M., Moret, B.: On computing breakpoint distances for genomes with duplicate genes. J. Comput. Biol. (2016, ahead of print). doi:10.1089/cmb.2016.0149
Wei, Z., Zhu, D., Wang, L.: A dynamic programming algorithm for (1,2)-exemplar breakpoint distance. J. Comput. Biol. 22(7), 666–676 (2014)
Angibaud, S., Fertin, G., Rusu, I., Thévenin, A., Vialette, S.: Efficient tools for computing the number of breakpoints and the number of adjacencies between two genomes with duplicate genes. J. Comput. Biol. 15(8), 1093–1115 (2008)
Zeira, R., Shamir, R.: Sorting by cuts, joins, and whole chromosome duplications. J. Comput. Biol. 24(2), 127–137 (2017)
Sankoff, D., El-Mabrouk, N.: Duplication, rearrangement and reconciliation. In: Sankoff, D., Nadeau, J.H. (eds.) Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics Map, Alignment and Evolution of Gene Families, vol. 1, pp. 537–550. Springer, Dordrecht (2000). doi:10.1007/978-94-011-4309-7_46
Sankoff, D.: Genome rearrangement with gene families. Bioinformatics 15(11), 909–917 (1999)
Chauve, C., El-Mabrouk, N., Guéguen, L., Semeria, M., Tannier, E.: Duplication, rearrangement and reconciliation a follow-up 13 years later. In: Chauve, C., El-Mabrouk, N., Tannier, E. (eds.) Models and Algorithms for Genome Evolution, vol. 19, pp. 47–62. Springer, London (2013). doi:10.1007/978-1-4471-5298-9_4
Duchemin, W., Anselmetti, Y., Patterson, M., Ponty, Y., Bérard, S., Chauve, C., Scornavacca, C., Daubin, V., Tannier, E.: DeCoSTAR: reconstructing the ancestral organization of genes or genomes using reconciled phylogenies. Genome Biol. Evol. 9(5), 1312–1319 (2017)
Plummer, M.D., Lovász, L.: Matching Theory. Elsevier, Amsterdam (1986)
Luhmann N., Lafond M., Thevenin A., Ouangraoua A., Wittler R., Chauve C.: The SCJ small parsimony problem for weighted gene adjacencies. IEEE/ACM Trans. Comput. Biol. Bioinform. (2017, ahead of print). doi:10.1109/TCBB.2017.2661761
Miklós, I., Kiss, S., Tannier, E.: Counting and sampling SCJ small parsimony solutions. Theoret. Comput. Sci. 552, 83–98 (2014)
Biller, P., Guéguen, L., Tannier, E.: Moments of genome evolution by Double Cut-and-Join. BMC Bioinform. 16(Suppl 14), S7 (2015)
Acknowledgments
CC is supported by a Discovery Grant from the Natural Sciences and Engineering Research Council of Canada. PF is supported by the Genome Canada grant PathoGiST.
Author information
Authors and Affiliations
Corresponding author
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2017 Springer International Publishing AG
About this paper
Cite this paper
Feijão, P., Mane, A., Chauve, C. (2017). A Tractable Variant of the Single Cut or Join Distance with Duplicated Genes. In: Meidanis, J., Nakhleh, L. (eds) Comparative Genomics. RECOMB-CG 2017. Lecture Notes in Computer Science(), vol 10562. Springer, Cham. https://doi.org/10.1007/978-3-319-67979-2_2
Download citation
DOI: https://doi.org/10.1007/978-3-319-67979-2_2
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-67978-5
Online ISBN: 978-3-319-67979-2
eBook Packages: Computer ScienceComputer Science (R0)