Menegaux et al., 2020 - Google Patents
Embedding the de Bruijn graph, and applications to metagenomicsMenegaux et al., 2020
View PDF- Document ID
- 13163438963225677568
- Author
- Menegaux R
- Vert J
- Publication year
- Publication venue
- BioRxiv
External Links
Snippet
Fast mapping of sequencing reads to taxonomic clades is a crucial step in metagenomics, which however raises computational challenges as the numbers of reads and of taxonomic clades increases. Besides alignment-based methods, which are accurate but computational …
- 238000010801 machine learning 0 abstract description 17
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/14—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for phylogeny or evolution, e.g. evolutionarily conserved regions determination or phylogenetic tree construction
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computer systems based on biological models
- G06N3/12—Computer systems based on biological models using genetic models
- G06N3/126—Genetic algorithms, i.e. information processing using digital simulations of the genetic system
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F7/00—Methods or arrangements for processing data by operating upon the order or content of the data handled
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06N—COMPUTER SYSTEMS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N99/00—Subject matter not provided for in other groups of this subclass
- G06N99/005—Learning machines, i.e. computer in which a programme is changed according to experience gained by the machine itself during a complete run
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| Kopylova et al. | SortMeRNA: fast and accurate filtering of ribosomal RNAs in metatranscriptomic data | |
| Vervier et al. | Large-scale machine learning for metagenomics sequence classification | |
| Kuchaiev et al. | Integrative network alignment reveals large regions of global network similarity in yeast and human | |
| Flicek et al. | Sense from sequence reads: methods for alignment and assembly | |
| Zakov et al. | An algorithmic approach for breakage-fusion-bridge detection in tumor genomes | |
| Nicholson et al. | Bioinformatic evidence of widespread priming in type I and II CRISPR-Cas systems | |
| Bernard et al. | Alignment-free microbial phylogenomics under scenarios of sequence divergence, genome rearrangement and lateral genetic transfer | |
| Ghaly et al. | Predicting the taxonomic and environmental sources of integron gene cassettes using structural and sequence homology of attC sites | |
| Lusk et al. | Evolutionary mirages: selection on binding site composition creates the illusion of conserved grammars in Drosophila enhancers | |
| Jiang et al. | DEPP: deep learning enables extending species trees using single genes | |
| Kawulok et al. | CoMeta: classification of metagenomes using k-mers | |
| Fusi et al. | In silico predictive modeling of CRISPR/Cas9 guide efficiency | |
| Bernard et al. | Recapitulating phylogenies using k-mers: from trees to networks | |
| Li et al. | Alignment-free approaches for predicting novel Nuclear Mitochondrial Segments (NUMTs) in the human genome | |
| Sanabria et al. | The human genome’s vocabulary as proposed by the DNA language model GROVER | |
| Menegaux et al. | Embedding the de Bruijn graph, and applications to metagenomics | |
| Bickmann et al. | TEclass2: Classification of transposable elements using Transformers | |
| Bergeron et al. | Formal models of gene clusters | |
| Dinu et al. | A rank-based sequence aligner with applications in phylogenetic analysis | |
| Chen et al. | A new statistic for efficient detection of repetitive sequences | |
| Alipanahi et al. | Disentangled long-read de Bruijn graphs via optical maps | |
| Schulz et al. | Sequence-based pangenomic core detection | |
| Ahmed et al. | Spumoni 2: Improved pangenome classification using a compressed index of minimizer digests | |
| Deaton et al. | Mini‐Metagenomics and Nucleotide Composition Aid the Identification and Host Association of Novel Bacteriophage Sequences | |
| He et al. | ViTax: adaptive hierarchical viral taxonomy classification with a taxonomy belief tree on a foundation model |