Schulz et al., 2008 - Google Patents
The generalised k-Truncated Suffix Tree for time-and space-efficient searches in multiple DNA or protein sequencesSchulz et al., 2008
View PDF- Document ID
- 3834276305531614941
- Author
- Schulz M
- Bauer S
- Robinson P
- Publication year
- Publication venue
- International journal of bioinformatics research and applications
External Links
Snippet
Efficient searching for specific subsequences in a set of longer sequences is an important component of many bioinformatics algorithms. Generalised suffix trees and suffix arrays allow searches for a pattern of length n in time proportional to n independent of the length of …
- 229920003013 deoxyribonucleic acid 0 title abstract description 10
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30964—Querying
- G06F17/30979—Query processing
- G06F17/30985—Query processing by using string matching techniques
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/24—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for machine learning, data mining or biostatistics, e.g. pattern finding, knowledge discovery, rule extraction, correlation, clustering or classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Giegerich et al. | Efficient implementation of lazy suffix trees | |
Canzar et al. | Short read mapping: an algorithmic tour | |
Karasikov et al. | Metagraph: Indexing and analysing nucleotide archives at petabase-scale | |
Liu et al. | A method for aligning RNA secondary structures and its application to RNA motif detection | |
Shrestha et al. | A bioinformatician’s guide to the forefront of suffix array construction algorithms | |
US11062793B2 (en) | Systems and methods for aligning sequences to graph references | |
Khan et al. | A practical algorithm for finding maximal exact matches in large sequence datasets using sparse suffix arrays | |
Meyer et al. | Structator: fast index-based search for RNA sequence-structure patterns | |
Vinga et al. | Pattern matching through Chaos Game Representation: bridging numerical and discrete data structures for biological sequence analysis | |
Schulz et al. | The generalised k-Truncated Suffix Tree for time-and space-efficient searches in multiple DNA or protein sequences | |
Belazzougui et al. | Bidirectional variable-order de Bruijn graphs | |
Ma et al. | Chaining for accurate alignment of erroneous long reads to acyclic variation graphs | |
Ben-Bassat et al. | String graph construction using incremental hashing | |
Karim et al. | A MapReduce framework for mining maximal contiguous frequent patterns in large DNA sequence datasets | |
Barsky et al. | Suffix trees for inputs larger than main memory | |
Allali et al. | The at-most $ k $-deep factor tree | |
Makris et al. | An intelligent grammar-based platform for RNA H-type pseudoknot prediction | |
US20200265923A1 (en) | Efficient Seeding For Read Alignment | |
Iliopoulos et al. | Maximal motif discovery in a sliding window | |
Laurio et al. | Regular biosequence pattern matching with cellular automata | |
Bonnici et al. | A k-mer based sequence similarity for pangenomic analyses | |
Ferragina et al. | Computational biology | |
Neelapala et al. | SPINE: Putting backbone into string indexing | |
Iliopoulos et al. | An algorithm for mapping short reads to a dynamically changing genomic sequence | |
Gog et al. | Multi-pattern matching with bidirectional indexes |