Schulz et al., 2008 - Google Patents

The generalised k-Truncated Suffix Tree for time-and space-efficient searches in multiple DNA or protein sequences

Schulz et al., 2008

Document ID: 3834276305531614941
Author: Schulz M; Bauer S; Robinson P
Publication year: 2008
Publication venue: International journal of bioinformatics research and applications

External Links

Cited by

Snippet

Efficient searching for specific subsequences in a set of longer sequences is an important component of many bioinformatics algorithms. Generalised suffix trees and suffix arrays allow searches for a pattern of length n in time proportional to n independent of the length of …

Continue reading at citeseerx.ist.psu.edu (PDF) (other versions)

229920003013 deoxyribonucleic acid 0 title abstract description 10

Classifications

- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30946—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30943—Information retrieval; Database structures therefor; File system structures therefor details of database functions independent of the retrieved data type
- G06F17/30964—Querying
- G06F17/30979—Query processing
- G06F17/30985—Query processing by using string matching techniques
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
- G06F17/30321—Indexing structures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/24—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for machine learning, data mining or biostatistics, e.g. pattern finding, knowledge discovery, rule extraction, correlation, clustering or classification
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
- G06F17/21—Text processing
- G06F17/22—Manipulating or registering by use of codes, e.g. in sequence of text characters
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30861—Retrieval from the Internet, e.g. browsers
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F21/00—Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
- G06F21/50—Monitoring users, programs or devices to maintain the integrity of platforms, e.g. of processors, firmware or operating systems
- G06F21/55—Detecting local intrusion or implementing counter-measures
- G06F21/56—Computer malware detection or handling, e.g. anti-virus arrangements
- G06F21/562—Static detection
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code

Similar Documents

Publication	Publication Date	Title
Giegerich et al.	2003	Efficient implementation of lazy suffix trees
Canzar et al.	2015	Short read mapping: an algorithmic tour
Karasikov et al.	2020	Metagraph: Indexing and analysing nucleotide archives at petabase-scale
Liu et al.	2005	A method for aligning RNA secondary structures and its application to RNA motif detection
Shrestha et al.	2014	A bioinformatician’s guide to the forefront of suffix array construction algorithms
US11062793B2 (en)	2021-07-13	Systems and methods for aligning sequences to graph references
Khan et al.	2009	A practical algorithm for finding maximal exact matches in large sequence datasets using sparse suffix arrays
Meyer et al.	2011	Structator: fast index-based search for RNA sequence-structure patterns
Vinga et al.	2012	Pattern matching through Chaos Game Representation: bridging numerical and discrete data structures for biological sequence analysis
Schulz et al.	2008	The generalised k-Truncated Suffix Tree for time-and space-efficient searches in multiple DNA or protein sequences
Belazzougui et al.	2016	Bidirectional variable-order de Bruijn graphs
Ma et al.	2023	Chaining for accurate alignment of erroneous long reads to acyclic variation graphs
Ben-Bassat et al.	2014	String graph construction using incremental hashing
Karim et al.	2012	A MapReduce framework for mining maximal contiguous frequent patterns in large DNA sequence datasets
Barsky et al.	2011	Suffix trees for inputs larger than main memory
Allali et al.	2004	The at-most $ k $-deep factor tree
Makris et al.	2022	An intelligent grammar-based platform for RNA H-type pseudoknot prediction
US20200265923A1 (en)	2020-08-20	Efficient Seeding For Read Alignment
Iliopoulos et al.	2018	Maximal motif discovery in a sliding window
Laurio et al.	2002	Regular biosequence pattern matching with cellular automata
Bonnici et al.	2021	A k-mer based sequence similarity for pangenomic analyses
Ferragina et al.	2018	Computational biology
Neelapala et al.	2004	SPINE: Putting backbone into string indexing
Iliopoulos et al.	2012	An algorithm for mapping short reads to a dynamically changing genomic sequence
Gog et al.	2014	Multi-pattern matching with bidirectional indexes