[go: up one dir, main page]

Wandelt et al., 2014 - Google Patents

Trends in genome compression

Wandelt et al., 2014

View PDF
Document ID
4169689631332189644
Author
Wandelt S
Bux M
Leser U
Publication year
Publication venue
Current bioinformatics

External Links

Snippet

Technological advancements in high throughput sequencing have led to a tremendous increase in the amount of genomic data produced. With the cost being down to 2,000 USD for a single human genome, sequencing dozens of individuals is an undertaking that is …
Continue reading at www.informatik.hu-berlin.de (PDF) (other versions)

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/22Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30067File systems; File servers
    • G06F17/30129Details of further file system functionalities
    • G06F17/3015Redundancy elimination performed by the file system
    • G06F17/30156De-duplication implemented within the file system, e.g. based on file segments
    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/3084Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method
    • H03M7/3086Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction using adaptive string matching, e.g. the Lempel-Ziv method employing a sliding window, e.g. LZ77
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/30Information retrieval; Database structures therefor; File system structures therefor
    • G06F17/30286Information retrieval; Database structures therefor; File system structures therefor in structured data stores
    • GPHYSICS
    • G06COMPUTING; CALCULATING; COUNTING
    • G06FELECTRICAL DIGITAL DATA PROCESSING
    • G06F19/00Digital computing or data processing equipment or methods, specially adapted for specific applications
    • G06F19/10Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
    • G06F19/28Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
    • HELECTRICITY
    • H03BASIC ELECTRONIC CIRCUITRY
    • H03MCODING; DECODING; CODE CONVERSION IN GENERAL
    • H03M7/00Conversion of a code where information is represented by a given sequence or number of digits to a code where the same information or similar information or a subset of information is represented by a different sequence or number of digits
    • H03M7/30Compression; Expansion; Suppression of unnecessary data, e.g. redundancy reduction
    • H03M7/40Conversion to or from variable length codes, e.g. Shannon-Fano code, Huffman code, Morse code

Similar Documents

Publication Publication Date Title
Wandelt et al. Trends in genome compression
Wandelt et al. FRESCO: Referential compression of highly similar sequences
Kuruppu et al. Optimized relative Lempel-Ziv compression of genomes
Deorowicz et al. Robust relative compression of genomes with random access
Deorowicz et al. Data compression for sequencing data
Zhang et al. Light-weight reference-based compression of FASTQ data
Bakr et al. DNA lossless compression algorithms
Sardaraz et al. Advances in high throughput DNA sequence data compression
EP2595076B1 (en) Compression of genomic data
Yao et al. HRCM: an efficient hybrid referential compression method for genomic big data
CN110168652B (en) Methods and systems for storing and accessing bioinformatics data
Saha et al. NRGC: a novel referential genome compression algorithm
Grabowski et al. MBGC: multiple bacteria genome compressor
KR102729412B1 (en) Efficient compression method and system for genome sequence reads
Grassi et al. KungFQ: A simple and powerful approach to compress fastq files
Chen et al. Efficient sequencing data compression and FPGA acceleration based on a two-step framework
Yanovsky ReCoil-an algorithm for compression of extremely large datasets of DNA data
JP2022549580A (en) Methods for compression of genomic sequence data
JP2020509474A (en) Methods and systems for reconstructing genomic reference sequences from compressed genomic sequence reads
Liu et al. Quality scores compression of genomic sequencing data: a comprehensive review and performance evaluation
Selva et al. SRComp: short read sequence compression using burstsort and Elias omega coding
Roy et al. Sbvrldnacomp: an effective dna sequence compression algorithm
Roy et al. An efficient compression algorithm for forthcoming new species
Roy et al. A survey of data structures and algorithms used in the contextof compression upon biological sequence
Law Application of signal processing for DNA sequence compression