Nothaft, 2017 - Google Patents
Scalable systems and algorithms for genomic variant analysisNothaft, 2017
View PDF- Document ID
- 5681884128893361648
- Author
- Nothaft F
- Publication year
External Links
Snippet
With the cost of sequencing a human genome dropping below $1,000, population-scale sequencing has become feasible. With projects that sequence more than 10,000 genomes becoming commonplace, there is a strong need for genome analysis tools that can scale …
- 238000004458 analytical method 0 title abstract description 53
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30386—Retrieval requests
- G06F17/30424—Query processing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30312—Storage and indexing structures; Management thereof
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/30—Information retrieval; Database structures therefor; File system structures therefor
- G06F17/30286—Information retrieval; Database structures therefor; File system structures therefor in structured data stores
- G06F17/30289—Database design, administration or maintenance
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/22—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for sequence comparison involving nucleotides or amino acids, e.g. homology search, motif or SNP [Single-Nucleotide Polymorphism] discovery or sequence alignment
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/18—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for functional genomics or proteomics, e.g. genotype-phenotype associations, linkage disequilibrium, population genetics, binding site identification, mutagenesis, genotyping or genome annotation, protein-protein interactions or protein-nucleic acid interactions
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/40—Transformations of program code
- G06F8/41—Compilation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F19/00—Digital computing or data processing equipment or methods, specially adapted for specific applications
- G06F19/10—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology
- G06F19/28—Bioinformatics, i.e. methods or systems for genetic or protein-related data processing in computational molecular biology for programming tools or database systems, e.g. ontologies, heterogeneous data integration, data warehousing or computing architectures
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F8/00—Arrangements for software engineering
- G06F8/70—Software maintenance or management
- G06F8/76—Adapting program code to run in a different environment; Porting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F17/00—Digital computing or data processing equipment or methods, specially adapted for specific functions
- G06F17/20—Handling natural language data
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F9/00—Arrangements for programme control, e.g. control unit
- G06F9/06—Arrangements for programme control, e.g. control unit using stored programme, i.e. using internal store of processing equipment to receive and retain programme
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F11/00—Error detection; Error correction; Monitoring
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING; COUNTING
- G06F—ELECTRICAL DIGITAL DATA PROCESSING
- G06F3/00—Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Garrison et al. | A spectrum of free software tools for processing the VCF variant call format: vcflib, bio-vcf, cyvcf2, hts-nim and slivar | |
Van der Auwera et al. | From FastQ data to high‐confidence variant calls: the genome analysis toolkit best practices pipeline | |
US10600217B2 (en) | Methods for the graphical representation of genomic sequence data | |
Timón-Reina et al. | An overview of graph databases and their applications in the biomedical domain | |
JP5791149B2 (en) | Computer-implemented method, computer program, and data processing system for database query optimization | |
Coombe et al. | ntLink: a toolkit for de novo genome assembly scaffolding and mapping using long reads | |
Nothaft | Scalable systems and algorithms for genomic variant analysis | |
Li et al. | ntEdit+ Sealer: efficient targeted error resolution and automated finishing of long‐read genome assemblies | |
Nothaft | Scalable Genome Resequencing with ADAM and avocado | |
Ricketts et al. | Using LICHeE and BAMSE for reconstructing cancer phylogenetic trees | |
Shajii et al. | A Python-based optimization framework for high-performance genomics | |
Robinson et al. | Postprocessing the Alignment | |
US12164516B2 (en) | Click-to-script reflection | |
Forer et al. | Cloudflow-A framework for mapreduce pipeline development in biomedical research | |
Petkau | A framework for the indexing, querying, clustering, and visualization of microbial genomes for surveillance and outbreak investigation | |
Schatz | High performance computing for DNA sequence alignment and assembly | |
Spiegelberg et al. | Tuplex: robust, efficient analytics when Python rules | |
Kaye | Approaches to genome analysis through the application of graph theory | |
Ruano | Implementing Bioinformatics Pipelines and User Interfaces for Selection of Immunotherapeutic Targets in Colorectal Cancer | |
Purohit | PostGUI: A Modern Web Application for Sharing Biological Big Data | |
Oguchi | A Comparison of Sensitive Splice Aware Aligners in RNA Sequence Data Analysis in Leaping towards Benchmarking | |
Novak | Infrastructure for Scalable Analysis of Genomic Variation | |
Rengasamy | Engineering High Performance Workflows for End-to-End Acceleration of Genomic Applications | |
Yuan | BESS: bounded evaluation SQL systems | |
Kozanitis | Compressing and Querrying the Human Genome |