A microarray database is a repository containing microarray gene expression data. The key uses of a microarray database are to store the measurement data, manage a searchable index, and make the data available to other applications for analysis and interpretation (either directly, or via user downloads).
Microarray databases can fall into two distinct classes:
- A peer reviewed, public repository that adheres to academic or industry standards and is designed to be used by many analysis applications and groups. A good example of this is the Gene Expression Omnibus (GEO) from NCBI or ArrayExpress from EBI.
- A specialized repository associated primarily with the brand of a particular entity (lab, company, university, consortium, group), an application suite, a topic, or an analysis method, whether it is commercial, non-profit, or academic. These databases might have one or more of the following characteristics:
- A subscription or license may be needed to gain full access,
- The content may come primarily from a specific group (e.g. SMD, or UPSC-BASE), the Immunological Genome Project
- There may be constraints on who can use the data or for what purpose data can be used,
- Special permission may be required to submit new data, or there may be no obvious process at all,
- Only certain applications may be equipped to use the data, often also associated with the same entity (for example, caArray at NCI is specialized for the caBIG),
- Further processing or reformatting of the data may be required for standard applications or analysis,
- They claim to address the 'urgent need' to have a standard, centralized repository for microarray data. (See YMD, last updated in 2003, for example),
- There is a claim to an incremental improvement over one of the public repositories,
- A meta-analysis application, which incorporates studies from one or more public databases (e.g. Gemma primarily uses GEO studies; NextBio uses various sources)
Some of the most known public, curated microarray databases are:
Database | Scope | Microarray experiment sets | Sample profiles | As of date |
---|---|---|---|---|
ArrayTrack | ArrayTrack hosts both public and private data, including MAQC benchmark data, with integrated analysis tools | 1622 | 50,093 | Feb 2012 |
NCI mAdb | Hosts NCI data with integrated analysis and statistics tools | ? | 105,000 | Mar 2012 |
ImmGen database | Open access across all immune system cells; expression data, differential expression, coregulated clusters, regulation | 267 | 1059 | Jan 2012 |
Genevestigator | Gene expression search engine based on manually curated, well annotated public and proprietary microarray and RNA-seq datasets | 3228 | 232,855 | October 2016 |
Gene Expression Omnibus - NCBI | any curated MIAME compliant molecular abundance study | 25859 | 641770 | October 28, 2011 |
ArrayExpress at EBI | Any curated MIAME or MINSEQE compliant transcriptomics data | 24838 | 708914 | October 28, 2011 |
Stanford Microarray database | private and published microarray and molecule abundance database (now defunct) | 82542 | ? | October 23, 2011 |
The Cancer Genome Atlas (TCGA) | collection of expression data for different cancers | 21229 | ? | August 30, 2013 |
GeneNetwork system | Open access standard arrays, exons arrays, and RNA-seq data for genetic analysis (eQTL studies) with analysis suite | ~100 | ~10000 | July, 2010 |
UNC modENCODE Microarray database | Nimblegen customer 2.1 million array | ~6 | 180 | July 17, 2009 |
UPSC-BASE | data generated by microarray analysis within Umeå Plant Science Centre (UPSC). | ~100 | ? | November 15, 2007 |
UPenn RAD database | MIAME compliant public and private studies, associated with ArrayExpress | ~100 | ~2500 | Sept. 1, 2007 |
UNC Microarray database | provides the service for microarray data storage, retrieval, analysis, and visualization | ~31 | 2093 | April 1, 2007 |
MUSC database | The database is a repository for DNA microarray data generated by MUSC investigators as well as researchers in the global research community. | ~45 | 555 | April 1, 2007 |
caArray at NCI | Cancer data, prepared for analysis on caBIG | 41 | 1741 | November 15, 2006 |
See also
editExternal links
edit- ArrayExpress: Quick Tour on EBI Train OnLine
- Exploring functional genomics data with the ArrayExpress Archive on EBI Train OnLine
- Investigating gene expression patterns with the Gene Expression Atlas on EBI Train OnLine
- ArrayExpress:Submitting data using MAGE-TAB on EBI Train OnLine
- ArrayExplorer - A free tool to compare microarrays side by side.