Transcriptomics
Many of us can agree that we have heard of ribonucleic acid (RNA) and deoxyribonucleic
acid (DNA) and their role in protein synthesis and gene expression. Whereas genomics
studies an organism’s genome or the entire DNA sequence, transcriptomics studies all RNA
transcripts that are being produced under a certain condition. The entire collection of mRNA
sequences in a cell, or the transcriptome, allows scientists and researchers to determine when
and where genes are turned on and off in an organism.
Transcriptomics technologies are the techniques used to study an organism's transcriptome,
the sum of all of its RNA transcripts. The information content of an organism is recorded in
the DNA of its genome and expressed through transcription. Here, mRNA serves as a
transient intermediary molecule in the information network, whilst non-coding
RNAs perform additional diverse functions. A transcriptome captures a snapshot in time of
the total transcripts present in a cell. Transcriptomics technologies provide a broad account of
which cellular processes are active and which are dormant. A major challenge in molecular
biology is to understand how a single genome gives rise to a variety of cells. Another is how
gene expression is regulated.
Transcriptomics functions as a hypothesis generator; the extensive data gives rise to new
questions in the field of genetics. Clinically, transcriptomics has assisted doctors with
diagnoses by identifying biomarkers of diseases. Agriculturally, identifying how plants are
affected by biotic and abiotic stressors allows for the development of new crops with
improved traits. Although there are still some limitations, such as degradation and
fragmentation of the RNA within formalin-fixed-paraffin-embedded (FFPE) tissue (a method
used to preserve cellular information and tissue morphology), transcriptomics can be applied
anywhere to better understand life, how it functions, and how changes in genes contribute to
the overall health of the organism.
The first attempts to study the whole transcriptome began in the early 1990s, and
technological advances since the late 1990s have made transcriptomics a widespread
discipline. Transcriptomics has been defined by repeated technological innovations that
transform the field. There are two key contemporary techniques in the field: microarrays,
which quantify a set of predetermined sequences, and RNA sequencing (RNA-Seq), which
uses high-throughput sequencing to capture all sequences.
Transcriptomics has been characterised by the development of new techniques which have
redefined what is possible every decade or so and render previous technologies obsolete. The
first attempt at capturing a partial human transcriptome was published in 1991 and reported
609 mRNA sequences from the human brain. In 2008, two human transcriptomes composed
of millions of transcript-derived sequences covering 16,000 genes were published [2][3], and,
by 2015, transcriptomes had been published for hundreds of individuals. Transcriptomes of
different disease states, tissues, or even single cells are now routinely generated. This
explosion in transcriptomics has been driven by the rapid development of new technologies
with an improved sensitivity and economy.
The word “transcriptome” was first used in the 1990s. In 1995, one of the earliest
sequencing-based transcriptomic methods was developed, serial analysis of gene expression
(SAGE), which worked by Sanger sequencing of concatenated random transcript fragments.
Transcripts were quantified by matching the fragments to known genes. A variant of SAGE
using high-throughput sequencing techniques, called digital gene expression analysis, was
also briefly used. However, these methods were largely overtaken by high throughput
sequencing of entire transcripts, which provided additional information on transcript
structure, e.g., splice variants