DNA Microarrays
Dr. Abdul Wahab
Department of Microbiology
University of Karachi
What is a Microarray?
• “Microarray” has become a general term, there are
many types now
– DNA microarrays
– Protein microarrays
– Tissue microarray
–…
• We’ll be discussing DNA microarrays
What is a DNA Microarray
• A grid of DNA spots (probes) on a substrate used
to detect complementary sequences
• The DNA spots can be deposited by
– piezolectric (ink jet style)
– Pen
– Photolithography (Affymetrix)
• The substrate can be plastic, glass, silicon
(Affymetrix)
• RNA/DNA of interest is labelled & hybridizes with
the array
• Hybridization with probes is detected optically.
Several types of arrays
• Spotted DNA arrays
– Developed by Pat Brown’s lab at Stanford
– PCR products of full-length genes (>100nt)
• Affymetrix gene chips
– Photolithography technology from computer
industry allows building many 25-mers
• Ink-jet microarrays from Agilent
– 25-60-mers “printed directly on glass slides
– Flexible, rapid, but expensive
GeneChip Technology
Affymetrix Inc
Miniaturized, high density arrays of 1,300,000 DNA
oligos 1-cm by 1-cm
Manufacturing Process
Solid-phase chemical synthesis and
Photolithographic fabrication techniques employed
in semiconductor industry
Array Fabrication Photolithography
• Light activated synthesis
• synthesize oligonucleotides on glass
slides
• 107copies per oligo in 24 x 24 um square
• Use 20 pairs of different 25-mers per gene
• Perfect match and mismatch
Array Fabrication Photolithography
Affymetrix Microarrays
Raw image
1.28cm
50um
~107 oligonucleotides,
half perfectly match mRNA
(PM),
half have one mismatch (MM)
Raw gene expression is
intensity difference: PM - MM
Printed cDNA or Oligonucleotide Arrays
· Robotically spotted cDNAs (50mer) or Oligonucleotides (70mers) vs.
Affymetrix’s that uses 25mers
• Printed on Nylon, Plastic, or Glass surface
Steel spotting pin
Spotted arrays
chemically modified slides
384 well source plate
1 nanolitre spots
90-120 um diameter
Building the chip
PCR amplification
Directly from colonies with SP6-T7
primers in 96-well plates
Consolidate into
384-well plates
Arrayed Library
(96 or 384-well plates of
bacterial glycerol stocks)
Spot as microarray
on glass slides
Microarray Life Cycle
Biological
Question
Data Analysis &
Modelling
Sample
Preparation
Microarray Detection
Microarray Reaction
Biological question
Differentially expressed genes
Sample class prediction etc.
Experimental design
Microarray experiment
16-bit TIFF files
Image analysis
(Rfg, Rbg), (Gfg, Gbg)
Normalization
R, G
Estimation Testing Clustering Discrimination
Biological verification
and interpretation
Cartridge-based Expression
Microarrays
Involves Fluorescently tagged
biotinylated cRNA
-One chip per sample
-Uses single fluorescent dye
-More expensive
Affymetrix GeneChip Image
Spotted cDNA and Oligo Glass
Arrays
Involves two dyes on the same slide
• Red dye-Cy5
• Green dye-Cy3
• Control and experimental
cDNA on same chip
Pros/Cons of Different
Technologies
Spotted Arrays Affy Gene Chips
• relative cheap to make • expensive ($500 or more)
(~$10 slide) • limited types avail, no
• flexible - spot anything chance of specialized
you want chips
• Cheap so can repeat • fewer repeated
experiments many times
experiments usually
• highly variable spot
deposition • more uniform DNA
• usually have to make feaures
your own • Can buy off the shelf
• Accuracy at extremes in • Dynamic range may be
range may be less slightly better
Purposes
• So why do we use DNA microarray?
– To measure changes in gene expression levels – two
samples’ gene expression can be compared from
different samples, such as from cells of different stages
of mitosis
– To observe genomic gains and losses. Microarray
Comparative Genomic Hybridization (CGH)
– To observe mutations in DNA
Several types of arrays
• Spotted DNA arrays
– Developed by Pat Brown’s lab at Stanford
– PCR products of full-length genes (>100nt)
• Affymetrix gene chips
– Photolithography technology from computer
industry allows building many 25-mers
• Ink-jet microarrays from Agilent
– 25-60-mers “printed directly on glass slides
– Flexible, rapid, but expensive
Central “Assumption” of Gene
Expression Microarrays
• The level of a given mRNA is positively correlated
with the expression of the associated protein.
– Higher mRNA levels mean higher protein
expression, lower mRNA means lower protein
expression
• Other factors:
– Protein degradation, mRNA degradation,
polyadenylation, codon preference, translation
rates, alternative splicing, translation lag…
• This is relatively obvious, but worth emphasizing
An Array Experiment
STEP 1: Collect Samples
This can be from a variety of organisms. We’ll use two
samples – cancerous human skin tissue & healthy
human skin tissue
STEP 2: Isolate mRNA
• Extract the RNA from the samples. Using either a
column, or a solvent such as phenol-chloroform.
• After isolating the RNA, we need to isolate the mRNA
from the rRNA and tRNA. mRNA has a poly-A tail, so
we can use a column containing beads with poly-T tails
to bind the mRNA.
• Rinse with buffer to release the mRNA from the beads.
The buffer disrupts the pH, disrupting the hybrid
bonds.
STEP 3: Create Labelled DNA
• Add a labelling mix to the
RNA. The labelling mix
contains poly-T (oligo dT)
primers, reverse transcriptase
(to make cDNA), and
fluorescently dyed nucleotides.
• We will add cyanine 3
(fluoresces green) to the
healthy cells and cyanine 5
(fluoresces red) to the
cancerous cells.
• The primer and RT bind to the
mRNA first, then add the
fluorescently dyed nucleotides,
creating a complementary
strand of DNA
STEP 4: Hybridization
• Apply the cDNA we
have just created to a
microarray plate.
• When comparing two
samples, apply both
samples to the same
plate.
• The ssDNA will bind to
the cDNA already
present on the plate.
Hybridization chamber
3XSSC
HYB CHAMBER
ARRAY
LIFTERSLIP
SLIDE
LABEL
SLIDE LABEL
• Humidity
• Temperature
• Formamide
(Lowers the Tm)
STEP 5: LASERS!
STEP 5: Microarray Scanner
The scanner has a laser, a computer,
and a camera.
The laser causes the hybrid bonds to
fluoresce.
The camera records the images
produced when the laser scans the
plate.
The computer allows us to
immediately view our results and it
also stores our data.
Scan
Scan
Green: down regulate
Red: up regulate
Yellow: equal level
STEP 6: Analyze the Data
GREEN – the healthy sample hybridized
more than the diseased sample.
RED – the diseased/cancerous sample
hybridized more than the non diseased
sample.
YELLOW - both samples hybridized
equally to the target DNA.
BLACK - areas where neither sample
hybridized to the target DNA.
By comparing the differences in gene
expression between the two samples, we
can understand more about the
genomics of a disease.
Image Analysis & Data Visualization
Cy5 Cy5
log2
Cy3 Cy5 Cy3 Cy3
200 10000 50.00 5.64
4800 4800 1.00 0.00
9000 300 0.03 -4.91
Experiments
Underexpressed 8
4
2
Genes
fold
2
4
Overexpressed
8
Data Normalization
• Normalize data to correct for variances
– Dye bias
– Location bias
– Intensity bias
– Pin bias
– Slide bias
• Control vs. non-control spots
Data Normalization
Uncalibrated, red light under Calibrated, red and green equally
detected detected
Hierarchical clustering
Genomic Reprogramming in Response to Oxidant minutes
0 10 20 40 60 120
One-third of genome expression is
transiently reprogrammed
6218 genes
Fold repression Fold induction
>9 >6 >3 1:1 >3 >6 >9
Heat map representing the expression level of selected genes of Salmonella wild type and baeR mutant in LB
+/−20mM sodium tungstate.
Appia-Ayme C, Patrick E, J. Sullivan M, Alston MJ, et al. (2011) Novel Inducers of the Envelope Stress Response BaeSR in Salmonella
Typhimurium: BaeR Is Critically Required for Tungstate Waste Disposal. PLoS ONE 6(8): e23713. doi:10.1371/journal.pone.0023713
http://www.plosone.org/article/info:doi/10.1371/journal.pone.0023713
Differentially expressed genes during cell cycle
Microarray Limitations
• Cross-hybridization of sequences with high identity
• Chip to chip variation
• True measure of abundance?
• Does mRNA levels reflect protein levels?
• Generally, do not “prove” new biology - simply
suggest genes involved in a process, a hypothesis
that will require traditional experimental
verification.
• What fold change has biological relevance?
• Need cloned EST or some sequence knowledge --
rare messages may be undetected
• Expensive!! Not every lab can afford experiment
repeat.
• The real limitation is Bioinformatics
Microarray Potential Applications
• Biological discovery
– new and better molecular diagnostics
– new molecular targets for therapy
– finding and refining biological pathways
– Mutation and polymorphism detection
• Recent examples
– molecular diagnosis of leukemia, breast
cancer, ...
– appropriate treatment for genetic signature
– potential new drug targets