EP1444362A2 - Analyse a haut debit du transcriptome et de la validation de fonction - Google Patents
Analyse a haut debit du transcriptome et de la validation de fonctionInfo
- Publication number
- EP1444362A2 EP1444362A2 EP02780490A EP02780490A EP1444362A2 EP 1444362 A2 EP1444362 A2 EP 1444362A2 EP 02780490 A EP02780490 A EP 02780490A EP 02780490 A EP02780490 A EP 02780490A EP 1444362 A2 EP1444362 A2 EP 1444362A2
- Authority
- EP
- European Patent Office
- Prior art keywords
- cell
- dsrna
- candidate
- gene
- cdna
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1072—Differential gene expression library synthesis, e.g. subtracted libraries, differential screening
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1079—Screening libraries by altering the phenotype or phenotypic trait of the host
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/11—DNA or RNA fragments; Modified forms thereof; Non-coding nucleic acids having a biological activity
- C12N15/111—General methods applicable to biologically active non-coding nucleic acids
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2310/00—Structure or type of the nucleic acid
- C12N2310/10—Type of nucleic acid
- C12N2310/14—Type of nucleic acid interfering nucleic acids [NA]
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2320/00—Applications; Uses
- C12N2320/10—Applications; Uses in screening processes
- C12N2320/12—Applications; Uses in screening processes in functional genomics, i.e. for the determination of gene function
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N2330/00—Production
- C12N2330/30—Production chemically synthesised
Definitions
- transcriptome has been coined to describe the set of all genes expressed, at any given time, under defined conditions in a given tissue (Velculescu et al., 1997, Cell 88:243-51).
- the detection of changes to the transcriptome can provide useful information regarding the identity of genes and gene products important in development, drug response, and, particularly, human disease processes.
- methods now used for identifying changes in the transcriptome suffer from a variety of deficiencies, e.g., they are expensive, require relatively large quantities of starting material, and/or do not efficiently identify low abundance transcripts important in mediating cell processes.
- the method for identifying and producing an active dsRNA comprises: (a) producing a plurality of cDNA, wherein each cDNA comprises at least a portion of a gene that is expressed in a cell; (b) producing a candidate dsRNA from at least one of the cDNAs; (c) introducing the candidate dsRNA into a reference cell having a gene expression similar to the cell in step (a); and (d) identifying an active dsRNA by determining whether the candidate dsRNA attenuates a desired gene expression in the reference cell.
- methods of the present invention can also include producing the identified active dsRNA from the corresponding cDNA of step (a). Since methods of the present invention provide a library, preferably a comprehensive library, of cDNA, once the active dsRNA has been identified it can be readily synthesized by transcription of the corresponding cDNA. Therefore, methods of the present invention do not require conventional chemical oligonucleotide synthesis and/or availability of known gene sequences to produce the active dsRNA.
- Identification of the active dsRNA include selecting a candidate gene and identifying whether the dsRNA of at least a portion of the candidate gene is an active dsRNA by determining whether modulation of expression of the candidate gene by dsRNA in a reference cell has a functional effect in the reference cell.
- the candidate gene is a gene that is expressed in a test cell and/or a control cell, and/or is expressed at a detectably different level with respect to the test cell and the control cell.
- the candidate gene can be an endogenous gene of the reference cell, or it can be present in the reference cell as an extrachromosomal gene.
- the test cell and control cell differ with respect to a particular cellular characteristic of interest.
- the active dsRNA alters a cellular activity or a cellular state in the reference cell by modulating the expression of the candidate gene.
- Active dsRNA can be identified by a variety of methods, including by introducing the candidate dsRNA into the reference cell and detecting an alteration in a cellular activity or a cellular state in the reference cell.
- the alteration in a cellular activity or a cellular state in the reference cell indicates that the candidate gene plays a functional role in the reference cell and that the candidate dsRNA is an active dsRNA.
- the candidate dsRNA is selected such that it is substantially identical to at least a part of the candidate gene.
- the cellular characteristic is cell health
- the test cell is a diseased cell and the control cell is a healthy cell
- the candidate gene is potentially correlated with a disease.
- the cellular characteristic is stage of development and the test cell and the control cell are at different stages of development, and the candidate gene is potentially correlated with mediating the change between the different stages of development.
- the cellular characteristic is cellular differentiation and the candidate gene is potentially correlated with controlling cellular differentiation.
- the plurality of cDNA which is used to synthesize dsRNA, is produced from at least one mRNA which is isolated from the cell.
- the isolated mRNA is then reverse transcribed by any of the methods conventionally known to one skilled in the art to produce the cDNA.
- the cDNA is then digested with one or more, preferably two, restriction enzymes to produce a plurality of similar length cDNAs.
- the restriction enzyme is selected from the group consisting of Dpn1 and Rsa1.
- a plasmid or PCR fragment is then generated from the digested cDNAs by any of the conventional methods known to one skilled in the art.
- the candidate dsRNA is the produced by transcription of the plasmid or the PCR fragment.
- the cDNA is produced from all mRNAs that are isolated from the control cell.
- This provides a comprehensive cDNA library which comprises at least a portion of substantially all genes that are actively expressed in the cell.
- Another aspect of the present invention provides a method for identifying and validating activity of an active dsRNA which attenuates a desired gene expression in a cell. The method generally comprises producing a candidate dsRNA, introducing the candidate dsRNA into a reference cell and identifying whether the candidate dsRNA is an active dsRNA by detecting an alteration in a cellular activity or a cellular state in the reference cell.
- Yet another aspect of the present invention provides a high-through put method for correlating genes and gene function, said method comprising: (a) producing a plurality of candidate dsRNAs from a plurality of cDNAs of a control cell such that each candidate dsRNA comprises at least a portion of a gene that is expressed in the control cell; (b) introducing each of the candidate dsRNA into a plurality of separate reference cell each having a gene expression similar to the control cell in step (a); and (c) identifying which candidate dsRNA is an active dsRNA by detecting an alteration in a cellular activity or a cellular state in the reference cell, desired alteration indicating that the gene corresponding to the candidate dsRNA plays a functional role in the reference cell.
- the plurality of cDNAs is produced from a plurality of mRNAs as described herein.
- each candidate dsRNA is substantially identical to at least a portion of the candidate gene.
- Detecting an alteration in a cellular activity or a cellular state in the reference cell can involve a variety of methods. For example, one can detect modulation of ligand binding to a protein, detect a change in phenotype or determine whether the protein encoded by the candidate gene binds to another protein to form a complex that can be coimmunoprecipitated. Detecting a change in phenotype is particularly useful when the reference cell is a part of an organism.
- detecting an alteration in a cellular activity or a cellular state in the reference cell can involve determining whether interference with expression of the candidate gene in the reference cell is correlated with alteration of a cellular activity or cellular state. Interference can be achieved by introducing a double- stranded RNA into the reference cell that can specifically hybridize to the candidate gene.
- the candidate gene can be selected from a normalized library prepared from cells of the same type as the test cell or the control cell. In one particular embodiment, the candidate gene is present in low abundance in the normalized library.
- the candidate gene is a differentially expressed gene selected from a subtracted library that is enriched for genes that are differentially expressed with respect to the test cell and the control cell.
- the subtracted library is also normalized and the candidate gene is one of the genes that is both present in low abundance and differentially expressed in the subtracted and normalized library.
- the candidate gene is selected by a method comprising: (i) preparing (A) a tester-normalized cDNA library which is a normalized library prepared from test cells; (B) a driver-normalized cDNA library which is a normalized library prepared from control cells; (C) a tester-subtracted cDNA library which is enriched in one or more genes that are up-regulated with respect to the test cell and the control cell, and (D) a driver-subtracted cDNA library which is enriched in one or more genes that are down-regulated with respect to the test cell and the control cell; and (ii) identifying one or more clones from the normalized libraries and/or the subtracted libraries, wherein the candidate gene is one of the clones identified.
- identification of one or more clones from the normalized libraries comprises: (A) contacting clones from the tester-normalized cDNA library with labeled probes derived from mRNA from test cells and contacting clones from the driver- normalized cDNA library with labeled probes derived from mRNA from control cells under conditions whereby probes specifically hybridize with complementary clones to form a first set of hybridization complexes; and (B) detecting at least one hybridization complex from the first set of hybridization complexes to identify a clone from one of the normalized libraries which is present in low abundance.
- identification of one or more clones from the normalized libraries comprises: (A) contacting clones from the tester-subtracted cDNA library and contacting clones from the driver-subtracted cDNA library with a population of labeled probes under conditions whereby probes from the population of probes specifically hybridize with complementary clones to form a second set of hybridization complexes, and wherein the population of labeled probes is derived from mRNA from test cells and control cells; and (B) detecting at least one hybridization complex from the second set of hybridization complexes to identify a clone from one of the subtracted libraries which is differentially expressed above a threshold level with respect to the subtracted libraries.
- test cell is obtained from a mammal that has had a stroke or is at risk for stroke.
- test cell is obtained from a mammal that has neurological disorders or develop phenotypes mimicking human neurological disorders.
- the reference cell can be part of a cell culture, a tissue, part of an organism, an embryo, neural, glial cell or a neuroblastoma cell.
- the reference cell can be a mammalian cell.
- the reference cell is human cell or a model system which is useful for investigating a variety of human diseases and/or illnesses.
- the reference cell is useful as a model system for investigating neurological disorders in humans.
- the reference cell has increased sensitivity to N-methyl-D-aspartate, ⁇ -amyloid, peroxide, oxygen-glucose deprivation, or combinations thereof.
- the detecting step can comprises detecting a decrease in cellular sensitivity to N-methyl-D-aspartate, ⁇ -amyloid, peroxide, oxygen-glucose deprivation, or combinations thereof.
- FIGURES Figure 1 shows duplicate arrays probed using the "knock-down" methods of the invention. Arrows show (A) presence of hybridization signal (triplicate spots) and (B) reduction of signal due to inclusion of knock-down polynucleotide during hybridization. This figure shows a portion (detail) of a larger array.
- eGFP dsRNA Green Fluorescent Protein
- Figures 6B -6D Results showing that RNAi mediated inhibition of PARP expression induces resistance to oxygen glucose deprivation (OGD).
- Figures 6B and 6C show views of neuroblastoma cells (AGYNB-010 cells) subjected to 6 hours of OGD. Cell viability was assayed by staining with a fluorescent dye that preferentially stains healthy cells rather than dead cells. Cells transfected with dsPARP 3 hours after initiation of OGD show significantly less cell death (Figure 6C) as compared to control cells transfected with dsEGFP ( Figure 6B).
- Figure 6D is a chart showing that AGYNB-010 cells transfected with dsPARP are rescued from cell death following 3 hours of OGD, whereas control cells that are either untransfected (mock cells) or transfected with dsEGFP show significant cell death after 6 hours of OGD.
- Figures 7A-7C Charts showing sensitivity of the AGYNB-010 neuroblastoma cell line to ⁇ -amyloid (Figure 7A), N-methyl-D-aspartate (NMDA) ( Figure 7B) and oxygen glucose deprivation (OGD) ( Figure 7C).
- Figures 8A and 8B are graphs depicting the expression of EGFP and UCP2 in the presence of dsRNA.
- Figures 9A-9D show dsRNA-mediated inhibition of expression of caspase-3 (A), fas- activated kinase (FASTK, B), 14-4-3 (C) and 3-hydroxy-3-methylglutaryl-Coenzyme A synthase (D).
- Control level of each mRNA was determined in cells transfected with dsEGFP RNA and in mock transfected cells. Levels of GAPDH expression served as controls to ensure the quality of mRNA as well as equal amount of cDNA was used in each reaction.
- Figure 10 is a graph depicting the effect of dsRNA in differentiated N2a cells.
- Realtime PCR was used to measure the levels of 14-3-3 mRNA from cells transfected with lipofectamine alone, dsRNA 14-3-3, and dsRNA EGFP. Data presented were mean from two technical repeats. Similar results were obtained in two independent experiments.
- tissue refers to any aggregation of morphologically or functionally related cells, or cell systems, and thus includes cells (including in vitro cultured cells), tissues, organs, and the like.
- library refers to a collection of polynucleotides (usually in the form of double-stranded cDNA) derived from mRNA of a particular tissue. The polynucleotides of a library may be, but are not necessarily, cloned into a vector.
- nucleic acid polynucleotide
- oligonucleotide refers to a deoxyribonucleotide or ribonucleotide polymer in either single- or double-stranded form, and unless otherwise limited, encompasses known analogs of natural nucleotides that hybridize to nucleic acids in a manner similar to naturally- occurring nucleotides. Examples of such analogs include, without limitation, phosphorothioates, phosphoramidates, methyl phosphonates, chiral-methyl phosphonates, 2-O-methyl ribonucleotides, and peptide-nucleic acids (PNAs).
- a “subsequence” or “segment” refers to a sequence of nucleotides that comprise a part of a longer sequence of nucleotides.
- a “gene,” for the purposes of the present disclosure, includes a DNA region encoding a gene product (see infra).
- the region can also include DNA regions that regulate the production of the gene product, whether or not such regulatory sequences are adjacent to coding and/or transcribed sequences.
- a gene can include, without limitation, promoter sequences, terminators, translational regulatory sequences such as ribosome binding sites and internal ribosome entry sites, enhancers, silencers, insulators, boundary elements, replication origins, matrix attachment sites and locus control regions.
- Gene expression refers to the conversion of the information, contained in a gene, into a gene product.
- a gene product can be the direct transcriptional product of a gene (e.g., mRNA, tRNA, rRNA, antisense RNA, ribozyme, structural RNA or any other type of RNA) or a protein produced by translation of a mRNA.
- Gene products also include RNAs which are modified, by processes such as capping, polyadenylation, methylation, and editing, and proteins modified by, for example, methylation, acetylation, phosphorylation, ubiquitination, ADP-ribosylation, myristilation, and glycosylation.
- Modulation refers to a change in the level or magnitude of an activity or process. The change can be either an increase or a decrease. For example, modulation of gene expression includes both gene activation and gene repression. Modulation can be assayed by determining any parameter that is indirectly or directly affected by the expression of the target gene.
- Such parameters include, e.g., changes in RNA or protein levels, changes in protein activity, changes in product levels, changes in downstream gene expression, changes in reporter gene transcription (luciferase, CAT, ⁇ -galactosidase, ⁇ -glucuronidase, green fluorescent protein (see, e.g., Mistili & Spector, Nature Biotechnology 15:961-964 (1997)); changes in signal transduction, phosphorylation and dephosphorylation, receptor- ligand interactions, second messenger concentrations (e.g., cGMP, cAMP, IP3, and Ca2+), and cell growth.
- the term "complementary" means that one nucleic acid is identical to, or hybridizes selectively to, another nucleic acid molecule.
- Selectivity of hybridization exists when hybridization occurs that is more selective than total lack of specificity. Typically, selective hybridization will occur when there is at least about 55% identity over a stretch of at least 14-25 nucleotides, preferably at least 65%, more preferably at least 70%, at least about 75%, and most preferably at least 90%. Preferably, one nucleic acid hybridizes specifically to the other nucleic acid. See M. Kanehisa, Nucleic Acids Res. 12:203 (1984).
- exogenous when used with reference to a molecule (e.g., a nucleic acid) refers to a molecule that is not normally present in a cell, but can be introduced into a cell by one or more genetic, biochemical or other methods. Normal presence in the cell is determined with respect to the particular developmental stage and environmental conditions of the cell. Thus, for example, a molecule that is present only during embryonic development of muscle is an exogenous molecule with respect to an adult muscle cell.
- An exogenous molecule can comprise, for example, a functioning version of a malfunctioning endogenous molecule or a malfunctioning version of a normally-functioning endogenous molecule.
- An exogenous molecule can be, among other things, a small molecule, such as is generated by a combinatorial chemistry process, or a macromolecule such as a protein, nucleic acid, carbohydrate, lipid, glycoprotein, lipoprotein, polysaccharide, any modified derivative of the above molecules, or any complex comprising one or more of the above molecules.
- An exogenous molecule can be the same type of molecule as an endogenous molecule, e.g., protein or nucleic acid (i.e., an exogenous gene), providing it has a sequence that is different from an endogenous molecule.
- lipid-mediated transfer i.e., liposomes, including neutral and cationic lipids
- electroporation direct injection
- cell fusion cell fusion
- particle bombardment particle bombardment
- calcium phosphate co- precipitation DEAE-dextran-mediated transfer
- viral vector-mediated transfer viral vector-mediated transfer.
- endogenous when used in reference to a molecule is one that is normally present in a particular cell at a particular developmental stage under particular environmental conditions.
- nucleic acids or polypeptides refer to two or more sequences or subsequences that are the same or have a specified percentage of nucleotides or amino acid residues that are the same, when compared and aligned for maximum correspondence, as measured, using a sequence comparison algorithm such as those described below for example, or by visual inspection.
- substantially identical in the context of two nucleic acids, refers to two or more sequences or subsequences that have at least 75%, preferably at least 80% or 85%, more preferably at least 90%, 95% or higher nucleotide identity, when compared and aligned for maximum correspondence, as measured using a sequence comparison algorithm such as those described below for example, or by visual inspection.
- the substantial identity exists over a region of the sequences that is at least about 40-60 nucleotides in length, in other instances over a region at least 60-80 nucleotides in length, in still other instances at least 90-100 nucleotides in length, and in yet other instances the sequences are substantially identical over the full length of the sequences being compared, such as the coding region of a nucleotide for example.
- sequence comparison typically one sequence acts as a reference sequence, to which test sequences are compared.
- test and reference sequences are input into a computer, subsequence coordinates are designated, if necessary, and sequence algorithm program parameters are designated.
- sequence comparison algorithm then calculates the percent sequence identity for the test sequence(s) relative to the reference sequence, based on the designated program parameters.
- Optimal alignment of sequences for comparison can be conducted, e.g., by the local homology algorithm of Smith & Waterman, Adv. Appl. Math. 2:482 (1981), by the homology alignment algorithm of Needleman & Wunsch, J. Mol. Biol. 48:443 (1970), by the search for similarity method of Pearson & Lipman, Proc. Natl. Acad. Sci. USA 85:2444 (1988), by computerized implementations of these algorithms (GAP, BESTFIT, FASTA, and TFASTA in the Wisconsin Genetics Software Package, Genetics Computer Group, 575 Science Dr., Madison, Wl), or by visual inspection [see generally, Current Protocols in Molecular Biology, (Ausubel, F.M. et al., eds.) John Wiley & Sons, Inc., New York (1987-1999, including supplements such as supplement 46 (April 1999)]. Use of these programs to conduct sequence comparisons are typically conducted using the default parameters specific for each program.
- HSPs high scoring sequence pairs
- the word hits are then extended in both directions along each sequence for as far as the cumulative alignment score can be increased. Cumulative scores are calculated using, for nucleotide sequences, the parameters M (reward score for a pair of matching residues; always > 0) and N (penalty score for mismatching residues; always ⁇ 0). For amino acid sequences, a scoring matrix is used to calculate the cumulative score. Extension of the word hits in each direction are halted when: the cumulative alignment score falls off by the quantity X from its maximum achieved value; the cumulative score goes to zero or below, due to the accumulation of one or more negative-scoring residue alignments; or the end of either sequence is reached.
- the default parameters of the BLAST programs are suitable.
- the BLASTP program uses as defaults a word length (W) of 3, an expectation (E) of 10, and the BLOSUM62 scoring matrix.
- the TBLATN program (using protein sequence for nucleotide sequence) uses as defaults a word length (W) of 3, an expectation (E) of 10, and a BLOSUM 62 scoring matrix, (see Henikoff & Henikoff, Proc. Natl. Acad. Sci. USA 89:10915 (1989)).
- the BLAST algorithm In addition to calculating percent sequence identity, the BLAST algorithm also performs a statistical analysis of the similarity between two sequences (see, e.g., Karlin & Altschul, Proc. Natl. Acad. Sci. USA 90:5873-5787 (1993)).
- One measure of similarity provided by the BLAST algorithm is the smallest sum probability (P(N)), which provides an indication of the probability by which a match between two nucleotide or amino acid sequences would occur by chance.
- P(N) the smallest sum probability
- a nucleic acid is considered similar to a reference sequence if the smallest sum probability in a comparison of the test nucleic acid to the reference nucleic acid is less than about 0.1 , more preferably less than about 0.01 , and most preferably less than about 0.001.
- nucleic acid sequences are substantially identical.
- Bod(s) substantially refers to complementary hybridization between a probe nucleic acid and a target nucleic acid and embraces minor mismatches that can be accommodated by reducing the stringency of the hybridization media to achieve the desired detection of the target polynucleotide sequence.
- hybridizing specifically to or “specifically hybridizing to” refers to the binding, duplexing, or hybridizing of a molecule only to a particular nucleotide sequence under stringent conditions when that sequence is present in a complex mixture (e.g., total cellular) DNA or RNA.
- stringent conditions refers to conditions under which a probe or primer will hybridize to its target subsequence, but to no other sequences. Stringent conditions are sequence-dependent and will be different in different circumstances. Longer sequences hybridize specifically at higher temperatures. Generally, stringent conditions are selected to be about 5 °C lower than the thermal melting point (Tm) for the specific sequence at a defined ionic strength and pH. In other instances, stringent conditions are chosen to be about 20 °C or 25 °C below the melting temperature of the sequence and a probe with exact or nearly exact complementarity to the target. As used herein, the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half-dissociated into single strands.
- Tm thermal melting point
- the melting temperature is the temperature at which a population of double-stranded nucleic acid molecules becomes half-dissociated into single strands.
- T m 81.5 + 0.41 (% G + C), when a nucleic acid is in aqueous solution at 1 M NaCl (see e.g., Anderson and Young, "Quantitative Filter Hybridization,” in Nucleic Acid Hybridization (1985)).
- Other references include more sophisticated computations which take structural as well as sequence characteristics into account for the calculation of T m .
- the melting temperature of a hybrid is affected by various factors such as the length and nature (DNA, RNA, base composition) of the probe or primer and nature of the target (DNA, RNA, base composition, present in solution or immobilized, and the like), and the concentration of salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol).
- factors such as the length and nature (DNA, RNA, base composition) of the probe or primer and nature of the target (DNA, RNA, base composition, present in solution or immobilized, and the like), and the concentration of salts and other components (e.g., the presence or absence of formamide, dextran sulfate, polyethylene glycol).
- stringent conditions will be those in which the salt concentration is less than about 1.0 M Na ion, typically about 0.01 to 1.0 M Na ion concentration (or other salts) at pH 7.0 to 8.3 and the temperature is at least about 30 °C for short probes or primers (e.g., 10 to 50 nucleotides) and at least about 60 °C for long probes or primers (e.g., greater than 50 nucleotides).
- Stringent conditions can also be achieved with the addition of destabilizing agents such as formamide.
- detectably labeled means that an agent (e.g., a probe) has been conjugated with a label that can be detected by physical, chemical, electromagnetic and other related analytical techniques.
- detectable labels include, but are not limited to, radioisotopes, fluorophores, chromophores, mass labels, electron dense particles, magnetic particles, spin labels, molecules that emit chemiluminescence, electrochemically active molecules, enzymes, cofactors, and enzyme substrates.
- the present invention provides methods for efficiently identifying and characterizing genes that play important roles in cellular processes such as aging and development, response to environmental challenges (e.g., injury or drug exposure), and pathologic processes.
- the methods disclosed herein permit the rapid and economical generation of "libraries" of differentially expressed and low abundance sequences likely to play roles in pathogenesis and treatment of human disease.
- the methods of the invention are well suited to use with very small amounts of tissue. This permits comprehensive libraries to be produced even when small amount of starting material is available.
- the methods also include a process in which genes identified as being present in low abundance and/or as being differentially expressed (“candidate genes”) are functionally validated.
- This validation process involves determining whether a candidate gene does in fact play a functional effect in a cell by, for example, determining if modulation of expression of the candidate gene is correlated with an alteration in a cellular activity or cellular state in the cell in which expression is modulated.
- RNAi double-stranded RNA interference
- methods involve introducing a dsRNA that is specifically hybridizes to at least a segment of the candidate gene into a reference cell or tissue into which the dsRNA is introduced and then determining whether interference with expression is associated with alteration of cellular activity or state. Detection of such an alteration provides evidence that the candidate gene is correlated with the particular cellular state or process under investigation.
- RNAi RNA interference
- methods other than RNAi can be utilized to functionally validate candidate genes identified in the libraries.
- Such methods include interference with gene expression by use of antisense technology, ribozymes and gene knock-out approaches. Additional approaches include co-immunoprecipitation and epistasis investigations.
- cDNA libraries are prepared that are highly enriched for gene sequences likely to play a role in the molecular and cellular mechanisms of disease, or which are involved in other important cellular processes.
- four related, or "cognate,” libraries are prepared and selected sequences analyzed.
- fewer than four libraries are prepared, by screening multiple (e.g., four) libraries the coverage of the transcriptome is maximized and the likelihood of identifying low-abundance and differentially-expressed genes is increased.
- four libraries validation techniques as described infra are facilitated.
- the libraries of the invention are prepared using mRNA from pairs of tissues that are of the same type, but which differ in one major characteristic, such as disease state (e.g., diseased & normal brain tissue), age (e.g., adult and fetal liver tissue), exposure to drugs, state of differentiation, stage of development, or other state (e.g., stimulated & unstimulated; activated & unactivated).
- the tissue source may be human or non-human.
- the tissues are from a mammal such as a human, non-human primate, rat, or mouse.
- the tissues are from an animal or tissue culture model of a human disease, e.g., stroke, Alzheimer's disease, and neuropathy. Examples of tissue pairs useful for library preparation are shown in Table 1.
- tissue in the pair is designated the “driver tissue,” “control tissue,” or simply “control cell” (from which "driver” cDNA may be made) and the second tissue in the pair is designated the “tester” tissue, “test tissue,” or simply “test cell” (from which "tester” cDNA may be made).
- driver tissue the tissue in the first column
- test tissue the tissue in the second column
- driver the driver
- the four cognate libraries are referred to herein as: (1) driver- normalized, (2) tester-normalized, (3) driver-subtracted, and (4) tester-subtracted.
- Libraries (1) and (2) are normalized, and thus enriched in sequences corresponding to low abundance transcripts.
- Library 1 is made using one tissue of a pair (driver tissue) and Library 2 is made using the specified tester tissue.
- Libraries (3) and (4) are subtracted (or normalized and subtracted) libraries and thus enriched in sequences that are differentially expressed between pairs of tissue states.
- Libraries (3) and (4) of a cognate group are made using both tissues in the tissue pair.
- Double-stranded cDNA is prepared from tissues using standard protocols, i.e., by reverse transcription of messenger (poly A + ) RNA from a specified RNA source using a primer to produce single stranded cDNA. Methods for isolation of total or poly(A) RNA and for making cDNA libraries are well known in the art, and are described in detail in Ausubel and Sambrook (supra). In one embodiment, the library is made using oligo(dT) primers for first strand synthesis. The single-stranded cDNA is converted into double- stranded cDNA (dscDNA) using routine methods (see, e.g., Ausubel supra).
- the dscDNA from each tissue source is digested with one restriction enzyme or, in an alternative embodiment, the dscDNA from each tissue source is separately digested with two or more restriction enzymes, with different specificities, that cut at recognition sequences found frequently in the dscDNA.
- two enzymes are used (and the discussion and examples below will refer to use of two enzymes).
- the digestion with each of the two or more enzymes is carried out separately (e.g., in separate reaction tubes). The digested fragments may be combined later for further processing.
- the dual digestion steps allow for the efficient generation of libraries that are more comprehensive (e.g., containing more different species of expressed or differentially expressed species) than libraries made by other methods.
- the digestion is intended, in part, to generate fragments in a size range that allows efficient hybridization during the annealing steps of library construction. Only fragments of the target size range will efficiently anneal under the conditions used, and non-annealing molecules are excluded from amplification or cloning in some embodiments of the invention.
- a further advantage of the dual digestion steps is that by digesting with multiple (e.g., 2) enzymes with different specificities as taught herein, the resulting libraries are more comprehensive.
- the restriction enzymes used are selected that will produce a calculated (or "predicted") average fragment size of between about 100 and about 500 basepairs, preferably about 300-500 basepairs (e.g., an average length of between 300 bases and 500 bases).
- the two or more different enzymes should produce fragments of similar lengths (e.g., so that each has a calculated average fragment size of within about 150 bases, more often about 100 bases, of the calculated average fragment size of the other). Because PCR is generally more efficient for shorter fragments, the use of fragments of similar length also ensures non-biased PCR amplification between fragments resulting from digestion with different enzymes at subsequent steps in library construction.
- the calculated average fragment size produced by digestion of a particular sample with a particular enzyme can be determined in a variety of ways.
- a database of mRNA/cDNA sequences corresponding to a selected class of mRNAs is used as a representative proxy for the entire population of mRNAs of that class.
- GenBank accessible at, e.g., http://www.ncbi.nlm.nih.gov/).
- a set of mRNA sequences known to be expressed in a specified tissue e.g., brain
- organism e.g., rat, human
- phylum e.g., mammalia
- sequences in databases such as GenBank are annotated, so that an investigator can select sequences with particular properties.
- the frequency and distribution of particular restriction enzyme recognition sites in the selected population of sequences is then determined, e.g., by inspection, but most conveniently by using a computer program such as GCG (Genetics Computer Group Inc., Madison, Wl) or Sequencher (Gene Codes Corp, Ann Arbor, Ml).
- GCG Genetics Computer Group Inc., Madison, Wl
- Sequencher Gene Codes Corp, Ann Arbor, Ml
- the distribution of restriction sites in the population can be determined using publicly available computer software, and enzymes that frequently cut at clustered sites identified; such enzymes are less desirable than those that recognize more evenly distributed sites.
- Table II summarizes an experiment in which enzymes suitable for use with dscDNA prepared from rat mRNA were identified.
- a collection of 489 full- length rat mRNA/cDNA sequences was collected from GenBank.
- the selected sequences from rat included a poly A-signal at 3' end as well as an entire protein coding sequence (ORF) and at least 100 base pairs of 5' UTR.
- the mRNAs sequences analyzed had an average mRNA length of 2257 bases (and an average coding sequence length 1509 bases and average 3' untranslated region of 604 bases).
- the restriction pattern predicted for digestion of this polynucleotide set was determined using the GCG program described supra.
- Exemplary enzymes for digestion of mammalian sequences include Alu I, Cvi RI, Dpn I, Hae Ml, Rsa I, Cvi J1 and Tha 1. As is apparent from the table, most suitable enzymes recognize 4-base restriction sites and are blunt-cutters. As determined in the experiment summarized in Table II, preferred combinations of enzymes for construction of libraries from mammalian sequences are Dpn I and Rsa I, because they produce fragments of similar size in the desired size range.
- the average fragment size can be determined empirically. For example, average fragment size can be determined by PCR amplification of large number (e.g., at least 500) of clones from a normalized or subtracted library with vector-specific primers, followed by size determination of inserts on agarose gels.
- Table III provides a flowchart illustrating the production of restriction digested dscDNA from a tissue pair using restriction enzymes Dpn 1 and Rsa 1.
- Parenthetical numbers are used to refer to specific products (i.e., reagents) produced or used for library production.
- the digested fragments are divided into two aliquots and each aliquot is ligated to an adaptor oligonucleotide, i.e., the first aliquot is ligated to a first adaptor and the second aliquot is ligated to a second adaptor.
- the adaptors used are usually designed to create a 22 to 40 base upper strand hybridized to a 8-12 base lower strand (i.e., partially double-stranded).
- Adaptors are ligated to dscDNA fragments using methods well known in the art.
- unphosphorylated oligonucleotides may be ligated to dscDNA fragments in a standard ligation reaction (e.g., a buffered mixture containing adaptors, fragments, 0.3 mM ATP and T4 DNA ligase, incubated for 12h at 14°C).
- a standard ligation reaction e.g., a buffered mixture containing adaptors, fragments, 0.3 mM ATP and T4 DNA ligase, incubated for 12h at 14°C.
- the adaptors are designed according to the following criteria: 1) The ligation of the adaptor to the fragment should reconstitute the restriction enzyme recognition sequence for the restriction enzyme used to produce the fragments; 2) The adaptor should have a sequence sufficiently long and complex to serve as targets for amplification by the polymerase chain reaction (PCR), e.g., nested PCR. 3) The first and second adaptors should have different sequences so that a molecule containing both adaptor sequences at opposite ends of a fragment can be differentiated from a molecule containing the same adaptor sequence at each end by PCR amplification using suitable primers.
- PCR polymerase chain reaction
- Table V provides, in schematic terms, a flowchart illustrating the addition of adaptors to the products of Table III.
- the first adaptor is designated “Adaptor A” or “Adaptor C”
- the second adaptor is designated “Adaptor B” or “Adaptor D”
- different first and second adaptors being used for fragments produced using different restriction enzymes.
- pairs such as A and C or B and D will have different sequences at the end ligated to the fragment (so that the appropriate restriction fragment is regenerated upon ligation), to the extent possible the adaptors are designed to share the same sequence, e.g., to facilitate subsequent PCR amplification.
- adaptor-ligated fragments corresponding to each of the separate digestion reactions can be, and typically are, combined before proceeding to the subsequent subtraction and normalization protocols.
- 1A + 2C, 1 B + 2D, 3A + 4C, 3B + 4D may be combined if adaptors A and C and adaptors B and D differ only at the 3' end (in order to reconstitute the restriction site).
- the reactions may be combined at later stages, or, alternatively, they may be kept separate. Production of Subtracted libraries
- Subtracted libraries are used to identify efficiently genes that are differentially expressed in a pair of tissues. Two subtracted libraries are produced, a "driver-subtracted” library and a “tester-subtracted library.” When the “tester tissue” is stimulated tissue and the “driver tissue” is unstimulated, the "driver- subtracted” library will be enriched for genes down-regulated by stimulation and the “tester- subtracted” library will be enriched for genes up-regulated by stimulation.
- the normalized-subtracted libraries of the invention are made essentially according to Diatchenko et al. supra.
- the production of the normalized-subtracted libraries includes the following steps: First Annealing Step
- the following mixtures of adaptor-free digested fragments and adaptor-linked fragments are prepared and annealing reactions carried out (Table VI).
- the adaptor-free fragments are added in excess over the adaptor-linked fragments, e.g., at an about 20:1 , 10:1 , or 5:1 ratio. Multiple ratios can be used.
- the mixture is heat-denatured and allowed to anneal, e.g., by heat-denaturation for
- HEPES pH 8.3
- 4 mM Cetyltrimethylammonium bromide 4 mM Cetyltrimethylammonium bromide. Annealing is allowed to proceed to multiple different Cot values by incubating samples or aliquots for varying times
- Cot values results in a more completely normalized library and/or increases the likelihood of enrichment of all differentially regulated genes. It will be recognized that in the annealing step, abundant sequences represented in the adaptor-ligated population will become double-stranded most rapidly, so that, as to adaptor-ligated single-stranded molecules, the library becomes enriched for low-copy number molecules present in the adaptor-ligated population.
- the products can be combined prior to the second annealing step, infra, or, alternatively, can be maintained separately throughout the amplification and optional cloning steps.
- Annealing is allowed to proceed to different Cot values by incubating samples or aliquots for various times (e.g. 4-20 h). Amplification After hybridization, PCR amplification is performed to isolate sequences of interest.
- Normalization is the process by which redundant clones in a library are removed, without reducing the complexity of the library. After successful normalization, approximately equal numbers of all expressed genes are present in a library.
- normalization methods are based on reassociation kinetics of re-annealing of nucleic acids in which denatured DNA is hybridized to an excess amount of denatured complementary DNA. Because re-annealing nucleic acids follow approximately second- order kinetics, the most abundant species form double-stranded hybrids most quickly. Thus, at any given Cot, rare or less abundant species will preferentially remain single stranded and abundant species will enter the population of double-stranded molecules.
- Several methods are available for distinguishing, separating, or differentially amplifying the single stranded species. Exemplary normalization methods are found Soares et al., 1994, Proc Natl. Acad. Sci.
- tester-normalized and driver-normalized are produced.
- each normalized library is produced essentially according to the protocol described in ⁇ F, supra, except that the driver and tester are identical.
- the following reactions in Table VIII are carried out.
- sequences likely to be of particular interest include genes in the low abundance classes from normalized libraries and differentially expressed genes.
- the preferentially amplified or cloned products of subtraction, normalization or combination subtraction-normalization methods are obtained, as described above or by other methods of normalization and/or subtraction.
- clones are subcloned by ligation into a vector capable of propagation in a bacterial or eukaryotic cell. Typically, the clones are propagated in bacterial cells.
- suitable vectors and cloning methods are known (see, e.g., Sambrook, and Ausubel, both supra), including "TA" cloning of PCR products (Stratagene, La Jolla, CA) or blunt-end ligation into a vector of fragments following a fill-in reaction using T4 DNA polymerase and dNTPs.
- clones i.e., by growing a large number of colonies or plaques containing clones from the library(s)).
- clones typically, at least about 5000 clones, more often 10,000, sometimes 15,000 and frequently 25,000 clones are propagated.
- multiwell plates e.g., 384-well plates
- robotic means for growing and picking colonies e.g., Suitable means are known in the art and are described at, e.g., Nguyen et al., 1995, Genomics 29:207-216.
- large numbers of clones can be grown and picked manually.
- the insert i.e., cloned sequences
- the insert DNAs are immobilized at identified positions in a matrix suitable for hybridization analysis.
- high- density filter arrays containing up to 12,000 PCR products per 8x12 cm membrane are used (Nguyen et al, supra).
- sequences may "printed" onto glass plates, as is described generally by Schena et al., 1995, Science 270:467-470.
- the insert corresponding to each clone is amplified by PCR using vector specific primers for spotting on the array.
- DNA from each clone can be isolated, the DNA can be digested with a restriction enzyme(s) that cuts at the boundary of the vector and insert, and the insert sequence can be isolated and spotted on the array.
- the arrayed sequences are then probed with labeled cDNA derived from "driver” (e.g., unstimulated) tissue or "tester” (e.g., stimulated) tissue.
- Labeled probes can be prepared using methods known in the art, e.g., by reverse transcription of isolated RNA from the driver and tester tissues in the presence of radiolabeled or fluorescently-labeled nucleotides (see, e.g., Ausubel, supra; Kricka, 1992, Nonisotopic DNA Probe Techniques, Academic Press San Diego, CA.; Zhao et al., 1995, Gene 156:207; Pietu et al., 1996 Genome Res. 6:492).
- Alternative methods for preparing probes, e.g., riboprobes are well known and their use is contemplated in some embodiments of the invention.
- Optimal hybridization conditions for probing will depend on the type of array (e.g., filter, slide, etc.) selected, the method of labeling probe, and other factors. Hybridization is carried out under conditions of excess immobilized (arrayed) nucleic acid. General parameters for specific (i.e., stringent) hybridization conditions for nucleic acids are described in Sambrook and Ausubel. Suitable hybridization conditions for probing high density arrays are provided in Shena et al., 1996, Proc. Natl. Acad. Sci. USA, 93:10614, and Nguyen, supra.
- the fluorescence emissions at each site of a transcript array are detected (e.g., by scanning confocal laser microscopy or laser illumination, see, e.g., Shalon et al., 1996, Genome Research 6:639-645; Schena et al., 1996, Genome Res. 6:639-645; Ferguson et al., 1996, Nature Biotech. 14:1681-1684).
- autoradiography or quantitative imaging systems e.g., FUJIX BAS 1000 (Fugi) may be used. See Nguyen et al., supra, and references cited therein.
- multiple copies of a specific array can be prepared, separately probed, the hybridization intensity be determined for each clone, and a ratio determined.
- a single array can be repeatedly probed, with washing steps between hybridizations.
- multiple (e.g., 2) differently labeled probes may be simultaneously hybridized to the same matrix (e.g., rhodamine-labeled driver cDNA and fluorescein-labeled driver cDNA), and, for any particular hybridization site on the transcript array, a ratio of the emission of the two fluorophores can be calculated from simultaneous hybridization to the same array.
- the same matrix e.g., rhodamine-labeled driver cDNA and fluorescein-labeled driver cDNA
- One goal of the hybridization is to identify clones corresponding to mRNAs expressed at low abundance in driver and tester tissues, particularly clones corresponding to differentially expressed sequences.
- both driver- normalized and tester-normalized libraries are probed with labeled cDNA from the tissue from which they are derived, as indicated in Table IX. Because the signal intensity for any arrayed clone will correspond to the abundance of the corresponding mRNA in the tissue, clones with low intensity signals (i.e., "low signal clones”) will correspond to low abundance transcripts (i.e., mRNAs rare in the transcriptome).
- a “low intensity signal” or “low signal clone” refers to a clone having a hybridization signal in the lowest (e.g., 1 st to 20 th percentile) or very lowest (e.g., 1 st to 5 th percentile) range in a ranking of a large number (e.g., 1000) of clone signals in the array. This mRNA class is believed to be enriched for sequences of pharmaceutical importance.
- both are probed using labeled probes (e.g., cDNA probes) from both RNA sources (i.e. cDNA from driver tissues and cDNA from tester tissues).
- the ratio between the signals obtained by tester and driver probes indicates the up-regulation or down-regulation of a given clone in response to a stimulus.
- probing both driver- subtracted and tester-subtracted libraries will identify all genes that change in expression, either by up-regulation (tester-subtracted) or down-regulation (driver-subtracted).
- genes showing at least a 20% (1.2-fold) change are of interest, with genes showing a 2-fold difference in expression considered to be of particular interest.
- the genes show at least about a 3-fold, 5-fold or 10-fold difference in expression. Clones exhibiting these differences in expression, as detected by hybridization of different probes, are referred to as "high ratio" clones.
- the hybridization analysis described provides an efficient way for prioritizing clones of likely high pharmaceutical significance for further analysis.
- Selected clones are usually characterized by DNA sequencing and homology analysis.
- Genes derived from such normalized libraries are used as a representative, relevant and non-redundant gene collection of a particular tissue and a particular biological question for a variety of downstream applications. These genes can serve as targets for array analysis allowing one to quantitate gene expression changes in the same or other biological models and complement the gene collection identified by normalized-subtracted libraries.
- the analysis of a number of normalized libraries from a variety of central and peripheral tissues under different conditions of stimulation provides an avenue for the ultimate identification of all genes expressed in the species under investigation.
- the arrayed sequences are screened with other probes; for example, an array of sequences differentially expressed in stroke vs. normal brain can be screened with cDNA probe made from mRNA of Alzheimer's Disease brain tissue. "Knock-Down" Analysis
- One advantage of the present method is that, among the genes selected for further analysis on the basis of hybridization, the level of redundancy is low (i.e., the number genes that are repeatedly sequenced is low) and the percentage of novel genes detected (genes not previously reported in GenBank) is high.
- DNA libraries contain clones representing a small number of parent genes comprise a large proportion of all the clones in the library.
- highly represented (or highly redundant) genes are particularly common in non-normalized libraries, or in libraries from less complex sources, such as specific sub-regions of tissue or cell lines. Random selection of genes from such a library for analysis (e.g. sequencing) results in significant redundancy of effort and expense.
- the knock-down methods of the invention can be used to further reduce redundancy both in the libraries described herein supra, and in libraries prepared by altogether other means (including non-normalized libraries or libraries prepared from specific sub-regions of tissue or cell lines).
- the knock-down method is used to identify clones that are redundant in a library (i.e., clones generated from transcripts having the same sequence) so that the effort and expense of characterizing the redundant sequences is avoided.
- redundant sequences in the library are identified by "prior sampling.” That is, prior to the hybridization analysis described in Section lll(A), supra, or the equivalent of such hybridization, the DNA sequence is determined for representative number of clones, usually at least 50, often between about 100 to about 400 clones, and sometimes more, for example, about 1000 clones. These analyzed clones are referred to as the "prior sample.” It is not necessary to sequence the entire clone; rather only one, or optionally both, termini need be sequenced (e.g., typically at least about 50 bases are determined, more often between about 200 and 350 bases). The sequences are analyzed, for example by BLAST searching (Altschul et al., 1990, d Mol. Biol. 5:403-10). A redundant sequence will appear more often than average: For example, a BLAST-identified sequence appearing as more than 4% of the sample is considered redundant.
- a set of previously identified genes are included as "knock-down" (e.g., unlabeled) polynucleotide in the "knock-down” method, to identify and avoid further processing of clones that have already been characterized (e.g., sequenced).
- knock-down e.g., unlabeled
- DNA may be isolated from the clone(s) (e.g., by PCR amplification of the fragment or insert) and included as an unlabeled (e.g., blocking), or distinctly labeled polynucleotide, during a hybridization of a labeled probe mixture against an array of clones from the library, as described in Section lll(A) supra.
- an unlabeled or distinctly labeled "knock-down" polynucleotide is included at a concentration of about 5 to about 100 ng/ml in the hybridization mixture, often from about 5 to about 40 ng/ml.
- the unlabeled or distinctly labeled polynucleotides are referred to herein as "knock-down" polynucleotides.
- a small number of redundant genes e.g., one to ten
- many or all genes appearing in the "prior sample” can be included as “knock-down” polynucleotides.
- the included unlabeled (or distinctly labeled) "knockdown" polynucleotide will hybridize to complementary sequences in the labeled probe mixture, reducing the amount of specific labeled probe species available for hybridization to the array.
- Comparison of the signal of the probe with and without the addition of knockdown polynucleotide will show that the inclusion of the knock-down clone(s) reduces hybridization signals at particular sites on the matrix.
- the sites of reduced signal correspond to sequences that are represented in the set of "knock-down" polynucleotides (i.e., redundant sequences by frequency or known sequences by prior sampling). Having identified such clones, a decision may be made not to further analyze (e.g., sequence) the clones, saving time and effort.
- redundant clones will be identifiable by the presence of the distinct signal at the matrix site. This requires an additional labeling step for the "knock-down" polynucleotides and, in one embodiment, requires an additional duplicate hybridization matrix or a measurement of the distinct signal. This is similar to the effort of measuring the signal of the primary (non-knock-down) labeled probe with and without the inclusion of "knock-down" polynucleotides.
- redundant clones are identified by hybridization of single clones against an array representing the library, rather than by sequence analysis as discussed supra.
- cDNA libraries are a critical reagent used by biologists in the analysis of gene expression and function. Various methods have been used to produce normalized and/or subtracted cDNA libraries (see, e.g., ⁇ 11 supra and Ausubel, supra).
- the art lacks a convenient and economical method for evaluating the quality of normalized and/or subtracted cDNA libraries.
- the "quality" of a subtracted (or normalized-subtracted) library is assessed by the degree to which differentially expressed genes are enriched in the library relative to non-differentially expressed genes.
- the "quality" of a normalized- library e.g., a tester-normalized or driver-normalized library
- the present invention provides methods for conveniently assessing library quality.
- the superior method can be identified (by virtue of producing a higher quality library).
- the method involves making libraries from the same tester and driver RNA but varying parameters.
- Detectably labeled probe is made from DNA from each library, using standard methods (e.g., nick translation, Ausubel, supra).
- the resulting probes are hybridized to an array of immobilized polynucleotides under conditions of specific hybridization.
- Suitable polynucleotide arrays may be produced by any of a variety of methods, but typically are spotted onto glass slides or nylon membranes (e.g., Schena et al., 1995, Science 270:467-470, and Zhao et al., 1995, Gene 156:207-213).
- the array is selected to contain at least some polynucleotide sequences representing genes that are differentially expressed in the tester RNA tissue compared to the driver RNA tissue. This may be accomplished generally in two different ways. In one method, a reference library (e.g. a tester-subtracted library) is produced from tester and driver RNA (e.g., as described supra).
- the tester and driver RNA used for preparation of the reference library is made from the same tissue sources as used for the libraries to be assessed, although it will be appreciated that this is not strictly necessary.
- the resulting library is cloned (e.g., by ligation to a vector and transform of bacteria) and DNA corresponding to individual clones prepared (e.g., by PCR amplification using vector primers).
- DNA from a plurality of the clones typically at least 50, more often at least 100, more often at least 1000
- a substrate e.g., glass slide
- the resulting cDNAs are spotted onto substrate (e.g. nylon or glass) and the substrate is treated to affix the cDNAs.
- the array will include differentially expressed sequences (reflecting the library from which the clones were prepared).
- a second method for selection of genes can rely on publications for selection of genes previously reported to be expressed in the tester RNA at higher levels than the driver
- RNA can be identified by their Genbank identifier number, and many can be ordered from commercial sources, and these can be amplified by gene specific primers with
- the resulting arrays are then prehybridized, and hybridized with probe described supra. After hybridization (including appropriate washing), the degree of hybridization of each library to various immobilized polynucleotides is detected and compared (e.g., the detectable signal is quantitated). As shown in the Examples, and in Figures 2-4, the intensity of hybridization of the labeled probe to an immobilized polynucleotide in the array is indicative of the relative abundance of the probe sequence in the library. For example, the more enriched a library is for a differentially expressed gene, the greater the intensity of the hybridization of probe from that library to the immobilized gene sequence.
- a higher quality library is identified because at least one differentially expressed sequence shows higher hybridization signal (compared to a library of lower quality). More often, a higher quality library is characterized by a higher hybridization signal to a plurality of different differentially expressed genes on the array, e.g., at least about 5, 10, 20 or 30 sequences or at least about 5%, 10% or 50% of the genes on the array that are differentially expressed (i.e., show an at least 1.2-fold, preferably an at least 2-fold, often at least 3-fold difference in expression between the tester and driver RNAs). If the differentially expressed sequence is rare (i.e.
- the hybridization signal of the rare sequence in the improved subtracted-normalized library will increase relative to a tester- subtracted library. Conversely, if a differentially expressed sequence is abundant (i.e. expressed at a higher level relative to the average sequence expression level), the hybridization signal of the abundant sequence in the improved subtracted-normalized library will decrease relative to a tester-subtracted library.
- the method provides for the detection of rare clones that are differentially expressed between two conditions.
- Candidate genes can potentially be correlated with a wide variety of cellular states or activities. Examples of such states and activities include, but are not limited to, states related to exposure to certain stimuli (e.g., drugs, toxins, environmental stimuli), disease, age, cellular differentiation and/or stage of development.
- stimuli e.g., drugs, toxins, environmental stimuli
- the term "functional validation” as used herein refers to a process whereby one determines whether modulation of expression of a candidate gene or set of such genes causes a detectable change in a cellular activity or cellular state for a reference cell, which cell can be a population of cells such as a tissue or an entire organism.
- the detectable change or alteration that is detected can be any activity carried out by the reference cell.
- alterations include, but are not limited to, phenotypic changes (e.g., cell morphology, cell proliferation, cell viability and cell death); cells acquiring resistance to a prior sensitivity or acquiring a sensitivity which previously did not exist; protein/protein interactions; cell movement; intracellular or intercellular signaling; cell/cell interactions; cell activation (e.g., T cell activation, B cell activation, mast cell degranulation); release of cellular components (e.g., hormones, chemokines and the like); and metabolic or catabolic reactions.
- phenotypic changes e.g., cell morphology, cell proliferation, cell viability and cell death
- cells acquiring resistance to a prior sensitivity or acquiring a sensitivity which previously did not exist
- protein/protein interactions e.g., cell movement; intracellular or intercellular signaling; cell/cell interactions
- cell activation e.g., T cell activation, B cell activation, mast cell degranulation
- release of cellular components e.g.,
- candidate genes generally correspond to genes expressed at low levels and/or genes that are differentially expressed with respect to different cells (e.g., diseased cells versus healthy cells).
- Low level candidate genes are those whose mRNA is about 20% or less of the total mRNA within a cell or a library prepared therefrom. Preferably about 15% or less, more preferably about 10% or less, still more preferably about 5% or less, yet still more preferably about 1% or lesss, and most preferably about 0.1% or less.
- the low abundance genes are 1 % or less of the total mRNA in the cell or library prepared therefrom.
- Genes that are differentially expressed are genes in which there is a detectable difference in expression between the different cells/tissues being compared.
- the difference usually is one that is statistically significant, meaning that the probability of the difference occurring by chance (the P-value) is less than some predetermined level (e.g., 0.05).
- some predetermined level e.g., 0.05
- the confidence level P is ⁇ 0.05, more typically ⁇ 0.01 , and in other instances, ⁇ 0.001.
- One particular aspect of the present invention provides a high-throughput functional validation, which generally involves using the transcriptome procedure described herein. In this manner, once the expression of a gene is determined to correlate with a particular cellular state and/or cellular activity, at least a partial clone of the gene is already available from the transcriptome in the form of plasmit containing T7/T3 promoter. Alternatively, a promoter can be added to such partial clone of the gene, e.g., using PCR approach.
- RNAi Double-stranded RNA interference
- RNAi technology is an effective approach for functionally validating candidate genes identified through the foregoing gene identification methods.
- RNAi technology refers to a process in which double-stranded RNA is introduced into cells expressing a candidate gene to inhibit expression of the candidate gene, i.e., to "silence" its expression.
- the dsRNA is selected to have substantial identity with the candidate gene.
- dsRNA suppresses the expression of endogenous genes by a post-transcriptional mechanism. Specificity in inhibition is important because accumulation of dsRNA in mammalian cells can result in the global blocking of protein synthesis. This blockage appears to result because even low doses of dsRNA (such as occasioned by viral infection, for example) can induce what is called the interferon response. It is believed that in some cases, this response leads to the activation of a dsRNA-responsive protein kinase simply referred to as PKR.
- PKR dsRNA-responsive protein kinase
- PKR phosphorylates and inactivates EIF2 ⁇ , thereby causing global suppression of translation, which in turn triggers cellular apoptosis.
- AGYNB-010 cells there is a minor upregulation of IFN- ⁇ , with no significant global suppression of translation, which in turn results in no apoptosis.
- the gene identification procedures set forth herein when coupled with RNAi technology enables high throughput analysis and validation of a large number of genes for any particular cellular state or activity of interest. In general such methods initially involve transcribing a nucleic acids containing all or part of a candidate gene into single- or double- stranded RNA.
- Sense and anti-sense RNA strands are allowed to anneal under appropriate conditions to form dsRNA.
- the resulting dsRNA is introduced into reference cells via various methods and the degree of attenuation in expression of the candidate gene is measured using various techniques. Usually one detects whether inhibition alters a cellular state or cellular activity.
- Nature of the dsRNA The dsRNA is prepared to be substantially identical to at least a segment of a candidate gene. In general, the dsRNA is selected to have at least 70%, 75%, 80%, 85% or
- sequence identity with the candidate gene over at least a segment of the candidate gene.
- sequence identity is even higher, such as 95%, 97% or 99%, and in still other instances, there is 100% sequence identity with the candidate gene over at least a segment of the candidate gene.
- the size of the segment over which there is sequence identity can vary depending upon the size of the candidate gene. In general, however, there is substantial sequence identity over at least 15, 20, 25, 30, 35, 40 or 50 nucleotides. In other instances, there is substantial sequence identity over at least 100, 200, 300, 400, 500 or 1000 nucleotides; in still other instances, there is substantial sequence identity over the entire length of the candidate gene, i.e., the coding and non- coding region of the candidate gene. Suitable regions of the gene include the 5' untranslated region, the 3' untranslated region, and the coding sequence.
- the dsRNA can include various modified or nucleotide analogs.
- the dsRNA consists of two separate complementary RNA strands.
- the dsRNA may be formed by a single strand of RNA that is self-complementary, such that the strand loops back upon itself to form a hairpin loop. Regardless of form, RNA duplex formation can occur inside or outside of a cell.
- the size of the dsRNA that is utilized varies according to the size of the candidate gene whose expression is to be suppressed and is sufficiently long to be effective in reducing expression of the candidate gene in a cell.
- the dsRNA is at least 10-15 nucleotides long. In certain applications, the dsRNA is less than 20, 21 , 22, 23, 24 or 25 nucleotides in length. In other instances, the dsRNA is at least 50, 100, 150 or 200 nucleotides in length.
- the dsRNA can be longer still in certain other applications, such as at least 300, 400, 500 or 600 nucleotides. Typically, the dsRNA is not longer than 3000 nucleotides.
- dsRNA dsRNA can be prepared according to any of a number of methods that are known in the art, including in vitro and in vivo methods, as well as by synthetic chemistry approaches.
- Certain methods generally involve inserting the segment corresponding to the candidate gene that is to be transcribed between a promoter or pair of promoters that are oriented to drive transcription of the inserted segment and then utilizing an appropriate RNA polymerase to carry out transcription.
- One such arrangement involves positioning a DNA fragment corresponding to the candidate gene or segment thereof into a vector such that it is flanked by two opposable polymerase-specific promoters that can be same or different. Transcription from such promoters produces two complementary RNA strands that can subsequently anneal to form the desired dsRNA.
- Exemplary plasmids for use in such systems include the plasmid (PCR 4.0 TOPO) (available from Invitrogen).
- Another example is the vector pGEM-T (Promega, Madison, Wl) in which the oppositely oriented promoters are T7 and SP6; the T3 promoter can also be utilized.
- DNA fragments corresponding to the segment of the candidate gene that is to be transcribed is inserted both in the sense and antisense orientation downstream of a single promoter.
- the sense and antisense fragments are cotranscribed to generate a single RNA strand that is self-complementary and thus can form dsRNA.
- Single-stranded RNA can also be produced using a combination of enzymatic and organic synthesis or by total organic synthesis.
- the use of synthetic chemical methods enable one to introduce desired modified nucleotides or nucleotide analogs into the dsRNA.
- dsRNA can also be prepared in vivo according to a number of established methods (see, e.g., Sambrook, et al. (1989) Molecular Cloning: A Laboratory Manual, 2 nd ed.; Transcription and Translation (B.D. Hames, and S.J. Higgins, Eds., 1984); DNA Cloning, volumes I and II (D.N. Glover, Ed., 1985); and Oligonucleotide Synthesis (M.J. Gait, Ed., 1984, each of which is incorporated herein by reference in its entirety). Annealing Single-Stranded RNA.
- RNAase free water or a buffer of suitable composition RNAase free water or a buffer of suitable composition.
- dsRNA is generated by annealing the sense and anti-sense RNA in vitro. Generally, the strands are initially denatured to keep the strands separate and to avoid self-annealing. During the annealing process, typically certain ratios of the sense and antisense strands are combined to facilitate the annealing process.
- a molar ratio of sense to antisense strands of 3:7 is used; in other instances, a ratio of 4:6 is utilized; and in still other instances, the ratio is 1 :1.
- the buffer composition utilized during the annealing process can in some instances affect the efficacy of the annealing process and subsequent transfection procedure. While some have indicated that the buffered solution used to carry out the annealing process should include a potassium salt such as potassium chloride (at a concentration of about 80 mM), the current inventors have found that the use of buffered solutions that are substantially potassium free can provide improved results.
- the term "substantially potassium free” means that a potassium salt is not added to the buffer solution; as a consequence, the potassium level is generally less than 1 ⁇ M, and more typically less than 1 nM.
- the sodium chloride concentration in the annealing buffer solution generally is at least 10 mM, and generally in the range 20 mM to 50 mM.
- present inventors have also found that further improved results can be obtained using sodium chloride free (i.e., ⁇ 1 nM of sodium chloride) ammonium acetate at a concentration range of from about 10 ⁇ M to about 50 mM.
- annealing reactions are conducted in a solution containing 20 mM NaCl at 65 °C for 30 minutes, followed by cooling for 15 minutes.
- the annealing solution contains 10 mM TRIS (pH 7.5) and 20 mM NaCl at 95 °C for 1 minute and then allowing the solution to cool at room temperature overnight.
- RNAase A or RNAase T any single-strand overhangs are removed using an enzyme that specifically cleaves such overhangs.
- RNAase A or RNAase T an enzyme that specifically cleaves such overhangs.
- RNAase A or RNAase T an enzyme that specifically cleaves such overhangs.
- a reference cell which can include an individual cell or a population of cells (e.g., a tissue, an embryo and an entire organism).
- the cell can be from essentially any source, including animal, plant, viral, bacterial, fungal and other sources.
- the tissue can include dividing or nondividing and differentiated or undifferentiated cells. Further, the tissue can include germ line cells and somatic cells.
- differentiated cells examples include, but are not limited to, neurons, glial cells, blood cells, megakaryocytes, lymphocytes, macrophages, neutrophils, eosinophils, basophils, mast cells, leukocytes, granulocytes, keratinocytes, adipocytes, osteoblasts, osteoclasts, hepatocytes, cells of the endocrine or exocrine glands, fibroblasts, myocytes, cardiomyocytes, and endothelial cells.
- the cell can be an individual cell of an embryo, and can be a blastocyte or an oocyte.
- a disease e.g., a disease
- certain methods provided herein are conducted with a neuroblastoma cell line that serves as a model system for investigating genes that are correlated with various neurological diseases.
- diseases that can be studied with this particular cell line include, but are not limited to, Alzheimer's disease, Parkinson's disease, brain tumor, epilepsy, stroke, especially ischemic stroke, and other neuro degenerative diseases.
- One specific cell line is referred to by the present inventors as the AGYNB-010 cell line.
- This cell line is prepared as follows. Neuronal cells (ATCC CCL131) are passaged at least 30 times on media containing 0.10 mg/L of Fe(N0 3 ) 3 and 4500 mg/L of glucose . Cells so prepared have been found to be sensitivity to oxygen-glucose deprivation (OGD), N- methyl-D-aspartate (NMDA) and ⁇ -amyloid. As such, this particular line of cells serves as a useful model system for studying stroke (e.g., ischemic stroke), Alzheimer's disease and other neurological disorders. Other cell lines can be utilized as model systems to study obesity and brain tumor.
- RNA can be directly introduced intracellularly.
- Various physical methods are generally utilized in such instances, such as administration by microinjection (see, e.g., Zernicka-Goetz, et al. (1997) Development 124:1133-1137; and Wianny, et al. (1998) Chromosoma 107: 430-439).
- Other options for cellular delivery include permeabilizing the cell membrane and electroporation in the presence of the dsRNA, liposome-mediated transfection, or transfection using chemicals such as calcium phosphate.
- a number of established gene therapy techniques can also be utilized to introduce the dsRNA into a cell. By introducing a viral construct within a viral particle, for instance, one can achieve efficient introduction of an expression construct into the cell and transcription of the RNA encoded by the construct.
- dsRNA is to be introduced into an organism or tissue
- gene gun technology is an option that can be employed. This generally involves immobilizing the dsRNA on a gold particle which is subsequently fired into the desired tissue.
- mammalian cells have transport mechanisms for taking in dsRNA (see, e.g., Asher, et al. (1969) Nature 223:715-717). Consequently, another delivery option is to administer the dsRNA extracellularly into a body cavity, interstitial space or into the blood system of the mammal for subsequent uptake by such transport processes.
- the blood and lymph systems and the cerebrospinal fluid are potential sites for injecting dsRNA.
- Oral, topical, parenteral, rectal and intraperitoneal administration are also possible modes of administration.
- the composition introduced can also include various other agents in addition to the dsRNA.
- agents include, but are not limited to, those that stabilize the dsRNA, enhance cellular uptake and/or increase the extent of interference.
- the dsRNA is introduced in a buffer that is compatible with the composition of the cell into which the RNA is introduced to prevent the cell from being shocked.
- the minimum size of the dsRNA that effectively achieves gene silencing can also influence the choice of delivery system and solution composition. Quantity ofdsRNA introduced
- Sufficient dsRNA is introduced into the tissue to cause a detectable change in expression of the candidate gene (assuming the candidate gene is in fact being expressed in the cell into which the dsRNA is introduced) using available detection methodologies such as those described in the following section.
- sufficient dsRNA is introduced to achieve at least a 5-10% reduction in candidate gene expression as compared to a cell in which the dsRNA is not introduced.
- inhibition is at least 20, 30, 40 or 50%.
- the inhibition is at least 60, 70, 80, 90 or 95%. Expression in some instances is essentially completely inhibited to undetectable levels.
- the amount of dsRNA introduced depends upon various factors such as the mode of administration utilized, the size of the dsRNA, the number of cells into which dsRNA is administered, and the age and size of an animal if dsRNA is introduced into an animal.
- An appropriate amount can be determined by those of ordinary skill in the art by initially administering dsRNA at several different concentrations for example, for example. In certain instances when dsRNA is introduced into a cell culture, the amount of dsRNA introduced into the cells varies from about 0.5 to 3 ⁇ g per 10 6 cells. Detecting Interference of Expression
- a number of options are available to detect interference of candidate gene expression (i.e., to detect candidate gene silencing).
- inhibition in expression is detected by detecting a decrease in the level of the protein encoded by the candidate gene, determining the level of mRNA transcribed from the gene and/or detecting a change in phenotype associated with candidate gene expression.
- Various methods can be utilized to detect changes in protein levels. Exemplary methods include, but are not limited to, Western blot analysis, performing immunological analyses utilizing an antibody that specifically binds to the protein followed by detection of complex formed between the antibody and protein, and activity assays, provided the protein has a detectable activity. Similarly, a number of methods are available for detecting attenuation of candidate gene mRNA levels. Such methods include, for example, dot blot analysis, in-situ hybridization, RT-PCR, quantitative reverse-transcription PCR (i.e., the so- called "TaqMan” methods), Northern blots and nucleic acid probe array methods.
- the phenotype of the cell can also be observed to detect a phenotypical change that is correlated with inhibition of expression of the candidate gene.
- phenotypical changes can include, for instance, apoptosis, morphological changes and changes in cell proliferation as well as other cellular activities listed supra.
- Antisense technology can be utilized to functionally validate a candidate gene.
- an antisense polynucleotide that specifically hybridizes to a segment of the coding sequence for the candidate gene is administered to inhibit expression of the candidate gene in those cells into which it is introduced.
- Methods relating to antisense polynucleotides are well known, see e.g., Melton, D., Ed , 1988, ANTISENSE RNA AND DNA, Cold Spring Harbor Laboratory, Cold Spring Harbor, NY; Dagle et al., 1991 , Nucleic Acids Research, 19:1805; and Uhlmann et al., Chem. Reviews, 90:543-584 (1990).
- the antisense polynucleotide should be long enough to form a stable duplex but short enough, depending on the mode of delivery, to be administered in vivo, if desired.
- the minimum length of a polynucleotide required for specific hybridization to a target sequence depends on several factors, such as G/C content, positioning of mismatched bases (if any), degree of uniqueness of the sequence as compared to the population of target polynucleotides, and chemical nature of the polynucleotide (e.g., methylphosphonate backbone, peptide nucleic acid, phosphorothioate), among other factors.
- the antisense polynucleotides used in the functional validation methods comprise an antisense sequence of that usually is at least about 10 contiguous nucleotides long, in other instances at least 12 or 14 contiguous nucleotides long, and in still other instances up to about 100 contiguous nucleotides long, which sequence specifically hybridizes to a sequence from a mRNA encoding the candidate gene.
- the antisense sequence is complementary to relatively accessible sequences of the candidate gene mRNA (e.g., relatively devoid of secondary structure). This can be determined by analyzing predicted RNA secondary structures using, for example, the MFOLD program (Genetics Computer Group, Madison Wl) and testing in vitro or in vivo as is known in the art.
- Another useful method for optimizing antisense compositions uses combinatorial arrays of oligonucleotides (see, e.g., Milner et al., 1997, Nature Biotechnology 15:537).
- the antisense nucleic acids can be made using any suitable method for producing a nucleic acid, such as chemical synthesis and recombinant methods that are well known in the art.
- the functional role that a candidate gene plays in a cell can also be assessed using gene "knockout” approaches in which the candidate gene is deleted, modified, or inhibited on either a single or both alleles.
- the cells or animals can be optionally be reconstituted with a wild-type candidate gene as part of a further analysis.
- Certain "knockout” approaches are based on the premise that the level of expression of a candidate gene in a mammalian cell can be decreased or completely abrogated by introducing into the genome a new DNA sequence that serves to interrupt some portion of the DNA sequence of the candidate gene.
- simple mutations that either alter the reading frame or disrupt the promoter can be suitable.
- a "gene trap insertion” can be used to disrupt a candidate gene, and embryonic stem (ES) cells (e.g., from mice) can be used to produce knockout transgenic animals (see, e.g., in Holzschu (1997) Transgenic Res 6: 97-106).
- the insertion of the exogenous sequence is typically by homologous recombination between complementary nucleic acid sequences.
- the exogenous sequence is some portion of the candidate gene which one seeks to modify, such as exonic, intronic or transcriptional regulatory sequences, or any genomic sequence which is able to affect the level of expression of the candidate gene; or a combination thereof.
- the construct can also be introduced into other (i.e., non-candidate gene) locations in the genome. Gene targeting via homologous recombination in pluripotential embryonic stem cells allows one to modify precisely the candidate gene of interest.
- the exogenous sequence is typically inserted in a construct, usually also with a marker gene to aid in the detection of the knockout construct and/or a selection gene.
- the construct can be any of a variety of expression vectors, plasmids, and the like.
- the knockout construct is inserted in a cell, typically an embryonic stem (ES) cell, using a variety of established techniques. As noted above, the insertion of the exogenous DNA usually occurs by homologous recombination.
- the resultant transformed cell can be a single gene knockout (i.e., only one of the two copies of the candidate has been modified) or a double gene knockout (i.e., both copies of the candidate gene has been modified).
- ES cells that take up the knockout construct typically only one to five percent of the ES cells that take up the knockout construct actually integrate exogenous DNA in these regions of complementarity; thus, identification and selection of cells with the desired phenotype is usually necessary. This can be accomplished by detecting expression of the selection or marker sequence described above. Cells that have incorporated the construct are selected for prior to inserting the genetically manipulated cell into a developing embryo.
- selection and marker techniques are well known in the art (e.g., antibiotic resistance selection or beta-galactosidase marker expression).
- insertion of the exogenous sequence and levels of expression of the endogenous candidate gene or marker/selection genes can be detected by hybridization or amplification techniques or by antibody-based assays.
- the cells are inserted into an embryo (e.g., a mouse embryo). Insertion can be accomplished by a variety of techniques, such as microinjection, in which about 10 to 30 cells are collected into a micropipet and injected into embryos that are at the proper stage of development to integrate the ES cell into the developing embryonic blastocyst, at about the eight cell stage (for mice, this is about 3.5 days after fertilization). The embryos are obtained by perfusing the uterus of pregnant females.
- an embryo e.g., a mouse embryo. Insertion can be accomplished by a variety of techniques, such as microinjection, in which about 10 to 30 cells are collected into a micropipet and injected into embryos that are at the proper stage of development to integrate the ES cell into the developing embryonic blastocyst, at about the eight cell stage (for mice, this is about 3.5 days after fertilization).
- the embryos are obtained by perfusing the uterus of pregnant females.
- the ES cell After the ES cell has been introduced into the embryo, it is implanted into the uterus of a pseudopregnant foster mother, which is typically prepared by mating with vascectomized males of the same species. In mice, the optimal time to implant is about two to three days pseudopregnant.
- Offspring are screened for integration of the candidate gene. Offspring that have the desired phenotype are crossed to each other to generate a homozygous knockout. If it is unclear whether germline cells of the offspring have modified candidate gene, they can be crossed with a parental or other strain and the offspring screened for heterozygosity of the desired trait.
- mice that have a knocked out candidate gene is provided in the following sources, for example: Bijvoet (1998) Hum. Mol. Genet. 7:53-62; Moreadith (1997) J. Mol. Med. 75:208-216; Tojo (1995) Cytotechnology 19:161- 165; Mudgett (1995) Methods Mol. Biol. 48:167-184; Longo (1997) Transgenic Res. 6:321- 328; U.S. Patents Nos.
- Ribozymes can also be utilized to inhibit expression of candidate gene expression in a cell or animal.
- Useful ribozymes can comprise 5'- and 3'-terminal sequences complementary to the candidate gene and can be engineered by one of skill on the basis of the sequence of the candidate gene.
- Various types of ribozymes can be utilized in the functional validation studies, including, for example, those that have characteristics of group I intron ribozymes (see, e.g., Cech, 1995, Biotechnology 13:323) and those that have the characteristics of hammerhead ribozymes (see, e.g., Edgington, 1992, Biotechnology 10:256).
- Ribozymes and antisense polynucleotides can be delivered by a number of techniques known in the art, including liposomes, immunoliposomes, ballistics, direct uptake into cells, and the like (see, e.g., U.S. Patent 5,272,065).
- Co-immunoprecipitations can be used to functionally validate the role of a protein in a pathway. If two proteins interact and antibodies are available, co-immunoprecipitations can be used to quickly confirm their role in a pathway. Alternative Methods for Identifying Candidate Genes
- RNAi methods e.g., RNAi methods
- RNAi methods have been discussed primarily with respect to candidate genes identified from subtractive and/or normalized libraries prepared according to the methods described supra, it should be understood that these functional validation procedures can be utilized to functionally validate genes that have been identified by any of a number of other methods.
- the functional validation procedures e.g., RNAi methods
- the functional validation procedures can be used to functionally validate low abundance genes and differentially expressed genes identified using other techniques.
- differential display PCR see, e.g., U.S. Patent Nos. 5,262,311; 5,5599,672; and Liang, P. and Pardee, A.B., (1992) Science 257:967-971
- nucleic acid probe arrays see, e.g., WO 97/10365; WO 97/27317; and the entire supplement of Nature Genetics, vol. 21 (1999)
- Quantitative RT-PCR see, e.g., U.S. Patent Nos.
- a microglia cell line was stimulated with lipopolysacchar.de (LPS, 100 ng/ml) and y- interferon- ⁇ lFN- ⁇ ,100 U/ml) in a culture dish. Stimulated and unstimulated cells were harvested at 12 hours and a tester-subtracted library prepared (SL18). In this specific case, the tester and driver dscDNAs were digested with Rsa I, and adaptor set 1 (see Table IV, supra) was used for tester ligations. The first and second hybridizations were for 8 and 16 h, respectively.
- PCR amplification (primary PCR: 25 cycles, secondary PCR: 12 cycles) was with primer set 1 , and products were cloned in pCR 2.1. Primer set 1 is shown supra in Table IV.
- BlastN the BlastN algorithm
- Gene classification is based on BlastN results using the most recent version of Genbank as database. Genes are considered to be "known” if they display a high degree of similarity (>80% identity on nucleotide level)) to a database entry, as similar if they display a distant similarity (40-80% identity on nucleotide level) and as unknown if they do not show any homology or an insignificant homology to a database entry.
- a mouse microglial cell line known to respond to stimulation by incubation in media containing lipopolysaccharide (LPS) and gamma interferon ( ⁇ lFN) was used.
- a normalized and subtracted cDNA library was prepared and cloned in bacteria ("Library 1 "). For a representative number of clones (670), sufficient sequence was determined to assign a Genbank identifier tag (GID) based on a BLAST comparison. Clones matching a GID for MERANTES (GID X70675) were highly represented in the sample (10 clones of 670, or approximately 1.5%).
- GID Genbank identifier tag
- Radiolabeled cDNA probes were prepared from approximately 0.5 micrograms of tester or driver mRNA. The knockdown cDNA was boiled 5 minutes, cooled on ice, and approximately 1 microgram was added to aliquots of radiolabeled tester probe. Equivalent aliquots of radiolabeled tester probe and driver probe were used without the addition of knockdown cDNA. The probe or probe/knockdown mixtures were incubated at 68°C for 20 minutes and hybridization solution 50% formamide, 5 X SSC, 5X Denhardt's reagent, 1 % SDS, 0.025% sodium pyrophosphate ) was added.
- Each of the probe mixtures was hybridized to nylon membranes onto which PCR- amplified cDNA prepared from the 670 partially sequenced clones from Library 1 had been spotted. Hybridization was for 20 hours at 42°C and was followed by washing and signal detection.
- Quantitation of the signal level of tester, knockdown-tester and driver hybridizations allowed the selection of clones upregulated by LPS and ⁇ lFN, based on their tester/driver ratios. Further, the signal ratio of tester/knockdown-tester allowed for the identification of clones that match the knockdown cDNA. All 10 clones corresponding to MERANTES were identified by an elevated tester/knockdown-tester ratio, with an average tester/knockdown- tester signal ratio of 6.4 fold (stdev 2.2). In contrast, the average tester/knockdown-tester signal ratio for all clones was 1.38 (stdev 0.7). There was one clone with tester/knockdown-tester ratio above 3 fold that was not MERANTES. The selection and effort of further handling of redundant clones (e.g. MERANTES) can be reduced by rejection of clones having an elevated tester/knockdown-tester ratio (e.g. greater than 3)
- Human fibroblasts (ATCC CRL 2091) were grown to approximately 60% confluence in 15 cm Petri dishes in Dulbecco's Modified Eagle Medium (DMEM), 10% Fetal Calf Serum (FCS). The cells were washed 3 times with DMEM lacking FCS. After a 48 hour incubation in DMEM with 0.1 % FCS the medium was replaced with fresh medium containing 10% FCS (serum stimulation). Cells were collected at two different time points. One batch of cells was collected just prior to serum stimulation (serum stimulated cells). This sample served as a time zero reference from which "driver" RNA was prepared. Another batch was collected 6 hours after the addition of FCS. This sample served as a stimulated sample from which "tester” RNA was prepared (serum starved cells).
- DMEM Dulbecco's Modified Eagle Medium
- FCS Fetal Calf Serum
- RNA from these samples was prepared using Trizol (Life Technologies). mRNA was selected using Oligotex Kit (Quiagen). The poly A + RNA was reverse transcribed using an Oligo dT priming method and converted into double-stranded cDNA (dscDNA) using standard methods.
- the ds cDNA was digested with Rsa I (NEB).
- the Rsa l-digested tester and driver ds cDNA were divided into two aliquots each, and each aliquot was ligated to an adapter oligonucleotide (Adapter set No. 1 , shown in Table IV, supra).
- the ligation reaction was performed for 12 hours at 16°C using T4 DNA Ligase (2000 U/ ⁇ l).
- PCR primer for first amplification PCR1 , (SEQ ID NO:14) CTAATACGACTCACTATAGGGC; PCR primer pair for second, nested amplification: (SEQ ID NO: 15) nPCR1 , TCGAGCGGCCGCCCGGGCAGGT (SEQ ID NO: 16) nPCR2, AGCGTGGTCGCGGCCGAGGT C. Evaluation of Library Quality i) Array Preparation
- Arrays can be prepared using various materials and protocols (for examples, see Schena, Mark et al., "Quantitative monitoring of Gene Expression patterns with a complementary DNA microarray", Science (1995) v270:467-470, and Zhao, Nanding et al., "High-Density cDNA Filter Analysis: A Novel Approach for Large-Scale, Quantitative Analysis of Gene Expression", Gene (1995) v156:207-213).
- An array can be comprised of a large number of clonal cDNAs on a substrate.
- the cDNAs can be produced by various methods, including purification of plasmids and PCR amplification.
- the cDNAs are commonly attached by treatment with heat, ultraviolet light, chemicals or enzymes, or by reaction with a preactivated surface.
- One typical array starts with the PCR amplification of 11520 bacterial clones containing cDNAs inserted into a plasmid. These clones are commonly from a normalized-subtracted library and therefor contain genes differential in tester and driver mRNA expression levels. Aliquots of the PCR reactions are spotted onto nylon membrane (Scheicher& Scheull) to produce the array.
- cDNA fragments are denatured by wetting the membrane in a solution of 0.5M sodium hydroxide, 1.5M sodium chloride to allow better availability for hybridization, neutralized and crosslinked by ultraviolet light (Stratalinker, Stratagene).
- a cDNA array suitable for analysis of library production methods was prepared. Clones corresponding to 80 genes were selected because their mRNA expression levels in fibroblasts varied upon stimulation by serum, based on cDNA microarray data as described in Iyer, Vishwanath et al., 1999 Science v283:83-87, incorporated herein by reference in its entirety for all purposes. Recombinant clones were purchased from Research Genetics and verified by DNA sequencing.
- cDNA insert of each clone was PCR-amplified using vector-specific primers. PCR products were verified by gel electrophoresis. PCR products were spotted in sextuplicate on nylon membranes, ii) Probe Preparation ds cDNA from each of libraries A-K described supra (i.e., the products of the second PCR amplification) were gel purified using a QiaEx Gel purification kit. The purified products were labeled with 32 P-dCTP (Klenow, Decamer labeling Kit, Ambion) and unincorporated nucleotides were removed by spin column P30 (BioRad). iii) Evaluation of Library Quality
- the probes were hybridized to the cDNA arrays at 42°C in 5xSSC/50% formamide for 20 hours.
- the hybridized arrays were washed in 0.1x SSC at 60°C and exposed to phosphorimager screens (Packard Instruments) for approximately 64 h.
- Hybridization signal intensities were determined by a Cyclone scanner and Optiquant software (Packard
- the arrayed genes were also grouped into classes that increase, maintain, or decrease signal intensity (were regulated in the amount of mRNA produced under condition of tester and driver(e.g., serum-stimulation and serum-starvation).
- genes were considered up-regulated if the ratio of their tester/driver signals is greater than 2, genes are considered unchanged if the ratio of their tester/driver signals were greater than 0.85 and less than 1.15, and genes were considered down-regulated if the ratio of their driver/tester signals is greater than 1:5.
- gene could be of low abundance in driver (i.e. low signal of hybridization, herein less than 5000 DLU) and upregulated (i.e. ratio of tester/driver signals is greater than 2).
- the enrichment factors describe the change in abundance of a particular gene in normalized and subtracted cDNA libraries and are indicators for the success/quality of that library.
- the quality of a normalized-subtracted library is assessed by the degree to which differentially expressed genes are enriched in the library.
- upregulated genes of abundance higher in tester than in driver
- down regulated genes are decreased.
- reverse subtraction the reverse is true (e.g. down regulated genes are increased in abundance in the resulting library).
- particular conditions e.g. F25
- F25 can increase further the signal and abundance of low, medium and high abundance genes where their initial abundance are higher in tester than in driver.
- the quality of a tester-normalized or driver-normalized library is assessed by the degree to which sequences in the library are present in the same abundance, as assessed by a similar intensity of hybridization to the arrayed clones. In a perfectly normalized library, all of the sequences represented are present in the same abundance. Normalization of the abundance of clones gives a more equal chance of discovering what were initially abundant and non-abundant genes, saving time by reducing redundancy of the clone fragments.
- particular conditions e.g. library B
- the quality of a tester-subtracted normalized library is demonstrated by an increase in the occurrence of genes that are more abundant in tester than in driver, a decrease in the occurrence of genes that are more abundant in driver than tester, and the abundance of genes that remain in the library are normalized. This leads to an increase in the abundance of genes having a low abundance that are more prevalent in tester than driver.
- the normalization will also decrease the redundancy of very abundant genes that are more prevalent in tester than driver. This effect of normalization will ease the discovery of genes more specific to tester that are rare, and increase the efficiency of identifying all genes in the subtracted library. An equivalent assessment of quality can be made for a driver- subtracted normalized library.
- the AGYNB-010 cell line utilized in this particular investigation was derived from a neuroblastoma cell line called Neuro 2A (ATCC No. CCL131). As described further below, the AGYNB-010 cell line was shown by the current inventors to be sensitive to OGD, NMDA and ⁇ -amyloid relative to the Neuro 2A cell line.
- the sensitivities exhibited by the AGYNB-010 cell line makes the cell line a good model system for studying various neurological and non-neurological conditions such as ischemia, excitotoxicity, Alzheimer's disease and oxidative stress because these conditions are associated with the foregoing sensitivities.
- the AGYNB-010 cell line were transfected with a green fluorescent protein
- GFP GFP expressing plasmid to provide an assay system to determine the reduction in specific protein levels achieved by RNAi rapidly and quantitatively.
- a neuroblastoma derived cell line expressing the enhanced Green Fluorescent Protein eGFP
- Neuro 2A cells were grown in DMEM and then plated in a six well plate at a concentration of 5x10 5 cells/ml.
- a plasmid expressing eGFP was obtained from Clontech(pEGFP-CI). Twenty-four hours after seeding the plates with Neuro 2A cells, the cells were co-transfected with 0.5 microgram of pCMVneo (available from Stratogene) and three microgram of pEGFP-CI. Forty-eight hours after cotransfection, cells were transferred to media containing G418 to select for transfected cells.
- Cells resistant to G418 were selected, tested for GFP by visualization with a light microscope, replated and independent clonal lines established. The established cell line was further tested for OGD, ⁇ -amyloid, and NMDA sensitivity according to the assays set forth below in this section.
- RNA transcription Single strands of sense and anti-sense RNA from the full length pEGFP clones were transcribed about 500 bp of EGFP-C (i.e., about 500 bp of the C-terminus of the pEGFP) in vitro using T3 and T7 promoters. Addition of SP6 polymerase results in the transcription of sense RNA, and addition of T7 polymerase results in the transcription of antisense RNA (Ambion). Transcripts were purified of proteins using phenol-chloroform extraction.
- RNA was precipitated by adding 20 microliters of 10 M ammonium acetate and 220 microliters of isopropanol to 200 microliters of the extracted mix and then incubating the resulting mixture at -20 °C for 15 minutes. The mixture was centrifuged and the RNA pellet dried and resuspended in 100 microliters of RNAse free double distilled water. The concentration of RNA was determined to be approximately 1 microgram/ml. The length of the transcripts was typically 500 bases or more.
- dsRNA corresponding to the full length coding region of UCP-2 (uncoupling protein 2) gene was prepared in a similar manner.
- In vitro transcription can also be done in 96-well format using both T3 and T7 promoter to generate sense and antisense strands.
- Purification of the transcripts is done using RNA purification columns, such as, but not limited to, RNeasy kit (available from Qiagen). Annealing of both strands in the absence of potassium chloride or sodium chloride can be achieved using ammonium acetate, e.g., at about 10 ⁇ M to 1 mM concentration.
- the reaction buffer is then adjusted to 500 mM of sodium chloride before RNase T1 treatment.
- RNase T1 is added to degrade any non annealed single-stranded RNA.
- the resulting products are passed through RNA purification columns again to remove RNase T1. Concentration of the final dsRNA products can be measured using a plate reader.
- RNA Double-stranded RNA.
- Equimolar quantities of sense and antisense RNA strands from either eGFP or UCP-2 were added in a reaction solution of annealing buffer; annealing of the sense and antisense strands was carried out by incubation at 60 °C for thirty minutes and then allowed to cool at room temperature.
- annealing buffers can be used. For example, when an annealing solution containing 20 mM sodium chloride is used, the reaction mixture is heated incubated at 60 °C for thirty minutes and cooled for about 15 minutes to afford a dsRNA.
- the RNA can be added to 10 mM Tris (pH 7.5) buffer containing 20 mM of sodium chloride.
- RNA is precipitated in 1 M ammonium acetate solution and resuspended in double distilled water.
- the mixture is then incubated at 60 °C for thirty minutes and cooled for about 15 minutes to afford a dsRNA.
- the latter buffer solution differs from annealing buffers used by others which contain potassium or sodium chloride.
- the approach described here also differs from other approaches in that incubation typically is only for 30 minutes, whereas the others typically incubate the mixture overnight (see, e.g., Tuschel et al., Genes and Dev't, 1999, 13, 3191-3197)
- Solution A and B were gently mixed and incubated for 15 minutes at room temperature, then 0.8 ml of serum free DMEM was added to the transfection mixture and this mixture overlayed on the washed cells. Care was taken to ensure that the final volume of the transfection mixture overlayed on the cells did not exceed 1 ml.
- the cells were incubated at 37 °C in a C0 2 incubator for 18-24 hours. The cells are drained of the transfection mixture and replaced with fresh DMEM containing 10%FBS.
- Oxygen-glucose deprivation To measure the sensitivity of cells to combined oxygen-glucose deprivation, cells were resuspended in glucose free deoxygenated media (Earle's balanced salt solution (EBSS) containing 116 mM NaCl, 5.4 mM KCI, 0.8 mM MgS0 4 , 1 mM NaH 2 P0 4 , and 0.9 mM CaCI 2 ) bubbled with 5% H 2 /85% N 2 /5% C0 2 . The cells were transferred to an anaerobic chamber for 5 or 60 min at 37° C, containing the following gas mixture, 5% H 2 , 85% N 2 , and 5% C0 2 .
- EBSS glucose free deoxygenated media
- oxygen glucose deprivation was terminated simply by removing the cells from the anaerobic chamber and replacing the EBSS solution with oxygenated growth media. Sensitivity of the cells to OGD was determined by measuring cell death. The cells were stained with calcein and ethidium homodimer (Molecular Probes), which stains live cells and dead cells, respectively, the staining quantitated on a cytofluor plate reader, and the percentage of dead cells determined.
- calcein and ethidium homodimer Molecular Probes
- NMDA Sensitivity Cells were washed with control salt solution (CSS) containing 120 mM NaCl, 5.4 mM KCI, 1.8 mM CaCI 2 , 25 mM Tris-HCI, 15 mM glucose, pH 7.4. N- Methyl-D-aspartic acid (NMDA) was applied in CSS for 5 min, and after this incubation time the NMDA solution was removed from the cells and growth medium. Toxicity was assayed 20-24 hrs. after exposure to NMDA solution.
- ⁇ -Amyloid Sensitivity Cells were plated the day before exposure to either ⁇ -amyloid or peroxide in a 24 well plate at a concentration of 1x10 5 cells/well.
- ⁇ -Amyloid was made by first solubilizing it in DMSO or an aqueous solution and the resulting solution then diluted in DMEM. In both instances, sensitivity was assessed by measuring cell death using the staining procedure described in the section on assays for OGD.
- Results Figure 5 shows the results of a Western Blot analysis. Lanes 1 and 2 of the gel show eGFP and MAP2 protein levels for untransfected cells (i.e., "mock" cells).
- lanes 6-8 show a significant reduction in eGFP levels for AGYNB-010 cells transfected with 3 ⁇ g of eGFP-C dsRNA; likewise, cells transfected with 3 ⁇ g of enhanced green fluorescent protein (i.e., eGFP) dsRNA also showed a significant reduction in eGFP levels (lanes 9-10).
- the results demonstrate selectivity in inhibition in that eGFP expression is inhibited by eGFP dsRNA but not UCP-2 dsRNA.
- the consistent bands for MAP2 across all lanes confirms consistency in protein loading.
- the AGYNB-010 neuroblastoma derived cell line was shown to be sensitive to ⁇ - amyloid, NMDA and OGD as compared to Neuro 2A cells from which the AGYNB-010 cells are derived (see Figures 7A, 7B and 7C, respectively). As indicated supra, these sensitivities mean this particular cell line can serve as a useful model for conducting studies of various biological phenomenon associated with such sensitivities. For instance, the cell line can be used in studying stroke (e.g., ischemic stroke), as stroke is associated with oxygen deprivation.
- studying stroke e.g., ischemic stroke
- Ischemic stroke results from transient or permanent reduction of the cerebral blood flow.
- Neuronal cells require high oxygen levels for viability and normal function. Deprivation of oxygen thus leads to neuronal death causing brain damage.
- shorter exposures to ischemia result in protection from neuronal damage, a phenomena known as ischemic tolerance, or ischemic preconditioning.
- PARP poly-ADP-ribose- polymerase
- PARP inhibitors or inhibition of PARP may have neuroprotective effects.
- AGYNB-010 cells are sensitive to oxygen glucose deprivation and thus provide a sensitized system for studying ischemia. Transfection of dsRNA into cells .
- RNA from the C terminus or N terminus of the PARP gene (NM_007415, e.g., PARP-N 79-1171 and PARP-C 2200-2797 regions) or from UCP-2 as control were transcribed, purified and concentrated according to the general procedure set forth in Example 4. The single strands were converted to dsRNA and then transfected into AGYNB-010 cells also as described in Example 4.
- dsUCP-2 UCP-2 dsRNA
- dsPARP- C PARP dsRNA from the C terminus
- dsPARP-N PARP dsRNA from the N-terminus
- FIG. 6B is a view showing the number of stained cells (i.e., healthy cells) present for cells transfected with dsEGFP 3 hours after the start of oxygen glucose deprivation.
- Figure 6C shows a similar view of cells similarly treated, except the cells are transfected with dsPARP.
- Figure 6D is a chart showing the same results as in Figures 6B and 6C. The chart also shows results for two controls: (1) the extent of cell death for cells not exposed to OGD, and (2) mock cells (i.e., untransfected cells) subject to 3 hours of OGD. Collectively, these results show the ability of dsPARP to rescue cells having been previously subjected to 3 hours of OGD.
- RNAi functional validation results obtained by RNAi are consistent with the gene expression data indicating that up-regulation of PARP is correlated with harmful cellular effects caused by ischemia.
- the results with the model system provided herein indicate that inhibition of PARP can provide a neuroprotective effect, particularly against ischemia. This makes PARP an attractive target for treatment of stroke.
- dsRNA corresponding to the full-length (dsEGFP) or to the C-terminal part of the EGFP (dsEGFP-C) open reading frame was generated for transfection into these N2a-EGFP cells.
- Control dsRNA was made from the entire coding region of the uncoupling protein-2 (UCP2). N2a-EGFP cells were plated 24 hours before transfection.
- the reporter gene EGFP was used to facilitate detection of expression.
- dsRNA was transfected into a stable cell line expressing EGFP protein that mimics expression of endogenous genes.
- the dsRNA-induced inhibitory effects we observed were clearly gene-specific.
- Control dsRNA corresponding to an unrelated gene show non-significant suppression of EGFP expression compared with mock-transfected.
- cells remain alive and healthy with no significant apoptosis induced by transfection of double-stranded RNA.
- siRNA of 21 nt can induce efficient gene-specific silencing in mammalian cells.
- N2a cells it is demonstrated herein that siRNA derived from FASTK ORF indeed induced gene-specific silencing, confirming the efficacy of siRNA in these neuronal cells.
- higher concentrations of siRNA appear to be needed compared with long dsRNA, consistent with the current hypothesis that dsRNA is processed into 21-23nt siRNA before degrading the target mRNA.
- we have shown that the level of inhibition of the EGFP expression is dependent on the amount of dsEGFP-RNA.
- long dsRNA may be processed into 21-23nt siRNA to induce degradation of target mRNA in these neuronal cells, which may explain the higher efficiency of equal concentrations of long dsRNA compared to siRNA.
- RNA Double-stranded RNA.
- the plasmid EGFP-C1 was used as template to produce PCR fragment corresponding to the full-length coding region and C-terminal fragment of EGFP.
- the PCR fragments were then subcloned in PCRTOPO4.0 plasmid.
- the plasmid was linearized and transcribed with T3 and T7 RNA polymerase.
- Oligonucleotides used for PCR amplification for full-length EGFP ORF are (SEQ ID NO: 17) ATGGTGAGCAAGGGCGAGGAGCTG and (SEQ ID NO: 18)
- dsRNA-EGFP and EGFP-C were 727 and 620 long, respectively.
- dsRNA preparation was treated with RNase T1 to eliminate any remaining ssRNA.
- the quality of the dsRNA preparations was analyzed on 1.2% native agarose gel.
- Sense and anti-sense UCP2 were generated from RNA derived from UCP2 were generated using T7 and SP6 polymerase respectively. The template used was a PCR fragment.
- UCP2 ds- RNA was generated similarly as described.
- RNAi-mediated gene silencing of endogenous genes several types were chosen, including genes involved in apoptotic pathways such as caspase-3, p53, 14-3-3, and kinases such as MAP kinase p38, fas-activated serine threonine kinase (FASTK), and housekeeping enzymes such as Homo-Coenzyme synthase 1.
- kinases such as MAP kinase p38, fas-activated serine threonine kinase (FASTK)
- housekeeping enzymes such as Homo-Coenzyme synthase 1.
- dsRNAs corresponding to the above mouse genes were cloned into PCR4.0TOPO (Invitrogene).
- the average length of the dsRNAs corresponding to the partial sequences of the above genes is about 600-800 bps.
- the table below shows the positions of the oligonucleotides used for cloning the PCR fragments, the length of each dsRNA, and the positions of the oligonucleotides used for quantification of the mRNA level after transfection.
- Table 1 dsRNA-mediated inhibition of gene expression quantified with real-time PCR.
- mice p38 expression was observed using the dsRNA derived from the 3' UTR of rat p38, which shares about 80% homology with the mouse sequence.
- dsRNA 5 nM dsRNA was transfected into N2a cells. Three days after transfection, cells were harvested and total RNA was extracted for RT-PCR as described in Materials and Methods. As shown in figure 9A-F, gene specific dsRNA induces profound silencing of the cognate mRNA while control dsRNA-EGFP shows little or no silencing effect. To test whether dsRNA induces non-specific silencing, we used GAPDH as our internal control. No difference in the expression levels of GAPDH was observed between dsRNA-transfected and mock-transfected, indicating that there was no non-specific silencing induced by long dsRNA in these mouse neuroblastoma cells under current experimental conditions (Fig A-
- RNAi machinery is highly active in N2a cells and the silencing effects we observed at the protein level are not due to post-translational mechanisms, but are mediated at the transcriptional level.
- dsRNA-induced inhibition appears to be transient when transfected into mammalian cells.
- the present experiments show dsRNA- induced inhibition that is both transient and time-dependent.
- Different genes require different lengths of post-transfection time for efficient inhibition of gene expression.
- dsRNA-p53 induced maximal inhibition in 24-48 hours
- both dsRNA-PARP and dsRNA-EGFP induce maximal inhibition in 96 hours.
- This phenomena can be best explained by the fact that the mRNA and protein of different genes have different stability and turn over time.
- EGFP is known to be much more stable than p53 protein, which has a typical half-life of 20 minutes.
- dsRNA corresponding to the partial ORF of mouse p38 induces efficient silencing of p38 mRNA in mouse neuroblastoma N2a cells 9 (Fig. 9C). It was then tested whether dsRNA corresponding to the 3'UTR of the rat p38 gene (dsRNA-rat-p38), which shares about 80% identity with the mouse p38 gene sequence in that region, also can induce gene-specific silencing in N2a cells. Indeed, dsRNA-rat-p38 induces efficient silencing of p38 mRNA in N2a cells, indicating that effective dsRNA is not restricted to sequences in the ORF, and that 100%) sequence identity is not required.
- N2a cells undergo serum withdrawal, which induces partial differentiation. It was then tested whether RNAi is active in fully differentiated N2a cells.
- Cells were transfected with 5 nM dsRNA and then incubated in Neurobasal medium (Invitrogen) with N2 supplement (Invitrogen) in the presence of 20 ⁇ M retinoid acid to induce neuronal differentiation. After three days, N2a cells were fully differentiated with long processes. The proliferation rate decreased dramatically after 2-3 days in differentiation media, as measured with the amount of incorporated BrdU.
- RNAi is not restricted to non-differentiated cells or cells of embryonic origin, but active in these fully- differentiated neuronal cells.
- SYBR Green real-time PCR amplifications were performed in a iCycler Real-Time Detection System (Bio-Rad Laboratories, Hercules, CA). Primers were designed using Primer3 developed by the Whitehead Institute for Biomedical Research and the primers (Operon Technologies, Alameda, CA) concentrations were optimized for use with the SYBR green PCR master mix reagents kit. The sizes of the amplicons were checked by running out the PCR product on a 1.5 % agarose gel.
- the thermal profile for all SYBR Green PCRs was 50°C for 2 minutes and 95°C for 10 minutes, followed by 45 cycles of 95°C for 15 seconds, 60°C for 30 seconds followed by 72 °C for 40 seconds.
- the standard curves are used to calculate the PCR efficiency of the primer set.
- GPDH glyceraldehydes-3-phosphate dehydrogenase
- All PCR reactions performed in triplicates. Quantification was performed using the comparative cycle threshold (CT) method, where CT is defined as the cycle number at which fluorescence reaches a set threshold value.
- CT comparative cycle threshold
- the target transcript was normalized to an endogenous reference (simultaneous triplicate GAPDH reactions), and relative differences were calculated using the PCR efficiencies.
- DsRNA corresponding to FASTK, caspase-3, p53, 14-3-3, p38, and 3-hydroxy-3- methylglutaryl-Coenzyme A synthase were generated as follows. Briefly, partial sequences of these mouse genes were cloned using RT-PCR and inserted into PCR4.0TOPO and serve as templates for in vitro transcription.
- the oligonucleotides used for PCR to generate partial clone of FASTK (accession #NM_023229) are (SEQ ID NO: 19) GTCTCCACCACCCAGCTCCATG and (SEQ ID NO:20)
- the oligonucleotides used for PCR to generate partial clone of capspase-3 are (SEQ ID NO:21) TGGAGAACAACAAAACCTCAGTGG and (SEQ ID NO:22)
- the oligonucleotides used for PCR to generate partial clone of p53 are (SEQ ID NO:23) ACCTCACTGCATGGACGATCTG and (SEQ ID NO:24)
- GCAGTTCAGGGCAAAGGACTTC The oligonucleotides used for PCR to generate partial clone of 14-3-3 (accession #D87663) are (SEQ ID NO:25) CGGCAAATGGTTGAAACTGA and (SEQ ID NO:26) CCTGCAGCGCTTCTTTATTCT.
- the oligonucleotides used for PCR to generate partial clone of p38 (accession NM_011951) are (SEQ ID NO:27) GCAGGAGAGGCCCACGTTCT and (SEQ ID NO:28) CATCATCAGTGTGCCGAGCCA.
- the oligonucleotides used for PCR to generate partial clone of 3-hydroxy-3-methylglutaryl- Coenzyme A synthase 1 are (SEQ ID NO:28) CGTGGTATCTGGTCAGAGTGGA and (SEQ ID NO:29) GCCAGACCACAACAGGAAGCAT.
- the oligonucleotides used for PCR to generate partial clone of all dsRNA used for transfections are blunt-ended.
- the primers used to quantify caspase-3 are: (SEQ ID NO:30) GTACGCGCACAAGCTAGAAT and (SEQ ID NO:31) AAAGTGGAGTCCAGGGAGAAG; for FASTK, the primers are (SEQ ID NO:32) GGTGGTCAAAGGTTGGAAGT and (SEQ ID NO:33) CCATTACGTGAGGAGTCAGTTC; for p53, the primers are (SEQ ID NO:34) GCGTAAACGCTTCGAGATG and (SEQ ID NO:35) AGTAGACTGGCCCTTCTTGGT; for synthase, the primers are (SEQ ID NO:36) CTGGCCAGTGGTAAATGTACTG and (SEQ ID NO:37) CTCTGCCTTTTGCTGTCAGA; for 14-3-3, the primers are (SEQ ID NO:38) CGCTGTGGACCTCAGACAT and (SEQ ID NO:39) GGGGTAGTCAGAGATGGTTTCT; for p38, the primers are
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Chemical & Material Sciences (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Zoology (AREA)
- Wood Science & Technology (AREA)
- Organic Chemistry (AREA)
- Biotechnology (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- Microbiology (AREA)
- Plant Pathology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- Crystallography & Structural Chemistry (AREA)
- Bioinformatics & Computational Biology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Applications Claiming Priority (5)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US10/027,807 US6841351B2 (en) | 1999-07-30 | 2001-10-19 | High-throughput transcriptome and functional validation analysis |
US27807 | 2001-10-19 | ||
US116437 | 2002-04-03 | ||
US10/116,437 US6924109B2 (en) | 1999-07-30 | 2002-04-03 | High-throughput transcriptome and functional validation analysis |
PCT/US2002/033425 WO2003033673A2 (fr) | 2001-10-19 | 2002-10-17 | Analyse a haut debit du transcriptome et de la validation de fonction |
Publications (2)
Publication Number | Publication Date |
---|---|
EP1444362A2 true EP1444362A2 (fr) | 2004-08-11 |
EP1444362A4 EP1444362A4 (fr) | 2006-06-21 |
Family
ID=26702900
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP02780490A Withdrawn EP1444362A4 (fr) | 2001-10-19 | 2002-10-17 | Analyse a haut debit du transcriptome et de la validation de fonction |
Country Status (4)
Country | Link |
---|---|
US (2) | US6924109B2 (fr) |
EP (1) | EP1444362A4 (fr) |
CA (1) | CA2461171A1 (fr) |
WO (1) | WO2003033673A2 (fr) |
Families Citing this family (14)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR101147147B1 (ko) | 2004-04-01 | 2012-05-25 | 머크 샤프 앤드 돔 코포레이션 | Rna 간섭의 오프 타겟 효과 감소를 위한 변형된폴리뉴클레오타이드 |
US20060166234A1 (en) | 2004-11-22 | 2006-07-27 | Barbara Robertson | Apparatus and system having dry control gene silencing compositions |
US7923207B2 (en) * | 2004-11-22 | 2011-04-12 | Dharmacon, Inc. | Apparatus and system having dry gene silencing pools |
US7935811B2 (en) | 2004-11-22 | 2011-05-03 | Dharmacon, Inc. | Apparatus and system having dry gene silencing compositions |
US7369898B1 (en) * | 2004-12-22 | 2008-05-06 | Pacesetter, Inc. | System and method for responding to pulsed gradient magnetic fields using an implantable medical device |
WO2007035962A2 (fr) * | 2005-09-23 | 2007-03-29 | California Institute Of Technology | Methode de blocage de gene |
EP2081949B1 (fr) | 2006-09-22 | 2014-12-10 | GE Healthcare Dharmacon, Inc. | Complexes d'oligonucléotides tripartites et procédés de silençage de gènes par interférence arn |
US7845686B2 (en) * | 2007-12-17 | 2010-12-07 | S & B Technical Products, Inc. | Restrained pipe joining system for plastic pipe |
US8188060B2 (en) | 2008-02-11 | 2012-05-29 | Dharmacon, Inc. | Duplex oligonucleotides with enhanced functionality in gene regulation |
GB0804690D0 (en) * | 2008-03-13 | 2008-04-16 | Netherlands Cancer Inst The | Method |
TWI380992B (zh) | 2010-03-11 | 2013-01-01 | Univ Ishou | 一種抑制nmda受體nr1之小段干擾rna、該小段干擾rna用以抑制皮下組織nmda受體nr1之方法、該小段干擾rna之用途以及一種減緩皮膚炎症性疼痛之藥物化合物 |
TWI417109B (zh) | 2010-12-30 | 2013-12-01 | Univ Ishou | 一種用以抑制nmda受體nr1之短夾rna、一種以短夾rna製備用以抑制nmda受體nr1之試劑之用途以及一種減緩皮膚炎症性疼痛之藥物 |
EP4367236A4 (fr) * | 2021-07-05 | 2025-05-21 | Heligenics, Inc. | Système et procédé de dosage de fonctions moléculaires cellulaires haut débit |
CN116525008A (zh) * | 2023-04-28 | 2023-08-01 | 中国科学院软件研究所 | 一种面向自主可控异构众核集群的高性能基因比对方法和系统 |
Family Cites Families (17)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6027876A (en) * | 1992-12-30 | 2000-02-22 | Kreitman; Martin | Method for monitoring pesticide resistance |
GB9323884D0 (en) * | 1993-11-19 | 1994-01-05 | Eisai London Res Lab Ltd | Physiological modulation |
US6251928B1 (en) * | 1994-03-16 | 2001-06-26 | Eli Lilly And Company | Treatment of alzheimer's disease employing inhibitors of cathepsin D |
US5801026A (en) * | 1994-09-26 | 1998-09-01 | Carnegie Institution Of Washington | Use of plant fatty acyl hydroxylases to produce hydroxylated fatty acids and derivatives in plants |
US5593837A (en) * | 1995-05-26 | 1997-01-14 | The Jackson Laboratory | Clinical disorders associated with carboxypeptidase E mutation |
US5858777A (en) * | 1995-09-08 | 1999-01-12 | Geron Corporation | Methods and reagents for regulating telomere length and telomerase activity |
US6077686A (en) * | 1996-02-29 | 2000-06-20 | Mount Sinai Hospital Corporation | Shc proteins |
US6074872A (en) * | 1996-05-15 | 2000-06-13 | The Scripps Research Institute | Cortistatin: nucleic acids that encode these neuropeptides |
US5846721A (en) * | 1996-09-19 | 1998-12-08 | The Trustees Of Columbia University In The City Of New York | Efficient and simpler method to construct normalized cDNA libraries with improved representations of full-length cDNAs |
US6197557B1 (en) * | 1997-03-05 | 2001-03-06 | The Regents Of The University Of Michigan | Compositions and methods for analysis of nucleic acids |
US6124091A (en) * | 1997-05-30 | 2000-09-26 | Research Corporation Technologies, Inc. | Cell growth-controlling oligonucleotides |
US6242253B1 (en) * | 1997-10-09 | 2001-06-05 | Regents Of The University Of California | IkB kinase, subunits thereof, and methods of using same |
US6506559B1 (en) * | 1997-12-23 | 2003-01-14 | Carnegie Institute Of Washington | Genetic inhibition by double-stranded RNA |
GB9827152D0 (en) * | 1998-07-03 | 1999-02-03 | Devgen Nv | Characterisation of gene function using double stranded rna inhibition |
US6135942A (en) * | 1998-11-30 | 2000-10-24 | Leptin; Maria | Nucleic acids proteins of a D. melanogaster insulin-like gene and uses thereof |
EP1147204A1 (fr) | 1999-01-28 | 2001-10-24 | Medical College Of Georgia Research Institute, Inc. | Composition et methode destinees a l'attenuation in vivo et in vitro de l'expression genique utilisant de l'arn double brin |
GB9927444D0 (en) * | 1999-11-19 | 2000-01-19 | Cancer Res Campaign Tech | Inhibiting gene expression |
-
2002
- 2002-04-03 US US10/116,437 patent/US6924109B2/en not_active Expired - Fee Related
- 2002-10-17 WO PCT/US2002/033425 patent/WO2003033673A2/fr not_active Application Discontinuation
- 2002-10-17 CA CA002461171A patent/CA2461171A1/fr not_active Abandoned
- 2002-10-17 US US10/491,509 patent/US20050053936A1/en not_active Abandoned
- 2002-10-17 EP EP02780490A patent/EP1444362A4/fr not_active Withdrawn
Also Published As
Publication number | Publication date |
---|---|
US6924109B2 (en) | 2005-08-02 |
WO2003033673A2 (fr) | 2003-04-24 |
US20050053936A1 (en) | 2005-03-10 |
WO2003033673A3 (fr) | 2003-11-13 |
CA2461171A1 (fr) | 2003-04-24 |
EP1444362A4 (fr) | 2006-06-21 |
US20030082570A1 (en) | 2003-05-01 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN100582117C (zh) | 用于dna介导基因沉默的组合物 | |
Heidersbach et al. | microRNA-1 regulates sarcomere formation and suppresses smooth muscle gene expression in the mammalian heart | |
US6924109B2 (en) | High-throughput transcriptome and functional validation analysis | |
EP1462525B1 (fr) | Systeme d'expression d'arnsi et procede de production de cellule knockdown a gene fonctionnel ou analogue utilisant ce systeme | |
EP2314687B1 (fr) | Constructions geniques permettant l'expression inductible de petites molecules d'arn pour une extinction genique ciblee | |
US20050197315A1 (en) | siRNA expression system and method for producing functional gene knock-down cell using the system | |
US20020150945A1 (en) | Methods for making polynucleotide libraries, polynucleotide arrays, and cell libraries for high-throughput genomics analysis | |
AU2634800A (en) | Composition and method for (in vivo) and (in vitro) attenuation of gene expression using double stranded rna | |
US20030143597A1 (en) | Methods for making polynucleotide libraries, polynucleotide arrays, and cell libraries for high-throughput genomics analysis | |
US20040235764A1 (en) | Methods of inhibiting expression of a target gene in mammalian cells | |
US7524653B2 (en) | Small interfering RNA libraries and methods of synthesis and use | |
US20060088837A1 (en) | Expression system for stem-loop rna molecule having rnai effect | |
US20120058917A1 (en) | Nucleic Acids and Libraries | |
US6841351B2 (en) | High-throughput transcriptome and functional validation analysis | |
AU2002343543A1 (en) | High-throughput transcriptome and functional validation analysis | |
US20130029876A1 (en) | Nucleic Acids and Libraries | |
US20050059019A1 (en) | Gene-related RNAi transfection method | |
US20040137490A1 (en) | Methods for making polynucleotide libraries, polynucleotide arrays, and cell libraries for high-throughput genomics analysis | |
WO2005068630A1 (fr) | Arn bicatenaire interferant | |
JP2002306183A (ja) | 遺伝子機能同定法 |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20040518 |
|
AK | Designated contracting states |
Kind code of ref document: A2 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR IE IT LI LU MC NL PT SE SK TR |
|
AX | Request for extension of the european patent |
Extension state: AL LT LV MK RO SI |
|
A4 | Supplementary search report drawn up and despatched |
Effective date: 20060522 |
|
RIC1 | Information provided on ipc code assigned before grant |
Ipc: C12N 15/11 20060101ALI20060516BHEP Ipc: C12N 15/10 20060101ALI20060516BHEP Ipc: C12Q 1/68 20060101AFI20040128BHEP |
|
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20060819 |