CA2278645A1 - Characterization of the yeast transcriptome - Google Patents
Characterization of the yeast transcriptome Download PDFInfo
- Publication number
- CA2278645A1 CA2278645A1 CA002278645A CA2278645A CA2278645A1 CA 2278645 A1 CA2278645 A1 CA 2278645A1 CA 002278645 A CA002278645 A CA 002278645A CA 2278645 A CA2278645 A CA 2278645A CA 2278645 A1 CA2278645 A1 CA 2278645A1
- Authority
- CA
- Canada
- Prior art keywords
- gene
- group
- norf
- phase
- yeast
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Abandoned
Links
- 240000004808 Saccharomyces cerevisiae Species 0.000 title claims abstract description 100
- 238000012512 characterization method Methods 0.000 title description 4
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 182
- 210000004027 cell Anatomy 0.000 claims abstract description 75
- 230000022131 cell cycle Effects 0.000 claims abstract description 28
- 229940121375 antifungal agent Drugs 0.000 claims abstract description 6
- 239000003429 antifungal agent Substances 0.000 claims abstract description 4
- 239000007787 solid Substances 0.000 claims abstract 4
- 230000014509 gene expression Effects 0.000 claims description 73
- 238000000034 method Methods 0.000 claims description 39
- 239000000523 sample Substances 0.000 claims description 29
- 230000018199 S phase Effects 0.000 claims description 24
- 108020004414 DNA Proteins 0.000 claims description 22
- 102000053602 DNA Human genes 0.000 claims description 13
- 230000006369 cell cycle progression Effects 0.000 claims description 12
- 239000002299 complementary DNA Substances 0.000 claims description 11
- 239000002773 nucleotide Substances 0.000 claims description 9
- 125000003729 nucleotide group Chemical group 0.000 claims description 9
- -1 EN02 Proteins 0.000 claims description 8
- 210000005253 yeast cell Anatomy 0.000 claims description 7
- 230000004668 G2/M phase Effects 0.000 claims description 6
- 229940079593 drug Drugs 0.000 claims description 6
- 239000003814 drug Substances 0.000 claims description 6
- 101100010928 Saccharolobus solfataricus (strain ATCC 35092 / DSM 1617 / JCM 11322 / P2) tuf gene Proteins 0.000 claims description 5
- 101150001810 TEAD1 gene Proteins 0.000 claims description 5
- 101150074253 TEF1 gene Proteins 0.000 claims description 5
- 102100029898 Transcriptional enhancer factor TEF-1 Human genes 0.000 claims description 5
- 239000000126 substance Substances 0.000 claims description 5
- 238000012360 testing method Methods 0.000 claims description 5
- FQVLRGLGWNWPSS-BXBUPLCLSA-N (4r,7s,10s,13s,16r)-16-acetamido-13-(1h-imidazol-5-ylmethyl)-10-methyl-6,9,12,15-tetraoxo-7-propan-2-yl-1,2-dithia-5,8,11,14-tetrazacycloheptadecane-4-carboxamide Chemical compound N1C(=O)[C@@H](NC(C)=O)CSSC[C@@H](C(N)=O)NC(=O)[C@H](C(C)C)NC(=O)[C@H](C)NC(=O)[C@@H]1CC1=CN=CN1 FQVLRGLGWNWPSS-BXBUPLCLSA-N 0.000 claims description 4
- 102100034035 Alcohol dehydrogenase 1A Human genes 0.000 claims description 4
- 102100034044 All-trans-retinol dehydrogenase [NAD(+)] ADH1B Human genes 0.000 claims description 4
- 101710193111 All-trans-retinol dehydrogenase [NAD(+)] ADH4 Proteins 0.000 claims description 4
- 101150085381 CDC19 gene Proteins 0.000 claims description 4
- 101000892220 Geobacillus thermodenitrificans (strain NG80-2) Long-chain-alcohol dehydrogenase 1 Proteins 0.000 claims description 4
- 101000780443 Homo sapiens Alcohol dehydrogenase 1A Proteins 0.000 claims description 4
- 101000579123 Homo sapiens Phosphoglycerate kinase 1 Proteins 0.000 claims description 4
- 101000642268 Homo sapiens Speckle-type POZ protein Proteins 0.000 claims description 4
- 101100234604 Neurospora crassa (strain ATCC 24698 / 74-OR23-1A / CBS 708.71 / DSM 1257 / FGSC 987) ace-8 gene Proteins 0.000 claims description 4
- KJWZYMMLVHIVSU-IYCNHOCDSA-N PGK1 Chemical compound CCCCC[C@H](O)\C=C\[C@@H]1[C@@H](CCCCCCC(O)=O)C(=O)CC1=O KJWZYMMLVHIVSU-IYCNHOCDSA-N 0.000 claims description 4
- 101150093629 PYK1 gene Proteins 0.000 claims description 4
- 102100028251 Phosphoglycerate kinase 1 Human genes 0.000 claims description 4
- 101100067993 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) ASC1 gene Proteins 0.000 claims description 4
- 101100458423 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) MTC7 gene Proteins 0.000 claims description 4
- 101100363283 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPS30B gene Proteins 0.000 claims description 4
- 101100045631 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) TMA19 gene Proteins 0.000 claims description 4
- 101100543842 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YCR013C gene Proteins 0.000 claims description 4
- 101100432403 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) YJR085C gene Proteins 0.000 claims description 4
- 102100036422 Speckle-type POZ protein Human genes 0.000 claims description 4
- 230000000843 anti-fungal effect Effects 0.000 claims description 4
- 238000012544 monitoring process Methods 0.000 claims description 4
- 102000039446 nucleic acids Human genes 0.000 claims description 4
- 108020004707 nucleic acids Proteins 0.000 claims description 4
- 150000007523 nucleic acids Chemical class 0.000 claims description 4
- 238000012216 screening Methods 0.000 claims description 3
- 210000004962 mammalian cell Anatomy 0.000 claims description 2
- 230000002538 fungal effect Effects 0.000 claims 1
- 238000003491 array Methods 0.000 abstract 1
- 210000003527 eukaryotic cell Anatomy 0.000 abstract 1
- 230000022983 regulation of cell cycle Effects 0.000 abstract 1
- 235000014680 Saccharomyces cerevisiae Nutrition 0.000 description 64
- 238000003196 serial analysis of gene expression Methods 0.000 description 49
- 238000004458 analytical method Methods 0.000 description 31
- 108700026244 Open Reading Frames Proteins 0.000 description 23
- 108020004999 messenger RNA Proteins 0.000 description 20
- 210000000349 chromosome Anatomy 0.000 description 10
- 230000000875 corresponding effect Effects 0.000 description 10
- 238000009396 hybridization Methods 0.000 description 10
- 238000012163 sequencing technique Methods 0.000 description 10
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 8
- 102000004190 Enzymes Human genes 0.000 description 8
- 108090000790 Enzymes Proteins 0.000 description 8
- 230000006870 function Effects 0.000 description 8
- 102000004169 proteins and genes Human genes 0.000 description 8
- 238000000636 Northern blotting Methods 0.000 description 6
- 230000012010 growth Effects 0.000 description 6
- 230000002759 chromosomal effect Effects 0.000 description 5
- 230000013011 mating Effects 0.000 description 5
- 210000003411 telomere Anatomy 0.000 description 5
- 102000055501 telomere Human genes 0.000 description 5
- 108091035539 telomere Proteins 0.000 description 5
- WQZGKKKJIJFFOK-GASJEMHNSA-N Glucose Natural products OC[C@H]1OC(O)[C@H](O)[C@@H](O)[C@@H]1O WQZGKKKJIJFFOK-GASJEMHNSA-N 0.000 description 4
- 230000015572 biosynthetic process Effects 0.000 description 4
- 230000001413 cellular effect Effects 0.000 description 4
- 238000002474 experimental method Methods 0.000 description 4
- 239000008103 glucose Substances 0.000 description 4
- 241000206602 Eukaryota Species 0.000 description 3
- 238000012300 Sequence Analysis Methods 0.000 description 3
- IXKSXJFAGXLQOQ-XISFHERQSA-N WHWLQLKPGQPMY Chemical compound C([C@@H](C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(C)C)C(=O)N1CCC[C@H]1C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(=O)N[C@@H](CC(O)=O)C(=O)N1CCC[C@H]1C(=O)N[C@@H](CCSC)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(O)=O)NC(=O)[C@@H](N)CC=1C2=CC=CC=C2NC=1)C1=CNC=N1 IXKSXJFAGXLQOQ-XISFHERQSA-N 0.000 description 3
- 238000004873 anchoring Methods 0.000 description 3
- 239000003016 pheromone Substances 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 238000003786 synthesis reaction Methods 0.000 description 3
- 238000013518 transcription Methods 0.000 description 3
- 230000035897 transcription Effects 0.000 description 3
- 102100034088 40S ribosomal protein S4, X isoform Human genes 0.000 description 2
- 101100302211 Arabidopsis thaliana RNR2A gene Proteins 0.000 description 2
- OKTJSMMVPCPJKN-UHFFFAOYSA-N Carbon Chemical compound [C] OKTJSMMVPCPJKN-UHFFFAOYSA-N 0.000 description 2
- 241000233866 Fungi Species 0.000 description 2
- 101150038242 GAL10 gene Proteins 0.000 description 2
- 101150103804 GAL3 gene Proteins 0.000 description 2
- 102100024637 Galectin-10 Human genes 0.000 description 2
- 102100039558 Galectin-3 Human genes 0.000 description 2
- 108010033040 Histones Proteins 0.000 description 2
- 241000282412 Homo Species 0.000 description 2
- 101000732165 Homo sapiens 40S ribosomal protein S4, X isoform Proteins 0.000 description 2
- VSNHCAURESNICA-UHFFFAOYSA-N Hydroxyurea Chemical compound NC(=O)NO VSNHCAURESNICA-UHFFFAOYSA-N 0.000 description 2
- 108010024777 Mating Factor Receptors Proteins 0.000 description 2
- 238000000342 Monte Carlo simulation Methods 0.000 description 2
- KYRVNWMVYQXFEU-UHFFFAOYSA-N Nocodazole Chemical compound C1=C2NC(NC(=O)OC)=NC2=CC=C1C(=O)C1=CC=CS1 KYRVNWMVYQXFEU-UHFFFAOYSA-N 0.000 description 2
- 108091028043 Nucleic acid sequence Proteins 0.000 description 2
- ISWSIDIOOBJBQZ-UHFFFAOYSA-N Phenol Chemical compound OC1=CC=CC=C1 ISWSIDIOOBJBQZ-UHFFFAOYSA-N 0.000 description 2
- 102000012288 Phosphopyruvate Hydratase Human genes 0.000 description 2
- 108010022181 Phosphopyruvate Hydratase Proteins 0.000 description 2
- 101150002896 RNR2 gene Proteins 0.000 description 2
- 101150006985 STE2 gene Proteins 0.000 description 2
- 101100141330 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RNR4 gene Proteins 0.000 description 2
- 101100531022 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) RPS2 gene Proteins 0.000 description 2
- 238000002105 Southern blotting Methods 0.000 description 2
- 108091036066 Three prime untranslated region Proteins 0.000 description 2
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical compound O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 description 2
- 238000013459 approach Methods 0.000 description 2
- 230000001580 bacterial effect Effects 0.000 description 2
- 229910052799 carbon Inorganic materials 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 230000037149 energy metabolism Effects 0.000 description 2
- 229960001330 hydroxycarbamide Drugs 0.000 description 2
- 230000006698 induction Effects 0.000 description 2
- 230000010354 integration Effects 0.000 description 2
- 238000002955 isolation Methods 0.000 description 2
- 238000002703 mutagenesis Methods 0.000 description 2
- 231100000350 mutagenesis Toxicity 0.000 description 2
- 229950006344 nocodazole Drugs 0.000 description 2
- 238000002264 polyacrylamide gel electrophoresis Methods 0.000 description 2
- 230000008488 polyadenylation Effects 0.000 description 2
- 125000002924 primary amino group Chemical group [H]N([H])* 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 238000001243 protein synthesis Methods 0.000 description 2
- 108020003175 receptors Proteins 0.000 description 2
- 102000005962 receptors Human genes 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 238000005070 sampling Methods 0.000 description 2
- 241000894007 species Species 0.000 description 2
- 230000009897 systematic effect Effects 0.000 description 2
- 230000002103 transcriptional effect Effects 0.000 description 2
- 230000014616 translation Effects 0.000 description 2
- 238000011144 upstream manufacturing Methods 0.000 description 2
- 108020005345 3' Untranslated Regions Proteins 0.000 description 1
- 229930024421 Adenine Natural products 0.000 description 1
- GFFGJBXGBJISGV-UHFFFAOYSA-N Adenine Chemical compound NC1=NC=NC2=C1N=CN2 GFFGJBXGBJISGV-UHFFFAOYSA-N 0.000 description 1
- 102000007698 Alcohol dehydrogenase Human genes 0.000 description 1
- 108010021809 Alcohol dehydrogenase Proteins 0.000 description 1
- 241000212384 Bifora Species 0.000 description 1
- 101100462138 Brassica napus OlnB1 gene Proteins 0.000 description 1
- 108010077544 Chromatin Proteins 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 108020004635 Complementary DNA Proteins 0.000 description 1
- 102100033270 Cyclin-dependent kinase inhibitor 1 Human genes 0.000 description 1
- 102000005362 Cytoplasmic Dyneins Human genes 0.000 description 1
- 108010070977 Cytoplasmic Dyneins Proteins 0.000 description 1
- 231100000280 DNA damage induction Toxicity 0.000 description 1
- 102100035472 DNA polymerase iota Human genes 0.000 description 1
- 230000004543 DNA replication Effects 0.000 description 1
- 101900137526 Escherichia coli Galactokinase Proteins 0.000 description 1
- 108010058643 Fungal Proteins Proteins 0.000 description 1
- 230000010190 G1 phase Effects 0.000 description 1
- 101150094690 GAL1 gene Proteins 0.000 description 1
- 101150037782 GAL2 gene Proteins 0.000 description 1
- 101150103317 GAL80 gene Proteins 0.000 description 1
- 102100028501 Galanin peptides Human genes 0.000 description 1
- 102100021735 Galectin-2 Human genes 0.000 description 1
- 102100039555 Galectin-7 Human genes 0.000 description 1
- NYHBQMYGNKIUIF-UUOKFMHZSA-N Guanosine Chemical compound C1=NC=2C(=O)NC(N)=NC=2N1[C@@H]1O[C@H](CO)[C@@H](O)[C@H]1O NYHBQMYGNKIUIF-UUOKFMHZSA-N 0.000 description 1
- 102000006947 Histones Human genes 0.000 description 1
- 101000944380 Homo sapiens Cyclin-dependent kinase inhibitor 1 Proteins 0.000 description 1
- 101001094672 Homo sapiens DNA polymerase iota Proteins 0.000 description 1
- 101100121078 Homo sapiens GAL gene Proteins 0.000 description 1
- 101000608772 Homo sapiens Galectin-7 Proteins 0.000 description 1
- 101000581402 Homo sapiens Melanin-concentrating hormone receptor 1 Proteins 0.000 description 1
- 108010044467 Isoenzymes Proteins 0.000 description 1
- QIVBCDIJIAJPQS-VIFPVBQESA-N L-tryptophane Chemical compound C1=CC=C2C(C[C@H](N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-VIFPVBQESA-N 0.000 description 1
- 108091077621 MAPRE family Proteins 0.000 description 1
- 101150095974 MELT gene Proteins 0.000 description 1
- 101150068236 MFA2 gene Proteins 0.000 description 1
- 241000124008 Mammalia Species 0.000 description 1
- 108010038049 Mating Factor Proteins 0.000 description 1
- 241001465754 Metazoa Species 0.000 description 1
- 102000029749 Microtubule Human genes 0.000 description 1
- 108091022875 Microtubule Proteins 0.000 description 1
- 102000009664 Microtubule-Associated Proteins Human genes 0.000 description 1
- 241001529936 Murinae Species 0.000 description 1
- 101100018717 Mus musculus Il1rl1 gene Proteins 0.000 description 1
- 108700005081 Overlapping Genes Proteins 0.000 description 1
- 108090000854 Oxidoreductases Proteins 0.000 description 1
- 101150048735 POL3 gene Proteins 0.000 description 1
- 108010011939 Pyruvate Decarboxylase Proteins 0.000 description 1
- 108020005115 Pyruvate Kinase Proteins 0.000 description 1
- 238000002123 RNA extraction Methods 0.000 description 1
- 102100022851 Rab5 GDP/GTP exchange factor Human genes 0.000 description 1
- 108700005075 Regulator Genes Proteins 0.000 description 1
- 102100028287 Regulator of nonsense transcripts 1 Human genes 0.000 description 1
- 101710172454 Regulator of nonsense transcripts 1 Proteins 0.000 description 1
- 101710203837 Replication-associated protein Proteins 0.000 description 1
- 108010003494 Retinoblastoma-Like Protein p130 Proteins 0.000 description 1
- 102000004642 Retinoblastoma-Like Protein p130 Human genes 0.000 description 1
- 101710133505 Ribonucleoside-diphosphate reductase 2 subunit alpha Proteins 0.000 description 1
- 101710157892 Ribonucleoside-diphosphate reductase 2 subunit beta Proteins 0.000 description 1
- 102000000505 Ribonucleotide Reductases Human genes 0.000 description 1
- 108010041388 Ribonucleotide Reductases Proteins 0.000 description 1
- 102000037055 SLC1 Human genes 0.000 description 1
- 101150084266 STE3 gene Proteins 0.000 description 1
- 101100166254 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) CBF2 gene Proteins 0.000 description 1
- 101100204213 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) STE3 gene Proteins 0.000 description 1
- 101100079524 Schizosaccharomyces pombe (strain 972 / ATCC 24843) ndc80 gene Proteins 0.000 description 1
- 108010090804 Streptavidin Proteins 0.000 description 1
- 206010051379 Systemic Inflammatory Response Syndrome Diseases 0.000 description 1
- 108020005038 Terminator Codon Proteins 0.000 description 1
- QIVBCDIJIAJPQS-UHFFFAOYSA-N Tryptophan Natural products C1=CC=C2C(CC(N)C(O)=O)=CNC2=C1 QIVBCDIJIAJPQS-UHFFFAOYSA-N 0.000 description 1
- 101150025199 Upf1 gene Proteins 0.000 description 1
- 208000036142 Viral infection Diseases 0.000 description 1
- 230000003213 activating effect Effects 0.000 description 1
- 229960000643 adenine Drugs 0.000 description 1
- 150000001413 amino acids Chemical class 0.000 description 1
- 230000003698 anagen phase Effects 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000033228 biological regulation Effects 0.000 description 1
- 239000012472 biological sample Substances 0.000 description 1
- 238000010804 cDNA synthesis Methods 0.000 description 1
- 238000004364 calculation method Methods 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000003915 cell function Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 210000003483 chromatin Anatomy 0.000 description 1
- 230000020973 chromatin silencing at telomere Effects 0.000 description 1
- 230000024321 chromosome segregation Effects 0.000 description 1
- 238000010367 cloning Methods 0.000 description 1
- 238000012790 confirmation Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 238000007405 data analysis Methods 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 230000009274 differential gene expression Effects 0.000 description 1
- 201000010099 disease Diseases 0.000 description 1
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 1
- 102000013035 dynein heavy chain Human genes 0.000 description 1
- 108060002430 dynein heavy chain Proteins 0.000 description 1
- 238000004520 electroporation Methods 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 238000011156 evaluation Methods 0.000 description 1
- 239000000284 extract Substances 0.000 description 1
- 238000000684 flow cytometry Methods 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 230000002414 glycolytic effect Effects 0.000 description 1
- 210000005260 human cell Anatomy 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 238000011065 in-situ storage Methods 0.000 description 1
- 230000000977 initiatory effect Effects 0.000 description 1
- 210000002415 kinetochore Anatomy 0.000 description 1
- 150000002632 lipids Chemical class 0.000 description 1
- 239000002502 liposome Substances 0.000 description 1
- 238000005259 measurement Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 239000002609 medium Substances 0.000 description 1
- 239000012528 membrane Substances 0.000 description 1
- 238000002493 microarray Methods 0.000 description 1
- 210000004688 microtubule Anatomy 0.000 description 1
- 210000003470 mitochondria Anatomy 0.000 description 1
- 230000000869 mutational effect Effects 0.000 description 1
- 238000002966 oligonucleotide array Methods 0.000 description 1
- 101150047627 pgk gene Proteins 0.000 description 1
- JTJMJGYZQZDUJJ-UHFFFAOYSA-N phencyclidine Chemical compound C1CCCCN1C1(C=2C=CC=CC=2)CCCCC1 JTJMJGYZQZDUJJ-UHFFFAOYSA-N 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 239000002243 precursor Substances 0.000 description 1
- 230000002028 premature Effects 0.000 description 1
- 238000011155 quantitative monitoring Methods 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 230000004044 response Effects 0.000 description 1
- 239000002342 ribonucleoside Substances 0.000 description 1
- 210000003705 ribosome Anatomy 0.000 description 1
- 239000007320 rich medium Substances 0.000 description 1
- 239000013605 shuttle vector Substances 0.000 description 1
- 230000003068 static effect Effects 0.000 description 1
- 101150003389 tdh2 gene Proteins 0.000 description 1
- 210000001541 thymus gland Anatomy 0.000 description 1
- 230000005030 transcription termination Effects 0.000 description 1
- 230000026683 transduction Effects 0.000 description 1
- 238000010361 transduction Methods 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000007704 transition Effects 0.000 description 1
- 230000005760 tumorsuppression Effects 0.000 description 1
- 230000007306 turnover Effects 0.000 description 1
- 229940035893 uracil Drugs 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 230000009105 vegetative growth Effects 0.000 description 1
- 108700026220 vif Genes Proteins 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- XLYOFNOQVPJJNP-UHFFFAOYSA-N water Substances O XLYOFNOQVPJJNP-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C07—ORGANIC CHEMISTRY
- C07K—PEPTIDES
- C07K14/00—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof
- C07K14/37—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi
- C07K14/39—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts
- C07K14/395—Peptides having more than 20 amino acids; Gastrins; Somatostatins; Melanotropins; Derivatives thereof from fungi from yeasts from Saccharomyces
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Organic Chemistry (AREA)
- Mycology (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Biophysics (AREA)
- Gastroenterology & Hepatology (AREA)
- Genetics & Genomics (AREA)
- Medicinal Chemistry (AREA)
- Molecular Biology (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Microbiology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
- Medicines Containing Material From Animals Or Micro-Organisms (AREA)
- Micro-Organisms Or Cultivation Processes Thereof (AREA)
Abstract
Yeast genes which are differentially expressed during the cell cycle are described. They can be used to study, affect, and monitor the cell cycle of a eukaryotic cell. They can be used to obtain human homologs involved in cell cycle regulation. They can be used to identify antifungal agents. They can be formed into arrays on solid supports for interrogation of a cell's transcriptome under various conditions.
Description
CHARACTERIZATION OF THE YEAST TRANSCRIPTOME
TECH1VICAL FIEhD OF THE INVENTION
This invention is related to the characterization of the expressed genes of the yeast genome. More particularly, it is related to the identification and use of previously unrecognized genes.
BACKGROUND OF THE ~~'VENTION
It is by now axiomatic that the phenotype of an organism is largely determined by the genes expressed within it. These expressed genes can be represented by a "transcriptome", conveying the identity of each expressed gene and its level of expression for a defined population of cells. Unlike the genome, which is essentially a static entity, the transcriptome can be modulated by both external and internal factors. The transcriptome thereby serves as a dynamic link between an organism's genome and its physical characteristics.
The transcriptome as defined above has not been characterized in any eukaryotic or prokaryotic organism, largely because of technological limitations. However, some general features of gene expression patterns were elucidated two decades ago through RNA-DNA hybridization measurements (Bishop et ai., 1974; Hereford and Rosbash, 1977). In many organisms, it was thus found that at least three classes of transcripts could be identified, with either high, medium, or low levels of expression, and the number of transcripts per cell were estimated (Lewin, 1980). These data of course provided little information about the specific genes that were members of each class. Data on the expression levels of individual genes have accumulated as new genes were discovered. However, in only a few instances have the absolute levels of expression of particular genes been measured and compared to other genes in the same cell type.
Description of any cell's transcriptome would therefore provide new information useful for understanding numerous aspects of cell biology and biochemistry.
gTTMNLARY OF THE INVENTION
It is an object of the present invention to provide genes which are involved in cell cycle progression.
It is another object of the present invention to provide methods of using the genes to affect the cell cycle.
It is an object of the present invention to provide methods for screening candidate antifungal drugs.
Another object of the invention is to provide a method for obtaining human homologs of the yeast genes which are involved in cell cycle progression.
Another object of the invention is to provide probes for ascertaining phase in the cell cycle of a cell.
These and other objects of the invention are achieved by providing the art with one or more of the embodiments described below. According to one embodiment of the invention an isolated DNA molecule is provided. It comprises a yeast gene which is involved in cell cycle progression selected from the group of NORF genes identified in Table 3 or 4.
According to another embodiment of the invention a method of using yeast genes is provided. The method is for affecting the cell cycle of a cell.
The method comprises the step of administering to a cell an isolated DNA molecule comprising a ._ . ... . r r . .__. . .
yeast gene which is involved in cell cycle progression selected from the differentially expressed genes identified in Tables 1, 2, 3 and 4.
In yet another embodiment of the invention a method for screening candidate antifungal drugs is provided. The method comprises the steps of S contacting a test substance with a yeast cell;
monitoring expression of a yeast gene which is involved in cell cycle progression selected from the group of yeast genes identified in Tables 1, 2, 3 and 4, wherein a test substance which modifies the expression of the yeast gene is a candidate antifungal drug.
In still another embodiment of the invention a method for identifying human genes which are involved in cell cycle progression is provided. The method comprises the step of hybridizing a probe comprising at least 14 contiguous nucleotides of a yeast gene which is differentially expressed between at least two phases selected from the group consisting of log phase, S phase, and G2/M phase, wherein the yeast gene is identified in Table 1, 2, 3, or 4.
Also provided by the present invention are isolated DNA molecules, which comprise probes for ascertaining phase in the cell cycle of a cell, wherein the probe comprises at least 14 contiguous nucleotides of a NORF
gene as identified in Table 3 or 4.
These and other embodiments of the invention which will be apparent to those of skill in the art upon reading the detailed disclosure provided below, make available to the art hitherto unrecognized genes, and information about the expression of genes globally at the organismal level. We provide the first description of a transcriptome, deterniined in S. cerevisiae cells.
This organism was chosen because it is widely used to clarify the biochemical and physiologic parameters underlying eukaryotic cellular functions and because it is the only eukaryote in which the entire genome has been defined at the nucleotide level (Goffeau, et al., 1996).
TECH1VICAL FIEhD OF THE INVENTION
This invention is related to the characterization of the expressed genes of the yeast genome. More particularly, it is related to the identification and use of previously unrecognized genes.
BACKGROUND OF THE ~~'VENTION
It is by now axiomatic that the phenotype of an organism is largely determined by the genes expressed within it. These expressed genes can be represented by a "transcriptome", conveying the identity of each expressed gene and its level of expression for a defined population of cells. Unlike the genome, which is essentially a static entity, the transcriptome can be modulated by both external and internal factors. The transcriptome thereby serves as a dynamic link between an organism's genome and its physical characteristics.
The transcriptome as defined above has not been characterized in any eukaryotic or prokaryotic organism, largely because of technological limitations. However, some general features of gene expression patterns were elucidated two decades ago through RNA-DNA hybridization measurements (Bishop et ai., 1974; Hereford and Rosbash, 1977). In many organisms, it was thus found that at least three classes of transcripts could be identified, with either high, medium, or low levels of expression, and the number of transcripts per cell were estimated (Lewin, 1980). These data of course provided little information about the specific genes that were members of each class. Data on the expression levels of individual genes have accumulated as new genes were discovered. However, in only a few instances have the absolute levels of expression of particular genes been measured and compared to other genes in the same cell type.
Description of any cell's transcriptome would therefore provide new information useful for understanding numerous aspects of cell biology and biochemistry.
gTTMNLARY OF THE INVENTION
It is an object of the present invention to provide genes which are involved in cell cycle progression.
It is another object of the present invention to provide methods of using the genes to affect the cell cycle.
It is an object of the present invention to provide methods for screening candidate antifungal drugs.
Another object of the invention is to provide a method for obtaining human homologs of the yeast genes which are involved in cell cycle progression.
Another object of the invention is to provide probes for ascertaining phase in the cell cycle of a cell.
These and other objects of the invention are achieved by providing the art with one or more of the embodiments described below. According to one embodiment of the invention an isolated DNA molecule is provided. It comprises a yeast gene which is involved in cell cycle progression selected from the group of NORF genes identified in Table 3 or 4.
According to another embodiment of the invention a method of using yeast genes is provided. The method is for affecting the cell cycle of a cell.
The method comprises the step of administering to a cell an isolated DNA molecule comprising a ._ . ... . r r . .__. . .
yeast gene which is involved in cell cycle progression selected from the differentially expressed genes identified in Tables 1, 2, 3 and 4.
In yet another embodiment of the invention a method for screening candidate antifungal drugs is provided. The method comprises the steps of S contacting a test substance with a yeast cell;
monitoring expression of a yeast gene which is involved in cell cycle progression selected from the group of yeast genes identified in Tables 1, 2, 3 and 4, wherein a test substance which modifies the expression of the yeast gene is a candidate antifungal drug.
In still another embodiment of the invention a method for identifying human genes which are involved in cell cycle progression is provided. The method comprises the step of hybridizing a probe comprising at least 14 contiguous nucleotides of a yeast gene which is differentially expressed between at least two phases selected from the group consisting of log phase, S phase, and G2/M phase, wherein the yeast gene is identified in Table 1, 2, 3, or 4.
Also provided by the present invention are isolated DNA molecules, which comprise probes for ascertaining phase in the cell cycle of a cell, wherein the probe comprises at least 14 contiguous nucleotides of a NORF
gene as identified in Table 3 or 4.
These and other embodiments of the invention which will be apparent to those of skill in the art upon reading the detailed disclosure provided below, make available to the art hitherto unrecognized genes, and information about the expression of genes globally at the organismal level. We provide the first description of a transcriptome, deterniined in S. cerevisiae cells.
This organism was chosen because it is widely used to clarify the biochemical and physiologic parameters underlying eukaryotic cellular functions and because it is the only eukaryote in which the entire genome has been defined at the nucleotide level (Goffeau, et al., 1996).
WO 98!32847 PCT/US98/01216 RR1EF DE SCIZTPTION OF TH N 1.~ lt~ w 11r c~~
Figure 1. Schematic of SAGE Method and Genome Analysis.
In applying SAGE to the analysis of yeast gene expression patterns, the 3' most NIaIII site was used to define a unique position in each transcript and to provide a site for ligation of a linker with a BsmFI site. The type Its enzyme BsmFI, which cleaves a defined distance from its non-palindromic recognition site, was then used to generate a l5bp SAGE tag (designated by the black arrows), which includes the NIaIII site. Automated sequencing of concatenated SAGE tags allowed the routine identification of about a thousand tags per sequencing gel. Once sequenced, the abundance of each SAGE tag was calculated, and each tag was used to search the entire yeast genome to identify its corresponding gene. The lower panel shows a small region of Chromosome 15. Gray arrows indicate all potential SAGE tags (l~Tla~ sites) and black arrows indicate 3' most SAGE tags. The total number of tags observed for each potential tag is indicated above (+ strand) or below (- strand) the tag. As expected, the observed SAGE tags were associated with the 3' end of expressed genes.
Figure 2. Sampling of Yeast Gene Expression.
Analysis of increasing amounts of ascertained tags reveals a plateau in the number of unique expressed genes. Triangles represent genes with known functions, squares represent genes predicted on the basis of sequence information, and circles represent total genes.
Figure 3. Virtual Rot.
(a) Abundance Classes in the Yeast Transcriptome. The transcript abundance is plotted in reverse order on the abscissa, whereas the fraction of total transcripts with at least that abundance is plotted on the ordinate. The dotted lines identify the three components of the curve, 1, 2, and 3. This is analogous to a Rot curve derived from reassociation kinetics where the product of initial RNA concentration and time is plotted on the abscissa, and T , the percent of labeled cDNA that hybridizes to excess mRNA is plotted on the ordinate.
(b) Comparison of Virtual Rot and Rot Components. Transitions and data from virtual Rot components were calculated from the data in Figure 3A, while data for Rot components were obtained from Hereford and Rosbash, 1977.
Figure 4. Chromosomal Expression Map for S. cerevisiae. Individual yeast genes were positioned on each chromosome according to their open reading frame (ORF) start coordinates. Abundance levels of tags corresponding to each gene are displayed on the vertical axis, with transcription from the +
strand indicated above the abscissa and that from the - strand indicated below.
Yellow bands at ends of the expanded chromosome represent telomeric regions that are undertranscribed (see text for details).
Figure 5. Northern Blot Analysis of Representative Genes. TDH2/3, TEF1/2 and NORF1, are expressed relatively equally in all three states (lane 1, G2/M arrested; lane 2, S phase arrested; lane 3, log phase), while RNR4, RNR2 , and NORFS are highly expressed in S-phase arrested cells. The expression level observed by SAGE (number of tags) is noted below each lane and was highly correlated with quantitation of the Northern blot by PhosphorImager analysis (rz=0.97).
Figure 1. Schematic of SAGE Method and Genome Analysis.
In applying SAGE to the analysis of yeast gene expression patterns, the 3' most NIaIII site was used to define a unique position in each transcript and to provide a site for ligation of a linker with a BsmFI site. The type Its enzyme BsmFI, which cleaves a defined distance from its non-palindromic recognition site, was then used to generate a l5bp SAGE tag (designated by the black arrows), which includes the NIaIII site. Automated sequencing of concatenated SAGE tags allowed the routine identification of about a thousand tags per sequencing gel. Once sequenced, the abundance of each SAGE tag was calculated, and each tag was used to search the entire yeast genome to identify its corresponding gene. The lower panel shows a small region of Chromosome 15. Gray arrows indicate all potential SAGE tags (l~Tla~ sites) and black arrows indicate 3' most SAGE tags. The total number of tags observed for each potential tag is indicated above (+ strand) or below (- strand) the tag. As expected, the observed SAGE tags were associated with the 3' end of expressed genes.
Figure 2. Sampling of Yeast Gene Expression.
Analysis of increasing amounts of ascertained tags reveals a plateau in the number of unique expressed genes. Triangles represent genes with known functions, squares represent genes predicted on the basis of sequence information, and circles represent total genes.
Figure 3. Virtual Rot.
(a) Abundance Classes in the Yeast Transcriptome. The transcript abundance is plotted in reverse order on the abscissa, whereas the fraction of total transcripts with at least that abundance is plotted on the ordinate. The dotted lines identify the three components of the curve, 1, 2, and 3. This is analogous to a Rot curve derived from reassociation kinetics where the product of initial RNA concentration and time is plotted on the abscissa, and T , the percent of labeled cDNA that hybridizes to excess mRNA is plotted on the ordinate.
(b) Comparison of Virtual Rot and Rot Components. Transitions and data from virtual Rot components were calculated from the data in Figure 3A, while data for Rot components were obtained from Hereford and Rosbash, 1977.
Figure 4. Chromosomal Expression Map for S. cerevisiae. Individual yeast genes were positioned on each chromosome according to their open reading frame (ORF) start coordinates. Abundance levels of tags corresponding to each gene are displayed on the vertical axis, with transcription from the +
strand indicated above the abscissa and that from the - strand indicated below.
Yellow bands at ends of the expanded chromosome represent telomeric regions that are undertranscribed (see text for details).
Figure 5. Northern Blot Analysis of Representative Genes. TDH2/3, TEF1/2 and NORF1, are expressed relatively equally in all three states (lane 1, G2/M arrested; lane 2, S phase arrested; lane 3, log phase), while RNR4, RNR2 , and NORFS are highly expressed in S-phase arrested cells. The expression level observed by SAGE (number of tags) is noted below each lane and was highly correlated with quantitation of the Northern blot by PhosphorImager analysis (rz=0.97).
Table Legends Table 1. Highly Expressed Genes Tag represents the 10 by SAGE tag adjacent to the NIaIII site; Gene represents the gene or genes corresponding to a particular tag (multiple genes that match unique tags are from related families, with an average identity of 93%); Locus and Description denote the locus name, and functional description of each ORF, respectively; Copies/cell represents the abundance of each transcript in the SAGE library, assuming 15,000 total transcripts per cell and 60,633 ascertained transcripts.
Table 2. Expression of Putative Coding Sequences Table columns are the same as for Table 1.
Table 3. Expression of NORF genes SAGE Tag, Locus, and Copies/cell are the same as for Table 1; Chr and Tag Pos denote the chromosome and position of each tag; ORF Size denotes the size of the ORF corresponding to the indicated tag. In each case, the tag was located within or less than 250 by 3' of the NORF.
pETAILED DESCRIPTION
It is a discovery of the present invention that certain hitherto unknown genes (the NORFs) exist and are expressed in yeast. These genes, as well as other previously identified and previously postulated genes, can be used to study, monitor, and affect phase of cell cycle. The present invention provides information on which genes are differentially expressed during the cell cycle.
Differentially expressed genes can be used as markers of phases of the cell cycle. They can also be used to affect a change in the phase of the cell cycle.
In addition, they can be used to screen for drugs which affect the cell cycle, by affecting expression of the genes. Human homologs of these eukaryotic genes are also presumed to exist, and can be identified using the yeast genes as probes or primers to identify the human homologs.
Table 2. Expression of Putative Coding Sequences Table columns are the same as for Table 1.
Table 3. Expression of NORF genes SAGE Tag, Locus, and Copies/cell are the same as for Table 1; Chr and Tag Pos denote the chromosome and position of each tag; ORF Size denotes the size of the ORF corresponding to the indicated tag. In each case, the tag was located within or less than 250 by 3' of the NORF.
pETAILED DESCRIPTION
It is a discovery of the present invention that certain hitherto unknown genes (the NORFs) exist and are expressed in yeast. These genes, as well as other previously identified and previously postulated genes, can be used to study, monitor, and affect phase of cell cycle. The present invention provides information on which genes are differentially expressed during the cell cycle.
Differentially expressed genes can be used as markers of phases of the cell cycle. They can also be used to affect a change in the phase of the cell cycle.
In addition, they can be used to screen for drugs which affect the cell cycle, by affecting expression of the genes. Human homologs of these eukaryotic genes are also presumed to exist, and can be identified using the yeast genes as probes or primers to identify the human homologs.
~..
7 PCT/US98l01216 New genes termed NORFs (not previously assigned open reading frames) have been found. They are uniquely identified by their SAGE tags.
In addition their entire nucleotide sequence is known and publicly available.
In general, these were not previously identified as genes due to their small size. However, they have now been found to be expressed.
Differentially expressed yeast genes are those whose expression varies by a statistically significant difference (to greater than 95% confidence level) within different growth phases, particularly log phase, S phase, and G2/M.
Preferably the difference is greater than 10%, 25%, 50%, or 100%. The genes which have been found to have such differential expression characteristics are: NORF N° 1, 2, 4, 5, 6, 17, 25, 27, TEF1/TEF2, EN02, ADH1, ADH2, PGK1, CUP1A/CUP1B, PYK1, YKL056C, YMR116C, YEL033W, YOR182C, YCR013C, ribonucleotide reductase 2 and 4, and YJR085C.
The DNA molecules according to the invention can be genomic or cDNA. Preferably they are isolated free of other cellular components such as membrane components, proteins, and lipids. They can be made by a cell and isolated, or synthesized using PCR or an automatic synthesizer. Any technique for obtaining a DNA of known sequence may be used. Methods for purifying and isolating DNA are routine and are known in the art.
To administer yeast genes to cells, any DNA delivery techniques known in the art may be used, without limitation. These include liposomes, transfection, transduction, transformation, viral infection, electroporation.
Vectors for particular purposes and characteristics can be selected by the skilled artisan for their known properties. Cells which can be used as gene recipients are yeast and other fungi, mammalian cells, including humans, and bacterial cells.
Antifungal drugs can be identified using yeast cells as described herein.
Expression of a differentially expressed gene can be monitored by any means known in the art. When a test substance affects the expression of such a differentially expressed gene, it is a candidate drug for affecting the growth properties of fungi, and may be useful as an antifungal agent.
Because differentially expressed genes are likely to be involved in cell cycle progression, it is likely that these genes are conserved among species.
The differentially expressed genes identified by the present invention can be used to identify homologs in humans and other mammals. Means for identifying homologous genes among different species are well known in the art. Briefly, stringency of hybridization can be reduced so that imperfectly matching sequences hybridize. This can be in the context of inter alia Southern blots, Northern blots, colony hybridization or PCR. Any hybridization technique which is known in the art can be used.
Probes according to the present invention are isolated DNA molecules which have at least 10, and preferably at least 12, 14, 16, 18, 20, or 25 contiguous nucleotides of a particular NORF gene or other differentially expressed gene. The probes may or may not be labeled. They may be used as primers for PCR or for Southern or Northern blots. Preferably the probes are anchored to a sofid support. More preferably they are present on an array so that multiple probes can simultaneously hybridize to a single biological sample. The probes can be spotted onto the array or synthesized in situ on the array. See Lockhart et. al., Nature Biotechnology, Vol. 14, December i 996, "Expression monitoring by hybridization to high-density oligonucleotide arrays." A single array can contain more than 100, 500 or even 1,000 different probes in discrete locations.
The above disclosure generally describes the present invention. A more complete understanding can be obtained by reference to the following specific examples which are provided herein for purposes of illustration only, and are not intended to limit the scope of the invention.
Summary We have analyzed the set of genes expressed from the yeast genome, herein r i called the transcriptome, using serial analysis of gene expression (SAGE).
Analysis of 60,633 transcripts revealed 4,665 genes, with expression levels ranging from 0.3 to over 200 transcripts per cell. Of these genes, 1,981 had known functions, while 2,684 were previously uncharacterized. Integration of positional information with gene expression data allowed the generation of chromosomal expression maps, identifying physical regions of transcriptional activity, and identified genes that had not been predicted by sequence information alone. These studies provide insight into global patterns of gene expression in yeast and demonstrate the feasibility of genome-wide expression studies in eukaryotes.
Results Characteristics and Rationale of SAGE Approach Several methods have recently been described for the high throughput evaluation of gene expression (Nguyen et al., 1995; Schena et al., 1995;
Velculescu et al., 1995). We used SAGE (Serial Analysis of Gene Expression) because it can provide quantitative gene expression data without the prerequisite of a hybridization probe for each transcript. The SAGE
technology is based on two basic principles (Figure 1). First, a .short sequence tag (9-11 bp) contains sufficient information to uniquely identify a transcript, provided that it is derived from a defined location within that transcript. Second, many transcript tags can be concatenated into a single molecule and then sequenced, revealing the identity of multiple tags simultaneously. The expression pattern of any population of transcripts can be quantitatively evaluated by determining the abundance of individual tags and identifying the gene corresponding to each tag.
Genome-wide expression In order to maximize representation of genes involved in normal growth and cell-cycle progression, SAGE libraries were generated from yeast cells in three states: log phase, S phase arrested and G2/M phase arrested. In total, SAGE tags corresponding to 60,633 total transcripts were identified (including 20,184 from log phase, 20,034 from S phase arrested, and 20,415 from GZ/M phase arrested cells). Of these tags, 56,291 tags (93%) precisely matched the yeast genome, 88 tags matched the mitochondria) genome, and 91 tags matched the 2 micron plasmid.
The number of SAGE tags required to define a yeast transcriptome depends on the confidence level desired for detecting low abundance mRNA
molecules. Assuming the previously derived estimate of 15,000 mRNA
molecules per cell (Hereford and Rosbash, 1977), 20,000 tags would represent a 1.3 fold coverage even for mRNA molecules present at a single copy per cell, and would provide a 72% probability of detecting such transcripts (as determined by Monte Carlo simulations). Analysis of 20,184 tags from log phase cells identified 3,298 unique genes. As an independent confirmation of mRNA copy number per cell, we compared the expression level of SUP44/RPS4, one of the few genes whose absolute mRNA levels have been reliably determined by quantitative hybridization experiments (Iyer and Struhl, 1996), with expression levels determined by SAGE.
SUP44/RPS4 was measured by hybridization at 75 +/- 10 copies/cell (Iyer and Struhl, 1996), in good accord with the SAGE data of 63 copies/cell, suggesting that the estimate of 15,000 mRNA molecules per cell was reasonably accurate. Analysis of SAGE tags from S phase arrested and G2/M
phase arrested cells revealed similar expression levels for this gene (range to 55 wpies/cell), as well as for the vast majority of expressed genes. As less than 1% of the genes were expressed at dramatically different levels among these three states (see below), SAGE tags obtained from all libraries were combined and used to analyze global patterns of gene expression.
Analysis of ascertained tags at increasing increments revealed that the number of unique transcripts plateaued at 60,000 tags (Figure 2). This suggested that generation of fi~rther SAGE tags would yield few additional genes, consistent with the fact that sixty thousand transcripts represented a four-fold redundancy for genes expressed as low as 1 transcript per cell.
i Likewise, Monte Carlo simulations indicated that analysis of 60,000 tags would identify at least one tag for a given transcript 97% of the time if its expression level was one copy per cell.
The 56,291 tags that precisely matched the yeast genome represented 4,665 different genes. This number is in agreement with the estimate of 3,000 to 4,000 expressed genes obtained by RNA DNA reassociation kinetics (Hereford and Rosbash, 1977). These expressed genes included 85% of the genes with characterized functions (1,981 of 2,340), and 76% of the total genes predicted from analysis of the yeast genome (4,665 of 6,121). These numbers are consistent with a relatively complete sampling of the yeast transcriptome given the limited number of physiological states examined and the large number of genes predicted solely on the basis of genomic sequence analysis.
The transcript expression per gene was observed to vary from 0.3 to over 200 copies per cell. Analysis of the distribution of gene expression levels revealed several abundance classes that were similar to those observed in previous studies using reassociation kinetics. A "virtual Rot" of the genes observed by SAGE (Figure 3A) identified three main components of the transcriptome with abundances ranging over three orders of magnitude. A
Rot curve derived from RNA-cDNA reassociation kinetics also contained three main components distributed over a similar range of abundances {Hereford and Rosbash, 1977). Although the kinetics of reassociation of a particular class of RNA and cDNA may be affected by numerous experimental variables, there were striking similarities between Rot and virtual Rot analyses (Figure 3B). Because Rot analysis may not detect all transcripts of low abundance (Lewin, 1980), it is not surprising that SAGE
revealed both a larger total number of expressed genes and a higher fraction of the transcriptome belonging to the low abundance transcript class.
Integration of Expression Information with the Genomic Map The SAGE expression data could be integrated with existing positional information to generate chromosomal expression maps (Figure 4). These maps were generated using the sequence of the yeast genome and the position coordinates of ORFs obtained from the Stanford Yeast Genome Database.
Although there were a few genes that were noted to be physically proximal and have similarly high levels of expression, there did not appear to be any clusters of particularly high or low expression on any chromosome. Genes like histones H3 and H4, which are known to have coregulated divergent promoters and are immediately adjacent on chromosome 14 (Smith and Murray, 1983), had very similar expression levels (5 and 6 copies per cell, respectively). The distribution of transcripts among the chromosomes suggested that overall transcription was evenly dispersed, with total transcript levels being roughly linearly related to chromosome size (rz =0.85, data not shown). However, regions within 10 kb of telomeres appeared to be uniformly undertranscribed, containing on average 3.2 tags per gene as compared with 12.4 tags per gene for non-telomeric regions. (Figure 4). This is consistent with the previously described observations of "telomeric silencing" in yeast (Gottschling et al., 1990). Recent studies have reported telomeric position effects as far as 4 kb from telomere ends (Renauld et al., 1993).
Gene Expression Patterns Table 1 lists the 30 most highly expressed genes, all of which are expressed at greater than 60 mRNA copies per cell. . As expected, these genes mostly correspond to well characterized enzymes involved in energy metabolism and protein synthesis and were expressed at similar levels in all three growth states (Examples in Figure 5). Some of these genes, including EN02 (McAlister and Holland, 1982), PDCI (Schmitt et aL, 1983), PGKI
(Chambers et al., 1989), PYKI (Nishizawa et al., 1989), and ADHl (penis et al., 1983), are known to be dramatically induced in the glucose-rich growth conditions used in this study. In contrast, glucose repressible genes such as the GALIlGAL7/GAL10 cluster (St John and Davis, 1979), and GAL3 (Bajwa et al., 1988) were observed to be expressed at very low levels (0.3 or fewer copies per cell). As expected for the yeast strain used in this study, mating type a specific genes, such as the a factor genes (MFAl, MFA2) (Michaelis and Herskowitz, 1988), and alpha factor receptor (STE2) (Burkholder and Harlwell, 1985) were all observed to be expressed at significant levels (range 2 to 10 copies per cell), while mating type alpha specific genes (MFaI, MFa2, STE3) (Hagen et al., 1986; Kurjan and Herskowitz, 1982; Singh et al., 1983) were observed to be expressed at very low levels (<0.3 copies/cell).
Three of the highly expressed genes in Table 1 had not been previously characterized. One contained an ORF with predicted ribosomal function, previously identified only by genomic sequence analysis. Analyses of all SAGE data suggested that there were 2,684 such genes corresponding to uncharacterized ORFs which were transcribed at detectable levels. The 30 most abundant of these transcripts were observed more than 30 times, corresponding to at least 8 transcripts per cell {Table 2). The other two highly expressed uncharacterized genes corresponded to ORFs not predicted by analysis of the yeast genome sequence (NORF = ~onannotated ~).
Analyses of SAGE data suggested that there were approximately 160 NORF
genes transcribed at detectable levels. The 30 most abundant of these transcripts were observed at least 9 times (Table 3 and examples in Figure 5).
Interestingly, one of the NORF genes (NORFS) was only expressed in S phase arrested cells and corresponded to the transcript whose abundance varied the most in the three states analyzed (> 49 fold, Figure 5).
Comparison of S phase arrested cells to the other states also identified greater than 9 fold elevation of the RNR2 and RNR4 transcripts (Figure 5). Induction of these ribonucleoside reductase genes is likely to be due to the hydroxyurea treatment used to arrest cells in S phase (Elledge and Davis, 1989).
Likewise, comparison of G2/M arrested cells identified elevation of RBL2 and dynein light chain, both microtubule associated proteins (Archer et al., 1995; Dick et al., 1996). As with the RNR inductions, these elevated levels seem likely to be related to the nocodazole treatment used to arrest cells in the G2/M phase. While there were many relatively small differences between the states (for example, NORFI, Figure 5), overall comparison of the three states revealed surprisingly few dramatic differences; there were only 29 transcripts whose abundance varied more than 10 fold among the three different states analyzed.
Discussion Analysis of a yeast transcriptome affords a unique view of the RNA
components defining cellular life. We observed gene expression levels to vary over three orders of magnitude, with the transcripts involved in energy metabolism and protein synthesis the most highly expressed. Key transcripts, such as those encoding enzymes required for DNA replication (e.g. POLI and POL3), kinetochore proteins (NDC10 and SKPI), and many other interesting proteins, were present at 1 or fewer copies per cell on average. These 1 S abundances are consistent with previous qualitative data from reassociation kinetics which suggested that the largest number of expressed genes was present at 1 or 2 copies per cell. These observations indicate that low transcript copy numbers are sufficient for gene expression in yeast, and suggest that yeast possess a mechanism for rigid control of RNA abundance.
The synthesis of chromosomal expression maps presents a cataloging of the expression level of genes, organized by their genomic positions. It is not surprising that gene expression is well balanced throughout the 16 chromosomes of S. cerevisiae. As most genes have independent regulatory elements, it would have been surprising to find a large number of physically adjacent genes that had similar high levels of expression. Of the few genes that were known to have coregulated divergent promoters, like the H3/H4 pair, SAGE data confirmed concordant levels of expression. For areas like telomere ends that are known to be transcriptionally suppressed, SAGE data corroborated low levels of expression. Other expected expression patterns such as high levels of glucose induced glycolytic enzymes, low levels of glucose repressed GAL genes, expression of mating type a specific genes, and low of expression of mating type alpha genes, were observed. Finally, identification of tags corresponding to NORF genes suggests that there is a significant number of small proteins encoded by the yeast genome that were undetected by the criteria used for systematic sequence analysis. The yeast genome sequence has been annotated for all ORES larger than 300bp, (encoding proteins 100 amino acids or greater). Genes encoding proteins below this cut off are therefore commonly unannotated. This class of genes might also be underrepresented in mutational collections because of the small target size for mutagenesis, and given their small size, may encode proteins with novel functions. The systematic knockout of these NORF genes will therefore be of great interest.
Comparison of gene expression patterns from altered physiologic states can provide insight into genes that are important in a variety of processes.
Comparison of transcriptomes from a variety of physiologic states should 1 S provide a minimum set of genes whose expression is required for normal vegetative growth, and another set composed of genes that will be expressed only in response to specific environmental stimuli, or during specialized processes. For example, recent work has defined a minimal set of 250 genes required for prokaryotic cellular life (Mushegian and Koonin, 1996).
Examination of the yeast genome readily identified homologous genes for 196 of these, over 90% of which were observed to be expressed in the SAGE
analysis. Detailed analyses of yeast transcriptomes, as well as transcriptomes from other organisms, should ultimately allow the generation of a minimal set of genes required for eukaryotic life.
Like other genome-wide analyses, SAGE analysis of yeast transcriptomes has several potential limitations. First, a small number of transcripts would be expected to lack an NIaIII site and therefore would not be detected by our analysis. Second, our analysis was limited to transcripts found at least as frequently as 0.3 copies per cell. Transcripts expressed in only a minute fraction of the cell cycle, or transcripts expressed in only a fraction of the cell population, would not be reliably detected by our analysis.
Finally, mRNA sequence data are practically unavailable for yeast, and consequently, some SAGE tags cannot be unambiguously matched to corresponding genes. Tags which were derived from overlapping genes, or genes which have unusually long 3' untranslated regions may be misassigned.
Increased availability of 3' UTR sequences in yeast mRNA molecules should help to resolve the ambiguities.
Despite these potential limitations, it is clear that the analyses described here furnish both global and local pictures of gene expression, precisely defined at the nucleotide level. These data, like the sequence of the yeast genome itselt; provide simple, basic information integral to the interpretation of many experiments in the future. The availability of mRNA sequence information from EST sequencing as well as various genome projects, will soon allow definition of transcriptomes from a variety of organisms, including human. The data recorded here suggest that a reasonably complete picture of a human cell transcriptome will require only about 10 - 20 fold more tags than evaluated here, a number well within the practical realm achievable with a small number of automated sequencers. The analysis of global expression patterns in higher eukaryotes is expected, in general, to be similar to those reported here for S. cerevisiae. However, the analysis of the transcriptome in different cells and from different individuals should yield a wealth of information regarding gene function in normal, developmental, and disease states.
Experimental Procedures Yeast cell culture The source of transcripts for all experiments was S. cerevisiae strain YPH499 {MATa ura3-52 lys2-801 ade2-101 leu2-dl his3-d200 trill-d63) (Sikorski and ITieter, 1989). Logarithmically growing cells were obtained by growing yeast cells to early log phase (3 x 106 cells/ml) in YPD (Rose et al., 1990) rich medium (YPD supplemented with 6mM uracil, 4.8 mM adenine and 24 mM tryptophan) at 30°C. For arrest in the G1/S phase of the cell cycle, t , ~ .. ......
hydroxyurea (0.1M) was added to early log phase cells, and the culture was incubated an additional 3 .5 hours at 3 0 ° C. For arrest in the G2/M
phase of the cell cycle, nocodazole (l5ug/ml) was added to early log phase cells and the culture was incubated for an additional 100 minutes at 30 ° C.
Harvested cells were washed once with water prior to freezing at -70 ° C. The growth states of the harvested cells were confirmed by microscopic and flow cytometric analyses (Basrai et al., 1996).
RNA isolation and Northern Blot Analysis Total yeast RNA was prepared using the hot phenol method as described (Leeds et al., 1991). mRNA was obtained using the MessageMaker Kit (GibcoBRL) following the manufacturer's protocol. Northern blot analysis was performed as described (El-Deiry et al., 1993), using probes PCR
amplified from yeast genomic DNA.
SAGE protocol The SAGE method was performed as previously described (Velculescu et al., 1995), with exceptions noted below. PolyA RNA was converted to double-stranded cDNA with a BRL synthesis kit using the manufacturer's protocol except for the inclusion of primer biotin-5'-Tlg-3'. The cDNA was cleaved with MaIII (Anchoring Enzyme). As NIaIII sites were observed to occur once every 309 base pairs in three arbitrarily chosen yeast chromosomes (1, 5, 10), 95% of yeast transcripts were predicted to be detectable with a NIaIII-based SAGE approach. After capture of the 3' cDNA fragments on streptavidin coated magnetic beads (Dynal), the bound cDNA was divided into two pools, and one of the following linkers containing recognition sites for BsmFI was ligated to each pool: Linker 1, 5'-TTTGGATTTGCTGGTGCAGTACAACTAGGCTTAATAGGGACATG-3' ( S E D I D N O : 1 ) . 5 ' -TCCCTATTAAGCCTAGTTGTACTGCACCAGCAAATCC
[amino mod. C7]-3'(SED ID N0:2).; Linker 2,5'-~WO 98/32847 PCT/US98/01216 TTTCTGCTCGAATTCAAGCTTCTAACGATGTACGGGGACATG-3' ( S E D I D N O : 3 ) 5 ' -TCCCCGTACATCGTTAGAAGCTTGAATTCGAGCAG[amino mod. C7]-3' (SED ID N0:4).
As BsmFI (Tagging Enzyme) cleaves 14 by away from its recognition site, and the MaIli site overlaps the BsmFI site by 1 bp, a 15 by SAGE tag was released with BsmFI. SAGE tag overhangs were filled-in with Klenow, and tags from the two pools were combined and ligated to each other. The ligation product was diluted and then amplified with PCR for 28 cycles with 5'-GGATTTGCTGGTGCAGTACA-3' (SED ID NO:S) and 5'-CTGCTCGAATTCAAGCTTCT-3' (SED ID N0:6), as primers. The PCR
product was analyzed by polyacrylamide gel electrophoresis (PAGE), and the PCR product containing two tags ligated tail to tail (ditag) was excised. The PCR product was then cleaved with NlalB, and the band containing the ditags was excised and self ligated. After ligation, the concatenated products were separated by PAGE and products between 500 by and 2 kb were excised.
These products were cloned into the SphI site of pZero (Invitrogen).
Colonies were screened for inserts by PCR with M13 forward and M13 reverse sequences located outside the cloning site as primers.
PCR products from selected clones were sequenced with the TaqFS
DyePrimer kits (Perkin Elmer) and analyzed using a 377 ABI automated sequences (Perkin Elmer), following the manufacturer's protocol. Each successfi~i sequencing reaction identified an average of 26 tags; given a 90%
sequencing reaction success rate, this corresponded to an average of about 850 tags per sequencing gel.
SAGE data analysis Sequence files were analyzed by means of the SAGE program group (Velculescu et al., 1995), which identifies the anchoring enzyme site with the proper spacing and extracts the two intervening tags and records them in a database. The 68,691 tags obtained contained 62,965 tags from unique WO 98/32847 PCT/US98/a1216 ditags and 5,726 tags from repeated ditags. The latter were counted only once to eliminate potential PCR bias of the quantitation, as described (Velculescu et al., 1995). Of 62,965 tags, 2,332 tags corresponded to linker sequences, and were excluded from further analysis. Of the remaining tags, 4,342 tags could not be assigned, and were likely due to sequencing errors (in the tags or in the yeast genomic sequence). If all of these were due to tag sequencing errors, this corresponds to a sequencing error rate of about 0.7%
per base pair (for a lObp tag), not far from what we would have expected under our automated sequencing conditions. However, some unassigned tags had a much higher than expected frequency of A's as the last five base pairs of the tag (5 of the 52 most abundant unassigned tags), suggesting that these tags were derived from transcripts containing anchoring enzyme sites within several base pairs from their polyA tails. Given the frequency ofNIaTII sites in the genome (one in 309 base pairs), approximately 3% of transcripts were predicted to contain NIaIII sites within 10 by of their polyA tails.
As very sparse data are available for yeast mRNA sequences and efforts to date have not been able to identify a highly conserved polyadenylation signal (Irniger and Braus, 1994; Zaret and Sherman, 1982), we used 14 by of SAGE tags (i.e. the NIaTII site plus the adjacent 10 bp) to search the yeast genome directly (yeast genome sequence obtained from the Stanford yeast genome ftp site (genome-ftp.stanford.edu) on August 7, 1996). Because only coding regions are annotated in the yeast genome, and SAGE tags can be derived from 3' untranslated regions of genes, a SAGE tag was considered to correspond to a particular gene if it matched the ORF or the region 500 by 3' of the ORF (locus names, gene names and ORF chromosomal coordinates were obtained from Stanford yeast genome ftp site, and ORF descriptions were obtained fromMIPS www site (http://www.mips.biochem. mpg.de~ on August 14, 1996). ORFs were considered genes with known functions if they were associated with a three letter gene name, while ORFs without such designations were considered uncharacterized.
As expected, SAGE tags matched transcribed portions of the genome in a highly non-random fashion, with 88% matching ORFs or their adjacent 3' regions in the correct orientation (chi-squared P value < 10''~. In instances when more than one tag matched a particular ORF in the correct orientation, the abundance was calculated to be the sum of the matched tags (for Figure 2, Figure 3, and Figure 4). Tags that matched ORFs in the incorrect orientation were not used in abundance calculations. In instances when a tag matched more than one region of the genome (for example an ORF and non-ORF region) only the matched ORF was considered. In some cases the 15th base of the tag could also be used to resolve ambiguities. For Figure 4, only tags that matched the genome once were used.
For the identification of NORF genes, only tags were considered that matched portions of the genome that were further than 500 by 3' of a previously identified ORF, and were observed at least two times in the SAGE
libraries.
WO 98/32847 PCT/US98/0121b J y O
°b ~ a c N .m .E
H v a E
c 't ~ o m 0 0~
_ o ~ cvi m _~' o -o ~ ~ n. E, ~ 'm o m ~ ~ o - o ,n - s t O T N .N ~ m ~ J 'N ~f - N ~ _ .mp '~ UOI N m N '4 O fw- C OD
ac m ° ~~ Es ~ ° u_ v~~ °J'~ ~:'~~in o, .m ~ ,c m a c_ c ~ c c ~ c c c c c c c c a ,c ,c m o A ~ m ~o 'm 'm O 'm 'm O 'a; 'a~ 'w m m 'm 'a~ 'm mm m m ~ O ~- ~ +. ~- .a-. C a r. t. N +~ ~a~ r +~
C T ~ V T V a O V p O O m ~ 'y O ~ O O O O C O O O O E O O
0 0 ~ ~ a,~ a >, ,n a a.. a o a ~ a a a aY a a a a o a a ~~.tNEt°~EE°cEoEc°°EEEE~EEEE~EE
OU>LNONIn'V- 010/1 NNfoOUOIN~>N~NN~CII~N
7. V t O ~ t U ,.00 ~ Q ~ ~ .D O ~ ~ lJO ~ 2..a ~ ~ ~ O ~ ~
Ln ft N N d W N. ~ 'C a f0 C 'C C 'C E 'C C v 'C 'C 'C O_'C 'C 'C 'C ~ C 'C
tn 00 ~ I~ h N OD CO M M c0 CO N ~ 1f7 (h M N O ~ M O GO CO f~ I~ h- O
N ~ N N ~ O CD c0 ~ ~ O~ M L. h- I~ !~ f~ CO f0 (O (O CD (O (D CO (D
'r~i~l~S
<. o r :5 'S, ~i'~~.~'~,.f U ~ ~ U 3 Q ~ ~ o n ~ w 'is~<., N w o r> ~ ~ ao m r> o v :ir.,.; o~ M o m o o N o 0 0 a.#.::: ~ r~
U ~ U U U ~ ~ J ~ ~ ~ ~ U ~ ~ U ~ J ?~ U ~ ~ ~ U
Q ~ N O Q N N ~ ~ I ~ N ~.. O
c~ ~ ,~ g ~ u~ o m o ~ r ~ ~ r r r_ ~ Q Y_ °~_ ~°o~ o ~ 0 0 o y ~, o '~ ~ ~ N.' C ~ J J J ~ ~ ~' O ~ p~' d' J N.' -1 J ~ J
'~3.~ ~ o~ o = rn ~ '° ~ ~ ~ U ~ ?' ~ Z cUv m 3 Z (~ ~ C7 U d n 2 O ~ ~
CD ~
si o p Y~ o o r ~ 0 0 o m r ~,, y r r v r r T r a p o #t::r ~ ~ r ~ ~ y m r >
:irk;; U
~si.;:
:~>.:;
N ?~,~:;, >
..,,::
~~t;:
. r~~r.~.f :a w 'C ,"~~o m m cmo m m m m :,.:; r-rs,~~~., N
~ ~rj = rL J J N. J J m m i:~.~omNQ~o~ Q Wna a~ QaQm aQ~QQQm ~.::~CO~UQ~Q~Yg V» ~U ~~ Y~~~~Zaao ~ :o~~Q_aQ~°u_°~a~Q J
~A ~ W- g Q ~ ~ a ~ ~ ~ Q
sGf::~a v~ ~ U Z
>, s,~~~;
~~i C~
Z ''''"~U1-p-UH-UUH(a.9~U' U' 1-U' U' UUU' (9f-C9 C7U~' U' F-Q
U Q (~ Q U Q F- Q i- H C7 H [~ U
~'%U U O H U Q U H O U ~ Q U (~ f- U U' U' ~ ~ ~ U' ~ U Q U U
~~,~''''.~'.'~ i-U' U' C74 UU' U' U' UUH.UUQU' O(~U(~CJUC9~f-U
r ~ ~ U U Q ~ U f- ~ Q d Q 4 U Q U U H p Q ~ U O Q U a U U
Q ~ U U U 1- ~ U U ~ C7 h- ~ F- Q U' ~ U Q
_d fi<(9UU UC~(~U' U' U UUC7QU U(~U U~Q~U' '~pU
''~(~C47QUpU' HC9aUC7~h-UU~UU~~UU1-fU...1-O_1.'t.-C~U10-i'~'U' QF-U' ~U' QC9~F-U' U' ~ UU' U' HUI-044(~C7UC7U
;::..::.
WO 98/32847 PCT/(1598/01216 a ' E
_~
Y
N_ ' O
w U
M f'7 C
'N C C
.
O p p Q O
a N o_ C c0 C C' m E ~ E
_N Y, N O C ~ tpn O d O w 'aC O
i O
t''~/i d ~ .a O p ;a :.. n ~ ' d>, o c o c w a ~? ~ c ~
cn 'N ~ J pE p N
~
;.O U O f0 ~C ~ c L ~ ~ O
.
C = ~ d~ Q. C p U C O
~
a. c . ~ ~ca L Y ~ a 'm ~ _a~ a . d a ' y co .- ~ N ~ o~
~ ,~ a -o .- a, a, ~m v c .~
3 n ~ ~ lD UL U ~ O C
~ ~ U O L C
N
. m c U W i o~ > o_ ~ ~ o c_ ~ ~ ~~ '~ v~
~
U ~ a o ~ ~' m ~ o a E z ~ , ~ ~
J
o ~ v U= ~ ,.., ~ ~
w ~ o ~ U
' :; ~ .n ~a o ~ ~ , ~
~ a 'C~ n.O 3 LW'd L2 U7 ~
C ~ CC~ .~~ Cd~L CCC ~ ~
~ ~ J CC o ~ o ~ c~ o r~ m 0 ~ ~
o o 0 0 0' ay E v o o ~ a~ E o ...d a> ~ 'o o E w a~ .. .... ~- ~ ' v c o Y c ' ~ ' a ~ ~,~.~ o- >>o ~Y
' oo o -(n L
' z o '' - L L c ' ~m o_ o ,~ n. . . o_ 'yn n. m ' W ~
m m mm o o U O
m (0 . .y L w . (O (0 .~ (C r-. (0 a - O O l0 O
O
~ ~U U U ~ T f0 ~ OE ,- + r-. >. ~
' '~, +. .V E
~ ~ - U
~ ' ' ~
~
~
;~ '~'Vl NV1 ~ i.. ~ O p .
(I) O N E N N ~ . O tn V7 N ~ r- ~ p N v ~
.
Ql ~ O ~ ~ m D)O O f0 O ~ !D
D) ~ 07 01 0 p O
O f0 f0 f0 ~
E ~ ~ Y E o . E E
o ' ' ' r.o o o_ oE o_ a ...
o 0 0 0 ~ ~ ..
a no_o ~ ~
' ~ ' ~
' '.'.:'N L L ~ N ~L L E ~ L L 3 L N.
N N (~/J 'C '(/l tn L Vf N Vl IO L ~
L Q
ry : i:
~f..%i,~~r~~~~~C1~~~~rrr~ClC)Cl ,.~~.tf)lO~'MC~MNNN~
.i ~;'.~... ...
i ::
:
ft::;
:r ,~~~'...
:::A
~~~<~fUOMIU_D~~NMQ V ~N~~~~NUuU70_~NN(UpN~aUD~~M
h?$?zON~~p~~O~o00~~~O~~OMpopO~~MVNON
V '%~'Y ~ ~ g m a o U ~ .J a o ~ ~ ~ ~ m ~ ~ O a C7 ~ ~ o m ~ o ~ o ,.~~ p- ui >_ ~~
riYYii:
n ~.:.
,.~«;.
Vii/:::;
y~~.~UUQU' U~'- U U~ (~Q~C7UH1-Q!-H~- f- C7H 1-H
N. QU~~~U~~ V' U~~U' ~t-~-~QU~UQ~U(a.7~HU~~
.,.~F- H- Q (~ H- U ~ ~ U'' : F- 1Q- Q~- ~ F- Q ~ 1- Q Q C~ U ~ f CV >%U ~ Q ~ ~ V ~ Q ~ U C? ~ FU- ~ ~ ~ U U' U I- ~ ~ ~ ~ ~ U ~ U' Ur U' C7 I- ~ C~J
?;~: t~ U Q ~- U ~- Q ~ c9 c9 Q H
C7Q~U' U (~U' U' ~~U~UUUUU~C7~~U~~
.~.,..~f:::~E-U~(J1-H(~F- ~Ur 1-(~~-1-Q~UUU~~UU' U~~UU' ~,r~.:.
;..;~.
'.. .
H
?. .
t OOD ~ O l~ ~ ~ (O In ~ ~ OMO V ~ 01 r t(~ m O) 'd' V' ~ O Or0 OM7 N ~ N N
r r N N r r N N N r r N N r N M r N N
r- N M 'vY' 01 O ~ M ~ In ~ h- N ~ r 'd' ~ f0 r r M ~ ~ ~ N r r !~
~M000~mf~-~OODpOpI~OMO~(~OD NMm~rtND
t~ N
c'O~ N ~ tf7 t00 0~0 ~ d' ~ r 00 ~t r (O N M M V' (rp a117- O ~ eM- eN- ~ ~ N d' 00 ~ N lCJ M ~ st m N N ~ r I~ ~ ~ In T M V' M (O ~ N r O ap pp y.. ~p (p fp In ~' d' M M M M M M M M M N N N N
1'r r- r- r e- ~-N M '~ tn (D 1~ OD O O r N M ~ tf) f0 h- OD O O ~- N M ~' I,n (D h a0 O O
~ ~ r r r r r r' ~- r r r N N N N N N N N N N M
LL lL lL LL LL LL LL ll. LL LL Ll LL LL LL LL LL LL LL LL LL
U U 0 0 O O O O ~ OC LY D: Li ~ ~ ~ ~ ~ ~ ~ ~ Q' ~' ~ d' ~ ~ ~ ~
zzzzzzzzooooo0000000000000000 zzzzzzzzzzzzzzzzzzzzz a~
U I- E-. U U' (~ 1- f... ~ ~ Q U' U
i U H U Q ~ Q Q C7 ~ I-Q- U U C~ U Q U I- U ~--' U U
Z ' U Q Q ~ ~ U' ~ FU- ~ 1- U FU- U' U' Q ~ f"' ~ Q U ~' U ~ U U
' FU- U Q ~ U ~ ~ ~ F~- aU' Q U V Q ~ ~ U '~ (9 ~ U ~'~'U' ~ U U C~
U U U' U U ~ U Q U ~ F- ~ U U' U' f-- ~ I- Q
~ ~ ~ U ~ U Q ~ Q ~ V 1~- 1a- ~ (~9 U U ~ ~ ~ ~ ~ U E.- C7 ~ ~ ~ C~
Additional NORFs ::::
:.:....:::::::::::::::.:,...::::...:::....::.:...:.::..::.::>::::.::.::....:::.
::<...:,:;.::.::.::.:....:::::::....::.'.:.::..:::..''<...:.....:::.::::.......
:::>..-.::::::':<.....::~~.::.::....::::....:..::::.:.:;'..::....::....::
......:................' ..~::.....'...:.:.,.::::..,.::
...:".....'...'.
. .' ...'..."
....:....:.::::::
..:.....:' ....... .:':,.
....
:. .. .. .,'.
...".'....
...... . .
. ........
. .. ......' .. . ...:
:::'.,',:::..:.:.,.:::
.: .
::i:::.: .:
' :: > , ' :..., : 'st2o-::a':-'.'::::".';.'..;
' . ,................
. ' : : :
. _:;_:_:y w ..' . ' .
:.~'::'.'-.'.''.'.:::.,.::.'.
..
.. . . .:.:.:..:.~:::.::::.,::..":::.:~...............'.::..:..
. ::::::.:::..:....
... .. .........
... .. . .
..............'.
...... ...
..............
.. ..... ...
.,~ ..' :.",......
.........'...,.....'......
.... .. .
......................,.,..,;~.;:::::~::::::::::;:::....::;:;ox;;;;;
::.,.::a:.:;..:;:;t.:::;::".
:.'.:.::'..:":",:..:,.....":.........",....:..::......,.."...."..,...
~
~
TTTAGTTAAT 2 477623 _ 2 TAGTTGCTCC 7 . 317108 1 1~V0 98/32847 PCT/US98/01216 CAATTCCTAC 1 172182 0.8 TTTGATTTGA 2 46431 0.8 GGCTCTGGTT 2 414510 0.8 CAGAAATAGC 2 565130 0.8 CTGTTATTTT 2 616054 0.8 CGAAGTCAAA 2 680605 0.8 CTCTAGATAA 3 171584 0.8 AGTCAAAATG 4 192750 0.8 GCGAGTTTAG 4 691301 0.8 GCTCCAATAG 4 1131020 0.8 TTTATTTGAG 4 1237501 0.8 GTTATATTGA 4 1401803 0.8 TGGGTTGAAG 5 251266 0.8 ATTTTATTTG 5 447729 0.8 ATCATAAAAA 5 548612 0.8 TTATATAAAA 6 223182 0.8 CTACTTCTGC 8 34653 0.8 ATAAGACAGT 10 227802 0.8 TTCATAAGTT 10 471894 0.8 TAAATCTGAG 11 145617 0.8 CTGGTAGAAA 11 151174 0.8 CACGTACACA 11 403208 0.8 CCAAGATCAA 11 425882 0.8 AGCTTGTTCC 12 234966 0.8 CACATTCGTT 12 759953 0.8 CTTACATATA 12 789781 0.8 TCTATAGCAA 13 228936 0.8 CCTTTCTGAA 13 ~ 297985 0.8 CCTTTAGAAT 13 777999 0.8 AATTAACACC 13 842122 0.8 GCGCAGGGGC 14 440984 0.8 TGTTTATAAA 14 661710 0.8 AAAAGTCATT 15 32081 0.8 TTCGTAAACT 15 680625 0.8 TTTTTGGAGT 15 888343 0.8 AGGCATCTTG 16 250284 0.8 AAATCAAAAC 16 453890 0.8 AATTGACGAA 16 560169 0.8 TTGATGATTT 16 582360 0.8 CCTGTTTTTG 16 643476 0.8 TTTTTAAAAA 1 101436 0.5 AAGTTTGATC 1 199848 0.5 AGCACCTATG 2 46913 0.5 TGATTTATCC 2 418946 0.5 ACTGCATCTG 2 680860 0.5 CAAGTTAGGA 2 744770 0.5 ATACCCAATT 3 29939 0.5 AACTTTGTAT 3 30056 0.5 GCGGCGGGTG 3 41645 0.5 AAAATTGTTC 3 57108 0.5 TCAAGTACTC 3 157855 0.5 AACTGTATGC 3 223882 0.5 CTATCGGCCA 3 278840 0.5 ACAAGCCCAA 3 289917 0.5 GTACAGGGCT 4 93873 0.5 AAGATCATCG 4 254851 0.5 GAACTCCTGG 4 340891 0.5 GAACGAGAAG 4 371850 0.5 TZTTTAATAC 4 372058 0.5 TCTCCAGTTG 4 381712 0.5 AATACGTTAC 4 471791 0.5 ACGATTGGCT 4 509158 0.5 TGTTTATAAG 4 521709 0.5 CGTTTTCGTC 4 538839 0.5 TCGAACCTCT 4 578702 0.5 TCCACACACA 4 930972 0.5 CCGTGCGTGC 4 1324367 0.5 TTTCTTCAAC 5 116099 0.5 CCAAGTCTCG 5 159320 0.5 AGAGCGAATT 5 207517 0.5 TGTAGATTAT 5 280465 0.5 AAAAGTAGTT 5 286387 0.5 ACTTGGTATG 5 422942 0.5 TTAATGTTAT 5 544523 0.5 TACACGCGCG 5 544555 0.5 GGTCACTCCT 6 62983 0.5 AAGTGATGAA 6 76141 0.5 TTTATCTTGT 6 130327 0.5 AGTGATTGTT 6 256223 0.5 ~
GCTTTGTTGT 7 72577 0.5 TCATTGATTC 7 110590 0.5 TTCACCGGAA 7 323655 0.5 ACTATTCTGT 7 423957 0.5 GGGCCAACCC 7 433787 0.5 AAAATATCTT 7 559397 0.5 TAGTAGTAAC 7 622201 0.5 AAGCGCACAA 7 735909 0.5 TCGCTGTTTT 7 800300 0.5 TGTATTTTTG 7 836202 0.5 CTAAACAAAG 7 836587 0.5 TAGGAAGAAA 7 905046 0.5 GGAAAAATTA 7 958839 0.5 r TTTGGATAGT 7 974754 0.5 CGTTTGTGTA 8 202655 0.5 AGAAAAAAAC 8 386651 0.5 TAAAGTCCAG 8 518998 0.5 TAAGCAGATT 8 529129 0.5 ATGAGCATTT 9 97114 0.5 AGGTGCAAAA 9 229077 0.5 TAACAAAGAG 10 628227 0.5 CAATTGGCAA 10 721781 0.5 ACTCCCTGTA 11 93528 0.5 CTCTATTGAT 11 144281 0.5 GCTTTCCTTT 11 146665 0.5 ACCGCAAAGA 11 231872 0.5 CTTGTTCAAA 12 230972 0.5 AATGTGCTGT 12 320426 0.5 GCAGATAGCG 12 341324 0.5 TCTGACTTAG 12 368780 0.5 CCCGGATGTT 12 433912 0.5 GTAACGATTG 12 449917 0.5 GAATAACGAA 12 673851 0.5 ACTGCTATTT 12 712476 0.5 GTTCTCTAGC 12 712712 0.5 CATCACCATC 12 794710 0.5 TTGCACTTCT 12 806833 0.5 ACTGTTTATG 12 867350 . 0.5 TTGCTATATA 12 1017911 0.5 TACATTCTAA 13 95707 0.5 CTCTTAGTTG 13 158970 0.5 ACGAACACTT 13 278341 0.5 TGCGCAAGTC 13 283795 0.5 TTTTTCTTAA 13 363037 0.5 CAAATGCATT 13 390802 0.5 CAAATTGTGT 13 395599 0.5 GCAATACTAT 13 826521 0.5 AGTGACGATG 14 60143 0.5 TACTGGTTTA 14 118854 0.5 GTTTGACCTA 14 335512 0.5 AGCGTTTGAT 14 478481 0.5 CTCTGTTGCG 14 ~ 728251 0.5 ,A 15 35952 0.5 TTTGCTTGGT 15 242742 0.5 AGTTTTCCTG 15 304813 0.5 TTTAAAGATA 15 331453 0.5 AAGGAGACAC 15 448624 0.5 CTATATATCA 15 544530 0.5 GATGGAATAG 15 571210 0.5 TCGAGTCGAA 15 758202 0.5 15 882567 0.5 TTTCCAGAAT 15 969884 0.5 TGGACAATGT 15 970607 0.5 GGAATTAAGA 15 979894 0.5 ACTATATGTT 16 582230 0.5 GATATATCAT 16 589647 0.5 AGAATTGATT 16 744406 0.5 CACTGTCTCC 16 824649 0.5 References Archer, J. E., Vega, L. R., and Solomon, F. (1995). Rbl2p, a yeast protein that binds to beta-tubuiin and participates in microtubule function in vivo.
Cell 82, 425-434.
Bajwa, W., Torchia, T. E., and Hopper, J. E. (1988). Yeast regulatory gene GAL3: carbon regulation; UASGaI elements in common with GAL1, GAL2, GAL7, GAL10, GAL80, and MELT; encoded protein strikingly similar to yeast and Escherichia coli galactokinases. Mol Cell Biol 8, 3439-3447.
Basrai, M. A., Kingsbury, J., Koshland, D., Spencer, F., and Hieter, P.
(1996). Faithful chromosome transmission requires Spt4p, a putative regulator of chromatin structure in Saccharomyces cerevisiae. Mol Cell Biol 16, 283 8-2847.
Bishop, J. O., Morton, J. G., Rosbash, M., and Richardson, M. (1974). Three abundance classes in HeLa cell messenger RNA. Nature 250, 199-204.
Burkholder, A. C., and Hartwell, L. H. (1985). The yeast alpha-factor receptor: structural properties deduced from the sequence of the STE2 gene.
Nucleic Acids Res 13, 8463-8475.
Chambers, A, Tsang, J. S., Stanway, C., Kingsman, A. J., and Kingsman, S.
M. (1989). Transcriptional control of the Saccharomyces cerevisiae PGK
gene by RAP1. Mol Cell Biol 9, 5516-5524.
Denis, C. L., Ferguson, J., and Young, E. T. (1983). mRNA levels for the fermentative alcohol dehydrogenase of Saccharomyces cerevisiae decrease upon growth on a nonfermentable carbon source. J Biol Chem 258, 1165-1171.
Dick, T., Surana, U., and Chin, W. (1996). Molecular and genetic characterization of SLC1, a putative Saccharomyces cerevisiae homolog of the metazoan cytoplasmic dynein light chainl. Mol Gen Genet 251, 38-43.
El-Deiry, W. S., Tokino, T., Velculescu, V. E., Levy, D. B., Parsons, R., Trent, J. M., Lin, D., Mercer, W. E., Kinzler, K. W., and Vogelstein, B.
(1993). WAF1, a potential mediator of p53 tumor suppression. Cell 75, 817-825.
Elledge, S. J., and Davis, R. W. (1989). DNA damage induction of ribonucleotide reductase. Mol Cell Biol 9, 4932-4940.
Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J.D., Jacq, C., Johnston, M., Louis, E.J., Mewes, H.W., Murakami, Y., Philippsen, P., Tettelin, H., and Oliver, S.G. (1996).
Life with 6000 genes. Science 274, 546-567.
Gottschling, D. E., Aparicio, O. M., Billington, B. L., and Zakian, V. A.
(1990). Position effect at S. cerevisiae telomeres: reversible repression ofPol II transcription. Cell 63, 751-762.
Hagen, D. C., McCaffrey, G., and Sprague, G. F., Jr. (1986). Evidence the yeast STE3 gene encodes a receptor for the peptide pheromone a factor: gene sequence and implications for the structure of the presumed receptor. Proc Natl Acad Sci U S A 83, 1418-1422.
Hereford, L. M., and Rosbash, M. (1977). Number and distribution of polyadenylated RNA sequences in yeast. Cell 10, 453-462.
Irniger, S., and Braus, G. H. (I994). Saturation mutagenesis of a polyadenylation signal reveals a hexanucleotide element essential for mRNA
3' end formation in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 91, 257-261.
Iyer, V., and Struhl, K. (1996). Absolute mRNA levels and transcriptional initiation rates in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 93, 5208-5212.
Kurjan, J., and Herskowitz, I. (1982). Structure of a yeast pheromone gene (N1F alpha): a putative alpha-factor precursor contains four tandem copies of mature alpha-factor. Cell 30, 933-943.
Leeds, P., Peltz; S. W., Jacobson, A., and Culbertson, M. R. (1991). The product of the yeast UPF1 gene is required for rapid turnover of mRNAs containing a premature translational termination codon. Genes Dev S, 230 3-2314.
Lewin, B. (1980). Gene Expression 2, (New York, New York: John Wiley and Sons), pp. 694-727.
McAlister, L., and Holland, M. J. (1982). Targeted deletion of a yeast enolase structural gene. Identification and isolation of yeast enolase isozymes. J
Biol Chem 257, 7181-7188.
Michaelis, S., and Herskowitz, I. (1988). The a-factor pheromone of Saccharomyces cerevisiae is essential for mating. Mol Cell Biol 8, 1309-1318.
Mushegian, A R, and Koonin, E. V. (1996). A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad.
Sci. USA 93, 10268-10273.
Nguyen, C., Rocha, D., Granjeaud, S., Baldit, M., Bernard, K., Naquet, P., and Jordan, B. R. (1995). Differential gene expression in the murine thymus assayed by quantitative hybridization of arrayed cDNA clones. Genomics 29, 207-216.
Nishizawa, M., Araki, R., and Teranishi, Y. (1989). Identification of an upstream activating sequence and an upstream repressible sequence of the pyruvate kinase gene of the yeast Saccharomyces cerevisiae. Mol Cell Biol 9, 442-451.
Renauld, H., Aparicio, O. M., Zierath, P. D., Billington, B. L., Chhablani, S.
K., and Gottschling, D. E. (1993). Silent domains are assembled continuously from the telomere and are defined by promoter distance and strength, and by SIRS dosage. Genes Dev 7, 1133-1145.
Rose, M. D., Winston, F., and P. I-fieter. (1990). Methods in Yeast Genetics.
(Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press), pp.
177.
Schena, M., Shalon, D., Davis, R W., and Brown, P. O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA
microarray. Science 270, 467-470.
Schmitt, H. D., Ciriacy, M., and Zimmermann, F. K. (1983). The synthesis of yeast pyruvate decarboxylase is regulated by large variations in the messenger RNA level. Mol Gen Genet 192, 247-252.
Sikorski, R S., and Hieter, P. (1989). A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae. Genetics 122, 19-27.
Singh, A., Chen, E. Y., Lugovoy, J. M., Chang, C. N., Hitzeman, R. A., and Seeburg, P. H. (1983). Saccharomyces cerevisiae contains two discrete genes coding for the alpha-factor pheromone. Nucleic Acids Res ll, 4049-4063.
Smith, M. M., and Murray, K. (1983). Yeast H3 and H4 histone messenger RNAs are transcribed from two non-allelic gene sets. J Mol Biol 169, 641-661.
St John, T. P., and Davis, R. W. (1979). Isolation of galactose-inducible DNA sequences from Saccharomyces cerevisiae by differential plaque filter hybridization. Cell 16, 443-452.
Velculescu, V. E., Zhang, L., Vogelstein, B., and Kinzler, K. W. (1995).
Serial analysis of gene expression. Science 270, 484-487.
Zaret, K. S., and Sherman, F. (1982). DNA sequence required for efEcient transcription termination in yeast. Cell 28, 563-573.
In addition their entire nucleotide sequence is known and publicly available.
In general, these were not previously identified as genes due to their small size. However, they have now been found to be expressed.
Differentially expressed yeast genes are those whose expression varies by a statistically significant difference (to greater than 95% confidence level) within different growth phases, particularly log phase, S phase, and G2/M.
Preferably the difference is greater than 10%, 25%, 50%, or 100%. The genes which have been found to have such differential expression characteristics are: NORF N° 1, 2, 4, 5, 6, 17, 25, 27, TEF1/TEF2, EN02, ADH1, ADH2, PGK1, CUP1A/CUP1B, PYK1, YKL056C, YMR116C, YEL033W, YOR182C, YCR013C, ribonucleotide reductase 2 and 4, and YJR085C.
The DNA molecules according to the invention can be genomic or cDNA. Preferably they are isolated free of other cellular components such as membrane components, proteins, and lipids. They can be made by a cell and isolated, or synthesized using PCR or an automatic synthesizer. Any technique for obtaining a DNA of known sequence may be used. Methods for purifying and isolating DNA are routine and are known in the art.
To administer yeast genes to cells, any DNA delivery techniques known in the art may be used, without limitation. These include liposomes, transfection, transduction, transformation, viral infection, electroporation.
Vectors for particular purposes and characteristics can be selected by the skilled artisan for their known properties. Cells which can be used as gene recipients are yeast and other fungi, mammalian cells, including humans, and bacterial cells.
Antifungal drugs can be identified using yeast cells as described herein.
Expression of a differentially expressed gene can be monitored by any means known in the art. When a test substance affects the expression of such a differentially expressed gene, it is a candidate drug for affecting the growth properties of fungi, and may be useful as an antifungal agent.
Because differentially expressed genes are likely to be involved in cell cycle progression, it is likely that these genes are conserved among species.
The differentially expressed genes identified by the present invention can be used to identify homologs in humans and other mammals. Means for identifying homologous genes among different species are well known in the art. Briefly, stringency of hybridization can be reduced so that imperfectly matching sequences hybridize. This can be in the context of inter alia Southern blots, Northern blots, colony hybridization or PCR. Any hybridization technique which is known in the art can be used.
Probes according to the present invention are isolated DNA molecules which have at least 10, and preferably at least 12, 14, 16, 18, 20, or 25 contiguous nucleotides of a particular NORF gene or other differentially expressed gene. The probes may or may not be labeled. They may be used as primers for PCR or for Southern or Northern blots. Preferably the probes are anchored to a sofid support. More preferably they are present on an array so that multiple probes can simultaneously hybridize to a single biological sample. The probes can be spotted onto the array or synthesized in situ on the array. See Lockhart et. al., Nature Biotechnology, Vol. 14, December i 996, "Expression monitoring by hybridization to high-density oligonucleotide arrays." A single array can contain more than 100, 500 or even 1,000 different probes in discrete locations.
The above disclosure generally describes the present invention. A more complete understanding can be obtained by reference to the following specific examples which are provided herein for purposes of illustration only, and are not intended to limit the scope of the invention.
Summary We have analyzed the set of genes expressed from the yeast genome, herein r i called the transcriptome, using serial analysis of gene expression (SAGE).
Analysis of 60,633 transcripts revealed 4,665 genes, with expression levels ranging from 0.3 to over 200 transcripts per cell. Of these genes, 1,981 had known functions, while 2,684 were previously uncharacterized. Integration of positional information with gene expression data allowed the generation of chromosomal expression maps, identifying physical regions of transcriptional activity, and identified genes that had not been predicted by sequence information alone. These studies provide insight into global patterns of gene expression in yeast and demonstrate the feasibility of genome-wide expression studies in eukaryotes.
Results Characteristics and Rationale of SAGE Approach Several methods have recently been described for the high throughput evaluation of gene expression (Nguyen et al., 1995; Schena et al., 1995;
Velculescu et al., 1995). We used SAGE (Serial Analysis of Gene Expression) because it can provide quantitative gene expression data without the prerequisite of a hybridization probe for each transcript. The SAGE
technology is based on two basic principles (Figure 1). First, a .short sequence tag (9-11 bp) contains sufficient information to uniquely identify a transcript, provided that it is derived from a defined location within that transcript. Second, many transcript tags can be concatenated into a single molecule and then sequenced, revealing the identity of multiple tags simultaneously. The expression pattern of any population of transcripts can be quantitatively evaluated by determining the abundance of individual tags and identifying the gene corresponding to each tag.
Genome-wide expression In order to maximize representation of genes involved in normal growth and cell-cycle progression, SAGE libraries were generated from yeast cells in three states: log phase, S phase arrested and G2/M phase arrested. In total, SAGE tags corresponding to 60,633 total transcripts were identified (including 20,184 from log phase, 20,034 from S phase arrested, and 20,415 from GZ/M phase arrested cells). Of these tags, 56,291 tags (93%) precisely matched the yeast genome, 88 tags matched the mitochondria) genome, and 91 tags matched the 2 micron plasmid.
The number of SAGE tags required to define a yeast transcriptome depends on the confidence level desired for detecting low abundance mRNA
molecules. Assuming the previously derived estimate of 15,000 mRNA
molecules per cell (Hereford and Rosbash, 1977), 20,000 tags would represent a 1.3 fold coverage even for mRNA molecules present at a single copy per cell, and would provide a 72% probability of detecting such transcripts (as determined by Monte Carlo simulations). Analysis of 20,184 tags from log phase cells identified 3,298 unique genes. As an independent confirmation of mRNA copy number per cell, we compared the expression level of SUP44/RPS4, one of the few genes whose absolute mRNA levels have been reliably determined by quantitative hybridization experiments (Iyer and Struhl, 1996), with expression levels determined by SAGE.
SUP44/RPS4 was measured by hybridization at 75 +/- 10 copies/cell (Iyer and Struhl, 1996), in good accord with the SAGE data of 63 copies/cell, suggesting that the estimate of 15,000 mRNA molecules per cell was reasonably accurate. Analysis of SAGE tags from S phase arrested and G2/M
phase arrested cells revealed similar expression levels for this gene (range to 55 wpies/cell), as well as for the vast majority of expressed genes. As less than 1% of the genes were expressed at dramatically different levels among these three states (see below), SAGE tags obtained from all libraries were combined and used to analyze global patterns of gene expression.
Analysis of ascertained tags at increasing increments revealed that the number of unique transcripts plateaued at 60,000 tags (Figure 2). This suggested that generation of fi~rther SAGE tags would yield few additional genes, consistent with the fact that sixty thousand transcripts represented a four-fold redundancy for genes expressed as low as 1 transcript per cell.
i Likewise, Monte Carlo simulations indicated that analysis of 60,000 tags would identify at least one tag for a given transcript 97% of the time if its expression level was one copy per cell.
The 56,291 tags that precisely matched the yeast genome represented 4,665 different genes. This number is in agreement with the estimate of 3,000 to 4,000 expressed genes obtained by RNA DNA reassociation kinetics (Hereford and Rosbash, 1977). These expressed genes included 85% of the genes with characterized functions (1,981 of 2,340), and 76% of the total genes predicted from analysis of the yeast genome (4,665 of 6,121). These numbers are consistent with a relatively complete sampling of the yeast transcriptome given the limited number of physiological states examined and the large number of genes predicted solely on the basis of genomic sequence analysis.
The transcript expression per gene was observed to vary from 0.3 to over 200 copies per cell. Analysis of the distribution of gene expression levels revealed several abundance classes that were similar to those observed in previous studies using reassociation kinetics. A "virtual Rot" of the genes observed by SAGE (Figure 3A) identified three main components of the transcriptome with abundances ranging over three orders of magnitude. A
Rot curve derived from RNA-cDNA reassociation kinetics also contained three main components distributed over a similar range of abundances {Hereford and Rosbash, 1977). Although the kinetics of reassociation of a particular class of RNA and cDNA may be affected by numerous experimental variables, there were striking similarities between Rot and virtual Rot analyses (Figure 3B). Because Rot analysis may not detect all transcripts of low abundance (Lewin, 1980), it is not surprising that SAGE
revealed both a larger total number of expressed genes and a higher fraction of the transcriptome belonging to the low abundance transcript class.
Integration of Expression Information with the Genomic Map The SAGE expression data could be integrated with existing positional information to generate chromosomal expression maps (Figure 4). These maps were generated using the sequence of the yeast genome and the position coordinates of ORFs obtained from the Stanford Yeast Genome Database.
Although there were a few genes that were noted to be physically proximal and have similarly high levels of expression, there did not appear to be any clusters of particularly high or low expression on any chromosome. Genes like histones H3 and H4, which are known to have coregulated divergent promoters and are immediately adjacent on chromosome 14 (Smith and Murray, 1983), had very similar expression levels (5 and 6 copies per cell, respectively). The distribution of transcripts among the chromosomes suggested that overall transcription was evenly dispersed, with total transcript levels being roughly linearly related to chromosome size (rz =0.85, data not shown). However, regions within 10 kb of telomeres appeared to be uniformly undertranscribed, containing on average 3.2 tags per gene as compared with 12.4 tags per gene for non-telomeric regions. (Figure 4). This is consistent with the previously described observations of "telomeric silencing" in yeast (Gottschling et al., 1990). Recent studies have reported telomeric position effects as far as 4 kb from telomere ends (Renauld et al., 1993).
Gene Expression Patterns Table 1 lists the 30 most highly expressed genes, all of which are expressed at greater than 60 mRNA copies per cell. . As expected, these genes mostly correspond to well characterized enzymes involved in energy metabolism and protein synthesis and were expressed at similar levels in all three growth states (Examples in Figure 5). Some of these genes, including EN02 (McAlister and Holland, 1982), PDCI (Schmitt et aL, 1983), PGKI
(Chambers et al., 1989), PYKI (Nishizawa et al., 1989), and ADHl (penis et al., 1983), are known to be dramatically induced in the glucose-rich growth conditions used in this study. In contrast, glucose repressible genes such as the GALIlGAL7/GAL10 cluster (St John and Davis, 1979), and GAL3 (Bajwa et al., 1988) were observed to be expressed at very low levels (0.3 or fewer copies per cell). As expected for the yeast strain used in this study, mating type a specific genes, such as the a factor genes (MFAl, MFA2) (Michaelis and Herskowitz, 1988), and alpha factor receptor (STE2) (Burkholder and Harlwell, 1985) were all observed to be expressed at significant levels (range 2 to 10 copies per cell), while mating type alpha specific genes (MFaI, MFa2, STE3) (Hagen et al., 1986; Kurjan and Herskowitz, 1982; Singh et al., 1983) were observed to be expressed at very low levels (<0.3 copies/cell).
Three of the highly expressed genes in Table 1 had not been previously characterized. One contained an ORF with predicted ribosomal function, previously identified only by genomic sequence analysis. Analyses of all SAGE data suggested that there were 2,684 such genes corresponding to uncharacterized ORFs which were transcribed at detectable levels. The 30 most abundant of these transcripts were observed more than 30 times, corresponding to at least 8 transcripts per cell {Table 2). The other two highly expressed uncharacterized genes corresponded to ORFs not predicted by analysis of the yeast genome sequence (NORF = ~onannotated ~).
Analyses of SAGE data suggested that there were approximately 160 NORF
genes transcribed at detectable levels. The 30 most abundant of these transcripts were observed at least 9 times (Table 3 and examples in Figure 5).
Interestingly, one of the NORF genes (NORFS) was only expressed in S phase arrested cells and corresponded to the transcript whose abundance varied the most in the three states analyzed (> 49 fold, Figure 5).
Comparison of S phase arrested cells to the other states also identified greater than 9 fold elevation of the RNR2 and RNR4 transcripts (Figure 5). Induction of these ribonucleoside reductase genes is likely to be due to the hydroxyurea treatment used to arrest cells in S phase (Elledge and Davis, 1989).
Likewise, comparison of G2/M arrested cells identified elevation of RBL2 and dynein light chain, both microtubule associated proteins (Archer et al., 1995; Dick et al., 1996). As with the RNR inductions, these elevated levels seem likely to be related to the nocodazole treatment used to arrest cells in the G2/M phase. While there were many relatively small differences between the states (for example, NORFI, Figure 5), overall comparison of the three states revealed surprisingly few dramatic differences; there were only 29 transcripts whose abundance varied more than 10 fold among the three different states analyzed.
Discussion Analysis of a yeast transcriptome affords a unique view of the RNA
components defining cellular life. We observed gene expression levels to vary over three orders of magnitude, with the transcripts involved in energy metabolism and protein synthesis the most highly expressed. Key transcripts, such as those encoding enzymes required for DNA replication (e.g. POLI and POL3), kinetochore proteins (NDC10 and SKPI), and many other interesting proteins, were present at 1 or fewer copies per cell on average. These 1 S abundances are consistent with previous qualitative data from reassociation kinetics which suggested that the largest number of expressed genes was present at 1 or 2 copies per cell. These observations indicate that low transcript copy numbers are sufficient for gene expression in yeast, and suggest that yeast possess a mechanism for rigid control of RNA abundance.
The synthesis of chromosomal expression maps presents a cataloging of the expression level of genes, organized by their genomic positions. It is not surprising that gene expression is well balanced throughout the 16 chromosomes of S. cerevisiae. As most genes have independent regulatory elements, it would have been surprising to find a large number of physically adjacent genes that had similar high levels of expression. Of the few genes that were known to have coregulated divergent promoters, like the H3/H4 pair, SAGE data confirmed concordant levels of expression. For areas like telomere ends that are known to be transcriptionally suppressed, SAGE data corroborated low levels of expression. Other expected expression patterns such as high levels of glucose induced glycolytic enzymes, low levels of glucose repressed GAL genes, expression of mating type a specific genes, and low of expression of mating type alpha genes, were observed. Finally, identification of tags corresponding to NORF genes suggests that there is a significant number of small proteins encoded by the yeast genome that were undetected by the criteria used for systematic sequence analysis. The yeast genome sequence has been annotated for all ORES larger than 300bp, (encoding proteins 100 amino acids or greater). Genes encoding proteins below this cut off are therefore commonly unannotated. This class of genes might also be underrepresented in mutational collections because of the small target size for mutagenesis, and given their small size, may encode proteins with novel functions. The systematic knockout of these NORF genes will therefore be of great interest.
Comparison of gene expression patterns from altered physiologic states can provide insight into genes that are important in a variety of processes.
Comparison of transcriptomes from a variety of physiologic states should 1 S provide a minimum set of genes whose expression is required for normal vegetative growth, and another set composed of genes that will be expressed only in response to specific environmental stimuli, or during specialized processes. For example, recent work has defined a minimal set of 250 genes required for prokaryotic cellular life (Mushegian and Koonin, 1996).
Examination of the yeast genome readily identified homologous genes for 196 of these, over 90% of which were observed to be expressed in the SAGE
analysis. Detailed analyses of yeast transcriptomes, as well as transcriptomes from other organisms, should ultimately allow the generation of a minimal set of genes required for eukaryotic life.
Like other genome-wide analyses, SAGE analysis of yeast transcriptomes has several potential limitations. First, a small number of transcripts would be expected to lack an NIaIII site and therefore would not be detected by our analysis. Second, our analysis was limited to transcripts found at least as frequently as 0.3 copies per cell. Transcripts expressed in only a minute fraction of the cell cycle, or transcripts expressed in only a fraction of the cell population, would not be reliably detected by our analysis.
Finally, mRNA sequence data are practically unavailable for yeast, and consequently, some SAGE tags cannot be unambiguously matched to corresponding genes. Tags which were derived from overlapping genes, or genes which have unusually long 3' untranslated regions may be misassigned.
Increased availability of 3' UTR sequences in yeast mRNA molecules should help to resolve the ambiguities.
Despite these potential limitations, it is clear that the analyses described here furnish both global and local pictures of gene expression, precisely defined at the nucleotide level. These data, like the sequence of the yeast genome itselt; provide simple, basic information integral to the interpretation of many experiments in the future. The availability of mRNA sequence information from EST sequencing as well as various genome projects, will soon allow definition of transcriptomes from a variety of organisms, including human. The data recorded here suggest that a reasonably complete picture of a human cell transcriptome will require only about 10 - 20 fold more tags than evaluated here, a number well within the practical realm achievable with a small number of automated sequencers. The analysis of global expression patterns in higher eukaryotes is expected, in general, to be similar to those reported here for S. cerevisiae. However, the analysis of the transcriptome in different cells and from different individuals should yield a wealth of information regarding gene function in normal, developmental, and disease states.
Experimental Procedures Yeast cell culture The source of transcripts for all experiments was S. cerevisiae strain YPH499 {MATa ura3-52 lys2-801 ade2-101 leu2-dl his3-d200 trill-d63) (Sikorski and ITieter, 1989). Logarithmically growing cells were obtained by growing yeast cells to early log phase (3 x 106 cells/ml) in YPD (Rose et al., 1990) rich medium (YPD supplemented with 6mM uracil, 4.8 mM adenine and 24 mM tryptophan) at 30°C. For arrest in the G1/S phase of the cell cycle, t , ~ .. ......
hydroxyurea (0.1M) was added to early log phase cells, and the culture was incubated an additional 3 .5 hours at 3 0 ° C. For arrest in the G2/M
phase of the cell cycle, nocodazole (l5ug/ml) was added to early log phase cells and the culture was incubated for an additional 100 minutes at 30 ° C.
Harvested cells were washed once with water prior to freezing at -70 ° C. The growth states of the harvested cells were confirmed by microscopic and flow cytometric analyses (Basrai et al., 1996).
RNA isolation and Northern Blot Analysis Total yeast RNA was prepared using the hot phenol method as described (Leeds et al., 1991). mRNA was obtained using the MessageMaker Kit (GibcoBRL) following the manufacturer's protocol. Northern blot analysis was performed as described (El-Deiry et al., 1993), using probes PCR
amplified from yeast genomic DNA.
SAGE protocol The SAGE method was performed as previously described (Velculescu et al., 1995), with exceptions noted below. PolyA RNA was converted to double-stranded cDNA with a BRL synthesis kit using the manufacturer's protocol except for the inclusion of primer biotin-5'-Tlg-3'. The cDNA was cleaved with MaIII (Anchoring Enzyme). As NIaIII sites were observed to occur once every 309 base pairs in three arbitrarily chosen yeast chromosomes (1, 5, 10), 95% of yeast transcripts were predicted to be detectable with a NIaIII-based SAGE approach. After capture of the 3' cDNA fragments on streptavidin coated magnetic beads (Dynal), the bound cDNA was divided into two pools, and one of the following linkers containing recognition sites for BsmFI was ligated to each pool: Linker 1, 5'-TTTGGATTTGCTGGTGCAGTACAACTAGGCTTAATAGGGACATG-3' ( S E D I D N O : 1 ) . 5 ' -TCCCTATTAAGCCTAGTTGTACTGCACCAGCAAATCC
[amino mod. C7]-3'(SED ID N0:2).; Linker 2,5'-~WO 98/32847 PCT/US98/01216 TTTCTGCTCGAATTCAAGCTTCTAACGATGTACGGGGACATG-3' ( S E D I D N O : 3 ) 5 ' -TCCCCGTACATCGTTAGAAGCTTGAATTCGAGCAG[amino mod. C7]-3' (SED ID N0:4).
As BsmFI (Tagging Enzyme) cleaves 14 by away from its recognition site, and the MaIli site overlaps the BsmFI site by 1 bp, a 15 by SAGE tag was released with BsmFI. SAGE tag overhangs were filled-in with Klenow, and tags from the two pools were combined and ligated to each other. The ligation product was diluted and then amplified with PCR for 28 cycles with 5'-GGATTTGCTGGTGCAGTACA-3' (SED ID NO:S) and 5'-CTGCTCGAATTCAAGCTTCT-3' (SED ID N0:6), as primers. The PCR
product was analyzed by polyacrylamide gel electrophoresis (PAGE), and the PCR product containing two tags ligated tail to tail (ditag) was excised. The PCR product was then cleaved with NlalB, and the band containing the ditags was excised and self ligated. After ligation, the concatenated products were separated by PAGE and products between 500 by and 2 kb were excised.
These products were cloned into the SphI site of pZero (Invitrogen).
Colonies were screened for inserts by PCR with M13 forward and M13 reverse sequences located outside the cloning site as primers.
PCR products from selected clones were sequenced with the TaqFS
DyePrimer kits (Perkin Elmer) and analyzed using a 377 ABI automated sequences (Perkin Elmer), following the manufacturer's protocol. Each successfi~i sequencing reaction identified an average of 26 tags; given a 90%
sequencing reaction success rate, this corresponded to an average of about 850 tags per sequencing gel.
SAGE data analysis Sequence files were analyzed by means of the SAGE program group (Velculescu et al., 1995), which identifies the anchoring enzyme site with the proper spacing and extracts the two intervening tags and records them in a database. The 68,691 tags obtained contained 62,965 tags from unique WO 98/32847 PCT/US98/a1216 ditags and 5,726 tags from repeated ditags. The latter were counted only once to eliminate potential PCR bias of the quantitation, as described (Velculescu et al., 1995). Of 62,965 tags, 2,332 tags corresponded to linker sequences, and were excluded from further analysis. Of the remaining tags, 4,342 tags could not be assigned, and were likely due to sequencing errors (in the tags or in the yeast genomic sequence). If all of these were due to tag sequencing errors, this corresponds to a sequencing error rate of about 0.7%
per base pair (for a lObp tag), not far from what we would have expected under our automated sequencing conditions. However, some unassigned tags had a much higher than expected frequency of A's as the last five base pairs of the tag (5 of the 52 most abundant unassigned tags), suggesting that these tags were derived from transcripts containing anchoring enzyme sites within several base pairs from their polyA tails. Given the frequency ofNIaTII sites in the genome (one in 309 base pairs), approximately 3% of transcripts were predicted to contain NIaIII sites within 10 by of their polyA tails.
As very sparse data are available for yeast mRNA sequences and efforts to date have not been able to identify a highly conserved polyadenylation signal (Irniger and Braus, 1994; Zaret and Sherman, 1982), we used 14 by of SAGE tags (i.e. the NIaTII site plus the adjacent 10 bp) to search the yeast genome directly (yeast genome sequence obtained from the Stanford yeast genome ftp site (genome-ftp.stanford.edu) on August 7, 1996). Because only coding regions are annotated in the yeast genome, and SAGE tags can be derived from 3' untranslated regions of genes, a SAGE tag was considered to correspond to a particular gene if it matched the ORF or the region 500 by 3' of the ORF (locus names, gene names and ORF chromosomal coordinates were obtained from Stanford yeast genome ftp site, and ORF descriptions were obtained fromMIPS www site (http://www.mips.biochem. mpg.de~ on August 14, 1996). ORFs were considered genes with known functions if they were associated with a three letter gene name, while ORFs without such designations were considered uncharacterized.
As expected, SAGE tags matched transcribed portions of the genome in a highly non-random fashion, with 88% matching ORFs or their adjacent 3' regions in the correct orientation (chi-squared P value < 10''~. In instances when more than one tag matched a particular ORF in the correct orientation, the abundance was calculated to be the sum of the matched tags (for Figure 2, Figure 3, and Figure 4). Tags that matched ORFs in the incorrect orientation were not used in abundance calculations. In instances when a tag matched more than one region of the genome (for example an ORF and non-ORF region) only the matched ORF was considered. In some cases the 15th base of the tag could also be used to resolve ambiguities. For Figure 4, only tags that matched the genome once were used.
For the identification of NORF genes, only tags were considered that matched portions of the genome that were further than 500 by 3' of a previously identified ORF, and were observed at least two times in the SAGE
libraries.
WO 98/32847 PCT/US98/0121b J y O
°b ~ a c N .m .E
H v a E
c 't ~ o m 0 0~
_ o ~ cvi m _~' o -o ~ ~ n. E, ~ 'm o m ~ ~ o - o ,n - s t O T N .N ~ m ~ J 'N ~f - N ~ _ .mp '~ UOI N m N '4 O fw- C OD
ac m ° ~~ Es ~ ° u_ v~~ °J'~ ~:'~~in o, .m ~ ,c m a c_ c ~ c c ~ c c c c c c c c a ,c ,c m o A ~ m ~o 'm 'm O 'm 'm O 'a; 'a~ 'w m m 'm 'a~ 'm mm m m ~ O ~- ~ +. ~- .a-. C a r. t. N +~ ~a~ r +~
C T ~ V T V a O V p O O m ~ 'y O ~ O O O O C O O O O E O O
0 0 ~ ~ a,~ a >, ,n a a.. a o a ~ a a a aY a a a a o a a ~~.tNEt°~EE°cEoEc°°EEEE~EEEE~EE
OU>LNONIn'V- 010/1 NNfoOUOIN~>N~NN~CII~N
7. V t O ~ t U ,.00 ~ Q ~ ~ .D O ~ ~ lJO ~ 2..a ~ ~ ~ O ~ ~
Ln ft N N d W N. ~ 'C a f0 C 'C C 'C E 'C C v 'C 'C 'C O_'C 'C 'C 'C ~ C 'C
tn 00 ~ I~ h N OD CO M M c0 CO N ~ 1f7 (h M N O ~ M O GO CO f~ I~ h- O
N ~ N N ~ O CD c0 ~ ~ O~ M L. h- I~ !~ f~ CO f0 (O (O CD (O (D CO (D
'r~i~l~S
<. o r :5 'S, ~i'~~.~'~,.f U ~ ~ U 3 Q ~ ~ o n ~ w 'is~<., N w o r> ~ ~ ao m r> o v :ir.,.; o~ M o m o o N o 0 0 a.#.::: ~ r~
U ~ U U U ~ ~ J ~ ~ ~ ~ U ~ ~ U ~ J ?~ U ~ ~ ~ U
Q ~ N O Q N N ~ ~ I ~ N ~.. O
c~ ~ ,~ g ~ u~ o m o ~ r ~ ~ r r r_ ~ Q Y_ °~_ ~°o~ o ~ 0 0 o y ~, o '~ ~ ~ N.' C ~ J J J ~ ~ ~' O ~ p~' d' J N.' -1 J ~ J
'~3.~ ~ o~ o = rn ~ '° ~ ~ ~ U ~ ?' ~ Z cUv m 3 Z (~ ~ C7 U d n 2 O ~ ~
CD ~
si o p Y~ o o r ~ 0 0 o m r ~,, y r r v r r T r a p o #t::r ~ ~ r ~ ~ y m r >
:irk;; U
~si.;:
:~>.:;
N ?~,~:;, >
..,,::
~~t;:
. r~~r.~.f :a w 'C ,"~~o m m cmo m m m m :,.:; r-rs,~~~., N
~ ~rj = rL J J N. J J m m i:~.~omNQ~o~ Q Wna a~ QaQm aQ~QQQm ~.::~CO~UQ~Q~Yg V» ~U ~~ Y~~~~Zaao ~ :o~~Q_aQ~°u_°~a~Q J
~A ~ W- g Q ~ ~ a ~ ~ ~ Q
sGf::~a v~ ~ U Z
>, s,~~~;
~~i C~
Z ''''"~U1-p-UH-UUH(a.9~U' U' 1-U' U' UUU' (9f-C9 C7U~' U' F-Q
U Q (~ Q U Q F- Q i- H C7 H [~ U
~'%U U O H U Q U H O U ~ Q U (~ f- U U' U' ~ ~ ~ U' ~ U Q U U
~~,~''''.~'.'~ i-U' U' C74 UU' U' U' UUH.UUQU' O(~U(~CJUC9~f-U
r ~ ~ U U Q ~ U f- ~ Q d Q 4 U Q U U H p Q ~ U O Q U a U U
Q ~ U U U 1- ~ U U ~ C7 h- ~ F- Q U' ~ U Q
_d fi<(9UU UC~(~U' U' U UUC7QU U(~U U~Q~U' '~pU
''~(~C47QUpU' HC9aUC7~h-UU~UU~~UU1-fU...1-O_1.'t.-C~U10-i'~'U' QF-U' ~U' QC9~F-U' U' ~ UU' U' HUI-044(~C7UC7U
;::..::.
WO 98/32847 PCT/(1598/01216 a ' E
_~
Y
N_ ' O
w U
M f'7 C
'N C C
.
O p p Q O
a N o_ C c0 C C' m E ~ E
_N Y, N O C ~ tpn O d O w 'aC O
i O
t''~/i d ~ .a O p ;a :.. n ~ ' d>, o c o c w a ~? ~ c ~
cn 'N ~ J pE p N
~
;.O U O f0 ~C ~ c L ~ ~ O
.
C = ~ d~ Q. C p U C O
~
a. c . ~ ~ca L Y ~ a 'm ~ _a~ a . d a ' y co .- ~ N ~ o~
~ ,~ a -o .- a, a, ~m v c .~
3 n ~ ~ lD UL U ~ O C
~ ~ U O L C
N
. m c U W i o~ > o_ ~ ~ o c_ ~ ~ ~~ '~ v~
~
U ~ a o ~ ~' m ~ o a E z ~ , ~ ~
J
o ~ v U= ~ ,.., ~ ~
w ~ o ~ U
' :; ~ .n ~a o ~ ~ , ~
~ a 'C~ n.O 3 LW'd L2 U7 ~
C ~ CC~ .~~ Cd~L CCC ~ ~
~ ~ J CC o ~ o ~ c~ o r~ m 0 ~ ~
o o 0 0 0' ay E v o o ~ a~ E o ...d a> ~ 'o o E w a~ .. .... ~- ~ ' v c o Y c ' ~ ' a ~ ~,~.~ o- >>o ~Y
' oo o -(n L
' z o '' - L L c ' ~m o_ o ,~ n. . . o_ 'yn n. m ' W ~
m m mm o o U O
m (0 . .y L w . (O (0 .~ (C r-. (0 a - O O l0 O
O
~ ~U U U ~ T f0 ~ OE ,- + r-. >. ~
' '~, +. .V E
~ ~ - U
~ ' ' ~
~
~
;~ '~'Vl NV1 ~ i.. ~ O p .
(I) O N E N N ~ . O tn V7 N ~ r- ~ p N v ~
.
Ql ~ O ~ ~ m D)O O f0 O ~ !D
D) ~ 07 01 0 p O
O f0 f0 f0 ~
E ~ ~ Y E o . E E
o ' ' ' r.o o o_ oE o_ a ...
o 0 0 0 ~ ~ ..
a no_o ~ ~
' ~ ' ~
' '.'.:'N L L ~ N ~L L E ~ L L 3 L N.
N N (~/J 'C '(/l tn L Vf N Vl IO L ~
L Q
ry : i:
~f..%i,~~r~~~~~C1~~~~rrr~ClC)Cl ,.~~.tf)lO~'MC~MNNN~
.i ~;'.~... ...
i ::
:
ft::;
:r ,~~~'...
:::A
~~~<~fUOMIU_D~~NMQ V ~N~~~~NUuU70_~NN(UpN~aUD~~M
h?$?zON~~p~~O~o00~~~O~~OMpopO~~MVNON
V '%~'Y ~ ~ g m a o U ~ .J a o ~ ~ ~ ~ m ~ ~ O a C7 ~ ~ o m ~ o ~ o ,.~~ p- ui >_ ~~
riYYii:
n ~.:.
,.~«;.
Vii/:::;
y~~.~UUQU' U~'- U U~ (~Q~C7UH1-Q!-H~- f- C7H 1-H
N. QU~~~U~~ V' U~~U' ~t-~-~QU~UQ~U(a.7~HU~~
.,.~F- H- Q (~ H- U ~ ~ U'' : F- 1Q- Q~- ~ F- Q ~ 1- Q Q C~ U ~ f CV >%U ~ Q ~ ~ V ~ Q ~ U C? ~ FU- ~ ~ ~ U U' U I- ~ ~ ~ ~ ~ U ~ U' Ur U' C7 I- ~ C~J
?;~: t~ U Q ~- U ~- Q ~ c9 c9 Q H
C7Q~U' U (~U' U' ~~U~UUUUU~C7~~U~~
.~.,..~f:::~E-U~(J1-H(~F- ~Ur 1-(~~-1-Q~UUU~~UU' U~~UU' ~,r~.:.
;..;~.
'.. .
H
?. .
t OOD ~ O l~ ~ ~ (O In ~ ~ OMO V ~ 01 r t(~ m O) 'd' V' ~ O Or0 OM7 N ~ N N
r r N N r r N N N r r N N r N M r N N
r- N M 'vY' 01 O ~ M ~ In ~ h- N ~ r 'd' ~ f0 r r M ~ ~ ~ N r r !~
~M000~mf~-~OODpOpI~OMO~(~OD NMm~rtND
t~ N
c'O~ N ~ tf7 t00 0~0 ~ d' ~ r 00 ~t r (O N M M V' (rp a117- O ~ eM- eN- ~ ~ N d' 00 ~ N lCJ M ~ st m N N ~ r I~ ~ ~ In T M V' M (O ~ N r O ap pp y.. ~p (p fp In ~' d' M M M M M M M M M N N N N
1'r r- r- r e- ~-N M '~ tn (D 1~ OD O O r N M ~ tf) f0 h- OD O O ~- N M ~' I,n (D h a0 O O
~ ~ r r r r r r' ~- r r r N N N N N N N N N N M
LL lL lL LL LL LL LL ll. LL LL Ll LL LL LL LL LL LL LL LL LL
U U 0 0 O O O O ~ OC LY D: Li ~ ~ ~ ~ ~ ~ ~ ~ Q' ~' ~ d' ~ ~ ~ ~
zzzzzzzzooooo0000000000000000 zzzzzzzzzzzzzzzzzzzzz a~
U I- E-. U U' (~ 1- f... ~ ~ Q U' U
i U H U Q ~ Q Q C7 ~ I-Q- U U C~ U Q U I- U ~--' U U
Z ' U Q Q ~ ~ U' ~ FU- ~ 1- U FU- U' U' Q ~ f"' ~ Q U ~' U ~ U U
' FU- U Q ~ U ~ ~ ~ F~- aU' Q U V Q ~ ~ U '~ (9 ~ U ~'~'U' ~ U U C~
U U U' U U ~ U Q U ~ F- ~ U U' U' f-- ~ I- Q
~ ~ ~ U ~ U Q ~ Q ~ V 1~- 1a- ~ (~9 U U ~ ~ ~ ~ ~ U E.- C7 ~ ~ ~ C~
Additional NORFs ::::
:.:....:::::::::::::::.:,...::::...:::....::.:...:.::..::.::>::::.::.::....:::.
::<...:,:;.::.::.::.:....:::::::....::.'.:.::..:::..''<...:.....:::.::::.......
:::>..-.::::::':<.....::~~.::.::....::::....:..::::.:.:;'..::....::....::
......:................' ..~::.....'...:.:.,.::::..,.::
...:".....'...'.
. .' ...'..."
....:....:.::::::
..:.....:' ....... .:':,.
....
:. .. .. .,'.
...".'....
...... . .
. ........
. .. ......' .. . ...:
:::'.,',:::..:.:.,.:::
.: .
::i:::.: .:
' :: > , ' :..., : 'st2o-::a':-'.'::::".';.'..;
' . ,................
. ' : : :
. _:;_:_:y w ..' . ' .
:.~'::'.'-.'.''.'.:::.,.::.'.
..
.. . . .:.:.:..:.~:::.::::.,::..":::.:~...............'.::..:..
. ::::::.:::..:....
... .. .........
... .. . .
..............'.
...... ...
..............
.. ..... ...
.,~ ..' :.",......
.........'...,.....'......
.... .. .
......................,.,..,;~.;:::::~::::::::::;:::....::;:;ox;;;;;
::.,.::a:.:;..:;:;t.:::;::".
:.'.:.::'..:":",:..:,.....":.........",....:..::......,.."...."..,...
~
~
TTTAGTTAAT 2 477623 _ 2 TAGTTGCTCC 7 . 317108 1 1~V0 98/32847 PCT/US98/01216 CAATTCCTAC 1 172182 0.8 TTTGATTTGA 2 46431 0.8 GGCTCTGGTT 2 414510 0.8 CAGAAATAGC 2 565130 0.8 CTGTTATTTT 2 616054 0.8 CGAAGTCAAA 2 680605 0.8 CTCTAGATAA 3 171584 0.8 AGTCAAAATG 4 192750 0.8 GCGAGTTTAG 4 691301 0.8 GCTCCAATAG 4 1131020 0.8 TTTATTTGAG 4 1237501 0.8 GTTATATTGA 4 1401803 0.8 TGGGTTGAAG 5 251266 0.8 ATTTTATTTG 5 447729 0.8 ATCATAAAAA 5 548612 0.8 TTATATAAAA 6 223182 0.8 CTACTTCTGC 8 34653 0.8 ATAAGACAGT 10 227802 0.8 TTCATAAGTT 10 471894 0.8 TAAATCTGAG 11 145617 0.8 CTGGTAGAAA 11 151174 0.8 CACGTACACA 11 403208 0.8 CCAAGATCAA 11 425882 0.8 AGCTTGTTCC 12 234966 0.8 CACATTCGTT 12 759953 0.8 CTTACATATA 12 789781 0.8 TCTATAGCAA 13 228936 0.8 CCTTTCTGAA 13 ~ 297985 0.8 CCTTTAGAAT 13 777999 0.8 AATTAACACC 13 842122 0.8 GCGCAGGGGC 14 440984 0.8 TGTTTATAAA 14 661710 0.8 AAAAGTCATT 15 32081 0.8 TTCGTAAACT 15 680625 0.8 TTTTTGGAGT 15 888343 0.8 AGGCATCTTG 16 250284 0.8 AAATCAAAAC 16 453890 0.8 AATTGACGAA 16 560169 0.8 TTGATGATTT 16 582360 0.8 CCTGTTTTTG 16 643476 0.8 TTTTTAAAAA 1 101436 0.5 AAGTTTGATC 1 199848 0.5 AGCACCTATG 2 46913 0.5 TGATTTATCC 2 418946 0.5 ACTGCATCTG 2 680860 0.5 CAAGTTAGGA 2 744770 0.5 ATACCCAATT 3 29939 0.5 AACTTTGTAT 3 30056 0.5 GCGGCGGGTG 3 41645 0.5 AAAATTGTTC 3 57108 0.5 TCAAGTACTC 3 157855 0.5 AACTGTATGC 3 223882 0.5 CTATCGGCCA 3 278840 0.5 ACAAGCCCAA 3 289917 0.5 GTACAGGGCT 4 93873 0.5 AAGATCATCG 4 254851 0.5 GAACTCCTGG 4 340891 0.5 GAACGAGAAG 4 371850 0.5 TZTTTAATAC 4 372058 0.5 TCTCCAGTTG 4 381712 0.5 AATACGTTAC 4 471791 0.5 ACGATTGGCT 4 509158 0.5 TGTTTATAAG 4 521709 0.5 CGTTTTCGTC 4 538839 0.5 TCGAACCTCT 4 578702 0.5 TCCACACACA 4 930972 0.5 CCGTGCGTGC 4 1324367 0.5 TTTCTTCAAC 5 116099 0.5 CCAAGTCTCG 5 159320 0.5 AGAGCGAATT 5 207517 0.5 TGTAGATTAT 5 280465 0.5 AAAAGTAGTT 5 286387 0.5 ACTTGGTATG 5 422942 0.5 TTAATGTTAT 5 544523 0.5 TACACGCGCG 5 544555 0.5 GGTCACTCCT 6 62983 0.5 AAGTGATGAA 6 76141 0.5 TTTATCTTGT 6 130327 0.5 AGTGATTGTT 6 256223 0.5 ~
GCTTTGTTGT 7 72577 0.5 TCATTGATTC 7 110590 0.5 TTCACCGGAA 7 323655 0.5 ACTATTCTGT 7 423957 0.5 GGGCCAACCC 7 433787 0.5 AAAATATCTT 7 559397 0.5 TAGTAGTAAC 7 622201 0.5 AAGCGCACAA 7 735909 0.5 TCGCTGTTTT 7 800300 0.5 TGTATTTTTG 7 836202 0.5 CTAAACAAAG 7 836587 0.5 TAGGAAGAAA 7 905046 0.5 GGAAAAATTA 7 958839 0.5 r TTTGGATAGT 7 974754 0.5 CGTTTGTGTA 8 202655 0.5 AGAAAAAAAC 8 386651 0.5 TAAAGTCCAG 8 518998 0.5 TAAGCAGATT 8 529129 0.5 ATGAGCATTT 9 97114 0.5 AGGTGCAAAA 9 229077 0.5 TAACAAAGAG 10 628227 0.5 CAATTGGCAA 10 721781 0.5 ACTCCCTGTA 11 93528 0.5 CTCTATTGAT 11 144281 0.5 GCTTTCCTTT 11 146665 0.5 ACCGCAAAGA 11 231872 0.5 CTTGTTCAAA 12 230972 0.5 AATGTGCTGT 12 320426 0.5 GCAGATAGCG 12 341324 0.5 TCTGACTTAG 12 368780 0.5 CCCGGATGTT 12 433912 0.5 GTAACGATTG 12 449917 0.5 GAATAACGAA 12 673851 0.5 ACTGCTATTT 12 712476 0.5 GTTCTCTAGC 12 712712 0.5 CATCACCATC 12 794710 0.5 TTGCACTTCT 12 806833 0.5 ACTGTTTATG 12 867350 . 0.5 TTGCTATATA 12 1017911 0.5 TACATTCTAA 13 95707 0.5 CTCTTAGTTG 13 158970 0.5 ACGAACACTT 13 278341 0.5 TGCGCAAGTC 13 283795 0.5 TTTTTCTTAA 13 363037 0.5 CAAATGCATT 13 390802 0.5 CAAATTGTGT 13 395599 0.5 GCAATACTAT 13 826521 0.5 AGTGACGATG 14 60143 0.5 TACTGGTTTA 14 118854 0.5 GTTTGACCTA 14 335512 0.5 AGCGTTTGAT 14 478481 0.5 CTCTGTTGCG 14 ~ 728251 0.5 ,A 15 35952 0.5 TTTGCTTGGT 15 242742 0.5 AGTTTTCCTG 15 304813 0.5 TTTAAAGATA 15 331453 0.5 AAGGAGACAC 15 448624 0.5 CTATATATCA 15 544530 0.5 GATGGAATAG 15 571210 0.5 TCGAGTCGAA 15 758202 0.5 15 882567 0.5 TTTCCAGAAT 15 969884 0.5 TGGACAATGT 15 970607 0.5 GGAATTAAGA 15 979894 0.5 ACTATATGTT 16 582230 0.5 GATATATCAT 16 589647 0.5 AGAATTGATT 16 744406 0.5 CACTGTCTCC 16 824649 0.5 References Archer, J. E., Vega, L. R., and Solomon, F. (1995). Rbl2p, a yeast protein that binds to beta-tubuiin and participates in microtubule function in vivo.
Cell 82, 425-434.
Bajwa, W., Torchia, T. E., and Hopper, J. E. (1988). Yeast regulatory gene GAL3: carbon regulation; UASGaI elements in common with GAL1, GAL2, GAL7, GAL10, GAL80, and MELT; encoded protein strikingly similar to yeast and Escherichia coli galactokinases. Mol Cell Biol 8, 3439-3447.
Basrai, M. A., Kingsbury, J., Koshland, D., Spencer, F., and Hieter, P.
(1996). Faithful chromosome transmission requires Spt4p, a putative regulator of chromatin structure in Saccharomyces cerevisiae. Mol Cell Biol 16, 283 8-2847.
Bishop, J. O., Morton, J. G., Rosbash, M., and Richardson, M. (1974). Three abundance classes in HeLa cell messenger RNA. Nature 250, 199-204.
Burkholder, A. C., and Hartwell, L. H. (1985). The yeast alpha-factor receptor: structural properties deduced from the sequence of the STE2 gene.
Nucleic Acids Res 13, 8463-8475.
Chambers, A, Tsang, J. S., Stanway, C., Kingsman, A. J., and Kingsman, S.
M. (1989). Transcriptional control of the Saccharomyces cerevisiae PGK
gene by RAP1. Mol Cell Biol 9, 5516-5524.
Denis, C. L., Ferguson, J., and Young, E. T. (1983). mRNA levels for the fermentative alcohol dehydrogenase of Saccharomyces cerevisiae decrease upon growth on a nonfermentable carbon source. J Biol Chem 258, 1165-1171.
Dick, T., Surana, U., and Chin, W. (1996). Molecular and genetic characterization of SLC1, a putative Saccharomyces cerevisiae homolog of the metazoan cytoplasmic dynein light chainl. Mol Gen Genet 251, 38-43.
El-Deiry, W. S., Tokino, T., Velculescu, V. E., Levy, D. B., Parsons, R., Trent, J. M., Lin, D., Mercer, W. E., Kinzler, K. W., and Vogelstein, B.
(1993). WAF1, a potential mediator of p53 tumor suppression. Cell 75, 817-825.
Elledge, S. J., and Davis, R. W. (1989). DNA damage induction of ribonucleotide reductase. Mol Cell Biol 9, 4932-4940.
Goffeau, A., Barrell, B.G., Bussey, H., Davis, R.W., Dujon, B., Feldmann, H., Galibert, F., Hoheisel, J.D., Jacq, C., Johnston, M., Louis, E.J., Mewes, H.W., Murakami, Y., Philippsen, P., Tettelin, H., and Oliver, S.G. (1996).
Life with 6000 genes. Science 274, 546-567.
Gottschling, D. E., Aparicio, O. M., Billington, B. L., and Zakian, V. A.
(1990). Position effect at S. cerevisiae telomeres: reversible repression ofPol II transcription. Cell 63, 751-762.
Hagen, D. C., McCaffrey, G., and Sprague, G. F., Jr. (1986). Evidence the yeast STE3 gene encodes a receptor for the peptide pheromone a factor: gene sequence and implications for the structure of the presumed receptor. Proc Natl Acad Sci U S A 83, 1418-1422.
Hereford, L. M., and Rosbash, M. (1977). Number and distribution of polyadenylated RNA sequences in yeast. Cell 10, 453-462.
Irniger, S., and Braus, G. H. (I994). Saturation mutagenesis of a polyadenylation signal reveals a hexanucleotide element essential for mRNA
3' end formation in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 91, 257-261.
Iyer, V., and Struhl, K. (1996). Absolute mRNA levels and transcriptional initiation rates in Saccharomyces cerevisiae. Proc Natl Acad Sci U S A 93, 5208-5212.
Kurjan, J., and Herskowitz, I. (1982). Structure of a yeast pheromone gene (N1F alpha): a putative alpha-factor precursor contains four tandem copies of mature alpha-factor. Cell 30, 933-943.
Leeds, P., Peltz; S. W., Jacobson, A., and Culbertson, M. R. (1991). The product of the yeast UPF1 gene is required for rapid turnover of mRNAs containing a premature translational termination codon. Genes Dev S, 230 3-2314.
Lewin, B. (1980). Gene Expression 2, (New York, New York: John Wiley and Sons), pp. 694-727.
McAlister, L., and Holland, M. J. (1982). Targeted deletion of a yeast enolase structural gene. Identification and isolation of yeast enolase isozymes. J
Biol Chem 257, 7181-7188.
Michaelis, S., and Herskowitz, I. (1988). The a-factor pheromone of Saccharomyces cerevisiae is essential for mating. Mol Cell Biol 8, 1309-1318.
Mushegian, A R, and Koonin, E. V. (1996). A minimal gene set for cellular life derived by comparison of complete bacterial genomes. Proc. Natl. Acad.
Sci. USA 93, 10268-10273.
Nguyen, C., Rocha, D., Granjeaud, S., Baldit, M., Bernard, K., Naquet, P., and Jordan, B. R. (1995). Differential gene expression in the murine thymus assayed by quantitative hybridization of arrayed cDNA clones. Genomics 29, 207-216.
Nishizawa, M., Araki, R., and Teranishi, Y. (1989). Identification of an upstream activating sequence and an upstream repressible sequence of the pyruvate kinase gene of the yeast Saccharomyces cerevisiae. Mol Cell Biol 9, 442-451.
Renauld, H., Aparicio, O. M., Zierath, P. D., Billington, B. L., Chhablani, S.
K., and Gottschling, D. E. (1993). Silent domains are assembled continuously from the telomere and are defined by promoter distance and strength, and by SIRS dosage. Genes Dev 7, 1133-1145.
Rose, M. D., Winston, F., and P. I-fieter. (1990). Methods in Yeast Genetics.
(Cold Spring Harbor, New York: Cold Spring Harbor Laboratory Press), pp.
177.
Schena, M., Shalon, D., Davis, R W., and Brown, P. O. (1995). Quantitative monitoring of gene expression patterns with a complementary DNA
microarray. Science 270, 467-470.
Schmitt, H. D., Ciriacy, M., and Zimmermann, F. K. (1983). The synthesis of yeast pyruvate decarboxylase is regulated by large variations in the messenger RNA level. Mol Gen Genet 192, 247-252.
Sikorski, R S., and Hieter, P. (1989). A system of shuttle vectors and yeast host strains designed for efficient manipulation of DNA in Saccharomyces cerevisiae. Genetics 122, 19-27.
Singh, A., Chen, E. Y., Lugovoy, J. M., Chang, C. N., Hitzeman, R. A., and Seeburg, P. H. (1983). Saccharomyces cerevisiae contains two discrete genes coding for the alpha-factor pheromone. Nucleic Acids Res ll, 4049-4063.
Smith, M. M., and Murray, K. (1983). Yeast H3 and H4 histone messenger RNAs are transcribed from two non-allelic gene sets. J Mol Biol 169, 641-661.
St John, T. P., and Davis, R. W. (1979). Isolation of galactose-inducible DNA sequences from Saccharomyces cerevisiae by differential plaque filter hybridization. Cell 16, 443-452.
Velculescu, V. E., Zhang, L., Vogelstein, B., and Kinzler, K. W. (1995).
Serial analysis of gene expression. Science 270, 484-487.
Zaret, K. S., and Sherman, F. (1982). DNA sequence required for efEcient transcription termination in yeast. Cell 28, 563-573.
Claims (41)
1. An isolated DNA molecule comprising a yeast gene which is involved in cell cycle progression selected from the group of NORF genes identified in Tables 3 and 4.
2. The isolated DNA molecule of claim 1 wherein expression of the NORF gene varies by at least 10% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
3. The isolated DNA molecule of claim 1 wherein expression of the NORF gene varies by at least 25% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
4. The isolated DNA molecule of claim 1 wherein expression of the NORF gene varies by at least 50% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
5. The isolated DNA molecule of claim 1 wherein expression of the NORF gene varies by at least 100% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
6. The isolated DNA molecule of claim 1 wherein expression of the NORF gene varies by a statistically significant difference (greater than 95% confidence level) between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
7. The isolated DNA molecule of claim 6 wherein the NORF is selected from the group consisting of NORF N° 1, 2, 4, 5, 6, 17, 25, and 27.
8. The isolated DNA molecule of claim 1 wherein the NORF gene is not expressed in at least one phase of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
9. The isolated DNA molecule of claim 1 which is genomic.
10. The isolated DNA molecule of claim 1 which is cDNA.
11. A method of using yeast genes to affect the cell cycle, comprising the step of:
administering to a cell an isolated DNA molecule comprising a yeast gene which is involved in cell cycle progression selected from the differentially expressed genes identified in Tables 1, 2, 3, and 4.
administering to a cell an isolated DNA molecule comprising a yeast gene which is involved in cell cycle progression selected from the differentially expressed genes identified in Tables 1, 2, 3, and 4.
12. The method of claim 11 wherein the cell is a yeast cell.
13. The method of claim 11 wherein the cell is a fungal cell.
14. The method of claim 11 wherein the cell is a mammalian cell.
15. The method of claim 11 wherein the yeast gene is selected from the group consisting of NORF N° 1, 2, 4, 5, 6, 17, 25, and 27.
16. The method of claim 11 wherein the yeast gene is selected from the group consisting of TEF1/TEF2, EN02, ADH1, ADH2, PGK1, CUP1A/CUP1B, and PYK1.
17. The method of claim 11 wherein the yeast gene is selected from the group consisting of YKL056C, YMR116C, YEL033W, YOR182C, YCR013C, and YJR085C.
18. A method for screening candidate antifungal drugs, comprising the steps of:
contacting a test substance with a yeast cell;
monitoring expression of a yeast gene which is involved in cell cycle progression selected from the group of yeast genes identified in Tables 1, 2, 3, and 4, wherein a test substance which modifies the expression of the yeast gene is a candidate antifungal drug.
contacting a test substance with a yeast cell;
monitoring expression of a yeast gene which is involved in cell cycle progression selected from the group of yeast genes identified in Tables 1, 2, 3, and 4, wherein a test substance which modifies the expression of the yeast gene is a candidate antifungal drug.
19. The method of claim 18 wherein the yeast gene is selected from the group consisting of NORF N° 1, 2, 4, 5, 6, 17, 25, and 27.
20. The method of claim 18 wherein the yeast gene is selected from the group consisting of TEF1/TEF2, EN02, ADH1, ADH2, PGK1, CUP1A/CUP1B, and PYK1.
21. The method of claim 18 wherein the yeast gene is selected from the group consisting of YKL056C, YMR116C, YEL033W, YOR182C, YCR013C, and YJR085C.
22. A method for identifying human genes which are involved in cell cycle progression, comprising the steps of:
hybridizing a probe comprising at least 10 contiguous nucleotides of a yeast gene which is differentially expressed between at least two phases selected from the group consisting of log phase, S phase, and G2/M phase, wherein the yeast gene is identified in Table 1, 2, 3, or 4.
hybridizing a probe comprising at least 10 contiguous nucleotides of a yeast gene which is differentially expressed between at least two phases selected from the group consisting of log phase, S phase, and G2/M phase, wherein the yeast gene is identified in Table 1, 2, 3, or 4.
23. The method of claim 22 wherein the yeast gene is selected from the group consisting of NORF N° 1, 2, 4, 5, 6, 17, 25, and 27.
24. The method of claim 22 wherein the yeast gene is selected from the group consisting of TEF1/TEF2, EN02, ADH1, ADH2, PGK1, CUP1A/CUP1B, and PYK1.
25. The method of claim 22 wherein the yeast gene is selected from the group consisting of YKL056C, YMR116C, YEL033W, YOR182C, YCR013C, and YJR085C.
26. A probe for ascertaining phase in the cell cycle of a cell, wherein the probe comprises at least 14 contiguous nucleotides of a NORF gene as identified in Table 3 or 4.
27. The probe of claim 26 wherein expression of the NORF gene varies by at least 10% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
28. The probe of claim 26 wherein expression of the NORF gene varies by at least 25% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
29. The probe of claim 26 wherein expression of the NORF gene varies by at least 50% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
30. The probe of claim 26 wherein expression of the NORF gene varies by at least 100% between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
31. The probe of claim 26 wherein the NORF gene is not expressed in at least one phase of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
32. The probe of claim 26 wherein expression of the NORF gene varies by a statistically significant difference (greater than 95%
confidence level) between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
confidence level) between any two phases of the cell cycle selected from the group consisting of log phase, S phase, and G2/M.
33. The probe of claim 32 wherein the gene is selected from the group consisting of NORF N° 1, 2, 4, 5, 6, 17, 25, and 27.
34. The method of claim 18 wherein said step of monitoring expression is performed using nucleic acid molecules which are immobilized on a solid support.
35. The method of claim 34 wherein the nucleic acid molecules are in on array.
36. The method of claim 19 wherein a probe which comprises a portion of said yeast gene is in an array on a solid support.
37. An array of probes on a solid support wherein at least one probe comprises at least 14 contiguous nucleotides of a NORF gene as identified in Table 3 or 4.
38. The array of claim 37 wherein the NORF gene is selected from the group consisting of NORF N°. 1 2, 4, 5, 6, 17, 25, and 27.
39. The array of claim 37 which comprises at least 100 probes of distinct sequence.
40. The array of claim 37 which comprises at least 500 probes of distinct sequence.
41. The array of claim 37 which comprises at least 1,000 probes of distinct sequence.
Applications Claiming Priority (3)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US3591797P | 1997-01-23 | 1997-01-23 | |
US60/035,917 | 1997-01-23 | ||
PCT/US1998/001216 WO1998032847A2 (en) | 1997-01-23 | 1998-01-22 | Characterization of the yeast transcriptome |
Publications (1)
Publication Number | Publication Date |
---|---|
CA2278645A1 true CA2278645A1 (en) | 1998-07-30 |
Family
ID=21885540
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CA002278645A Abandoned CA2278645A1 (en) | 1997-01-23 | 1998-01-22 | Characterization of the yeast transcriptome |
Country Status (5)
Country | Link |
---|---|
EP (1) | EP0970202A2 (en) |
JP (1) | JP2001509017A (en) |
AU (1) | AU749606C (en) |
CA (1) | CA2278645A1 (en) |
WO (1) | WO1998032847A2 (en) |
Families Citing this family (7)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7504493B2 (en) | 1997-01-23 | 2009-03-17 | The John Hopkins University | Characterization of the yeast transcriptome |
AU5485600A (en) * | 1999-06-16 | 2001-01-02 | Johns Hopkins University, The | Characterization of the yeast transcriptome |
FR2821087B1 (en) * | 2001-02-16 | 2004-01-02 | Centre Nat Rech Scient | PROCESS FOR QUALITATIVE AND QUANTITATIVE ANALYSIS OF A POPULATION OF NUCLEIC ACIDS CONTAINED IN A SAMPLE |
DE10160660A1 (en) | 2001-12-11 | 2003-06-18 | Bayer Cropscience Ag | Polypeptides to identify fungicidally active compounds |
US20060147926A1 (en) | 2002-11-25 | 2006-07-06 | Emmert-Buck Michael R | Method and apparatus for performing multiple simultaneous manipulations of biomolecules in a two-dimensional array |
ATE469172T1 (en) * | 2004-07-23 | 2010-06-15 | Ge Healthcare Uk Ltd | CELL CYCLE PHASE MARKERS |
CN108348556A (en) * | 2015-11-02 | 2018-07-31 | 欧瑞3恩公司 | Cell-cycle arrest improves the efficiency for generating induced multi-potent stem cell |
-
1998
- 1998-01-22 JP JP53211798A patent/JP2001509017A/en not_active Ceased
- 1998-01-22 AU AU59280/98A patent/AU749606C/en not_active Expired
- 1998-01-22 CA CA002278645A patent/CA2278645A1/en not_active Abandoned
- 1998-01-22 WO PCT/US1998/001216 patent/WO1998032847A2/en active IP Right Grant
- 1998-01-22 EP EP98902680A patent/EP0970202A2/en not_active Withdrawn
Also Published As
Publication number | Publication date |
---|---|
EP0970202A2 (en) | 2000-01-12 |
AU749606C (en) | 2007-05-17 |
AU749606B2 (en) | 2002-06-27 |
JP2001509017A (en) | 2001-07-10 |
WO1998032847A3 (en) | 1998-11-26 |
WO1998032847A2 (en) | 1998-07-30 |
AU5928098A (en) | 1998-08-18 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
Velculescu et al. | Characterization of the yeast transcriptome | |
Entian et al. | Functional analysis of 150 deletion mutants in Saccharomyces cerevisiae by a systematic approach | |
US7504493B2 (en) | Characterization of the yeast transcriptome | |
Inada et al. | One-step affinity purification of the yeast ribosome and its associated proteins and mRNAs | |
van Hoof et al. | Function of the ski4p (Csl4p) and Ski7p proteins in 3′-to-5′ degradation of mRNA | |
Loo et al. | Roles of ABF1, NPL3, and YCL54 in silencing in Saccharomyces cerevisiae. | |
Kasten et al. | Identification of the Saccharomyces cerevisiae genes STB1–STB5 encoding Sin3p binding proteins | |
McKee et al. | Mutations in Saccharomyces cerevisiae that block meiotic prophase chromosome metabolism and confer cell cycle arrest at pachytene identify two new meiosis-specific genes SAE1 and SAE3 | |
Liu et al. | The use of global transcriptional analysis to reveal the biological and cellular events involved in distinct development phases of Trichophyton rubrum conidial germination | |
Nugent et al. | Gene expression during Ustilago maydis diploid filamentous growth: EST library creation and analyses | |
WO2000077214A2 (en) | Characterization of the yeast transcriptome | |
AU749606C (en) | Characterization of the yeast transcriptome | |
EP1242593B1 (en) | A functional gene array in yeast | |
Schlecht et al. | Genome-wide expression profiling, in vivo DNA binding analysis, and probabilistic motif prediction reveal novel Abf1 target genes during fermentation, respiration, and sporulation in yeast | |
Vollmer et al. | [15] High expression cloning, purification, and assay of Ypt—GTPase-activating proteins | |
AU2005278901A1 (en) | Method for analyzing genes of industrial yeasts | |
Naitou et al. | Expression profiles of transcripts from 126 open reading frames in the entire chromosome VI of Saccharomyces cerevisiae by systematic northern analyses | |
US6265165B1 (en) | Methods for EST-specific full length cDNA cloning | |
JPH10512447A (en) | Nonsense-mediated production of heterologous polypeptides in the absence of mRNA decay function | |
Gromadka et al. | The KRR1 gene encodes a protein required for 18S rRNA synthesis and 40S ribosomal subunit assembly in Saccharomyces cerevisiae. | |
Boles | Yeast as a model system for studying glucose transport | |
Karkusiewicz et al. | Functional and physical interactions of Faf1p, a Saccharomyces cerevisiae nucleolar protein | |
JP2003512007A (en) | Drug targets in Candida albicans | |
JP2002525073A (en) | C. Albicans-derived essential gene and method for screening antifungal substance using the gene | |
Basrai et al. | Transcriptome analysis of Saccharomyces cerevisiae using serial analysis of gene expression |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
EEER | Examination request | ||
FZDE | Discontinued |