CN101115848A - Transcriptome microarray technology and methods of using the same - Google Patents
Transcriptome microarray technology and methods of using the same Download PDFInfo
- Publication number
- CN101115848A CN101115848A CNA2005800457745A CN200580045774A CN101115848A CN 101115848 A CN101115848 A CN 101115848A CN A2005800457745 A CNA2005800457745 A CN A2005800457745A CN 200580045774 A CN200580045774 A CN 200580045774A CN 101115848 A CN101115848 A CN 101115848A
- Authority
- CN
- China
- Prior art keywords
- genes
- list
- transcript
- array
- tissue
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 85
- 238000012775 microarray technology Methods 0.000 title description 2
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 91
- 201000010099 disease Diseases 0.000 claims abstract description 84
- 230000014509 gene expression Effects 0.000 claims abstract description 58
- 238000003745 diagnosis Methods 0.000 claims abstract description 13
- 238000003491 array Methods 0.000 claims abstract description 11
- 108090000623 proteins and genes Proteins 0.000 claims description 335
- 150000007523 nucleic acids Chemical class 0.000 claims description 212
- 102000039446 nucleic acids Human genes 0.000 claims description 210
- 108020004707 nucleic acids Proteins 0.000 claims description 210
- 230000000295 complement effect Effects 0.000 claims description 94
- 238000011282 treatment Methods 0.000 claims description 48
- 238000009396 hybridization Methods 0.000 claims description 42
- 206010028980 Neoplasm Diseases 0.000 claims description 33
- 230000002490 cerebral effect Effects 0.000 claims description 30
- 230000002440 hepatic effect Effects 0.000 claims description 29
- 108090000765 processed proteins & peptides Proteins 0.000 claims description 28
- 210000004072 lung Anatomy 0.000 claims description 27
- 238000001514 detection method Methods 0.000 claims description 26
- 239000003795 chemical substances by application Substances 0.000 claims description 22
- 230000001575 pathological effect Effects 0.000 claims description 22
- 208000001333 Colorectal Neoplasms Diseases 0.000 claims description 21
- 206010009944 Colon cancer Diseases 0.000 claims description 20
- 206010006187 Breast cancer Diseases 0.000 claims description 19
- 238000013518 transcription Methods 0.000 claims description 19
- 230000035897 transcription Effects 0.000 claims description 19
- 208000026310 Breast neoplasm Diseases 0.000 claims description 18
- 201000010989 colorectal carcinoma Diseases 0.000 claims description 18
- 239000012472 biological sample Substances 0.000 claims description 17
- 208000020816 lung neoplasm Diseases 0.000 claims description 16
- 229920001184 polypeptide Polymers 0.000 claims description 16
- 102000004196 processed proteins & peptides Human genes 0.000 claims description 16
- 230000001225 therapeutic effect Effects 0.000 claims description 15
- 206010058467 Lung neoplasm malignant Diseases 0.000 claims description 13
- 201000005202 lung cancer Diseases 0.000 claims description 13
- 208000015122 neurodegenerative disease Diseases 0.000 claims description 13
- 230000008520 organization Effects 0.000 claims description 6
- 208000027866 inflammatory disease Diseases 0.000 claims description 3
- 230000009257 reactivity Effects 0.000 claims description 2
- 239000003814 drug Substances 0.000 abstract description 37
- 238000004458 analytical method Methods 0.000 abstract description 33
- 238000012216 screening Methods 0.000 abstract description 17
- 230000002068 genetic effect Effects 0.000 abstract description 10
- 238000004393 prognosis Methods 0.000 abstract description 7
- 229940124597 therapeutic agent Drugs 0.000 abstract description 7
- 238000002560 therapeutic procedure Methods 0.000 abstract description 3
- 230000002596 correlated effect Effects 0.000 abstract description 2
- 210000001519 tissue Anatomy 0.000 description 209
- 239000000523 sample Substances 0.000 description 82
- 210000004027 cell Anatomy 0.000 description 53
- 108091060211 Expressed sequence tag Proteins 0.000 description 52
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 48
- 230000000875 corresponding effect Effects 0.000 description 20
- 108020004414 DNA Proteins 0.000 description 18
- 238000006243 chemical reaction Methods 0.000 description 18
- 208000006454 hepatitis Diseases 0.000 description 18
- 231100000283 hepatitis Toxicity 0.000 description 18
- 239000002773 nucleotide Substances 0.000 description 18
- 125000003729 nucleotide group Chemical group 0.000 description 18
- 238000012360 testing method Methods 0.000 description 18
- 201000011510 cancer Diseases 0.000 description 16
- 239000003153 chemical reaction reagent Substances 0.000 description 16
- 239000002299 complementary DNA Substances 0.000 description 16
- 229940079593 drug Drugs 0.000 description 16
- 235000018102 proteins Nutrition 0.000 description 15
- 102000004169 proteins and genes Human genes 0.000 description 15
- 230000000692 anti-sense effect Effects 0.000 description 13
- 238000005516 engineering process Methods 0.000 description 13
- 238000011156 evaluation Methods 0.000 description 13
- 239000012634 fragment Substances 0.000 description 13
- 230000000694 effects Effects 0.000 description 12
- 108020004999 messenger RNA Proteins 0.000 description 12
- 230000008859 change Effects 0.000 description 11
- 238000002493 microarray Methods 0.000 description 11
- 210000005036 nerve Anatomy 0.000 description 10
- 230000002974 pharmacogenomic effect Effects 0.000 description 9
- 238000002360 preparation method Methods 0.000 description 9
- 241000894006 Bacteria Species 0.000 description 8
- 239000002831 pharmacologic agent Substances 0.000 description 8
- 238000010839 reverse transcription Methods 0.000 description 8
- 230000003321 amplification Effects 0.000 description 7
- 238000002512 chemotherapy Methods 0.000 description 7
- 239000011159 matrix material Substances 0.000 description 7
- 238000003199 nucleic acid amplification method Methods 0.000 description 7
- 108091033319 polynucleotide Proteins 0.000 description 7
- 239000002157 polynucleotide Substances 0.000 description 7
- 102000040430 polynucleotide Human genes 0.000 description 7
- 150000001875 compounds Chemical class 0.000 description 6
- 238000012937 correction Methods 0.000 description 6
- 238000013461 design Methods 0.000 description 6
- 238000001959 radiotherapy Methods 0.000 description 6
- 108020004711 Nucleic Acid Probes Proteins 0.000 description 5
- 108091028043 Nucleic acid sequence Proteins 0.000 description 5
- 201000008275 breast carcinoma Diseases 0.000 description 5
- 210000001072 colon Anatomy 0.000 description 5
- 208000035475 disorder Diseases 0.000 description 5
- 210000004185 liver Anatomy 0.000 description 5
- 210000005075 mammary gland Anatomy 0.000 description 5
- 239000000463 material Substances 0.000 description 5
- 239000000203 mixture Substances 0.000 description 5
- 239000002853 nucleic acid probe Substances 0.000 description 5
- 238000011160 research Methods 0.000 description 5
- 238000001228 spectrum Methods 0.000 description 5
- 108700011259 MicroRNAs Proteins 0.000 description 4
- 108020004566 Transfer RNA Proteins 0.000 description 4
- 210000000038 chest Anatomy 0.000 description 4
- 238000010367 cloning Methods 0.000 description 4
- 239000002679 microRNA Substances 0.000 description 4
- 230000002969 morbid Effects 0.000 description 4
- 210000000056 organ Anatomy 0.000 description 4
- 238000003498 protein array Methods 0.000 description 4
- 239000013074 reference sample Substances 0.000 description 4
- 208000024891 symptom Diseases 0.000 description 4
- 238000013519 translation Methods 0.000 description 4
- 238000002965 ELISA Methods 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 208000021642 Muscular disease Diseases 0.000 description 3
- 201000009623 Myopathy Diseases 0.000 description 3
- 102000039471 Small Nuclear RNA Human genes 0.000 description 3
- 108020004459 Small interfering RNA Proteins 0.000 description 3
- 239000002253 acid Substances 0.000 description 3
- 230000008901 benefit Effects 0.000 description 3
- 210000004556 brain Anatomy 0.000 description 3
- 239000013599 cloning vector Substances 0.000 description 3
- 239000013068 control sample Substances 0.000 description 3
- 238000001914 filtration Methods 0.000 description 3
- 238000011005 laboratory method Methods 0.000 description 3
- 238000007834 ligase chain reaction Methods 0.000 description 3
- 239000007788 liquid Substances 0.000 description 3
- 208000037841 lung tumor Diseases 0.000 description 3
- 238000004519 manufacturing process Methods 0.000 description 3
- 238000003499 nucleic acid array Methods 0.000 description 3
- 230000007170 pathology Effects 0.000 description 3
- 238000003752 polymerase chain reaction Methods 0.000 description 3
- 210000002307 prostate Anatomy 0.000 description 3
- -1 rRNA (rRNA) Proteins 0.000 description 3
- 230000002441 reversible effect Effects 0.000 description 3
- 230000028327 secretion Effects 0.000 description 3
- 108091029842 small nuclear ribonucleic acid Proteins 0.000 description 3
- 238000001356 surgical procedure Methods 0.000 description 3
- 210000004881 tumor cell Anatomy 0.000 description 3
- YBJHBAHKTGYVGT-ZKWXMUAHSA-N (+)-Biotin Chemical compound N1C(=O)N[C@@H]2[C@H](CCCCC(=O)O)SC[C@@H]21 YBJHBAHKTGYVGT-ZKWXMUAHSA-N 0.000 description 2
- 108020005544 Antisense RNA Proteins 0.000 description 2
- 201000009030 Carcinoma Diseases 0.000 description 2
- 208000024172 Cardiovascular disease Diseases 0.000 description 2
- 206010067601 Dysmyelination Diseases 0.000 description 2
- 206010020772 Hypertension Diseases 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 102000015094 Paraproteins Human genes 0.000 description 2
- 108010064255 Paraproteins Proteins 0.000 description 2
- 108091093037 Peptide nucleic acid Proteins 0.000 description 2
- 229920001213 Polysorbate 20 Polymers 0.000 description 2
- 208000005718 Stomach Neoplasms Diseases 0.000 description 2
- 108010090804 Streptavidin Proteins 0.000 description 2
- 102000004377 Thiopurine S-methyltransferases Human genes 0.000 description 2
- 108090000958 Thiopurine S-methyltransferases Proteins 0.000 description 2
- 108060008682 Tumor Necrosis Factor Proteins 0.000 description 2
- 241000700605 Viruses Species 0.000 description 2
- 229930003756 Vitamin B7 Natural products 0.000 description 2
- 230000002159 abnormal effect Effects 0.000 description 2
- 238000011226 adjuvant chemotherapy Methods 0.000 description 2
- 230000002411 adverse Effects 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 206010003246 arthritis Diseases 0.000 description 2
- 208000006673 asthma Diseases 0.000 description 2
- 239000011324 bead Substances 0.000 description 2
- 238000001574 biopsy Methods 0.000 description 2
- 210000004369 blood Anatomy 0.000 description 2
- 239000008280 blood Substances 0.000 description 2
- 238000002790 cross-validation Methods 0.000 description 2
- 238000009792 diffusion process Methods 0.000 description 2
- 208000016097 disease of metabolism Diseases 0.000 description 2
- 238000007876 drug discovery Methods 0.000 description 2
- 238000007877 drug screening Methods 0.000 description 2
- 210000002919 epithelial cell Anatomy 0.000 description 2
- 238000011010 flushing procedure Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 230000007774 longterm Effects 0.000 description 2
- 210000004698 lymphocyte Anatomy 0.000 description 2
- 239000003550 marker Substances 0.000 description 2
- GLVAUDGFNGKCSF-UHFFFAOYSA-N mercaptopurine Chemical compound S=C1NC=NC2=C1NC=N2 GLVAUDGFNGKCSF-UHFFFAOYSA-N 0.000 description 2
- 208000030159 metabolic disease Diseases 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 201000006417 multiple sclerosis Diseases 0.000 description 2
- 230000001613 neoplastic effect Effects 0.000 description 2
- 238000002966 oligonucleotide array Methods 0.000 description 2
- 238000011275 oncology therapy Methods 0.000 description 2
- 230000002093 peripheral effect Effects 0.000 description 2
- 239000004033 plastic Substances 0.000 description 2
- 229920003023 plastic Polymers 0.000 description 2
- 235000010486 polyoxyethylene sorbitan monolaurate Nutrition 0.000 description 2
- 239000000256 polyoxyethylene sorbitan monolaurate Substances 0.000 description 2
- 230000002265 prevention Effects 0.000 description 2
- 230000008569 process Effects 0.000 description 2
- 208000023958 prostate neoplasm Diseases 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 230000002285 radioactive effect Effects 0.000 description 2
- 210000000664 rectum Anatomy 0.000 description 2
- 230000001105 regulatory effect Effects 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 210000002966 serum Anatomy 0.000 description 2
- 210000000130 stem cell Anatomy 0.000 description 2
- 238000011477 surgical intervention Methods 0.000 description 2
- 239000011735 vitamin B7 Substances 0.000 description 2
- 235000011912 vitamin B7 Nutrition 0.000 description 2
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 206010002198 Anaphylactic reaction Diseases 0.000 description 1
- 206010003267 Arthritis reactive Diseases 0.000 description 1
- 201000001320 Atherosclerosis Diseases 0.000 description 1
- 208000035143 Bacterial infection Diseases 0.000 description 1
- 206010005003 Bladder cancer Diseases 0.000 description 1
- 208000003174 Brain Neoplasms Diseases 0.000 description 1
- 206010006895 Cachexia Diseases 0.000 description 1
- 208000017897 Carcinoma of esophagus Diseases 0.000 description 1
- 208000031229 Cardiomyopathies Diseases 0.000 description 1
- 108010078791 Carrier Proteins Proteins 0.000 description 1
- 208000015374 Central core disease Diseases 0.000 description 1
- 206010008190 Cerebrovascular accident Diseases 0.000 description 1
- 206010008342 Cervix carcinoma Diseases 0.000 description 1
- 108091026890 Coding region Proteins 0.000 description 1
- 206010010356 Congenital anomaly Diseases 0.000 description 1
- 208000029323 Congenital myotonia Diseases 0.000 description 1
- 206010010741 Conjunctivitis Diseases 0.000 description 1
- 208000011231 Crohn disease Diseases 0.000 description 1
- 102000053602 DNA Human genes 0.000 description 1
- 238000007399 DNA isolation Methods 0.000 description 1
- 206010012442 Dermatitis contact Diseases 0.000 description 1
- 206010061818 Disease progression Diseases 0.000 description 1
- 208000030453 Drug-Related Side Effects and Adverse reaction Diseases 0.000 description 1
- 241000196324 Embryophyta Species 0.000 description 1
- 206010014759 Endometrial neoplasm Diseases 0.000 description 1
- 206010014950 Eosinophilia Diseases 0.000 description 1
- 241000588724 Escherichia coli Species 0.000 description 1
- 241000233866 Fungi Species 0.000 description 1
- 108700028146 Genetic Enhancer Elements Proteins 0.000 description 1
- 108700007698 Genetic Terminator Regions Proteins 0.000 description 1
- 206010071602 Genetic polymorphism Diseases 0.000 description 1
- 206010018364 Glomerulonephritis Diseases 0.000 description 1
- 206010018404 Glucagonoma Diseases 0.000 description 1
- 206010018498 Goitre Diseases 0.000 description 1
- 208000009329 Graft vs Host Disease Diseases 0.000 description 1
- 208000003807 Graves Disease Diseases 0.000 description 1
- 208000015023 Graves' disease Diseases 0.000 description 1
- 108010043121 Green Fluorescent Proteins Proteins 0.000 description 1
- 102000004144 Green Fluorescent Proteins Human genes 0.000 description 1
- 108010034145 Helminth Proteins Proteins 0.000 description 1
- 108010033040 Histones Proteins 0.000 description 1
- 241000282412 Homo Species 0.000 description 1
- 101000883798 Homo sapiens Probable ATP-dependent RNA helicase DDX53 Proteins 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 208000023105 Huntington disease Diseases 0.000 description 1
- 206010020751 Hypersensitivity Diseases 0.000 description 1
- 206010062767 Hypophysitis Diseases 0.000 description 1
- 102000008394 Immunoglobulin Fragments Human genes 0.000 description 1
- 108010021625 Immunoglobulin Fragments Proteins 0.000 description 1
- 206010061217 Infestation Diseases 0.000 description 1
- 208000004554 Leishmaniasis Diseases 0.000 description 1
- 206010024227 Lepromatous leprosy Diseases 0.000 description 1
- 208000018501 Lymphatic disease Diseases 0.000 description 1
- 206010025323 Lymphomas Diseases 0.000 description 1
- 206010027336 Menstruation delayed Diseases 0.000 description 1
- 208000019695 Migraine disease Diseases 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- 201000002169 Mitochondrial myopathy Diseases 0.000 description 1
- 208000010316 Myotonia congenita Diseases 0.000 description 1
- 206010028851 Necrosis Diseases 0.000 description 1
- 206010056677 Nerve degeneration Diseases 0.000 description 1
- 208000028389 Nerve injury Diseases 0.000 description 1
- 208000012902 Nervous system disease Diseases 0.000 description 1
- 206010029260 Neuroblastoma Diseases 0.000 description 1
- 108091092724 Noncoding DNA Proteins 0.000 description 1
- 208000008589 Obesity Diseases 0.000 description 1
- 206010030155 Oesophageal carcinoma Diseases 0.000 description 1
- 108091034117 Oligonucleotide Proteins 0.000 description 1
- 108700020796 Oncogene Proteins 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 208000018737 Parkinson disease Diseases 0.000 description 1
- 239000004743 Polypropylene Substances 0.000 description 1
- 239000004793 Polystyrene Substances 0.000 description 1
- 102100038236 Probable ATP-dependent RNA helicase DDX53 Human genes 0.000 description 1
- 206010060862 Prostate cancer Diseases 0.000 description 1
- 208000000236 Prostatic Neoplasms Diseases 0.000 description 1
- 201000004681 Psoriasis Diseases 0.000 description 1
- 208000015634 Rectal Neoplasms Diseases 0.000 description 1
- 206010038910 Retinitis Diseases 0.000 description 1
- 206010039085 Rhinitis allergic Diseases 0.000 description 1
- 240000004808 Saccharomyces cerevisiae Species 0.000 description 1
- 206010039361 Sacroiliitis Diseases 0.000 description 1
- 206010039491 Sarcoma Diseases 0.000 description 1
- 208000006011 Stroke Diseases 0.000 description 1
- 101150045809 TPMT gene Proteins 0.000 description 1
- 206010070863 Toxicity to various agents Diseases 0.000 description 1
- 206010052779 Transplant rejections Diseases 0.000 description 1
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 1
- 208000007097 Urinary Bladder Neoplasms Diseases 0.000 description 1
- 208000006105 Uterine Cervical Neoplasms Diseases 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 208000002552 acute disseminated encephalomyelitis Diseases 0.000 description 1
- 230000001154 acute effect Effects 0.000 description 1
- 239000000853 adhesive Substances 0.000 description 1
- 230000001070 adhesive effect Effects 0.000 description 1
- 239000000556 agonist Substances 0.000 description 1
- 208000026935 allergic disease Diseases 0.000 description 1
- 201000010105 allergic rhinitis Diseases 0.000 description 1
- 230000007815 allergy Effects 0.000 description 1
- 125000003275 alpha amino acid group Chemical group 0.000 description 1
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 description 1
- 230000036783 anaphylactic response Effects 0.000 description 1
- 208000003455 anaphylaxis Diseases 0.000 description 1
- 239000005557 antagonist Substances 0.000 description 1
- 239000003529 anticholesteremic agent Substances 0.000 description 1
- 239000000935 antidepressant agent Substances 0.000 description 1
- 239000003429 antifungal agent Substances 0.000 description 1
- 239000002220 antihypertensive agent Substances 0.000 description 1
- 229940127088 antihypertensive drug Drugs 0.000 description 1
- 239000000164 antipsychotic agent Substances 0.000 description 1
- 229940005529 antipsychotics Drugs 0.000 description 1
- 239000003443 antiviral agent Substances 0.000 description 1
- 238000013459 approach Methods 0.000 description 1
- 206010003230 arteritis Diseases 0.000 description 1
- 238000003556 assay Methods 0.000 description 1
- 208000022362 bacterial infectious disease Diseases 0.000 description 1
- 230000004888 barrier function Effects 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 239000013060 biological fluid Substances 0.000 description 1
- 210000005013 brain tissue Anatomy 0.000 description 1
- 239000000969 carrier Substances 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 201000007303 central core myopathy Diseases 0.000 description 1
- 201000010881 cervical cancer Diseases 0.000 description 1
- 208000037976 chronic inflammation Diseases 0.000 description 1
- 230000006020 chronic inflammation Effects 0.000 description 1
- 208000029742 colonic neoplasm Diseases 0.000 description 1
- 230000000052 comparative effect Effects 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 238000010276 construction Methods 0.000 description 1
- 208000010247 contact dermatitis Diseases 0.000 description 1
- 230000008878 coupling Effects 0.000 description 1
- 238000010168 coupling process Methods 0.000 description 1
- 238000005859 coupling reaction Methods 0.000 description 1
- 238000013016 damping Methods 0.000 description 1
- 238000013524 data verification Methods 0.000 description 1
- 230000000593 degrading effect Effects 0.000 description 1
- 238000011161 development Methods 0.000 description 1
- 230000018109 developmental process Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 230000004069 differentiation Effects 0.000 description 1
- 230000008034 disappearance Effects 0.000 description 1
- 230000005750 disease progression Effects 0.000 description 1
- 230000036267 drug metabolism Effects 0.000 description 1
- 208000024732 dysthymic disease Diseases 0.000 description 1
- 238000013399 early diagnosis Methods 0.000 description 1
- 208000026500 emaciation Diseases 0.000 description 1
- 201000003914 endometrial carcinoma Diseases 0.000 description 1
- 210000003725 endotheliocyte Anatomy 0.000 description 1
- 210000003038 endothelium Anatomy 0.000 description 1
- 230000007613 environmental effect Effects 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 201000005619 esophageal carcinoma Diseases 0.000 description 1
- 210000003527 eukaryotic cell Anatomy 0.000 description 1
- 208000021045 exocrine pancreatic carcinoma Diseases 0.000 description 1
- 231100001264 fatal toxicity Toxicity 0.000 description 1
- 239000012530 fluid Substances 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 239000007850 fluorescent dye Substances 0.000 description 1
- 230000004907 flux Effects 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 238000007710 freezing Methods 0.000 description 1
- 230000008014 freezing Effects 0.000 description 1
- 230000004927 fusion Effects 0.000 description 1
- 206010017758 gastric cancer Diseases 0.000 description 1
- 238000001415 gene therapy Methods 0.000 description 1
- 238000007429 general method Methods 0.000 description 1
- 239000011521 glass Substances 0.000 description 1
- PCHJSUWPFVWCPO-UHFFFAOYSA-N gold Chemical compound [Au] PCHJSUWPFVWCPO-UHFFFAOYSA-N 0.000 description 1
- 208000024908 graft versus host disease Diseases 0.000 description 1
- 239000005090 green fluorescent protein Substances 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 230000012447 hatching Effects 0.000 description 1
- 244000000013 helminth Species 0.000 description 1
- 230000002008 hemorrhagic effect Effects 0.000 description 1
- 208000021991 hereditary neoplastic syndrome Diseases 0.000 description 1
- 238000003384 imaging method Methods 0.000 description 1
- 230000001900 immune effect Effects 0.000 description 1
- 208000026278 immune system disease Diseases 0.000 description 1
- 238000010166 immunofluorescence Methods 0.000 description 1
- 238000001114 immunoprecipitation Methods 0.000 description 1
- 230000008676 import Effects 0.000 description 1
- 238000007901 in situ hybridization Methods 0.000 description 1
- 238000001727 in vivo Methods 0.000 description 1
- 208000023692 inborn mitochondrial myopathy Diseases 0.000 description 1
- 210000003000 inclusion body Anatomy 0.000 description 1
- 230000001939 inductive effect Effects 0.000 description 1
- 208000015181 infectious disease Diseases 0.000 description 1
- 230000002757 inflammatory effect Effects 0.000 description 1
- 230000003993 interaction Effects 0.000 description 1
- 210000000936 intestine Anatomy 0.000 description 1
- 208000028867 ischemia Diseases 0.000 description 1
- 210000003734 kidney Anatomy 0.000 description 1
- 239000004816 latex Substances 0.000 description 1
- 229920000126 latex Polymers 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 201000007270 liver cancer Diseases 0.000 description 1
- 208000014018 liver neoplasm Diseases 0.000 description 1
- 208000018555 lymphatic system disease Diseases 0.000 description 1
- 208000002780 macular degeneration Diseases 0.000 description 1
- 210000004962 mammalian cell Anatomy 0.000 description 1
- 230000001404 mediated effect Effects 0.000 description 1
- 201000001441 melanoma Diseases 0.000 description 1
- 230000002503 metabolic effect Effects 0.000 description 1
- 230000004060 metabolic process Effects 0.000 description 1
- 239000002184 metal Substances 0.000 description 1
- 229910052751 metal Inorganic materials 0.000 description 1
- 206010027599 migraine Diseases 0.000 description 1
- 238000010369 molecular cloning Methods 0.000 description 1
- 210000002161 motor neuron Anatomy 0.000 description 1
- 201000006938 muscular dystrophy Diseases 0.000 description 1
- 238000002703 mutagenesis Methods 0.000 description 1
- 231100000350 mutagenesis Toxicity 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 208000010125 myocardial infarction Diseases 0.000 description 1
- 230000017074 necrotic cell death Effects 0.000 description 1
- 238000013188 needle biopsy Methods 0.000 description 1
- 230000008764 nerve damage Effects 0.000 description 1
- 201000001119 neuropathy Diseases 0.000 description 1
- 230000007823 neuropathy Effects 0.000 description 1
- 238000006386 neutralization reaction Methods 0.000 description 1
- 239000002547 new drug Substances 0.000 description 1
- 108091027963 non-coding RNA Proteins 0.000 description 1
- 102000042567 non-coding RNA Human genes 0.000 description 1
- 235000020824 obesity Nutrition 0.000 description 1
- 239000003921 oil Substances 0.000 description 1
- 210000002220 organoid Anatomy 0.000 description 1
- 201000008482 osteoarthritis Diseases 0.000 description 1
- 201000008968 osteosarcoma Diseases 0.000 description 1
- 230000036542 oxidative stress Effects 0.000 description 1
- 239000012188 paraffin wax Substances 0.000 description 1
- 230000001717 pathogenic effect Effects 0.000 description 1
- 208000029308 periodic paralysis Diseases 0.000 description 1
- 208000033808 peripheral neuropathy Diseases 0.000 description 1
- 230000000505 pernicious effect Effects 0.000 description 1
- 239000008177 pharmaceutical agent Substances 0.000 description 1
- 230000035479 physiological effects, processes and functions Effects 0.000 description 1
- 239000013612 plasmid Substances 0.000 description 1
- 230000008488 polyadenylation Effects 0.000 description 1
- 102000054765 polymorphisms of proteins Human genes 0.000 description 1
- 208000005987 polymyositis Diseases 0.000 description 1
- 229920001155 polypropylene Polymers 0.000 description 1
- 229920002223 polystyrene Polymers 0.000 description 1
- 230000002980 postoperative effect Effects 0.000 description 1
- 238000004321 preservation Methods 0.000 description 1
- 208000020016 psychiatric disease Diseases 0.000 description 1
- 208000008128 pulmonary tuberculosis Diseases 0.000 description 1
- 238000012113 quantitative test Methods 0.000 description 1
- 239000000700 radioactive tracer Substances 0.000 description 1
- 208000002574 reactive arthritis Diseases 0.000 description 1
- 238000003753 real-time PCR Methods 0.000 description 1
- 230000006798 recombination Effects 0.000 description 1
- 238000005215 recombination Methods 0.000 description 1
- 206010038038 rectal cancer Diseases 0.000 description 1
- 201000001275 rectum cancer Diseases 0.000 description 1
- 230000009467 reduction Effects 0.000 description 1
- 230000010410 reperfusion Effects 0.000 description 1
- 230000010076 replication Effects 0.000 description 1
- 238000002271 resection Methods 0.000 description 1
- 208000037803 restenosis Diseases 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 201000000306 sarcoidosis Diseases 0.000 description 1
- 201000000980 schizophrenia Diseases 0.000 description 1
- 230000035945 sensitivity Effects 0.000 description 1
- 238000000926 separation method Methods 0.000 description 1
- 230000001568 sexual effect Effects 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 201000008261 skin carcinoma Diseases 0.000 description 1
- 150000003384 small molecules Chemical class 0.000 description 1
- 239000007787 solid Substances 0.000 description 1
- 239000005315 stained glass Substances 0.000 description 1
- 210000002784 stomach Anatomy 0.000 description 1
- 201000011549 stomach cancer Diseases 0.000 description 1
- 201000000498 stomach carcinoma Diseases 0.000 description 1
- 239000000126 substance Substances 0.000 description 1
- 239000000758 substrate Substances 0.000 description 1
- 230000004083 survival effect Effects 0.000 description 1
- 210000001550 testis Anatomy 0.000 description 1
- 230000001519 thymoleptic effect Effects 0.000 description 1
- 238000004448 titration Methods 0.000 description 1
- 231100000419 toxicity Toxicity 0.000 description 1
- 230000001988 toxicity Effects 0.000 description 1
- 238000001890 transfection Methods 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 230000000472 traumatic effect Effects 0.000 description 1
- 102000003390 tumor necrosis factor Human genes 0.000 description 1
- 201000005112 urinary bladder cancer Diseases 0.000 description 1
- 239000013598 vector Substances 0.000 description 1
- 230000009385 viral infection Effects 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
- 238000012800 visualization Methods 0.000 description 1
- 238000005406 washing Methods 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
Images
Landscapes
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Arrays containing a transcriptome of a diseased tissue and methods of using the arrays for diagnosis, prognosis, screening, and identification of disease are provided herein. The transcriptome arrays from diseased tissue are useful for diagnosis of a disease by analysis of the genetic profile of a tissue sample specific to a disease state. The genetic profiles are then correlated with data on the effectiveness of specific therapeutic agents. Correlating expression profiles to the effectiveness of therapeutic agents provides a way to screen and select further patients predicted to respond to those therapeutic agents, thereby minimizing needless exposure to ineffective therapy.
Description
The cross reference of right of priority and related application
It is 04105479.2 that the right of priority that the application requires has the application number of application on November 3rd, 2004,04105482.6,04105483.4,04105484.2,04105507.0,04105485.9 european patent application, U.S. Provisional Patent Application 60/662 with application on March 14th, 2005, the U.S. Provisional Patent Application 60/700,293 of application on July 18th, 276 and 2005.
Technical field
The application relates to the field of gene and rna expression array technique, especially relates to the sequence and their purposes in diagnosis and treatment plan that contain the transcript of expressing in illing tissue.
The CD-R of Ti Jiaoing goes up the index of file simultaneously
This submission have 3 identical CD-R disks (being labeled as " Copy 1 ", " Copy 2 " and " Copy 3 "), each contains following e-text document.The CD-R disk is created on November 1st, 2005, and the big or small note of each file is following listed.All electronic files on the CD-R disk are incorporated this paper into by reference in full at this.
List of genes A.txt | (30.7Mb) | List of genes S.txt | (6.1Mb) |
List of genes B.txt | (1.9Mb) | List of genes T.txt | (29.6Mb) |
List of genes C.txt | (2Mb) | List of genes U.txt | (1.7Mb) |
List of genes D.txt | (1.1Mb) | List of genes V.txt | (13.3Mb) |
List of genes E.txt | (58.6Mb) | List of genes W.txt | (18.9Mb) |
List of genes F_txt | (3.5Mb) | List of genes X.txt | (10kb) |
List of genes G.txt | (30.7Mb) | List of genes Y.txt | (28kb) |
List of genes H.txt | (4.1Mb)} | List of genes Z.txt | (5.7Mb) |
List of genes I.txt | (30Mb) | List of genes AA.txt | (14.6Mb) |
List of genes J.txt | (18kb)} | List of genes BB.txt | (5.1Mb) |
List of genes K.txt | (20kb) | List of genes CC.txt | (34Mb) |
List of genes L.txt | (9.7Mb)} | List of genes DD.txt | (26.6Mb) |
List of genes M.txt | (5.1Mb) | List of genes EE.txt | (4kb) |
List of genes N.txt | (238kb) | List of genes FF.txt | (324kb) |
List of genes O.txt | (35.8Mb) | List of genes GG.txt | (8.6Mb) |
List of genes P.txt | (11.8Mb) | List of genes HH.txt | (18.8Mb) |
List of genes Q.txt | (3.9Mb) | List of genes II.txt | (9.6Mb) |
List of genes R.txt | (10.1Mb) | List of genes JJ.txt | (46.1Mb) |
In addition, in 3 identical CD-R disks (electronic media of mark " Copy 1-Sequence ListingPart ", " Copy 2-Sequence Listing Part " and " Copy 3-Sequence Listing Part ") of this submission, each contains the sequence table of all sequences described herein.According to 801 parts about the PCTInstruction of the international application that contains big Nucleotide and/or aminoacid sequence table and/or its relevant form, the medium of the computer-reader form that sequence table is only mentioned with 802 parts is submitted to.The electronic media of the computer-reader form on the CD-R disk is incorporated into way of reference in full at this paper.
Background technology
Pharmaceutical industry is constantly pursued such new drug treatment plan and is selected, promptly these select more effective than the medicines that use at present, have more specificity or have still less side effect.The replacement scheme of pharmacological agent constantly develops, because human heritable variation has caused the essence difference of many drug effectivenesses.Therefore, though the selection of available pharmacological agent at present is various in style, under the situation that the patient can not reply, often need more methods of treatment.
Traditionally, the used treatment example of doctor has been stipulated a line pharmacological agent, and this treatment may produce the highest success ratio for the treatment disease.If pharmacological agent does not have effect first, then adopt the prescription of alternative drugs treatment.Very clear, this example is not best methods of treatment for some disease.For example, in the disease of human cancer, treatment first often is most important and the best chance of successful treatment is provided that being necessary so more to select will be for the most effective initial medicine of the disease of particular patient.
It is impossible identifying a best line medicine, because there is not the available method to predict which pharmacological agent will be is the most effective for special cancer physiology.Like this, the patient often unnecessarily stands treatment invalid, drug toxicity.For example, colorectal carcinoma, the none method goes to determine which patient will react to the postoperative adjuvant chemotherapy of surgery.Behind the operative treatment among the patient of 40% risk of relapse 1/3rd benefits from chemotherapy.This meaning is implemented adjuvant chemotherapy causes many patients to accept unnecessary treatment.Cancer therapy and colorectal carcinoma clinical trial be still based on the operability of new active compound and explore, rather than based on the integrated approach of the pharmacogenomics of genetic composition that utilizes tumour and patient's genotype.
The appearance of microarray and molecular gene group has the potential that produces significant effects for the diagnosis capability of disease and prognosis classification, and its aid forecasting individual patient is to the reaction of a certain definite treatment plan.Microarray is used for the analysis of a large amount of genetic information, and individual genetic fingerprint is provided by this.Think that extensively this technology provides essential instrument for the customization therapeutic scheme the most at last.Yet, compile and be used for fully characterizing and predicting that individuality is a problem to the ability of the correct information of the reaction of specific drugs treatment, and some disappointment (Nebert et at.2003.AmJ Pharmacogenomics has been made us in the height expectation of drug application genomics (applied pharmacogenomics); 3 (6): 361-70).
At present the subject matter of microarray be they usually based on the general information content that derives from part order-checking engineering, the engineering that wherein checks order produces expressed sequence tag (EST) information of crossing over histological types.Selectable, this information can result from the gene order-checking engineering of utilizing the algorithm predicts gene to exist.A major issue of this method be microarray production constantly the lastest imformation content so that more sequence information can use.Like this, for than the more information content of having set up, this method has caused a plurality of array versions, and each all previously has a more information than it.Great barrier has been made in routine application to this technology in the case control like this, because the investigator makes data verification very difficult in the face of a plurality of different array platforms with different content.Even in the sequence platform of specific preparation, in early days and be difficult to cross validation information between the late period array version, make the design that studies for a long period of time very difficult like this.
Another problem of present available microarray is that multi-form disease may present different reactions for the treatment of different therapeutical agents.The availability of array is subject in the specific diseased tissue these arrays and how represents.Therefore conventional full genome array is no advantage, because the outer source signal that provides with the unconnected gene of morbid state has caused heavy body test noise, thereby makes ill Analysis of Complex of transcribing group.
Information between the polygene type that traditional general array provides is limited.Yet they are not included in the details content of the specific transcript of expressing under the given independent situation (discrete setting).The general method of general microarray industry is along with more information can be used, and increases the density and the capacity of information.This has caused based on the confusion of using this technology in the research of pharmacogenomics.This main problem relates to the difficulty in the not isostructure of more general array.That is exactly to be difficult to derive from the data and the data association that derives from 40k sequence array of 20k sequence array.These confusions are to be caused with different problems by the note that contrasts.
Summary of the invention
The invention provides comprise with from the array of transcribing the corresponding biomolecules of group of illing tissue with in analysis, use the method for these arrays.This paper described comprise with from the array of transcribing group corresponding nucleic acids molecule of illing tissue with in analysis, use the method for these arrays.It is the set of transcribed nucleic acid thing that illing tissue transcribes group, comprises coding and non-coding nucleic acid sequence, and it is expressed in specific diseased tissue.This paper has also described the array of transcribing corresponding other biomolecules of group that comprises with from illing tissue.These biomolecules comprise protein, polypeptide and antibody.Array provides strong tool for the global expression profile new transcript relevant with morbid state with evaluation of research illing tissue.
The difficult problem that used array ran into before microarray described herein had solved by the employing peculiar methods, this method is that definition is transcribed the group information content entirely and this information content is placed on the array in given disease group.The perfect information content sources is in a plurality of illing tissues sample of progression of disease different steps, and it comprises population and disease heterogeneity.This method has guaranteed that all relevant informations in the given disease group (given disease setting) are available for interrogation, therefore it has greatly increased the potential of developing strong signal, and these signals are diagnosis, prognosis or the predictions of in the given disease group therapy being reacted.In addition, this method has caused having the generation that does not need the perfect information content array that repeatedly upgrades, thereby helps the research and design steady in a long-term of himself.And, because this method has presented fully and stabilised platform, so promoted the cross validation between a plurality of patient groups to study in given disease group.
Disease specific is transcribed group pattern and is comprised the perfect information content in given disease group, and therefore for presented stable, long-term solution based on the pharmacogenomics research and design.
Aspect of method provided herein, transcribe the hereditary feature that group pattern is tested and appraised patient illing tissue sample and be used to diagnose the illness.By identifying hereditary feature from illing tissue's sample or the transcript of suspecting sick tissue sample and the reaction of transcribing group pattern.Detect the complementary sequence hybridization on transcript and the array then or combine.Preferably, transcribing group pattern is the array that is fixed on the computer chip, and the hybridization of the nucleic acid molecule and the array of the technology for detection sample that uses a computer.Hereditary feature with illing tissue's sample is associated with validity and reactive data of this feature to the specific treatment agent then.The express spectra that produces and the relational degree of therapeutical agent validity provide further screening and have selected prediction to the patient that particular therapeutic agent responds, and make unnecessary patient minimized by the situation of unsuccessful treatment thus.
Another aspect of method of the present invention comprises the application of transcribing group that (as array analysis) in this paper method described, be used to detect other method detect less than early stage disease and the illness of organism.These organisms comprise people, animal, plant or bacterium.
The method of array described herein and this array of application provides and utilizes to transcribe to organize and detect, monitor and identify numerous disease and illness.All diseases generally can be divided into neoplastic disease, inflammatory disease and degenerative disease.These classification comprise, and nonrestrictive disease as, cancer, sacroiliitis, asthma, nerve degenerative diseases, cardiovascular disorder, hypertension, mental disorder, transmissible disease, metabolic disease or Immunological diseases.
In one embodiment, transcribing group pattern provides the colorectum the most completely of be sure oing evaluation at present to transcribe the compilation of group.Concentrated about 69,000 transcripts that come from the colorectum tissue be used to generate colorectal, based on the high density oligonucleotide array of transcribing group.About 40,000 in these transcripts are described in the U.S. Provisional Patent Application number 60/662,276.Coming from other about 23,000 transcripts of colorectum tissue and about 5,000 antisense transcripts is described in this paper and transcribes the group sequence to replenish the colorectum described in the U.S. Provisional Patent Application numbering 60/662,276.
Provided herein be used for array transcribe that group be sure of lung the most completely, chest, colon/rectum, liver and the cerebral tissue identified so far transcribe the group version.The present invention concentrated transcript be used to generate lung, chest, colon/rectum, liver and brain illing tissue, based on the high density oligonucleotide array of transcribing group.
Like this, array described herein provides bulk information to material alterations, and these material alterations may become the basis of progression of disease or treatment tolerance.
Pharmacogenomics has following potential, promptly greatly is reduced in the U.S. because the estimation that ADR causes 100,000 dead and 2,000,000 hospital cares (Lazarou et al.JAMA.Apr 15,1998.279 (15): 1200-5.).Do not use the standard trial and error pricing to mate patient and medicine, array described herein and analysis can make the doctor analyze the hereditary feature of patient's sample and give the most suitable pharmacological agent from initial diagnostic phases to this patient.Array described herein not only provides the method for the accuracy that improves in the prescription first active drug, and has improved security, because the possibility of ADR reduces.
Therefore, an object of the present invention is to provide the gene, polynucleotide, Nucleotide and the segmental nucleic acid array that comprise from illing tissue, the expression that is used for screening the target sample disease related gene.
Another object of the present invention provides the method for the new transcribed nucleic acid thing that evaluation expresses in illing tissue.
Another object of the present invention provides the method for the heritable variation that indication disease in the screening tissue or illness exist, this disease or illness with other method detection less than.
Another object of the present invention provides based on to transcribing the method that group analysis diagnoses the illness in the illing tissue.
Another object of the present invention provides the method for analyzing fully that rna expression changes, described rna expression variable effect all gene or transcripts of having identified in the specified disease.
Another object of the present invention provides the method for the express spectra that characterizes specific gene/RNA individual in the illing tissue, and rna expression is associated with suitable and effective therapeutic scheme.
Another object of the present invention provides the multi-form method of difference disease, and express spectra is associated with successful therapeutical agent treatment plan.
Another object of the present invention provides the method for associative expression spectrum and suitable therapeutical agent treatment plan.
Another object of the present invention provides the method that recurs after the prediction cancer therapy.
These and other purpose, feature and benefit of the present invention below will be clearer after the detailed description of disclosed embodiment and additional claim.
The accompanying drawing summary
Fig. 1: the figure that transcribes the group microarray is provided, and it has shown the express spectra of the tumour and the tumour that therapeutical agent tolerates of therapeutical agent sensitivity.
Fig. 2: the BLAST that provides all public's available colons, prostate gland and chest to organize data compares sketch.
Detailed Description Of The Invention
This paper provides the method for transcribing group pattern and using them. Described the group pattern of transcribing that comprises the nucleic acid molecules that comes from illing tissue's transcript, wherein nucleic acid molecules is with array format. Nucleic acid molecules on the array with transcribe the group sequence hybridization from the complementary nucleic acid of illing tissue's sample. This paper defines disease specific and transcribes group and be the set at coding and the non-coding transcript of specificity illing tissue transcription. Other array described herein comprises other biomolecule, transcribes polypeptide or the antibody of the transcript in the group from illing tissue such as expression.
Like this, array provided herein comprises nucleic acid array, polypeptide array, or antibody array. In this article, unless context has requirement in addition, otherwise, when in specific embodiment, addressing nucleic acid array, should be appreciated that corresponding protein arrays and antibody array also should be considered into. In these embodiments, the nucleic acid polypeptide of being encoded by transcript or the antibody that is specific to this polypeptide substitute.
Composition described herein and method can more easily be understood with reference to the description of following detailed particular. Although composition and method are to describe by the specific detail with reference to its some embodiment, can not be interpreted as these details are regarded as limitation of the scope of the invention.
The cell DNA that it will be appreciated by those skilled in the art that the gene form is transcribed into RNA; Coding RNA is translated as protein; RNA alternatively reverse transcription is cDNA. Preferably, the group pattern of transcribing described herein comprises all of illing tissue or all rna transcription thing basically.
Disease specific is transcribed group and is comprised known and transcript unknown function, and randomly comprises extension and the reflection of being transcribed genetic transcription in the group by the protein conduct of coding RNA transcript translation. Disease specific is transcribed group can be along with disease progression or on such as the environmental stimuli of chemotherapy or radiotherapy or impact and change.
As used herein, term " transcript " means the RNA molecule that comes from take DNA or cDNA as the transcription of masterplate. The cDNA that transcript also can form with protein or the reverse transcription of rna transcription thing of rna transcription thing translation divides subrepresentation.
As used herein, term " gene outcome " means the RNA molecule that comes from take DNA or cDNA as the transcription of masterplate and by the peptide molecule of this RNA molecule translation.
As used herein, term " is transcribed group " and is meant the set of coding or the non-coding RNA transcript of specific tissue transcription, and preferably comprises all and basically all rna transcription things that produce in the tissue. These transcripts comprise mRNA (mRNA), selectable montage mRNA, rRNA (rRNA), transfer RNA (tRNA), also have other a large amount of transcripts, they can not translate into protein, such as small nuclear RNA (snRNA), antisense molecule such as siRNA (siRNA) and microRNA, or the rna transcription thing of other Unknown Function. Transcribe group and also comprise the protein of transcribing the interior rna transcription thing translation of group, it is extension and the reflection of transcribing genetic transcription in the group.
As used herein, term " illing tissue " means the tissue from specific organ or tissue type, and this tissue has the special disease classification (such as colorectal cancer, breast cancer, nerve degenerative diseases etc.) with weave connection. Illing tissue also refers to the single cell type from illing tissue, such as epithelial cell, stroma cell or stem cell. For example, the colorectum tissue of disease refers to be diagnosed any colorectum tissue with disease or illness such as cancer. Although carried out in certain embodiments the differentiation of type of cancer, in most of embodiment of transcribing group pattern of the present invention, deliberately do not gone the various cancers type in the dividing tissue.
In addition, be appreciated that in the illing tissue as sample can have some normal, non-illing tissues or with the cell of illing tissue as sample.
Nucleic acid
The nucleic acid molecules, nucleic acid elements or the polynucleotides that are included in the array provided herein can be nucleic acid or the nucleic acid analog of any type, comprise without limitation RNA, DNA, peptide nucleic acid or their mixture and/or fragment. As used herein, term " fragment " refers to the partial sequence such as those sequences provided herein, and described segment can keep enough nucleotide sequence to allow this fragment to keep the specificity of the whole sequence in this fragment source and selective. Fragment can be complementary to whole sequence and keep optionally ability with whole sequence hybridization. Nucleic acid molecules is separated, clones and synthetic preparation. Nucleic acid elements can comprise the carrier sequence or it can be pure basically. Nucleic acid elements can be under the hybridization conditions of routine with the nucleic acid samples that comprises transcript specific molecular or element that derives from tissue sample in complementary transcript hybridize. Those of ordinary skills can be provided by the signal that the hybridization key element is hybridized and produced with the best that provides for given hybridization, and required resolution capability between different genes and the genome location is provided.
Following transcript tabulation provides the sequence that is specific to specific diseased tissue. This tabulation is summarized in the following table 1. The form neutralization runs through the used term of specification " list of genes " and means " tabulation of transcribed nucleic acid thing " and comprise simultaneously coding and noncoding region.
Table 1: the tabulation of sequence table transcript is summed up
Tissue/list of genes | The number of sequence | The sequence table scope |
The colorectum sequence | ||
List of genes A | 16,350 | SEQ ID NO:1 to SEQ ID NO:16,350 |
List of genes B | 2,773 | SEQ ID NO:16351 to SEQ ID NO:19,123 |
List of genes C | 1,805 | SEQ ID NO:19,124 to SEQ ID NO:20,928 |
List of genes D | 1318 | SEQ ID NO:20,929 to SEQ ID NO:22,246 |
List of genes E | 10356 | SEQ ID NO:22,247 to SEQ ID NO:32,802 |
List of genes F | 7,134 | SEQ ID NO:32,803 to SEQ ID NO:39,936 |
List of genes G | 22,376 | SEQIDNO:39,937 to SEQ ID NO:62,312 |
List of genes H | 5,672 | SEQ ID NO:62,313 to SEQ ID NO:67,984 |
The lung sequence | ||
List of genes I | 36,431 | SEQ ID NO:67,985 to SEQ ID NO:104,415 |
List of genes J | 24 | SEQ ID NO:104,416 to SEQ ID NO:104,439 |
List of genes K | 22 | SEQ ID NO:104,440 to SEQ ID NO:104,461 |
List of genes L | 9,727 | SEQ ID NO:104,462 to SEQ ID NO:114,188 |
List of genes M | 5,208 | SEQ ID NO:114,189 to SEQID NO:119,396 |
List of genes N | 452 | SEQ ID NO:119,397 to SEQ ID NO:119,848 |
List of genes O | 42,790 | SEQ ID NO:119,849 to SEQ ID NO:162,638 |
The mammary gland sequence | ||
List of genes P | 17,291 | SEQ ID NO:162,639 to SEQ ID NO:179,929 |
List of genes Q | 3,278 | SEQ ID NO:179,930 to SEQ ID NO:183,207 |
List of genes R | 4,915 | SEQ ID NO:183,208 to SEQ ID NO:190,122 |
List of genes S | 4,857 | SEQ ID NO:194,123 to SEQ ID NO:194,979 |
List of genes T | 34,141 | SEQ ID NO:194,980 to SEQ ID NO:229,120 |
List of genes U | 3,911 | SEQ ID NO:229,121 to SEQ ID NO:233,031 |
List of genes V | 16,666 | SEQ ID NO:233,032 to SEQ ID NO:249,697 |
The liver sequence | ||
List of genes W | 24,744 | SEQ ID NO:249,698 to SEQ ID NO:274,441 |
List of genes X | 13 | SEQ ID NO:274,442 to SEQ ID NO:274,454 |
List of genes Y | 32 | SEQ B7 NO:274,455 to SEQ ID ND:274,486 |
List of genes Z | 6,565 | SEQ ID ND:274,487 to SEQ ID NO:281,051 |
List of genes AA | 14,789 | SEQ ID NO:281,052 to SEQ ID ND:295,840 |
List of genes BB | 11,851 | SEQ ID NO:295,841 to SEQ ID NO:307,691 |
List of genes CC | 39,979 | SEQ ID NO:307,692 to SEQ ID NO:347,670 |
The brain sequence | ||
List of genes DD | 33,275 | SEQ ID NO:347,671 to SEQ ID NO:380,945 |
List of genes EE | 5 | SEQ ID NO:384,946 to SEO ID ND:380,950 |
List of genes FF | 341 | SEQ ID NO:380,951 to SEQ ID NO:381,291 |
List of genes GG | 8,486 | SEQ ID NO:381,292 to SEQ ID NO:389,777 |
List of genes HH | 19,081 | SEQ ID NO:389,778 to SEQ ID NO:408,858 |
List of genes II | 21,845 | SEQ ID NO:408,859 to SEQ ID ND:430,703 |
List of genes JJ | 53,293 | SEQ ID NO:430,704 to SR ID NO:483,996 |
Sequence in each tabulation of list of genes A-JJ is included on the appended CD-R of this specification, and all incorporates them into this paper by reference.
Transcript in the ill colorectum tissue
List of genes A (SEQ ID NO:1 to SEQ ID NO:16,350)
This paper provides the previous set of identified 16,350 transcriptons of expressing in the colorectum tissue.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 4,000 listed among list of genes A nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 6,000,8,000,10,000,12,000,14,000 or 16,000 listed among list of genes A sequence.
List of genes B (SEQ ID NO:16,351 to SEQ ID NO:19,123)
Described the set of 2,773 transcripts, these transcripts neither contradict with public's available expressed sequence tag library that the rectum cancer produces, and also do not contradict with note gene among the Genebank.Herein, these genes are identified recently.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 1,000 listed among list of genes B nucleic acid molecule.In another embodiment, array comprise be complementary among the list of genes B listed at least 50,100,500,1, the nucleic acid molecule of 000,1,500,2000 or 2500 sequence.
List of genes C (SEQ ID NO:19,124 to SEQ ID NO:20,928)
The cDNA library produces the people's colorectum tissue from disease, and this paper identified 1,805 nucleotide sequence by high-flux sequence, and their are former is also expressed in the colorectum cancerous tissue by evaluation.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 500 listed among list of genes C nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 50,200,500,750,1,000,1,400 or 1,750 listed among list of genes C sequences.
List of genes D (SEQ ID NO:20,929 to SEQ ID NO:22,246)
Selectable premessenger RNA montage is main cell processes, and by the different albumen of the monogenic primary transcription deposits yields of this process function, this situation usually takes place with the organizing specific sexual norm.
This paper has identified the set of 1,318 nucleotide sequence recently, and there be (expressing) in these sequences with (montage) form of the remarkable change of the gene of previous note or ESTs in the colorectum cancerous tissue.Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 500 listed among list of genes D nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 50,100,250,500,750,1,000 or 1,250 listed among list of genes D sequences.
List of genes E (SEQ ID NO:22,247 to SEQ ID NO:32,802)
Set up the cDNA library with ill people's colorectum tissue, this paper has identified 10,556 nucleotide sequences, and these sequences had not before been identified in the colorectum cancerous tissue and expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 500 listed among list of genes E nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 1,000,2,000,5,000 or 10,000 listed among list of genes E sequence.
List of genes F (SEQ ID NO:32,803 to SEQ ID NO:39,936)
Set up the cDNA library with ill people's colorectum tissue, this paper has identified 7,134 nucleotide sequences, and these sequences had not before been identified in the colorectum cancerous tissue and expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 500 listed among list of genes F nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 1,000,2,500,5,000 or 7,000 listed among list of genes F sequence.
List of genes G (SEQ ID NO:39,937 to SEQ ID NO:62,312)
This paper has identified the set of 22,376 nucleotide sequences, and these sequences had not before been identified in the colorectum cancerous tissue and expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 4,000 listed among list of genes G nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 6000,8,000,10,000,12,000,14,000,16,000 or 19,000 listed among list of genes G sequences.
List of genes H (SEQ ID NO:62,313 to SEQ ID NO:67,984)
This paper has identified the set of 5,672 nucleotide sequences recently, and these sequences constitute antisense and corresponding reverse complemental transcript.
Antisense transcript and the inclusion body (inclusion) of adopted transcript is arranged accordingly is the key character of array.General commercial obtainable array mainly concentrates on the detection coding adopted proteic transcript.Along with the increase of the interest of the effect of endogenous sense-rna transcript in cancer and other disease, identified that colorectum transcribes the antisense sequences in the group now.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 2,000 listed among list of genes H nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 3,000,4,000 or 5,000 listed among list of genes H sequence.
Transcript in the diseased lung tissue
List of genes I (SEQ ID NO:67,985 to SEQ ID NO:104,415)
This paper provides the previous set that has shown 36,431 transcripts that relate in the lung cancer.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 4,000 listed among list of genes I nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 6,000,8,000,15,000,20,000,30,000 or 35,000 listed among list of genes I sequence.
List of genes J (SEQ ID NO:104,416 to SEQ ID NO:104,439)
This paper has described the set of 24 transcripts, and these transcripts are contradicted by public's available EST library of cancerous lung tissue preparation, or not with Genbank in note gene contradiction.These genes are that this paper identifies recently.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 5 listed among list of genes J nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 6,10,15,18,20 or 22 listed among list of genes J sequences.
List of genes K (SEQ ID NO:104,440 to SEO ID NO:104,461)
This paper has identified the set of 22 expressed sequence tag by high-flux sequence, and these expressed sequence tag before be not reported in the lung tissue and have expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 5 listed among list of genes k nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 6,10,15,18 or 20 listed among list of genes k sequences.
List of genes L (SEQ ID NO:104,462 to SEQ ID NO:114,188)
This paper has identified that recently 9,727 are accredited as the transcript set that contains sequence, and wherein said sequence exists with (montage) form of the remarkable change of the lung cancer associated gene of previous note or ESTs.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 3,000 listed among list of genes D nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 4,000,5,000,7,000 or 9,000 listed among list of genes D sequence.
List of genes M (SEQ ID NO:114,189 to SEQ ID NO:119,396)
This paper has identified the set of 5,208 note genes recently, and these genes have been accredited as in the diseased lung tissue and have expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 2,500 listed among list of genes M nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 3,000,4,000 or 5,000 listed among list of genes M sequence.
List of genes N (SEQ ID NO:119,397 to SEQ ID NO:119,848)
This paper has identified that the set of 452 transcripts is single copy EST nucleotide sequence, and the note gene is expressed and before be not accredited as to these transcripts in cancerous lung tissue.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 200 listed among list of genes N nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 250,300,350 or 400 listed among list of genes N sequences.
List of genes O (SEQ ID NO:119,849 to SEQ ID NO:162,638)
This paper has identified 42,790 transcript set recently, and these transcripts have been formed the antisense and corresponding reverse complemental (reverse complement) transcript of the sequence of expressing in the cancerous lung tissue.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 20,000 listed among list of genes O nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 25,000,30,000,35,000 or 40,000 listed among list of genes O sequence.
List of genes P (SEQ ID NO:162,639 to SEQ ID NO:179,929)
This paper provides 17,291 set that before have been presented at the expressed sequence tag of expressing in the breast carcinoma tissue.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 3,000 listed among list of genes P nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 4,000,5,000,7,000,10,000,12,000,15,000 or 17,000 listed among list of genes P sequence.
List of genes Q (SEQ ID NO:179,930 to SEQ ID NO:183,207)
This paper has described the set of 3,278 transcripts, and these transcripts do not contradict with the EST library of public's available by breast carcinoma tissue preparation, or does not contradict with note gene among the Genbank.These genes are that this paper identifies recently.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 1,000 listed among list of genes Q nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 4,000 or 6,000 listed among list of genes Q sequence.
List of genes R (SEQ ID NO:183,208 to SEQ ID NO:190,122)
This paper has identified the set of 6,915 transcripts by high-flux sequence, and these transcripts before be not reported in the ill mammary tissue and have expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 2,000 listed among list of genes R nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 4,000 or 6,000 listed among list of genes R sequence.
List of genes S (SEQ ID NO:190,123 to SEQ ID NO:194,979)
This paper has identified that recently 4,857 are accredited as the transcript set that contains sequence, and wherein said sequence (montage) form with the remarkable change of the gene of previous note or ESTs in ill mammary tissue exists.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 1,000 listed among list of genes S nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 2,000 or 4,000 listed among list of genes S sequence.
List of genes T (SEQ ID NO:194,980 to SEQ ID NO:229,120)
This paper has identified the set of 34,141 transcripts of expressing in mammary tissue.These transcripts before be not confirmed to be in the breast carcinoma tissue and had expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 10,000 listed among list of genes T nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 15,000,20,000,25,000 or 30,000 listed among list of genes T sequence.
List of genes U (SEQ ID NO:229,121 to SEQ ID NO:233,031)
This paper crowd, the set of 3,911 transcripts is accredited as single copy EST nucleotide sequence, and the note gene is expressed and before be not accredited as to these transcripts in breast carcinoma tissue.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 1,000 listed among list of genes U nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 1,500,2,000,2,500 or 3,000 listed among list of genes U sequences.
List of genes V (SEQ ID NO:233,032 to SEQ ID NO:249,697)
This paper has identified the set of 16,666 transcripts recently, and described transcript has constituted the sequence of expressing in breast carcinoma tissue antisense has adopted transcript accordingly with it.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 8,000 listed among list of genes V nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 10,000,12,000,14,000 or 16,000 listed among list of genes V sequence.
The transcript of ill hepatic tissue
List of genes W (SEQ ID NO:249,698 to SEQ ID NO:274,441)
This paper provides 24,744 set that before have been accredited as the transcript of expressing at the hepatic tissue relevant with hepatitis.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 4,000 listed among list of genes W nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 6,000,8,000,10,000,12,000,14,000,16,000,19,000 or 21,000 listed among list of genes V sequence.
List of genes X (SEQ ID NO:274,442 to SEO ID NO:274,454)
This paper has described the set of 13 transcripts, and these transcripts do not contradict from the EST library of the relevant hepatic tissue of hepatitis preparation with public's available, or does not contradict with note gene among the Genbank.These genes are that this paper identifies recently.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 8 listed among list of genes X nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 10 or 12 listed among list of genes X sequences.
List of genes Y (SEQ ID NO:274,455 to SEQ ID NO:274,486)
This paper has been tested and appraised the previous set that still before is not reported in 32 transcripts of expressing in the relevant hepatic tissue of hepatitis by high flux screening.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 15 listed among list of genes Y nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 20,25 or 30 listed among list of genes Y sequences.
List of genes Z (SEQ ID NO:274,487 to SEQ ID NO:281,051)
This paper has identified the set of 6,565 transcripts, and these transcripts exist with (montage) form of the remarkable change of the gene of previous note or ESTs and express in the relevant hepatic tissue of hepatitis.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 3,000 listed among list of genes Z nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 4,000,5,000 or 6,000 listed among list of genes Z sequence.
List of genes AA (SEQ ID NO:281 to SEQ ID NO:295,840)
This paper has identified that recently being integrated in the relevant hepatic tissue of hepatitis of 14,789 transcripts express.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 8,000 listed among list of genes AA nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 8,000,10,000,12,000 or 14,000 listed among list of genes AA sequence.
List of genes BB (SEQ ID NO:295,841 to SEQ ID NO:307,691)
This paper has identified that 11,851 is the set of the transcript of single copy EST nucleotide sequence, and the note gene is expressed and before be not accredited as to these transcripts in the relevant hepatic tissue of hepatitis.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 6,000 listed among list of genes BB nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 8,000 or 10,000 listed among list of genes BB sequence.
List of genes CC (SEQ ID NO:307,692 to SEQ ID NO:347,670)
This paper has identified the set of 39,979 transcripts recently, and described transcript has constituted the antisense of the sequence of expressing and adopted transcript is arranged accordingly in the relevant hepatic tissue of hepatitis.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 20,000 listed among list of genes CC nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 25,000,30,000 or 35,000 listed among list of genes CC sequence.
The transcript of ill cerebral tissue
List of genes DD (SEQ ID NO:347,671 to SEQ ID NO:380,945)
This paper provides the set of 33,275 transcripts of expressing of previous evaluation in the relevant cerebral tissue of nerve degenerative diseases.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 15,000 listed among list of genes DD nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 20,000,25,000 or 30,000 listed among list of genes DD sequence.
List of genes EE (SEQ ID NO:380,946 to SEO ID NO:380,950)
This paper has identified 5 transcripts set that contain following sequence recently, and described sequence with at public's available is not contradicted by the EST library that the relevant cerebral tissue of nerve degenerative diseases prepares, or does not contradict with note gene among the Genbank.These genes are that this paper identifies recently.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 3 listed among list of genes EE nucleic acid molecule.
List of genes FF (SEQ ID NO:380,95J to SEQ ID NO:381,291)
This paper has identified by high-flux sequence that the set of 341 transcripts, these transcripts before be not reported in the relevant cerebral tissue of nerve degenerative diseases and has expressed.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 150 listed among list of genes FF nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 200 or 300 listed among list of genes FF sequences.
List of genes GG (SEQ ID NO:381,292 to SEQ ID NO:389,777)
This paper has identified the set of 8,486 transcripts recently, and transcript wherein exists with (montage) form of the remarkable change of the gene of previous note or ESTs and expresses in the relevant cerebral tissue of nerve degenerative diseases.
List of genes HH (SEQ ID NO:389,778 to SEQ ID NO:408,858)
This paper provides the set of 19,081 transcripts of expressing at the relevant cerebral tissue of nerve degenerative diseases of identifying recently.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 8,000 listed among list of genes HH nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 12,000,15,000,17,000 or 19,000 listed among list of genes DD sequence.
List of genes II (SEQ ID NO:408,859 to SEQ ID NO:430,703)
This paper has identified that 21,845 is the set of the transcript of single copy EST nucleotide sequence, and the note gene is expressed and before be not accredited as to these transcripts in the relevant cerebral tissue of nerve degenerative diseases.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 10,000 listed among list of genes II nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 12,000,15,000,17,000 or 20,000 listed among list of genes II sequence.
List of genes JJ (SEQ ID NO:430,704 to SEQ ID NO:483,996)
This paper has identified the set of 53,293 transcripts recently, and described transcript has constituted the antisense of the sequence of expressing and adopted transcript is arranged accordingly in the relevant cerebral tissue of nerve degenerative diseases.
Therefore, in one embodiment, provide the array that comprises the nucleic acid molecule that is complementary at least 30,000 listed among list of genes JJ nucleic acid molecule.In another embodiment, array comprises the nucleic acid molecule that is complementary at least 35,000,40,000,45,000 or 50,000 listed among list of genes JJ sequence.
Array
As mentioned above, transcript provided herein tabulation can be used for preparing illing tissue by the nucleic acid molecule that use is complementary to sequence provided herein and transcribes group pattern.Term " array " and " microarray " can use alternately at this paper.The class miniature array that is associated with computer chip represented in those skilled in the art's term after commonly used.As used herein, term " tissue-specific element " is meant the biomolecules of the transcript specificity element that is attached to ill target sample source on the array, and it comprises nucleic acid, polypeptide and antibody molecule.
List of genes A-H provides the transcript sequence that is associated with ill colorectum tissue.In one embodiment, provide the array that comprises at least one nucleic acid molecule, the ill colorectum of this nucleic acid molecule complementation in list of genes B, C, D, E, F, G, H or its combination organized transcript.In another embodiment, the array that comprises nucleic acid molecule is provided, the ill colorectum of this nucleic acid molecule complementation in list of genes B, C, D, E, F, G, H or its combination organized at least 70% of transcript, for example at least 80% or at least 90% nucleic acid molecule.In another embodiment, the array that comprises nucleic acid molecule is provided, the ill colorectum of this nucleic acid molecule complementation in each list of genes B, C, D, E, F, G and H organized at least 70% of transcript, for example at least 80% or at least 90% nucleic acid molecule.List of genes I-O provides the sequence of the transcript that is associated with the diseased lung tissue.In one embodiment, provide and comprised the array that the diseased lung that is complementary in list of genes J, K, L, M, N, O or its combination is organized the nucleic acid molecule of transcript.In another embodiment, provide and comprised at least 70% of the transcript that is complementary in list of genes J, K, L, M, N, O or its combination, for example array of at least 80% or at least 90% diseased lung tissue core acid molecule.In another embodiment, provide and comprised at least 70% of the transcript that is complementary among list of genes J, K, L, M, N, the O, for example array of 80% or 90% diseased lung tissue core acid molecule.
List of genes P-V provides the transcript that is associated with ill mammary tissue sequence.In one embodiment, provide the array that comprises the nucleic acid molecule that is complementary to the ill mammary tissue transcript in list of genes Q, R, S, T, U, V or its combination.In another embodiment, provide and comprised at least 70% of the ill mammary tissue transcript that is complementary in list of genes Q, R, S, T, U, V or its combination, for example array of at least 80% or at least 90% nucleic acid molecule.In another embodiment, provide and comprised at least 70% of the ill mammary tissue transcript that is complementary among list of genes Q, R, S, T, U, the V, for example array of at least 80% or at least 90% nucleic acid molecule.
List of genes W-CC provides the transcript that is associated with ill hepatic tissue sequence.In one embodiment, provide the array that comprises the nucleic acid molecule that is complementary to the ill hepatic tissue transcript in list of genes X, Y, Z, AA, BB, CC or its combination.In another embodiment, provide and comprised at least 70% of the ill hepatic tissue transcript that is complementary in list of genes X, Y, Z, AA, BB, CC or its combination, for example array of at least 80% or at least 90% nucleic acid molecule.In another embodiment, provide and comprised at least 70% of the ill hepatic tissue transcript that is complementary at least among list of genes X, Y, Z, AA, BB, the CC, for example array of at least 80% or at least 90% nucleic acid molecule.
List of genes DD-JJ provides the transcript that is associated with ill cerebral tissue sequence.In one embodiment, provide the array that comprises the nucleic acid molecule that is complementary to the ill cerebral tissue transcript in list of genes EE, FF, GG, HH, II, JJ or its combination.In another embodiment, provide and comprised at least 70% of the ill cerebral tissue transcript that is complementary in list of genes EE, FF, GG, HH, II, JJ or its combination, for example array of at least 80% or at least 90% nucleic acid molecule.In another embodiment, provide and comprised at least 70% of the ill cerebral tissue transcript that is complementary at least among list of genes EE, FF, GG, HH, II, the JJ, for example array of at least 80% or at least 90% nucleic acid molecule.
In another embodiment, provide the array of the nucleic acid molecule that comprises the list of genes A-H, the J-O that are complementary to from two or more different carcinoma tissues, the nucleotide sequence among the Q-V, with at polytype cancer.In another embodiment, provide and comprised at least 70% of the transcript that is complementary in list of genes A-H, J-O and Q-V or its combination, for example array of at least 80% or at least 90% nucleic acid molecule.
Preferably, the array in each embodiment described herein comprises the combination of the nucleic acid molecule that nucleic acid molecule that one or more this paper identify recently or this paper identifies recently.Comprise following combination, i.e. this combination contains the nucleic acid molecule of identifying recently for the disease of the type of specified disease, disease or wide region.
Express
For the expression of the nucleotide sequence that obtains proteins encoded, sequence to be incorporated in the carrier with one or more control sequences, control sequence is operably connected on the nucleic acid to control its expression.Carrier randomly comprises other sequence, insert expression of nucleic acids as promotor or enhanser to drive, comprise nucleotide sequence so that peptide with the form production of fusion rotein, and/or comprises the nucleic acid of secretion signal of encoding, so that the polypeptide that produces in the host cell is secreted from cell.
Aspect order of the present invention, provide the carrier that comprises isolating polynucleotide.
In another aspect of this invention, provide the host cell that comprises carrier.
Peptide obtains by the following method: the carrier transfection that will incorporate specific nucleic acid squences into is to host cell, and wherein the carrier in host cell has function; Cultivate host cell to produce peptide; With recovering peptide from host cell or surrounding medium.
Like this, the method for production polypeptide within the scope of the present invention.Method comprises the expression of polypeptides by the nucleic acid encoding molecule.This can obtain easily by cultivate host cell in carrier-containing substratum, under the felicity condition that can cause or allow expression of polypeptides.
Carrier and host cell
Can select or make up appropriate carriers to comprise suitable adjusting sequence, including but not limited to promoter sequence, terminator fragment, polyadenylation sequence, enhancer sequence, marker gene and other suitable sequence.
Carrier can be plasmid, virus, as suitable phage or phagemid.Detailed content referring to, as MOLECULAR CLONING:A LABORATORY MANUAL:2nd edition, Sambrooket al., 1989, Cold Spring Harbor Laboratory Press.Many known technology and experimental program are used to control nucleic acid, for example prepare nucleic acid construct, mutagenesis, order-checking, the cytotropic importing of DNA and genetic expression, and protein analysis, these are described in detail in CURRENT PROTOCOLS IN MOLECULARBIOLOGY, Ausubel et al eds., John Wiley ﹠amp; Sons, 1992.
The clone of the polypeptide in different host cells and expression system are known.Appropriate host cell comprises bacterium, eukaryotic cell such as mammalian cell and yeast, and rhabdovirus system.
Therefore, the present invention further provides the host cell that comprises heterologous nucleic acids disclosed herein.
The array preparation
Use the polynucleotide design and make up transcript array described herein.In one embodiment, arrange the nucleic acid elements preparation and singly transcribe group pattern, although array can comprise corresponding to a plurality of groups of transcribing under the situation of needs.Transcribe group and can comprise a plurality of illing tissues transcript from a disease or a plurality of diseases.The disease specific array is included in the transcript of transcribing in the given disease group.
For example, in colorectal carcinoma, transcribe in the cell type in a certain scope that these transcripts can be found in colorectum tumour cell microenvironment, and cell can comprise, as, stroma cell, epithelial cell, lymphocyte, endotheliocyte, stem cell etc.In another embodiment, secretion by matter interaction or differential protein, cell or cancerous tumor cell have changed the expression of transcript in its peripheral cell (as matrix, endothelium or the lymphocyte of finding in the tumor microenvironment) before the canceration, therefore and producing the transcript of colorectal carcinoma feature, this transcript is included on the disease specific array.And when utilizing the disease specific array as the instrument of the genetic marker of identification diagnosis, prognostic or predictability, actual mark can comprise the transcript that derives from some or all these individual cells groups.
Array provided herein can be used for any suitable purpose, as, but be not limited to diagnosis, prognosis, pharmacological agent, drug screening etc.For given array, each nucleic acid elements can be complete sequence or the sequence that splits into different lengths.All fragments that there is no need to form complete sequence are presented on the array.
In one embodiment, use nucleic acid known in the art to fix or on behalf of transcript and the segmental tissue specific nucleic acid element of transcript, combination technology will be fixed on the array in a plurality of physics independences site.Tight (discreet) part of together having formed whole transcript or transcript in the fragment in a plurality of physics independences site.Fragment can be complementary to the discontinuous part of the sequential portion or the transcript of transcript.The existence of target transcript in the sample is represented in segmental hybridization on the nucleic acid molecule of target sample and the array.Hybridize and hybridize by the detection method of this area routine and detect and be described in detail in hereinafter.
In one embodiment, use other nucleotide sequence in a plurality of probes difference target sequences and the illing tissue's sample.In some embodiments, at least 2% target sequence is presented on the array by the combination of probe.In another embodiment, at least 5%, at least 10%, at least 20%, at least 30%, at least 40%, at least 50%, at least 60%, at least 70%, at least 80%, or at least 90% target sequence is presented on the array.Selectively, when sequence was represented bigger sequence or transcript, list of genes A was presented on array on to the combination by probe of 60% the sequence of list of genes JJ at least.In further embodiment, list of genes A is at least 70%, at least 80% of list of genes JJ, or the combination of at least 90% transcript by probe is presented on the array.The existence of full transcript in the tissue sample is represented in nucleic acid fragment in the sample and those the segmental hybridization on the array.
In another embodiment, the nucleic acid elements corresponding with full transcript or full transcript fragment is only in a physics independence site, be fixed on the array with " array of spots " form.A plurality of copies of specific nucleic acid element can be attached to array substrate in site independently.Preferably, " array of spots " of the type comprises the new one or more nucleic acid molecule identified of this paper.
As indicated above, transcript specificity element or the corresponding one or more nucleic acid elements of its fragment that provide with list of genes A-JJ preferably are provided array.As indicated above, the array that is specific to certain disease such as particular cancers can be designed to comprise the group of transcribing with respect to all or predetermined percentage of specified disease.For example, in one embodiment, the nucleotide sequence of all or the subgroup selected can be provided in the list of genes (list of genes A-H) that is associated with specified disease such as colorectal carcinoma that provides above array.In another embodiment, array can comprise transcribes group, and this is transcribed group the nucleotide sequence of all in the list of genes (list of genes A-V) that provides above with general type disorders such as cancers is associated or the subgroup selected for example is provided.Also have, in other embodiment, array can comprise transcribes group, and this transcribes the nucleotide sequence that the subgroup all or that select in the list of genes (list of genes DD-JJ) that the list of genes (list of genes W-CC) that is associated as relevant hepatic tissue with hepatitis with particular type organ and disease that provides above or relevant cerebral tissue with nerve degenerative diseases be associated for example is provided group.
In other embodiment, array comprise with list of genes A-JJ at least 50% corresponding nucleic acid elements of the set transcript specificity element that provides.In other embodiment, array comprise with list of genes A-JJ in the set transcript specificity element that provides at least 60%, for example at least 70%, at least 80% or greater than 90% corresponding nucleic acid elements.Show the existence of target gene in the sample from the hybridization of corresponding nucleic acids element on the target transcript specificity element of illing tissue's sample and the array.Other nucleic acid elements or its fragment corresponding to other transcript that provides among the list of genes A-JJ can be placed in discrete physics independence site on the array.
It will be appreciated by those skilled in the art that nucleic acid elements on set array is complementary to the transcript specific sequence in the set objective sample.The array that comprises native sequences also can be designed to identify the existence of antisense molecule in the target sample.To endogenous sense-rna transcript is interested, because nearest document has related to the endogenous antisense in cancer and other disease.
In one embodiment, described array is the array of following nucleic acid elements, and promptly on behalf of ill colorectum tissue, this nucleic acid elements transcribe that group, diseased lung tissue are transcribed group or ill mammary tissue is transcribed group.In this array, preferably exist in more than 75%, 80%, 90%, 95% or 98% of transcript total amount that transcribing of ill colorectum, lung or mammary tissue transcribed in the group respectively.In some embodiments, remaining nucleic acid elements is the contrast element.
The array that is used for mensuration described herein provided herein is by suitable technique construction known in the art.Referring to for example U.S. Patent number 5,486,452; 5,830,645; 5,807,552; 5,800,992 and 5,445,934.In each array, independently nucleic acid elements can only show once and maybe can repeat.Array can randomly also comprise the contrast nucleic acid elements.
Can use any suitable matrix as nucleic acid elements fix with in conjunction with thereon solid mutually.For example matrix can be the filter of glass, plastics, metal, oil gidling matrix and any material.The surface of matrix can be any suitable structure.For example, the surface can be flat condition, or become carinate or the ditch shape so that the nucleic acid elements that is fixed on the matrix be separated.In selectable embodiment, nucleic acid adhesive is to microballon (bead), and it is respectively recognizable.Nucleic acid elements adheres on the matrix in any suitable manner so that they can be used for hybridization, comprises covalently or non-covalently combination.
In other embodiment, whether relevant with susceptibility or resistance to specified disease reagent according to the expression of transcript, the polynucleotide or the protein molecule of transcribing in the group can divide into groups on array.Be grouped in the zone that provides such on the array like this, i.e. the set of transcript shows whether the individuality with specific array collection of illustrative plates will or not react (for example referring to Fig. 1) to the particular therapeutic agent reaction.
Illing tissue's sample
Any suitable destination organization or cell can be as the illing tissue's samples in the method described herein.It will be appreciated by those skilled in the art that term " illing tissue's sample " comprises abnormal sample, suspects ill sample and as the analysis normal specimens of conventional screening test part.
Illing tissue's sample is preferred processed to obtain one or more transcript specificity elements, then it is combined with array to allow to hybridize and be attached to the detection of the transcript specificity element of array.Term used herein " transcript specificity element " comprises any suitable nucleic acid from the rna transcription thing in the sample, as DNA or RNA.From the nucleic acid of rna transcription thing can be the DNA that transcribes by the cDNA of mRNA reverse transcription, by this cDNA, by the DNA of this cDNA amplification, and the RNA that transcribes by the DNA of this amplification etc.When purpose changes for measuring gene copy number, preferably utilize genomic dna.Selectively, when detecting transcript (one or more) expression level, preferably use RNA or cDNA.For example, for quantitative expression, transcript specificity element can be the RNA molecule of transcribing of any kind, messenger RNA(mRNA) (mRNA) for example, selectable montage mRNA, ribosome-RNA(rRNA) (rRNA), transfer RNA (tRNA) and other does not translate into proteinic transcript on a large scale, as examine interior microRNA (snRNA), and antisense molecule, as siRNA and microRNA (microRNA).Transcript specificity element also can be the nucleic acid from RNA.
According to the purpose of method, those of ordinary skills will select suitable ill target cell and tissue.For example, in the method for identifying the transcript relevant, can use any known biological sample or cell or tissue that can show or express the pathological condition symptom with the particular pathologies situation.
Array described herein is used for identifying in cancer by difference inductive transcript.In this case, target cell can be a tumour cell, for example colon cancer cell or stomach cancer cell.Target cell derives from any tissue source, comprise the humans and animals tissue, such as but not limited to, the new sample that obtains, the tissue of freezing sample, biopsy samples, humoral sample, blood sample, preservation such as paraffin embedding fixed tissue sample (tissue block just), or cell culture.
For diagnosis, illing tissue's testing sample is preferably from the biological sample of suspecting ill individuality.Under the ideal state, this tissue sample corresponding to or be incorporated into array, wherein said array comprises the one or more integral parts of transcribing group fully from homologue.Term " integral part " is defined as approximately the whole group of transcribing greater than 50%, 75%, 80%, 90%, 95% or 98% at this paper.For example, be diagnosing, the transcript specificity element application of lung tissue sample source is in whole all of group or the array of integral part of transcribing that comprises the diseased lung tissue.
The group of transcript specificity element can be by any suitable separate nucleic acid known in the art or purification process available from ill destination organization or cell.For example, be used for the commercial obtainable test kit of separate nucleic acid, as from QIAGEN
(Alameda, the QIAAMP that is used for DNA isolation CA)
Organize test kit to be used for method described herein.In addition, the separation of nucleic acid and purification process are described in LABORATORYTECHNIQUES IN BIOCHEMISTRY AND MOLECULAR BIOLOGY:HYBRIDIZATION WITH NUCLEIC ACID PROBES, PART I.THEORY ANDNUCLEIC ACID PREPARATION, P.Tijssen, ed.Elsevier, the 3rd chapter of N.Y (1993).
Size per sample and isolating method, obtain transcript specificity element can with or different amplification use together.Suitable amplification method includes but not limited to polymerase chain reaction (PCR) (Innis, et al, PCR PROTOCOLS:A GUIDE TO METHODS AND APPLICATION, AcademicPress, Inc.San Diego, (1990)), ligase chain reaction (LCR) (LCR) is (referring to Wu and Wallace, Genomics, 4:560 (1989), Landegren, et al, Science, 241:1077 (1988) and Barringer, etal, Gene, 89:117 (1990)), transcription amplification (Kwoh, et al., Proc.Natl.Acad.Sci.USA, 86:1173 (1989)), and self-sustained sequence replication (Guatelli, et al., Proc.Nat.Acad.Sci.USA, 87:1874 (1990)).The detailed content that relates to quantitative PCR is provided at PCRPROTOCOLS:AGUIDETO METHODS AND APPLICATIONS, Innis et al., and Academic Press, Inc.N.Y is in (1990).
In certain embodiments, only needing to detect specific transcript specificity element exists or does not exist.Under these circumstances, the detection of hybridization signal shows the existence of transcript specificity element in the sample.In other embodiment, need the expression of the one or more transcript specificity elements in the quantitative sample.In this case, the hybridization signal of the concentration of transcript specificity element and detection is proportional in the sample.The technician can understand ratio needn't be accurately (for example transcription rate double to cause doubling of mRNA transcript and doubling of hybridization signal).More undemanding ratio, for example to have caused the situation of 5-15 times of difference of intensity for hybridization be acceptable to 10 of target mRNA concentration times of differences.When needs more accurate when quantitative,, suitable standard can be used for the variation of correcting sample preparation and hybridization-mediated.
Hybridization
In method provided herein, under the condition of the suitable rigorous degree of selecting, hybridize to array from the transcript specificity element of illing tissue's sample.The technician clearly knows the rigorous degree of variation hybridization conditions to select sample is more suitable for.For example, the damping fluid that adopts non-strict dcq buffer liquid (for example 6xSSPE 0.01%Tween-20) and strictness is (as 100mMMES, 0.1M[Na+], 0.01%Tween-20), those skilled in the art or those of ordinary skill can change washing time (general 0-20 time) separately, and flushing temperature (general 15-50 ℃) is to obtain optimal hybridization.The method of Utopian hybridization conditions be to those skilled in the art know (referring to LABORATORY TECHNIQUES IN BIOCHEMISTRYAND MOLECULAR BIOLOGY, Vol.24:Hybridization With Nucleic Acid Probes, P.Tijssen, ed.Elsevier, N.Y, (1993)).
In one embodiment, under low rigorous condition, hybridize, and by flushing continuously under the rigorous condition that progressively raises, obtaining required hybridization specificity level, thereby eliminate the duplex of wrong hybridization.By and the hybridization of gene specific element and and the hybridization of the contrast of various existence between relatively estimate the hybridization specificity.
Mark and detection
The transcript specificity element that hybridizes to the nucleic acid elements of array provided herein preferably detects by one or more labels that detection adheres to from the sample transcript specificity element of illing tissue's sample.
To any proper method of nucleic acid, label can be before hybridization, introduce in the hybridization or hybridization back by adhered labels known in the art.Appropriate means can comprise the primary transcript specificity element (as mRNA, polyA mRNA, cDNA etc.) that directly label joined sample or during the transcript specificity element amplification of sample or join amplification product, for example Nucleotide of the primer of applying marking or mark afterwards.
The mark that is suitable for method described herein is including but not limited to being used for painted and having vitamin H, the magnetic bead (for example Dynabeads) of the streptavidin binding substances of mark, fluorescence dye (, rhodamine red, green fluorescent protein and analogue thereof), radio isotope tracer (as 3H, 125I, 35S, 14C or 32P), enzyme as fluorescein, Texas (as horseradish peroxidase, alkaline phosphatase and be used for other enzyme of ELISA) and colorimetric mark, as Radioactive colloidal gold and stained glass and plastics (as polystyrene, polypropylene, latex etc.) pearl.
According to the selection of mark, those of skill in the art can select the method for suitable detection mark known in the art.For the method for the nucleic acid of the hybridization of describing labeling nucleic acid and certification mark in detail referring to LABORATORY TECHNIQUES IN BIOCHEMISTRY AND MOLECULARBIOLOGY, Vol.24:Hybridization With Nucleic Acid Probes, P.Tijssen, ed.Elsevier, N.Y, (1993).
Protein arrays
In other embodiment, design and made up protein arrays.As used herein, term " albumen " and " polypeptide " can exchange.Tissue-specific element in these arrays comprises albumen, peptide, antibody, peptide nucleic acid(PNA) etc.The antibody that produces at ill peptide molecule of transcribing group coding can be fixed on the discrete site of array and be attached on the polypeptide that combines the detectable mark that is specific to antibody.Can contact with the array of mark from the isolated albumen of target sample, any labelled protein is displaced from fixed antibody all and can manifests by the disappearance of the discrete detectable mark of site of array.Albumen metathetical characteristics may be relevant to the reaction or the unreacted of specific therapy agent with the individuality of expressing array features on the array.
Perhaps, protein arrays can comprise ill peptide molecule of transcribing group coding.Peptide molecule can be attached on the discrete site of transcribing the histone array, and with separating from expressing ill antibody test of transcribing the individuality of group.
Antibody can be polyclonal or more preferably be mono-clonal.Can use complete antibody, or its fragment is (as Fab or F (ab ')
2).When term " mark " relates to probe or antibody, it is intended to comprise the direct mark that forms on probe or the antibody by the detectable material of coupling (physical connection just), and the direct mark to probe or antibody that forms by the reaction with another reagent that directly has been labeled.The example of indirect labelling comprise use fluorescently-labeled secondary antibody to the detection of primary antibody and utilize vitamin H to the end mark of dna probe so that it can be with fluorescently-labeled streptavidin detection.Term " biological sample " means and comprises tissue, cell and separate biological fluid from the experimenter, and is present in the intravital tissue of experimenter, cell and liquid.Just, detection method can be used in the detection bodies and vitro detection biological sample RNA, protein, and genomic dna.For example, external, the technology that detects RNA comprises Northem hybridization and in situ hybridization.External, detect proteic technology and comprise enzyme linked immunosorbent assay (ELISAs), western blotting, immunoprecipitation and immunofluorescence.External, detect genomic technology and comprise DNA hybridization.And, in vivo, detect proteic technology and comprise the antibody that imports mark to the experimenter.For example, antibody can carry out mark with radioactive mark, and wherein the existence of mark and position can be detected by the standard imaging technique.
Test kit
This paper provides the transcript specificity element in the detection of cancerous diseased tissues sample to exist or to its quantitative test kit.For example, test kit can comprise the one or more arrays of transcribing group from one or more illing tissues.Molecule on the array can be that this paper describes polynucleotide, polypeptide or antibody molecule.Test kit randomly also comprise detectable mark or carried out mark compound or can the detection of biological sample in gene product expression medicament and be used for the mark sample and be poised for battle the reagent that the hybridization that lists complementary sequence exerts an influence.Test kit randomly also comprises the instrument of transcript amount in the test sample, as colorimetric scale and equipment.
Test kit can comprise more than one array, wherein each array corresponding to the tissue that tormented by various disease and wherein each array comprise a plurality of groups of transcribing corresponding to the tissue that tormented by a kind of disease.Compound and medicament can be packaged in the proper container.Test kit can further comprise the specification sheets that uses test kit to detect albumen or nucleic acid.
The using method of prediction medicine (predictive medicine)
This paper provides the array that uses foregoing description in the method for predicting pharmaceutical field.This field comprises diagnositc analysis, prognostic analysis, forecast analysis, pharmacogenomics and to the detection of the clinical trial of various disease.
Term " disease " and " morbid state " comprise and can cause or the potential disease that causes small molecules collection of illustrative plates, CC or the organoid change of the cell in the ill organism.This disease can be divided into three main classifications: neoplastic disease, inflammatory disease and degenerative disease.The example of disease is including but not limited to metabolic disease (obesity for example, emaciation, diabetes, apositia or the like), cardiovascular disorder (atherosclerosis for example, ischemia/reperfusion, hypertension, myocardial infarction, restenosis, myocardosis, arteritis or the like), immunologic derangement (for example chronic inflammation disease and disorder, as Crow engler (Crohn ' s) disease, inflammatory bowel, reactive arthritis, arthritis deformans, osteoarthritis, comprise lymphatic disease, insulin-dependent diabetes, organ specificity autoimmunization, comprise multiple sclerosis, struma lymphomatosa and Graves disease, contact dermatitis, psoriasis, transplant rejection, graft versus host disease, sarcoidosis, the irritated situation of heredity, as asthma and transformation reactions, comprise allergic rhinitis, stomach and intestine allergy, comprise food anaphylaxis, eosinophilia, conjunctivitis, glomerulonephritis, to some pathogenic agent susceptible such as helminth (for example leishmaniasis) and some virus infection, comprise hiv virus, and bacterial infection, comprise pulmonary tuberculosis and lepromatous leprosy or the like), myopathy (polymyositis for example, muscular dystrophy, central core disease, central nucleus (multinuclear myotube) myopathy, congenital myotonia, the cellulosic myopathy, congenital paramyotonia, periodic paralysis, mitochondrial myopathy or the like), neurological disorder (neuropathy for example, Alzheimer, Parkinson's disease, Huntington Chorea, amyotrophic lateral sclerosis, motor neuron, traumatic nerve injury, multiple sclerosis, acute disseminated encephalomyelitis, acute necrosis hemorrhagic leukoencephalitis, dysmyelination (dysmyelination) disease, mitochondriopathy, the migraine disorder, infectation of bacteria, fungi infestation, apoplexy, old and feeble, dull-witted, peripheral nervous disease and abalienation such as dysthymia disorders and schizophrenia or the like), oncobiology disorder (leukemia for example, the cancer of the brain, prostate cancer, liver cancer, ovarian cancer, cancer of the stomach, colorectal carcinoma, laryngocarcinoma, mammary cancer, skin carcinoma, melanoma, lung cancer, sarcoma, cervical cancer, carcinoma of testis, bladder cancer, the internal secretion cancer, carcinoma of endometrium, the esophageal carcinoma, neurospongioma, lymphoma, neuroblastoma, osteosarcoma, carcinoma of the pancreas, the hypophysis cancer, kidney etc.) and the disease of ophthalmology (for example retinitis glucagonoma and macular degeneration).This term also comprises disorder, and it is caused by known and unknown oxidative stress, hereditary cancer syndromes and metabolic trouble.
Generally, be used to predict following the carrying out of method of medicine: will combine with array described herein from the transcript specificity element that ill target cell or tissue or suspection have a cell or tissue of pathological condition, under the condition of the making nucleic acid molecular hybridization that allows transcript specificity element and array, pass through hatching of enough time then, detect hybridization then; The detection of hybridization shows the existence of illing tissue in the sample, or analyze pattern that transcript expresses and with reference the comparison of expressing from the transcript specificity element of reference sample, with the information about selection of diagnosis, prognosis, drug screening, resistance, treatment etc. of sampling, more detailed description is as follows.
Diagnositc analysis
The diagnositc analysis that utilizes array described herein is provided, to be used to measure the activity of albumen and/or expression of nucleic acid and biological sample (as blood, serum, cell, tissue), thereby determine the individual torment that whether is subjected to disease or illness, or have ill sign, or whether individuality has the risk that develops into the disease relevant with paraprotein, expression of nucleic acid or activity or develop into disease.Early diagnosis will be beneficial to treatment and increase successfully to treat prophylactically treats individuality before the symptom that also can make doctor even disease or illness begins.
Array described herein is can be used to identify the nucleic acid molecule of differential expression under the pathologic condition, as the ill-condition of colorectum tissue, lung tissue, mammary tissue, hepatic tissue or cerebral tissue.
Rna transcription thing in the detection of biological sample or the existence of gene prod whether exemplary method comprise the acquisition biological sample, it comprises the nucleic acid elements from experimenter to be measured, biological sample is contacted with compound that can detect protein or nucleic acid or reagent, hybridize to the existence of the transcript of array described herein in like this can the detection of biological sample.The reagent that detects RNA or genomic dna is preferably can hybridizing to from the RNA of sample or the nucleic acid probe of genomic dna of mark.Nucleic acid probe can be, for example, total length nucleic acid or its part are at least 11,15,30,50,100,250,500,1 as total length, 000 or the oligonucleotide of polynucleotide more, and under rigorous condition its fully specific hybridization to RNA or genomic dna.
Biological sample combines with array with transcript specificity element in the detection of biological sample.In one embodiment, biological sample comprises the protein molecular from experimenter to be measured.Selectively, biological sample comprises from experimenter's nucleic acid elements to be measured, as RNA molecule or genomic dna molecule.Preferred biological sample be biological liquid (as serum), cell sample or in a usual manner as needle biopsy from the isolating biopsy sample of experimenter.
Array can also be used for identifying the sudden change of the gene of the transcript generation that causes that illing tissue exists.Like this; the invention provides a kind of the evaluation and unusual rna expression or the active relevant disease or the method for illness; wherein testing sample is available from the experimenter; detect albumen or nucleic acid (as RNA, genomic dna) then, wherein the existence of albumen or nucleic acid can be diagnosed as the patient and has or be in and develop into and abnormal gene expression or the active relevant disease or the risk of illness.
Diagnositc analysis provide in a kind of evaluation sample with pathologic condition (as the cancerous symptom of commitment, and this symptom presymptomatic with can not be detected by any other method) the method for the relevant one or more transcript specificity elements of susceptibility, or identify the in esse method of ill state.If the reference pattern that sample crossing pattern or expression pattern and non-disease are expressed with reference to the transcript specificity element of sample is compared, the transcript specificity element of corresponding target cell is associated with showing with pathological condition with reference to the differential expression between the sample.Equally, if expression pattern and the reference pattern of expressing with reference to the transcript specificity element of sample from the disease of special pathological condition are relatively, crossing pattern or expression pattern and reference pattern conform to substantially then show in sample tissue or the cell pathological condition or to the existence of the susceptibility of pathological condition.
The reference pattern of preliminary assay can be made up of the expression pattern of the subgroup that runs through full array or nucleic acid molecule, for example is measured as the subgroup with particular associative relevant with special pathological condition.The new subgroup of such nucleic acid molecule can be used for the structure of the nucleic acid elements array that is associated with special pathological condition.New matrix-like like this becomes another aspect of the present invention.
Differential expression can be qualitative or quantitative.For example, the expression difference in the reference sample can be the rise expressed of one or more transcript specificity elements of the target cell of sample or the downward modulation of expression.Express with reference to the corresponding transcript (one or more) in the sample (or contrast) than non-disease, the differential expression that measures can be the growth that one or more transcripts are expressed, or than the expression of non-disease with reference to the corresponding transcript of sample (or contrast), the reduction of the increase of the expression of one or more transcripts and the expression of one or more other transcripts.Therefore, the expression pattern of being regulated can be indicated specific cell or tissue function.
In preferred embodiments, RNA kind relevant with pathological condition or gene recombination are to the nucleotide sequence that is complementary to one or more sequences from list of genes A-JJ.Pathological condition can be any morbid state, and for example pathological condition can be a cancer.Array described herein can be used to distinguish the subgroup of the type and the cancer relevant with set tissue of cancer (as chest (mammary gland), colorectum, lung etc.).
In one embodiment, if express greater than the respective element in the reference sample 0.1 times, 0.5 times, 1 times, 1.5 times, 2 times, 5 times, 10 times or higher, think that then being expressed as of transcript specificity element in the target cell raised or downward modulation.Certainly, when estimating this qualitative difference, use correction coefficient to measure expression level, for example based on the measured expression of reference nucleic acid elements, known its all expressed at target cell with in reference to sample.Can use any suitable non-disease with reference to sample (or contrast).For example, can be that the mean value that maybe can contain the expression values of gene element described in the described a plurality of cells that do not have related pathologies is expressed from tissue identical with the target cell source and/or organ and/or experimenter's cell with reference to sample.
As described herein, array described herein can be estimated the expression of transcribing group of significant proportion in the specific diseased tissue, and therefore can be used for the evaluation of the differential expression pattern of a large amount of gene elements relevant with the particular pathologies situation.
Measured the cognation of transcript or gene and pathological condition, the existence of this transcript, copy number or expression level can be used to diagnose the method to the susceptibility existence of this situation.This purposes has been represented another independent aspects of method described herein.
Prognostic analysis
This paper also provides prognostic analysis to be used for whether mensuration is individual will recover or recur after not having preliminary drug intervention or carrying out under preliminary drug intervention such as the surgery, and wherein said individual the diagnosis has and paraprotein, expression of nucleic acid or relevant disease or the illness of activity.
Prognostic analysis described herein can be used to measure the overall survival rate that does not have positive or negative after any treatment or the preliminary drug intervention, and then determines whether to carry out prognostic analysis and identify the most effective further treatment.For example, analysis can be used to measure the cocktail whether patient should only accept surgical intervention or can use pharmaceutical agent, biological reagent or therapeutical agent to combine before or after surgical intervention.These reagent be particularly useful for prognostic poor and its do not have will be no longer under treatment and the drug intervention from the individuality of disease or illness rehabilitation.In such embodiment, the hybridization that transcript specificity element and disease are transcribed group pattern shows and does not undergo surgery after operation or chemotherapy or disease or the development or chemotherapy just has the possibility of recurrence.
In preferred prognostic analysis, array is used for the crossing pattern from sample is compared with the crossing pattern from known illing tissue, wherein, known illing tissue is to specific treatment negative response or active responding, thereby illing tissue experiences or does not experience the recurrence of disease, the recurrence after alleviating as cancer.
Forecast analysis
Provide forecast analysis to be used to select the individual disease of suitable particularly treatment influence or treatment of conditions agent or prevention reagent.Treatment reagent is including but not limited to micromolecular compound, agonist, antagonist, protein (comprising peptide and antibody or antibody fragment), plan peptide class, nucleic acid, gene therapy vector, radiotherapy, chemotherapy, and other candidate therapeutic reagent.
Then the information that obtains is used to measure the reaction of disease association tissue to drug treatment.These methods comprise to be measured behind patient's tumor resection the reaction of particular treatment, and after the tumour diffusion recurrence, tumour is to the reaction of radiotherapy, postoperation radiotherapy or chemotherapy.
Be used to regulate the active candidate agent except screening, array described herein can be as measuring the mode of action of reagent as treatment reagent.
Pharmacogenomics is analyzed
Array described herein also is used to detect albumen, expression of nucleic acid or the activity that the genotype by individuality causes, measuring individual response capacity to medicaments, thus select to be specific to the suitable treatment of this individuality or prevention reagent (as medicine) (this paper middle finger "
Pharmacogenomics ").
In this ability, array described herein can be used for prognosis or forecast analysis to identify that the patient is to reactivity and resistance based on the particular medication of genetic map.In this analysis, the patient is relevant with the crossing pattern from the transcript specificity element of these patient illing tissue samples to the historical data of pharmacological agent reaction.Then this information is used to measure the reaction of following patient to the same medicine treatment.These methods comprise to be measured the prognosis after patient tumors excision back, the tumour diffusion recurrence and measures the reaction of tumour to radiotherapy, postoperation radiotherapy or chemotherapy.
The exemplary therapeutical agent treatment of transcribing group analysis provided herein of will using includes but not limited to arthritis drug treatment, chemotherapeutics, treatment antibody, treatment albumen or peptide, treatment nucleic acid, antipsychotics, thymoleptic, antasthmatic, antiviral drug and antibacterium medicine, antihypertensive drug, cholesterol lowering drug and antifungal drug.Array can also be used to identify the offensiveness and the tumor recurrence evaluation by stages of progression of disease, disease.
Array provided herein can also be used to measure the degree of individual adverse effect to the special treatment agent, with the dosage of accurate titration treatment time and still less unfavorable drug reaction is provided.Different polymorphisms can cause metabolism increase or that reduce of special treatment agent.If conventional degrading enzymatic activity is owing to polymorphism is lowered, standard dose may cause more disadvantageous than usual reaction so.Aspect the effectiveness and toxicity of many medicines, the genetic polymorphism in drug metabolism enzyme, transporter, acceptor and the other medicines target is relevant with individual difference.For example, thiopurine methyltransferase (TPMT) causes change (McLeod and Yu, 2003, the Cancer Invest.21 (4): 630-40) of degraded of the reagent Ismipur of general description.This genetic mutation has significant clinical implication because have the homotype sudden change in the relevant TPMT gene of functional type patient experience extreme or fatal toxicity behind the administration routine dose 6-MP.In this embodiment, the expression pattern of sample is compared with the reference pattern of expressing from the transcript of reference sample, when occur expression pattern substantially corresponding to the prediction reference pattern in one or more the time, show that then individuality may experience the adverse effect of treatment.
In preferred embodiments, comprise also the target cell that contacts or the control sample of tissue and also combine, to be used for comparison with array with therapeutical agent.
Array described herein can also be used to detect clinical trial new or existing treatment.Particularly, array is used for the patient that preliminary election has the patient group of pathological condition, or prescreen has the patient of pathological condition, the patient of preliminary election or screening is used the test of cure agent of carrying out clinical trial or other therapeutical agent with the treatment pathological condition, thereby the patient produces optimum reaction to medicine.
Drug discovery and researching and analysing
Array provided herein can be used for drug discovery and research method.For example, array can be used to measure one or more transcript/gene pairs test of cure agent of transcribing group, new synthetic compound and the reaction of other reagent interested.Reagent can be that known to have therepic use maybe can be the candidate therapeutic agent of newly developing.
Like this, the array described herein candidate agent that can be used to screen or regulate target cell or function of organization in a large number.Consistent with method, the crossing pattern of the sample of handling with candidate therapeutic agent that will be on one or more arrays described herein contrasts with the crossing pattern of untreated control sample.The interelement difference of transcript specificity of handling the hybridization of sample and control sample shows that candidate's medicament is in the ability of regulating target cell or function of organization.
Composition provided herein and method will be described in greater detail in the specific embodiment.The following examples are explanatory purpose, have no intention to limit by any way or define the present invention.
Embodiment
Embodiment 1: colorectal carcinoma is transcribed the initial list of group sequence
Should obtain initial colorectal carcinoma in the following method and transcribe the group pattern sequence, it is disclosed in European patent application EP 04105479.2, EP 04105482.6, EP 04105483.4, EP 04105484.2, in EP04105507.0 and EP04105485.9 and U.S. Provisional Patent Application 60/662,276 and 60/700,293.
Material and method
The public data screening
All disclosed expressed sequence tag (ESTs) that obtain from all download databases revert to the FASTA form, and all 921 databases are connected to the unique sequence file that contains 272,686 single EST.Use these EST of combined sorting of the certain filter among the Paracel Filtering Package (PFP) (can in network address www.paracel.com, obtain) not enter the set program then to guarantee undesirable sequential element.Select to be provided with to cover up low complex area, carrier sequence and tumor-necrosis factor glycoproteins.Filter out the sequence that comprises the sequence, Mitochondrial DNA and the ribosome-RNA(rRNA) that pollute the E.coli sequence.After these screening steps, the inferior quality stub area that preceding stage covers up and anyly mainly comprise low complicated repeating sequences and remove with " trimjunk " algorithm (Paracel Filtering Package).At last, comprising the sequence screening that is less than 100 good bases comes out.
Data screening
The screening of EST is at " Phred " output file but not carry out on the original FASTA sequential file." Phred " file comprises the quality information of sequence, just washes for each base to be called statistical significance.Also allow to use other the filtering algorithm that is known as " qualclean ".Qualclean has left out the inferior quality sequence from the beginning and the ending of sequential file.Listed those of other used filter algorithm and public data are identical.
Data family
The set of disclosed and inner data is to carry out for 50 times by using Paracel software " Paracel TranscriptAssembler (PTA) " (seeing network address www.paracel.com) at a bunch threshold values (clustering threshold).These sequences of (contig) of gathering together are carried out BLAST at Genbank NT database, to carry out note and to identify the direction of sequence.Be accredited as to compare with listed those among the Genbank at contig and be under the rightabout situation, data are reversed replenishes and being included in the last data set of both direction.
The result
The colorectum source sequence rallies in the public data storehouse
In order to identify the sequence that can express in the colorectum tissue, (see network address: oncogene group analysis project (CGAP) access check cgap.nci.nih.gov) has come from the sequence information of colorectum tissue, colorectum cancerous tissue or colorectum source cell system in the healthy association of American National network address.921 all est database tabulations have been identified with CGAP.Then from UniGene database retrieval storehouse itself.With check and correction information in the single database, totally 272,686 independent sequences have been produced.Produce totally 18,721 contigs and 41,023 single copy EST (singlet) with Paracel transcript Combination tool set independent sequence then.Contig with 18,721 contigs and following listed order-checking project generation compares then.This has relatively shown at final amt is the contig in 16,350 open sources, has only limited redundancy.
The evaluation of the sequence that new colorectum is expressed
For the transcript of identifying that other can be expressed in the colorectum tissue, no matter be normal or virulent, produce the cDNA library from RNA storehouse from 80 normal and pernicious colorectum tumor tissues.Enter cloning vector with the RNA reverse transcription and by the direction clone.Then the library is transformed into bacterium and the single clone of dull and stereotyped cultivation generation.Select totally 50,000 clones and order-checking to determine their identity.Gather 50,000 clones then and produce 10,396 identical sequences altogether, the combination unique sequence obtains 4,129 contigs and 6,267 single copy EST.Then to 4,129 contigs and 6, the database that the sequence information in 267 single copy EST sources comprises Genebank with respect to public's available carries out BLAST once more identifying new sequence fully, and carries out BLAST once more with respect to the database that all public's available colon cloned tissue libraries produce and before also be not reported in the sequence of expressing in the colorectal carcinoma to identify.This analysis has identified that altogether 2,773 were not before also reported the new sequence as note gene or EST in gene pool.
Embodiment 2: the further evaluation of colorectal carcinoma sequence
Other colorectum sequence information identifies that by the detection on the microarray that contains public's information available other transcript of expressing in the colorectum tissue obtains.These sequences have been replenished the initial group pattern sequence information of transcribing, and provide more complete representative colorectal carcinoma to transcribe the array of group.
Method
40 of marks are from the RNA of colorectum tissue (27 tumour and 13 normal), and hybridize on the microarray that contains disclosed available information.Obtain transcript tabulation from these arrays, be used for those at least one of array exist and background technology in the target (just identifying the transcript of at least one colorectum sample, expressing) described.
Use on the chip GI or and the initial work of probe groups correlation number shown some difference between the complete sequence of target sequence and note target.Therefore, determine to use the open sequence library of actual sequence check of target to retrieve these sequences from public data, to proofread and correct these sequences of public database, described public database is represented target most, and this target is empirically determined to be expressed in the colorectum tissue by the array test.
From complete sequence, extract these sequences then and it is carried out BLAST with respect to interim patent sequence list (just those transcripts of identifying are gathered in order-checking and public data storehouse internally).Thereby the tabulation of 21,909 transcripts of having derived, the transcript in this tabulation does not occur in the sequence list in the U.S. Provisional Patent Application 60/662,276.
Whole tabulations of this sequence are carried out BLAST to disclosed est database (dbEST) using under the high rigorous condition (covering 90% target).Proofread and correct those sequences that those and dbEST are misfitted then from the public data storehouse.Successfully retrieve the set of 16,377 sequences by this method.
6,635 sequences of remainder are carried out BLAST to the RefSeq database.1,663 set has produced very strong inconsistent with RefSeq in the target.And these sequences are proofreaied and correct from the public data storehouse.
From 4,972 targets of remainder, extracted GI quantity, they are used for retrieving correlated series from the public data storehouse.
These three sequence list connect into a single file also with inner copy sequential detection software retrieval.It has produced the final tabulation of 22,376 no tumor-necrosis factor glycoproteinss.
Embodiment 3: colorectal carcinoma is transcribed the antisense sequences of group
Along with the increase of the scientific exchange interest of endogenous sense-rna transcript effect, the colorectal carcinoma database is checked the existence of antisense transcript.
Method
After the set, inner and disclosed data contig carries out BLAST with purpose that reaches note and the direction of identifying sequence to Genbank NT database.Be accredited as to compare with listed those among the Genbank at contig and be under the rightabout situation, data are reversed replenishes and being included in the last data set of both direction.Like this, combine antisense and have adopted transcript to form the gene order (list of genes H) of 5,672 transcripts accordingly.
Embodiment 4: lung cancer is transcribed the tabulation of group sequence
The list of genes I that is used to derive is similar to those methods of the colorectal carcinoma sequence that is used to derive to the method for the described lung cancer transcript of list of genes O array sequence.
These 55,626 lung cancer sequences are results of the inside set in disclosed available lung EST library.They are previous unique sets that show the data relevant with lung cancer.The note gene is expressed and before be not accredited as to a part in these sequences in lung cancer.
The result
Lung in the public data storehouse comes rallying of source sequence
In order to identify the sequence that in lung tissue, to express, derived from the sequence information of lung tissue, lung tumor tissue and lung tumor source cell system from the CGAP access check.Use the CGAP inlet to identify total tabulation in 301 EST libraries.Then from UniGene database retrieval library itself.Check and correction information produces whole 471,630 independent sequences in single database.Use Paracel transcript Combination tool set independent sequence to produce 36,431 contigs and 19,195 single copy EST altogether then.
The evaluation of new lung expressed sequence
In order to identify other the transcript in the lung tissue of may being expressed in, no matter be normal or virulent, origin comes from and surpasses 80 RNA storehouses normal and the malign lung tumor tissues and set up the cDNA library.Be cloned into cloning vector with the RNA reverse transcription and by direction.Then the library is transformed into bacterium and cultivates and produce independently clone.Select 4,032 clones and order-checking to determine their identity altogether.Screening and cloning produces totally 3,450 unique sequences then, and 602 contigs and 1,589 single copy EST are concentrated and provided to these sequences.The sequence information in contig and single copy EST source carries out once more BLAST identifying new sequence fully with respect to the available database that comprises gene pool of the public then, and carries out once more BLAST with respect to the database that the available lung tissue of all public library produces and before also be not reported in the sequence of expressing in the lung cancer to identify.24 new sequences of before also not reporting as note gene or EST have been identified in this analysis altogether in gene pool.
Embodiment 5: mammary cancer is transcribed the tabulation of group sequence
The list of genes P that is used to derive is similar to those methods that are used to obtain colorectal carcinoma sequence and lung cancer sequence to the method for the mammary cancer transcript array sequence described in the list of genes V.
These 87,059 mammary cancer sequences are results of the inside set in disclosed available mammary gland EST library.They are previous unique sets that show with breast cancer related data.The note gene is expressed and before be not accredited as to the part of these sequences in mammary cancer.
The result
Mammary gland in the public data storehouse comes rallying of source sequence
In order to identify the sequence that in mammary tissue, to express, derived from the sequence information of mammary tissue, breast tumor tissues and breast tumor source cell system from the CGAP access check.Use the CGAP inlet to identify total tabulation in 1,130 EST library.Then from UniGene database retrieval library itself.Check and correction information produces totally 288,854 independent sequences in single database.Use Paracel transcript Combination tool set independent sequence to produce 17,2911 contigs and 24,178 single copy EST altogether then.
The evaluation of new mammary gland expressed sequence
In order to identify other the transcript in the mammary tissue of may being expressed in, no matter be normal or virulent, origin comes from and surpasses 120 RNA storehouses normal and the malignant breast tumor tissue and set up the cDNA library.Be cloned into cloning vector with the RNA reverse transcription and by direction.Then the library is transformed into bacterium and cultivates and produce independently clone.Select 157,260 clones and order-checking to determine their identity altogether.Screening and cloning produces totally 127,306 unique sequences then, and 14,489 contigs and 24,308 single copy EST are concentrated and provided to these sequences.The sequence information in contig and single copy EST source carries out once more BLAST identifying new sequence fully with respect to the database of the available Genebank of comprising of the public then, and carries out once more BLAST with respect to the database that the available mammary tissue of all public library produces and before also be not reported in the sequence of expressing in the mammary cancer to identify.3,278 new sequences that before also be not reported among the Genebank as note gene or EST have been identified in this analysis altogether in gene pool.
Embodiment 6: the tabulation of transcribing the group sequence in the hepatic tissue source relevant with hepatitis
The list of genes W that is used to derive is similar to those methods of be used to derive colorectal carcinoma sequence and lung cancer sequence to the method for the transcript array sequence of the relevant hepatic tissue of the described hepatitis of list of genes CC.
The result of the inside set that these 86,122 ill hepatic tissue sequences are disclosed available liver EST libraries.They are the unique sets that before demonstrated the data that relate to the hepatic tissue relevant with hepatitis.The note gene is expressed and before be not accredited as to the part of these sequences in the relevant hepatic tissue of hepatitis.
The result
Rallying of ill liver sequence in the public data storehouse
In order to identify the sequence that in the relevant hepatic tissue of hepatitis, to express, derived from the sequence information of the clone that hepatic tissue, the hepatic tissue hepatic tissue relevant with hepatitis that hepatitis is relevant originate from the CGAP access check.Use the CGAP inlet to identify total tabulation in 63 EST libraries.Then from UniGene database retrieval library itself.Check and correction information produces whole 326,079 independent sequences in single database.Use Paracel transcript Combination tool set independent sequence to produce 24,744 contigs and 37,503 single copy EST altogether then.Then contig and the contig that produces from following order-checking project are compared, this project has provided the contig in final 24,744 open sources.
The new Sequence Identification that the hepatic tissue relevant with hepatitis expressed
In order to identify other the transcript in the hepatic tissue relevant with hepatitis of may being expressed in, origin comes from the RNA storehouse that surpasses 40 normal and ill hepatic tissue samples and has set up the cDNA library.Be cloned into carrier with the RNA reverse transcription and by direction.Then the library is transformed into bacterium and cultivates and produce independently clone.Select 4,944 clones and order-checking to determine their identity altogether.Screening and cloning produces totally 2,869 unique sequences then, and 45 contigs and 2,300 single copy EST are concentrated and provided to these sequences.The sequence information in contig and single copy EST source carries out BLAST once more with respect to public's available database (comprising NCBI RefSeq set) then, identifying new sequence fully, and carry out once more BLAST with respect to the database that the available hepatic tissue of all public library produces and before also be not reported in the sequence of expressing in the relevant hepatic tissue of hepatitis to identify.13 new sequences of before also not reporting as note gene or EST have been identified in this analysis altogether in Genebank.
Embodiment 7: with nerveDegeneration
The tabulation of transcribing the group sequence in relevant cerebral tissue source
The list of genes DD that is used to derive is similar to those methods of derive colorectal carcinoma sequence and lung cancer sequence to the method for the transcript array sequence of the retrograde cerebral tissue of neurocyte described in the list of genes JJ.
The result of the inside set that these 136,326 ill cerebral tissue sequences are disclosed available brain EST libraries.They are previous unique data acquisitions that show the cerebral tissue that relates to nervus retrogression.The note gene is expressed and before be not accredited as to the part of these sequences in the relevant cerebral tissue of nervus retrogression.
The result
Rallying of ill cerebral tissue sequence in the public data storehouse
In order to identify the sequence that in the relevant cerebral tissue of nervus retrogression, to express, derived from the sequence information of the clone that cerebral tissue, the cerebral tissue cerebral tissue relevant with nervus retrogression that nervus retrogression is relevant originate from the CGAP access check.Use public database to identify total tabulation in 674 EST libraries.Then from UniGene database retrieval library itself.Check and correction information produces totally 656,559 independent sequences in single database.Use Paracel transcript Combination tool set independent sequence to produce 33,275 contigs and 65,022 single copy EST altogether then.
The new Sequence Identification that the cerebral tissue relevant with nervus retrogression expressed
In order to identify other the transcript in the cerebral tissue relevant with nervus retrogression of may being expressed in, origin comes from the RNA storehouse that surpasses 20 normal and ill brain tissue samples and has set up the eDNA library.Be cloned into carrier with the RNA reverse transcription and by direction.Then the library is transformed into independently clone of bacterium and dull and stereotyped cultivation generation.Select 7,200 clones and order-checking to determine their identity altogether.Screening and cloning produces totally 3,115 sequences then, and these sequences are provided 346 contigs and 1,671 single copy EST by set.The sequence information in contig and single copy EST source carries out BLAST identifying new sequence fully once more with respect to public's available database (comprising NCBI RefSeq set) then, and carries out once more BLAST with respect to the database that the available cerebral tissue of all public library produces and before also be not reported in the sequence of expressing in the relevant cerebral tissue of nervus retrogression to identify.5 new sequences of before also not reporting as note gene or EST have been identified in this analysis altogether in Genebank.
Embodiment 8: from the comparison of colorectum, prostate gland and breast tumor sequence
Sequence from colorectum tumour, tumor of prostate and breast tumor compares with the conventional sequence of expressing.Fig. 2 provides the BLAST comparative graph of the open available sequences of representing all colons, prostate gland and mammary tissue.This is the comparison of all sequences after the set of open available sequences, obtains as described above.The boundary E value (cut offE-value) that is used to carry out the parameter of these sequences of BLAST is 0.1, and per-cent identity is 90%.The standard cut off value is checked and visual (visualization) from the manual of thousands of independent BLAST results.When allowing there is suitable specified difference constantly between sequence, the finding that satisfies these standards can clearly be categorized as " identical " finding.Failing standard compliant finding just is considered to not meet the requirement of array purpose of design.For each result, two values have been provided." zero homology " result shown with respect to the quantity that is not had the sequence of homology by the database of BLAST.Second value defined be not for " meeting (no hit) " and in this case, and chain to be looked into has and is less than 50% percentage range, that is to say that sequence to be looked into has to be less than target sequence and to represent 50% of length.
Zero homology sequence is the subgroup of " not meeting (no-hit) " sequence.The number that the number of total sequence deducts " not meeting " sequence obtains the number of common sequences between two colonies.All documents of mentioning in this specification sheets are incorporated this paper into as a reference.
The various modifications and the variant of the embodiment that the present invention describes all do not leave scope and spirit of the present invention to those skilled in the art.Though the present invention describes in conjunction with specific preferred embodiment, it should be appreciated that claim should not be subject to so specific embodiment.Really, the tangible for those skilled in the art various modifications of implementing mode of the present invention are covered by the present invention.
Claims (19)
1. comprise the array that illing tissue transcribes group.
2. array as claimed in claim 1, wherein said illing tissue comprise suffer from tumor disease, the tissue of inflammatory disease or degenerative disease.
3. array as claimed in claim 1, wherein said illing tissue comprise suffer from colorectal carcinoma, the tissue of lung cancer or mammary cancer.
4. as each described array of claim 1-3, the wherein said group of transcribing comprises one or more tissue-specific elements, each expression is from the transcript of ill colorectum organization order, and each described transcript is independently selected from the transcript described in list of genes B, list of genes C, list of genes D, list of genes E, list of genes F, list of genes G or the list of genes H.
5. as each described array of claim 1-3, the wherein said group of transcribing comprises one or more tissue-specific elements, each expression is from the transcript of diseased lung organization order, and each described transcript is independently selected from the transcript described in list of genes J, list of genes K, list of genes L, list of genes M, list of genes N or the list of genes O.
6. as each described array of claim 1-3, the wherein said group of transcribing comprises one or more tissue-specific elements, each expression is from the transcript of ill mammary tissue sequence, and each described transcript is independently selected from the transcript described in list of genes Q, list of genes R, list of genes S, list of genes T, list of genes U or the list of genes V.
7. as each described array of claim 1-3, the wherein said group of transcribing comprises one or more tissue-specific elements, each expression is from the transcript of ill hepatic tissue sequence, and each described transcript is independently selected from the transcript described in list of genes X, list of genes Y, list of genes Z, list of genes AA, list of genes BB or the list of genes CC.
8. as each described array of claim 1-3, the wherein said group of transcribing comprises one or more tissue-specific elements, each expression is from the transcript of ill cerebral tissue sequence, and each described transcript is independently selected from the transcript described in list of genes EE, list of genes FF, list of genes GG, list of genes HH, list of genes II or the list of genes JJ.
9. as each described array of claim 1-3, the wherein said group of transcribing comprises one or more tissue-specific elements, each expression is from the transcript of cancerous tissue sequence, and each described transcript is independently selected from the transcript described in list of genes B, C, D, E, F, G, H, J, K, L, M, N, O, Q, R, S, T, U or the V.
10. as each described array of claim 4-9, wherein saidly transcribe the tissue-specific element that group comprises the transcript at least one described list of genes of expression of 70%.
11. array according to claim 10 is wherein saidly transcribed the tissue-specific element that group comprises the transcript in each described list of genes of expression of 70%.
12. according to each described array of claim 4-11, the described tissue-specific element that wherein represents described transcript is the nucleic acid molecule with the sequence that is complementary to described transcript.
13. according to each described array of claim 4-11, the described tissue-specific element that wherein represents described transcript is described transcript encoded polypeptides.
14. according to each described array of claim 4-11, the described tissue-specific element that wherein represents described transcript is the antibody that is specific to described transcript encoded polypeptides.
15. the described array of aforementioned each claim, the wherein said group of transcribing comprises the coding that derives from illing tissue and the nucleic acid molecule of non-encoding transcription thing.
16. the purposes of the described array of aforementioned each claim in the method for diagnosis patient pathological condition, it comprises:
A) will contact with array from the transcript specificity element of patient's biological sample; With
B) detect combining of transcript specificity element and array;
Wherein bonded detects the diagnosis of indication pathological condition.
17. the described array of aforementioned each claim detect whether the patient had disease or illness by diagnosis will recover after preliminary medical intervention or the method for recurrence in purposes.
18. to the purposes in reactive method of the therapeutical agent of treatment pathological condition, it comprises the described array of aforementioned each claim patient that detection suffers pathological condition:
A) will contact with array from the transcript specificity element of patient's biological sample; With
B) detection arrays and transcript specificity element combines;
Wherein bonded detects the reactivity of indication patient's pathological condition to the treatment of therapeutical agent.
19. according to claim 16,17 or 18 described purposes, when being subordinated to claim 12, wherein, in step b, it is the detection of hybridization that bonded detects.
Applications Claiming Priority (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP04105484.2 | 2004-11-03 | ||
EP04105507.0 | 2004-11-03 | ||
EP04105485.9 | 2004-11-03 | ||
EP04105479.2 | 2004-11-03 | ||
EP04105482.6 | 2004-11-03 | ||
EP04105483.4 | 2004-11-03 | ||
EP04105484 | 2004-11-03 | ||
US60/662,276 | 2005-03-14 | ||
US60/700.293 | 2005-07-18 |
Publications (1)
Publication Number | Publication Date |
---|---|
CN101115848A true CN101115848A (en) | 2008-01-30 |
Family
ID=39023452
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CNA2005800457745A Pending CN101115848A (en) | 2004-11-03 | 2005-11-03 | Transcriptome microarray technology and methods of using the same |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN101115848A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102816853A (en) * | 2012-08-30 | 2012-12-12 | 山东百福基因科技有限公司 | Kinesiological related gene EPO (erythropoietin) fluorescent detection reagent kit and detection method |
CN103981269A (en) * | 2014-05-26 | 2014-08-13 | 中南大学 | Application method of long non-coding RNA CRYM-AS1 |
CN107326090A (en) * | 2017-08-23 | 2017-11-07 | 武汉大学 | Quantitative detecting method for the blood platelet LncRNA of Diagnosis of Non-Small Cell Lung |
CN112175949A (en) * | 2020-09-23 | 2021-01-05 | 山东大学第二医院 | Application of lncRNA RP11-394O4.6 in inhibiting the biological function of bladder cancer cells |
CN115319733A (en) * | 2021-09-18 | 2022-11-11 | 阿马尔(上海)机器人有限公司 | An anti-pinch hand robot door system and its device |
Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1472338A (en) * | 2002-08-01 | 2004-02-04 | 深圳市君轩生物技术有限公司 | Tumour related gene testing method |
CN1554025A (en) * | 2001-03-12 | 2004-12-08 | Īŵ���ɷ�����˾ | Cell-based detection and differentiation of disease states |
-
2005
- 2005-11-03 CN CNA2005800457745A patent/CN101115848A/en active Pending
Patent Citations (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN1554025A (en) * | 2001-03-12 | 2004-12-08 | Īŵ���ɷ�����˾ | Cell-based detection and differentiation of disease states |
CN1472338A (en) * | 2002-08-01 | 2004-02-04 | 深圳市君轩生物技术有限公司 | Tumour related gene testing method |
Non-Patent Citations (3)
Title |
---|
GROS F: "《From the messenger RNA saga to the transcriptome era》", 《COMPTES RENDS-BIOLOGIES》 * |
MULLIGAN K A等: "《APPLICATION OF MICROARRAY-BASED EXPRESSION PROFILING IN CANCER RESEARCH》", 《APPLIED GENOMICS AND PROTEOMICS》 * |
VAN "T VEER L等: "《Gene expression profiling predicts clinical outcome of breast cancer》", 《NATURE》 * |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102816853A (en) * | 2012-08-30 | 2012-12-12 | 山东百福基因科技有限公司 | Kinesiological related gene EPO (erythropoietin) fluorescent detection reagent kit and detection method |
CN103981269A (en) * | 2014-05-26 | 2014-08-13 | 中南大学 | Application method of long non-coding RNA CRYM-AS1 |
CN107326090A (en) * | 2017-08-23 | 2017-11-07 | 武汉大学 | Quantitative detecting method for the blood platelet LncRNA of Diagnosis of Non-Small Cell Lung |
CN112175949A (en) * | 2020-09-23 | 2021-01-05 | 山东大学第二医院 | Application of lncRNA RP11-394O4.6 in inhibiting the biological function of bladder cancer cells |
CN112175949B (en) * | 2020-09-23 | 2021-05-04 | 山东大学第二医院 | Application of lncRNA RP11-394O4.6 in inhibiting the biological function of bladder cancer cells |
CN115319733A (en) * | 2021-09-18 | 2022-11-11 | 阿马尔(上海)机器人有限公司 | An anti-pinch hand robot door system and its device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
AU2005300688B2 (en) | Transcriptome microarray technology and methods of using the same | |
JP5237076B2 (en) | Diagnosis and prognosis of breast cancer patients | |
Sharma et al. | Early detection of breast cancer based on gene-expression patterns in peripheral blood cells | |
CN102346816B (en) | Gene expression profiles for identifying prognostic subclasses in nasopharyngeal carcinoma | |
US20120295800A1 (en) | Oligonucleotides for cancer diagnosis | |
KR20120065959A (en) | Markers for predicting gastric cancer prognostication and method for predicting gastric cancer prognostication using the same | |
JP2013516968A (en) | Gene expression platform for diagnosis | |
JP2006519591A (en) | Diagnosis and prognosis of breast cancer patients | |
CN1950701B (en) | Breast cancer prognostics | |
Nagata et al. | Transcriptional profiling in hepatoblastomas using high-density oligonucleotide DNA array | |
KR20100058420A (en) | A transcriptiomic biomarker of myocarditis | |
US20140206565A1 (en) | Esophageal Cancer Markers | |
CA2504403A1 (en) | Prognostic for hematological malignancy | |
CN101115848A (en) | Transcriptome microarray technology and methods of using the same | |
EP2004857B1 (en) | Breast cancer markers | |
KR100835296B1 (en) | Cancer predictive gene set selection method | |
CN110331207A (en) | Adenocarcinoma of lung biomarker and related application | |
US20150011411A1 (en) | Biomarkers of cancer | |
Urquidi et al. | Genomic signatures of breast cancer metastasis | |
Dixon | Identifying Genetic Differences among African American and Caucasian Triple Negative Breast Cancer Genotypes | |
JP2006166789A (en) | New diagnostic method for cancer | |
CN114752675B (en) | Molecular marker for screening, prognosis and immunotherapy evaluation of gastric cancer and application thereof | |
AU2012202562A1 (en) | Transcriptome microarray technology and methods of using the same | |
EP1797196B1 (en) | Detection of breast cancer | |
KR101244543B1 (en) | GENE MAKER SET FOR IDENTIFICATION OF EXPOSURE TO 17-β ESTRADIOL, MICROARRAY CHIP AND METHOD OF DETERMINATION USING THEREOF |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
REG | Reference to a national code |
Ref country code: HK Ref legal event code: DE Ref document number: 1114643 Country of ref document: HK |
|
C02 | Deemed withdrawal of patent application after publication (patent law 2001) | ||
WD01 | Invention patent application deemed withdrawn after publication |
Application publication date: 20080130 |
|
REG | Reference to a national code |
Ref country code: HK Ref legal event code: WD Ref document number: 1114643 Country of ref document: HK |