EP1673623A1 - Methods and applications of biomarker profiles in the diagnosis and treatment of breast cancer - Google Patents
Methods and applications of biomarker profiles in the diagnosis and treatment of breast cancerInfo
- Publication number
- EP1673623A1 EP1673623A1 EP04733320A EP04733320A EP1673623A1 EP 1673623 A1 EP1673623 A1 EP 1673623A1 EP 04733320 A EP04733320 A EP 04733320A EP 04733320 A EP04733320 A EP 04733320A EP 1673623 A1 EP1673623 A1 EP 1673623A1
- Authority
- EP
- European Patent Office
- Prior art keywords
- breast
- breast cancer
- biomolecules
- subjects
- cancer
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Withdrawn
Links
- 208000026310 Breast neoplasm Diseases 0.000 title claims abstract description 244
- 206010006187 Breast cancer Diseases 0.000 title claims abstract description 242
- 238000000034 method Methods 0.000 title claims abstract description 90
- 238000003745 diagnosis Methods 0.000 title claims description 12
- 239000000090 biomarker Substances 0.000 title description 23
- 238000011282 treatment Methods 0.000 title description 6
- 210000000481 breast Anatomy 0.000 claims abstract description 133
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 claims abstract description 110
- 201000010099 disease Diseases 0.000 claims abstract description 106
- 230000003211 malignant effect Effects 0.000 claims abstract description 81
- 238000012360 testing method Methods 0.000 claims abstract description 64
- 108090000623 proteins and genes Proteins 0.000 claims abstract description 62
- 102000004169 proteins and genes Human genes 0.000 claims abstract description 62
- 230000003902 lesion Effects 0.000 claims abstract description 56
- 108090000765 processed proteins & peptides Proteins 0.000 claims abstract description 38
- 102000004196 processed proteins & peptides Human genes 0.000 claims abstract description 29
- 238000003748 differential diagnosis Methods 0.000 claims abstract description 27
- 229920001184 polypeptide Polymers 0.000 claims abstract description 24
- 238000001616 ion spectroscopy Methods 0.000 claims abstract description 19
- 239000000523 sample Substances 0.000 claims description 142
- 238000001514 detection method Methods 0.000 claims description 58
- 238000009739 binding Methods 0.000 claims description 48
- 239000003463 adsorbent Substances 0.000 claims description 47
- 210000002966 serum Anatomy 0.000 claims description 46
- 230000027455 binding Effects 0.000 claims description 45
- XSQUKJJJFZCRTK-UHFFFAOYSA-N Urea Chemical compound NC(N)=O XSQUKJJJFZCRTK-UHFFFAOYSA-N 0.000 claims description 38
- 239000012472 biological sample Substances 0.000 claims description 37
- 210000000582 semen Anatomy 0.000 claims description 27
- 238000004458 analytical method Methods 0.000 claims description 26
- UMGDCJDMYOKAJW-UHFFFAOYSA-N thiourea Chemical compound NC(N)=S UMGDCJDMYOKAJW-UHFFFAOYSA-N 0.000 claims description 24
- 210000001519 tissue Anatomy 0.000 claims description 24
- 238000004949 mass spectrometry Methods 0.000 claims description 23
- 125000001453 quaternary ammonium group Chemical group 0.000 claims description 19
- 210000002700 urine Anatomy 0.000 claims description 19
- 239000000872 buffer Substances 0.000 claims description 18
- 238000001574 biopsy Methods 0.000 claims description 17
- 210000004369 blood Anatomy 0.000 claims description 17
- 239000008280 blood Substances 0.000 claims description 17
- 210000003608 fece Anatomy 0.000 claims description 17
- 210000002381 plasma Anatomy 0.000 claims description 17
- 230000009870 specific binding Effects 0.000 claims description 16
- 238000005406 washing Methods 0.000 claims description 16
- 238000004925 denaturation Methods 0.000 claims description 15
- 230000036425 denaturation Effects 0.000 claims description 15
- 210000002751 lymph Anatomy 0.000 claims description 15
- 238000000672 surface-enhanced laser desorption--ionisation Methods 0.000 claims description 15
- 229920004890 Triton X-100 Polymers 0.000 claims description 14
- 239000013504 Triton X-100 Substances 0.000 claims description 14
- 239000012634 fragment Substances 0.000 claims description 14
- UMCMPZBLKLEWAF-BCTGSCMUSA-N 3-[(3-cholamidopropyl)dimethylammonio]propane-1-sulfonate Chemical compound C([C@H]1C[C@H]2O)[C@H](O)CC[C@]1(C)[C@@H]1[C@@H]2[C@@H]2CC[C@H]([C@@H](CCC(=O)NCCC[N+](C)(C)CCCS([O-])(=O)=O)C)[C@@]2(C)[C@@H](O)C1 UMCMPZBLKLEWAF-BCTGSCMUSA-N 0.000 claims description 13
- 239000004202 carbamide Substances 0.000 claims description 13
- 206010003445 Ascites Diseases 0.000 claims description 12
- 210000004908 prostatic fluid Anatomy 0.000 claims description 12
- 239000012148 binding buffer Substances 0.000 claims description 11
- 102000039446 nucleic acids Human genes 0.000 claims description 11
- 108020004707 nucleic acids Proteins 0.000 claims description 11
- 150000007523 nucleic acids Chemical class 0.000 claims description 11
- 239000000758 substrate Substances 0.000 claims description 11
- 125000003275 alpha amino acid group Chemical group 0.000 claims description 10
- 238000004587 chromatography analysis Methods 0.000 claims description 10
- 238000004811 liquid chromatography Methods 0.000 claims description 10
- 238000000816 matrix-assisted laser desorption--ionisation Methods 0.000 claims description 10
- 210000001175 cerebrospinal fluid Anatomy 0.000 claims description 9
- 235000013336 milk Nutrition 0.000 claims description 9
- 210000004080 milk Anatomy 0.000 claims description 9
- 239000008267 milk Substances 0.000 claims description 9
- 210000002445 nipple Anatomy 0.000 claims description 9
- 210000003296 saliva Anatomy 0.000 claims description 9
- 210000004243 sweat Anatomy 0.000 claims description 9
- 210000001138 tear Anatomy 0.000 claims description 9
- 238000007865 diluting Methods 0.000 claims description 8
- 238000002330 electrospray ionisation mass spectrometry Methods 0.000 claims description 7
- 238000004128 high performance liquid chromatography Methods 0.000 claims description 7
- 230000001131 transforming effect Effects 0.000 claims description 6
- -1 antibodies Proteins 0.000 claims description 5
- 238000000338 in vitro Methods 0.000 claims description 4
- 238000004885 tandem mass spectrometry Methods 0.000 claims description 4
- 238000004422 calculation algorithm Methods 0.000 claims description 3
- 150000007942 carboxylates Chemical group 0.000 claims description 3
- 238000005194 fractionation Methods 0.000 claims description 3
- 229910021645 metal ion Inorganic materials 0.000 claims description 3
- 125000000217 alkyl group Chemical group 0.000 claims description 2
- 125000003118 aryl group Chemical group 0.000 claims description 2
- HJMZMZRCABDKKV-UHFFFAOYSA-N carbonocyanidic acid Chemical group OC(=O)C#N HJMZMZRCABDKKV-UHFFFAOYSA-N 0.000 claims description 2
- 238000012512 characterization method Methods 0.000 abstract 1
- 206010009944 Colon cancer Diseases 0.000 description 63
- 208000001333 Colorectal Neoplasms Diseases 0.000 description 63
- 206010061902 Pancreatic neoplasm Diseases 0.000 description 58
- 208000015486 malignant pancreatic neoplasm Diseases 0.000 description 58
- 201000002528 pancreatic cancer Diseases 0.000 description 58
- 208000008443 pancreatic carcinoma Diseases 0.000 description 58
- 206010028980 Neoplasm Diseases 0.000 description 47
- 201000011510 cancer Diseases 0.000 description 33
- 238000003066 decision tree Methods 0.000 description 33
- 239000000243 solution Substances 0.000 description 33
- 150000002500 ions Chemical class 0.000 description 32
- 201000009030 Carcinoma Diseases 0.000 description 29
- 239000000463 material Substances 0.000 description 19
- 230000035945 sensitivity Effects 0.000 description 17
- QKNYBSVHEMOAJP-UHFFFAOYSA-N 2-amino-2-(hydroxymethyl)propane-1,3-diol;hydron;chloride Chemical compound Cl.OCC(N)(CO)CO QKNYBSVHEMOAJP-UHFFFAOYSA-N 0.000 description 15
- 108091033319 polynucleotide Proteins 0.000 description 14
- 239000002157 polynucleotide Substances 0.000 description 14
- 102000040430 polynucleotide Human genes 0.000 description 14
- VHJLVAABSRFDPM-QWWZWVQMSA-N dithiothreitol Chemical compound SC[C@@H](O)[C@H](O)CS VHJLVAABSRFDPM-QWWZWVQMSA-N 0.000 description 13
- 230000003993 interaction Effects 0.000 description 13
- 239000002773 nucleotide Substances 0.000 description 12
- 125000003729 nucleotide group Chemical group 0.000 description 12
- 208000037396 Intraductal Noninfiltrating Carcinoma Diseases 0.000 description 11
- 125000002091 cationic group Chemical group 0.000 description 11
- 208000028715 ductal breast carcinoma in situ Diseases 0.000 description 11
- 238000001228 spectrum Methods 0.000 description 11
- 238000003556 assay Methods 0.000 description 10
- 150000003431 steroids Chemical class 0.000 description 10
- WEVYAHXRMPXWCK-UHFFFAOYSA-N Acetonitrile Chemical compound CC#N WEVYAHXRMPXWCK-UHFFFAOYSA-N 0.000 description 9
- 108090000288 Glycoproteins Proteins 0.000 description 9
- 102000003886 Glycoproteins Human genes 0.000 description 9
- 108090001030 Lipoproteins Proteins 0.000 description 9
- 102000004895 Lipoproteins Human genes 0.000 description 9
- 102000004389 Ribonucleoproteins Human genes 0.000 description 9
- 108010081734 Ribonucleoproteins Proteins 0.000 description 9
- 235000001014 amino acid Nutrition 0.000 description 9
- 150000001413 amino acids Chemical class 0.000 description 9
- 238000005349 anion exchange Methods 0.000 description 9
- 238000013459 approach Methods 0.000 description 9
- 150000001720 carbohydrates Chemical class 0.000 description 9
- 235000014633 carbohydrates Nutrition 0.000 description 9
- 210000004027 cell Anatomy 0.000 description 9
- 238000003795 desorption Methods 0.000 description 9
- 235000014113 dietary fatty acids Nutrition 0.000 description 9
- 229930195729 fatty acid Natural products 0.000 description 9
- 239000000194 fatty acid Substances 0.000 description 9
- 150000004665 fatty acids Chemical class 0.000 description 9
- 150000002632 lipids Chemical class 0.000 description 9
- 235000000346 sugar Nutrition 0.000 description 9
- 150000008163 sugars Chemical class 0.000 description 9
- 239000003795 chemical substances by application Substances 0.000 description 8
- 150000001875 compounds Chemical class 0.000 description 8
- 238000011161 development Methods 0.000 description 8
- 230000018109 developmental process Effects 0.000 description 8
- 238000005516 engineering process Methods 0.000 description 8
- 239000011159 matrix material Substances 0.000 description 8
- 239000000126 substance Substances 0.000 description 8
- 238000002405 diagnostic procedure Methods 0.000 description 7
- 238000011534 incubation Methods 0.000 description 7
- 229910052751 metal Inorganic materials 0.000 description 7
- 239000002184 metal Substances 0.000 description 7
- 238000002203 pretreatment Methods 0.000 description 7
- 238000007637 random forest analysis Methods 0.000 description 7
- 108020004414 DNA Proteins 0.000 description 6
- 206010073099 Lobular breast carcinoma in situ Diseases 0.000 description 6
- PXHVJJICTQNCMI-UHFFFAOYSA-N Nickel Chemical compound [Ni] PXHVJJICTQNCMI-UHFFFAOYSA-N 0.000 description 6
- 239000003480 eluent Substances 0.000 description 6
- 238000001819 mass spectrum Methods 0.000 description 6
- 238000012549 training Methods 0.000 description 6
- 150000001450 anions Chemical class 0.000 description 5
- 239000007864 aqueous solution Substances 0.000 description 5
- 208000030270 breast disease Diseases 0.000 description 5
- 239000003183 carcinogenic agent Substances 0.000 description 5
- 238000010276 construction Methods 0.000 description 5
- 238000002790 cross-validation Methods 0.000 description 5
- 238000010790 dilution Methods 0.000 description 5
- 239000012895 dilution Substances 0.000 description 5
- 239000003550 marker Substances 0.000 description 5
- 238000005259 measurement Methods 0.000 description 5
- 230000000405 serological effect Effects 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 102000036365 BRCA1 Human genes 0.000 description 4
- 101150072950 BRCA1 gene Proteins 0.000 description 4
- 108010022366 Carcinoembryonic Antigen Proteins 0.000 description 4
- 102100025475 Carcinoembryonic antigen-related cell adhesion molecule 5 Human genes 0.000 description 4
- 101000971703 Homo sapiens Kinesin-like protein KIF1C Proteins 0.000 description 4
- 101000979579 Homo sapiens NK1 transcription factor-related protein 1 Proteins 0.000 description 4
- 102100021525 Kinesin-like protein KIF1C Human genes 0.000 description 4
- 206010027476 Metastases Diseases 0.000 description 4
- 231100000357 carcinogen Toxicity 0.000 description 4
- 231100000504 carcinogenesis Toxicity 0.000 description 4
- 208000035475 disorder Diseases 0.000 description 4
- 230000006870 function Effects 0.000 description 4
- 230000014509 gene expression Effects 0.000 description 4
- 230000002209 hydrophobic effect Effects 0.000 description 4
- 206010020718 hyperplasia Diseases 0.000 description 4
- 238000011065 in-situ storage Methods 0.000 description 4
- 230000002757 inflammatory effect Effects 0.000 description 4
- 201000003159 intraductal papilloma Diseases 0.000 description 4
- 208000030776 invasive breast carcinoma Diseases 0.000 description 4
- 238000009607 mammography Methods 0.000 description 4
- 239000000203 mixture Substances 0.000 description 4
- 230000002797 proteolythic effect Effects 0.000 description 4
- 150000003839 salts Chemical class 0.000 description 4
- PCMORTLOPMLEFB-ONEGZZNKSA-N sinapic acid Chemical compound COC1=CC(\C=C\C(O)=O)=CC(OC)=C1O PCMORTLOPMLEFB-ONEGZZNKSA-N 0.000 description 4
- PCMORTLOPMLEFB-UHFFFAOYSA-N sinapinic acid Natural products COC1=CC(C=CC(O)=O)=CC(OC)=C1O PCMORTLOPMLEFB-UHFFFAOYSA-N 0.000 description 4
- 238000004611 spectroscopical analysis Methods 0.000 description 4
- 238000002560 therapeutic procedure Methods 0.000 description 4
- 108700020463 BRCA1 Proteins 0.000 description 3
- 101150008921 Brca2 gene Proteins 0.000 description 3
- 102000004190 Enzymes Human genes 0.000 description 3
- 108090000790 Enzymes Proteins 0.000 description 3
- LFQSCWFLJHTTHZ-UHFFFAOYSA-N Ethanol Chemical compound CCO LFQSCWFLJHTTHZ-UHFFFAOYSA-N 0.000 description 3
- 206010061857 Fat necrosis Diseases 0.000 description 3
- 208000034826 Genetic Predisposition to Disease Diseases 0.000 description 3
- DTQVDTLACAAQTR-UHFFFAOYSA-N Trifluoroacetic acid Chemical compound OC(=O)C(F)(F)F DTQVDTLACAAQTR-UHFFFAOYSA-N 0.000 description 3
- 238000003491 array Methods 0.000 description 3
- 238000005341 cation exchange Methods 0.000 description 3
- 239000000470 constituent Substances 0.000 description 3
- 230000000875 corresponding effect Effects 0.000 description 3
- 230000007423 decrease Effects 0.000 description 3
- 230000003247 decreasing effect Effects 0.000 description 3
- 239000003599 detergent Substances 0.000 description 3
- 238000007598 dipping method Methods 0.000 description 3
- 229940088598 enzyme Drugs 0.000 description 3
- 238000011156 evaluation Methods 0.000 description 3
- 238000001502 gel electrophoresis Methods 0.000 description 3
- 230000036449 good health Effects 0.000 description 3
- 239000005556 hormone Substances 0.000 description 3
- 229940088597 hormone Drugs 0.000 description 3
- RAXXELZNTBOGNW-UHFFFAOYSA-N imidazole Natural products C1=CNC=N1 RAXXELZNTBOGNW-UHFFFAOYSA-N 0.000 description 3
- 238000003018 immunoassay Methods 0.000 description 3
- 239000012535 impurity Substances 0.000 description 3
- 238000001727 in vivo Methods 0.000 description 3
- 210000001165 lymph node Anatomy 0.000 description 3
- 230000009401 metastasis Effects 0.000 description 3
- 238000012986 modification Methods 0.000 description 3
- 230000004048 modification Effects 0.000 description 3
- 230000001613 neoplastic effect Effects 0.000 description 3
- 230000010309 neoplastic transformation Effects 0.000 description 3
- 229910052759 nickel Inorganic materials 0.000 description 3
- 230000001575 pathological effect Effects 0.000 description 3
- 230000008569 process Effects 0.000 description 3
- 238000004393 prognosis Methods 0.000 description 3
- 230000002062 proliferating effect Effects 0.000 description 3
- 230000005855 radiation Effects 0.000 description 3
- 201000008662 sclerosing adenosis of breast Diseases 0.000 description 3
- 238000000926 separation method Methods 0.000 description 3
- 239000007787 solid Substances 0.000 description 3
- 238000005507 spraying Methods 0.000 description 3
- 238000001356 surgical procedure Methods 0.000 description 3
- WXTMDXOMEHJXQO-UHFFFAOYSA-N 2,5-dihydroxybenzoic acid Chemical compound OC(=O)C1=CC(O)=CC=C1O WXTMDXOMEHJXQO-UHFFFAOYSA-N 0.000 description 2
- 102000052609 BRCA2 Human genes 0.000 description 2
- 108700020462 BRCA2 Proteins 0.000 description 2
- 241000283690 Bos taurus Species 0.000 description 2
- 206010055113 Breast cancer metastatic Diseases 0.000 description 2
- RYGMFSIKBFXOCR-UHFFFAOYSA-N Copper Chemical compound [Cu] RYGMFSIKBFXOCR-UHFFFAOYSA-N 0.000 description 2
- 238000009007 Diagnostic Kit Methods 0.000 description 2
- 208000007659 Fibroadenoma Diseases 0.000 description 2
- GYHNNYVSQQEPJS-UHFFFAOYSA-N Gallium Chemical compound [Ga] GYHNNYVSQQEPJS-UHFFFAOYSA-N 0.000 description 2
- 208000002628 Granulomatous mastitis Diseases 0.000 description 2
- 208000033640 Hereditary breast cancer Diseases 0.000 description 2
- 241000701044 Human gammaherpesvirus 4 Species 0.000 description 2
- 241000701806 Human papillomavirus Species 0.000 description 2
- 102000004195 Isomerases Human genes 0.000 description 2
- 108090000769 Isomerases Proteins 0.000 description 2
- 108700019961 Neoplasm Genes Proteins 0.000 description 2
- 102000048850 Neoplasm Genes Human genes 0.000 description 2
- 208000008589 Obesity Diseases 0.000 description 2
- 108091034117 Oligonucleotide Proteins 0.000 description 2
- 208000031816 Pathologic Dilatation Diseases 0.000 description 2
- 239000004696 Poly ether ether ketone Substances 0.000 description 2
- 241000288906 Primates Species 0.000 description 2
- FAPWRFPIFSIZLT-UHFFFAOYSA-M Sodium chloride Chemical compound [Na+].[Cl-] FAPWRFPIFSIZLT-UHFFFAOYSA-M 0.000 description 2
- 229910000831 Steel Inorganic materials 0.000 description 2
- 239000007983 Tris buffer Substances 0.000 description 2
- HCHKCACWOHOZIP-UHFFFAOYSA-N Zinc Chemical compound [Zn] HCHKCACWOHOZIP-UHFFFAOYSA-N 0.000 description 2
- 230000001154 acute effect Effects 0.000 description 2
- 208000009956 adenocarcinoma Diseases 0.000 description 2
- 238000001042 affinity chromatography Methods 0.000 description 2
- 238000007605 air drying Methods 0.000 description 2
- 230000004075 alteration Effects 0.000 description 2
- 238000003287 bathing Methods 0.000 description 2
- 230000004071 biological effect Effects 0.000 description 2
- 239000013060 biological fluid Substances 0.000 description 2
- 210000001185 bone marrow Anatomy 0.000 description 2
- 201000008275 breast carcinoma Diseases 0.000 description 2
- 201000005389 breast carcinoma in situ Diseases 0.000 description 2
- 238000004364 calculation method Methods 0.000 description 2
- 208000035269 cancer or benign tumor Diseases 0.000 description 2
- 230000001413 cellular effect Effects 0.000 description 2
- 239000002738 chelating agent Substances 0.000 description 2
- 238000006243 chemical reaction Methods 0.000 description 2
- 239000003153 chemical reaction reagent Substances 0.000 description 2
- 239000003638 chemical reducing agent Substances 0.000 description 2
- 230000001684 chronic effect Effects 0.000 description 2
- 150000001851 cinnamic acid derivatives Chemical class 0.000 description 2
- 239000000084 colloidal system Substances 0.000 description 2
- 229910052802 copper Inorganic materials 0.000 description 2
- 239000010949 copper Substances 0.000 description 2
- 238000007405 data analysis Methods 0.000 description 2
- 230000002950 deficient Effects 0.000 description 2
- 230000001419 dependent effect Effects 0.000 description 2
- 230000004069 differentiation Effects 0.000 description 2
- LOKCTEFSRHRXRJ-UHFFFAOYSA-I dipotassium trisodium dihydrogen phosphate hydrogen phosphate dichloride Chemical compound P(=O)(O)(O)[O-].[K+].P(=O)(O)([O-])[O-].[Na+].[Na+].[Cl-].[K+].[Cl-].[Na+] LOKCTEFSRHRXRJ-UHFFFAOYSA-I 0.000 description 2
- 238000010828 elution Methods 0.000 description 2
- 230000007613 environmental effect Effects 0.000 description 2
- 238000010265 fast atom bombardment Methods 0.000 description 2
- 229910052733 gallium Inorganic materials 0.000 description 2
- 231100000722 genetic damage Toxicity 0.000 description 2
- 230000000762 glandular Effects 0.000 description 2
- 239000011521 glass Substances 0.000 description 2
- 150000004676 glycans Chemical class 0.000 description 2
- 208000025581 hereditary breast carcinoma Diseases 0.000 description 2
- NOESYZHRGYRDHS-UHFFFAOYSA-N insulin Chemical compound N1C(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(NC(=O)CN)C(C)CC)CSSCC(C(NC(CO)C(=O)NC(CC(C)C)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CCC(N)=O)C(=O)NC(CC(C)C)C(=O)NC(CCC(O)=O)C(=O)NC(CC(N)=O)C(=O)NC(CC=2C=CC(O)=CC=2)C(=O)NC(CSSCC(NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2C=CC(O)=CC=2)NC(=O)C(CC(C)C)NC(=O)C(C)NC(=O)C(CCC(O)=O)NC(=O)C(C(C)C)NC(=O)C(CC(C)C)NC(=O)C(CC=2NC=NC=2)NC(=O)C(CO)NC(=O)CNC2=O)C(=O)NCC(=O)NC(CCC(O)=O)C(=O)NC(CCCNC(N)=N)C(=O)NCC(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC=CC=3)C(=O)NC(CC=3C=CC(O)=CC=3)C(=O)NC(C(C)O)C(=O)N3C(CCC3)C(=O)NC(CCCCN)C(=O)NC(C)C(O)=O)C(=O)NC(CC(N)=O)C(O)=O)=O)NC(=O)C(C(C)CC)NC(=O)C(CO)NC(=O)C(C(C)O)NC(=O)C1CSSCC2NC(=O)C(CC(C)C)NC(=O)C(NC(=O)C(CCC(N)=O)NC(=O)C(CC(N)=O)NC(=O)C(NC(=O)C(N)CC=1C=CC=CC=1)C(C)C)CC1=CN=CN1 NOESYZHRGYRDHS-UHFFFAOYSA-N 0.000 description 2
- 238000011835 investigation Methods 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 238000004989 laser desorption mass spectroscopy Methods 0.000 description 2
- 239000003446 ligand Substances 0.000 description 2
- 201000011059 lobular neoplasia Diseases 0.000 description 2
- 238000010801 machine learning Methods 0.000 description 2
- 230000009245 menopause Effects 0.000 description 2
- 108020004999 messenger RNA Proteins 0.000 description 2
- 230000000877 morphologic effect Effects 0.000 description 2
- 238000010606 normalization Methods 0.000 description 2
- 235000020824 obesity Nutrition 0.000 description 2
- 239000003960 organic solvent Substances 0.000 description 2
- 238000002559 palpation Methods 0.000 description 2
- 201000010198 papillary carcinoma Diseases 0.000 description 2
- 208000003154 papilloma Diseases 0.000 description 2
- 239000002245 particle Substances 0.000 description 2
- 239000002953 phosphate buffered saline Substances 0.000 description 2
- 108091005981 phosphorylated proteins Proteins 0.000 description 2
- 229920002530 polyetherether ketone Polymers 0.000 description 2
- 229920001282 polysaccharide Polymers 0.000 description 2
- 239000005017 polysaccharide Substances 0.000 description 2
- 238000000746 purification Methods 0.000 description 2
- 238000011160 research Methods 0.000 description 2
- 230000000717 retained effect Effects 0.000 description 2
- 238000012216 screening Methods 0.000 description 2
- 238000010187 selection method Methods 0.000 description 2
- 238000011896 sensitive detection Methods 0.000 description 2
- 238000001542 size-exclusion chromatography Methods 0.000 description 2
- 238000002791 soaking Methods 0.000 description 2
- 239000002904 solvent Substances 0.000 description 2
- 238000000638 solvent extraction Methods 0.000 description 2
- 238000011895 specific detection Methods 0.000 description 2
- 238000010561 standard procedure Methods 0.000 description 2
- 238000007619 statistical method Methods 0.000 description 2
- 239000010959 steel Substances 0.000 description 2
- 230000004083 survival effect Effects 0.000 description 2
- 229920002994 synthetic fiber Polymers 0.000 description 2
- LENZDBCJOHFCAS-UHFFFAOYSA-N tris Chemical compound OCC(N)(CO)CO LENZDBCJOHFCAS-UHFFFAOYSA-N 0.000 description 2
- 238000000539 two dimensional gel electrophoresis Methods 0.000 description 2
- 230000000007 visual effect Effects 0.000 description 2
- 229910052725 zinc Inorganic materials 0.000 description 2
- 239000011701 zinc Substances 0.000 description 2
- JKMHFZQWWAIEOD-UHFFFAOYSA-N 2-[4-(2-hydroxyethyl)piperazin-1-yl]ethanesulfonic acid Chemical compound OCC[NH+]1CCN(CCS([O-])(=O)=O)CC1 JKMHFZQWWAIEOD-UHFFFAOYSA-N 0.000 description 1
- 102100031126 6-phosphogluconolactonase Human genes 0.000 description 1
- 108010029731 6-phosphogluconolactonase Proteins 0.000 description 1
- 108010022752 Acetylcholinesterase Proteins 0.000 description 1
- 102000012440 Acetylcholinesterase Human genes 0.000 description 1
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 1
- 108010000239 Aequorin Proteins 0.000 description 1
- 102000007698 Alcohol dehydrogenase Human genes 0.000 description 1
- 108010021809 Alcohol dehydrogenase Proteins 0.000 description 1
- 102000002260 Alkaline Phosphatase Human genes 0.000 description 1
- 108020004774 Alkaline Phosphatase Proteins 0.000 description 1
- 108010024976 Asparaginase Proteins 0.000 description 1
- 102000015790 Asparaginase Human genes 0.000 description 1
- 238000012935 Averaging Methods 0.000 description 1
- 108700040618 BRCA1 Genes Proteins 0.000 description 1
- 108700010154 BRCA2 Genes Proteins 0.000 description 1
- 102400000748 Beta-endorphin Human genes 0.000 description 1
- 101800005049 Beta-endorphin Proteins 0.000 description 1
- 102100026189 Beta-galactosidase Human genes 0.000 description 1
- 102100035882 Catalase Human genes 0.000 description 1
- 108010053835 Catalase Proteins 0.000 description 1
- VYZAMTAEIAYCRO-UHFFFAOYSA-N Chromium Chemical compound [Cr] VYZAMTAEIAYCRO-UHFFFAOYSA-N 0.000 description 1
- 102100030497 Cytochrome c Human genes 0.000 description 1
- 108010075031 Cytochromes c Proteins 0.000 description 1
- IGXWBGJHJZYPQS-SSDOTTSWSA-N D-Luciferin Chemical compound OC(=O)[C@H]1CSC(C=2SC3=CC=C(O)C=C3N=2)=N1 IGXWBGJHJZYPQS-SSDOTTSWSA-N 0.000 description 1
- 238000000018 DNA microarray Methods 0.000 description 1
- CYCGRDQQIOGCKX-UHFFFAOYSA-N Dehydro-luciferin Natural products OC(=O)C1=CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 CYCGRDQQIOGCKX-UHFFFAOYSA-N 0.000 description 1
- 102400000242 Dynorphin A(1-17) Human genes 0.000 description 1
- 108010065372 Dynorphins Proteins 0.000 description 1
- 238000004435 EPR spectroscopy Methods 0.000 description 1
- 241000283073 Equus caballus Species 0.000 description 1
- 241001141491 Eumorpha elisa Species 0.000 description 1
- BJGNCJDXODQBOB-UHFFFAOYSA-N Fivefly Luciferin Natural products OC(=O)C1CSC(C=2SC3=CC(O)=CC=C3N=2)=N1 BJGNCJDXODQBOB-UHFFFAOYSA-N 0.000 description 1
- 238000005033 Fourier transform infrared spectroscopy Methods 0.000 description 1
- 108010073178 Glucan 1,4-alpha-Glucosidase Proteins 0.000 description 1
- 102100022624 Glucoamylase Human genes 0.000 description 1
- 108010015776 Glucose oxidase Proteins 0.000 description 1
- 239000004366 Glucose oxidase Substances 0.000 description 1
- 108010018962 Glucosephosphate Dehydrogenase Proteins 0.000 description 1
- 239000007995 HEPES buffer Substances 0.000 description 1
- 241000590002 Helicobacter pylori Species 0.000 description 1
- 241000700721 Hepatitis B virus Species 0.000 description 1
- 108010001336 Horseradish Peroxidase Proteins 0.000 description 1
- 102100023915 Insulin Human genes 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 108060001084 Luciferase Proteins 0.000 description 1
- 239000005089 Luciferase Substances 0.000 description 1
- DDWFXDSYGUXRAY-UHFFFAOYSA-N Luciferin Natural products CCc1c(C)c(CC2NC(=O)C(=C2C=C)C)[nH]c1Cc3[nH]c4C(=C5/NC(CC(=O)O)C(C)C5CC(=O)O)CC(=O)c4c3C DDWFXDSYGUXRAY-UHFFFAOYSA-N 0.000 description 1
- 102000013460 Malate Dehydrogenase Human genes 0.000 description 1
- 108010026217 Malate Dehydrogenase Proteins 0.000 description 1
- 108010059724 Micrococcal Nuclease Proteins 0.000 description 1
- 102100030856 Myoglobin Human genes 0.000 description 1
- 108010062374 Myoglobin Proteins 0.000 description 1
- 206010033128 Ovarian cancer Diseases 0.000 description 1
- 206010061535 Ovarian neoplasm Diseases 0.000 description 1
- 108091093037 Peptide nucleic acid Proteins 0.000 description 1
- 108010053210 Phycocyanin Proteins 0.000 description 1
- 108010026552 Proteome Proteins 0.000 description 1
- 108020004511 Recombinant DNA Proteins 0.000 description 1
- 102000006382 Ribonucleases Human genes 0.000 description 1
- 108010083644 Ribonucleases Proteins 0.000 description 1
- 229920002684 Sepharose Polymers 0.000 description 1
- 208000000453 Skin Neoplasms Diseases 0.000 description 1
- DBMJMQXJHONAFJ-UHFFFAOYSA-M Sodium laurylsulphate Chemical compound [Na+].CCCCCCCCCCCCOS([O-])(=O)=O DBMJMQXJHONAFJ-UHFFFAOYSA-M 0.000 description 1
- 210000001744 T-lymphocyte Anatomy 0.000 description 1
- 108010046334 Urease Proteins 0.000 description 1
- BZHJMEDXRYGGRV-UHFFFAOYSA-N Vinyl chloride Chemical compound ClC=C BZHJMEDXRYGGRV-UHFFFAOYSA-N 0.000 description 1
- 241000700605 Viruses Species 0.000 description 1
- JLCPHMBAVCMARE-UHFFFAOYSA-N [3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[3-[[3-[[3-[[3-[[3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-[[5-(2-amino-6-oxo-1H-purin-9-yl)-3-hydroxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxyoxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(5-methyl-2,4-dioxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(6-aminopurin-9-yl)oxolan-2-yl]methoxy-hydroxyphosphoryl]oxy-5-(4-amino-2-oxopyrimidin-1-yl)oxolan-2-yl]methyl [5-(6-aminopurin-9-yl)-2-(hydroxymethyl)oxolan-3-yl] hydrogen phosphate Polymers Cc1cn(C2CC(OP(O)(=O)OCC3OC(CC3OP(O)(=O)OCC3OC(CC3O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c3nc(N)[nH]c4=O)C(COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3COP(O)(=O)OC3CC(OC3CO)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3ccc(N)nc3=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cc(C)c(=O)[nH]c3=O)n3cc(C)c(=O)[nH]c3=O)n3ccc(N)nc3=O)n3cc(C)c(=O)[nH]c3=O)n3cnc4c3nc(N)[nH]c4=O)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)n3cnc4c(N)ncnc34)O2)c(=O)[nH]c1=O JLCPHMBAVCMARE-UHFFFAOYSA-N 0.000 description 1
- 230000001133 acceleration Effects 0.000 description 1
- 239000008351 acetate buffer Substances 0.000 description 1
- 229940022698 acetylcholinesterase Drugs 0.000 description 1
- DZBUGLKDJFMEHC-UHFFFAOYSA-N acridine Chemical class C1=CC=CC2=CC3=CC=CC=C3N=C21 DZBUGLKDJFMEHC-UHFFFAOYSA-N 0.000 description 1
- 230000009471 action Effects 0.000 description 1
- 230000016571 aggressive behavior Effects 0.000 description 1
- 239000002168 alkylating agent Substances 0.000 description 1
- 229940100198 alkylating agent Drugs 0.000 description 1
- AWUCVROLDVIAJX-UHFFFAOYSA-N alpha-glycerophosphate Natural products OCC(O)COP(O)(O)=O AWUCVROLDVIAJX-UHFFFAOYSA-N 0.000 description 1
- 150000001408 amides Chemical class 0.000 description 1
- 125000000539 amino acid group Chemical group 0.000 description 1
- 230000033115 angiogenesis Effects 0.000 description 1
- 230000002942 anti-growth Effects 0.000 description 1
- 230000000692 anti-sense effect Effects 0.000 description 1
- 108091007433 antigens Proteins 0.000 description 1
- 102000036639 antigens Human genes 0.000 description 1
- 238000003782 apoptosis assay Methods 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 150000004982 aromatic amines Chemical class 0.000 description 1
- 229910052785 arsenic Inorganic materials 0.000 description 1
- RQNWIZPPADIBDY-UHFFFAOYSA-N arsenic atom Chemical compound [As] RQNWIZPPADIBDY-UHFFFAOYSA-N 0.000 description 1
- 239000010425 asbestos Substances 0.000 description 1
- 229960003272 asparaginase Drugs 0.000 description 1
- DCXYFEDJOCDNAF-UHFFFAOYSA-M asparaginate Chemical compound [O-]C(=O)C(N)CC(N)=O DCXYFEDJOCDNAF-UHFFFAOYSA-M 0.000 description 1
- 238000000376 autoradiography Methods 0.000 description 1
- 239000000987 azo dye Substances 0.000 description 1
- 239000011324 bead Substances 0.000 description 1
- 230000008901 benefit Effects 0.000 description 1
- 108010005774 beta-Galactosidase Proteins 0.000 description 1
- WOPZMFQRCBYPJU-NTXHZHDSSA-N beta-endorphin Chemical compound C([C@@H](C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](C)C(=O)N[C@@H](CC=1C=CC(O)=CC=1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CCCCN)C(=O)NCC(=O)N[C@@H](CCC(N)=O)C(O)=O)NC(=O)[C@H](CC(C)C)NC(=O)[C@@H](NC(=O)[C@@H](NC(=O)[C@H](CC(C)C)NC(=O)[C@H]1N(CCC1)C(=O)[C@@H](NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@H](CCCCN)NC(=O)[C@H](CCC(N)=O)NC(=O)[C@H](CO)NC(=O)[C@@H](NC(=O)[C@H](CCSC)NC(=O)[C@H](CC=1C=CC=CC=1)NC(=O)CNC(=O)CNC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)[C@@H](C)O)[C@@H](C)O)C(C)C)[C@@H](C)O)C1=CC=CC=C1 WOPZMFQRCBYPJU-NTXHZHDSSA-N 0.000 description 1
- 210000000069 breast epithelial cell Anatomy 0.000 description 1
- 230000000711 cancerogenic effect Effects 0.000 description 1
- PFKFTWBEEFSNDU-UHFFFAOYSA-N carbonyldiimidazole Chemical group C1=CN=CN1C(=O)N1C=CN=C1 PFKFTWBEEFSNDU-UHFFFAOYSA-N 0.000 description 1
- 230000000747 cardiac effect Effects 0.000 description 1
- 238000004113 cell culture Methods 0.000 description 1
- 230000036978 cell physiology Effects 0.000 description 1
- 230000030570 cellular localization Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 229910052804 chromium Inorganic materials 0.000 description 1
- 239000011651 chromium Substances 0.000 description 1
- 239000003593 chromogenic compound Substances 0.000 description 1
- 230000002759 chromosomal effect Effects 0.000 description 1
- 238000005352 clarification Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 238000007796 conventional method Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000001186 cumulative effect Effects 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 238000012217 deletion Methods 0.000 description 1
- 230000037430 deletion Effects 0.000 description 1
- 238000013461 design Methods 0.000 description 1
- 235000005911 diet Nutrition 0.000 description 1
- 230000037213 diet Effects 0.000 description 1
- 230000029087 digestion Effects 0.000 description 1
- 239000003814 drug Substances 0.000 description 1
- 229940079593 drug Drugs 0.000 description 1
- JMNJYGMAUMANNW-FIXZTSJVSA-N dynorphin a Chemical compound C([C@@H](C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N[C@@H]([C@@H](C)CC)C(=O)N[C@@H](CCCNC(N)=N)C(=O)N1[C@@H](CCC1)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC(C)C)C(=O)N[C@@H](CCCCN)C(=O)N[C@@H](CC=1C2=CC=CC=C2NC=1)C(=O)N[C@@H](CC(O)=O)C(=O)N[C@@H](CC(N)=O)C(=O)N[C@@H](CCC(N)=O)C(O)=O)NC(=O)CNC(=O)CNC(=O)[C@@H](N)CC=1C=CC(O)=CC=1)C1=CC=CC=C1 JMNJYGMAUMANNW-FIXZTSJVSA-N 0.000 description 1
- 230000002255 enzymatic effect Effects 0.000 description 1
- 238000006911 enzymatic reaction Methods 0.000 description 1
- 210000002919 epithelial cell Anatomy 0.000 description 1
- 125000003700 epoxy group Chemical group 0.000 description 1
- DNJIEGIFACGWOD-UHFFFAOYSA-N ethyl mercaptane Natural products CCS DNJIEGIFACGWOD-UHFFFAOYSA-N 0.000 description 1
- 230000029142 excretion Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 239000003925 fat Substances 0.000 description 1
- ZFKJVJIDPQDDFY-UHFFFAOYSA-N fluorescamine Chemical compound C12=CC=CC=C2C(=O)OC1(C1=O)OC=C1C1=CC=CC=C1 ZFKJVJIDPQDDFY-UHFFFAOYSA-N 0.000 description 1
- GNBHRKFJIUUOQI-UHFFFAOYSA-N fluorescein Chemical compound O1C(=O)C2=CC=CC=C2C21C1=CC=C(O)C=C1OC1=CC(O)=CC=C21 GNBHRKFJIUUOQI-UHFFFAOYSA-N 0.000 description 1
- 238000005227 gel permeation chromatography Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- 229940116332 glucose oxidase Drugs 0.000 description 1
- 235000019420 glucose oxidase Nutrition 0.000 description 1
- 102000006602 glyceraldehyde-3-phosphate dehydrogenase Human genes 0.000 description 1
- 108020004445 glyceraldehyde-3-phosphate dehydrogenase Proteins 0.000 description 1
- 102000035122 glycosylated proteins Human genes 0.000 description 1
- 108091005608 glycosylated proteins Proteins 0.000 description 1
- 230000012010 growth Effects 0.000 description 1
- 229940037467 helicobacter pylori Drugs 0.000 description 1
- 208000021991 hereditary neoplastic syndrome Diseases 0.000 description 1
- 239000008240 homogeneous mixture Substances 0.000 description 1
- 238000002657 hormone replacement therapy Methods 0.000 description 1
- 238000002013 hydrophilic interaction chromatography Methods 0.000 description 1
- 238000004191 hydrophobic interaction chromatography Methods 0.000 description 1
- 230000002055 immunohistochemical effect Effects 0.000 description 1
- 230000006872 improvement Effects 0.000 description 1
- 229940125396 insulin Drugs 0.000 description 1
- 230000009545 invasion Effects 0.000 description 1
- 238000004255 ion exchange chromatography Methods 0.000 description 1
- 238000002955 isolation Methods 0.000 description 1
- 208000032839 leukemia Diseases 0.000 description 1
- 238000004020 luminiscence type Methods 0.000 description 1
- 230000005291 magnetic effect Effects 0.000 description 1
- 238000002595 magnetic resonance imaging Methods 0.000 description 1
- 230000036210 malignancy Effects 0.000 description 1
- 238000001840 matrix-assisted laser desorption--ionisation time-of-flight mass spectrometry Methods 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 230000009247 menarche Effects 0.000 description 1
- 230000002175 menstrual effect Effects 0.000 description 1
- 150000002739 metals Chemical class 0.000 description 1
- 206010061289 metastatic neoplasm Diseases 0.000 description 1
- 230000000813 microbial effect Effects 0.000 description 1
- 238000012544 monitoring process Methods 0.000 description 1
- 230000035772 mutation Effects 0.000 description 1
- 150000004005 nitrosamines Chemical class 0.000 description 1
- 238000011275 oncology therapy Methods 0.000 description 1
- 210000000056 organ Anatomy 0.000 description 1
- 230000002018 overexpression Effects 0.000 description 1
- 238000000955 peptide mass fingerprinting Methods 0.000 description 1
- RXNXLAHQOVLMIE-UHFFFAOYSA-N phenyl 10-methylacridin-10-ium-9-carboxylate Chemical compound C12=CC=CC=C2[N+](C)=C2C=CC=CC2=C1C(=O)OC1=CC=CC=C1 RXNXLAHQOVLMIE-UHFFFAOYSA-N 0.000 description 1
- ZWLUXSQADUDCSB-UHFFFAOYSA-N phthalaldehyde Chemical compound O=CC1=CC=CC=C1C=O ZWLUXSQADUDCSB-UHFFFAOYSA-N 0.000 description 1
- 230000035790 physiological processes and functions Effects 0.000 description 1
- 125000005575 polycyclic aromatic hydrocarbon group Chemical group 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000005522 programmed cell death Effects 0.000 description 1
- 208000037821 progressive disease Diseases 0.000 description 1
- 230000000750 progressive effect Effects 0.000 description 1
- 238000009877 rendering Methods 0.000 description 1
- 230000003362 replicative effect Effects 0.000 description 1
- 230000001850 reproductive effect Effects 0.000 description 1
- 238000003757 reverse transcription PCR Methods 0.000 description 1
- 230000002441 reversible effect Effects 0.000 description 1
- PYWVYCXTNDRMGF-UHFFFAOYSA-N rhodamine B Chemical compound [Cl-].C=12C=CC(=[N+](CC)CC)C=C2OC2=CC(N(CC)CC)=CC=C2C=1C1=CC=CC=C1C(O)=O PYWVYCXTNDRMGF-UHFFFAOYSA-N 0.000 description 1
- 229910052895 riebeckite Inorganic materials 0.000 description 1
- 239000012488 sample solution Substances 0.000 description 1
- 239000012047 saturated solution Substances 0.000 description 1
- 201000000849 skin cancer Diseases 0.000 description 1
- 230000000391 smoking effect Effects 0.000 description 1
- AWUCVROLDVIAJX-GSVOUGTGSA-N sn-glycerol 3-phosphate Chemical compound OC[C@@H](O)COP(O)(O)=O AWUCVROLDVIAJX-GSVOUGTGSA-N 0.000 description 1
- 239000011780 sodium chloride Substances 0.000 description 1
- 241000894007 species Species 0.000 description 1
- 230000003595 spectral effect Effects 0.000 description 1
- 238000005309 stochastic process Methods 0.000 description 1
- 230000002459 sustained effect Effects 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 208000011580 syndromic disease Diseases 0.000 description 1
- 230000001225 therapeutic effect Effects 0.000 description 1
- 230000009466 transformation Effects 0.000 description 1
- 238000000844 transformation Methods 0.000 description 1
- 230000009261 transgenic effect Effects 0.000 description 1
- 239000000107 tumor biomarker Substances 0.000 description 1
- 230000009452 underexpressoin Effects 0.000 description 1
- 230000003612 virological effect Effects 0.000 description 1
- 238000001262 western blot Methods 0.000 description 1
- DGVVWUTYPXICAM-UHFFFAOYSA-N β‐Mercaptoethanol Chemical compound OCCS DGVVWUTYPXICAM-UHFFFAOYSA-N 0.000 description 1
Classifications
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/574—Immunoassay; Biospecific binding assay; Materials therefor for cancer
- G01N33/57407—Specifically defined cancers
- G01N33/57438—Specifically defined cancers of liver, pancreas or kidney
-
- G—PHYSICS
- G01—MEASURING; TESTING
- G01N—INVESTIGATING OR ANALYSING MATERIALS BY DETERMINING THEIR CHEMICAL OR PHYSICAL PROPERTIES
- G01N33/00—Investigating or analysing materials by specific methods not covered by groups G01N1/00 - G01N31/00
- G01N33/48—Biological material, e.g. blood, urine; Haemocytometers
- G01N33/50—Chemical analysis of biological material, e.g. blood, urine; Testing involving biospecific ligand binding methods; Immunological testing
- G01N33/53—Immunoassay; Biospecific binding assay; Materials therefor
- G01N33/574—Immunoassay; Biospecific binding assay; Materials therefor for cancer
- G01N33/57407—Specifically defined cancers
- G01N33/57419—Specifically defined cancers of colon
Definitions
- the present invention provides biomolecules and the use of these biomolecules for the differential diagnosis of breast cancer and or non-malignant diseases of the breast.
- the biomolecules are characterised by mass profiles generated by contacting a test and/or biological sample with an anion exchange surface under specific binding conditions and detecting said biomolecules using gas phase ion spectrometry.
- the biomolecules used according to the invention are preferably proteins or polypeptides.
- preferred test and/or biological samples are blood serum samples and are of human origin.
- breast cancer a leading cause of death in women
- breast cancer stills remains the most common cancer (other than skin cancer) among women.
- the number of new cases of breast cancer in women was estimated to be about 212,600.
- tumourigenesis While the exact mechanism of tumourigenesis for most breast cancers is largely unknown, research has shown that patients exposed to certain risk factors are more likely than others to develop some form of breast cancer.
- Some of the strongest risk factors include: an increase in age, wherein women over 60 are at the highest risk of developing a breast cancer; a familial and/or personal history of breast cancer; the reproductive and menstrual history of a woman including the age of menarche ( ⁇ 12 years of age), age at first child-bearing and menopause (>55 years of age); hormone replacement therapies, as well as genetic factors such as the breast cancer gene (BRCA) family.
- BRCA breast cancer gene
- BRCA1 and BRCA2 contribute to familial breast cancer in 5% to 10% of breast cancer cases.
- Germ-line mutations within these two loci are associated with a 50 to 85% lifetime risk of breast and/or ovarian cancer [Marcus et al. (1996) Hereditary Breast Cancer: Pathobiology,
- this diagnostic method must be used in parallel with other methods since the detected lesions may either be benign, malignant, or too small to detect be by palpation alone.
- Mammography in contrast, is able to detect a breast tumour before it can be discovered by physical examination, but this diagnostic method is not without its own limitations. For example, mammography's predictive value depends on the observer's skill and the quality of the mammogram. In addition, 80 to 93% of suspicious mammograms are false positives, and 10 to 15% of women with breast cancer have false negative mammograms. Clearly, new diagnostic methods that offer a more sensitive and specific detection of early breast cancer are needed.
- stage determination has potential prognostic value and provides criteria for designing optimal therapy.
- stage determination has potential prognostic value and provides criteria for designing optimal therapy.
- pathological staging of breast cancer over clinical staging is that it provides a more accurate prognosis of the disease, the disadvantage being that this method is invasive.
- clinical staging could become a more attractive approach if it were at least as accurate as pathological staging; it does not depend on an invasive procedure to obtain tissue for evaluation.
- markers could be mRNA or protein markers expressed by cells originating from the primary tumour in the breast but residing in blood, bone marrow or lymph nodes and could serve as sensitive indicators for tumour development and/or metastasis to these distal organs.
- markers could be mRNA or protein markers expressed by cells originating from the primary tumour in the breast but residing in blood, bone marrow or lymph nodes and could serve as sensitive indicators for tumour development and/or metastasis to these distal organs.
- specific protein antigens and mRNA, associated with breast epithelial cells have been detected by immunohistochemical techniques and RT-PCR, respectively, in bone marrow, lymph nodes and blood of breast cancer patients
- CEA carcinoembryonic antigen
- CA 15-3 suffers a similar fate since this marker can also be negative in a significant number of patients with progressive disease and, therefore, fails to predict metastasis.
- both CEA and CA 15-3 can be elevated in non-malignant, benign conditions giving rise to false positive results.
- tumour markers CA15.3 and CA27.29 only for the monitoring of therapeutic treatment in the cases advanced stage breast cancer.
- new serological biomarkers that could be used individually or in combination with an existing modality for cost-effective screening of breast cancer are still urgently needed.
- biomarkers identified are neither grade-specific nor can they detect the disease at its earliest stages (stage I and II), and thereby would not allow for effective patient-specific diagnosis and/or treatment of the disease. Moreover, such serological biomarkers that can specifically differentiate between the presence of a given breast cancer and a non-malignant disease of the breast have not yet been identified.
- such a diagnostic method should be able to detect early-stage breast cancer, as well as distinguish between the later stages or grades of the disease.
- a diagnostic tool should be able to differentiate between breast cancer and a non-malignant disease of the breast.
- the present invention addresses this difficulty with the development of a non-invasive diagnostic tool for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast.
- the present invention relates to methods for the differential diagnosis of breast cancer and/or non-malignant diseases of the breast, by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast.
- the present invention provides a method for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast, in vitro, comprising obtaining a test sample from a subject, contacting test sample with a biologically active surface under specific binding conditions, allowing for biomolecules present within the test sample to bind to the biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said test sample, transforming data into a computer-readable form, and comparing said mass profile against a database containing mass profiles specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having metastasised breast cancers, or subjects having a non-malignant disease of the breast, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast.
- the invention provides a database comprising of mass profiles of biological samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having non-malignant disease of the breast.
- the database is generated by obtaining biological samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, and subjects having non-malignant diseases of the breast, contacting said biological samples with a biologically active surface under specific binding conditions, allowing the biomolecules within the biological sample to bind to said biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said biological samples, transforming data into a computer-readable form, and applying a mathematical algorithm to classify the mass profiles as specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having metastasised breast cancer, and subjects having a non-malignant disease of the breast.
- the present invention provides biomolecules having a molecular mass selected from the group consisting of 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇
- biomolecules having said molecular masses are detected by contacting a test and/or biological sample with a biologically active surface comprising an adsorbent under specific binding conditions and further analysed by gas phase ion spectrometry.
- a biologically active surface comprising an adsorbent under specific binding conditions and further analysed by gas phase ion spectrometry.
- the adsorbent used is comprised of positively charged quaternary ammonium groups (anion exchange surface).
- the invention provides specific binding conditions for the detection of biomolecules within a sample.
- a sample is diluted 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then diluted again 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH 8.5 at 0 to 4°C.
- the treated sample is then contacted with a biologically active surface comprising of positively charged (cationic) quaternary ammonium groups (anion exchanging), incubated for 120 minutes at 20 to 24°C, and the bound biomolecules are detected using gas phase ion spectrometry.
- a biologically active surface comprising of positively charged (cationic) quaternary ammonium groups (anion exchanging)
- the invention provides a method for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast comprising detecting of one or more differentially expressed biomolecules within a sample.
- This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast.
- binding molecules are antibodies specific for said polypeptides.
- the biomolecules related to the invention having a molecular mass selected from the group consisting of 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇ 33 Da,
- the invention provides a method for the identification of biomolecules within a sample, provided that the biomolecules are proteins, polypeptides or fragments thereof, comprising: chromatography and fractionation, analysis of fractions for the presence of said differentially expressed proteins and/or fragments thereof, using a biologically active surface, further analysis using mass spectrometry to obtain amino acid sequences encoding said proteins and/or fragments thereof, and searching amino acid sequence databases of known proteins to identify said differentially expressed proteins by amino acid sequence comparison.
- the method of chromatography is high performance liquid chromatography (HPLC) or fast protein liquid chromatography (FPLC).
- the mass spectrometry used is selected from the group of matrix-assisted laser desorption ionization/time of flight (MALDI-TOF), surface enhanced laser desorption ionisation/time of flight (SELDI-TOF), liquid chromatography, MS-MS, or ESI-MS.
- MALDI-TOF matrix-assisted laser desorption ionization/time of flight
- SELDI-TOF surface enhanced laser desorption ionisation/time of flight
- MS-MS mass spectrometry
- kits for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin.
- the test and/or biological samples are blood serum samples, and are isolated from subjects of mammalian origin, preferably of human origin.
- FIG. 1 Comparison of protein mass spectra detected using the above-mentioned SAX2 ProteinChip arrays for samples isolated from patients with breast cancer (TI and T2) and from patients suffering from non-malignant diseases of the breast (CI and C2).
- A shows an overview in the mass range 11 - 20 kDa.
- B shows the boxed section of A. The mass signal at m/z 12,656.2 Da is highlighted. Its variable importance score ranks 2 nd within the classifier.
- Figure 2A Development of out-of-bag error. During the training process of the final classifier, the out-of-bag error decreased to about 27%. The out-of-bag error is typically higher than the resulting test error as class assignment is only conducted on the basis of about 1/3 of the generated trees.
- Figure 2B Out-of-bag estimation of ROC curve for final classifier.
- the out-of-bag estimates of sensitivity and specificity presented in Table -3 are extrapolated into the entire range of sensitivity and specificity. This is done by varying the percentage of decision trees with vote "positive” necessary for assigning a case to class "positive”.
- the diagonal represents the average random classifier, assigning cases randomly to class "positive” and "negative”.
- the circle marks the pair of sensitivity and specificity of Table 3.
- Figure 3 Decision tree complexity.
- the histogram visualizes the distribution of decision tree complexity in the final random forest classifier.
- decision tree complexity is measured by the number of terminal nodes.
- FIG. 4 Voting distribution. The histogram shows how frequently trees of the final classifier vote for class "positive". For each case (patient) only the votes of those trees are collected for which the considered case is "out-of-bag". For each case, votes are normalized as follows: (number of votes for class "positive” - number of votes for class negative) / (number of trees for which the considered case is "out-of-bag”). Dashed vertical lines correspond to quantiles at 0%, 25%, 50%, 75%, and 100%. Figure 5.
- a - E Scatter plots of peak clusters belonging to differentially expressed proteins included in the classifier. Peak clusters are aligned along the vertical axis, e.g. M1516.00 denotes the peak cluster with characteristic mass 1516 Da.
- the horizontal axis shows the raw relative signal intensity of the peaks in the examined serum samples.
- raw refers the non- logarithmic and not additionally normalized intensities, see Figure 6 and 7 for further intensity transformations.
- ⁇ T Tuour
- o C Healthy & diseased control patients' serum samples.
- FIG. 6A - E Scatter plots of peak clusters belonging to differentially expressed proteins included in the classifier. Peak clusters are aligned along the vertical axis, e.g. M1516.00 denotes the peak cluster with characteristic mass 1516 Da. The horizontal axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. For each mass, intensities were first shifted to entirely positive values and then normalized by dividing the intensity values by the average intensity of that mass. Finally, the base 2 logarithm was taken.
- zero logarithmic normalized relative intensity refers to mean peak cluster intensity
- logarithmic normalized relative intensities of +1 and -1 mean two-fold over- and under-expression relative to mean peak cluster intensity, respectively
- ⁇ T Tuour
- o C Healthy & diseased control patients' serum samples.
- FIG. 7A - E Additionally scaled scatter plots of peak clusters belonging to differentially expressed proteins included in the classifier. Peak clusters are aligned along the vertical axis, e.g. M1516.00 denotes the peak cluster with characteristic mass 1516 Da. As in Figure 3, the Y- axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. However, intensities were additionally (shifted and) scaled so that the intensities of each peak cluster cover the entire horizontal range. Thereby, the minimum and maximum intensities of all masses are aligned on the left and right edge of the plot, respectively. This allows to better visualize the extend of class overlap, a T (Tumour): Breast cancer & DCIS patients' serum samples, o C (Control): Healthy & diseased control patients' serum samples.
- T Tuour
- C Healthy & diseased control patients' serum samples.
- biomolecule refers to a molecule produced by a cell or living organism. Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, proteins, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins).
- nucleotide or polynucleotide” refer to a nucleotide, oligonucleotide, polynucleotide, or any fragment thereof.
- DNA or RNA of genomic or synthetic origin which may be single-stranded or double-stranded and may represent the sense, or the antisense strand, to peptide polynucleotide sequences (i.e. peptide nucleic acids; PNAs), or to any DNA-like or RNA-like material.
- PNAs peptide nucleic acids
- fragment refers to a portion of a polypeptide (parent) sequence that comprises at least 10 consecutive amino acid residues and retains a biological activity and/or some functional characteristics of the parent polypeptide e.g. antigenicity or structural domain characteristics.
- biological sample and "test sample” refer to all biological fluids and excretions isolated from any given subject.
- samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples.
- binding refers to the binding reaction between a biomolecule and a specific "binding molecule”.
- binding molecules that include, but are not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins).
- a binding reaction is considered to be specific when the interaction between said molecules is substantial.
- a binding reaction is considered substantial when the reaction that takes place between said molecules is at least two times the background.
- specific binding conditions refers to reaction conditions that permit the binding of said molecules such as pH, salt, detergent and other conditions known to those skilled in the art.
- reaction relates to the direct or indirect binding or alteration of biological activity of a biomolecule. .
- the term "differential diagnosis” refers to a diagnostic decision between healthy and different disease states, including various stages of a specific disease.
- a subject is diagnosed as healthy or to be suffering from a specific disease, or a specific stage of a disease based on a set of hypotheses that allow for the distinction between healthy and one or more stages of the disease.
- the choice between healthy and one or more stages of disease depends on a significant difference between each hypothesis.
- a “differential diagnosis” may also refer to a diagnostic decision between one disease type as compared to another (e.g. breast cancer vs. a non-malignant disease of the breast).
- breast cancer refers to a malignant neoplastic lesion of the breast within a given subject, wherein the neoplasm is defined according to its type, stage and/or grade.
- the various stages of a cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)].
- UICC Union Internationale Contre Cancer
- AJC American Joint Committee on Cancer
- breast cancer is also referred to as “mammary cancer” or "a carcinoma of the breast”.
- breast cancer includes both in situ (non-invasive) and invasive breast cancers.
- in situ breast cancers include ductal und lobular carcinoma in situ (DCIS und LCIS, respectively), invasive breast cancers encompass infiltrating diseases such as invasive ductal, lobular und papillary carcinoma's (DCIS und LCIS) and medullar, colloid, und tubular carcinomas.
- DCIS und LCIS ductal und lobular carcinoma in situ
- invasive breast cancers encompass infiltrating diseases such as invasive ductal, lobular und papillary carcinoma's (DCIS und LCIS) and medullar, colloid, und tubular carcinomas.
- a non-malignant disease of the breast refers to a lesion of the breast that does not exhibit malignant neoplastic physiological, biochemical, and/or morphological properties known to those skilled in the art.
- diseases include, but are not limited to, inflammatory and p oliferative lesions, fibrocystic changes within mammary tissue as well as benign disorders of the breast.
- inflammatory lesions encompass acute, periductal and granulomatous mastitis, duct ectasia, fat necrosis, whereas proliferative lesions include epithelial hyperplasia (atypical ductal and lobular hyperplasia), sclerosing adenosis, and small duct papillomas.
- proliferative lesions include epithelial hyperplasia (atypical ductal and lobular hyperplasia), sclerosing adenosis, and small duct papillomas.
- benign disorders of the glandular tissue mastopathy
- papillomas large duct, intraductal
- fibroadenomas are benign disorders of the glandular tissue.
- the term "healthy individual” refers to a subject possessing good health. Such a subject demonstrates an absence of any disease within the breast; preferably an absence of a non- malignant disease of the breast or breast cancer.
- precancerous lesion of the breast refers to a biological change within the breast such that it becomes susceptible to the development of a cancer. More specifically, a precancerous lesion of the breast is a preliminary stage of a breast cancer.
- causes of a precancerous lesion may include, but are not limited to, genetic predisposition and exposure to cancer-causing agents (carcinogens); such cancer causing agents include agents that cause genetic damage and induce neoplastic transformation of a cell.
- non-plastic transformation of a cell refers an alteration in normal cell physiology and includes, but is not limited to, self-sufficiency in growth signals, insensitivity to growth-inhibitory (anti-growth) signals, evasion of programmed cell death (apoptosis), limitless replicative potential, sustained angiogenesis, and tissue invasion and metastasis.
- the phrase "differentially present” refers to differences in the quantity of a biomolecule (of a particular apparent molecular mass) present in a sample from a subject as compared to a comparable sample.
- a biomolecule is present at an elevated level, a decreased level or absent in samples of subjects having breast cancer compared to samples of subjects who do not have a cancer of the breast. Therefore in the context of the invention, the term “differentially present biomolecule” refers to the quantity of the biomolecule (of a particular apparent molecular mass) present within a sample taken from a subject having a breast cancer or a non-malignant disease of the breast as compared to a comparable sample taken from a healthy subject.
- a biomolecule is differentially present between two samples if the quantity of said biomolecule in one sample is significantly different (defined statistically) from the quantity of said biomolecule in another sample.
- diagnostic assay can be used interchangeably with “diagnostic method” and refers to the detection of the presence or nature of a pathologic condition. Diagnostic assays differ in their sensitivity and specificity. Within the context of the invention the sensitivity of a diagnostic assay is defined as the percentage of diseased subjects who test positive for a breast cancer or a non-malignant disease of the breast, and are considered “true positives”. Subjects having either a breast cancer or a non-malignant disease of the breast, but are not detected by the diagnostic assay are considered to be “false negatives”. Subjects who show no disease, whether a breast cancer or a non-malignant disease of the breast, and who test negative in the diagnostic assay are considered to be "true negatives”.
- the term specificity of a diagnostic assay is defined as 1 minus the false positive rate, where the "false positive rate” is defined as the proportion of those subjects devoid of a non-malignant disease of the breast or a breast cancer, but who test positive in said assay.
- adsorbent refers to any material that is capable of accumulating (binding) a given biomolecule.
- the adsorbent typically coats a biologically active surface and is composed of a single material or a plurality of different materials that are capable of binding a biomolecule.
- materials include, but are not limited to, anion exchange materials, cation exchange materials, metal chelators, polynucleotides, oligonucleotides, peptides, antibodies, metal chelators etc.
- biologically active surface refers to any two- or three-dimensional extension of a material that biomolecules can bind to, or interact with, due to the specific biochemical properties of this material and those of the biomolecules.
- biochemical properties include, but are not limited to, ionic character (charge), hydrophobicity, or hydrophilicity.
- binding molecule refers to a molecule that displays an affinity for another molecule.
- such molecules may include, but are not limited to nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polypeptides, carbohydrates, lipids, and combinations thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins).
- binding molecules are antibodies.
- solution refers to a homogeneous mixture of two or more substances. Solutions may include, but are not limited to buffers, substrate solutions, elution solutions, wash solutions, detection solutions, standardisation solutions, chemical solutions, solvents, etc. Furthermore, other solutions known to those skilled in the art are also included herein.
- mass profile refers to a mass spectrum as a characteristic property of a given sample or a group of samples. Such a profile, when compared to the mass profile of a second sample or group of samples, will allow for the differentiation between the two samples.
- the mass profile is obtained by treating the biological sample as follows. The sample is diluted it 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1%. DTT, and 2% ampholine and subsequently diluted 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5.
- pre-treated sample is applied to a biologically active surface comprising positively charged quaternary ammonium groups (anion exchange surface) and incubated for 120 minutes.
- the biomolecules bound to the surface are analysed by gas phase ion spectrometry as described in another section. All but the dilution steps are performed at 20 to 24°C. Dilution steps are performed at 0 to 4°C.
- Appendix refers to the molecular mass value in Dalton (Da) of a biomolecule as it may appear in a given method of investigation, e.g. size exclusion chromatography, gel electrophoresis, or mass spectrometry.
- chromatography refers to any method of separating biomolecules within a given sample such that the original native state of a given biomolecule is retained. Separation of a biomolecule from other biomolecules within a given sample for the purpose of enrichment, purification and/or analysis, may be achieved by methods including, but not limited to, size exclusion chromatography, ion exchange chromatography, hydrophobic and hydrophilic interaction chromatography, metal affinity chromatography, wherein "metal” refers to metal ions (e.g. nickel, copper, gallium, or zinc) of all chemically possible valences, or ligand affinity chromatography wherein "ligand” refers to binding molecules, preferably proteins, antibodies, or DNA. Generally, chromatography uses biologically active surfaces as adsorbents to selectively accumulate certain biomolecules.
- mass spectrometry refers to a method comprising employing an ionization source to generate gas phase ions from a biological entity of a sample presented on a biologically active surface, and detecting the gas phase ions with a mass spectrometer.
- laser desorption mass spectrometry refers to a method comprising the use of a laser as an ionization source to generate gas phase ions from a biomolecule presented on a biologically active surface, and detecting the gas phase ions with a mass spectrometer.
- mass spectrometer refers to a gas phase ion spectrometer that includes an inlet system, an ionisation source, an ion optic assembly, a mass analyser, and a detector.
- the terms “detect”, “detection” or “detecting” refer to the identification of the presence, absence, or quantity of a biomolecule.
- EAM energy absorbing molecule
- Cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid are frequently used as energy-absorbing molecules in laser desorption of biomolecules. See U.S. Pat. No. 5,719,060 (Hutchens & Yip) for a further description of energy absorbing molecules.
- training set refers to a subset of the respective entire available data set. This subset is typically randomly selected, and is solely used for the purpose of classifier construction.
- test set refers to a subset of the entire available data set consisting of those entries not included in the training set. Test data is applied to evaluate classifier performance.
- decision tree refers to a flow-chart-like tree structure employed for classification. Decision trees consist of repeated splits of a data set into subsets. Each split consists of a simple rule applied to one variable, e.g., "if value of 'variable 1' larger than 'threshold 1' then go left else go right". Accordingly, the given feature space is partitioned into a set of rectangles with each rectangle assigned to one class.
- ensemble can be used interchangeably and refer to a classifier that consists of many simpler elementary classifiers, e.g., an ensemble of decision trees is a classifier consisting of decision trees.
- the result of the ensemble classifier is obtained by combining all the results of its constituent classifiers, e.g., by majority voting that weights all constituent classifiers equally. Majority voting is, especially reasonable in the case of bagging, where constituent classifiers are then naturally weighted by the frequency with which they are generated.
- Competitors refers to a variable that can be used as an alternative splitting rule in a decision tree. Within the context of the invention,.the competitor is the apparent molecular mass of a given biomolecule. In each step of decision tree construction, only the variable yielding the best data-splitting is selected. Competitors are non-selected variables with similar but lower performance than the selected variable. They point into the direction of alternative decision trees.
- surrogate refers to a splitting rule that closely mimics the action of the primary split.
- a surrogate is a variable that can substitute a selected decision tree variable, e.g. in the case of missing values. Not only must a good surrogate split the parent node into descendant nodes similar in size and composition to the primary descendant nodes, it must also match the primary split on the specific cases that go to the left child and right child nodes.
- peak and “signal” may be used interchangeably, and refer to any signal which is generated by a biomolecule when under investigation using a specific method, for example chromatography, mass spectrometry, or any type of spectroscopy like Ultraviolet/Visible Light (UV/Vis) spectroscopy, Fourier Transformed Infrared (FTIR) spectroscopy, Electron Paramagnetic Resonance (EPR) spectroscopy, or Nuclear Mass Resonance (NMR) spectroscopy.
- UV/Vis Ultraviolet/Visible Light
- FTIR Fourier Transformed Infrared
- EPR Electron Paramagnetic Resonance
- NMR Nuclear Mass Resonance
- peak and signal refer to the signal generated by a biomolecule of a certain molecular mass hitting the detector of a mass spectrometer, thus generating a signal intensity which correlates with the amount or concentration of said biomolecule of a given sample.
- a “peak” and “signal” is defined by two values: an apparent molecular mass value (m z) and an intensity value generated as described.
- the mass value is an elemental characteristic of a biological entity, whereas the intensity value accords to a certain amount or concentration of a biological entity with the corresponding apparent molecular mass value, and thus “peak” and “signal” always refer to the properties of this biological entity.
- cluster refers to a signal or peak present in a certain set of mass spectra or mass profiles obtained from different samples belonging to two or more different groups (e.g. cancer and non-cancer). Within the set, signals belonging to cluster can differ in their intensities, but not in the apparent molecular masses.
- variable refers to a cluster which is subjected to a statistical analysis aiming towards a classification of samples into two or more different sample groups (e.g. cancer and non cancer) by using decision trees, wherein the sample feature relevant for classification is the intensity value of the variables in the analysed samples.
- the present invention relates to methods for the differential diagnosis of breast cancer and or a non-malignant disease of the breast by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast.
- a method for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast comprises: obtaining a test sample from a given subject, contacting said sample with an adsorbent present on a biologically active surface under specific binding conditions, allowing the biomolecules within the test sample to bind to said adsorbent, detecting one or more bound biomolecules using a detection method, wherein the detection method generates a mass profile of said sample, transforming mass profile data into a computer-readable form comparing the mass profile of said sample with a database containing mass profiles from comparable samples specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast.
- a comparison of mass profiles allows for the medical practitioner to determine if a subject is healthy, has a precancerous lesion of the breast, a breast cancer, a metastasised breast cancer or a non-malignant disease of the breast based on the presence, absence or quantity of specific biomolecules.
- a single biomolecule or a combination of more than one biomolecule selected from the group having an apparent molecular mass of 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da
- Detection of a single or a combination of more than one biomolecule of the invention is based on specific sample pre-treatment conditions, the pH of binding conditions, and the type of biologically active surface used for the detection of biomolecules. For example, prior to the detection of the biomolecules described herein, a given sample is pre- treated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine.
- the denatured sample is then diluted 1:10 in a specific binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5), applied to a biologically active surface comprising of positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface.
- a specific binding buffer 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5
- a biomolecule of the invention may include any molecule that is produced by a cell or living organism, and may have any biochemical property (e.g. phosphorylated proteins, glycosylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophilicity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution in 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C followed by incubation on said biologically active surface for 120 minutes at 20 to 24°C.
- biochemical property e.g. phosphorylated proteins, glycosylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophilicity
- biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged qua
- Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins).
- a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more preferred are peptide or protein biomolecules or fragments thereof.
- a biomolecule having the apparent molecular mass of about e.g. 8940 Da is present only in biological samples from patients having a metastasised breast cancer.
- Mass profiling of two test samples from different subjects, X and Y reveals the presence of a biomolecule with the apparent molecular mass of about 8940 Da in a sample from test subject X, and the absence of said biomolecule in test sample from subject Y.
- the medical practitioner is able to diagnose subject X as having a potential metastasised breast cancer and subject Y as not having a metastasised breast cancer.
- biomolecules having the apparent molecular mass of about 2053 Da, 4161 Da and 10682 Dare present in varying quantities in samples specific for precancerous lesions and "early" breast cancers.
- the biomolecule having the apparent molecular mass of 2053 Da is more present in samples specific for precancerous lesions of the breast than for "early" breast cancers.
- a biomolecule having an apparent molecular mass of 4161 Da is detected in samples from subjects having "early" breast cancers but not in those having a precancerous lesion, whereas the biomolecule having the molecular mass of 10682 Da is present in about the same quantity in both sample types.
- biomolecules are not present in samples from healthy subjects, only those of apparent molecular mass of 14014 Da and 9377 Da.
- Analysis of a test sample reveals the presence of biomolecules having the molecular mass of 10682 Da, 2053 Da and 4161 Da.
- Comparison of the quantity of the biomolecules within said sample reveals that the biomolecule with an apparent molecular mass of 2053 Da is present at lower levels than those found in samples from subjects having a precancerous lesion.
- the medical practitioner is able to diagnose the test subject as having an "early" breast cancer.
- an immunoassay can be used to determine the presence or absence of a biomolecule within a test sample of a subject.
- the presence or absence of a biomolecule within a sample can be detected using the various immunoassay methods known to those skilled in the art (i.e. ELISA, western blots). If a biomolecule is present in the test sample, it will form an antibody-marker complex with an antibody that specifically binds a biomolecule under suitable incubation conditions. The amount of an antibody-biomolecule complex can be determined by comparing to a standard.
- the invention provides a method for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast comprising: detecting of one or more differentially expressed biomolecules within a sample.
- This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer and/or a non-malignant disease of the breast.
- Binding molecules include, but are not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins), compounds or synthetic molecules.
- binding molecules are antibodies specific for biomolecules selected from the group of having an apparent molecular mass of 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇
- a method for detecting the differential presence or absence of one or more biomolecules selected from the group having an apparent molecular mass of 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da
- antibodies or fragments thereof may be utilised for the detection of a biomolecule in a biological sample comprising: applying a labelled antibody directed against a given biomolecule of the invention to said sample under conditions that favour an interaction between the labelled antibody and its corresponding protein.
- a labelled antibody directed against a given biomolecule of the invention to said sample under conditions that favour an interaction between the labelled antibody and its corresponding protein.
- an antibody coupled to an enzyme is detected using a chromogenic substrate that is recognised and cleaved by the enzyme to produce a chemical, moiety, which is readily detected using spectrometric, fluorimetric or visual means.
- Enzymes used to for labelling include, but are not limited to, malate dehydrogenase, staphylococcal nuclease, delta-5 -steroid isomerase, yeast alcohol dehydrogenase, alpha-glycerophosphate, dehydrogenase, triose phosphate isomerase, horseradish peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase and acetylcholinesterase.
- Detection may also be accomplished by visual comparison of the extent of the enzymatic reaction of a substrate with that of similarly prepared standards.
- radiolabelled antibodies can be detected using a gamma or a scintillation counter, or they can be detected using autoradiography.
- fluorescently labelled antibodies are detected based on the level at which the attached compound fluoresces following exposure to a given wavelength. Fluorescent compounds typically used in antibody labelling include, but are not limited to, fluorescein isothiocynate, rhodamine, phycoerthyrin, phycocyanin, allophycocyani, o-phthaldehyde and fluorescamine.
- antibodies coupled to a chemi- or bioluminescent compound can be detected by determining the presence of luminescence.
- luminescence include, but are not limited to, luminal, isoluminal, theromatic acridinium ester, imidazole, acridinium salt, oxalate ester, luciferin, luciferase and aequorin.
- in vivo techniques for the detection of a biomolecule of the invention include introducing into a subject a labelled antibody directed against a given polypeptide or fragment thereof.
- the test sample used for the differential diagnosis of a breast cancer and or a non-malignant disease of the breast within a subject may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin.
- test samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
- test samples used for the methods of the invention are isolated from subjects of mammalian origin, preferably of primate origin. Even more preferred are subjects of human origin.
- the methods of the invention for the differential diagnosis of healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasized breast cancer or subjects having a non-malignant disease of the breast described herein may be combined with other diagnostic methods to improve the outcome of the differential diagnosis.
- Other diagnostic methods are known to those skilled in the art.
- a database comprising of mass profiles specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast is generated by contacting biological samples isolated from above-mentioned subjects with an adsorbent on a biologically active surface under specific binding conditions, allowing the biomolecules within said sample to bind said adsorbent, detecting one or more bound biomolecules using a detection method wherein the detection method generates a mass profile of said sample, transforming the mass profile data into a computer-readable form and applying a mathematical algorithm to classify the mass profile as specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer or subjects having a non-malignant disease of the breast.
- the classification of said mass profiles is performed using the "CART" decision tree approach [classification and regression trees; Breiman et al. (1984) Classification and regression trees. Wadsworth International, Belmont, California] and is known to those skilled in the art. Furthermore, bagging of classifiers is applied to overcome typical instabilities of forward variable selection procedures, thereby increasing overall classifier performance [Breiman L. (1996) Bagging Predictors. Machine learning 24: 123-140].
- one or more biomolecules selected from the group having an apparent molecular mass of 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇ 33 Da
- biomolecules within a given sample are bound to an adsorbent on a biologically active surface under specific binding conditions, for example, the biomolecules within a given sample are applied to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated with 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH of 8.5 to allow for specific binding. Biomolecules that bind to said biologically active surface under these conditions are negatively charged molecules.
- biomolecules of the invention are bound to a cationic adsorbent comprising of positively-charged quaternary ammonium groups, the biomolecules are capable of binding other types of adsorbents, as described in another section using binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents.
- biological samples used to generate a database of mass profiles for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer or subjects having a non-malignant disease of the breast may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin.
- biological samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
- the biological samples related to the invention are isolated from subjects considered to be healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or having a non-malignant disease of the breast.
- Said subjects are of mammalian origin, preferably of primate origin. Even more preferred are subjects of human origin.
- a subject of the invention that is said to have a precancerous lesion of the breast, displays preliminary stages of a cancer, wherein a cell and/or tissue has become susceptible to the development of a cancer as a result of either a genetic predisposition, exposure to a cancer-causing agent (carcinogen) or both.
- a cancer-causing agent carcinogen
- a genetic pre-disposition may include a predisposition for an autosomal dominant inherited cancer syndrome which is generally indicated by a strong family history of uncommon cancer and or an association with a specific marker phenotype, a familial cancer (e.g. familial relapsing a non-malignant disease of the breast) wherein an evident clustering of cancer is observed but the role of inherited predisposition may not be clear, or an autosomal recessive syndrome characterised by chromosomal or DNA instability.
- Cancer-causing agents include agents that stimulate genetic damage and induce neoplastic transformation of a cell.
- Such agents fall into three categories: 1) chemical carcinogens such as alkylating agents, polycyclic aromatic hydrocarbons, aromatic amines, azo dyes, nitrosamines and amides, asbestos, vinyl chloride, chromium, nickel, arsenic, and naturally occurring carcinogens (e.g. aflotoxin Bl); 2) radiation such as ultraviolet (UV) and ionisation radiation including electromagnetic (e.g. x-rays, ⁇ -rays) and particulate radiation (e.g.
- chemical carcinogens such as alkylating agents, polycyclic aromatic hydrocarbons, aromatic amines, azo dyes, nitrosamines and amides, asbestos, vinyl chloride, chromium, nickel, arsenic, and naturally occurring carcinogens (e.g. aflotoxin Bl); 2) radiation such as ultraviolet (UV) and ionisation radiation including electromagnetic (e.g. x-rays, ⁇ -rays) and particulate radiation (e.
- ⁇ and ⁇ particles, protons, neutrons 3) viral and microbial carcinogens such as human Papillomavirus (HPV), Epstein-Barr virus (EBV), hepatitis B virus (HBV), human T-cell leukaemia virus type 1 (HTLV-1), or Helicobacter pylori.
- HPV human Papillomavirus
- EBV Epstein-Barr virus
- HBV hepatitis B virus
- HTLV-1 human T-cell leukaemia virus type 1
- Helicobacter pylori a viral and microbial carcinogens
- environmental factors have also been implicated to play a role in the predisposition of breast cancer. Such factors are known to those skilled in the art and include, but are not limited to smoking, chronic alcohol intake, and the consumption of a high-energy diet rich in fats.
- breast cancer arises with greater frequency in patient with chronic a non-malignant disease of the breast
- cancers of the breast are also referred to as mammary cancers or carcinomas of the breast.
- Breast cancers of the invention include both in situ (non- invasive) and invasive breast cancers.
- in situ (non-invasive) breast cancers include ductal und lobular carcinoma in situ (DCIS und LCIS, respectively)
- invasive breast cancers encompass infiltrating diseases such as invasive ductal, lobular und papillary carcinoma's (DCIS und LCIS) and medullar, colloid, und tubular carcinomas.
- breast cancers of the invention may also be of various stages, wherein the staging is based on the size of the primary lesion, its extent of spread to regional lymph nodes, and the presence or absence of blood-borne metastases (metastatic breast cancers).
- the various stages of abreast cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)].
- UICC Union Internationale Contre Cancer
- AJC American Joint Committee on Cancer
- different grades of said breast cancers wherein the grade of the cancer is based on the degree of differentiation of the epithelial cells within the lining of the breast and the number of mitoses as a correlation to a neoplasm's aggression.
- a subject said to have a non-malignant disease of the breast possesses a lesion of the breast that does not exhibit malignant neoplastic physiological, biochemical, and or morphological properties known to those skilled in the art.
- diseases include, but are not limited to, inflammatory and proliferative lesions, fibrocystic changes within mammary tissue as well as benign disorders of the breast.
- inflammatory lesions encompass acute, periductal and granulomatous mastitis, duct ectasia, fat necrosis, whereas proliferative lesions include epithelial hyperplasia (atypical ductal and lobular hyperplasia), sclerosing adenosis, and small duct papillomas.
- benign disorders of the glandular tissue mastopathy
- papillomas large duct, intraductal
- fibroadenomas are benign disorders of the glandular tissue.
- Healthy individuals are those that possess good health, and demonstrate an absence of a breast cancer or a non-malignant disease of the breast.
- Biomolecules The differential expression of biomolecules in samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having metastasised breast cancer, and subjects having a non-malignant disease of the breast, allows for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast within a subject.
- Biomolecules are said to be specific for a particular clinical state (e.g. healthy, precancerous lesion of the breast, breast cancer, metastasised breast cancer, a non-malignant disease of the breast) when they are present at different levels within samples taken from subjects in one clinical state as compared to samples taken from subjects from other clinical states (e.g. in subjects with a precancerous lesion of the breast vs. in subjects with a metastasised breast cancer). Biomolecules may be present at elevated levels, at decreased levels, or altogether absent within a sample taken from a subject in a particular clinical state (e.g. healthy, precancerous lesion of the breast, breast cancer, metastasised breast cancer, a non-malignant disease of the breast).
- a particular clinical state e.g. healthy, precancerous lesion of the breast, breast cancer, metastasised breast cancer, a non-malignant disease of the breast.
- biomolecules And B are found at elevated levels in samples isolated from healthy subjects as compared to samples isolated from subjects having a precancerous lesion of the breast, a breast cancer, a metastatic breast cancer or a non-malignant disease of the breast.
- biomolecules X, Y, Z are found at elevated levels and or more frequently in samples isolated from subjects having a precancerous lesion of the breast as opposed to subjects in good health, having a breast cancer, a metastasised breast cancer or a non-malignant disease of the breast.
- Biomolecules And B are said to be specific for healthy subjects, whereas biomolecules X, Y, Z are specific for subjects having a precancerous lesion of the breast.
- the differential presence of one or more biomolecules found in a test sample compared to samples from healthy subjects, subjects with a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer, or a non-malignant disease of the breast, or the mere detection of one or more biomolecules in the test sample provides useful information regarding probability of whether a subject being tested has a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer or a non-malignant disease of the breast.
- the probability that a subject being tested has a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer or a non-malignant disease of the breast depends on whether the quantity of one or more biomolecules in a test sample taken from said subject is statistically significantly different from the quantity of one or more biomolecules in a biological sample taken from healthy subjects, subjects having a precancerous lesion of the breast, a breast cancer, a metastasised breast cancer, or a non-malignant disease of the breast.
- a biomolecule of the invention may be any molecule that is produced by a cell or living organism, and may have any biochemical property (e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophilicity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution in 0.1 M Tris-HCl, 0.02%) Triton X-100 at pH 8.5 at 0 to 4°C followed by incubation on said biologically active surface for 120 minutes at 20 to 24°C.
- biochemical property e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophilicity
- biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M
- Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins).
- a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more preferred are peptide or protein biomolecules.
- the biomolecules of the invention can be detected based on specific sample pre-treatment conditions, the pH of binding conditions, the type of biologically active surface used for the detection of biomolecules within a given sample and their molecular mass. For example, prior to the detection of the biomolecules described herein, a given sample is pre-treated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine.
- a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine.
- the denatured sample is then diluted 1:10 in 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5, applied to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface.
- biomolecules of the invention are detected using a cationic adsorbent positively charged quaternary ammonium groups, as well as specific pre-treatment and binding conditions, the biomolecules are capable of binding other types of adsorbents, as described below, using alternative pre-treatment and binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents.
- the biomolecules of the invention include biomolecules having a molecular mass selected from the group consisting of 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇
- biomolecules were first identified in blood serum samples, their detection is not limited to said sample type.
- the biomolecules may also be detected in other samples types, such as blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract.
- samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
- biomolecules can be sufficiently characterized by their mass and biochemical characteristics such as the type of biologically active surface they bind to or the pH of binding conditions, it is not necessary to identify the biomolecules in order to be able to identify them in a sample. It should be noted that molecular mass and binding properties are characteristic properties of these biomolecules and not limitations on the means of detection or isolation. Furthermore, using the methods described herein, or other methods known in the art, the absolute identity of the markers can be determined. This is important when one wishes to develop and/or screen for specific binding molecules, or to develop an assay for the detection of said biomolecules using specific binding molecules.
- biologically active surfaces include, but are not restricted to, surfaces that contain adsorbents such as quaternary ammonium groups (anion exchange surfaces), carboxylate groups (cation exchange surfaces), alkyl or aryl chains (hydrophobic interaction, reverse phase chemistry), groups such as nitriloacetic acid that immobilize metal ions such as nickel, gallium, copper, or zinc (metal affinity interaction), or biomolecules such as proteins, preferably antibodies, or nucleic acids, preferably protein binding sequences, covalently bound to the surface via carbonyl dimidazole moieties or epoxy groups (specific affinity interaction).
- adsorbents comprising anion exchange surfaces.
- These surfaces may be located on matrices like polysaccharides such as sepharose, e.g. anion exchange surfaces or hydrophobic interaction surfaces, or solid metals, e.g. antibodies coupled to magnetic beads. Surfaces may also include gold-plated surfaces such as those used for Biacore Sensor Chip technology. Other surfaces known to those skilled in the art are also included within the scope of the invention.
- Biomolecules like amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins).
- devices that use biologically active surfaces to selectively adsorb biomolecules may be chromatography columns for Fast Protein Liquid Chromatography (FPLC) and High Pressure Liquid Chromatography (HPLC), where the matrix, e.g. a polysaccharide, carrying the biologically active surface, is filled into vessels (usually referred to as "columns") made of glass, steel, or synthetic materials like polyetheretherketone (PEEK).
- FPLC Fast Protein Liquid Chromatography
- HPLC High Pressure Liquid Chromatography
- devices that use biologically active surfaces to selectively adsorb biomolecules may be metal strips carrying thin layers of the biologically active surface on one or more spots of the strip surface to be used as probes for gas phase ion spectrometry analysis, for example the SAX2 ProteinChip array (Ciphergen Biosystems, Inc.) for SELDI analysis.
- the mass profile of a sample may be generated using an array-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent present on a biologically active surface located on a solid platform ("array” or "probe”). After the biomolecules have bound to the adsorbent, they are detected using gas phase ion spectrometry. Biomolecules or other substances bound to the adsorbents on the probes can be analyzed using a gas phase ion spectrometer. This includes, e.g., mass spectrometers, ion mobility spectrometers, or total ion current measuring devices. The quantity and characteristics of the biomolecule can be determined using gas phase ion spectrometry. Other substances in addition to the biomolecule of interest can also be detected by gas phase ion spectrometry.
- a mass spectrometer can be used to detect biomolecules on the probe.
- a probe with a biomolecule is introduced into an inlet system of the mass spectrometer.
- the biomolecule is then ionized by an ionization source, such as a laser, fast atom bombardment, or plasma.
- the generated ions are collected by an ion optic assembly, and then a mass analyzer disperses and analyzes the passing ions.
- the ionisation course that ionises the biomolecule is a laser.
- the ions exiting the mass analyzer are detected by a ion detector.
- the ion detector then translates information of the detected ions into mass-to-charge ratios. Detection of the presence of a biomolecule or other substances will typically involve detection of signal intensity. This, in turn, can reflect the quantity and character of a biomolecule bound to the probe.
- the mass profile of a sample may be generated using a liquid-chromatography (LC)-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent located in a vessel made of glass, steel, or synthetic material; known to those skilled in the art as a chromatography column.
- LC liquid-chromatography
- the biomolecules are eluted from the biologically active surface by washing the vessel with appropriate solutions known to those skilled in the art.
- solutions include but are not limited to, buffers, e.g. Tris (hydroxymethyl) aminomethane hydrochloride (TRIS-HC1), buffers containing salt, e.g. sodium chloride (NaCI), or organic solvents, e.g. acetonitrile.
- Biomolecule mass profiles are generated by application of the eluting biomolecules of the sample by direct connection via an electrospray device to a mass spectrometer (LC/ESI-MS).
- Conditions that promote binding of biomolecules to an adsorbent are known to those skilled in the art (reference) and ordinarily include parameters such as pH, the concentration of salt, organic solvent, or other competitors for binding of the biomolecule to the adsorbent.
- incubation temperatures are of at least 0 to 100°C, preferably of at least 4 to 60°C, and most preferably of at least 15 to 30°C.
- Varying additional parameters such as incubation time, the concentration of detergent, e.g., 3-[(3-Cholamidopropyl) dimethylammonio]-2-hydroxy-l-propanesulfonate (CHAPS), or reducing agents, e.g. dithiothreitol (DTT), are also known to those skilled in the art.
- concentration of detergent e.g., 3-[(3-Cholamidopropyl) dimethylammonio]-2-hydroxy-l-propanesulfonate (CHAPS)
- reducing agents e.g. dithiothreitol (DTT)
- the invention relates to methods for detecting differentially present biomolecules in a test sample and or biological sample.
- any suitable method can be used to detect one or more of the biomolecules described herein.
- gas phase ion spectrometry can be used. This technique includes, e.g., laser desorption/ionization mass spectrometry.
- the test and/or biological sample is prepared prior to gas phase ion spectrometry, e.g., pre-fractionation, two-dimensional gel chromatography, high performance liquid chromatography, etc. to assist detection of said biomolecules.
- Detection of said biomolecules can also be achieved using methods other than gas phase ion spectrometry.
- immunoassays can be used to detect the biomolecules within a sample.
- the test and/or biological sample is prepared prior to contacting a biologically active surface and is in aqueous form.
- samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples.
- solid test and/or biological samples, such as excreta or biopsy samples can be solubilised in or admixed with an eluent using methods known to those skilled in the art such that said samples may be easily applied to a biologically active surface.
- Test and or biological samples in the aqueous form can be further prepared using specific solutions for denaturation (pre-treatment) like sodium dodecyl sulphate (SDS), mercaptoethanol, urea, etc.
- a test and/or biological sample of the invention can be denatured prior to contacting a biologically active surface comprising of quaternary ammonium groups by diluting said sample 1:5 with a buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT and
- the sample is contacted with a biologically active surface using any techniques including bathing, soaking, dipping, spraying, washing over, or pipetting, etc.
- a volume of sample containing from a few atomoles to 100 picomoles of a biomolecule in about 1 to 500 ⁇ l is sufficient for detecting binding of the biomolecule to the adsorbent.
- the ' pH value of the solvent in which the sample contacts the biologically active surface is a function of the specific sample and the selected biologically active surface.
- a sample is contacted with a biologically active surface under pH values between 0 and 14, preferably between about 4 and 10, more preferably between 4.5 and 9.0, and most preferably, at pH 8.5.
- the pH value depends on the type of adsorbent present on a biologically active surface and can be adjusted accordingly.
- the sample can contact the adsorbent present on a biologically active surface for a period of time sufficient to allow the marker to bind to the adsorbent.
- the sample and the biologically active surface are contacted for a period of between about 1 second and about 12 hours, preferably, between about 30 seconds and about 3 hours, and most preferably for 120 minutes.
- the temperature at which the sample contacts the biologically active surface is a function of the specific sample and the selected biologically active surface.
- the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably between 20 and 24°C.
- a biologically active surface comprising of quaternary ammonium groups (anion exchange surface) will bind the biomolecules described herein when the pH value is between 6.5 and 9.0.
- Optimal binding of the biomolecules of the present invention occurs at a pH of 8.5.
- a sample is contacted with said biologically active surface for 120 minutes at a temperature of 20 - 24 °C.
- washing unbound biomolecules are removed by methods known to those skilled in the art such as bathing, soaking, dipping, rinsing, spraying, or washing the biologically active surface with an eluent or a washing solution.
- a microfluidics process is preferably used when a washing solution such as an eluent is introduced to small spots of adsorbents on the biologically active surface.
- the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably between 20 and 24°C.
- Washing solution or eluents used to wash the unbound biomolecules from a biologically active surface include, but are not limited to, organic solutions, aqueous solutions such as buffers wherein a buffer may contain detergents, salts, or reducing agents in appropriate concentrations as those known to those skilled in the art.
- Aqueous solutions are preferred for washing biologically active surfaces.
- Exemplary aqueous solutions include, but are not limited to, HEPES buffer, Tris buffer, phosphate buffered saline (PBS), and modifications thereof.
- the selection of a particular washing solution or an eluent is dependent on other experimental conditions (e. g., types of adsorbents used or biomolecules to be detected), and can be determined by those of skill in the art. For example, if a biologically active surface comprising a quaternary ammonium group as adsorbent (anion exchange surface) is used, then an aqueous solution, such as a Tris buffer, may be preferred. In another example, if a biologically active surface comprising a carboxylate group as adsorbent (cation exchange surface) is used, then an aqueous solution, such as an acetate buffer, may be preferred.
- an energy absorbing molecule e.g. in solution
- EAM energy absorbing molecule
- exemplary energy absorbing molecules include, but are not limited to, cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid.
- adsorbent-bound biomolecules are detected using gas phase ion spectrometry.
- the quantity and characteristics of a biomolecule can be determined using said method.
- said biomolecules can be analyzed using a gas phase ion spectrometer such as mass spectrometers, ion mobility spectrometers, or total ion current measuring devices.
- a gas phase ion spectrometer such as mass spectrometers, ion mobility spectrometers, or total ion current measuring devices.
- Other gas phase ion spectrometers known to those skilled in the art are also included.
- mass spectrometry can be used to detect biomolecules of a given sample present on a biologically active surface.
- methods include, but are not limited to, matrix- assisted laser desorption ionization/time-of-flight (MALDI-TOF), surface-enhanced laser desorption ionization time-of-flight (SELDI-TOF), liquid chromatography coupled with MS, MS-MS, or ESI-MS.
- MALDI-TOF matrix- assisted laser desorption ionization/time-of-flight
- SELDI-TOF surface-enhanced laser desorption ionization time-of-flight
- ESI-MS liquid chromatography coupled with MS, MS-MS, or ESI-MS.
- the biomolecules present in a sample are detected using gas phase ion spectrometry, and more preferably, using mass spectrometry.
- gas phase ion spectrometry and more preferably, using mass spectrometry.
- mass spectrometry can be used.
- MALDI matrix- assisted laser desorption/ionization
- the sample is partially purified to obtain a fraction that essentially consists of a biomolecule by employing such separation methods as: two-dimensional gel electrophoresis (2D-gel) or high performance liquid chromatography (HPLC).
- SELDI surface-enhanced laser desorption/ionization mass spectrometry
- SELDI uses a substrate comprising adsorbents to capture biomolecules, which can then be directly desorbed and ionized from the substrate surface during mass spectrometry. Since the substrate surface in SELDI captures biomolecules, a sample need not be partially purified as in MALDI. However, depending on the complexity of a sample and the type of adsorbents used, it may be desirable to prepare a sample to reduce its complexity prior to SELDI analysis.
- biomolecules bound to a biologically active surface can be introduced into an inlet system of the mass spectrometer.
- the biomolecules are then ionized by an ionization source such as a laser, fast atom bombardment, or plasma.
- the generated ions are then collected by an ion optic assembly, and then a mass analyzer disperses the passing ions.
- the ions exiting the mass analyzer are detected by a detector and translated into mass-to-charge ratios. Detection of the presence of a biomolecule typically involves detection of its specific signal intensity, and reflects the quantity and character of said biomolecule.
- a laser desorption time-of-flight mass spectrometer is used with the probe of the present invention.
- biomolecules bound to a biologically active surface are introduced into an inlet system. Biomolecules are desorbed and ionized into the gas phase by a laser. The ions generated are then collected by an ion optic assembly. These ions are accelerated through a short high-voltage field and allowed to drift into a high vacuum chamber of a time-of-flight mass analyzer. At the far end of the high vacuum chamber, the accelerated ions collide with a detector surface at varying times. Since the time-of-flight is a function of the mass of the ions, the elapsed time between ionization and impact can be used to identify the presence or absence of molecules of a specific mass.
- biomolecules described herein can be enhanced using certain selectivity conditions (e. g., types of adsorbents used or washing solutions).
- selectivity conditions e. g., types of adsorbents used or washing solutions.
- the same or substantially the same selectivity conditions that were used to discover the biomolecules can be used in the methods for detecting a biomolecule in a sample.
- the computer program generally contains a readable medium that stores codes. Certain codes can be devoted to memory that include the location of each feature on a biologically active surface, the identity of the adsorbent at that feature and the elution conditions used to wash the adsorbent. Using this information, the program can then identify the set of features on the biologically active surface defining certain selectivity characteristics (e. g. types of adsorbent and eluents used). The computer also contains codes that receive as data (input) on the strength of the signal at various molecular masses received from a particular addressable location on the biologically active surface. This data can indicate the number of biomolecules detected, as well as the strength of the signal and the determined molecular mass for each biomolecule detected.
- Data analysis can include the steps of determining signal strength (e. g., height of peaks) of a biomolecule detected and removing "outliers" (data deviating from a predetermined statistical distribution).
- the observed peaks can be normalized, a process whereby the height of each peak relative to some reference is calculated.
- a reference can be background noise generated by instrument and chemicals (e. g., energy absorbing molecule), which is set as zero in the scale.
- the signal strength detected for each biomolecule can be displayed in the form of relative intensities in the scale desired (e. g., 100).
- a standard may be admitted with the sample so that a peak from the standard can be used as a reference to calculate relative intensities of the signals observed for each biomolecule or other biomolecules detected.
- the computer can transform the resulting data into various formats for displaying.
- a standard spectral view can be displayed, wherein the view depicts the quantity of a biomolecule reaching the detector at each particular molecular mass.
- scatter plot only the peak height and mass information are retained from the spectrum view, yielding a cleaner image and enabling biomolecules with nearly identical molecular mass to be more visible.
- Preferred biomolecules of the invention are biomolecules with an apparent molecular mass of about 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇ 33 Da, 6906
- the present invention comprises a method for the identification of these proteins, especially by obtaining their amino acid sequence.
- This method comprises the purification of said proteins from the complex biological sample (blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples) by fractionating said sample using techniques known by the one of ordinary skill in the art, most preferably protein chromatography (FPLC, HPLC).
- FPLC protein chromatography
- the biomolecules of the invention include those proteins with a molecular mass selected from 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇ 33 Da, 6906
- the method comprises the analysis of the fractions for the presence and purity of said proteins by the method which was used to identify them as differentially expressed biomolecules, for example two-dimensional gel electrophoresis, SELDI mass spectrometry of MALDI mass spectrometry, but most preferably MALDI mass spectrometry.
- the method also comprises an analysis of the purified proteins aiming towards the revealing of their amino acid sequence. This analysis may be performed using techniques in mass spectroscopy known to those skilled in the art.
- this analysis may be performed using peptide mass finge rinting, revealing information about the specific peptide mass profile after proteolytic digestion of the investigated protein.
- this analysis may be preferably performed using post-source-decay (PSD), or ESI-MS, but most preferably ESI-MS, revealing mass information about all possible fragments of the investigated protein or proteolytic peptides thereof leading to the amino acid sequence of the investigated protein of proteolytic peptide thereof.
- PSD post-source-decay
- ESI-MS ESI-MS
- the information revealed by the aforementioned techniques can be used to feed world- wide- web search engines, such as MS Fit (Protein Prospector, http://prospector.ucsf.edu) for information obtained from peptide mass f ⁇ nge ⁇ rinting, or MS Tag (Protein Prospector, http://prospector.ucsf.edu) for information obtained from PSD, or mascot (www.matrixscience.com) for information obtained from MSMS and peptide mass fingerprinting, for the alignment of the obtained results with data available in public protein sequence databases, such as SwissProt (http://us.expasy.org/sprot/), NCBI (http://www.ncbi.nlm.nih.gov/BLAST/), EMBL (http://srs.embl-heidelberg.de:8000/srs5/) which leads to a confident information about the identity of said proteins.
- MS Fit Protein Prospector, http://prospector.ucsf.edu
- MS Tag
- This information may comprise, if available, the complete amino acid sequence, the calculated molecular mass, the structure, the enzymatic activity, the physiological function, and gene expression of the investigated proteins.
- the invention provides kits using the methods of the invention as described in the section Diagnostics for the differential diagnosis of a breast cancer or a non-malignant disease of the breast, wherein the kits are used to detect the biomolecules of the present invention.
- the methods used to detect the biomolecules of the invention can also be used to determine whether a subject is at risk of developing a breast cancer or has developed a breast cancer. Such methods may also be employed in the form of a diagnostic kit comprising an antibody specific to a biomolecule of the invention or a biologically active surface described herein, which may be conveniently used, for example, in clinical settings to diagnose patients exhibiting symptoms or a family history of a non-steroid dependent cancer. Such diagnostic kits also include solutions and materials necessary for the detection of a biomolecule of the invention, and instructions to use the kit based on the above-mentioned methods.
- the biomolecules of the invention include those proteins with a molecular mass selected from 1506 Da ⁇ 8 Da, 1533 Da ⁇ 8 Da, 1623 Da ⁇ 8 Da, 1975 Da ⁇ 10 Da, 2017 Da ⁇ 10 Da, 2053 Da ⁇ 10 Da, 2268 Da ⁇ 11 Da, 2607 Da ⁇ 13 Da, 3328 Da ⁇ 17 Da, 3508 Da ⁇ 18 Da, 3660 Da ⁇ 18 Da, 3951 Da ⁇ 20 Da, 4107 Da ⁇ 21 Da, 4161 Da ⁇ 21 Da, 4245 Da ⁇ 21 Da, 4295 Da ⁇ 21 Da, 4363 Da ⁇ 22 Da, 4476 Da ⁇ 22 Da, 4614 Da ⁇ 23 Da, 4725 Da ⁇ 24 Da, 4831 Da ⁇ 24 Da, 4874 Da ⁇ 24 Da, 4962 Da ⁇ 25 Da, 5115 Da ⁇ 26 Da, 5497 Da ⁇ 27 Da, 5655 Da ⁇ 28 Da, 5863 Da ⁇ 29 Da, 6454 Da ⁇ 32 Da, 6655 Da ⁇ 33 Da, 6906
- kits can be used to detect one or more of differentially present biomolecules as described above in a test sample of subject.
- the kits of the invention have many applications.
- the kits can be used to differentiate if a subject is healthy, having a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer or a non-malignant disease of the breast.
- the kits can be used to identify compounds that modulate expression of said biomolecules .
- a kit comprises an adsorbent on a biologically active surface, wherein the adsorbent is suitable for binding one or more biomolecules of the invention, a denaturation solution for the pre-treatment of a sample, a binding solution, a washing solution or instructions for making a denaturation solution, binding solution, or washing solution, wherein the combination allows for the detection of a biomolecule using gas phase ion spectrometry.
- kits can be prepared from the materials described in other previously detailed sections (e. g., denaturation buffer, binding buffer, adsorbents, washing solutions, etc.).
- the kit may comprise a first substrate comprising an adsorbent thereon (e. g., a particle functionalized with an adsorbent) and a second substrate onto which the first substrate can be positioned to form a probe, which is removably insertable into a gas phase ion spectrometer.
- the kit may comprise a single substrate, which is in the form of a removably insertable probe with adsorbents on the substrate.
- a kit comprises a binding molecule that specifically binds to a biomolecule related to the invention, a detection reagent, appropriate solutions and instructions on how to use the kit.
- kits can be prepared from the materials described above, and other materials known to those skilled in the art.
- a binding molecule used within such a kit may include, but is not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins), compounds or synthetic molecules.
- a binding molecule used in said kit is an antibody.
- the kit may optionally further comprise a standard or control information so that the test sample can be compared with the control information standard to determine if the test amount of a marker detected in a sample is a diagnostic amount consistent with a diagnosis of a breast cancer or a non-malignant disease of the breast.
- each recorded measurement reading is accompanied by a margin of deviation.
- the margin of deviation is exclusively device-specific. That means it is caused by the type of analytical device used which is preferably a mass spectrometer.
- the accuracy of the recorded measurement reading is specified by a fixed percentage.
- each disclosed molecular mass represents the averaged value of that range which deviates from the averaged value about + 0.5 %.
- slight differences appear in the molecular mass value itself which concerns the same protein in parallel patent applications disclosing the matter of cancer biomarkers. There are three reasons to be considered. First, each molecular mass results from the analysis of samples belonging to another type of cancer.
- the origin of sample, the cellular status, the environmental conditions of the gathered tissue etc. exert an influence on the measurements.
- the given molecular mass of the biomarkers represents the averaged value which is calculated from the data of numerous samples of each cancer species.
- measuring errors might be also imaginable, for example due to the sample preparation.
- 11723 ⁇ 58 (breast cancer) (xlix) 12504 ⁇ 62 (epithelial cancer) and 12492 ⁇ 62 (breast cancer) (1) 12669 ⁇ 63 (epithelial cancer), 12619 ⁇ 63 (colorectal cancer), 12648 ⁇ 63 (pancreatic cancer) and 12656 ⁇ 63 (breast cancer)
- ⁇ 70 (breast cancer) (lv) 14206 ⁇ 71 (pancreatic cancer) and 14082 ⁇ 70 (breast cancer) (lvi) 14798 ⁇ 74 (colorectal cancer), 14829 ⁇ 74 (pancreatic cancer) and
- ⁇ 88 (breast cancer) (lxv) 17890 ⁇ 89 (colorectal cancer), 17932 ⁇ 89 (pancreatic cancer) andl7961 ⁇ 89 (breast cancer)
- each recorded measurement reading is overlapping with any others within its margin of deviation.
- Example 1 Sample collection. Serum samples were obtained from a total of 216 individuals: 147 samples from women suffering from a given disease of the breast, courtesy of the Department of Gynaecology and Obstetrics at the University of Heidelberg in Heidelberg, Germany; and 69 serum samples obtained from healthy patients, courtesy of both the "Deutsches Rotesdorf (DRK)" in Berlin, Germany, and the "GENICA study group” in Bonn, Germany.
- serum samples obtained from woman suffering from a breast disease could be further subdivided based on the type of disease and the stage to which the disease has progressed e.g. non-malignant disease, mastopathy, DCIS or breast cancer (Table 1).
- Serum samples were collected from the patients directly before surgery. At this time, a primary diagnosis was made based on standard techniques e.g. mammography, magnetic resonance imaging (MRI) and/or other means for the detection of diseases of the breast. In most cases the final diagnosis was confirmed by histological evaluation after surgery. In about 30% of the cases surgery was not possible due to the advanced stage of cancer.
- follow-up data for all breast cancer patients are currently collected and will be available for later studies.
- ProteinChip Array analysis ProteinChip Arrays of the SAX2-type (strong anion exchanger) were arranged into a bioprocessor (Ciphergen Biosystems, Inc.), a device that contains up to 12 ProteinChips and facilitates processing of the ProteinChips. The ProteinChips were pre-incubated in the bioprocessor with 200 ⁇ l binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) for two times 15 minutes. 10 ⁇ l of serum sample was diluted 1:5 in a buffer (7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, 2% ampholine) and again diluted 1:10 in the binding buffer.
- SAX2-type strong anion exchanger
- the ProteinChips were placed in the ProteinChip Reader (ProteinChip Biology System II, Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots collected in the positive mode at an average laser intensity of 215, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed.
- ProteinChip Reader ProteinChip Biology System II, Ciphergen Biosystems, Inc.
- 2 x 1 ⁇ l matrix solution (a saturated solution of sinapinic acid in 50% acetonitrile 0.5% trifluoracetic acid) was applied to the spot. The drop was allowed to air- dry for 10 min after each application of matrix solution.
- the ProteinChip was placed in the ProteinChip Reader (Biology System II, Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots collected in the positive mode at laser intensity 210, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed. Subsequently, Time-Of-Flight values were correlated to the molecular masses of the standard proteins, and calibration was performed according to the instrument manual.
- ProteinChip Reader Biology System II, Ciphergen Biosystems, Inc.
- Figure 1 shows a comparison of protein mass spectra detected using the above mentioned SAX2 ProteinChip arrays for samples isolated from patients suffering from a non-malignant disease of the breast (CI and C2) and of patients with breast cancer (TI and T2).
- the following clusters were deleted because of defective peak recognition: m/z 1553.25, 9598.64, 14211.2, 16139.6, 17161.4, 18879.3, 22979.5, 23455.3, and 27570.4 Da.
- the clusters m/z 1508.07, 2020.59, 4303.89.3, and 4614.06, 18408.2, and 23174.5 Da were changed into 1506, 2017, 3660, 4295, 4611, 18430, and 23210 Da, respectively, because of defective cluster mass centring. So, in total, 82 signal clusters were received.
- the cluster information (containing sample ID and sample group, cluster mass values and cluster signal intensities for each spectrum) was transformed into an interchangeable data format (a .csv table) using the "Sample group statistics" function of the "Biomarker Wizard” tool of the ProteinChip Software Version 3.1. In this format, the data was subjected to statistical analyses (see Examples 4 to 6).
- Classifiers with binary target variable were constructed as follows. First, as a proof of principle, classifiers were constructed and evaluated by stratified 10-fold cross validation. The data set was partitioned in 10 approximately equal-sized subsets in which the two classes are represented in about the same proportion as in the overall data set. Then, 10 classifiers were constructed using only 9/10 of the data by excluding subsequently one sub-dataset. Classifier performance was determined on the excluded test data set. Thereby, each available case was employed 9 times for classifier construction, and once for classifier evaluation. Test results were collected to determine overall sensitivity and specificity. Second, a final classifier was constructed on the basis of all available cases. This classifier was evaluated by using out-of-bag error estimates, see below.
- Classifiers were constructed as decision tree ensembles to overcome typical instabilities of simple forward variable selection procedures such as single decision trees, thereby improving the overall classifier performance on independent test data, see e.g. Breiman L (1996) Bagging Predictors, Machine Learning, Vol. 24, No. 2, pp. 123-140.
- the results of the present invention were generated using the "random forest" approach, see the following references available at ftp ://ftp . stat.berkeley.edu/pub/users/breiman/:
- the generated random forest classifiers consisted of 1000 exploratory decision trees, i.e. maximally grown decision trees consisting of pure final nodes only.
- the high number of decision trees was used in order to (1) ensure best classification performance, i.e. a saturation of the test error on the lowest possible level, see Figure 2A and (3) to obtain a sound statistical basis for determining variable importance.
- Decision tree generation was based on bootstrap sub-samples resulting from 98 random selections of cases with replacement from each class, so that both classes were weighted equally. Nodes were split by applying the Gini splitting rule to random subsets consisting of 8 randomly selected variables (masses).
- Example 5 Classifier structure.
- the final classifier consists of 1000 decision trees, each decision tree consisting typically of about 25 terminal nodes, see Figure 3. For each variable, variable importance was determined as the total decrease in node impurity achieved by splits using this variable, averaged over all trees. The high number of trees ensures a sound statistical basis for variable importance. Node impurity was measured by the Gini index. Table 4 shows all variables ranked according to importance in the final random forest classifier.
- the high classification performance of random forests is based on the high degree of independence of the underlying low-biased single decision trees. The high degree of independence is established by two stochastic processes: (1) bootstrapping introducing variations of the training data and (2) the random restriction to small variable subsets for each node splitting.
- the classification result of the final random forest classifier is determined by majority vote: each case is assigned to the class for which most single decision trees vote. The more decision trees assigned to a given case, the higher the probability that this case actually belongs to the corresponding class.
- Figure 4 visualizes how normalized votes for class "positive" are distributed. Votes were determined by an out-of-bag approach to estimate the distribution of votes on independent test data. Vote normalization was performed as follows; (number of votes for class "positive” - number of votes for class negative) / (number of trees for which the considered case is "out-of-bag”). Normalized votes range from -1 (all votes for class "negative") to +1 (all votes for class "positive”). Difficult to classify cases possess normalized votes around zero.
- Classification performance was estimated by two different methods: 1) 10-fold cross validation in the proof-of-principle framework and, 2) out-of-bag estimation for the final classifier.
- the confusion matrix obtained from cross-validation is presented in Table 2.
- Performance was estimated by 67.59 % specificity and 76.85 % sensitivity.
- the confusion matrix obtained for the final classifier from out-of-bag estimation is presented in Table 3. It yields slightly higher performance levels of 68.52 % specificity and 76.85 % sensitivity.
- out-of-bag error is the proportion of misclassif ⁇ ed cases in the entire data set.
- For the classification of each case (patient) only those trees are applied that were constructed independently of that case, i.e. for which the considered case was not in the bootstrap sub-sample used for training. Such cases are called "out-of-bag" cases.
- Table 3 states classifier performance on the basis of out-of-bag estimation and majority voting for the final classifier. A case is assigned to class "positive" if more than 50% of the decision trees vote for this class.
- the ROC curve extrapolates the performance of the generated classifier to neighbouring sensitivity and specificity ranges, thereby visualizing the possible trade-off between sensitivity and specificity.
- the out-of-bag ROC curve estimation is a valid estimation for the ROC performance of the final classifier on unseen test data as the out-of-bag error was not used for classifier tuning. Instead, training parameters were chosen in accordance with Breiman L. (2001a) and in order to obtain reasonable statistics for variable importance, see Table 4. The obtained AUC value is 0.79.
- SMDI surface enhanced laser desorption ionization
- MALDI-TOF matrix-assisted laser desorption ionization time of flight
- biomarker patterns classifiers
- the biomarker profiles are able to correctly classify a patient as either healthy, having a non-malignant disease of the breast, or having DCIS (early stage cancer) or a breast cancer, with a high degree of sensitivity and specificity.
- DCIS head stage cancer
- classification performances were estimated by two different approaches: cross validation and out-of-bag. Both approaches yielded similar performance estimates, see Table 2 and 3, respectively.
- the progressive success of classifier generation is shown in Figure 2 A. From Figure 2A, it is evident that the out-of-bag error decreases with an increase in the number of decision trees.
- Classification performance can be extended to the entire range of sensitivity and specificity and visualized in ROC curve form, see Figure 2B.
- the classifiers are ensemble classifiers, i.e., they consist of many single decision trees of varying complexity.
- Figure 3 visualizes decision tree complexity by the number of nodes of each single decision tree. The importance of a single mass in an ensemble classifier is determined by summing its partitioning success.
- the present analysis applies the "random forest” approach, an extension of bagging.
- This approach in addition to data set variations on the level of included cases (“bootstrapping"), restricts feature selection in each partitioning step to random feature subsets.
- the generated decision trees vary significantly and are more independent from each other. Accordingly, averaging over many decision trees yields a better overall classification performance.
- biomarker patterns for the development of a comprehensive diagnostic tool for breast cancer detection. Furthermore, such a diagnostic tool will provide the practising clinician with a basis on which to design a more personalised therapy program for a given patient, thereby improving the overall prognosis of the patient.
- Table 4 Variable importance.
- the table presents the variable importance for all masses, i.e. the total decrease in node impurity achieved by a variable during final classifier construction averaged over all trees. Masses are ranked according to their importance.
Landscapes
- Health & Medical Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Immunology (AREA)
- Engineering & Computer Science (AREA)
- Urology & Nephrology (AREA)
- Hematology (AREA)
- Chemical & Material Sciences (AREA)
- Molecular Biology (AREA)
- Biomedical Technology (AREA)
- Microbiology (AREA)
- Medicinal Chemistry (AREA)
- Biotechnology (AREA)
- Oncology (AREA)
- Hospice & Palliative Care (AREA)
- Pathology (AREA)
- Food Science & Technology (AREA)
- Cell Biology (AREA)
- Physics & Mathematics (AREA)
- Analytical Chemistry (AREA)
- Biochemistry (AREA)
- General Health & Medical Sciences (AREA)
- General Physics & Mathematics (AREA)
- Gastroenterology & Hepatology (AREA)
- Other Investigation Or Analysis Of Materials By Electrical Means (AREA)
Abstract
The present invention provides biomolecules and the use of these biomolecules for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast. In particular the present invention provides methods for detecting biomolecules within a test sample as well as a database comprising of mass profiles of biomolecules specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer or a metastasised breast cancer or subjects having a non-malignant disease of the breast. Furthermore, the present invention provides methods for the characterization of said biomolecules using gas phase ion spectrometry. In addition, the present invention provides methods for the identification of said biomolecules provided that they are proteins or polypeptides. The invention further provides kits for the differential diagnosis of a non-malignant disease of the breast or breast cancer.
Description
Methods and Applications of Biomarker Profiles in the Diagnosis and Treatment of Breast Cancer
The present invention provides biomolecules and the use of these biomolecules for the differential diagnosis of breast cancer and or non-malignant diseases of the breast. In specific embodiments, the biomolecules are characterised by mass profiles generated by contacting a test and/or biological sample with an anion exchange surface under specific binding conditions and detecting said biomolecules using gas phase ion spectrometry. The biomolecules used according to the invention are preferably proteins or polypeptides. Furthermore, preferred test and/or biological samples are blood serum samples and are of human origin.
BACKGROUND TO THE INVENTION
The incidence of breast cancer, a leading cause of death in women, has been gradually increasing in the United States over the last thirty years. Despite improvements in the rates of screening and early detection, as well as advancements in cancer therapies and improved lifestyles, breast cancer stills remains the most common cancer (other than skin cancer) among women. In 2003, the number of new cases of breast cancer in women was estimated to be about 212,600.
While the exact mechanism of tumourigenesis for most breast cancers is largely unknown, research has shown that patients exposed to certain risk factors are more likely than others to develop some form of breast cancer. Some of the strongest risk factors include: an increase in age, wherein women over 60 are at the highest risk of developing a breast cancer; a familial and/or personal history of breast cancer; the reproductive and menstrual history of a woman including the age of menarche (<12 years of age), age at first child-bearing and menopause (>55 years of age); hormone replacement therapies, as well as genetic factors such as the breast cancer gene (BRCA) family. For example, research has shown that the tumour suppressor
BRCA1 and BRCA2 contribute to familial breast cancer in 5% to 10% of breast cancer cases.
Germ-line mutations within these two loci are associated with a 50 to 85% lifetime risk of breast and/or ovarian cancer [Marcus et al. (1996) Hereditary Breast Cancer: Pathobiology,
Prognosis, and BRCA1 and BRCA2 Gene Linkage. Cancer 77:697-709; Casey, G. (1997) The
BRCA1 and BRCA2 Breast Cancer Genes. Curr. Opin. Oncol. 9:88-93]. Whereas the cumulative lifetime risk of breast cancer for women who carry the mutant BRCA1 is predicted to be approximately 92%, the risk for the non-carrier majority is estimated to be approximately 10%. Other risk factors have also been linked to the development of breast cancers such as obesity after menopause, alcohol intake, breast density and ethic race. Although obesity and
alcohol intake are associated with an increased risk, prospective studies have not yet shown that steering clear of these risk factors actually prevents the development of the disease.
Currently there are only a handful of treatments available for specific types of breast cancer. Despite scientific and medical advancements, such therapies provide no guarantee of success. In order for therapies to reach their maximum efficacy, an early detection of malignancy, including the ability to differentiate between malignant vs. non-malignant disease is required. In addition, a reliable assessment of the cancers severity is also needed. For example, patients diagnosed with early breast cancer have greater than a 90% five-year relative survival rate as compared to a survival rate of about 20% for patients diagnosed with distantly metastasized breast cancers. (American Cancer Society statistics). Currently, the best initial detection methods of early breast cancer are palpation of the breast (physical examination) and mammography. Although a physical examination of the breast may be a very good initial indicator, this diagnostic method must be used in parallel with other methods since the detected lesions may either be benign, malignant, or too small to detect be by palpation alone. Mammography, in contrast, is able to detect a breast tumour before it can be discovered by physical examination, but this diagnostic method is not without its own limitations. For example, mammography's predictive value depends on the observer's skill and the quality of the mammogram. In addition, 80 to 93% of suspicious mammograms are false positives, and 10 to 15% of women with breast cancer have false negative mammograms. Clearly, new diagnostic methods that offer a more sensitive and specific detection of early breast cancer are needed.
Not only should such methods offer more sensitive and specific detection of early breast cancer, they should also be able to determine the stage to which the patient's disease has progressed; stage determination has potential prognostic value and provides criteria for designing optimal therapy. The advantage of pathological staging of breast cancer over clinical staging is that it provides a more accurate prognosis of the disease, the disadvantage being that this method is invasive. Conversely, clinical staging could become a more attractive approach if it were at least as accurate as pathological staging; it does not depend on an invasive procedure to obtain tissue for evaluation.
Early detection and staging of breast cancer could be improved by detecting new markers in serum or urine. Such markers could be mRNA or protein markers expressed by cells originating from the primary tumour in the breast but residing in blood, bone marrow or lymph nodes and could serve as sensitive indicators for tumour development and/or metastasis to these distal organs. For example, specific protein antigens and mRNA, associated with breast epithelial
cells, have been detected by immunohistochemical techniques and RT-PCR, respectively, in bone marrow, lymph nodes and blood of breast cancer patients
Currently, the serum tumour markers most commonly used for breast cancer detection are carcinoembryonic antigen (CEA) and CA 15-3. Limitations of CEA include the absence of elevated serum levels in about 40% of women with metastatic disease. CA 15-3 suffers a similar fate since this marker can also be negative in a significant number of patients with progressive disease and, therefore, fails to predict metastasis. Furthermore, both CEA and CA 15-3 can be elevated in non-malignant, benign conditions giving rise to false positive results. These serum tumour markers evidently lack the adequate sensitivity and specificity required to be effective in detecting early stage breast cancer in a large population; only reaching performance levels of 23% sensitivity and 69% specificity. In addition, the US Food and Drug Administration has approved the tumour markers CA15.3 and CA27.29 only for the monitoring of therapeutic treatment in the cases advanced stage breast cancer. Clearly, new serological biomarkers that could be used individually or in combination with an existing modality for cost-effective screening of breast cancer are still urgently needed.
Currently, many groups are utilising proteomic technologies to comparatively analyse the differences in protein levels in breast cancer patients as compared to non-diseased subjects, in the hopes of discovering such new serological biomarkers. Formerly, the standard method of proteome analysis has been two dimensional (2D) gel electrophoresis, which is an invaluable tool for the separation and identification of biomarkers. This method is also an effective tool for the identification of aberrantly expressed proteins in a variety of tissue samples. Unfortunately, the analysis of data generated by 2D-gel electrophoresis is labour-intensive and requires large quantities of material for protein analysis, thereby rendering it impractical for routine clinical use.
Through the introduction of SELDI (surface enhanced laser desorption ionization), a modification of MALDI-TOF (matrix-assisted laser desorption ionization/time of flight) which is a mass spectrometry technique that allows for the simultaneous analysis of multiple biomarkers within one sample, this tool has been achieved. Small amounts of biomarkers can be directly bound to a biochip, carrying spots with different types of chromatographic material, including those with hydrophobic, hydrophilic, cation-exchanging and anion-exchanging characteristics. This approach has been proven to be very useful to identify biomarkers and biomarker patterns (profiles) in various biological fluids (Ciphergen Inc.).
To date, specific serological biomarkers for the detection of breast cancers (patents WO0223200 and WO03058198 from Ciphergen) have been identified using the above-mentioned SELDI technology. Unfortunately, due to the nature of the sample testing, the biomarkers identified can only be used to diagnose a patient as having a breast cancer versus not having the disease at all. For example, whereas the test samples analysed in WO03058198 (Ciphergen) and WO0223200 (Ciphergen) were taken from patients with late-stage breast cancer (stages III and IV), the control samples were taken from patients with undetectable breast cancer. The biomarkers identified are neither grade-specific nor can they detect the disease at its earliest stages (stage I and II), and thereby would not allow for effective patient-specific diagnosis and/or treatment of the disease. Moreover, such serological biomarkers that can specifically differentiate between the presence of a given breast cancer and a non-malignant disease of the breast have not yet been identified.
Again, there is a critical need to develop a simple, non-invasive, reliable and inexpensive method for the effective detection of breast cancer at its early stages. Preferably, such a diagnostic method should be able to detect early-stage breast cancer, as well as distinguish between the later stages or grades of the disease. Furthermore, such a diagnostic tool should be able to differentiate between breast cancer and a non-malignant disease of the breast. With such valuable information, clinicians would be able to tailor patient therapies for optimum treatment of the disease.
The present invention addresses this difficulty with the development of a non-invasive diagnostic tool for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast.
SUMMARY OF THE INVENTION
The present invention relates to methods for the differential diagnosis of breast cancer and/or non-malignant diseases of the breast, by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast.
The present invention provides a method for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast, in vitro, comprising obtaining a test sample from a subject,
contacting test sample with a biologically active surface under specific binding conditions, allowing for biomolecules present within the test sample to bind to the biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said test sample, transforming data into a computer-readable form, and comparing said mass profile against a database containing mass profiles specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having metastasised breast cancers, or subjects having a non-malignant disease of the breast, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast.
In one embodiment the invention provides a database comprising of mass profiles of biological samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having non-malignant disease of the breast.
Within the same embodiment the database is generated by obtaining biological samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, and subjects having non-malignant diseases of the breast, contacting said biological samples with a biologically active surface under specific binding conditions, allowing the biomolecules within the biological sample to bind to said biologically active surface, detecting one or more bound biomolecules using mass spectrometry thereby generating a mass profile of said biological samples, transforming data into a computer-readable form, and applying a mathematical algorithm to classify the mass profiles as specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having metastasised breast cancer, and subjects having a non-malignant disease of the breast.
In specific embodiments, the present invention provides biomolecules having a molecular mass selected from the group consisting of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da,
9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, and 28313 Da ± 142 Da. The biomolecules having said molecular masses are detected by contacting a test and/or biological sample with a biologically active surface comprising an adsorbent under specific binding conditions and further analysed by gas phase ion spectrometry. Preferably the adsorbent used is comprised of positively charged quaternary ammonium groups (anion exchange surface).
In specific embodiments, the invention provides specific binding conditions for the detection of biomolecules within a sample. In preferred embodiments, a sample is diluted 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then diluted again 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH 8.5 at 0 to 4°C. The treated sample is then contacted with a biologically active surface comprising of positively charged (cationic) quaternary ammonium groups (anion exchanging), incubated for 120 minutes at 20 to 24°C, and the bound biomolecules are detected using gas phase ion spectrometry.
In an alternative embodiment, the invention provides a method for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast comprising detecting of one or more differentially expressed biomolecules within a sample. This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast. Preferably, binding molecules are antibodies specific for said polypeptides.
The biomolecules related to the invention, having a molecular mass selected from the group consisting of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18
Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da, and may include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins). Preferably said biomolecules are proteins, polypeptides, or fragments thereof.
In yet another embodiment, the invention provides a method for the identification of biomolecules within a sample, provided that the biomolecules are proteins, polypeptides or fragments thereof, comprising: chromatography and fractionation, analysis of fractions for the presence of said differentially expressed proteins and/or fragments thereof, using a biologically active surface, further analysis using mass spectrometry to obtain amino acid sequences encoding said proteins and/or fragments thereof, and searching amino acid sequence databases of known proteins to identify said differentially expressed proteins by amino acid sequence comparison. Preferably the method of chromatography is high performance liquid chromatography (HPLC) or fast protein liquid chromatography (FPLC). Furthermore, the mass spectrometry used is selected from the group of matrix-assisted laser desorption ionization/time of flight (MALDI-TOF), surface enhanced laser desorption ionisation/time of flight (SELDI-TOF), liquid chromatography, MS-MS, or ESI-MS.
Furthermore, the invention provides kits for the differential diagnosis of breast cancer and/or a non-malignant disease of the breast.
The test or biological samples used according to the invention may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin. Preferably, the test and/or biological samples are blood serum samples, and are isolated from subjects of mammalian origin, preferably of human origin.
DESCRIPTION OF FIGURES
Figure 1. Comparison of protein mass spectra detected using the above-mentioned SAX2 ProteinChip arrays for samples isolated from patients with breast cancer (TI and T2) and from patients suffering from non-malignant diseases of the breast (CI and C2). A shows an overview in the mass range 11 - 20 kDa. B shows the boxed section of A. The mass signal at m/z 12,656.2 Da is highlighted. Its variable importance score ranks 2nd within the classifier.
Figure 2A. Development of out-of-bag error. During the training process of the final classifier, the out-of-bag error decreased to about 27%. The out-of-bag error is typically higher than the resulting test error as class assignment is only conducted on the basis of about 1/3 of the generated trees.
Figure 2B. Out-of-bag estimation of ROC curve for final classifier. The out-of-bag estimates of sensitivity and specificity presented in Table -3 are extrapolated into the entire range of sensitivity and specificity. This is done by varying the percentage of decision trees with vote "positive" necessary for assigning a case to class "positive". The diagonal represents the average random classifier, assigning cases randomly to class "positive" and "negative". The circle marks the pair of sensitivity and specificity of Table 3.
Figure 3. Decision tree complexity. The histogram visualizes the distribution of decision tree complexity in the final random forest classifier. Here, decision tree complexity is measured by the number of terminal nodes.
Figure 4. Voting distribution. The histogram shows how frequently trees of the final classifier vote for class "positive". For each case (patient) only the votes of those trees are collected for which the considered case is "out-of-bag". For each case, votes are normalized as follows: (number of votes for class "positive" - number of votes for class negative) / (number of trees for which the considered case is "out-of-bag"). Dashed vertical lines correspond to quantiles at 0%, 25%, 50%, 75%, and 100%.
Figure 5. A - E. Scatter plots of peak clusters belonging to differentially expressed proteins included in the classifier. Peak clusters are aligned along the vertical axis, e.g. M1516.00 denotes the peak cluster with characteristic mass 1516 Da. The horizontal axis shows the raw relative signal intensity of the peaks in the examined serum samples. Here, "raw" refers the non- logarithmic and not additionally normalized intensities, see Figure 6 and 7 for further intensity transformations. □ T (Tumour): Breast cancer & DCIS patients' serum samples, o C (Control): Healthy & diseased control patients' serum samples.
Figure 6A - E. Scatter plots of peak clusters belonging to differentially expressed proteins included in the classifier. Peak clusters are aligned along the vertical axis, e.g. M1516.00 denotes the peak cluster with characteristic mass 1516 Da. The horizontal axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. For each mass, intensities were first shifted to entirely positive values and then normalized by dividing the intensity values by the average intensity of that mass. Finally, the base 2 logarithm was taken. Accordingly, zero logarithmic normalized relative intensity refers to mean peak cluster intensity, and logarithmic normalized relative intensities of +1 and -1 mean two-fold over- and under-expression relative to mean peak cluster intensity, respectively, α T (Tumour): Breast cancer & DCIS patients' serum samples, o C (Control): Healthy & diseased control patients' serum samples.
Figure 7A - E. Additionally scaled scatter plots of peak clusters belonging to differentially expressed proteins included in the classifier. Peak clusters are aligned along the vertical axis, e.g. M1516.00 denotes the peak cluster with characteristic mass 1516 Da. As in Figure 3, the Y- axis shows the logarithmic normalized relative signal intensity of the peaks in the examined serum samples. However, intensities were additionally (shifted and) scaled so that the intensities of each peak cluster cover the entire horizontal range. Thereby, the minimum and maximum intensities of all masses are aligned on the left and right edge of the plot, respectively. This allows to better visualize the extend of class overlap, a T (Tumour): Breast cancer & DCIS patients' serum samples, o C (Control): Healthy & diseased control patients' serum samples.
DESCRIPTION OF THE INVENTION
It is to be understood that the present invention is not limited to the particular materials and methods described or equipment, as these may vary. It is also to be understood that the terminology used herein is for the purpose of describing particular embodiments only, and is not intended to limit the scope of the present invention, which will be limited only by the appended claims.
It should be noted that as used herein and in the appended claims, the singular forms "a," "an," and "the" include plural reference unless the context clearly dictates otherwise. Thus, for example, a reference to "an antibody" is a reference to one or more antibodies and derivatives thereof known to those skilled in the art, and so forth.
Unless defined otherwise, all technical and scientific terms used herein have the same meanings as commonly understood by one of ordinary skill in the art. Although any materials and methods, or equipment comparable to those specifically described herein can be used to practice or test the present invention, the preferred equipment, materials and methods are described below. All publications mentioned herein are cited for the purpose of describing and disclosing protocols, reagents, and current state of the art technologies that might be used in connection with the invention. Nothing herein is to be construed as an admission that the invention is not entitled to precede such disclosure by virtue of prior invention.
Definitions
The term "biomolecule" refers to a molecule produced by a cell or living organism. Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, proteins, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins). Furthermore, the terms "nucleotide" or polynucleotide" refer to a nucleotide, oligonucleotide, polynucleotide, or any fragment thereof. These phrases also refer to DNA or RNA of genomic or synthetic origin which may be single-stranded or double-stranded and may represent the sense, or the antisense strand, to peptide polynucleotide sequences (i.e. peptide nucleic acids; PNAs), or to any DNA-like or RNA-like material.
The term "fragment" refers to a portion of a polypeptide (parent) sequence that comprises at least 10 consecutive amino acid residues and retains a biological activity and/or some functional characteristics of the parent polypeptide e.g. antigenicity or structural domain characteristics.
The terms "biological sample" and "test sample" refer to all biological fluids and excretions isolated from any given subject. In the context of the invention such samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples.
The term "specific binding" refers to the binding reaction between a biomolecule and a specific "binding molecule". Related to the invention are binding molecules that include, but are not
limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins). Furthermore, a binding reaction is considered to be specific when the interaction between said molecules is substantial. In the context of the invention, a binding reaction is considered substantial when the reaction that takes place between said molecules is at least two times the background. Moreover, the term "specific binding conditions" refers to reaction conditions that permit the binding of said molecules such as pH, salt, detergent and other conditions known to those skilled in the art.
The term "interaction" relates to the direct or indirect binding or alteration of biological activity of a biomolecule. .
The term "differential diagnosis" refers to a diagnostic decision between healthy and different disease states, including various stages of a specific disease. A subject is diagnosed as healthy or to be suffering from a specific disease, or a specific stage of a disease based on a set of hypotheses that allow for the distinction between healthy and one or more stages of the disease. The choice between healthy and one or more stages of disease depends on a significant difference between each hypothesis. Under the same principle, a "differential diagnosis" may also refer to a diagnostic decision between one disease type as compared to another (e.g. breast cancer vs. a non-malignant disease of the breast).
The term "breast cancer" refers to a malignant neoplastic lesion of the breast within a given subject, wherein the neoplasm is defined according to its type, stage and/or grade. The various stages of a cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)]. It is to be understood that the term "breast cancer" is also referred to as "mammary cancer" or "a carcinoma of the breast". Within the context, of the invention, breast cancer includes both in situ (non-invasive) and invasive breast cancers. Whereas, in situ (non-invasive) breast cancers include ductal und lobular carcinoma in situ (DCIS und LCIS, respectively), invasive breast cancers encompass infiltrating diseases such as invasive ductal, lobular und papillary carcinoma's (DCIS und LCIS) and medullar, colloid, und tubular carcinomas.
The term "a non-malignant disease of the breast" refers to a lesion of the breast that does not exhibit malignant neoplastic physiological, biochemical, and/or morphological properties known to those skilled in the art. Such diseases include, but are not limited to, inflammatory and p oliferative lesions, fibrocystic changes within mammary tissue as well as benign disorders of the breast. Within the context of the invention, inflammatory lesions encompass acute,
periductal and granulomatous mastitis, duct ectasia, fat necrosis, whereas proliferative lesions include epithelial hyperplasia (atypical ductal and lobular hyperplasia), sclerosing adenosis, and small duct papillomas. Also included in the invention are benign disorders of the glandular tissue (mastopathy), papillomas (large duct, intraductal), and fibroadenomas.
The term "healthy individual" refers to a subject possessing good health. Such a subject demonstrates an absence of any disease within the breast; preferably an absence of a non- malignant disease of the breast or breast cancer.
The term "precancerous lesion of the breast" refers to a biological change within the breast such that it becomes susceptible to the development of a cancer. More specifically, a precancerous lesion of the breast is a preliminary stage of a breast cancer. Causes of a precancerous lesion may include, but are not limited to, genetic predisposition and exposure to cancer-causing agents (carcinogens); such cancer causing agents include agents that cause genetic damage and induce neoplastic transformation of a cell. Furthermore, the phrase "neoplastic transformation of a cell" refers an alteration in normal cell physiology and includes, but is not limited to, self-sufficiency in growth signals, insensitivity to growth-inhibitory (anti-growth) signals, evasion of programmed cell death (apoptosis), limitless replicative potential, sustained angiogenesis, and tissue invasion and metastasis.
The phrase "differentially present" refers to differences in the quantity of a biomolecule (of a particular apparent molecular mass) present in a sample from a subject as compared to a comparable sample. For example, a biomolecule is present at an elevated level, a decreased level or absent in samples of subjects having breast cancer compared to samples of subjects who do not have a cancer of the breast. Therefore in the context of the invention, the term "differentially present biomolecule" refers to the quantity of the biomolecule (of a particular apparent molecular mass) present within a sample taken from a subject having a breast cancer or a non-malignant disease of the breast as compared to a comparable sample taken from a healthy subject. Within the context of the invention, a biomolecule is differentially present between two samples if the quantity of said biomolecule in one sample is significantly different (defined statistically) from the quantity of said biomolecule in another sample.
The term "diagnostic assay" can be used interchangeably with "diagnostic method" and refers to the detection of the presence or nature of a pathologic condition. Diagnostic assays differ in their sensitivity and specificity. Within the context of the invention the sensitivity of a diagnostic assay is defined as the percentage of diseased subjects who test positive for a breast cancer or a non-malignant disease of the breast, and are considered "true positives". Subjects
having either a breast cancer or a non-malignant disease of the breast, but are not detected by the diagnostic assay are considered to be "false negatives". Subjects who show no disease, whether a breast cancer or a non-malignant disease of the breast, and who test negative in the diagnostic assay are considered to be "true negatives". Furthermore, the term specificity of a diagnostic assay, as used herein, is defined as 1 minus the false positive rate, where the "false positive rate" is defined as the proportion of those subjects devoid of a non-malignant disease of the breast or a breast cancer, but who test positive in said assay.
The term "adsorbent" refers to any material that is capable of accumulating (binding) a given biomolecule. The adsorbent typically coats a biologically active surface and is composed of a single material or a plurality of different materials that are capable of binding a biomolecule. Such materials include, but are not limited to, anion exchange materials, cation exchange materials, metal chelators, polynucleotides, oligonucleotides, peptides, antibodies, metal chelators etc.
The term "biologically active surface" refers to any two- or three-dimensional extension of a material that biomolecules can bind to, or interact with, due to the specific biochemical properties of this material and those of the biomolecules. Such biochemical properties include, but are not limited to, ionic character (charge), hydrophobicity, or hydrophilicity.
The term "binding molecule" refers to a molecule that displays an affinity for another molecule. With in the context of the invention such molecules may include, but are not limited to nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polypeptides, carbohydrates, lipids, and combinations thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins). Preferably, such binding molecules are antibodies.
The term "solution" refers to a homogeneous mixture of two or more substances. Solutions may include, but are not limited to buffers, substrate solutions, elution solutions, wash solutions, detection solutions, standardisation solutions, chemical solutions, solvents, etc. Furthermore, other solutions known to those skilled in the art are also included herein.
The term "mass profile" refers to a mass spectrum as a characteristic property of a given sample or a group of samples. Such a profile, when compared to the mass profile of a second sample or group of samples, will allow for the differentiation between the two samples. In the context of the invention, the mass profile is obtained by treating the biological sample as follows. The sample is diluted it 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1%. DTT, and 2% ampholine and subsequently diluted 1:10 in binding buffer
consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5. Thus pre-treated sample is applied to a biologically active surface comprising positively charged quaternary ammonium groups (anion exchange surface) and incubated for 120 minutes. The biomolecules bound to the surface are analysed by gas phase ion spectrometry as described in another section. All but the dilution steps are performed at 20 to 24°C. Dilution steps are performed at 0 to 4°C.
The phrase "apparent molecular mass" refers to the molecular mass value in Dalton (Da) of a biomolecule as it may appear in a given method of investigation, e.g. size exclusion chromatography, gel electrophoresis, or mass spectrometry.
The term "chromatography" refers to any method of separating biomolecules within a given sample such that the original native state of a given biomolecule is retained. Separation of a biomolecule from other biomolecules within a given sample for the purpose of enrichment, purification and/or analysis, may be achieved by methods including, but not limited to, size exclusion chromatography, ion exchange chromatography, hydrophobic and hydrophilic interaction chromatography, metal affinity chromatography, wherein "metal" refers to metal ions (e.g. nickel, copper, gallium, or zinc) of all chemically possible valences, or ligand affinity chromatography wherein "ligand" refers to binding molecules, preferably proteins, antibodies, or DNA. Generally, chromatography uses biologically active surfaces as adsorbents to selectively accumulate certain biomolecules.
The term "mass spectrometry" refers to a method comprising employing an ionization source to generate gas phase ions from a biological entity of a sample presented on a biologically active surface, and detecting the gas phase ions with a mass spectrometer.
The phrase "laser desorption mass spectrometry" refers to a method comprising the use of a laser as an ionization source to generate gas phase ions from a biomolecule presented on a biologically active surface, and detecting the gas phase ions with a mass spectrometer.
The term "mass spectrometer" refers to a gas phase ion spectrometer that includes an inlet system, an ionisation source, an ion optic assembly, a mass analyser, and a detector.
Within the context of the invention, the terms "detect", "detection" or "detecting" refer to the identification of the presence, absence, or quantity of a biomolecule.
The term "energy absorbing molecule" or "EAM" refers to a molecule that absorbs energy from an energy source in a mass spectrometer thereby enabling desorption of a biomolecule from a
biologically active surface. Cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid are frequently used as energy-absorbing molecules in laser desorption of biomolecules. See U.S. Pat. No. 5,719,060 (Hutchens & Yip) for a further description of energy absorbing molecules.
The term "training set" refers to a subset of the respective entire available data set. This subset is typically randomly selected, and is solely used for the purpose of classifier construction.
The term "test set" refers to a subset of the entire available data set consisting of those entries not included in the training set. Test data is applied to evaluate classifier performance.
The term "decision tree" refers to a flow-chart-like tree structure employed for classification. Decision trees consist of repeated splits of a data set into subsets. Each split consists of a simple rule applied to one variable, e.g., "if value of 'variable 1' larger than 'threshold 1' then go left else go right". Accordingly, the given feature space is partitioned into a set of rectangles with each rectangle assigned to one class.
The terms "ensemble", "tree ensemble" or "ensemble classifier" can be used interchangeably and refer to a classifier that consists of many simpler elementary classifiers, e.g., an ensemble of decision trees is a classifier consisting of decision trees. The result of the ensemble classifier is obtained by combining all the results of its constituent classifiers, e.g., by majority voting that weights all constituent classifiers equally. Majority voting is, especially reasonable in the case of bagging, where constituent classifiers are then naturally weighted by the frequency with which they are generated.
The term "competitor" refers to a variable that can be used as an alternative splitting rule in a decision tree. Within the context of the invention,.the competitor is the apparent molecular mass of a given biomolecule. In each step of decision tree construction, only the variable yielding the best data-splitting is selected. Competitors are non-selected variables with similar but lower performance than the selected variable. They point into the direction of alternative decision trees.
The term "surrogate" refers to a splitting rule that closely mimics the action of the primary split. A surrogate is a variable that can substitute a selected decision tree variable, e.g. in the case of missing values. Not only must a good surrogate split the parent node into descendant nodes similar in size and composition to the primary descendant nodes, it must also match the primary split on the specific cases that go to the left child and right child nodes.
The terms "peak" and "signal" may be used interchangeably, and refer to any signal which is generated by a biomolecule when under investigation using a specific method, for example chromatography, mass spectrometry, or any type of spectroscopy like Ultraviolet/Visible Light (UV/Vis) spectroscopy, Fourier Transformed Infrared (FTIR) spectroscopy, Electron Paramagnetic Resonance (EPR) spectroscopy, or Nuclear Mass Resonance (NMR) spectroscopy.
Within the context of the invention, the terms "peak" and "signal" refer to the signal generated by a biomolecule of a certain molecular mass hitting the detector of a mass spectrometer, thus generating a signal intensity which correlates with the amount or concentration of said biomolecule of a given sample. A "peak" and "signal" is defined by two values: an apparent molecular mass value (m z) and an intensity value generated as described. The mass value is an elemental characteristic of a biological entity, whereas the intensity value accords to a certain amount or concentration of a biological entity with the corresponding apparent molecular mass value, and thus "peak" and "signal" always refer to the properties of this biological entity.
The term "cluster" refers to a signal or peak present in a certain set of mass spectra or mass profiles obtained from different samples belonging to two or more different groups (e.g. cancer and non-cancer). Within the set, signals belonging to cluster can differ in their intensities, but not in the apparent molecular masses.
The term "variable" refers to a cluster which is subjected to a statistical analysis aiming towards a classification of samples into two or more different sample groups (e.g. cancer and non cancer) by using decision trees, wherein the sample feature relevant for classification is the intensity value of the variables in the analysed samples.
Detailed Description of the invention a) Diagnostics
The present invention relates to methods for the differential diagnosis of breast cancer and or a non-malignant disease of the breast by detecting one or more differentially expressed biomolecules within a test sample of a given subject, comparing results with samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast, wherein the comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or a non-malignant disease of the breast.
In one aspect of the invention, a method for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast comprises: obtaining a test sample from a given subject, contacting said sample with an adsorbent present on a biologically active surface under specific binding conditions, allowing the biomolecules within the test sample to bind to said adsorbent, detecting one or more bound biomolecules using a detection method, wherein the detection method generates a mass profile of said sample, transforming mass profile data into a computer-readable form comparing the mass profile of said sample with a database containing mass profiles from comparable samples specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast. A comparison of mass profiles allows for the medical practitioner to determine if a subject is healthy, has a precancerous lesion of the breast, a breast cancer, a metastasised breast cancer or a non-malignant disease of the breast based on the presence, absence or quantity of specific biomolecules.
In more than one embodiment, a single biomolecule or a combination of more than one biomolecule selected from the group having an apparent molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da may be detected within a given sample. Detection of a single or a combination of more than one biomolecule of the invention is based on specific sample pre-treatment conditions, the pH of binding conditions, and the type of biologically active surface used for the detection of biomolecules. For example, prior to the detection of the biomolecules described herein, a given sample is pre-
treated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine. The denatured sample is then diluted 1:10 in a specific binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5), applied to a biologically active surface comprising of positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface.
According to the invention, a biomolecule with the molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da± 21 Da, 4295 Da ± 21 Da, 4363 Da± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da is detected by diluting the biological sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C, applying thus treated sample to a biologically active surface comprising positively charged (cationic) quaternary ammonium groups (anion exchanging), incubating for 120 minutes at 20 to 24°C, and subjecting the bound biomolecules to gas phase ion spectrometry as described in another section.
A biomolecule of the invention may include any molecule that is produced by a cell or living organism, and may have any biochemical property (e.g. phosphorylated proteins, glycosylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophilicity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution
in 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C followed by incubation on said biologically active surface for 120 minutes at 20 to 24°C. Such molecules include, but are not limited to, molecules comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins). Preferably a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more preferred are peptide or protein biomolecules or fragments thereof.
The methods for detecting these biomolecules have many applications. For example, a single biomolecule or a combination of more than one biomolecule selected from the group having an apparent molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da can be measured to differentiate between healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having a metastasized breast cancer or subjects with a non-malignant disease of the breast, and thus are useful as an aid in the diagnosis of a breast cancer and/or a non-malignant disease of the breast within a subject. Alternatively, said biomolecules may be used to diagnose a subject as healthy.
For example, a biomolecule having the apparent molecular mass of about e.g. 8940 Da is present only in biological samples from patients having a metastasised breast cancer. Mass profiling of two test samples from different subjects, X and Y, reveals the presence of a biomolecule with the apparent molecular mass of about 8940 Da in a sample from test subject
X, and the absence of said biomolecule in test sample from subject Y. The medical practitioner is able to diagnose subject X as having a potential metastasised breast cancer and subject Y as not having a metastasised breast cancer. In yet another example, three biomolecules having the apparent molecular mass of about 2053 Da, 4161 Da and 10682 Dare present in varying quantities in samples specific for precancerous lesions and "early" breast cancers. The biomolecule having the apparent molecular mass of 2053 Da is more present in samples specific for precancerous lesions of the breast than for "early" breast cancers. A biomolecule having an apparent molecular mass of 4161 Da is detected in samples from subjects having "early" breast cancers but not in those having a precancerous lesion, whereas the biomolecule having the molecular mass of 10682 Da is present in about the same quantity in both sample types. Such biomolecules are not present in samples from healthy subjects, only those of apparent molecular mass of 14014 Da and 9377 Da. Analysis of a test sample reveals the presence of biomolecules having the molecular mass of 10682 Da, 2053 Da and 4161 Da. Comparison of the quantity of the biomolecules within said sample reveals that the biomolecule with an apparent molecular mass of 2053 Da is present at lower levels than those found in samples from subjects having a precancerous lesion. The medical practitioner is able to diagnose the test subject as having an "early" breast cancer. These examples are solely used for the purpose of clarification and are not intended to limit the scope of this invention.
In another aspect of the invention, an immunoassay can be used to determine the presence or absence of a biomolecule within a test sample of a subject. First, the presence or absence of a biomolecule within a sample can be detected using the various immunoassay methods known to those skilled in the art (i.e. ELISA, western blots). If a biomolecule is present in the test sample, it will form an antibody-marker complex with an antibody that specifically binds a biomolecule under suitable incubation conditions. The amount of an antibody-biomolecule complex can be determined by comparing to a standard.
Thus the invention provides a method for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast comprising: detecting of one or more differentially expressed biomolecules within a sample. This method comprises obtaining a test sample from a subject, contacting said sample with a binding molecule specific for a differentially expressed polypeptide, detecting an interaction between the binding molecule and its specific polypeptide, wherein the detection of an interaction indicates the presence or absence of said polypeptide, thereby allowing for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer and/or a non-malignant disease of the breast. Binding molecules include, but are not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids,
polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins), compounds or synthetic molecules. Preferably, binding molecules are antibodies specific for biomolecules selected from the group of having an apparent molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da
In another aspect of the invention, a method for detecting the differential presence or absence of one or more biomolecules selected from the group having an apparent molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da,
18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da in a test sample of a subject involves contacting the test sample with a compound or agent capable of detecting said biomolecule such that the presence or absence of said biomolecule is directly and or indirectly labelled. For example a fluorescently labelled secondary antibody can be used to detect a primary antibody bound to its specific biomolecule. Furthermore, such detection methods can be used to detect a variety of biomolecules within a test sample both in vitro as well as in vivo.
For example, in vivo, antibodies or fragments thereof may be utilised for the detection of a biomolecule in a biological sample comprising: applying a labelled antibody directed against a given biomolecule of the invention to said sample under conditions that favour an interaction between the labelled antibody and its corresponding protein. Depending on the nature of the biological sample, it is possible to determine not only the presence of a biomolecule, but also its cellular distribution. For example, in a blood serum sample, only the serum levels of a given biomolecule can be detected, whereas its level of expression and cellular localisation can be detected in histological samples. It will be obvious to those skilled in the art, that a wide variety of methods can be modified in order to achieve such detection.
For example, an antibody coupled to an enzyme is detected using a chromogenic substrate that is recognised and cleaved by the enzyme to produce a chemical, moiety, which is readily detected using spectrometric, fluorimetric or visual means. Enzymes used to for labelling include, but are not limited to, malate dehydrogenase, staphylococcal nuclease, delta-5 -steroid isomerase, yeast alcohol dehydrogenase, alpha-glycerophosphate, dehydrogenase, triose phosphate isomerase, horseradish peroxidase, alkaline phosphatase, asparaginase, glucose oxidase, beta-galactosidase, ribonuclease, urease, catalase, glucose-6-phosphate dehydrogenase, glucoamylase and acetylcholinesterase. Detection may also be accomplished by visual comparison of the extent of the enzymatic reaction of a substrate with that of similarly prepared standards. Alternatively, radiolabelled antibodies can be detected using a gamma or a scintillation counter, or they can be detected using autoradiography. In another example, fluorescently labelled antibodies are detected based on the level at which the attached compound fluoresces following exposure to a given wavelength. Fluorescent compounds typically used in antibody labelling include, but are not limited to, fluorescein isothiocynate, rhodamine, phycoerthyrin, phycocyanin, allophycocyani, o-phthaldehyde and fluorescamine. In yet another example, antibodies coupled to a chemi- or bioluminescent compound can be detected by determining the presence of luminescence. Such compounds include, but are not
limited to, luminal, isoluminal, theromatic acridinium ester, imidazole, acridinium salt, oxalate ester, luciferin, luciferase and aequorin.
Furthermore, in vivo techniques for the detection of a biomolecule of the invention include introducing into a subject a labelled antibody directed against a given polypeptide or fragment thereof.
In more than one embodiment of the invention, the test sample used for the differential diagnosis of a breast cancer and or a non-malignant disease of the breast within a subject may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin. Preferably, test samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
Furthermore, test samples used for the methods of the invention are isolated from subjects of mammalian origin, preferably of primate origin. Even more preferred are subjects of human origin.
In addition, the methods of the invention for the differential diagnosis of healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasized breast cancer or subjects having a non-malignant disease of the breast described herein may be combined with other diagnostic methods to improve the outcome of the differential diagnosis. Other diagnostic methods are known to those skilled in the art.
b Database
In another aspect of the invention, a database comprising of mass profiles specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer, or subjects having a non-malignant disease of the breast is generated by contacting biological samples isolated from above-mentioned subjects with an adsorbent on a biologically active surface under specific binding conditions, allowing the biomolecules within said sample to bind said adsorbent, detecting one or more bound biomolecules using a detection method wherein the detection method generates a mass profile of said sample, transforming the mass profile data into a computer-readable form and applying a mathematical algorithm to classify the mass profile as specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a
metastasised breast cancer or subjects having a non-malignant disease of the breast.
According to the invention, the classification of said mass profiles is performed using the "CART" decision tree approach [classification and regression trees; Breiman et al. (1984) Classification and regression trees. Wadsworth International, Belmont, California] and is known to those skilled in the art. Furthermore, bagging of classifiers is applied to overcome typical instabilities of forward variable selection procedures, thereby increasing overall classifier performance [Breiman L. (1996) Bagging Predictors. Machine learning 24: 123-140].
In more than one embodiment, one or more biomolecules selected from the group having an apparent molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da may be detected within a given biological sample. Detection of said biomolecules of the invention is based on specific sample pre-treatment conditions, the pH of binding conditions, and the type of biologically active surface used for the detection of biomolecules.
Within the context of the invention, biomolecules within a given sample are bound to an adsorbent on a biologically active surface under specific binding conditions, for example, the biomolecules within a given sample are applied to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated with 0.1 M Tris-HCl, 0.02% Triton X-100 at a pH of 8.5 to allow for specific binding. Biomolecules that bind to said biologically active surface under these conditions are negatively charged molecules. It should be
noted that although the biomolecules of the invention are bound to a cationic adsorbent comprising of positively-charged quaternary ammonium groups, the biomolecules are capable of binding other types of adsorbents, as described in another section using binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents.
According to the invention, a biomolecule with the molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da is detected by diluting the biological sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C, applying thus treated sample to a biologically active surface comprising positively charged (cationic) quaternary ammonium groups (anion exchanging), incubating for 120 minutes at 20 to 24°C, and subjecting the bound biomolecules to gas phase ion spectrometry as described in another section.
In one embodiment of the invention, biological samples used to generate a database of mass profiles for healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having a metastasised breast cancer or subjects having a non-malignant disease of the breast, may be of blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract origin. Preferably, biological samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue
extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
Furthermore, the biological samples related to the invention are isolated from subjects considered to be healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer or having a non-malignant disease of the breast. Said subjects are of mammalian origin, preferably of primate origin. Even more preferred are subjects of human origin.
A subject of the invention that is said to have a precancerous lesion of the breast, displays preliminary stages of a cancer, wherein a cell and/or tissue has become susceptible to the development of a cancer as a result of either a genetic predisposition, exposure to a cancer-causing agent (carcinogen) or both.
A genetic pre-disposition may include a predisposition for an autosomal dominant inherited cancer syndrome which is generally indicated by a strong family history of uncommon cancer and or an association with a specific marker phenotype, a familial cancer (e.g. familial relapsing a non-malignant disease of the breast) wherein an evident clustering of cancer is observed but the role of inherited predisposition may not be clear, or an autosomal recessive syndrome characterised by chromosomal or DNA instability. Cancer-causing agents include agents that stimulate genetic damage and induce neoplastic transformation of a cell. Such agents fall into three categories: 1) chemical carcinogens such as alkylating agents, polycyclic aromatic hydrocarbons, aromatic amines, azo dyes, nitrosamines and amides, asbestos, vinyl chloride, chromium, nickel, arsenic, and naturally occurring carcinogens (e.g. aflotoxin Bl); 2) radiation such as ultraviolet (UV) and ionisation radiation including electromagnetic (e.g. x-rays, γ-rays) and particulate radiation (e.g. α and β particles, protons, neutrons); 3) viral and microbial carcinogens such as human Papillomavirus (HPV), Epstein-Barr virus (EBV), hepatitis B virus (HBV), human T-cell leukaemia virus type 1 (HTLV-1), or Helicobacter pylori. In addition, environmental factors have also been implicated to play a role in the predisposition of breast cancer. Such factors are known to those skilled in the art and include, but are not limited to smoking, chronic alcohol intake, and the consumption of a high-energy diet rich in fats. Furthermore, breast cancer arises with greater frequency in patient with chronic a non-malignant disease of the breast
Within the context of the invention, cancers of the breast are also referred to as mammary cancers or carcinomas of the breast. Breast cancers of the invention include both in situ (non-
invasive) and invasive breast cancers. Whereas, in situ (non-invasive) breast cancers include ductal und lobular carcinoma in situ (DCIS und LCIS, respectively), invasive breast cancers encompass infiltrating diseases such as invasive ductal, lobular und papillary carcinoma's (DCIS und LCIS) and medullar, colloid, und tubular carcinomas. Furthermore, breast cancers of the invention may also be of various stages, wherein the staging is based on the size of the primary lesion, its extent of spread to regional lymph nodes, and the presence or absence of blood-borne metastases (metastatic breast cancers). The various stages of abreast cancer may be identified using staging systems known to those skilled in the art [e.g. Union Internationale Contre Cancer (UICC) system or American Joint Committee on Cancer (AJC)]. Also included are different grades of said breast cancers, wherein the grade of the cancer is based on the degree of differentiation of the epithelial cells within the lining of the breast and the number of mitoses as a correlation to a neoplasm's aggression.
A subject said to have a non-malignant disease of the breast possesses a lesion of the breast that does not exhibit malignant neoplastic physiological, biochemical, and or morphological properties known to those skilled in the art. Such diseases include, but are not limited to, inflammatory and proliferative lesions, fibrocystic changes within mammary tissue as well as benign disorders of the breast. Within the context of the invention, inflammatory lesions encompass acute, periductal and granulomatous mastitis, duct ectasia, fat necrosis, whereas proliferative lesions include epithelial hyperplasia (atypical ductal and lobular hyperplasia), sclerosing adenosis, and small duct papillomas. Also included in the invention are benign disorders of the glandular tissue (mastopathy), papillomas (large duct, intraductal), and fibroadenomas.
Healthy individuals, as related to certain embodiments of the invention, are those that possess good health, and demonstrate an absence of a breast cancer or a non-malignant disease of the breast.
c) Biomolecules The differential expression of biomolecules in samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having a breast cancer, subjects having metastasised breast cancer, and subjects having a non-malignant disease of the breast, allows for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast within a subject.
Biomolecules are said to be specific for a particular clinical state (e.g. healthy, precancerous lesion of the breast, breast cancer, metastasised breast cancer, a non-malignant disease of the
breast) when they are present at different levels within samples taken from subjects in one clinical state as compared to samples taken from subjects from other clinical states (e.g. in subjects with a precancerous lesion of the breast vs. in subjects with a metastasised breast cancer). Biomolecules may be present at elevated levels, at decreased levels, or altogether absent within a sample taken from a subject in a particular clinical state (e.g. healthy, precancerous lesion of the breast, breast cancer, metastasised breast cancer, a non-malignant disease of the breast). For example, biomolecules And B are found at elevated levels in samples isolated from healthy subjects as compared to samples isolated from subjects having a precancerous lesion of the breast, a breast cancer, a metastatic breast cancer or a non-malignant disease of the breast. Whereas, biomolecules X, Y, Z are found at elevated levels and or more frequently in samples isolated from subjects having a precancerous lesion of the breast as opposed to subjects in good health, having a breast cancer, a metastasised breast cancer or a non-malignant disease of the breast. Biomolecules And B are said to be specific for healthy subjects, whereas biomolecules X, Y, Z are specific for subjects having a precancerous lesion of the breast.
Accordingly, the differential presence of one or more biomolecules found in a test sample compared to samples from healthy subjects, subjects with a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer, or a non-malignant disease of the breast, or the mere detection of one or more biomolecules in the test sample provides useful information regarding probability of whether a subject being tested has a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer or a non-malignant disease of the breast. The probability that a subject being tested has a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer or a non-malignant disease of the breast depends on whether the quantity of one or more biomolecules in a test sample taken from said subject is statistically significantly different from the quantity of one or more biomolecules in a biological sample taken from healthy subjects, subjects having a precancerous lesion of the breast, a breast cancer, a metastasised breast cancer, or a non-malignant disease of the breast.
A biomolecule of the invention may be any molecule that is produced by a cell or living organism, and may have any biochemical property (e.g. phosphorylated proteins, positively charged molecules, negatively charged molecules, hydrophobicity, hydrophilicity), but preferably biochemical properties that allow binding of the biomolecule to a biologically active surface comprising positively charged quaternary ammonium groups after denaturation in 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine and dilution in 0.1 M Tris-HCl, 0.02%) Triton X-100 at pH 8.5 at 0 to 4°C followed by incubation on said biologically active surface for 120 minutes at 20 to 24°C. Such molecules include, but are not limited to, molecules
comprising nucleotides, amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides (DNA or RNA), polypeptides, proteins, antibodies, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins). Preferably a biomolecule may be a nucleotide, polynucleotide, peptide, protein or fragments thereof. Even more preferred are peptide or protein biomolecules.
The biomolecules of the invention can be detected based on specific sample pre-treatment conditions, the pH of binding conditions, the type of biologically active surface used for the detection of biomolecules within a given sample and their molecular mass. For example, prior to the detection of the biomolecules described herein, a given sample is pre-treated by diluting 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% ampholine. The denatured sample is then diluted 1:10 in 0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5, applied to a biologically active surface comprising positively-charged quaternary ammonium groups (cationic) and incubated using specific buffer conditions (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) to allow for binding of said biomolecules to the above-mentioned biologically active surface. It should be noted that although the biomolecules of the invention are detected using a cationic adsorbent positively charged quaternary ammonium groups, as well as specific pre-treatment and binding conditions, the biomolecules are capable of binding other types of adsorbents, as described below, using alternative pre-treatment and binding conditions known to those skilled in the art. Accordingly, some embodiments of the invention are not limited to the use of cationic adsorbents.
The biomolecules of the invention include biomolecules having a molecular mass selected from the group consisting of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638
Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da± 142 Da
According to the invention, a biomolecule with the molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da is detected by diluting the biological sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, and 2% Ampholine, and then 1:10 in binding buffer consisting of 0.1 M Tris-HCl, 0.02% Triton X-100 at pH 8.5 at 0 to 4°C, applying thus treated sample to a biologically active surface comprising positively charged (cationic) quaternary ammonium groups (anion exchanging), incubating for 120 minutes at 20 to 24°C, and subjecting the bound biomolecules to gas phase ion spectrometry as described in another section.
Although said biomolecules were first identified in blood serum samples, their detection is not limited to said sample type. The biomolecules may also be detected in other samples types, such as blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract. Preferably, samples are of blood, blood serum, plasma, urine, excreta, prostatic fluid, biopsy, ascites, lymph or tissue extract origin. More preferred are blood, blood serum, plasma, urine, excreta, biopsy, lymph or tissue extract samples. Even more preferred are blood serum, urine, excreta or biopsy samples. Overall preferred are blood serum samples.
Since the biomolecules can be sufficiently characterized by their mass and biochemical characteristics such as the type of biologically active surface they bind to or the pH of binding conditions, it is not necessary to identify the biomolecules in order to be able to identify them in a sample. It should be noted that molecular mass and binding properties are characteristic properties of these biomolecules and not limitations on the means of detection or isolation. Furthermore, using the methods described herein, or other methods known in the art, the absolute identity of the markers can be determined. This is important when one wishes to develop and/or screen for specific binding molecules, or to develop an assay for the detection of said biomolecules using specific binding molecules.
d) Biologically Active Surfaces
In one embodiment of the invention, biologically active surfaces include, but are not restricted to, surfaces that contain adsorbents such as quaternary ammonium groups (anion exchange surfaces), carboxylate groups (cation exchange surfaces), alkyl or aryl chains (hydrophobic interaction, reverse phase chemistry), groups such as nitriloacetic acid that immobilize metal ions such as nickel, gallium, copper, or zinc (metal affinity interaction), or biomolecules such as proteins, preferably antibodies, or nucleic acids, preferably protein binding sequences, covalently bound to the surface via carbonyl dimidazole moieties or epoxy groups (specific affinity interaction). Preferred are adsorbents comprising anion exchange surfaces.
These surfaces may be located on matrices like polysaccharides such as sepharose, e.g. anion exchange surfaces or hydrophobic interaction surfaces, or solid metals, e.g. antibodies coupled to magnetic beads. Surfaces may also include gold-plated surfaces such as those used for Biacore Sensor Chip technology. Other surfaces known to those skilled in the art are also included within the scope of the invention.
Biologically active surfaces are able to adsorb biomolecules like amino acids, sugars, fatty acids, steroids, nucleic acids, polynucleotides, polypeptides, carbohydrates, lipids, and combinations thereof (e.g., glycoproteins, ribonucleoproteins, lipoproteins).
In another embodiment, devices that use biologically active surfaces to selectively adsorb biomolecules may be chromatography columns for Fast Protein Liquid Chromatography (FPLC) and High Pressure Liquid Chromatography (HPLC), where the matrix, e.g. a polysaccharide, carrying the biologically active surface, is filled into vessels (usually referred to as "columns") made of glass, steel, or synthetic materials like polyetheretherketone (PEEK).
In yet another embodiment, devices that use biologically active surfaces to selectively adsorb
biomolecules may be metal strips carrying thin layers of the biologically active surface on one or more spots of the strip surface to be used as probes for gas phase ion spectrometry analysis, for example the SAX2 ProteinChip array (Ciphergen Biosystems, Inc.) for SELDI analysis.
e) Mass Profiling
In one embodiment, the mass profile of a sample may be generated using an array-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent present on a biologically active surface located on a solid platform ("array" or "probe"). After the biomolecules have bound to the adsorbent, they are detected using gas phase ion spectrometry. Biomolecules or other substances bound to the adsorbents on the probes can be analyzed using a gas phase ion spectrometer. This includes, e.g., mass spectrometers, ion mobility spectrometers, or total ion current measuring devices. The quantity and characteristics of the biomolecule can be determined using gas phase ion spectrometry. Other substances in addition to the biomolecule of interest can also be detected by gas phase ion spectrometry.
In one embodiment, a mass spectrometer can be used to detect biomolecules on the probe. In a typical mass spectrometer, a probe with a biomolecule is introduced into an inlet system of the mass spectrometer. The biomolecule is then ionized by an ionization source, such as a laser, fast atom bombardment, or plasma. The generated ions are collected by an ion optic assembly, and then a mass analyzer disperses and analyzes the passing ions. Within the scope of this invention, the ionisation course that ionises the biomolecule is a laser.
The ions exiting the mass analyzer are detected by a ion detector. The ion detector then translates information of the detected ions into mass-to-charge ratios. Detection of the presence of a biomolecule or other substances will typically involve detection of signal intensity. This, in turn, can reflect the quantity and character of a biomolecule bound to the probe.
In another embodiment, the mass profile of a sample may be generated using a liquid-chromatography (LC)-based assay in which the biomolecules of a given sample are bound by biochemical or affinity interactions to an adsorbent located in a vessel made of glass, steel, or synthetic material; known to those skilled in the art as a chromatography column. The biomolecules are eluted from the biologically active surface by washing the vessel with appropriate solutions known to those skilled in the art. Such solutions include but are not limited to, buffers, e.g. Tris (hydroxymethyl) aminomethane hydrochloride (TRIS-HC1), buffers containing salt, e.g. sodium chloride (NaCI), or organic solvents, e.g. acetonitrile. Biomolecule mass profiles are generated by application of the eluting biomolecules of the sample by direct connection via an electrospray device to a mass spectrometer (LC/ESI-MS).
Conditions that promote binding of biomolecules to an adsorbent are known to those skilled in the art (reference) and ordinarily include parameters such as pH, the concentration of salt, organic solvent, or other competitors for binding of the biomolecule to the adsorbent. Within the scope of the invention, incubation temperatures are of at least 0 to 100°C, preferably of at least 4 to 60°C, and most preferably of at least 15 to 30°C. Varying additional parameters, such as incubation time, the concentration of detergent, e.g., 3-[(3-Cholamidopropyl) dimethylammonio]-2-hydroxy-l-propanesulfonate (CHAPS), or reducing agents, e.g. dithiothreitol (DTT), are also known to those skilled in the art. Various degrees of binding can be accomplished by combining the above stated conditions as needed, and will be readily apparent to those skilled in the art.
f) Methods for detecting biomolecules within a sample
In yet another aspect, the invention relates to methods for detecting differentially present biomolecules in a test sample and or biological sample. Within the context of the invention, any suitable method can be used to detect one or more of the biomolecules described herein. For example, gas phase ion spectrometry can be used. This technique includes, e.g., laser desorption/ionization mass spectrometry. Preferably, the test and/or biological sample is prepared prior to gas phase ion spectrometry, e.g., pre-fractionation, two-dimensional gel chromatography, high performance liquid chromatography, etc. to assist detection of said biomolecules. Detection of said biomolecules can also be achieved using methods other than gas phase ion spectrometry. For example, immunoassays can be used to detect the biomolecules within a sample.
In one embodiment, the test and/or biological sample is prepared prior to contacting a biologically active surface and is in aqueous form. Examples of said samples include, but are not limited to, blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples. Furthermore, solid test and/or biological samples, such as excreta or biopsy samples can be solubilised in or admixed with an eluent using methods known to those skilled in the art such that said samples may be easily applied to a biologically active surface. Test and or biological samples in the aqueous form can be further prepared using specific solutions for denaturation (pre-treatment) like sodium dodecyl sulphate (SDS), mercaptoethanol, urea, etc. For example, a test and/or biological sample of the invention can be denatured prior to contacting a biologically active surface comprising of quaternary ammonium groups by diluting said sample 1:5 with a buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT and
2% ampholine.
The sample is contacted with a biologically active surface using any techniques including bathing, soaking, dipping, spraying, washing over, or pipetting, etc. Generally, a volume of sample containing from a few atomoles to 100 picomoles of a biomolecule in about 1 to 500 μl is sufficient for detecting binding of the biomolecule to the adsorbent.
The'pH value of the solvent in which the sample contacts the biologically active surface is a function of the specific sample and the selected biologically active surface. Typically, a sample is contacted with a biologically active surface under pH values between 0 and 14, preferably between about 4 and 10, more preferably between 4.5 and 9.0, and most preferably, at pH 8.5. The pH value depends on the type of adsorbent present on a biologically active surface and can be adjusted accordingly.
The sample can contact the adsorbent present on a biologically active surface for a period of time sufficient to allow the marker to bind to the adsorbent. Typically, the sample and the biologically active surface are contacted for a period of between about 1 second and about 12 hours, preferably, between about 30 seconds and about 3 hours, and most preferably for 120 minutes.
The temperature at which the sample contacts the biologically active surface (incubation temperature) is a function of the specific sample and the selected biologically active surface. Typically, the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably between 20 and 24°C.
For example, a biologically active surface comprising of quaternary ammonium groups (anion exchange surface) will bind the biomolecules described herein when the pH value is between 6.5 and 9.0. Optimal binding of the biomolecules of the present invention occurs at a pH of 8.5. Furthermore, a sample is contacted with said biologically active surface for 120 minutes at a temperature of 20 - 24 °C.
Following contacting a sample or sample solution with a biological surface, it is preferred to remove any unbound biomolecules so that only the bound biomolecules remain on the biologically active surface. Washing unbound biomolecules are removed by methods known to those skilled in the art such as bathing, soaking, dipping, rinsing, spraying, or washing the biologically active surface with an eluent or a washing solution. A microfluidics process is preferably used when a washing solution such as an eluent is introduced to small spots of adsorbents on the biologically active surface. Typically, the washing solution can be at a temperature of between 0 and 100°C, preferably between 4 and 37°C, and most preferably
between 20 and 24°C.
Washing solution or eluents used to wash the unbound biomolecules from a biologically active surface include, but are not limited to, organic solutions, aqueous solutions such as buffers wherein a buffer may contain detergents, salts, or reducing agents in appropriate concentrations as those known to those skilled in the art.
Aqueous solutions are preferred for washing biologically active surfaces. Exemplary aqueous solutions include, but are not limited to, HEPES buffer, Tris buffer, phosphate buffered saline (PBS), and modifications thereof. The selection of a particular washing solution or an eluent is dependent on other experimental conditions (e. g., types of adsorbents used or biomolecules to be detected), and can be determined by those of skill in the art. For example, if a biologically active surface comprising a quaternary ammonium group as adsorbent (anion exchange surface) is used, then an aqueous solution, such as a Tris buffer, may be preferred. In another example, if a biologically active surface comprising a carboxylate group as adsorbent (cation exchange surface) is used, then an aqueous solution, such as an acetate buffer, may be preferred.
Optionally, an energy absorbing molecule (EAM), e.g. in solution, can be applied to biomolecules or other substances bound on the biologically active surface by spraying, pipetting or dipping. Applying an EAM can be done after unbound materials are washed off of the biologically active surface. Exemplary energy absorbing molecules include, but are not limited to, cinnamic acid derivatives, sinapinic acid and dihydroxybenzoic acid.
Once the biologically active surface is free of any unbound biomolecules, adsorbent-bound biomolecules are detected using gas phase ion spectrometry. The quantity and characteristics of a biomolecule can be determined using said method. Furthermore, said biomolecules can be analyzed using a gas phase ion spectrometer such as mass spectrometers, ion mobility spectrometers, or total ion current measuring devices. Other gas phase ion spectrometers known to those skilled in the art are also included.
In one embodiment, mass spectrometry can be used to detect biomolecules of a given sample present on a biologically active surface. Such methods include, but are not limited to, matrix- assisted laser desorption ionization/time-of-flight (MALDI-TOF), surface-enhanced laser desorption ionization time-of-flight (SELDI-TOF), liquid chromatography coupled with MS, MS-MS, or ESI-MS. Typically, biomolecules are analysed by introducing a biologically active surface containing said biomolecules, ionizing said biomolecules to generate ions that are collected and analysed.
In a preferred embodiment, the biomolecules present in a sample are detected using gas phase ion spectrometry, and more preferably, using mass spectrometry. In one embodiment, matrix- assisted laser desorption/ionization ("MALDI") mass spectrometry can be used. In MALDI, the sample is partially purified to obtain a fraction that essentially consists of a biomolecule by employing such separation methods as: two-dimensional gel electrophoresis (2D-gel) or high performance liquid chromatography (HPLC).
In another embodiment, surface-enhanced laser desorption/ionization mass spectrometry ("SELDI") can be used. SELDI uses a substrate comprising adsorbents to capture biomolecules, which can then be directly desorbed and ionized from the substrate surface during mass spectrometry. Since the substrate surface in SELDI captures biomolecules, a sample need not be partially purified as in MALDI. However, depending on the complexity of a sample and the type of adsorbents used, it may be desirable to prepare a sample to reduce its complexity prior to SELDI analysis.
For example, biomolecules bound to a biologically active surface can be introduced into an inlet system of the mass spectrometer. The biomolecules are then ionized by an ionization source such as a laser, fast atom bombardment, or plasma. The generated ions are then collected by an ion optic assembly, and then a mass analyzer disperses the passing ions. The ions exiting the mass analyzer are detected by a detector and translated into mass-to-charge ratios. Detection of the presence of a biomolecule typically involves detection of its specific signal intensity, and reflects the quantity and character of said biomolecule.
In a preferred embodiment, a laser desorption time-of-flight mass spectrometer is used with the probe of the present invention. In laser desorption mass spectrometry, biomolecules bound to a biologically active surface are introduced into an inlet system. Biomolecules are desorbed and ionized into the gas phase by a laser. The ions generated are then collected by an ion optic assembly. These ions are accelerated through a short high-voltage field and allowed to drift into a high vacuum chamber of a time-of-flight mass analyzer. At the far end of the high vacuum chamber, the accelerated ions collide with a detector surface at varying times. Since the time-of-flight is a function of the mass of the ions, the elapsed time between ionization and impact can be used to identify the presence or absence of molecules of a specific mass.
The detection of biomolecules described herein can be enhanced using certain selectivity conditions (e. g., types of adsorbents used or washing solutions). In a preferred embodiment, the same or substantially the same selectivity conditions that were used to discover the
biomolecules can be used in the methods for detecting a biomolecule in a sample.
Combinations of the laser desorption time-of-flight mass spectrometer with other components described herein, in the assembly of mass spectrometer that employs various means of desorption, acceleration, detection, measurement of time, etc., are known to those skilled in the art.
Data generated by desorption and detection of markers can be analyzed with the use of a programmable digital computer. The computer program generally contains a readable medium that stores codes. Certain codes can be devoted to memory that include the location of each feature on a biologically active surface, the identity of the adsorbent at that feature and the elution conditions used to wash the adsorbent. Using this information, the program can then identify the set of features on the biologically active surface defining certain selectivity characteristics (e. g. types of adsorbent and eluents used). The computer also contains codes that receive as data (input) on the strength of the signal at various molecular masses received from a particular addressable location on the biologically active surface. This data can indicate the number of biomolecules detected, as well as the strength of the signal and the determined molecular mass for each biomolecule detected.
Data analysis can include the steps of determining signal strength (e. g., height of peaks) of a biomolecule detected and removing "outliers" (data deviating from a predetermined statistical distribution). For example, the observed peaks can be normalized, a process whereby the height of each peak relative to some reference is calculated. For example, a reference can be background noise generated by instrument and chemicals (e. g., energy absorbing molecule), which is set as zero in the scale. Then the signal strength detected for each biomolecule can be displayed in the form of relative intensities in the scale desired (e. g., 100). Alternatively, a standard may be admitted with the sample so that a peak from the standard can be used as a reference to calculate relative intensities of the signals observed for each biomolecule or other biomolecules detected.
The computer can transform the resulting data into various formats for displaying. In one format, referred to as "spectrum view", a standard spectral view can be displayed, wherein the view depicts the quantity of a biomolecule reaching the detector at each particular molecular mass. In another format, referred to as "scatter plot" only the peak height and mass information are retained from the spectrum view, yielding a cleaner image and enabling biomolecules with nearly identical molecular mass to be more visible.
Using any of the above display formats, it can be readily determined from the signal display whether a biomolecule having a particular molecular mass is detected from a sample. Preferred biomolecules of the invention are biomolecules with an apparent molecular mass of about 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da± 21 Da, 4161 Da ± 21 Da, 4245 Da± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da. Moreover, from the strength of signal, the amount of a biomolecule bound on the biologically active surface can be determined.
g Identification of proteins
In the event that the biomolecules of the invention are proteins, the present invention comprises a method for the identification of these proteins, especially by obtaining their amino acid sequence. This method comprises the purification of said proteins from the complex biological sample (blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, tears, saliva, sweat, ascites, cerebrospinal fluid, milk, lymph, or tissue extract samples) by fractionating said sample using techniques known by the one of ordinary skill in the art, most preferably protein chromatography (FPLC, HPLC).
The biomolecules of the invention include those proteins with a molecular mass selected from 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da,
5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da± 142 Da.
Furthermore, the method comprises the analysis of the fractions for the presence and purity of said proteins by the method which was used to identify them as differentially expressed biomolecules, for example two-dimensional gel electrophoresis, SELDI mass spectrometry of MALDI mass spectrometry, but most preferably MALDI mass spectrometry. The method also comprises an analysis of the purified proteins aiming towards the revealing of their amino acid sequence. This analysis may be performed using techniques in mass spectroscopy known to those skilled in the art.
In one embodiment, this analysis may be performed using peptide mass finge rinting, revealing information about the specific peptide mass profile after proteolytic digestion of the investigated protein.
In another embodiment, this analysis may be preferably performed using post-source-decay (PSD), or ESI-MS, but most preferably ESI-MS, revealing mass information about all possible fragments of the investigated protein or proteolytic peptides thereof leading to the amino acid sequence of the investigated protein of proteolytic peptide thereof.
The information revealed by the aforementioned techniques can be used to feed world- wide- web search engines, such as MS Fit (Protein Prospector, http://prospector.ucsf.edu) for information obtained from peptide mass fϊngeφrinting, or MS Tag (Protein Prospector, http://prospector.ucsf.edu) for information obtained from PSD, or mascot (www.matrixscience.com) for information obtained from MSMS and peptide mass fingerprinting, for the alignment of the obtained results with data available in public protein sequence databases, such as SwissProt (http://us.expasy.org/sprot/), NCBI
(http://www.ncbi.nlm.nih.gov/BLAST/), EMBL (http://srs.embl-heidelberg.de:8000/srs5/) which leads to a confident information about the identity of said proteins.
This information may comprise, if available, the complete amino acid sequence, the calculated molecular mass, the structure, the enzymatic activity, the physiological function, and gene expression of the investigated proteins.
h) Kits
In yet another aspect, the invention provides kits using the methods of the invention as described in the section Diagnostics for the differential diagnosis of a breast cancer or a non-malignant disease of the breast, wherein the kits are used to detect the biomolecules of the present invention.
The methods used to detect the biomolecules of the invention can also be used to determine whether a subject is at risk of developing a breast cancer or has developed a breast cancer. Such methods may also be employed in the form of a diagnostic kit comprising an antibody specific to a biomolecule of the invention or a biologically active surface described herein, which may be conveniently used, for example, in clinical settings to diagnose patients exhibiting symptoms or a family history of a non-steroid dependent cancer. Such diagnostic kits also include solutions and materials necessary for the detection of a biomolecule of the invention, and instructions to use the kit based on the above-mentioned methods.
The biomolecules of the invention include those proteins with a molecular mass selected from 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107 Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da, 8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da ± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492 Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504 Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da,
18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da± 141 Da, or 28313 Da± 142 Da.
For example, the kits can be used to detect one or more of differentially present biomolecules as described above in a test sample of subject. The kits of the invention have many applications. For example, the kits can be used to differentiate if a subject is healthy, having a precancerous lesion of the breast, a breast cancer, a metastasized breast cancer or a non-malignant disease of the breast. Thus aiding the diagnosis of a breast cancer and or a nόn-malignant disease of the breast. In another example, the kits can be used to identify compounds that modulate expression of said biomolecules .
In one embodiment, a kit comprises an adsorbent on a biologically active surface, wherein the adsorbent is suitable for binding one or more biomolecules of the invention, a denaturation solution for the pre-treatment of a sample, a binding solution, a washing solution or instructions for making a denaturation solution, binding solution, or washing solution, wherein the combination allows for the detection of a biomolecule using gas phase ion spectrometry. Such kits can be prepared from the materials described in other previously detailed sections (e. g., denaturation buffer, binding buffer, adsorbents, washing solutions, etc.).
In some embodiments, the kit may comprise a first substrate comprising an adsorbent thereon (e. g., a particle functionalized with an adsorbent) and a second substrate onto which the first substrate can be positioned to form a probe, which is removably insertable into a gas phase ion spectrometer. In other embodiments, the kit may comprise a single substrate, which is in the form of a removably insertable probe with adsorbents on the substrate.
In another embodiment, a kit comprises a binding molecule that specifically binds to a biomolecule related to the invention, a detection reagent, appropriate solutions and instructions on how to use the kit. Such kits can be prepared from the materials described above, and other materials known to those skilled in the art. A binding molecule used within such a kit may include, but is not limited to, proteins, peptides, nucleotides, nucleic acids, hormones, amino acids, sugars, fatty acids, steroids, polynucleotides, carbohydrates, lipids, or a combination thereof (e.g. glycoproteins, ribonucleoproteins, lipoproteins), compounds or synthetic molecules. Preferably, a binding molecule used in said kit is an antibody.
In either embodiment, the kit may optionally further comprise a standard or control information so that the test sample can be compared with the control information standard to determine if the test amount of a marker detected in a sample is a diagnostic amount consistent with a
diagnosis of a breast cancer or a non-malignant disease of the breast.
Each recorded measurement reading is accompanied by a margin of deviation. The latter statistical imprecision is well-known to those skilled in the art. In the scope of the present invention, the margin of deviation is exclusively device-specific. That means it is caused by the type of analytical device used which is preferably a mass spectrometer. The accuracy of the recorded measurement reading is specified by a fixed percentage. In the meaning of the present invention, each disclosed molecular mass represents the averaged value of that range which deviates from the averaged value about + 0.5 %. Furthermore, slight differences appear in the molecular mass value itself which concerns the same protein in parallel patent applications disclosing the matter of cancer biomarkers. There are three reasons to be considered. First, each molecular mass results from the analysis of samples belonging to another type of cancer. The origin of sample, the cellular status, the environmental conditions of the gathered tissue etc. exert an influence on the measurements. Secondly, the given molecular mass of the biomarkers represents the averaged value which is calculated from the data of numerous samples of each cancer species. Thirdly, measuring errors might be also imaginable, for example due to the sample preparation.
Above statements are further illustrated by examples which should not be construed as limiting with regard to the type of disease, the number of given molecular masses or in any other way. The following molecular masses [Da] of biomolecules are regarded as equivalent:
(i) 1516 + 8 (epithelial cancer) and 1506 ± 8 (breast cancer) (ii) 1535 ± 8 (epithelial cancer) and 1533 + 8 (breast cancer)
(iii) 1624 ± 8 (pancreatic cancer) and 1623 ± 8 (breast cancer) (iv) 2020 ± 10 (epithelial cancer), 2020 ± 10 (colorectal cancer), 2020 ± 10
(pancreatic cancer) and 2017 + 10 (breast cancer) (v) 2050 + 10 (epithelial cancer), 2049 ± 10 (colorectal cancer) and 2053 + 10 (breast cancer)
(vi) 2270 + 11 (colorectal cancer), 2271 ± 11 (pancreatic cancer) and 2268 +
11 (breast cancer) (vii) 3326 + 17 (colorectal cancer) and 3328 ± 17 (breast cancer)
(viii) 3946 ± 20 (epithelial cancer), 3946 ± 20 (colorectal cancer), 3951 + 20
(pancreatic cancer) and 3951 ± 20 (breast cancer) (ix) 4104 ± 21 (epithelial cancer), 4103 + 21 (colorectal cancer), 4108 + 20 (pancreatic cancer) and 4107 + 21 (breast cancer) (x) 4151 ± 21 (epithelial cancer) and 4161 + 21 (breast cancer)
(xi) 4242 + 21 (colorectal cancer), 4249 + 21 (pancreatic cancer) and 4245 +
21 (breast cancer) (xii) 4298 + 21 (epithelial cancer), 4295 + 21 (colorectal cancer), 4307 ± 21 (pancreatic cancer) and 4295 ± 21 (breast cancer) (xiii) 4360 + 22 (epithelial cancer), 4359 ± 22 (colorectal cancer), 4364 ± 22
(pancreatic cancer) and 4363 ± 22 (breast cancer) (xiv) 4477 ± 22 (epithelial cancer), 4476 ± 22 (colorectal cancer), 4480 ± 22
(pancreatic cancer) and 4476 + 22 (breast cancer) (xv) 4607 + 23 (colorectal cancer), 4614 + 23 (pancreatic cancer) and 4614 + 23 (breast cancer)
(xvi) 4719 ± 24 (colorectal cancer), 4725 ± 24 (pancreatic cancer) and 4725 ±
24 (breast cancer) (xvii) 4830 ± 24 (colorectal cancer), 4836 ± 24 (pancreatic cancer) and 4831 ± 24 (breast cancer) (xviii) 4867 + 24 (epithelial cancer), 4865 ± 24 (colorectal cancer), 4875 ± 24
(pancreatic cancer) and 4874 + 24 (breast cancer) (xix) 4958 + 25 (epithelial cancer), 4963 + 25 (colorectal cancer), 4969 ± 25
(pancreatic cancer) and 4962 ± 25 (breast cancer) (xx) 5112 + 26 (colorectal cancer), 5119 ± 26 (pancreatic cancer) and 5119 + 26 (breast cancer)
(xxi) 5491 + 27 (epithelial cancer), 5493 ± 27 (colorectal cancer), 5497 ± 27
(pancreatic cancer) and 5497 ± 27 (breast cancer) (xxii) 5650 ± 28 (epithelial cancer), 5648 ± 28 (colorectal cancer), 5657 ± 28 (pancreatic cancer) and 5655 ± 28 (breast cancer) (xxiii) 5854 + 29 (colorectal cancer), 5857 ± 29 (pancreatic cancer) and 5863 ±
29 (breast cancer) (xxiv) 6449 ± 32 (epithelial cancer), 6446 ± 32 (colorectal cancer), 6458 ± 32 (pancreatic cancer) and 6454 ± 32 (breast cancer)
(xxv) 6644 ± 33 (colorectal cancer) and 6655 ± 33 (breast cancer)
(xxvi) 6897 ± 35 (colorectal cancer), 6908 ± 35 (pancreatic cancer) and 6906 ±
35 (breast cancer) (xxvii) 7001 ± 35 (epithelial cancer), 6999 ± 35 (colorectal cancer), 7013 ± 35 (pancreatic cancer) and 7012 ± 35 (breast cancer)
(xxviii) 7575 ± 38 (colorectal cancer) and 7591 ± 38 (breast cancer)
(xxix) 7969 ± 40 (epithelial cancer), 8001 ± 40 (pancreatic cancer) and 7998 +
40 (breast cancer) (xxx) 8232 ± 41 (epithelial cancer), 8215 ± 41 (colorectal cancer), 8237 ± 41 (pancreatic cancer) and 8230 ± 41 (breast cancer)
(xxxi) 8474 ± 42 (colorectal cancer), 8494 ± 42 (pancreatic cancer) and 8487 ±
42 (breast cancer)
(xxxii) 8574 ± 43 (colorectal cancer), 8596 ± 43 (pancreatic cancer) and 8589 ±
43 (breast cancer) (xxxiii) 8711 ± 44 (epithelial cancer), 8702 ± 44 (colorectal cancer), 8717
± 44 (pancreatic cancer) and 8717 ± 44 (breast cancer) (xxxiv) 8780 ± 44 (colorectal cancer), 8794 ± 44 (pancreatic cancer) and
8792 ± 44 (breast cancer) (xxxv) 8922 ± 45 (colorectal cancer), 8942 ± 45 (pancreatic cancer) and 8939 ± 45 (breast cancer)
(xxxvi) 9143 ± 46 (colorectal cancer), 9163 ± 46 (pancreatic cancer) and
9160 ± 46 (breast cancer) (xxxvii) 9201 + 46 (colorectal cancer), 9220 ± 46 (pancreatic cancer) and
9221 ± 46 (breast cancer) (xxxviii) 9359 ± 47 (colorectal cancer), 9382 ± 47 (pancreatic cancer) and
9377 ± 47 (breast cancer) (xxxix) 9425 ± 47 (colorectal cancer), 9443 ± 47 (pancreatic cancer) and
9446 ± 47 (breast cancer) (xl) 9641 ± 48 (colorectal cancer), 9652 ± 48 (pancreatic cancer) and 9661 ± 48 (breast cancer)
(xli) 9718 ± 49 (colorectal cancer), 9741 ± 49 (pancreatic cancer) and 9737 ±
49 (breast cancer) (xlii) 9930 ± 50 (colorectal cancer) and 9955 ± 50 (breast cancer)
(xliii) 10215 ± 51 (colorectal cancer), 10233 ± 51 (pancreatic cancer) and
10232 ± 51 (breast cancer) (xliv) 10440 ± 52 (colorectal cancer), 10455 ± 52 (pancreatic cancer) and
10464 ± 52 (breast cancer) (xlv) 10665 ± 53 (epithelial cancer), 10748 ± 54 (pancreatic cancer) and 10682
± 53 (breast cancer) (xlvi) 11464 ± 57 (colorectal cancer), 11488 ± 57 (pancreatic cancer) and
11414 ± 57 (breast cancer) (xlvii) 11547 ± 58 (colorectal cancer), 11558 ± 58 (pancreatic cancer) and 11567 ± 58 (breast cancer)
(xlviii) 11693 ± 58 (colorectal cancer), 11713 ± 58 (pancreatic cancer) and
11723 ± 58 (breast cancer) (xlix) 12504 ± 62 (epithelial cancer) and 12492 ± 62 (breast cancer) (1) 12669 ± 63 (epithelial cancer), 12619 ± 63 (colorectal cancer), 12648 ± 63 (pancreatic cancer) and 12656 ± 63 (breast cancer)
(li) 13632 ± 68 (colorectal cancer) and 13652 ± 68 (breast cancer) (Iii) 13784 ± 69 (colorectal cancer), 13800 ± 69 (pancreatic cancer) and
13776 ± 69 (breast cancer) (liii) 13824 ± 69 (pancreatic cancer) and 13812 ± 69 (breast cancer) (liv) 13989 ± 70 (epithelial cancer), 13983 ± 70 (colorectal cancer) and 14014
± 70 (breast cancer) (lv) 14206 ± 71 (pancreatic cancer) and 14082 ± 70 (breast cancer) (lvi) 14798 ± 74 (colorectal cancer), 14829 ± 74 (pancreatic cancer) and
14821 ± 74 (breast cancer) (lvii) 15140 ± 76 (colorectal cancer), 15168 ± 76 (pancreatic cancer) and
15160 ± 76 (breast cancer) (lviii) 15350 ± 77 (colorectal cancer), 15378 ± 77 (pancreatic cancer) and
15367 ± 77 (breast cancer) (lix) 15879 ± 79 (colorectal cancer), 15858 ± 79 (pancreatic cancer) and 15909 ± 79 (breast cancer)
(lx) 15959 ± 80 (epithelial cancer), 15957 ± 80 (colorectal cancer), 15984 ±
80 (pancreatic cancer) and 15975 ± 80 (breast cancer)
(lxi) 16164 ± 81 (epithelial cancer), 16164 ± 81 (colorectal cancer), 16200 ±
81 (pancreatic cancer) and 16202 ± 81 (breast cancer) (lxii) 17279 ± 86 (epithelial cancer), 17263 ± 86 (colorectal cancer) and 17288
± 86 (breast cancer) (lxiii) 17406 ± 87 (epithelial cancer), 17397 ± 87 (colorectal cancer), 17426 ±
87 (pancreatic cancer) and 17416 ± 87 (breast cancer) (lxiv) 17630 ± 88 (epithelial cancer), 17617 ± 88 (colorectal cancer) and 17638
± 88 (breast cancer) (lxv) 17890 ± 89 (colorectal cancer), 17932 ± 89 (pancreatic cancer) andl7961 ± 89 (breast cancer)
(lxvi) 18133 ± 91 (epithelial cancer), 18115 ± 91 (colorectal cancer), 18153 ±
91 (pancreatic cancer) and 18146 ± 91 (breast cancer) (lxvii) 17890 ± 89 (colorectal cancer), 17932 ± 89 (pancreatic cancer) and
17961 + 90 (breast cancer) (lxviii) 18647 ± 93 (pancreatic cancer) and 18656 ± 93 (breast cancer)
(lxix) 22338 ± 112 (colorectal cancer) and 22383 ± 112 (breast cancer) (lxx) 22466 ± 113 (colorectal cancer) and 22496 ± 113 (breast cancer) (lxxi) 22676 ± 114 (colorectal cancer) and 22710 ± 114 (breast cancer) (lxxii) 23166 ± 116 (pancreatic cancer) and 23218 ± 116 (breast cancer) (lxxiii) 28055 ± 140 (colorectal cancer), 28009 ± 140 (pancreatic cancer) and
28119 ± 141 (breast cancer) (lxxiv) 28259 ± 141 (colorectal cancer), 28124 ± 141 (pancreatic cancer) and
28313 ± 142 (breast cancer)
In all examples, each recorded measurement reading is overlapping with any others within its margin of deviation.
A further calculation of averaged values which incorporates the matching molecular masses of each type of cancer is known to those skilled in the art. By applying formulas which the method of error calculation by means of weights (weighted average) is based upon, the following generalized results are obtained for the aforementioned examples: (i) 1511 ± 8 (ii) 1534 ± 8
(iii) 1624 ±8
(iv) 2019 ±10
(v) 2051 ± 10
(vi) 2270 ±11 (vii) 3327 ± 17
(viii) 3949 ± 20
(ix) 4106 ±21
(x) 4155 ±21
(xi) 4245 ±21 (xii) 4299 ±21
(xiii) 4362 ± 22
(xiv) 4477 ± 22
(xv) 4612 ±23
(xvi) 4723 ±24 (xvii) 4832 ±24
(xviii) 4870 ±24
(xix) 4963 ± 25
(xx) 5115 ±26
(xxi) 5495 ± 27 (xxii) 5653 ±28
(xxiii) 5858 ±29
(xxiv) 6452 ±32
(xxv) 6650 ±33
(xxvi) 6904 ±35 (xxvii) 7006 ± 35
(xxviii) 7583 ± 38
(xxix) 7989 ±40
(xxx) 8229 ±41
(xxxi) 8485 ±42 (xxxii) 8586 ±43
(xxxiii) 8712 ±44
(xxxiv) 8789 ± 44
(xxxv) 8934 ±45
(xxxvi)i 9155 ±46
(xxxvii .) 9214 ±46
(xxxviii) 9373 ± 47
(xx iX 1 9438 ± 47
(xl) 9651 ±48
(xli) 9732 ± 49
(xlii) 9943 ± 50
(xliii) 10227 ±51
(xliv) 10453 ±52
(xlv) 10698 ±53
(xlvi) 11455 ±57
(xlvii) 11557 + 58
(xlviii) 111710 + 59
(xlix) 12498 ± 62
(1) 12648 ±63
(li) 13642 ± 68
(Iii) 13787 ±69
(liϋ) 13818 ±69
(liv) 13995 ±70
(lv) 14144 ±71
(Ivi) 14816 ±74
(lvii) 15156 ±76
(lviii) 15365 ±77
(lix) 15882 ±78
(iχ) 15969 ±80
(bd) 16183 ±81
(ixϋ) 17277 ±86
(lxiii) 17411 ±87
(lxiv) 17628 ±88
(lxv) 17928 ±90
(lxvi) 18137 + 91
(lxvii) 18415 ±92 (lxviii) 18652 ±93 (lxix) 22361 ±112 (lxx) 22481 ±113 (lxxi) 22693 ±114 (lxxii) 23192 ±116 (lxxiii) 28061 ±140 (lxxiv) 28232 ± 141
The present invention is further illustrated by the following examples, which should not be construed as limiting in any way. The contents of all cited references (including literature references, issued patents, published patent applications), as cited throughout this application, are hereby expressly incorporated by reference. The practice of the present invention will employ, unless otherwise indicated, conventional techniques of cell biology, cell culture, molecular biology, transgenic biology, microbiology, recombinant DNA, and immunology, which are known to those skilled in the art. Such techniques are explained fully in the literature.
Examples.
Example 1. Sample collection. Serum samples were obtained from a total of 216 individuals: 147 samples from women suffering from a given disease of the breast, courtesy of the Department of Gynaecology and Obstetrics at the University of Heidelberg in Heidelberg, Germany; and 69 serum samples obtained from healthy patients, courtesy of both the "Deutsches Rotes Kreuz (DRK)" in Berlin, Germany, and the "GENICA study group" in Bonn, Germany.
In addition, serum samples obtained from woman suffering from a breast disease could be further subdivided based on the type of disease and the stage to which the disease has progressed e.g. non-malignant disease, mastopathy, DCIS or breast cancer (Table 1). Serum samples were collected from the patients directly before surgery. At this time, a primary diagnosis was made based on standard techniques e.g. mammography, magnetic resonance imaging (MRI) and/or other means for the detection of diseases of the breast. In most cases the final diagnosis was confirmed by histological evaluation after surgery. In about 30% of the cases surgery was not possible due to the advanced stage of cancer. Follow-up data for all breast cancer patients are currently collected and will be available for later studies.
Example 2. ProteinChip Array analysis.
ProteinChip Arrays of the SAX2-type (strong anion exchanger) were arranged into a bioprocessor (Ciphergen Biosystems, Inc.), a device that contains up to 12 ProteinChips and facilitates processing of the ProteinChips. The ProteinChips were pre-incubated in the bioprocessor with 200 μl binding buffer (0.1 M Tris-HCl, 0.02% Triton X-100, pH 8.5) for two times 15 minutes. 10 μl of serum sample was diluted 1:5 in a buffer (7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, 2% ampholine) and again diluted 1:10 in the binding buffer. Then, 300 μl of this mixture (equivalent to 6 μl original serum sample) were directly applied to the spots of the SAX2 ProteinChips. Between dilution steps, and prior to the application to the spots, the sample was kept on ice (at 0°C). After incubation for 120 minutes at 20 to 24 °C the chips were incubated with 200 μl binding buffer, before 2 x 0.5 μl EAM solution (20 mg/ml sinapinic acid in 50% acetonitrile and 0.5% trifluoroacetic acid) was applied to the spots. After air-drying for 10 min, the ProteinChips were placed in the ProteinChip Reader (ProteinChip Biology System II, Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots collected in the positive mode at an average laser intensity of 215, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed.
Calibration of mass accuracy was performed by using the following mixture of mass standard calibrant proteins: Dynorphin A (porcine, 209 - 225, 2147.50 Da), Beta-endorphin (human, 61 - 91, 3465.00 Da), Insulin (bovine, 5733.58 Da), and Cytochrome c (bovine, 12230.90 Da) at a concentration of 1.21 pmol/μl, and Myoglobin (equine cardiac, 16951.50 Da) at a concentration of 5.16 pmol/μl. 0.5 μl of this mixture was applied to a single spot of a H4 ProteinChip array. After air-drying of the drop, 2 x 1 μl matrix solution (a saturated solution of sinapinic acid in 50% acetonitrile 0.5% trifluoracetic acid) was applied to the spot. The drop was allowed to air- dry for 10 min after each application of matrix solution.
The ProteinChip was placed in the ProteinChip Reader (Biology System II, Ciphergen Biosystems, Inc.) and time-of-flight spectra were generated by laser shots collected in the positive mode at laser intensity 210, with the detector sensitivity of 8. Sixty laser shots per average spectra were performed. Subsequently, Time-Of-Flight values were correlated to the molecular masses of the standard proteins, and calibration was performed according to the instrument manual.
Example 3. Peak detection and data analysis.
The analysis of the data was performed by automatic peak detection and alignment using the operating software of the ProteinChip Biology System II, the ProteinChip Software Version 3.1
(Ciphergen Biosystems, Inc.). Figure 1 shows a comparison of protein mass spectra detected using the above mentioned SAX2 ProteinChip arrays for samples isolated from patients
suffering from a non-malignant disease of the breast (CI and C2) and of patients with breast cancer (TI and T2).
The m/z values of all mass spectra selected for the analysis ranged between 1500 Da and 30000 Da, wherein smaller masses were not used since artefacts with the "Energy Absorbing Molecule, EAM" ("Matrix") could not be excluded, and higher masses were not detected under the chosen experimental conditions. First, baseline subtraction was conducted for each mass spectrum followed by external calibration with the calibration equation generated on Oct 01, 2003 (most of the spectra were recorded under this calibration), and subsequent internal calibration using the mass signal at 6655.0 Da, which is present in all spectra with a signal-to- noise ratio of at least 5. Then, normalisation of the spectra according to the intensity of the total ion current in the range from 1500 to 50000 Da was performed. Finally, automatic peak detection was applied as previously described by Adam et al., using the "Biomarker Wizard" tool of the ProteinChip Software Version 3.1 (Ciphergen Biosystems, Inc.). The following settings were chosen for peak detection by "Biomarker Wizard": a) auto-detect peaks to cluster, b) first pass signal/noise = 5, c) minimum peak threshold: 2% of all spectra (peak present in at least 5 of 15 DCIS samples), d) deletion of user-detected peaks below threshold, e) cluster mass window: +/- 0.3% of mass, f) second pass signal/noise = 2. Using these settings, 91 signal clusters were identified. The following clusters were deleted because of defective peak recognition: m/z 1553.25, 9598.64, 14211.2, 16139.6, 17161.4, 18879.3, 22979.5, 23455.3, and 27570.4 Da. The clusters m/z 1508.07, 2020.59, 4303.89.3, and 4614.06, 18408.2, and 23174.5 Da were changed into 1506, 2017, 3660, 4295, 4611, 18430, and 23210 Da, respectively, because of defective cluster mass centring. So, in total, 82 signal clusters were received.
The cluster information (containing sample ID and sample group, cluster mass values and cluster signal intensities for each spectrum) was transformed into an interchangeable data format (a .csv table) using the "Sample group statistics" function of the "Biomarker Wizard" tool of the ProteinChip Software Version 3.1. In this format, the data was subjected to statistical analyses (see Examples 4 to 6).
Example 4. Classifier Construction.
Classifiers with binary target variable (cancer versus non-cancer) were constructed as follows. First, as a proof of principle, classifiers were constructed and evaluated by stratified 10-fold cross validation. The data set was partitioned in 10 approximately equal-sized subsets in which the two classes are represented in about the same proportion as in the overall data set. Then, 10 classifiers were constructed using only 9/10 of the data by excluding subsequently one sub-dataset. Classifier performance was determined on the excluded test data set. Thereby, each
available case was employed 9 times for classifier construction, and once for classifier evaluation. Test results were collected to determine overall sensitivity and specificity. Second, a final classifier was constructed on the basis of all available cases. This classifier was evaluated by using out-of-bag error estimates, see below.
Classifiers were constructed as decision tree ensembles to overcome typical instabilities of simple forward variable selection procedures such as single decision trees, thereby improving the overall classifier performance on independent test data, see e.g. Breiman L (1996) Bagging Predictors, Machine Learning, Vol. 24, No. 2, pp. 123-140. The results of the present invention were generated using the "random forest" approach, see the following references available at ftp ://ftp . stat.berkeley.edu/pub/users/breiman/:
Breiman L. (2001a) Random forests. Machine Learning, 45(l):5-32, available at ftp://ftp.stat.berkeley.edu/pub/users/breiman
Breiman L. (2001b) Wald Lecture I: Machine Learning, available at ftp://ftp.stat.berkeley.edu/pub/users/breiman/
Breiman, L. (2001c) Wald Lecture II: Looking Inside the Black Box", available at ftp://ftp.stat.berkeley.edu/pub/users/breiman
Breiman, L. (2003) Manual - Setting Up, Using, and Understanding Random Forests V4.0", available at ftp://ftp.stat.berkelev.edu/pub/users/breiman/
The generated random forest classifiers consisted of 1000 exploratory decision trees, i.e. maximally grown decision trees consisting of pure final nodes only. The high number of decision trees was used in order to (1) ensure best classification performance, i.e. a saturation of the test error on the lowest possible level, see Figure 2A and (3) to obtain a sound statistical basis for determining variable importance. Decision tree generation was based on bootstrap sub-samples resulting from 98 random selections of cases with replacement from each class, so that both classes were weighted equally. Nodes were split by applying the Gini splitting rule to random subsets consisting of 8 randomly selected variables (masses).
Example 5. Classifier structure.
The final classifier consists of 1000 decision trees, each decision tree consisting typically of about 25 terminal nodes, see Figure 3. For each variable, variable importance was determined as the total decrease in node impurity achieved by splits using this variable, averaged over all trees. The high number of trees ensures a sound statistical basis for variable importance. Node impurity was measured by the Gini index. Table 4 shows all variables ranked according to importance in the final random forest classifier.
The high classification performance of random forests is based on the high degree of independence of the underlying low-biased single decision trees. The high degree of independence is established by two stochastic processes: (1) bootstrapping introducing variations of the training data and (2) the random restriction to small variable subsets for each node splitting.
The classification result of the final random forest classifier is determined by majority vote: each case is assigned to the class for which most single decision trees vote. The more decision trees assigned to a given case, the higher the probability that this case actually belongs to the corresponding class. Figure 4 visualizes how normalized votes for class "positive" are distributed. Votes were determined by an out-of-bag approach to estimate the distribution of votes on independent test data. Vote normalization was performed as follows; (number of votes for class "positive" - number of votes for class negative) / (number of trees for which the considered case is "out-of-bag"). Normalized votes range from -1 (all votes for class "negative") to +1 (all votes for class "positive"). Difficult to classify cases possess normalized votes around zero. Cases with especially clear classification result are those with high absolute value of normalized vote. In Figure 4, dashed vertical lines correspond to quantiles at 0%, 25%, 50%, 75%, and 100%, thereby illustrating which values of normalized votes are typical for clear (e.g. below 25%- and above 75%-quantile) and non-clear voting results (between 25%- and 75%-quantile).
Example 6. Classification performance.
Classification performance was estimated by two different methods: 1) 10-fold cross validation in the proof-of-principle framework and, 2) out-of-bag estimation for the final classifier. The confusion matrix obtained from cross-validation is presented in Table 2. Performance was estimated by 67.59 % specificity and 76.85 % sensitivity. The confusion matrix obtained for the final classifier from out-of-bag estimation is presented in Table 3. It yields slightly higher performance levels of 68.52 % specificity and 76.85 % sensitivity.
For the final classifier, progression and success of learning are visualized in Figure 2A. The out-of-bag error is the proportion of misclassifϊed cases in the entire data set. For the classification of each case (patient) only those trees are applied that were constructed independently of that case, i.e. for which the considered case was not in the bootstrap sub-sample used for training. Such cases are called "out-of-bag" cases.
Table 3 states classifier performance on the basis of out-of-bag estimation and majority voting for the final classifier. A case is assigned to class "positive" if more than 50% of the decision
trees vote for this class. By varying the 50% threshold from.0% to 100%, we obtain an out-of- bag estimation of the ROC curve of the final classifier, see Figure 2B. The ROC curve extrapolates the performance of the generated classifier to neighbouring sensitivity and specificity ranges, thereby visualizing the possible trade-off between sensitivity and specificity. The out-of-bag ROC curve estimation is a valid estimation for the ROC performance of the final classifier on unseen test data as the out-of-bag error was not used for classifier tuning. Instead, training parameters were chosen in accordance with Breiman L. (2001a) and in order to obtain reasonable statistics for variable importance, see Table 4. The obtained AUC value is 0.79.
Summary
Currently, many groups are utilising proteomic technologies to comparatively analyse the differences in protein levels in. disease vs. non-diseased patients in the hopes of discovering serological biomarkers that will aid in disease diagnosis. One such technology currently being employed is surface enhanced laser desorption ionization (SELDI); a modification of matrix-assisted laser desorption ionization time of flight (MALDI-TOF). This technology is a mass spectrometry technique that allows for the simultaneous analysis of multiple biomarkers within a biological sample.
This technology, when coupled with decision tree ensembles of varying complexities, can lead to the identification of biomarker patterns (classifiers) which correctly classify a patient as healthy or having a given disease. In the context of this invention, the biomarker profiles (biomolecule molecular masses) listed in Table 4 are able to correctly classify a patient as either healthy, having a non-malignant disease of the breast, or having DCIS (early stage cancer) or a breast cancer, with a high degree of sensitivity and specificity. The higher the sensitivity and specificity of a biomarker pattern, the more likely it is capable of determining a patients' diagnosis with a high degree of accuracy.
Herein, classification performances were estimated by two different approaches: cross validation and out-of-bag. Both approaches yielded similar performance estimates, see Table 2 and 3, respectively. The progressive success of classifier generation is shown in Figure 2 A. From Figure 2A, it is evident that the out-of-bag error decreases with an increase in the number of decision trees. Classification performance can be extended to the entire range of sensitivity and specificity and visualized in ROC curve form, see Figure 2B. The classifiers are ensemble classifiers, i.e., they consist of many single decision trees of varying complexity. Figure 3 visualizes decision tree complexity by the number of nodes of each single decision tree. The importance of a single mass in an ensemble classifier is determined by summing its partitioning success. This yields a ranked list of masses shown in Table 4. For some patients, out-of-bag
voting is clear, e.g. all trees vote for the same class, while for other patients the decision is close, e.g. 51% of trees vote for class "negative" and 49% for class "positive". The entire gradual distribution of voting results is shown in Figure 4.
The present analysis applies the "random forest" approach, an extension of bagging. This approach, in addition to data set variations on the level of included cases ("bootstrapping"), restricts feature selection in each partitioning step to random feature subsets. Thereby, the generated decision trees vary significantly and are more independent from each other. Accordingly, averaging over many decision trees yields a better overall classification performance.
Based on this information, one can employ the biomarker patterns for the development of a comprehensive diagnostic tool for breast cancer detection. Furthermore, such a diagnostic tool will provide the practising clinician with a basis on which to design a more personalised therapy program for a given patient, thereby improving the overall prognosis of the patient.
Disease NNoumber of samples
Non-malignant mastopathy 25
Other* 14
Malignant
DCIS 15
TI 37
Table 1. Distribution of
T2 38 10 Serum Samples from
T3 8 Patients with a given Breast
T4 10 Disease
*fat necrosis, sclerosing adenosis, fϊbroadenoma, and small duct and intraductal papillomas
Table 2. Confusion matrix by cross-validation using the respective test datasets.
(predicted class) negative positive
(actual class) negative 73 35 positive 25 83
Table 3. Confusion matrix by out-of-bag estimation for final classifier.
(predicted class) negative positive (actual class) negative 74 34 positive 25 83
Table 4. Variable importance. The table presents the variable importance for all masses, i.e. the total decrease in node impurity achieved by a variable during final classifier construction averaged over all trees. Masses are ranked according to their importance.
mass importance mass importance mass importance
M 12656.2 6.49 M 14082.5 1.22 M4476.36 0.89
M 11414.5 5.42 M 10464.2 1.14 M 9661.47 0.88
M 15909.3 4.41 M 6454.34 1.13 M 22495.9 0.88
M 15366.8 3.46 M 11723.3 1.12 M 2017.00 0.88
M 14820.7 3.19 M4363.33 1.11 M 2267.52 0.87
M 15159.6 3.1 M 3327.77 1.07 M 28312.7 0.86
M 1533.37 2.89 M 23217.8 1.06 M 7011.50 0.86
M4962.25 2.53 M 17637.6 1.05 M 8230.23 0.85
M 7590.86 2.43 M4831.37 1.05 M4614.05 0.85
M 2607.17 2.34 M 17961.4 1.04 M4295.00 0.85
M 1506.00 2.22 M 9445.67 1.04 M 8589.09 0.84
M 3507.81 1.76 M4161.22 1.03 M 8791.50 0.82
M 2052.78 1.76 M 10232.1 1.02 M 4724.85 0.8
M 16202.0 1.64 M 3950.65 0.99 M 9954.84 0.8
M 6654.56 1.56 M 17503.5 0.98 M 22709.5 0.78
M 15975.3 1.45 M 18145.8 0.97 M 9376.93 0.76
M 5114.84 1.44 M 1974.74 0.97 M 8939.44 0.75
M 87 7.36 1.44 M 13811.8 0.96 M4873.64 0.71
M4107.32 1.39 M 17416.3 0.96 M28118.6 ^ 0.68
M 7998.40 1.36 M 12491.8 0.95 M 14013.7 0.68
M 22383.2 1.32 M 8486.84 0.94 M 18430.0 0.67
M 5654.92 1.31 M 11566.7 0.94 M 6905.53 0.66
M 5863.17 1.3 M 13776.5 0.93 M 13651.7 0.62
M 3660.00 1.29 M 10681.8 0.92 M 18656.4 0.61
M 9736.95 1.29 M 17287.5 0.92 M 9220.92 0.53
M 9160.26 1.28 M 4244.87 0.89
M 5497.21 1.27 M 1623.22 0.89
Claims
1. A method for the differential diagnosis of a breast cancer and/or a non-malignant disease of the breast, in vitro, comprising: a) obtaining a test sample from a subject, b) contacting test sample with a biologically active surface under specific binding conditions c) allowing the biomolecules within the test sample to bind said biologically active surface, d) detecting bound biomolecules using a detection method, wherein the detection method generates a mass profile of said test sample, e) transforming the mass profile into a computer readable form, and f) comparing the mass profile of e) with a database containing mass profiles specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having metastasised breast cancer, or subjects having a non-malignant disease of the breast, wherein said comparison allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer and/or a non-malignant disease of the breast.
2. The method of claim 1 , wherein the database is generated by a) obtaining biological samples from healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having metastasised breast cancer, and subjects having non-malignant disease of the breast, b) contacting said biological samples with a biologically active surface under specific binding conditions, c) allowing the biomolecules within the biological samples to bind to said biologically active surface, d) detecting bound biomolecules using a detection method, wherein the detection method generates mass profiles of said biological samples, e) transforming the mass profiles into a computer-readable form, f) applying a mathematical algorithm to classify the mass profiles in e) as specific for healthy subjects, subjects having a precancerous lesion of the breast, subjects having breast cancer, subjects having metastasised breast cancer, and subjects having non-malignant disease of the breast.
3. The method of claim 1 , wherein the biomolecules are characterized by: a) diluting a sample 1:5 in a denaturation buffer consisting of 7 M urea, 2 M thiourea, 4% CHAPS, 1% DTT, 2% Ampholine, at 0° to 4° b) further diluting said sample 1:10 with a binding buffer consisting of 0.1 M Tris- HCI, 0.02% Triton X-100, pH 8.5 at 0° to 4° c) contacting. the sample with a biologically active surface comprising positively charged quaternary ammonium groups, d) incubating of the treated sample with said biologically active surface for 120 minutes under temperatures between 20 and 24°C at pH 8.5, e) and analysing the bound biomolecules by gas phase ion spectrometry.
4. The method of claim 1, wherein the detection method is mass spectrometry.
5. The method of claim 4 wherein the method of mass spectrometry is selected from the group of matrix-assisted laser desorption ionization/time of flight (MALDI-TOF), surface enhanced laser desorption ionisation/time of flight (SELDI-TOF), liquid chromatography, MS-MS, or ESI-MS.
6. The method of claims 1, wherein the biologically active surface comprises an adsorbent selected from the group of quaternary ammonium groups, carboxylate groups, groups with alkyl or aryl chains, groups such as nitriloacetic acid that immobilize metal ions, or proteins, antibodies, or nucleic acids.
7. The method of claim 1, wherein the mass profiles comprise a panel of one or more differentially expressed biomolecules.
8. The method of claim 7, wherein, wherein the biomolecules are selected from a group having the apparent molecular mass of 1506 Da ± 8 Da, 1533 Da ± 8 Da, 1623 Da ± 8 Da, 1975 Da ± 10 Da, 2017 Da ± 10 Da, 2053 Da ± 10 Da, 2268 Da ± 11 Da, 2607 Da ± 13 Da, 3328 Da ± 17 Da, 3508 Da ± 18 Da, 3660 Da ± 18 Da, 3951 Da ± 20 Da, 4107
Da ± 21 Da, 4161 Da ± 21 Da, 4245 Da ± 21 Da, 4295 Da ± 21 Da, 4363 Da ± 22 Da, 4476 Da ± 22 Da, 4614 Da ± 23 Da, 4725 Da ± 24 Da, 4831 Da ± 24 Da, 4874 Da ± 24 Da, 4962 Da ± 25 Da, 5115 Da ± 26 Da, 5497 Da ± 27 Da, 5655 Da ± 28 Da, 5863 Da ± 29 Da, 6454 Da ± 32 Da, 6655 Da ± 33 Da, 6906 Da ± 35 Da, 7012 Da ± 35 Da, 7591 Da ± 38 Da, 7998 Da ± 40 Da, 8230 Da ± 41 Da, 8487 Da ± 42 Da, 8589 Da ± 43 Da,
8717 Da ± 44 Da, 8792 Da ± 44 Da, 8939 Da ± 45 Da, 9160 Da ± 46 Da, 9221 Da ± 46 Da, 9377 Da ± 47 Da, 9446 Da ± 47 Da, 9661 Da ± 48 Da, 9737 Da ± 49 Da, 9955 Da ± 50 Da, 10232 Da± 51 Da, 10464 Da ± 52 Da, 10682 Da ± 53 Da, 11414 Da ± 57 Da, 11567 Da ± 58 Da, 11723 Da ± 59 Da, 12492. Da ± 62 Da, 12656 Da ± 63 Da, 13652 Da ± 68 Da, 13776 Da ± 69 Da, 13812 Da ± 69 Da, 14014 Da ± 70 Da, 14082 Da ± 70 Da, 14821 Da ± 74 Da, 15160 Da ± 76 Da, 15367 Da ± 77 Da, 15909 Da ± 78 Da, 15975 Da ± 80 Da, 16202 Da ± 81 Da, 17288 Da ± 86 Da, 17416 Da ± 87 Da, 17504
Da ± 88 Da, 17638 Da ± 88 Da, 17961 Da ± 90 Da, 18146 Da ± 91 Da, 18430 Da ± 92 Da, 18656 Da ± 93 Da, 22383 Da ± 112 Da, 22496 Da ± 113 Da, 22710 Da ± 114 Da, 23218 Da ± 116 Da, 28119 Da ± 141 Da, or 28313 Da ± 142 Da.
9. A method for the identification of differentially expressed biomolecules wherein the biomolecules of any of claims 1-8 are proteins, comprising: a) chromatography and fractionation, b) analysis of fractions for the presence of said differentially expressed proteins and or fragments thereof, using a biologically active surface, c) further analysis using mass spectrometry to obtain amino acid sequences encoding said proteins and/or fragments thereof, and d) searching amino acid sequence databases of known proteins to identify said differentially expressed proteins by amino acid sequence comparison.
10. The method of claim 9, wherein the method of chromatography is selected from high performance liquid chromatography (HPLC) or fast protein liquid chromatography (FPLC).
11. The method of claim 9, wherein the mass spectrometry used is selected from the group of matrix-assisted laser desorption ionization/time of flight (MALDI-TOF), surface enhanced laser desorption ionisation/time of flight (SELDI-TOF), liquid chromatography, MS-MS, or ESI-MS.
12. A method for the differential diagnosis of a breast cancer and/or a malignant disease of the breast, in vitro, comprising detection of one or more differentially expressed biomolecules wherein the biomolecules are polypeptides, comprising: a) obtaining a test sample from a subject, b) contacting said sample with a binding molecule specific for a differentially expressed polypeptide identified in claims 9-11, c) detecting the presence or absence of said polypeptide(s), wherein the presence or absence of said polypeptide(s) allows for the differential diagnosis of a subject as healthy, having a precancerous lesion of the breast, having a breast cancer, having a metastasised breast cancer and/or a non-malignant disease of the breast.
13. The method of any one of claims 1-12, wherein the test sample is a blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract sample.
14. The method of any one of claims 1-12, wherein the biological sample is a blood, blood serum, plasma, nipple aspirate, urine, semen, seminal fluid, seminal plasma, prostatic fluid, excreta, tears, saliva, sweat, biopsy, ascites, cerebrospinal fluid, milk, lymph, or tissue extract sample.
15. The method of any one of claims 1-12, wherein the subject is of mammalian origin.
16. The method of claim 15, wherein the subject is of human origin.
17. A kit for the diagnosis of a breast cancer and/or a non-malignant disease of the breast within a subject using the method of any one of claims 1-11 and 13-16 comprising a denaturation solution, a binding solution, a washing solution, a biologically active surface comprising an adsorbent, and instructions to use the kit.
18. A kit for the diagnosis of a breast cancer or a non-malignant disease of the breast within a subject using the method of any one of claims 12-16 comprising a solution, binding molecule, detection substrate, and instructions to use the kit.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP04733320A EP1673623A1 (en) | 2003-05-15 | 2004-05-17 | Methods and applications of biomarker profiles in the diagnosis and treatment of breast cancer |
Applications Claiming Priority (9)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
EP03090141 | 2003-05-15 | ||
US47277203P | 2003-05-23 | 2003-05-23 | |
EP03090153A EP1477803A1 (en) | 2003-05-15 | 2003-05-23 | Serum protein profiling for the diagnosis of epithelial cancers |
US52558303P | 2003-11-24 | 2003-11-24 | |
EP03090401 | 2003-11-24 | ||
EP03090460 | 2003-12-30 | ||
US53419704P | 2004-01-02 | 2004-01-02 | |
PCT/EP2004/005292 WO2004102188A1 (en) | 2003-05-15 | 2004-05-17 | Methods and applications of biomarker profiles in the diagnosis and treatment of breast cancer |
EP04733320A EP1673623A1 (en) | 2003-05-15 | 2004-05-17 | Methods and applications of biomarker profiles in the diagnosis and treatment of breast cancer |
Publications (1)
Publication Number | Publication Date |
---|---|
EP1673623A1 true EP1673623A1 (en) | 2006-06-28 |
Family
ID=56290563
Family Applications (2)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP04733338A Withdrawn EP1629278A1 (en) | 2003-05-15 | 2004-05-17 | Biomarkers for the differential diagnosis of pancreatitis and pancreatic cancer |
EP04733320A Withdrawn EP1673623A1 (en) | 2003-05-15 | 2004-05-17 | Methods and applications of biomarker profiles in the diagnosis and treatment of breast cancer |
Family Applications Before (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
EP04733338A Withdrawn EP1629278A1 (en) | 2003-05-15 | 2004-05-17 | Biomarkers for the differential diagnosis of pancreatitis and pancreatic cancer |
Country Status (4)
Country | Link |
---|---|
EP (2) | EP1629278A1 (en) |
AU (2) | AU2004239416A1 (en) |
CA (2) | CA2525740A1 (en) |
WO (2) | WO2004102189A1 (en) |
Families Citing this family (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2010151731A1 (en) * | 2009-06-26 | 2010-12-29 | University Of Utah Research Foundation | Materials and methods for the identification of drug-resistant cancers and treatment of same |
CA2797941C (en) * | 2009-10-01 | 2015-01-27 | Phenomenome Discoveries Inc. | Serum-based biomarkers of pancreatic cancer and uses thereof for disease detection and diagnosis |
EP2686437A4 (en) * | 2011-03-18 | 2015-09-30 | Fox Chase Cancer Ct | MUCINE 5B, A SPECIFIC BIOMARKER FOR PANCREATIC LIQUID CYSTS, USEFUL IN ESTABLISHING A PRECISE DIAGNOSIS OF MUCINIC KYSTOS, AND OTHER MARKERS FOR THE DETECTION OF PANCREATIC CANCER |
GB201501930D0 (en) | 2015-02-05 | 2015-03-25 | Univ London Queen Mary | Biomarkers for pancreatic cancer |
US20240069025A1 (en) * | 2021-03-23 | 2024-02-29 | Kashiv Biosciences, Llc | Method for size based evaluation of pancreatic protein mixture |
Family Cites Families (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
AU2001288921A1 (en) * | 2000-09-11 | 2002-03-26 | Ciphergen Biosystems, Inc. | Human breast cancer biomarkers |
US6855554B2 (en) * | 2001-09-21 | 2005-02-15 | Board Of Regents, The University Of Texas Systems | Methods and compositions for detection of breast cancer |
-
2004
- 2004-05-17 AU AU2004239416A patent/AU2004239416A1/en not_active Abandoned
- 2004-05-17 WO PCT/EP2004/005293 patent/WO2004102189A1/en not_active Application Discontinuation
- 2004-05-17 EP EP04733338A patent/EP1629278A1/en not_active Withdrawn
- 2004-05-17 WO PCT/EP2004/005292 patent/WO2004102188A1/en not_active Application Discontinuation
- 2004-05-17 CA CA002525740A patent/CA2525740A1/en not_active Abandoned
- 2004-05-17 EP EP04733320A patent/EP1673623A1/en not_active Withdrawn
- 2004-05-17 AU AU2004239418A patent/AU2004239418A1/en not_active Abandoned
- 2004-05-17 CA CA002525725A patent/CA2525725A1/en not_active Abandoned
Non-Patent Citations (1)
Title |
---|
See references of WO2004102188A1 * |
Also Published As
Publication number | Publication date |
---|---|
EP1629278A1 (en) | 2006-03-01 |
WO2004102189A1 (en) | 2004-11-25 |
AU2004239416A1 (en) | 2004-11-25 |
CA2525740A1 (en) | 2004-11-25 |
AU2004239418A1 (en) | 2004-11-25 |
WO2004102188A1 (en) | 2004-11-25 |
CA2525725A1 (en) | 2004-11-25 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US8465929B2 (en) | Biomarkers for ovarian cancer | |
Seibert et al. | Advances in clinical cancer proteomics: SELDI-ToF-mass spectrometry and biomarker discovery | |
US7811772B2 (en) | Apolipoprotein A-II isoform as a biomarker for prostate cancer | |
EP1756566B1 (en) | Biomarkers for ovarian cancer | |
US8206934B2 (en) | Methods for diagnosing ovarian cancer | |
US20090142332A1 (en) | Identification of Biomarkers by Serum Protein Profiling | |
US20160320398A1 (en) | SRM/MRM Assay for Subtyping Lung Histology | |
WO2006128082A2 (en) | Biomarkers for breast cancer | |
WO2005008247A2 (en) | Detection of endometrial pathology | |
US20070087392A1 (en) | Method for diagnosing head and neck squamous cell carcinoma | |
CN101087889A (en) | Biomarkers for breast cancer | |
EP1477803A1 (en) | Serum protein profiling for the diagnosis of epithelial cancers | |
EP1673623A1 (en) | Methods and applications of biomarker profiles in the diagnosis and treatment of breast cancer | |
CA2525743A1 (en) | Differential diagnosis of colorectal cancer and other diseases of the colon | |
Fung et al. | Biomarkers for Ovarian Cancer |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PUAI | Public reference made under article 153(3) epc to a published international application that has entered the european phase |
Free format text: ORIGINAL CODE: 0009012 |
|
17P | Request for examination filed |
Effective date: 20051214 |
|
AK | Designated contracting states |
Kind code of ref document: A1 Designated state(s): AT BE BG CH CY CZ DE DK EE ES FI FR GB GR HU IE IT LI LU MC NL PL PT RO SE SI SK TR |
|
RAP1 | Party data changed (applicant data changed or rights of an application transferred) |
Owner name: MIRACULINS INC. |
|
17Q | First examination report despatched |
Effective date: 20060714 |
|
DAX | Request for extension of the european patent (deleted) | ||
STAA | Information on the status of an ep patent application or granted ep patent |
Free format text: STATUS: THE APPLICATION IS DEEMED TO BE WITHDRAWN |
|
18D | Application deemed to be withdrawn |
Effective date: 20061129 |