WO2025034580A1 - Phase methylation-based markers for tissue and cell-type-specific identification and monitoring - Google Patents
Phase methylation-based markers for tissue and cell-type-specific identification and monitoring Download PDFInfo
- Publication number
- WO2025034580A1 WO2025034580A1 PCT/US2024/040801 US2024040801W WO2025034580A1 WO 2025034580 A1 WO2025034580 A1 WO 2025034580A1 US 2024040801 W US2024040801 W US 2024040801W WO 2025034580 A1 WO2025034580 A1 WO 2025034580A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- cell
- captured
- libraries
- tissue
- samples
- Prior art date
Links
- 230000011987 methylation Effects 0.000 title claims abstract description 85
- 238000007069 methylation reaction Methods 0.000 title claims abstract description 85
- 238000012544 monitoring process Methods 0.000 title description 4
- 238000000034 method Methods 0.000 claims abstract description 44
- 239000000203 mixture Substances 0.000 claims abstract description 11
- 108020004414 DNA Proteins 0.000 claims description 58
- 210000001519 tissue Anatomy 0.000 claims description 47
- 150000007523 nucleic acids Chemical class 0.000 claims description 39
- -1 ONCUT1 Proteins 0.000 claims description 37
- 102000039446 nucleic acids Human genes 0.000 claims description 37
- 108020004707 nucleic acids Proteins 0.000 claims description 37
- 238000012163 sequencing technique Methods 0.000 claims description 36
- 108090000623 proteins and genes Proteins 0.000 claims description 29
- 108091029430 CpG site Proteins 0.000 claims description 28
- 239000000523 sample Substances 0.000 claims description 25
- 210000004027 cell Anatomy 0.000 claims description 23
- ISAKRJDGNUQOIC-UHFFFAOYSA-N Uracil Chemical group O=C1C=CNC(=O)N1 ISAKRJDGNUQOIC-UHFFFAOYSA-N 0.000 claims description 15
- 239000012472 biological sample Substances 0.000 claims description 14
- 108091029523 CpG island Proteins 0.000 claims description 10
- 210000000227 basophil cell of anterior lobe of hypophysis Anatomy 0.000 claims description 10
- 238000006243 chemical reaction Methods 0.000 claims description 9
- 102100039511 Chymotrypsin-C Human genes 0.000 claims description 8
- 101000889306 Homo sapiens Chymotrypsin-C Proteins 0.000 claims description 8
- 101000693011 Homo sapiens Pancreatic alpha-amylase Proteins 0.000 claims description 8
- 101000835708 Homo sapiens Tectonin beta-propeller repeat-containing protein 1 Proteins 0.000 claims description 8
- 102100026367 Pancreatic alpha-amylase Human genes 0.000 claims description 8
- 102100026429 Tectonin beta-propeller repeat-containing protein 1 Human genes 0.000 claims description 8
- 239000012530 fluid Substances 0.000 claims description 8
- 210000004369 blood Anatomy 0.000 claims description 7
- 239000008280 blood Substances 0.000 claims description 7
- 108091028043 Nucleic acid sequence Proteins 0.000 claims description 6
- 210000004153 islets of langerhan Anatomy 0.000 claims description 5
- 210000002381 plasma Anatomy 0.000 claims description 5
- 229940035893 uracil Drugs 0.000 claims description 5
- 102100023987 Aquaporin-12A Human genes 0.000 claims description 4
- 102100023968 Aquaporin-12B Human genes 0.000 claims description 4
- 102100035687 Bile salt-activated lipase Human genes 0.000 claims description 4
- 102100030230 CUB and zona pellucida-like domain-containing protein 1 Human genes 0.000 claims description 4
- 101150087048 CYB2 gene Proteins 0.000 claims description 4
- 102100024316 Calcium/calmodulin-dependent 3',5'-cyclic nucleotide phosphodiesterase 1A Human genes 0.000 claims description 4
- 101100439285 Candida albicans (strain SC5314 / ATCC MYA-2876) CLB4 gene Proteins 0.000 claims description 4
- 102100023338 Chymotrypsin-like elastase family member 2B Human genes 0.000 claims description 4
- 102100039514 Chymotrypsinogen B2 Human genes 0.000 claims description 4
- 102100024069 Coiled-coil and C2 domain-containing protein 1B Human genes 0.000 claims description 4
- 102100036873 Cyclin-I Human genes 0.000 claims description 4
- 102000004190 Enzymes Human genes 0.000 claims description 4
- 108090000790 Enzymes Proteins 0.000 claims description 4
- 102100026148 Free fatty acid receptor 1 Human genes 0.000 claims description 4
- 102100033839 Glucose-dependent insulinotropic receptor Human genes 0.000 claims description 4
- 101000600756 Homo sapiens 3-phosphoinositide-dependent protein kinase 1 Proteins 0.000 claims description 4
- 101000757607 Homo sapiens Aquaporin-12A Proteins 0.000 claims description 4
- 101000757608 Homo sapiens Aquaporin-12B Proteins 0.000 claims description 4
- 101000715643 Homo sapiens Bile salt-activated lipase Proteins 0.000 claims description 4
- 101000726720 Homo sapiens CUB and zona pellucida-like domain-containing protein 1 Proteins 0.000 claims description 4
- 101001117044 Homo sapiens Calcium/calmodulin-dependent 3',5'-cyclic nucleotide phosphodiesterase 1A Proteins 0.000 claims description 4
- 101000907961 Homo sapiens Chymotrypsin-like elastase family member 2B Proteins 0.000 claims description 4
- 101000889299 Homo sapiens Chymotrypsinogen B2 Proteins 0.000 claims description 4
- 101000910424 Homo sapiens Coiled-coil and C2 domain-containing protein 1B Proteins 0.000 claims description 4
- 101000713124 Homo sapiens Cyclin-I Proteins 0.000 claims description 4
- 101000866287 Homo sapiens Excitatory amino acid transporter 2 Proteins 0.000 claims description 4
- 101000912510 Homo sapiens Free fatty acid receptor 1 Proteins 0.000 claims description 4
- 101000996752 Homo sapiens Glucose-dependent insulinotropic receptor Proteins 0.000 claims description 4
- 101001076292 Homo sapiens Insulin-like growth factor II Proteins 0.000 claims description 4
- 101001128393 Homo sapiens Interferon-induced GTP-binding protein Mx1 Proteins 0.000 claims description 4
- 101000853012 Homo sapiens Interleukin-23 receptor Proteins 0.000 claims description 4
- 101000967920 Homo sapiens Left-right determination factor 1 Proteins 0.000 claims description 4
- 101000972918 Homo sapiens MAX gene-associated protein Proteins 0.000 claims description 4
- 101001052076 Homo sapiens Maltase-glucoamylase Proteins 0.000 claims description 4
- 101000959028 Homo sapiens Mitochondrial 10-formyltetrahydrofolate dehydrogenase Proteins 0.000 claims description 4
- 101000978730 Homo sapiens Nephrin Proteins 0.000 claims description 4
- 101001109685 Homo sapiens Nuclear receptor subfamily 5 group A member 2 Proteins 0.000 claims description 4
- 101001134456 Homo sapiens Pancreatic triacylglycerol lipase Proteins 0.000 claims description 4
- 101001090047 Homo sapiens Peroxiredoxin-4 Proteins 0.000 claims description 4
- 101000983077 Homo sapiens Phospholipase A2 Proteins 0.000 claims description 4
- 101001074444 Homo sapiens Polycystin-1 Proteins 0.000 claims description 4
- 101001049828 Homo sapiens Potassium channel subfamily K member 6 Proteins 0.000 claims description 4
- 101000685275 Homo sapiens Protein sel-1 homolog 1 Proteins 0.000 claims description 4
- 101000605118 Homo sapiens Protein-glucosylgalactosylhydroxylysine glucosidase Proteins 0.000 claims description 4
- 101000686231 Homo sapiens Ras-related GTP-binding protein C Proteins 0.000 claims description 4
- 101000927796 Homo sapiens Rho guanine nucleotide exchange factor 7 Proteins 0.000 claims description 4
- 101000655897 Homo sapiens Serine protease 1 Proteins 0.000 claims description 4
- 101000987315 Homo sapiens Serine/threonine-protein kinase PAK 3 Proteins 0.000 claims description 4
- 101000889890 Homo sapiens Testis-expressed protein 11 Proteins 0.000 claims description 4
- 101000946167 Homo sapiens Transcription factor LBX1 Proteins 0.000 claims description 4
- 101000848014 Homo sapiens Trypsin-2 Proteins 0.000 claims description 4
- 101001117146 Homo sapiens [Pyruvate dehydrogenase (acetyl-transferring)] kinase isozyme 1, mitochondrial Proteins 0.000 claims description 4
- 102100025947 Insulin-like growth factor II Human genes 0.000 claims description 4
- 102100031802 Interferon-induced GTP-binding protein Mx1 Human genes 0.000 claims description 4
- 102100036672 Interleukin-23 receptor Human genes 0.000 claims description 4
- 102100040508 Left-right determination factor 1 Human genes 0.000 claims description 4
- 102100022621 MAX gene-associated protein Human genes 0.000 claims description 4
- 102100039076 Mitochondrial 10-formyltetrahydrofolate dehydrogenase Human genes 0.000 claims description 4
- 102100023195 Nephrin Human genes 0.000 claims description 4
- 102100022669 Nuclear receptor subfamily 5 group A member 2 Human genes 0.000 claims description 4
- 102100033359 Pancreatic triacylglycerol lipase Human genes 0.000 claims description 4
- 102100034768 Peroxiredoxin-4 Human genes 0.000 claims description 4
- 102100026918 Phospholipase A2 Human genes 0.000 claims description 4
- 102100023203 Potassium channel subfamily K member 6 Human genes 0.000 claims description 4
- 102100023159 Protein sel-1 homolog 1 Human genes 0.000 claims description 4
- 102100038278 Protein-glucosylgalactosylhydroxylysine glucosidase Human genes 0.000 claims description 4
- 102100025009 Ras-related GTP-binding protein C Human genes 0.000 claims description 4
- 102000012980 SLC1A2 Human genes 0.000 claims description 4
- 108091006561 SLC30A2 Proteins 0.000 claims description 4
- 108091006556 SLC30A8 Proteins 0.000 claims description 4
- 108091006936 SLC38A5 Proteins 0.000 claims description 4
- 101100397598 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) JNM1 gene Proteins 0.000 claims description 4
- 102100032491 Serine protease 1 Human genes 0.000 claims description 4
- 102100027911 Serine/threonine-protein kinase PAK 3 Human genes 0.000 claims description 4
- 102100033872 Sodium-coupled neutral amino acid transporter 5 Human genes 0.000 claims description 4
- 102100040172 Testis-expressed protein 11 Human genes 0.000 claims description 4
- 102100034738 Transcription factor LBX1 Human genes 0.000 claims description 4
- 102100034392 Trypsin-2 Human genes 0.000 claims description 4
- 101100397001 Xenopus laevis ins-a gene Proteins 0.000 claims description 4
- 102100034994 Zinc transporter 2 Human genes 0.000 claims description 4
- 210000001124 body fluid Anatomy 0.000 claims description 4
- 101150032953 ins1 gene Proteins 0.000 claims description 4
- 238000011144 upstream manufacturing Methods 0.000 claims description 4
- 210000002700 urine Anatomy 0.000 claims description 4
- 238000012408 PCR amplification Methods 0.000 claims description 3
- 230000001105 regulatory effect Effects 0.000 claims description 3
- 210000003296 saliva Anatomy 0.000 claims description 3
- 210000002966 serum Anatomy 0.000 claims description 3
- 239000000126 substance Substances 0.000 claims description 3
- 206010036790 Productive cough Diseases 0.000 claims description 2
- 210000003567 ascitic fluid Anatomy 0.000 claims description 2
- 210000001175 cerebrospinal fluid Anatomy 0.000 claims description 2
- 239000003623 enhancer Substances 0.000 claims description 2
- 230000002255 enzymatic effect Effects 0.000 claims description 2
- 238000011010 flushing procedure Methods 0.000 claims description 2
- 238000012164 methylation sequencing Methods 0.000 claims description 2
- 210000003802 sputum Anatomy 0.000 claims description 2
- 208000024794 sputum Diseases 0.000 claims description 2
- 210000004243 sweat Anatomy 0.000 claims description 2
- 210000001138 tear Anatomy 0.000 claims description 2
- 235000008474 Cardamine pratensis Nutrition 0.000 claims 2
- 240000000606 Cardamine pratensis Species 0.000 claims 2
- 102100036143 Polycystin-1 Human genes 0.000 claims 2
- 102100024148 [Pyruvate dehydrogenase (acetyl-transferring)] kinase isozyme 1, mitochondrial Human genes 0.000 claims 2
- 230000000295 complement effect Effects 0.000 claims 1
- 210000002797 pancreatic ductal cell Anatomy 0.000 claims 1
- 210000002685 pancreatic polypeptide-secreting cell Anatomy 0.000 claims 1
- 210000002325 somatostatin-secreting cell Anatomy 0.000 claims 1
- OPTASPLRGRRNAP-UHFFFAOYSA-N cytosine Chemical class NC=1C=CNC(=O)N=1 OPTASPLRGRRNAP-UHFFFAOYSA-N 0.000 description 16
- 230000035945 sensitivity Effects 0.000 description 13
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 10
- 230000008569 process Effects 0.000 description 10
- 206010067584 Type 1 diabetes mellitus Diseases 0.000 description 8
- 239000000090 biomarker Substances 0.000 description 8
- 102000053602 DNA Human genes 0.000 description 6
- 230000000875 corresponding effect Effects 0.000 description 6
- LSNNMFCWUKXFEE-UHFFFAOYSA-M Bisulfite Chemical compound OS([O-])=O LSNNMFCWUKXFEE-UHFFFAOYSA-M 0.000 description 5
- 101150006655 INS gene Proteins 0.000 description 5
- 201000010099 disease Diseases 0.000 description 5
- 208000035475 disorder Diseases 0.000 description 5
- 238000007481 next generation sequencing Methods 0.000 description 5
- 108091032973 (ribonucleotides)n+m Proteins 0.000 description 4
- 101710183548 Pyridoxal 5'-phosphate synthase subunit PdxS Proteins 0.000 description 4
- 102100035459 Pyruvate dehydrogenase protein X component, mitochondrial Human genes 0.000 description 4
- 238000003556 assay Methods 0.000 description 4
- 229940104302 cytosine Drugs 0.000 description 4
- 239000012634 fragment Substances 0.000 description 4
- 210000000496 pancreas Anatomy 0.000 description 4
- 239000004055 small Interfering RNA Substances 0.000 description 4
- RYVNIFSIEDRLSJ-UHFFFAOYSA-N 5-(hydroxymethyl)cytosine Chemical compound NC=1NC(=O)N=CC=1CO RYVNIFSIEDRLSJ-UHFFFAOYSA-N 0.000 description 3
- LRSASMSXMSNRBT-UHFFFAOYSA-N 5-methylcytosine Chemical compound CC1=CNC(=O)N=C1N LRSASMSXMSNRBT-UHFFFAOYSA-N 0.000 description 3
- 108091026890 Coding region Proteins 0.000 description 3
- 101000976075 Homo sapiens Insulin Proteins 0.000 description 3
- 206010028980 Neoplasm Diseases 0.000 description 3
- 201000011510 cancer Diseases 0.000 description 3
- 239000003153 chemical reaction reagent Substances 0.000 description 3
- 238000001514 detection method Methods 0.000 description 3
- 238000009826 distribution Methods 0.000 description 3
- 230000014509 gene expression Effects 0.000 description 3
- 239000003550 marker Substances 0.000 description 3
- 239000002773 nucleotide Substances 0.000 description 3
- 125000003729 nucleotide group Chemical group 0.000 description 3
- 238000000746 purification Methods 0.000 description 3
- 102100037263 3-phosphoinositide-dependent protein kinase 1 Human genes 0.000 description 2
- 108091093088 Amplicon Proteins 0.000 description 2
- 208000023275 Autoimmune disease Diseases 0.000 description 2
- 108091061744 Cell-free fetal DNA Proteins 0.000 description 2
- KCXVZYZYPLLWCC-UHFFFAOYSA-N EDTA Chemical compound OC(=O)CN(CC(O)=O)CCN(CC(O)=O)CC(O)=O KCXVZYZYPLLWCC-UHFFFAOYSA-N 0.000 description 2
- 108700024394 Exon Proteins 0.000 description 2
- 108091092195 Intron Proteins 0.000 description 2
- 102000001626 Kazal Pancreatic Trypsin Inhibitor Human genes 0.000 description 2
- 108010093811 Kazal Pancreatic Trypsin Inhibitor Proteins 0.000 description 2
- 241000124008 Mammalia Species 0.000 description 2
- 241001465754 Metazoa Species 0.000 description 2
- 108700011259 MicroRNAs Proteins 0.000 description 2
- 108091007412 Piwi-interacting RNA Proteins 0.000 description 2
- 102100037310 Serine/threonine-protein kinase D1 Human genes 0.000 description 2
- 108020003224 Small Nucleolar RNA Proteins 0.000 description 2
- 102000042773 Small Nucleolar RNA Human genes 0.000 description 2
- 108020004459 Small interfering RNA Proteins 0.000 description 2
- 108020004566 Transfer RNA Proteins 0.000 description 2
- 102000006943 Uracil-DNA Glycosidase Human genes 0.000 description 2
- 108010072685 Uracil-DNA Glycosidase Proteins 0.000 description 2
- 238000004458 analytical method Methods 0.000 description 2
- 230000000692 anti-sense effect Effects 0.000 description 2
- 230000001363 autoimmune Effects 0.000 description 2
- 230000008901 benefit Effects 0.000 description 2
- 230000002902 bimodal effect Effects 0.000 description 2
- 230000030833 cell death Effects 0.000 description 2
- 210000000349 chromosome Anatomy 0.000 description 2
- 230000001186 cumulative effect Effects 0.000 description 2
- 230000009429 distress Effects 0.000 description 2
- 230000036541 health Effects 0.000 description 2
- 230000006607 hypermethylation Effects 0.000 description 2
- 238000004519 manufacturing process Methods 0.000 description 2
- 238000013507 mapping Methods 0.000 description 2
- 125000002496 methyl group Chemical group [H]C([H])([H])* 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 230000004048 modification Effects 0.000 description 2
- 238000012986 modification Methods 0.000 description 2
- 210000000056 organ Anatomy 0.000 description 2
- 230000037361 pathway Effects 0.000 description 2
- 230000004962 physiological condition Effects 0.000 description 2
- 230000035790 physiological processes and functions Effects 0.000 description 2
- 108020004418 ribosomal RNA Proteins 0.000 description 2
- 229920002477 rna polymer Polymers 0.000 description 2
- 238000012360 testing method Methods 0.000 description 2
- RWQNBRDOKXIBIV-UHFFFAOYSA-N thymine Chemical compound CC1=CNC(=O)NC1=O RWQNBRDOKXIBIV-UHFFFAOYSA-N 0.000 description 2
- FFKUHGONCHRHPE-UHFFFAOYSA-N 5-methyl-1h-pyrimidine-2,4-dione;7h-purin-6-amine Chemical compound CC1=CNC(=O)NC1=O.NC1=NC=NC2=C1NC=N2 FFKUHGONCHRHPE-UHFFFAOYSA-N 0.000 description 1
- 241001552669 Adonis annua Species 0.000 description 1
- 101100310856 Drosophila melanogaster spri gene Proteins 0.000 description 1
- 102100031780 Endonuclease Human genes 0.000 description 1
- 108010042407 Endonucleases Proteins 0.000 description 1
- 108700039887 Essential Genes Proteins 0.000 description 1
- 108700039691 Genetic Promoter Regions Proteins 0.000 description 1
- 101100510238 Homo sapiens KIRREL2 gene Proteins 0.000 description 1
- 101001139016 Homo sapiens Kin of IRRE-like protein 2 Proteins 0.000 description 1
- 108090001061 Insulin Proteins 0.000 description 1
- 102100020690 Kin of IRRE-like protein 2 Human genes 0.000 description 1
- 108020005198 Long Noncoding RNA Proteins 0.000 description 1
- 108020005196 Mitochondrial DNA Proteins 0.000 description 1
- 102000055027 Protein Methyltransferases Human genes 0.000 description 1
- 108700040121 Protein Methyltransferases Proteins 0.000 description 1
- 238000011529 RT qPCR Methods 0.000 description 1
- 241000283984 Rodentia Species 0.000 description 1
- 108091081021 Sense strand Proteins 0.000 description 1
- 108020004682 Single-Stranded DNA Proteins 0.000 description 1
- DWAQJAXMDSEUJJ-UHFFFAOYSA-M Sodium bisulfite Chemical compound [Na+].OS([O-])=O DWAQJAXMDSEUJJ-UHFFFAOYSA-M 0.000 description 1
- 108091046869 Telomeric non-coding RNA Proteins 0.000 description 1
- 108700029229 Transcriptional Regulatory Elements Proteins 0.000 description 1
- 208000034953 Twin anemia-polycythemia sequence Diseases 0.000 description 1
- 241000251539 Vertebrata <Metazoa> Species 0.000 description 1
- 230000002159 abnormal effect Effects 0.000 description 1
- 230000004913 activation Effects 0.000 description 1
- 230000006978 adaptation Effects 0.000 description 1
- 210000004381 amniotic fluid Anatomy 0.000 description 1
- 230000003321 amplification Effects 0.000 description 1
- 230000006907 apoptotic process Effects 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 210000004556 brain Anatomy 0.000 description 1
- 230000015556 catabolic process Effects 0.000 description 1
- 108091092259 cell-free RNA Proteins 0.000 description 1
- 230000001413 cellular effect Effects 0.000 description 1
- 238000012512 characterization method Methods 0.000 description 1
- 239000007806 chemical reaction intermediate Substances 0.000 description 1
- 239000012141 concentrate Substances 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000009615 deamination Effects 0.000 description 1
- 238000006481 deamination reaction Methods 0.000 description 1
- 230000034994 death Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 238000012350 deep sequencing Methods 0.000 description 1
- 230000003412 degenerative effect Effects 0.000 description 1
- 230000001419 dependent effect Effects 0.000 description 1
- 206010012601 diabetes mellitus Diseases 0.000 description 1
- 230000004049 epigenetic modification Effects 0.000 description 1
- 230000001747 exhibiting effect Effects 0.000 description 1
- 210000003754 fetus Anatomy 0.000 description 1
- 238000001943 fluorescence-activated cell sorting Methods 0.000 description 1
- 238000005194 fractionation Methods 0.000 description 1
- 230000002068 genetic effect Effects 0.000 description 1
- UYTPUPDQBNUYGX-UHFFFAOYSA-N guanine Chemical class O=C1NC(N)=NC2=C1N=CN2 UYTPUPDQBNUYGX-UHFFFAOYSA-N 0.000 description 1
- 239000008241 heterogeneous mixture Substances 0.000 description 1
- 238000009396 hybridization Methods 0.000 description 1
- 125000004029 hydroxymethyl group Chemical group [H]OC([H])([H])* 0.000 description 1
- 239000004615 ingredient Substances 0.000 description 1
- 239000003112 inhibitor Substances 0.000 description 1
- 238000011528 liquid biopsy Methods 0.000 description 1
- 210000004185 liver Anatomy 0.000 description 1
- 210000004072 lung Anatomy 0.000 description 1
- 108020004999 messenger RNA Proteins 0.000 description 1
- 208000010125 myocardial infarction Diseases 0.000 description 1
- 230000017074 necrotic cell death Effects 0.000 description 1
- 238000003199 nucleic acid amplification method Methods 0.000 description 1
- 230000003647 oxidation Effects 0.000 description 1
- 238000007254 oxidation reaction Methods 0.000 description 1
- 238000002360 preparation method Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 230000037425 regulation of transcription Effects 0.000 description 1
- 230000022532 regulation of transcription, DNA-dependent Effects 0.000 description 1
- 150000003839 salts Chemical class 0.000 description 1
- 238000005070 sampling Methods 0.000 description 1
- 230000028327 secretion Effects 0.000 description 1
- 235000010267 sodium hydrogen sulphite Nutrition 0.000 description 1
- 238000000527 sonication Methods 0.000 description 1
- 125000001273 sulfonato group Chemical group [O-]S(*)(=O)=O 0.000 description 1
- 238000001847 surface plasmon resonance imaging Methods 0.000 description 1
- 208000024891 symptom Diseases 0.000 description 1
- 230000008685 targeting Effects 0.000 description 1
- 229940113082 thymine Drugs 0.000 description 1
- 238000013518 transcription Methods 0.000 description 1
- 230000035897 transcription Effects 0.000 description 1
- 210000004881 tumor cell Anatomy 0.000 description 1
- 238000010200 validation analysis Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12N—MICROORGANISMS OR ENZYMES; COMPOSITIONS THEREOF; PROPAGATING, PRESERVING, OR MAINTAINING MICROORGANISMS; MUTATION OR GENETIC ENGINEERING; CULTURE MEDIA
- C12N15/00—Mutation or genetic engineering; DNA or RNA concerning genetic engineering, vectors, e.g. plasmids, or their isolation, preparation or purification; Use of hosts therefor
- C12N15/09—Recombinant DNA-technology
- C12N15/10—Processes for the isolation, preparation or purification of DNA or RNA
- C12N15/1034—Isolating an individual clone by screening libraries
- C12N15/1093—General methods of preparing gene libraries, not provided for in other subgroups
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6869—Methods for sequencing
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6881—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for tissue or cell typing, e.g. human leukocyte antigen [HLA] probes
-
- G—PHYSICS
- G16—INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR SPECIFIC APPLICATION FIELDS
- G16B—BIOINFORMATICS, i.e. INFORMATION AND COMMUNICATION TECHNOLOGY [ICT] SPECIALLY ADAPTED FOR GENETIC OR PROTEIN-RELATED DATA PROCESSING IN COMPUTATIONAL MOLECULAR BIOLOGY
- G16B20/00—ICT specially adapted for functional genomics or proteomics, e.g. genotype-phenotype associations
- G16B20/20—Allele or variant detection, e.g. single nucleotide polymorphism [SNP] detection
Definitions
- This application relates generally to compositions, assays, methods, kits and apparatuses for identifying and using biomarkers based on the methylation levels of certain target genomic regions.
- Liquid biopsies offer a way to identify the loss of specific cells or tissues by observing their unique methylation patterns. Methylation, a type of epigenetic modification, affects gene expression and activates cell-specific pathways. Therefore, different cells or tissues exhibit unique methylation profiles that mirror the activation of various genes or pathways. For example, in Type 1 Diabetes Mellitus, which is caused by the loss of beta cells in the pancreatic islets, the INS gene is active and thus its promoter is demethylated - contrary from other tissues or cells.
- uPMPs ultra-specific phased methylation patterns
- cfDNA cell-free DNA
- this disclosure provides a method for identifying one or more ultraspecific phased methylation patterns (uPMPs) for a tissue or cell-type of interest, comprising: (a) obtaining a set of cell-free DNA (cfDNA) samples from each subject in a first group of subjects; (b) obtaining a set of genomic DNA samples from the tissue or cell-type of interest from each subject in the first group of subjects and/or a second group of subjects; (c) providing conditions capable of converting unmethylated cytosines to uracils in nucleic acid molecules in the set of cfDNA samples and the set of genomic DNA samples to generate a set of converted cfDNA samples and a set of converted genomic DNA samples; (d) selecting a set of target genomic regions; (e) capturing the set of target genomic regions from the set of converted cfDNA samples and the set of converted genomic DNA samples to generate a set of captured cfDNA libraries and a set of captured genomic DNA libraries; (f) subject the set of captured cfDNA samples
- PMPs phased methylation patterns
- this disclosure describes the application of multiple sets of uPMPs.
- Each set comprises markers that individually possess high specificity, though not necessarily high sensitivity. However, when used in combination, these sets of uPMPs maximize both sensitivity and specificity.
- these sets of uPMPs can be utilized for early detection and subsequent monitoring of degenerative conditions. In a separate embodiment, these sets of uPMPs can also be used for the early detection and monitoring of autoimmune diseases.
- Figure 1 provides illustrative schematics for phased methylation, read phased methylation, and phased methylation patterns.
- Top shows a hypothetical DNA molecule that features five CpG methylation sites, with each site possessing a non-homogeneous methylation status.
- the methylation status at each site is represented as "C” for a methylated cytosine and "c” for an unmethylated cytosine (originally denoted in upper or lower case, respectively, on the molecule).
- the read phased methylation is a compilation of the methylation status of each CpG site, and is denoted as T0100', where T indicates a methylated site, and 'O' represents an unmethylated site.
- This Read Phased Methylation can further be decomposed into ten subsets (combinations) of size three, each representing a Phased Methylation Pattern (PMP), as shown in the accompanying figure.
- PMP Phased Methylation Pattern
- FIG. 2 shows the identification of common and unique PMPs across two individual molecules. Consider two molecules, each featuring the same CpG sites, sharing identical methylation status at four out of five sites. As a result, these two molecules generate sixteen PMPs of size three. Of these, six PMPs are unique to the first molecule (Molecule A), six are unique to the second molecule (Molecule B), and four PMP is shared between the two molecules.
- FIG. 3 shows a workflow to identify ultra-specific Phase Methylation Patterns (uPMPs).
- Panel (A) shows a sample processing schematic.
- Panel (B) shows an exemplary uPMP identification process.
- Early onset T1DM is associated with unique phase methylation patterns. These patterns are absent in cfDNA samples even at high sequencing depths of l,500x, but are found in pancreatic islet samples with relative frequency >20%. These distinct methylation patterns are determined by unique methylation statuses of the CpG sites they encompass.
- Figure 5 showcases islet uPMPs within the Insulin gene and its flanking regions, each comprising 3 CpG sites. These were found in at least one islet sample, but not in cfDNA.
- the plot represents a lOkb genomic region on chrl 1 :2153660-2164221 (depicted on the x-axis, coordinate on assembly GRCH38), which harbors 599 total CpG sites (shown as light grey dots).
- 69 uPMPs were identified across the two strands.
- 51 were located in the downstream region of the INS gene, three in the promoter region, and the remaining within the INS gene.
- Figure 6 depicts how ultra-specific phased methylation patterns can be used to interrogate samples and discriminate between tissue types.
- Panel (A) illustrates the selection of biomarkers based on non-overlapping uPMP present in the samples of an Islet Panel.
- Panel (B) depicts the Islet Panel main findings.
- Panel (C) depicts the distribution of biomarker homology across the genes interrogated as bimodal.
- Panel (D) depicts the cumulative function of 1,300 highly specific low sensitivity markers to achieve a combined sensitivity that significantly exceeds that of a single ultra-specific 100% sensitive marker.
- Figure 7 presents examples of ultra-specific phased methylation patterns of size 3 (uPMPs) for genes INS, PDX1, and TECPR1. Grey shadows represent reads per PMP for each sample.
- the methylation pattern, shown on the y-axis, is represented in binary: 0 for an unmethylated CpG site and 1 for a methylated CpG site. Genomic coordinates are based on GRCh38. The figure summarizes findings for 25 cfDNA samples and 21 islet samples.
- any and all combinations of the members that make up that grouping of alternatives is specifically envisioned. For example, if an item is selected from a group consisting of A, B, C, and D, the inventors specifically envision each alternative individually (e.g., A alone, B alone, etc.), as well as combinations such as A, B, and D; A and C; B and C; etc.
- the term “and/or” when used in a list of two or more items means any one of the listed items by itself or in combination with any one or more of the other listed items.
- the expression “A and/or B” is intended to mean either or both of A and B - i.e., A alone, B alone, or A and B in combination.
- the expression “A, B and/or C” is intended to mean A alone, B alone, C alone, A and B in combination, A and C in combination, B and C in combination, or A, B, and C in combination.
- the term “substantially”, when used to modify a quality, generally allows a certain degree of variation without that quality being lost.
- degree of variation can be less than 0.1%, about 0.1%, about 0.2%, about 0.3%, about 0.4%, about 0.5%, about 0.6%, about 0.7%, about 0.8%, about 0.9%, about 1%, between 1-2%, between 2-3%, between 3-4%, between 4-5%, or greater than 5% or 10%.
- phased methylation refers to the methylation status of different CpG sites that co-occur on the same single-stranded DNA molecule.
- read phased methylation refers to the methylation status of different CpG sites as found in the same NGS read.
- phased methylation patterns refers to combinations of methylation status of at least 3 CpG sites identified within a given read phased methylation.
- PMPs are typically characterized by a list of CpG coordinates, and their methylation status as found in the read phased methylation. PMPs may skip certain methylation sites in a given read phased methylation.
- a PMP refers to methylation status of 3 CpG sites in a given read phased methylation, where some or all of the 3 CpG sites are intercalated by one or more CpG sites not considered as part of the PMP.
- PMPs are strand-dependent. This means that a phased methylation pattern observed on the sense strand of the original double-stranded DNA molecule may not necessarily have a corresponding pattern on the antisense strand of the same original double-stranded DNA molecule.
- uPMPs refers to PMPs not detected in any cell-free DNA (cfDNA) samples from any donor, but are detected in at least one genomic DNA sample from a tissue or cell-type of interest sourced from at least one donor.
- uPMPs are detected only in some, but not all, genomic DNA samples from a tissue or cell-type of interest, such as below 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70% or 80% of the samples.
- a uPMP comprises selected methylation sites intercalated by at least 1, at least 2 or at least 3, but below 10 unselected methylation sites.
- uPMPs in Type 1 Diabetes Mellitus are PMPs that are not detected in any cell-free DNA (cfDNA) sample (from a donor), but are detected in at least one pancreatic islet sample (from the same donor).
- cfDNA cell-free DNA
- pancreatic islet sample from the same donor.
- not detected refers to the lack of any read phased methylation carrying a specific PMP, provided that the sequencing depth is at least l,500x for cfDNA samples (background).
- detected refers to the presence of at least one read phased methylation carrying a specific PMP, provided that the sequencing depth is at least 150x.
- uPMP panel refers to a set of between 2 and 10,000 uPMPs. This panel, which is specific to at least one tissue or cell type, provides a level of sensitivity that exceeds that of an individual uPMP for at least one tissue or cell type. In a typical uPMP panel, each individual uPMP have a specificity close to 100%, and sensitivity above 0% but below 99%. Collectively, the resulting uPMP panel has the specificity equal to individual markers (i.e., close to 100%), but a sensitivity greater than 70%.
- the term “subject” generally refers to an entity or a medium that has testable or detectable genetic information.
- a subject can be a person, individual, or patient.
- a subject can be a vertebrate, such as, for example, a mammal.
- Non-limiting examples of mammals include humans, simians, farm animals, sport animals, rodents, and pets.
- the subject can be a person that has a condition, disorder or disease or is suspected of having such condition, disorder or disease.
- the subject may be displaying a symptom(s) indicative of a health or physiological state or condition of the subject, such as a cancer, autoimmune condition, or other disease, disorder, or condition of the subject.
- the subject can be asymptomatic with respect to such health or physiological state or condition.
- normal or “healthy”, as used herein, generally refers to a cell, tissue, plasma, blood, biological sample, or subject not having, or suspected to have, a condition, disorder or disease.
- tissue corresponds to any cells. Different types of tissue may correspond to different types of cells (e.g., liver, lung, pancreas or blood), but also may correspond to tissue from different organisms (mother vs. fetus) or to healthy cells vs. tumor cells.
- a “biological sample” refers to any sample that is taken from a subject (e.g., a human, such as a pregnant woman, a person with cancer, or a person suspected of having cancer, an organ transplant recipient, or a subject suspected of having a disease process involving an organ (e.g.
- the biological sample can be a bodily fluid, such as blood, plasma, serum, urine, vaginal fluid, uterine or vaginal flushing fluids, plural fluid, ascitic fluid, cerebrospinal fluid, saliva, sweat, tears, sputum, bronchoalveolar lavage fluid, etc. Stool samples can also be used.
- sample generally refers to a biological sample obtained from or derived from one or more subjects.
- Biological samples may be cell-free biological samples or substantially cell-free biological samples, or may be processed or fractionated to produce cell- free biological samples.
- cell-free biological samples may include cell-free ribonucleic acid (cfRNA), cell-free deoxyribonucleic acid (cfDNA), cell-free fetal DNA (cffDNA), plasma, serum, urine, saliva, amniotic fluid, and derivatives thereof.
- cfRNA cell-free ribonucleic acid
- cfDNA cell-free deoxyribonucleic acid
- cffDNA cell-free fetal DNA
- plasma serum, urine, saliva, amniotic fluid, and derivatives thereof.
- Cell-free biological samples may be obtained or derived from subjects using an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free DNA collection tube.
- EDTA ethylenediaminetetraacetic acid
- Cell-free biological samples may be derived from whole blood samples by fractionation.
- Biological samples or derivatives thereof may contain cells.
- a biological sample may be a blood sample or a derivative thereof (e.g., blood collected by a collection tube or blood drops).
- cell-free nucleic acid refers to any extracellular nucleic acid that is not attached to a cell.
- a cell-free nucleic acid can be a nucleic acid circulating in blood.
- a cell-free nucleic acid can be a nucleic acid in other bodily fluid disclosed herein, e.g., urine.
- a cell-free nucleic acid can be a deoxyribonucleic acid (“DNA”), e.g., genomic DNA, mitochondrial DNA, or a fragment thereof.
- DNA deoxyribonucleic acid
- a cell-free nucleic acid can be a ribonucleic acid (“RNA”), e.g., mRNA, short-interfering RNA (siRNA), microRNA (miRNA), circulating RNA (cRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), small nucleolar RNA (snoRNA), Piwi- interacting RNA (piRNA), long non-coding RNA (long ncRNA), or a fragment thereof.
- a cell-free nucleic acid is a DNA/RNA hybrid.
- a cell-free nucleic acid can be doublestranded, single-stranded, or a hybrid thereof.
- a cell-free nucleic acid can be released into bodily fluid through secretion or cell death processes, e.g., cellular necrosis and apoptosis.
- a cell-free nucleic acid can comprise one or more epigenetically modifications.
- a cell-free nucleic acid can be acetylated, methylated, ubiquitylated, phosphorylated, sumoylated, ribosylated, and/or citrullinated.
- a cell-free nucleic acid can be methylated cell-free DNA.
- bisulfite treatment refers to the treatment of DNA with bisulfite or a salt thereof, such as sodium bisulfite (NaHSO).
- Bisulfite reacts readily with the 5,6-double bond of cytosine, but poorly with methylated cytosine.
- Cytosine reacts with the bisulfite ion to form a sulfonated cytosine reaction intermediate which is susceptible to deamination, giving rise to a sulfonated uracil.
- the sulfonate group can be removed under alkaline conditions, resulting in the formation of uracil.
- Uracil is recognized as a thymine by polymerases and amplification will result in an adenine-thymine base pair instead of a cytosine-guanine base pair.
- genomic region generally refers to identified regions of nucleic acid that are identified by their location in the chromosome.
- the genomic regions are referred to by a gene name and encompass coding and non-coding regions associated with that physical region of nucleic acid.
- a gene comprises coding regions (exons), non-coding regions (introns), transcriptional control or other regulatory regions, and promoters.
- the genomic region may incorporate an intron or exon or an intron/exon boundary within a named gene.
- a “site” corresponds to a single site, which may be a single base position or a group of correlated base positions, e.g., a CpG site.
- CpG sites are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' —> 3' direction. CpG sites occur with high frequency in genomic regions called CpG islands.
- CpG islands generally refers to a contiguous region of genomic DNA that satisfies the criteria of (1) having a frequency of CpG dinucleotides corresponding to an “Observed/Expected Ratio” greater than about 0.6, and (2) having a “GC Content” greater than about 0.5.
- CpG islands may be between about 0.2 to about 3 kilobases (kb) in length having a high frequency of CpG sites.
- CpG islands may be found at or near promoters of about 40% of mammalian genes. CpG islands may also be found outside of mammalian genes.
- CpG islands are found in exons, introns, promoters, enhancers, inhibitors, and transcriptional regulatory elements. CpG islands may tend to occur upstream of so-called “housekeeping genes”. CpG islands may have a CpG dinucleotide content of at least about 60% of what would be statistically expected. The occurrence of CpG islands at or upstream of the 5' end of genes may reflect a role in the regulation of transcription.
- hypermethylation generally refers to the average methylation state corresponding to an increased presence of 5-mC at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5-mC found at corresponding CpG dinucleotides within a normal control DNA sample.
- a uPMP exhibits hypermethylation.
- hypomethylation generally refers to the average methylation state corresponding to a decreased presence of 5-mC at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5-mC found at corresponding CpG dinucleotides within a normal control DNA sample.
- a uPMP exhibits hypomethylation.
- methylation state or “methylation status”, as used herein, generally refers to the presence or absence of 5-methylcytosine (“5-mC”) at one or a plurality of CpG dinucleotides within a DNA sequence.
- Methylation states at one or more particular palindromic CpG methylation sites (each having two CpG dinucleotide sequences) within a DNA sequence include “unmethylated,” “fully-methylated”, and “hemi-methylated.”
- methylated cytosine generally refers to any methylated forms of the nucleic acid base cytosine that contains a methyl or hydroxymethyl functional group at the 5' position.
- Methylated cytosines may he regulators of gene transcription in genomic DNA. This term may include 5-methylcytosine and 5-hydroxymethylcytosine.
- methylation assay refers to any assay for determining the methylation state of one or more CpG dinucleotide sequences within a sequence of DNA.
- methylation converted or “converted” nucleic acid, as used herein, generally refers to nucleic acid, such as for example DNA, that has undergone a process used to convert the DNA for methylation sequencing.
- conversion processes include reagent-based (such as bisulfite) conversion, enzymatic conversion, or combination conversion (such as TAPS conversion) where unmethylated cytosines are converted into uracil prior to PCR amplification or sequencing.
- the conversion process may be used in methyl sequencing methods to distinguish between methylated and unmethylated cytosine bases.
- Exemplary enzymes for enzymatically converting cytosine include DNA methyltransferase, Uracil-DNA glycosylase (UDG), and G/T mismatch-specific endonuclease (GTSCE).
- sequencing depth refers to the number of times a specific nucleotide in a target region of a genome is read or sequenced. It is a measure of how extensively a particular DNA or RNA molecule is sampled during the sequencing process. Sequencing depth is commonly expressed as "X coverage” or "X-fold coverage,” where X represents the average number of times a given nucleotide is sequenced. For example, if a genome has a sequencing depth of 3 OX, it means that, on average, each nucleotide in the genome has been sequenced 30 times.
- the present disclosure provides methods that use a panel of uPMPs useful for the analysis of methylation within a region or gene.
- Other aspects provide novel uses of the region, gene, and the gene product as well as methods, assays, and kits directed to detecting, differentiating, and distinguishing certain condition or disorders, e.g., autoimmune diseases.
- One particular use is the early detection of autoimmune conditions.
- the present disclosure describes processes and methods for the selection of individual ultra-specific phased methylation patterns (uPMPs).
- uPMPs ultra-specific phased methylation patterns
- This selection process involves the analysis and comparison of PMPs across multiple samples.
- samples represent tissues or cell types of interest from multiple subjects and heterogeneous mixtures such as cell-free DNA (cfDNA) samples from multiple healthy donors.
- cfDNA cell-free DNA
- the present disclosure describes methods comprising (a) obtaining a set of single-stranded target nucleic acids wherein unmethylated cytosines are converted into uracil moieties, (b) obtaining a uPMP panel comprising primers for at least 2, 5, 10, 15, 20, 30, 50, 100, 150, 200, 250, 300, 350, 400, 450 or 500 uPMPs, and (c) applying the uPMP panel to the set of single-stranded target nucleic acids to assess the presence or absence of nucleic acid molecules originating from a tissue or cell-type of interest.
- the uPMP panel comprises one or more uPMPs associated with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25 or 30 genes or loci selected from the group consisting of MX1, TECPR1, PDK1, NR5A2, ONCUT1, CCNI1, INS1, IGF2, NPHS1, PGGHG, OTAM, SLC30A2, IFTM1, LBX1, ALDH1L2, CC2D1B, KCNK6, SLC1A2, SEL1L, PDE1A, GAPL1, LC16M2, CTRL, SH3GL2B, PY, TIME1, RBPS, RPXL, CEL, ELA2B, PKD1, LEFTY1, ATSPERB, FFAR1, PRDX4, TLRB1, CYB2, SLC30A8, SCYN, GTR2, CELB, OTAM, PNLIP, REGIA, AMY2A, CTRC, CTRB2, PA
- compositions which may comprise a set of single-stranded targets wherein unmethylated cytosines are converted into uracil moieties; a set of forward primers, each flanking at the 5'-end of at least one uPMPs; a set of reverse primers, each flanking at the 3 '-end of at least one uPMPs; and enzymes and buffer for PCR amplification.
- the set of forward primers and the set of reverse primers flank one or more uPMPs associated with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25 or 30 genes or loci selected from the group consisting of MX1, TECPR1, PDK1, NR5A2, ONCUT1, CCNI1, INS1, IGF2, NPHS1, PGGHG, OTAM, SLC30A2, IFTM1, LBX1, ALDH1L2, CC2D1B, KCNK6, SLC1A2, SEL1L, PDE1A, GAPL1, LC16M2, CTRL, SH3GL2B, PY, TIME1, RBPS, RPXL, CEL, ELA2B, PKD1, LEFTY1, ATSPERB, FFAR1, PRDX4, TLRB1, CYB2, SLC30A8, SCYN, GTR2, CELB, OTAM, PNLIP, REGIA, AMY2A, CTRC, CTRB2, PA
- Example 1 Identification of a uPMP panel associated with early onset of Type 1 Diabetes Mellitus (T1DM)
- Figure 1 explains the concept behind phased methylation, read phased methylation, and phased methylation patterns (PMP).
- a targeted hybridization-capture methylation panel was devised. This targeted panel included a set of probes covering approximately 3Mb across 80 regions of interest. Each region was defined as a full gene, plus or minus 3kb, flanking the upstream/downstream coordinates of the gene's start and end points.
- genes to be targeted was selected based on publicly available methylation and proteomic data. Initially, genes expressed in a tissue of interest at a rate four times greater than in other tissues, or in a group of 2-5 tissues compared to any other tissue, were chosen. After initial gene selection, publicly available methylation data was used to identify and prioritize genes exhibiting higher differential methylation at CpG sites.
- Capture probes and reagents were ordered from Twist Bioscience (San Francisco, USA). Approximately 25,000 sets of probes were synthesized and delivered as a pool, each set comprising eight probes of 120nt in length: two probes for each parental strand, either fully or partially converted.
- cfDNA was extracted from plasma using a SPRI-based kit (Beckman Coulter, USA), following the manufacturer's instructions, and stored at -20°C until use. Islet DNA was extracted using a column-based kit (NEB, USA), fragmented to a median length of approximately 220bp using sonication (Covaris, USA), and stored at -20°C until use. Libraries were prepared using EM- Seq (NEB, USA), following the manufacturer's instructions. Briefly, the DNA fragments were end-repaired and ligated with sequencing adapters. After purification, the samples underwent oxidation of 5-Methylcytosines and 5-Hydroxymethylcytosines, followed by a second purification.
- the samples were then treated with the APOBEC enzyme to convert unmethylated cytosine into uracil. After another round of purification, the fragments were amplified using NGS indexed primers. The product was purified, normalized, and used as input in the capture reaction. Finally, the eluate from the capture reaction was amplified, purified, and sequenced at l,500x and 150x sequencing depths for cfDNA and islet samples, respectively.
- NGS data from each sample were aligned to the reference genome using commonly used methylation aligners (Meth-BWA). Subsequently, a custom software tool was used to determine the phased methylation patterns. Finally, specific phased methylation patterns of the islets, as compared to the cfDNA, were identified in agreement with our definitions ( Figure 1).
- Example 2 Validation of candidate PMPs
- the uPMPs identified in Example 1 were validated at higher sequencing depths. For this, a subset of the initial hybridization capture panel was used, comprising a set of probes targeting approximately 300 Kb across 80 regions of interest. Capture probes and reagents were ordered from Twist Bioscience (San Francisco, USA). cfDNA samples were processed as previously described and sequenced at 30,000x. Islets from 19 samples were combined, and alpha cells and beta cells were separated using FACS. Next, the DNA from the different cell types was extracted and processed as above and sequenced at 150x.
- Validated beta-cell markers were identified as uPMPs not observed in any of the cfDNA samples at 30,000x sequencing depth or alpha cells at 150x sequencing depth, but were observed in beta-cells at 150x sequencing separated from the islet samples.
- Validated alpha-cell markers were identified as uPMPs that were not observed in any of the cfDNA samples at 30,000x sequencing depth or beta-cells at 150x sequencing depth, but were observed in alpha-cells at 150x sequencing separated from the islet samples.
- Example 3 Amplicons-based targeted uPMP panel.
- a targeted amplicon-based panel for T1DM uPMPs was devised comprising at least 20 beta-cell uPMP, and 10 alpha-cells or pancreas distressed uPMP.
- Example 4 Ultra-specific phased methylation patterns for interrogation of samples and discrimination between tissue types.
- biomarkers were defined as regions of fixed length (e.g., 120 nt) that do not overlap, characterized by a uPMP of size 3 (comprising three CpG sites) that is 100% specific and highly sensitive. As shown in Figure 6A, this methylation pattern was present in most of the 21 samples of an islet panel. The islet panel targeted 80 genes across 3Mb, covering 60 thousand CpG sites (30k on each strand); and queried 5,754 potential biomarkers. Of these biomarkers, 1,701 were not represented (0 reads) in 25 cfDNA samples, and 1,333 were found only in islets (see Figure 6B). The distribution of biomarker homology across the genes was also interrogated.
- Example 5 Ultra-specific phased methylation patterns for genes INS, PDX1, and TECPR1.
- Ultra-specific phased methylation patterns of size 3 for genes INS, PDX1, and TECPR1 were identified and summarized for 25 cfDNA samples and 21 islet samples.
- the identified methylation patterns were represented using binary code: “0” for an unmethylated CpG site and “1” for a methylated CpG site.
- the INS gene was shown to harbor a uPMP at coordinates chrl 1 : 2157011, 2157037, 2157072, showing binary pattern “100”, absent in cfDNA samples but present in 66% of islet samples (see Figure 7, top left and top right panels).
- PDX1 at chrl3: 27926732, 27926745, 27926759 showed the binary pattern “101”, absent in cfDNA samples, but found in 90% of islet samples.
- the binary pattern “010” for TECPR1 at chr7: 98217694, 98217700, 98217712 was found in all islet samples but none of the cfDNA samples.
Landscapes
- Life Sciences & Earth Sciences (AREA)
- Chemical & Material Sciences (AREA)
- Health & Medical Sciences (AREA)
- Engineering & Computer Science (AREA)
- Organic Chemistry (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Genetics & Genomics (AREA)
- Zoology (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Wood Science & Technology (AREA)
- Biotechnology (AREA)
- Physics & Mathematics (AREA)
- Analytical Chemistry (AREA)
- Biophysics (AREA)
- General Engineering & Computer Science (AREA)
- Molecular Biology (AREA)
- General Health & Medical Sciences (AREA)
- Biochemistry (AREA)
- Microbiology (AREA)
- Immunology (AREA)
- Biomedical Technology (AREA)
- Bioinformatics & Computational Biology (AREA)
- Plant Pathology (AREA)
- Crystallography & Structural Chemistry (AREA)
- Evolutionary Biology (AREA)
- Medical Informatics (AREA)
- Spectroscopy & Molecular Physics (AREA)
- Theoretical Computer Science (AREA)
- Cell Biology (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
Methods and compositions are disclosed for identifying one or more ultra-specific phased methylation patterns (uPMPs) for a tissue or cell-type of interest. Also disclosed and described are uPMPs for certain tissue or cell-type of interest and the use of uPMPs, e.g.., in a panel, for diagnostic purposes.
Description
PHASED METHYLATION-BASED MARKERS FOR TISSUE AND CELL-TYPE-
SPECIFIC IDENTIFICATION AND MONITORING
CROSS-REFERENCE TO RELATED APPLICATIONS
[001] This application claims the benefit of U.S. Provisional Application No. 63/517,813, filed August 4, 2023, which is hereby incorporated by reference in its entirety.
FIELD
[002] This application relates generally to compositions, assays, methods, kits and apparatuses for identifying and using biomarkers based on the methylation levels of certain target genomic regions.
BACKGROUND
[003] Liquid biopsies offer a way to identify the loss of specific cells or tissues by observing their unique methylation patterns. Methylation, a type of epigenetic modification, affects gene expression and activates cell-specific pathways. Therefore, different cells or tissues exhibit unique methylation profiles that mirror the activation of various genes or pathways. For example, in Type 1 Diabetes Mellitus, which is caused by the loss of beta cells in the pancreatic islets, the INS gene is active and thus its promoter is demethylated - contrary from other tissues or cells.
[004] Many researchers concentrate on either individual methylation loci or patterns of 3-9 phased methylation sites on the same strand, which consist of CpG sites that are consistently either methylated or unmethylated in the cell/tissue of interest but not in the rest of tissues or background (e.g., cfDNA). Methylation markers with better sensitivity and specificity are desired. A process is described here for identifying such markers by identifying phased methylation patterns with breadth and depth.
SUMMARY
[005] This disclosure presents methodologies for identifying ultra-specific phased methylation patterns (uPMPs), which are unique to certain tissues or cell types. Although not every uPMP described in this application needs to be present in each specific tissue or cell type, any detected uPMP can serve as an indicator of specific tissue or cellular distress. This is particularly relevant when abnormal cell death rates are observed, especially when these patterns are found in cell-free DNA (cfDNA).
[006] In another aspect, this disclosure provides a method for identifying one or more ultraspecific phased methylation patterns (uPMPs) for a tissue or cell-type of interest, comprising: (a) obtaining a set of cell-free DNA (cfDNA) samples from each subject in a first group of subjects; (b) obtaining a set of genomic DNA samples from the tissue or cell-type of interest from each subject in the first group of subjects and/or a second group of subjects; (c) providing conditions capable of converting unmethylated cytosines to uracils in nucleic acid molecules in the set of cfDNA samples and the set of genomic DNA samples to generate a set of converted cfDNA samples and a set of converted genomic DNA samples; (d) selecting a set of target genomic regions; (e) capturing the set of target genomic regions from the set of converted cfDNA samples and the set of converted genomic DNA samples to generate a set of captured cfDNA libraries and a set of captured genomic DNA libraries; (f) subject the set of captured cfDNA libraries to sequencing at a depth that is at least 0.1X, 0.5X, IX, 10X, 100X, l,000X, 10,000X or 100,000X; (g) subject the set of captured genomic DNA libraries to sequencing at a depth that is a least 0. IX, 0.5X, IX, 10X, 100X, l,000X, 10,000X or 100,000X; (h) determining phased methylation patterns (PMPs) in the set of captured cfDNA libraries and the set of captured genomic DNA libraries; and (i) identifying one or more uPMPs for the tissue or cell-type of interest, wherein the one or more uPMPs are detected in at least one library of the set of captured genomic DNA libraries, but not in any library of the set of captured cfDNA libraries.
[007] In another aspect, this disclosure describes the application of multiple sets of uPMPs. Each set comprises markers that individually possess high specificity, though not necessarily high sensitivity. However, when used in combination, these sets of uPMPs maximize both sensitivity and specificity.
[008] In one embodiment, these sets of uPMPs can be utilized for early detection and subsequent monitoring of degenerative conditions. In a separate embodiment, these sets of uPMPs can also be used for the early detection and monitoring of autoimmune diseases.
BRIEF DESCRIPTION OF THE DRAWINGS
[009] The following drawings and provided descriptions illustrate the apparent features, advantages, and uses of the invention(s). The incorporated drawings and descriptions included herein serve to identify specifications that will further explain the concept of the invention(s) and allow the production of art that allows a trained professional to make and use the invention(s). The drawings are not illustrated to scale.
[0010] Figure 1 provides illustrative schematics for phased methylation, read phased methylation, and phased methylation patterns. Top shows a hypothetical DNA molecule that features five CpG
methylation sites, with each site possessing a non-homogeneous methylation status. In the sequencing read, the methylation status at each site is represented as "C" for a methylated cytosine and "c” for an unmethylated cytosine (originally denoted in upper or lower case, respectively, on the molecule). Thus, the read phased methylation is a compilation of the methylation status of each CpG site, and is denoted as T0100', where T indicates a methylated site, and 'O' represents an unmethylated site. This Read Phased Methylation can further be decomposed into ten subsets (combinations) of size three, each representing a Phased Methylation Pattern (PMP), as shown in the accompanying figure.
[0011] Figure 2 shows the identification of common and unique PMPs across two individual molecules. Consider two molecules, each featuring the same CpG sites, sharing identical methylation status at four out of five sites. As a result, these two molecules generate sixteen PMPs of size three. Of these, six PMPs are unique to the first molecule (Molecule A), six are unique to the second molecule (Molecule B), and four PMP is shared between the two molecules.
[0012] Figure 3 shows a workflow to identify ultra-specific Phase Methylation Patterns (uPMPs). Panel (A) shows a sample processing schematic. Panel (B) shows an exemplary uPMP identification process. Early onset T1DM is associated with unique phase methylation patterns. These patterns are absent in cfDNA samples even at high sequencing depths of l,500x, but are found in pancreatic islet samples with relative frequency >20%. These distinct methylation patterns are determined by unique methylation statuses of the CpG sites they encompass.
[0013] Figure 4 depicts an islet uPMP found in KIRREL2 for CpG positions 35861365, 35861379, 35861452 on Chromosome 19 - antisense strand. Shown in panel (A), out of the 8 possible patterns of size =3, the PMP 1,1,1 (hypermethylated) is found in islets but not in cfDNA, at sequencing depths of 150x and l,500x respectively. Panel (B) shows a detailed methylation patterns breakdown. In 25 cfDNA samples, each independently processed and sequenced, 5,572 total reads have been found mapping to Chrl9: 35861365-35861452. No reads contained the patterns 110 and 111. In 23 islet samples, a total of 537 reads have been found mapping to the same coordinates. Of these, 322 mapped the pattern 111 and only 3 reads the pattern 110. While the first pattern is found in all samples, the second is found only in one sample. Both patterns are thus identified as uPMP, as both have specificity -100% and found in at least one islet sample.
[0014] Figure 5 showcases islet uPMPs within the Insulin gene and its flanking regions, each comprising 3 CpG sites. These were found in at least one islet sample, but not in cfDNA. The plot represents a lOkb genomic region on chrl 1 :2153660-2164221 (depicted on the x-axis, coordinate on assembly GRCH38), which harbors 599 total CpG sites (shown as light grey dots). In total, 69 uPMPs, were identified across the two strands. Among these, 51 were located in the downstream
region of the INS gene, three in the promoter region, and the remaining within the INS gene. At least two of these, located in the UTR region, overlap with the CpG sites (on relative positions chrl l : 2160805 and 2160808) previously used by Harold et al. (PNAS 108, no. 47 (2011): 19018— 23) to detect beta-cell death in cfDNA using qPCR. While all 69 uPMPs exhibit 100% specificity, the sensitivity of each marker ranges from 4% (1/21 positive samples) to 66% (14/21).
[0015] Figure 6 depicts how ultra-specific phased methylation patterns can be used to interrogate samples and discriminate between tissue types. Panel (A) illustrates the selection of biomarkers based on non-overlapping uPMP present in the samples of an Islet Panel. Panel (B) depicts the Islet Panel main findings. Panel (C) depicts the distribution of biomarker homology across the genes interrogated as bimodal. Panel (D) depicts the cumulative function of 1,300 highly specific low sensitivity markers to achieve a combined sensitivity that significantly exceeds that of a single ultra-specific 100% sensitive marker.
[0016] Figure 7 presents examples of ultra-specific phased methylation patterns of size 3 (uPMPs) for genes INS, PDX1, and TECPR1. Grey shadows represent reads per PMP for each sample. The methylation pattern, shown on the y-axis, is represented in binary: 0 for an unmethylated CpG site and 1 for a methylated CpG site. Genomic coordinates are based on GRCh38. The figure summarizes findings for 25 cfDNA samples and 21 islet samples.
DETAILED DESCRIPTION
[0017] Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this disclosure belongs.
[0018] All publications, patents, and patent applications mentioned in this specification are herein incorporated by reference to the same extent as if each individual publication, patent, or patent application was specifically and individually indicated to be incorporated by reference in their entirety.
[0019] As used herein and in the appended claims, the singular forms “a,” “an,” and “the” include plural referents unless the context clearly dictates otherwise. By way of example, “an element” means at least one element and can include more than one element.
[0020] Where a range of values is provided, it is understood that each intervening value, between the upper and lower limit of that range and any other stated or intervening value in that stated range is encompassed within the disclosure. The upper and lower limits of these smaller ranges may independently be included in the smaller ranges, and are also encompassed within the disclosure, subject to any specifically excluded limit in the stated range. Where the stated range includes one
or both of the limits, ranges excluding either or both of those included limits are also included in the disclosure.
[0021] When a grouping of alternatives is presented, any and all combinations of the members that make up that grouping of alternatives is specifically envisioned. For example, if an item is selected from a group consisting of A, B, C, and D, the inventors specifically envision each alternative individually (e.g., A alone, B alone, etc.), as well as combinations such as A, B, and D; A and C; B and C; etc.
[0022] The term “and/or” when used in a list of two or more items means any one of the listed items by itself or in combination with any one or more of the other listed items. For example, the expression “A and/or B” is intended to mean either or both of A and B - i.e., A alone, B alone, or A and B in combination. The expression “A, B and/or C” is intended to mean A alone, B alone, C alone, A and B in combination, A and C in combination, B and C in combination, or A, B, and C in combination.
General Definitions
[0023] As used herein, the term “substantially”, when used to modify a quality, generally allows a certain degree of variation without that quality being lost. For example, in certain aspects such degree of variation can be less than 0.1%, about 0.1%, about 0.2%, about 0.3%, about 0.4%, about 0.5%, about 0.6%, about 0.7%, about 0.8%, about 0.9%, about 1%, between 1-2%, between 2-3%, between 3-4%, between 4-5%, or greater than 5% or 10%.
[0024] The term “about”, “around” or “approximately”, when modifying the quantity (e.g., mg) of a substance or composition, or the value of a parameter characterizing a step in a method, or the like, refers to variation in the numerical quantity that can occur, for example, through typical measuring, handling, and sampling procedures involved in the preparation, characterization and/or use of the substance or composition; through an inadvertent error in these procedures; through differences in the manufacture, source, or purity of the ingredients employed to make or use the compositions or carry out the procedures; and the like. In certain aspects, “about” can mean a variation of ± 0.1%, ± 0.5%, ± 1%, ± 2%, ± 3%, ± 4%, ± 5%, ± 6%, ± 7%, ± 8%, ± 9% or ± 10%. [0025] As used herein, the term “phased methylation” refers to the methylation status of different CpG sites that co-occur on the same single-stranded DNA molecule.
[0026] As used herein, the term “read phased methylation” refers to the methylation status of different CpG sites as found in the same NGS read.
[0027] As used herein, the term “phased methylation patterns” or “PMPs” refers to combinations of methylation status of at least 3 CpG sites identified within a given read phased methylation.
PMPs are typically characterized by a list of CpG coordinates, and their methylation status as found in the read phased methylation. PMPs may skip certain methylation sites in a given read phased methylation. In an aspect, a PMP refers to methylation status of 3 CpG sites in a given read phased methylation, where some or all of the 3 CpG sites are intercalated by one or more CpG sites not considered as part of the PMP. PMPs are strand-dependent. This means that a phased methylation pattern observed on the sense strand of the original double-stranded DNA molecule may not necessarily have a corresponding pattern on the antisense strand of the same original double-stranded DNA molecule.
[0028] As used herein, the term “ultra-specific phased methylation patterns” or “uPMPs” refers to PMPs not detected in any cell-free DNA (cfDNA) samples from any donor, but are detected in at least one genomic DNA sample from a tissue or cell-type of interest sourced from at least one donor. In some aspect, uPMPs are detected only in some, but not all, genomic DNA samples from a tissue or cell-type of interest, such as below 5%, 10%, 20%, 30%, 40%, 50%, 60%, 70% or 80% of the samples. In some aspect, a uPMP comprises selected methylation sites intercalated by at least 1, at least 2 or at least 3, but below 10 unselected methylation sites.
[0029] In one embodiment and as an example, uPMPs in Type 1 Diabetes Mellitus (T1DM) are PMPs that are not detected in any cell-free DNA (cfDNA) sample (from a donor), but are detected in at least one pancreatic islet sample (from the same donor). In this context, "not detected" refers to the lack of any read phased methylation carrying a specific PMP, provided that the sequencing depth is at least l,500x for cfDNA samples (background). Conversely, "detected" refers to the presence of at least one read phased methylation carrying a specific PMP, provided that the sequencing depth is at least 150x.
[0030] As used herein, the term “ultra-specific phased methylation pattern panel” or “uPMP panel” refers to a set of between 2 and 10,000 uPMPs. This panel, which is specific to at least one tissue or cell type, provides a level of sensitivity that exceeds that of an individual uPMP for at least one tissue or cell type. In a typical uPMP panel, each individual uPMP have a specificity close to 100%, and sensitivity above 0% but below 99%. Collectively, the resulting uPMP panel has the specificity equal to individual markers (i.e., close to 100%), but a sensitivity greater than 70%.
[0031] As used herein, the term “subject” generally refers to an entity or a medium that has testable or detectable genetic information. A subject can be a person, individual, or patient. A subject can be a vertebrate, such as, for example, a mammal. Non-limiting examples of mammals include humans, simians, farm animals, sport animals, rodents, and pets. The subject can be a person that has a condition, disorder or disease or is suspected of having such condition, disorder or disease.
The subject may be displaying a symptom(s) indicative of a health or physiological state or condition of the subject, such as a cancer, autoimmune condition, or other disease, disorder, or condition of the subject. As an alternative, the subject can be asymptomatic with respect to such health or physiological state or condition.
[0032] The term “normal” or “healthy”, as used herein, generally refers to a cell, tissue, plasma, blood, biological sample, or subject not having, or suspected to have, a condition, disorder or disease.
[0033] As used herein, a “tissue” corresponds to any cells. Different types of tissue may correspond to different types of cells (e.g., liver, lung, pancreas or blood), but also may correspond to tissue from different organisms (mother vs. fetus) or to healthy cells vs. tumor cells. A “biological sample” refers to any sample that is taken from a subject (e.g., a human, such as a pregnant woman, a person with cancer, or a person suspected of having cancer, an organ transplant recipient, or a subject suspected of having a disease process involving an organ (e.g. the pancreas in diabetes, the heart in myocardial infarction, or the brain in stroke) and contains one or more nucleic acid molecule(s) of interest. The biological sample can be a bodily fluid, such as blood, plasma, serum, urine, vaginal fluid, uterine or vaginal flushing fluids, plural fluid, ascitic fluid, cerebrospinal fluid, saliva, sweat, tears, sputum, bronchoalveolar lavage fluid, etc. Stool samples can also be used.
[0034] As used herein, the term “sample” generally refers to a biological sample obtained from or derived from one or more subjects. Biological samples may be cell-free biological samples or substantially cell-free biological samples, or may be processed or fractionated to produce cell- free biological samples. For example, cell-free biological samples may include cell-free ribonucleic acid (cfRNA), cell-free deoxyribonucleic acid (cfDNA), cell-free fetal DNA (cffDNA), plasma, serum, urine, saliva, amniotic fluid, and derivatives thereof. Cell-free biological samples may be obtained or derived from subjects using an ethylenediaminetetraacetic acid (EDTA) collection tube, a cell-free RNA collection tube, or a cell-free DNA collection tube. Cell-free biological samples may be derived from whole blood samples by fractionation. Biological samples or derivatives thereof may contain cells. For example, a biological sample may be a blood sample or a derivative thereof (e.g., blood collected by a collection tube or blood drops).
[0035] As used herein, the term “cell-free nucleic acid” refers to any extracellular nucleic acid that is not attached to a cell. A cell-free nucleic acid can be a nucleic acid circulating in blood. Alternatively, a cell-free nucleic acid can be a nucleic acid in other bodily fluid disclosed herein, e.g., urine. A cell-free nucleic acid can be a deoxyribonucleic acid (“DNA”), e.g., genomic DNA, mitochondrial DNA, or a fragment thereof. A cell-free nucleic acid can be a ribonucleic acid
(“RNA”), e.g., mRNA, short-interfering RNA (siRNA), microRNA (miRNA), circulating RNA (cRNA), transfer RNA (tRNA), ribosomal RNA (rRNA), small nucleolar RNA (snoRNA), Piwi- interacting RNA (piRNA), long non-coding RNA (long ncRNA), or a fragment thereof. In some cases, a cell-free nucleic acid is a DNA/RNA hybrid. A cell-free nucleic acid can be doublestranded, single-stranded, or a hybrid thereof. A cell-free nucleic acid can be released into bodily fluid through secretion or cell death processes, e.g., cellular necrosis and apoptosis.
[0036] A cell-free nucleic acid can comprise one or more epigenetically modifications. For example, a cell-free nucleic acid can be acetylated, methylated, ubiquitylated, phosphorylated, sumoylated, ribosylated, and/or citrullinated. For example, a cell-free nucleic acid can be methylated cell-free DNA.
[0037] As used herein, the term “bisulfite treatment” refers to the treatment of DNA with bisulfite or a salt thereof, such as sodium bisulfite (NaHSO). Bisulfite reacts readily with the 5,6-double bond of cytosine, but poorly with methylated cytosine. Cytosine reacts with the bisulfite ion to form a sulfonated cytosine reaction intermediate which is susceptible to deamination, giving rise to a sulfonated uracil. The sulfonate group can be removed under alkaline conditions, resulting in the formation of uracil. Uracil is recognized as a thymine by polymerases and amplification will result in an adenine-thymine base pair instead of a cytosine-guanine base pair.
[0038] The term “genomic region”, as used herein, generally refers to identified regions of nucleic acid that are identified by their location in the chromosome. In some examples, the genomic regions are referred to by a gene name and encompass coding and non-coding regions associated with that physical region of nucleic acid. As used herein, a gene comprises coding regions (exons), non-coding regions (introns), transcriptional control or other regulatory regions, and promoters. In another example, the genomic region may incorporate an intron or exon or an intron/exon boundary within a named gene.
[0039] As used herein, a “site” corresponds to a single site, which may be a single base position or a group of correlated base positions, e.g., a CpG site.
[0040] As used herein, the term “CpG sites” are regions of DNA where a cytosine nucleotide is followed by a guanine nucleotide in the linear sequence of bases along its 5' —> 3' direction. CpG sites occur with high frequency in genomic regions called CpG islands.
[0041] The term “CpG islands”, as used herein, generally refers to a contiguous region of genomic DNA that satisfies the criteria of (1) having a frequency of CpG dinucleotides corresponding to an “Observed/Expected Ratio” greater than about 0.6, and (2) having a “GC Content” greater than about 0.5. CpG islands may be between about 0.2 to about 3 kilobases (kb) in length having a high frequency of CpG sites. CpG islands may be found at or near promoters of about 40% of
mammalian genes. CpG islands may also be found outside of mammalian genes. In some examples, CpG islands are found in exons, introns, promoters, enhancers, inhibitors, and transcriptional regulatory elements. CpG islands may tend to occur upstream of so-called “housekeeping genes”. CpG islands may have a CpG dinucleotide content of at least about 60% of what would be statistically expected. The occurrence of CpG islands at or upstream of the 5' end of genes may reflect a role in the regulation of transcription.
[0042] The term “hypermethylation”, as used herein, generally refers to the average methylation state corresponding to an increased presence of 5-mC at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5-mC found at corresponding CpG dinucleotides within a normal control DNA sample. In an aspect, a uPMP exhibits hypermethylation.
[0043] The term “hypomethylation”, as used herein, generally refers to the average methylation state corresponding to a decreased presence of 5-mC at one or a plurality of CpG dinucleotides within a DNA sequence of a test DNA sample, relative to the amount of 5-mC found at corresponding CpG dinucleotides within a normal control DNA sample. In an aspect, a uPMP exhibits hypomethylation.
[0044] The term “methylation state” or “methylation status”, as used herein, generally refers to the presence or absence of 5-methylcytosine (“5-mC”) at one or a plurality of CpG dinucleotides within a DNA sequence. Methylation states at one or more particular palindromic CpG methylation sites (each having two CpG dinucleotide sequences) within a DNA sequence include “unmethylated,” “fully-methylated”, and “hemi-methylated.”
[0045] The term “methylated cytosine”, as used herein, generally refers to any methylated forms of the nucleic acid base cytosine that contains a methyl or hydroxymethyl functional group at the 5' position. Methylated cytosines may he regulators of gene transcription in genomic DNA. This term may include 5-methylcytosine and 5-hydroxymethylcytosine.
[0046] The term “methylation assay” refers to any assay for determining the methylation state of one or more CpG dinucleotide sequences within a sequence of DNA.
[0047] The term “methylation converted” or “converted” nucleic acid, as used herein, generally refers to nucleic acid, such as for example DNA, that has undergone a process used to convert the DNA for methylation sequencing. Examples of conversion processes include reagent-based (such as bisulfite) conversion, enzymatic conversion, or combination conversion (such as TAPS conversion) where unmethylated cytosines are converted into uracil prior to PCR amplification or sequencing. The conversion process may be used in methyl sequencing methods to distinguish between methylated and unmethylated cytosine bases. Exemplary enzymes for enzymatically
converting cytosine include DNA methyltransferase, Uracil-DNA glycosylase (UDG), and G/T mismatch-specific endonuclease (GTSCE).
[0048] As used herein, in next-generation sequencing (NGS), sequencing depth refers to the number of times a specific nucleotide in a target region of a genome is read or sequenced. It is a measure of how extensively a particular DNA or RNA molecule is sampled during the sequencing process. Sequencing depth is commonly expressed as "X coverage" or "X-fold coverage," where X represents the average number of times a given nucleotide is sequenced. For example, if a genome has a sequencing depth of 3 OX, it means that, on average, each nucleotide in the genome has been sequenced 30 times.
[0049] In an aspect, the present disclosure provides methods that use a panel of uPMPs useful for the analysis of methylation within a region or gene. Other aspects provide novel uses of the region, gene, and the gene product as well as methods, assays, and kits directed to detecting, differentiating, and distinguishing certain condition or disorders, e.g., autoimmune diseases. One particular use is the early detection of autoimmune conditions.
[0050] In another aspect, the present disclosure describes processes and methods for the selection of individual ultra-specific phased methylation patterns (uPMPs). This selection process involves the analysis and comparison of PMPs across multiple samples. These samples represent tissues or cell types of interest from multiple subjects and heterogeneous mixtures such as cell-free DNA (cfDNA) samples from multiple healthy donors.
[0051] In a further aspect of the invention, the present disclosure describes methods comprising (a) obtaining a set of single-stranded target nucleic acids wherein unmethylated cytosines are converted into uracil moieties, (b) obtaining a uPMP panel comprising primers for at least 2, 5, 10, 15, 20, 30, 50, 100, 150, 200, 250, 300, 350, 400, 450 or 500 uPMPs, and (c) applying the uPMP panel to the set of single-stranded target nucleic acids to assess the presence or absence of nucleic acid molecules originating from a tissue or cell-type of interest.
[0052] In some embodiments of the invention of the present disclosure, the uPMP panel comprises one or more uPMPs associated with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25 or 30 genes or loci selected from the group consisting of MX1, TECPR1, PDK1, NR5A2, ONCUT1, CCNI1, INS1, IGF2, NPHS1, PGGHG, OTAM, SLC30A2, IFTM1, LBX1, ALDH1L2, CC2D1B, KCNK6, SLC1A2, SEL1L, PDE1A, GAPL1, LC16M2, CTRL, SH3GL2B, PY, TIME1, RBPS, RPXL, CEL, ELA2B, PKD1, LEFTY1, ATSPERB, FFAR1, PRDX4, TLRB1, CYB2, SLC30A8, SCYN, GTR2, CELB, OTAM, PNLIP, REGIA, AMY2A, CTRC, CTRB2, PAK3, MGA, PNLPR1, PLA2G1B, AQP12A, IL23R, CUZD1, TEX11, CELX1, AQP12B, GPR119, SLC38A5, PRSS2, PRSS1, AMY2A, CTRC and SPINK1.
[0053] In other aspects of the invention, the present disclosure describes compositions which may comprise a set of single-stranded targets wherein unmethylated cytosines are converted into uracil moieties; a set of forward primers, each flanking at the 5'-end of at least one uPMPs; a set of reverse primers, each flanking at the 3 '-end of at least one uPMPs; and enzymes and buffer for PCR amplification.
[0054] In some embodiments, the set of forward primers and the set of reverse primers flank one or more uPMPs associated with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25 or 30 genes or loci selected from the group consisting of MX1, TECPR1, PDK1, NR5A2, ONCUT1, CCNI1, INS1, IGF2, NPHS1, PGGHG, OTAM, SLC30A2, IFTM1, LBX1, ALDH1L2, CC2D1B, KCNK6, SLC1A2, SEL1L, PDE1A, GAPL1, LC16M2, CTRL, SH3GL2B, PY, TIME1, RBPS, RPXL, CEL, ELA2B, PKD1, LEFTY1, ATSPERB, FFAR1, PRDX4, TLRB1, CYB2, SLC30A8, SCYN, GTR2, CELB, OTAM, PNLIP, REGIA, AMY2A, CTRC, CTRB2, PAK3, MGA, PNLPR1, PLA2G1B, AQP12A, IL23R, CUZD1, TEX11, CELX1, AQP12B, GPR119, SLC38A5, PRSS2, PRSS1, AMY2A, CTRC and SPINK1. In some such embodiments, the set of forward primers and/or the set of reverse primers have a set size of at least or at most 2, 5, 10, 15, 20, 30, 50, 100, 150, 250, 300, 450, or 500.
[0055] It will be readily apparent to those skilled in the art that other suitable modifications and adaptations of the devices, systems and methods described herein may be made using suitable equivalents without departing from the scope of the aspects disclosed herein. Having now described certain aspects in detail, the same will be more clearly understood by reference to the following example, which is included for purposes of illustration only and is not intended to be limiting. All patents, patent applications, and references described herein are incorporated by reference in their entirety for all purposes.
EXAMPLES
Example 1: Identification of a uPMP panel associated with early onset of Type 1 Diabetes Mellitus (T1DM)
[0056] A set of ultra-specific phased methylation patterns (uPMP) associated with early onset T1DM were identified through deep sequencing of healthy donors' cfDNA (N=23) and DNA from islet and beta cells derived from different donors (N=19). Figure 1 explains the concept behind phased methylation, read phased methylation, and phased methylation patterns (PMP).
[0057] A targeted hybridization-capture methylation panel was devised. This targeted panel included a set of probes covering approximately 3Mb across 80 regions of interest. Each region
was defined as a full gene, plus or minus 3kb, flanking the upstream/downstream coordinates of the gene's start and end points.
[0058] The list of genes to be targeted was selected based on publicly available methylation and proteomic data. Initially, genes expressed in a tissue of interest at a rate four times greater than in other tissues, or in a group of 2-5 tissues compared to any other tissue, were chosen. After initial gene selection, publicly available methylation data was used to identify and prioritize genes exhibiting higher differential methylation at CpG sites.
[0059] Capture probes and reagents were ordered from Twist Bioscience (San Francisco, USA). Approximately 25,000 sets of probes were synthesized and delivered as a pool, each set comprising eight probes of 120nt in length: two probes for each parental strand, either fully or partially converted.
[0060] cfDNA was extracted from plasma using a SPRI-based kit (Beckman Coulter, USA), following the manufacturer's instructions, and stored at -20°C until use. Islet DNA was extracted using a column-based kit (NEB, USA), fragmented to a median length of approximately 220bp using sonication (Covaris, USA), and stored at -20°C until use. Libraries were prepared using EM- Seq (NEB, USA), following the manufacturer's instructions. Briefly, the DNA fragments were end-repaired and ligated with sequencing adapters. After purification, the samples underwent oxidation of 5-Methylcytosines and 5-Hydroxymethylcytosines, followed by a second purification. The samples were then treated with the APOBEC enzyme to convert unmethylated cytosine into uracil. After another round of purification, the fragments were amplified using NGS indexed primers. The product was purified, normalized, and used as input in the capture reaction. Finally, the eluate from the capture reaction was amplified, purified, and sequenced at l,500x and 150x sequencing depths for cfDNA and islet samples, respectively.
[0061] NGS data from each sample were aligned to the reference genome using commonly used methylation aligners (Meth-BWA). Subsequently, a custom software tool was used to determine the phased methylation patterns. Finally, specific phased methylation patterns of the islets, as compared to the cfDNA, were identified in agreement with our definitions (Figure 1).
[0062] A total of approximately 1,000,000 candidate markers, spanning across 87 kb, were identified by comparing methylation patterns from islets and cfDNA samples (Figure 3). An exemplary islet uPMP is found around the KIRREL2 gene (Figure 4). Further exemplary islet uPMPs were identified around the INS gene (Figure 5).
Example 2: Validation of candidate PMPs
[0063] The uPMPs identified in Example 1 were validated at higher sequencing depths. For this, a subset of the initial hybridization capture panel was used, comprising a set of probes targeting approximately 300 Kb across 80 regions of interest. Capture probes and reagents were ordered from Twist Bioscience (San Francisco, USA). cfDNA samples were processed as previously described and sequenced at 30,000x. Islets from 19 samples were combined, and alpha cells and beta cells were separated using FACS. Next, the DNA from the different cell types was extracted and processed as above and sequenced at 150x.
[0064] Validated beta-cell markers were identified as uPMPs not observed in any of the cfDNA samples at 30,000x sequencing depth or alpha cells at 150x sequencing depth, but were observed in beta-cells at 150x sequencing separated from the islet samples.
[0065] Validated alpha-cell markers were identified as uPMPs that were not observed in any of the cfDNA samples at 30,000x sequencing depth or beta-cells at 150x sequencing depth, but were observed in alpha-cells at 150x sequencing separated from the islet samples.
[0066] Validated pancreas distress markers were identified as uPMPs that were not observed in any of the cfDNA samples at 30,000x sequencing depth or beta-cells at 150x sequencing depth but were observed in islet samples or in alpha-cells
Example 3: Amplicons-based targeted uPMP panel.
[0067] A targeted amplicon-based panel for T1DM uPMPs was devised comprising at least 20 beta-cell uPMP, and 10 alpha-cells or pancreas distressed uPMP.
Example 4: Ultra-specific phased methylation patterns for interrogation of samples and discrimination between tissue types.
[0068] Without being bound by theory and for exemplary purposes, biomarkers were defined as regions of fixed length (e.g., 120 nt) that do not overlap, characterized by a uPMP of size 3 (comprising three CpG sites) that is 100% specific and highly sensitive. As shown in Figure 6A, this methylation pattern was present in most of the 21 samples of an islet panel. The islet panel targeted 80 genes across 3Mb, covering 60 thousand CpG sites (30k on each strand); and queried 5,754 potential biomarkers. Of these biomarkers, 1,701 were not represented (0 reads) in 25 cfDNA samples, and 1,333 were found only in islets (see Figure 6B). The distribution of biomarker homology across the genes was also interrogated. Examining the best sensitivity uPMPs on a gene level indicated most genes showed a bias towards heterogeneous uPMPs. The observed distribution appears bimodal, where genes either have both heterogeneous and homogeneous patterns or only heterogeneous patterns (see Figure 6C). Highly specific low sensitivity markers can be combined to create a highly sensitive panel. The cumulative function
of 1,300 biomarkers demonstrates that despite low sensitivity, a large population of highly specific markers, when considered together, can achieve a combined sensitivity of more than 200 times greater than a single ultra-specific 100% sensitive marker (see Figure 6D).
Example 5: Ultra-specific phased methylation patterns for genes INS, PDX1, and TECPR1.
[0069] Ultra-specific phased methylation patterns of size 3 (uPMPs) for genes INS, PDX1, and TECPR1 were identified and summarized for 25 cfDNA samples and 21 islet samples. The identified methylation patterns were represented using binary code: “0” for an unmethylated CpG site and “1” for a methylated CpG site.
[0070] The INS gene was shown to harbor a uPMP at coordinates chrl 1 : 2157011, 2157037, 2157072, showing binary pattern “100”, absent in cfDNA samples but present in 66% of islet samples (see Figure 7, top left and top right panels).
[0071] Similarly, PDX1 at chrl3: 27926732, 27926745, 27926759 showed the binary pattern “101”, absent in cfDNA samples, but found in 90% of islet samples. The binary pattern “010” for TECPR1 at chr7: 98217694, 98217700, 98217712 was found in all islet samples but none of the cfDNA samples.
Claims
1. A method for identifying one or more ultra-specific phased methylation patterns (uPMPs) for a tissue or cell-type of interest, comprising: a. obtaining a set of cell-free DNA (cfDNA) samples from each subject in a first group of subjects; b. obtaining a set of genomic DNA samples from the tissue or cell-type of interest from each subject in the first group of subjects and/or a second group of subjects; c. providing conditions capable of converting unmethylated cytosines to uracils in nucleic acid molecules of the set of cfDNA samples and the set of genomic DNA samples to generate a set of converted cfDNA samples and a set of converted genomic DNA samples; d. selecting a set of target genomic regions; e. capturing the set of target genomic regions from the set of converted cfDNA samples and the set of converted genomic DNA samples to generate a set of captured cfDNA libraries and a set of captured genomic DNA libraries; f. subject the set of captured cfDNA libraries to sequencing at a depth that is at least 0.1X, 0.5X, IX, 10X, 100X, l,000X, 10,000X or 100,000X; g. subject the set of captured genomic DNA libraries to sequencing at a depth that is a least 0.1X, 0.5X, IX, 10X, 100X, l,000X, 10,000X or 100,000X; h. determining phased methylation patterns (PMPs) in the set of captured cfDNA libraries and the set of captured genomic DNA libraries; i. identifying one or more uPMPs for the tissue or cell-type of interest, wherein the one or more uPMPs are detected in at least one library of the set of captured genomic DNA libraries, but not in any library of the set of captured cfDNA libraries.
2. The method of claim 1, wherein the sequencing depth for the set of captured cfDNA libraries is at least 2, 3, 4, 5, 7, 9, 10, 15, 20, 25, 30, 40, 50, 100, 500, or 1000 times of the sequencing depth for the set of captured genomic DNA libraries
3. The method of claim 1 or 2, wherein the set of target genomic regions each comprises a genic region or portion thereof, of a gene differentially expressed in the tissue or cell-type of interest relative to (i) at least one other tissue or cell-type or (ii) all other tissues or cell-types, optionally the genic region is selected from the group consisting of an
upstream regulatory element, a downstream regulatory element, a promoter and an enhancer.
4. The method of claim 3, wherein the gene is expressed in (i) the tissue or cell-type of interest or (ii) a group of 2 to 5 tissues or cell-types comprising the tissue or cell-type of interest, in each of cases (i) and (ii), at a rate 2, 3, 4, 5, 6, 7, 8, 9 or 10 times greater than in other tissues or cell-types.
5. The method of claim 3 or 4, wherein the genic region or portion thereof is enriched for CpG islands.
6. The method of claim 3 or 4, wherein the genic region or portion thereof exhibits higher differential methylation at one or more CpG sites in the tissue or cell-type of interest compared to (i) at least one other tissue or cell-type or (ii) all other tissues or cell-types.
7. The method of any of claims 1 to 4, wherein the first group of subjects and/or a second group of subjects are healthy individuals.
8. The method of any of claims 1 to 7, wherein the set of captured cfDNA libraries and the set of captured genomic DNA libraries are each sequenced in pooled format.
9. The method of any of claims 1 to 7, wherein the set of captured cfDNA libraries and the set of captured genomic DNA libraries are each barcoded, pooled and sequenced in a single sequencing run.
10. The method of any of claims 1 to 9, wherein the sequencing depth for the set of captured cfDNA libraries is at least 500x, lOOOx, 1500x, 2000x, 3000x, 4000x, 5000x, lOOOOx, 20000x or 50000x coverage.
11. The method of any of claims 1 to 9, wherein the sequencing depth for the set of captured cfDNA libraries is at more 500x, lOOOx, 1500x, 2000x, 3000x, 4000x, 5000x, lOOOOx, 20000x or 50000x coverage.
12. The method of any of claims 1 to 11, wherein sequencing depth for the set of captured genomic DNA libraries is at most 50x, lOOx, 150x, 200x, 300x, 400x, 500x, lOOOx, 2000x or 5000x coverage.
13. The method of any of claims 1 to 11, wherein sequencing depth for the set of captured genomic DNA libraries is at least 50x, lOOx, 150x, 200x, 300x, 400x, 500x, lOOOx, 2000x or 5000x coverage.
14. The method of any of the preceding claims, wherein the cell-free DNA (cfDNA) samples are obtained from a biological sample selected from the group consisting of a bodily fluid, blood, plasma, serum, urine, vaginal fluid, uterine or vaginal flushing fluids, plural fluid, ascitic fluid, cerebrospinal fluid, saliva, sweat, tears, sputum, and bronchoalveolar lavage fluid.
15. The method of any of the preceding claims, wherein the tissue or cell-type of interest is selected from the group consisting of: pancreatic islet, alpha cells, beta cells, delta cells, gamma cells, epsilon cells, pancreatic polypeptide cells, ductal cells, and acinar cells.
16. The method of any of the preceding claims, wherein the conversion of unmethylated cytosines to uracils is via an enzymatic or chemical method.
17. The method of any of the preceding claims, further comprising: a. selecting a set of at least 2, 5, 10, 15, 20, 30, 50, 100, 150, 200, 250, 300, 350, 400, 450 or 500 uPMPs to construct a uPMP panel for assessing in a cfDNA sample the presence or absence of cfDNA originating from the tissue or cell-type of interest.
18. A composition comprising: a. a set of single-stranded targets wherein unmethylated cytosines are converted into uracil moieties; b. a set of forward primers, each flanking at the 5'-end of at least one uPMPs; c. a set of reverse primers, each flanking at the 3 '-end of at least one uPMPs; and d. enzymes and buffer for PCR amplification.
19. The composition of claim 18, wherein the set of forward primers and/or the set of reverse primers have a set size of 2, 5, 10, 15, 20, 30, 50, 100, 150, 250, 300, 450, or 500.
20. A composition comprising: a. a set of single-stranded targets wherein unmethylated cytosines are converted into uracil; and b. a set of capture probes each being a reverse complement to at least one uPMPs.
21. A method for identifying one or more ultra-specific phased methylation patterns (uPMPs) for a tissue or cell-type of interest, comprising: a. obtaining a set of cell-free nucleic acid (cfNA) samples from each subject in a first group of subjects; b. obtaining a set of nucleic acid (NA) samples from the tissue or cell-type of interest from each subject in the first group of subjects and/or a second group of subjects;
c. selecting a set of target nucleic acid sequence regions; d. capturing the set of target nucleic acid sequence regions from the set of cfNA samples and the set of NA samples to generate a set of captured cfNA libraries and a set of captured NA libraries; e. subject the set of captured cfNA libraries and the set of captured NA libraries to methylation sequencing to different sequencing depths, wherein the sequencing depth for the set of captured cfNA libraries is at least 2, 3, 4, 5, 7, 9, 10, 15, 20, 25, 30, 40, 50, 100, 500, or 1000 times of the sequencing depth for the set of captured NA libraries; f. determining phased methylation patterns (PMPs) in the set of captured cfNA libraries and the set of captured NA libraries; g. identifying one or more uPMPs for the tissue or cell-type of interest, wherein the one or more uPMPs are detected in at least one library of the set of captured NA libraries, but not in any library of the set of captured cfNA libraries.
22. A method comprising: (a) obtaining a set of single-stranded target nucleic acids wherein unmethylated cytosines are converted into uracil moieties, (b) obtaining a uPMP panel comprising primers for at least 2, 5, 10, 15, 20, 30, 50, 100, 150, 200, 250, 300, 350, 400, 450 or 500 uPMPs, and (c) applying the uPMP panel to the set of single-stranded target nucleic acids to assess the presence or absence of nucleic acid molecules originating from a tissue or cell-type of interest.
23. The method of claim 22, wherein the uPMP panel comprises one or more uPMPs associated with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25 or 30 genes or loci selected from the group consisting of MX1, TECPR1, PDK1, NR5A2, ONCUT1, CCNI1, INS1, IGF2, NPHS1, PGGHG, OTAM, SLC30A2, IFTM1, LBX1, ALDH1L2, CC2D1B, KCNK6, SLC1A2, SEL1L, PDE1A, GAPL1, LC16M2, CTRL, SH3GL2B, PY, TIME1, RBPS, RPXL, CEL, ELA2B, PKD1, LEFTY1, ATSPERB, FFAR1, PRDX4, TLRB1, CYB2, SLC30A8, SCYN, GTR2, CELB, OTAM, PNLIP, REGIA, AMY2A, CTRC, CTRB2, PAK3, MGA, PNLPR1, PLA2G1B, AQP12A, IL23R, CUZD1, TEX11, CELX1, AQP12B, GPR119, SLC38A5, PRSS2, PRSS1, AMY2A, CTRC and SPINK 1.
24. The composition of claim 18, wherein the set of forward primers and the set of reverse primers flank one or more uPMPs associated with at least 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 12, 15, 20, 25 or 30 genes or loci selected from the group consisting of MX1, TECPR1,
PDK1, NR5A2, 0NCUT1, CCNI1, INS1, IGF2, NPHS1, PGGHG, OTAM, SLC30A2, IFTM1, LBX1, ALDH1L2, CC2D1B, KCNK6, SLC1A2, SEL1L, PDE1A, GAPL1, LC16M2, CTRL, SH3GL2B, PY, TIME1, RBPS, RPXL, CEL, ELA2B, PKD1, LEFTY1, ATSPERB, FFAR1, PRDX4, TLRB1, CYB2, SLC30A8, SCYN, GTR2, CELB, OTAM, PNLIP, REGIA, AMY2A, CTRC, CTRB2, PAK3, MGA, PNLPR1, PLA2G1B, AQP12A, IL23R, CUZD1, TEX11, CELX1, AQP12B, GPR119, SLC38A5, PRSS2, PRSS1, AMY2A, CTRC and SPINK 1.
25. The composition of claim 18, wherein the set of forward primers and/or the set of reverse primers have a set size of at least or at most 2, 5, 10, 15, 20, 30, 50, 100, 150, 250, 300, 450, or 500.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202363517813P | 2023-08-04 | 2023-08-04 | |
US63/517,813 | 2023-08-04 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2025034580A1 true WO2025034580A1 (en) | 2025-02-13 |
Family
ID=94535087
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2024/040801 WO2025034580A1 (en) | 2023-08-04 | 2024-08-02 | Phase methylation-based markers for tissue and cell-type-specific identification and monitoring |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2025034580A1 (en) |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170121767A1 (en) * | 2014-04-14 | 2017-05-04 | Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. | Method and kit for determining the tissue or cell origin of dna |
US20190106737A1 (en) * | 2017-09-20 | 2019-04-11 | University Of Utah Research Foundation | Size-Selection of Cell-Free DNA for Increasing Family Size During Next-Generation Sequencing |
US20230170052A1 (en) * | 2020-04-30 | 2023-06-01 | The University Of Sydney | Method and system for processing genomic data |
-
2024
- 2024-08-02 WO PCT/US2024/040801 patent/WO2025034580A1/en unknown
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20170121767A1 (en) * | 2014-04-14 | 2017-05-04 | Yissum Research Development Company Of The Hebrew University Of Jerusalem Ltd. | Method and kit for determining the tissue or cell origin of dna |
US20190106737A1 (en) * | 2017-09-20 | 2019-04-11 | University Of Utah Research Foundation | Size-Selection of Cell-Free DNA for Increasing Family Size During Next-Generation Sequencing |
US20230170052A1 (en) * | 2020-04-30 | 2023-06-01 | The University Of Sydney | Method and system for processing genomic data |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
KR102028375B1 (en) | Systems and methods to detect rare mutations and copy number variation | |
EP2885427B1 (en) | Colorectal cancer methylation marker | |
CN106687600B (en) | Methods for methylation analysis | |
AU2010206163B2 (en) | Determination of the degree of DNA methylation | |
JP2018533953A (en) | Detection of fetal chromosomal aneuploidy using DNA regions that are differentially methylated between fetuses and pregnant women | |
EP2773771B1 (en) | Methods for the detection, visualization and high resolution physical mapping of genomic rearrangements in breast and ovarian cancer genes and loci brca1 and brca2 using genomic morse code in conjunction with molecular combing | |
KR20230005927A (en) | Tumor Detection Reagents and Kits | |
CN114667355B (en) | Methods for detecting colorectal cancer | |
KR102409747B1 (en) | Composition for predicting or diagnosing obesity using methylation level of SNX20 gene and method for providing information therefore | |
US20220364173A1 (en) | Methods and systems for detection of nucleic acid modifications | |
US20220170112A1 (en) | Methods and compositions for lung cancer detection | |
WO2025034580A1 (en) | Phase methylation-based markers for tissue and cell-type-specific identification and monitoring | |
JP7447155B2 (en) | Method for detecting methylation of SDC2 gene | |
US20100311609A1 (en) | Methylation Profile of Neuroinflammatory Demyelinating Diseases | |
KR102085669B1 (en) | Method for providing information of prediction and diagnosis of small vessel occlusion using methylation level of CYP26C1 gene and composition therefor | |
KR102085667B1 (en) | Method for providing information of prediction and diagnosis of small vessel occlusion using methylation level of GPR160 gene and composition therefor | |
KR102652504B1 (en) | Epigenetic methylation markers for predicting metabolic syndrome, and kits using the same | |
US11827940B2 (en) | Method for detecting cancer using 5-hydroxymethylcytosine (5-hmC) | |
EP3669002A1 (en) | Mcc as epigenetic marker for the identification of immune cells, in particular basophil granulocytes | |
KR102421788B1 (en) | Method for providing information of prediction and diagnosis of small vessel occlusion using methylation level of CUEDC2 gene and composition therefor | |
JP5608553B2 (en) | Kit and method for determining whether or not unmethylated cytosine conversion treatment has been properly performed, and method for analyzing methylated DNA using the kit and method | |
KR20240104309A (en) | Method for detection of lung cancer using lung cancer-specific methylation marker gene | |
KR20240104310A (en) | Method for detection of lung cancer using lung cancer-specific methylation marker gene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 24852627 Country of ref document: EP Kind code of ref document: A1 |