WO2022082199A1 - Method for detecting amyotrophic lateral sclerosis - Google Patents
Method for detecting amyotrophic lateral sclerosis Download PDFInfo
- Publication number
- WO2022082199A1 WO2022082199A1 PCT/US2021/071865 US2021071865W WO2022082199A1 WO 2022082199 A1 WO2022082199 A1 WO 2022082199A1 US 2021071865 W US2021071865 W US 2021071865W WO 2022082199 A1 WO2022082199 A1 WO 2022082199A1
- Authority
- WO
- WIPO (PCT)
- Prior art keywords
- als
- mutations
- subject
- genes
- lateral sclerosis
- Prior art date
Links
- 206010002026 amyotrophic lateral sclerosis Diseases 0.000 title claims abstract description 175
- 238000000034 method Methods 0.000 title claims abstract description 49
- 230000035772 mutation Effects 0.000 claims abstract description 72
- 108020004414 DNA Proteins 0.000 claims abstract description 29
- 238000012163 sequencing technique Methods 0.000 claims abstract description 22
- 239000012472 biological sample Substances 0.000 claims abstract description 18
- 210000000349 chromosome Anatomy 0.000 claims abstract description 16
- 108091028043 Nucleic acid sequence Proteins 0.000 claims abstract description 15
- 101100495925 Schizosaccharomyces pombe (strain 972 / ATCC 24843) chr3 gene Proteins 0.000 claims abstract description 12
- 238000003752 polymerase chain reaction Methods 0.000 claims abstract description 12
- 108090000623 proteins and genes Proteins 0.000 claims description 124
- 239000000523 sample Substances 0.000 claims description 27
- -1 AC106707.1 Proteins 0.000 claims description 24
- 239000002773 nucleotide Substances 0.000 claims description 14
- 125000003729 nucleotide group Chemical group 0.000 claims description 14
- 108091068844 miR-7155 stem-loop Proteins 0.000 claims description 12
- 101000986786 Homo sapiens Orexin/Hypocretin receptor type 1 Proteins 0.000 claims description 9
- 101001134134 Homo sapiens Oxidation resistance protein 1 Proteins 0.000 claims description 9
- 102100028141 Orexin/Hypocretin receptor type 1 Human genes 0.000 claims description 9
- 102100036625 Coiled-coil domain-containing protein 42 Human genes 0.000 claims description 6
- 101000715288 Homo sapiens Coiled-coil domain-containing protein 42 Proteins 0.000 claims description 6
- 102100034154 Guanine nucleotide-binding protein G(i) subunit alpha-2 Human genes 0.000 claims description 5
- 101001070508 Homo sapiens Guanine nucleotide-binding protein G(i) subunit alpha-2 Proteins 0.000 claims description 5
- 101001046948 Homo sapiens SANT and BTB domain regulator of class switch recombination Proteins 0.000 claims description 5
- 102100022847 SANT and BTB domain regulator of class switch recombination Human genes 0.000 claims description 5
- 239000008280 blood Substances 0.000 claims description 5
- 210000004369 blood Anatomy 0.000 claims description 5
- 210000004027 cell Anatomy 0.000 claims description 5
- 101000601581 Homo sapiens NADH dehydrogenase [ubiquinone] iron-sulfur protein 4, mitochondrial Proteins 0.000 claims description 4
- 102100037519 NADH dehydrogenase [ubiquinone] iron-sulfur protein 4, mitochondrial Human genes 0.000 claims description 4
- 102100027563 Cytochrome c oxidase subunit 5A, mitochondrial Human genes 0.000 claims description 3
- 102100033587 DNA topoisomerase 2-alpha Human genes 0.000 claims description 3
- 101000725076 Homo sapiens Cytochrome c oxidase subunit 5A, mitochondrial Proteins 0.000 claims description 3
- 101000637977 Homo sapiens Neuronal calcium sensor 1 Proteins 0.000 claims description 3
- 101000905839 Homo sapiens Phospholipid-transporting ATPase VA Proteins 0.000 claims description 3
- 101000877833 Homo sapiens Protein FAM184B Proteins 0.000 claims description 3
- 101000709106 Homo sapiens SMC5-SMC6 complex localization factor protein 1 Proteins 0.000 claims description 3
- 101000795185 Homo sapiens Thyroid hormone receptor-associated protein 3 Proteins 0.000 claims description 3
- 101000652578 Homo sapiens Thyroid transcription factor 1-associated protein 26 Proteins 0.000 claims description 3
- 101000830563 Homo sapiens Trinucleotide repeat-containing gene 18 protein Proteins 0.000 claims description 3
- 101000781865 Homo sapiens Zinc finger CCCH domain-containing protein 7B Proteins 0.000 claims description 3
- 102100030658 Lipase member H Human genes 0.000 claims description 3
- 101710102454 Lipase member H Proteins 0.000 claims description 3
- 101001083117 Microbacterium liquefaciens Hydantoin permease Proteins 0.000 claims description 3
- 102100032077 Neuronal calcium sensor 1 Human genes 0.000 claims description 3
- 102100023496 Phospholipid-transporting ATPase VA Human genes 0.000 claims description 3
- 102100035465 Protein FAM184B Human genes 0.000 claims description 3
- 102100032663 SMC5-SMC6 complex localization factor protein 1 Human genes 0.000 claims description 3
- 102000003620 TRPM3 Human genes 0.000 claims description 3
- 108060008547 TRPM3 Proteins 0.000 claims description 3
- 102100029689 Thyroid hormone receptor-associated protein 3 Human genes 0.000 claims description 3
- 102100030344 Thyroid transcription factor 1-associated protein 26 Human genes 0.000 claims description 3
- 102100024597 Trinucleotide repeat-containing gene 18 protein Human genes 0.000 claims description 3
- 108010046308 Type II DNA Topoisomerases Proteins 0.000 claims description 3
- 102100036643 Zinc finger CCCH domain-containing protein 7B Human genes 0.000 claims description 3
- 238000012217 deletion Methods 0.000 claims description 3
- 230000037430 deletion Effects 0.000 claims description 3
- 238000003780 insertion Methods 0.000 claims description 3
- 230000037431 insertion Effects 0.000 claims description 3
- 102000054765 polymorphisms of proteins Human genes 0.000 claims description 3
- 210000001519 tissue Anatomy 0.000 claims description 3
- 238000004458 analytical method Methods 0.000 description 18
- 208000037265 diseases, disorders, signs and symptoms Diseases 0.000 description 18
- 230000002068 genetic effect Effects 0.000 description 16
- 201000010099 disease Diseases 0.000 description 11
- 239000013068 control sample Substances 0.000 description 5
- 238000001514 detection method Methods 0.000 description 5
- 238000012360 testing method Methods 0.000 description 5
- 238000013459 approach Methods 0.000 description 4
- 230000000875 corresponding effect Effects 0.000 description 4
- 238000013461 design Methods 0.000 description 4
- 238000003745 diagnosis Methods 0.000 description 4
- 238000011156 evaluation Methods 0.000 description 4
- 238000007481 next generation sequencing Methods 0.000 description 4
- 238000002360 preparation method Methods 0.000 description 4
- 238000000746 purification Methods 0.000 description 4
- 230000035945 sensitivity Effects 0.000 description 4
- 238000007619 statistical method Methods 0.000 description 4
- 206010064571 Gene mutation Diseases 0.000 description 3
- 101150014554 TARDBP gene Proteins 0.000 description 3
- 230000008826 genomic mutation Effects 0.000 description 3
- 208000015122 neurodegenerative disease Diseases 0.000 description 3
- 230000008506 pathogenesis Effects 0.000 description 3
- 230000007170 pathology Effects 0.000 description 3
- 238000012070 whole genome sequencing analysis Methods 0.000 description 3
- 208000024827 Alzheimer disease Diseases 0.000 description 2
- 108091026890 Coding region Proteins 0.000 description 2
- 238000001712 DNA sequencing Methods 0.000 description 2
- 201000011240 Frontotemporal dementia Diseases 0.000 description 2
- 101000610557 Homo sapiens U4/U6 small nuclear ribonucleoprotein Prp31 Proteins 0.000 description 2
- 108700026244 Open Reading Frames Proteins 0.000 description 2
- 101001109965 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) 60S ribosomal protein L7-A Proteins 0.000 description 2
- 101001109960 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) 60S ribosomal protein L7-B Proteins 0.000 description 2
- 108010021188 Superoxide Dismutase-1 Proteins 0.000 description 2
- 102100038836 Superoxide dismutase [Cu-Zn] Human genes 0.000 description 2
- 102100040347 TAR DNA-binding protein 43 Human genes 0.000 description 2
- 102100040118 U4/U6 small nuclear ribonucleoprotein Prp31 Human genes 0.000 description 2
- 230000003321 amplification Effects 0.000 description 2
- 238000003339 best practice Methods 0.000 description 2
- 230000007850 degeneration Effects 0.000 description 2
- 238000002405 diagnostic procedure Methods 0.000 description 2
- 239000000539 dimer Substances 0.000 description 2
- 230000000694 effects Effects 0.000 description 2
- 238000001914 filtration Methods 0.000 description 2
- 238000010448 genetic screening Methods 0.000 description 2
- 238000011331 genomic analysis Methods 0.000 description 2
- 238000003205 genotyping method Methods 0.000 description 2
- 230000000977 initiatory effect Effects 0.000 description 2
- 238000007403 mPCR Methods 0.000 description 2
- 239000002679 microRNA Substances 0.000 description 2
- 210000002161 motor neuron Anatomy 0.000 description 2
- 238000002610 neuroimaging Methods 0.000 description 2
- 238000003199 nucleic acid amplification method Methods 0.000 description 2
- 102000004169 proteins and genes Human genes 0.000 description 2
- 238000006467 substitution reaction Methods 0.000 description 2
- 208000024891 symptom Diseases 0.000 description 2
- 230000001225 therapeutic effect Effects 0.000 description 2
- 238000002560 therapeutic procedure Methods 0.000 description 2
- 101150076401 16 gene Proteins 0.000 description 1
- 101150092328 22 gene Proteins 0.000 description 1
- 108700028369 Alleles Proteins 0.000 description 1
- 108091093088 Amplicon Proteins 0.000 description 1
- 230000004544 DNA amplification Effects 0.000 description 1
- 102100029671 E3 ubiquitin-protein ligase TRIM8 Human genes 0.000 description 1
- 102000004190 Enzymes Human genes 0.000 description 1
- 108090000790 Enzymes Proteins 0.000 description 1
- 108700024394 Exon Proteins 0.000 description 1
- 238000000729 Fisher's exact test Methods 0.000 description 1
- 101000795300 Homo sapiens E3 ubiquitin-protein ligase TRIM8 Proteins 0.000 description 1
- 101000612980 Homo sapiens Thrombospondin-type laminin G domain and EAR repeat-containing protein Proteins 0.000 description 1
- 108091092195 Intron Proteins 0.000 description 1
- 108700011259 MicroRNAs Proteins 0.000 description 1
- 208000036110 Neuroinflammatory disease Diseases 0.000 description 1
- 101100012902 Saccharomyces cerevisiae (strain ATCC 204508 / S288c) FIG2 gene Proteins 0.000 description 1
- 230000002776 aggregation Effects 0.000 description 1
- 238000004220 aggregation Methods 0.000 description 1
- 230000006851 antioxidant defense Effects 0.000 description 1
- 230000008236 biological pathway Effects 0.000 description 1
- 239000000090 biomarker Substances 0.000 description 1
- 238000001574 biopsy Methods 0.000 description 1
- 230000015572 biosynthetic process Effects 0.000 description 1
- 238000006243 chemical reaction Methods 0.000 description 1
- 230000009137 competitive binding Effects 0.000 description 1
- 238000000205 computational method Methods 0.000 description 1
- 238000004590 computer program Methods 0.000 description 1
- 230000002596 correlated effect Effects 0.000 description 1
- 230000003247 decreasing effect Effects 0.000 description 1
- 229910003460 diamond Inorganic materials 0.000 description 1
- 239000010432 diamond Substances 0.000 description 1
- 208000035475 disorder Diseases 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 210000002950 fibroblast Anatomy 0.000 description 1
- 239000012634 fragment Substances 0.000 description 1
- 238000012252 genetic analysis Methods 0.000 description 1
- 210000004602 germ cell Anatomy 0.000 description 1
- 239000012535 impurity Substances 0.000 description 1
- 239000003550 marker Substances 0.000 description 1
- 239000011159 matrix material Substances 0.000 description 1
- 230000007246 mechanism Effects 0.000 description 1
- 238000002844 melting Methods 0.000 description 1
- 230000008018 melting Effects 0.000 description 1
- 108091070501 miRNA Proteins 0.000 description 1
- 239000003607 modifier Substances 0.000 description 1
- 230000003959 neuroinflammation Effects 0.000 description 1
- 230000001575 pathological effect Effects 0.000 description 1
- 230000002093 peripheral effect Effects 0.000 description 1
- 230000008569 process Effects 0.000 description 1
- 238000003908 quality control method Methods 0.000 description 1
- 230000001105 regulatory effect Effects 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000012552 review Methods 0.000 description 1
- 238000013179 statistical model Methods 0.000 description 1
Classifications
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q1/00—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
- C12Q1/68—Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
- C12Q1/6876—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
- C12Q1/6883—Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
-
- C—CHEMISTRY; METALLURGY
- C12—BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
- C12Q—MEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
- C12Q2600/00—Oligonucleotides characterized by their use
- C12Q2600/156—Polymorphic or mutational markers
Definitions
- the present invention relates to methods for detecting amyotrophic lateral sclerosis.
- Each method comprises sequencing 16 target genes or 23 target genomic loci from a biological sample of a subject, and identifying one or more mutations such as single nucleotide polymorphisms or insertions/deletions , if present, in the 16 target genes or 23 target genomic loci.
- ALS Amyotrophic lateral sclerosis
- ALS cases can be grouped by two categories: familial ALS (fALS), where the patient has a genetically related family member also affected, and sporadic ALS (sALS), where the patient has no family history of ALS 9 . Historically, 5-10% of cases are fALS, and the other 90-95% cases are sALS. In the past ten years, the C9ORF72 hexanucleotide repeat expansion, has been identified as the most prevalent genomic mutation found in the ALS disease population 13 . C9ORF72 repeat expansions can be found in up to 34% of fALS and 5% of sALS cases.
- FIG. 1 Distribution of SNPs present only in the 338 ALS sample. rs767982303 and rs760890146 (SNIP IDs) are each in greater than 25% of the ALS population. The dots represent percentage of the ALS sample for each selected SNP with the 68.5% Confidence Level (CL) Clopper-Pearson interval on the true binomial proportion. The grey area represents the range of the possible percentage in the healthy population, with a 95% CL Clopper-Pearson interval.
- CL Confidence Level
- FIG. 2 SNPs that are not mutated in the control sample. The number of ALS cases out of the overall 338 patient cohort, percent of total ALS cases with the 99% CL Clopper- Pearson interval, and p-value.
- FIG. 3 Distribution of mutated genes found only in the 338 ALS sample. Dots represent percentage of the ALS group for each selected gene. The grey area represents an upper-bound on the potential false-positive percentage in the healthy population. This upper bound is set via the 99% CL Clopper-Pearson interval on the binomial proportion. MIR7155 mutations are detected in 51% of the ALS cohort.
- FIG. 4 16 genes that are not mutated in the control sample. The number of ALS cases out of the 338-patient cohort, number of unique SNPs, percent of total ALS cases, and p-value with the 99% CL Clopper-Pearson interval are shown, respectively.
- FIGs. 5A-5B Classifier Analysis using candidate ALS-only mutated genes. Selecting patients with three or more genes mutated of the 16 candidate genes yields a falsepositive rate less than 0.1% and false-negative rate less than 59% at 99% CL. 52% of the ALS cases have at least three of the 16 candidate genes mutated.
- (5B) The percentage of ALS cases with at least the given number of genes mutated from the candidate list (light). The maximum false positive rate at 99% CL (dark).
- FIG. 6 Distribution of candidate ALS-only mutated genes and probability of having ALS based on number of mutations. The distribution of the number of genes out of the top 22 candidates found in each of the 713 ALS cases is shown in grey. The probability of having ALS and the probability of not having ALS is represented is shown. DETAILED DESCRIPTION OF THE INVENTION
- locus is a specific, fixed position on a chromosome where a particular gene or genetic marker is located.
- a “single nucleotide polymorphism” is a germline substitution of a single nucleotide at a specific position in the genome. For example, at a specific base position in the human genome, the G nucleotide may appear in most individuals, but in a minority of individuals, the position is occupied by an A. This means that there is a SNP at this specific position, and the two possible nucleotide variations - G or A - are the alleles for this specific position.
- ALS Amyotrophic Lateral Sclerosis
- the present invention identifies a set of mutations in genomic-coding regions that are present in ALS patients but not in healthy control samples.
- the present invention provides methods to detect and diagnose ALS before clinical and pathological onset, which is imperative to prolonging patient lifespan, understanding the pathobiology, and designing therapies for early intervention.
- the inventors compute and analyze large datasets of genomes of over 1,500 ALS disease patients and healthy controls.
- the inventors unravel mutations such as single nucleotide polymorphisms (SNPs) and Indels (insertions and deletions) in gene-coding and inter-genic regions that are associated with ALS disease diagnosis and always absent in healthy control patients.
- SNPs single nucleotide polymorphisms
- Indels insertions and deletions
- the inventors have analyzed nextgeneration genomic sequencing data from two cohorts of ALS and healthy controls from the Answer ALS Consortium. In doing so, the inventors discover mutations in protein-coding genes that have not been associated with ALS previously.
- the present invention is directed to methods for detecting amyotrophic lateral sclerosis in a subject by detecting one or more mutations in specific genes or gene loci.
- the inventors have discovered that 16 target genes of the human genome, MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4, are important for detecting ALS.
- the invention provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 16 target genes selected from the groups consisting of: MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4.
- the method comprising the steps of: (a) sequencing 16 target genes from a biological sample of a human subject, wherein the target genes are MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4, (b) comparing each of the DNA sequences of the 16 target genes with its corresponding normal genes, (c) identifying one or more mutations such as SNPs, if present, in each of the DNA sequences of the 16 target genes, and (d) detecting amyotrophic lateral sclerosis in the subject if at least one of the 16 target genes has one or more mutations. With at least one target gene found mutated, 67% to 80%, at 99% CL (C-P), of ALS can be detected, with at least one target
- ALS is detected in the subject if at least two of the 16 target genes have one or more mutations. With at least two target genes mutated, 50% to 64%, at 99% CL, of ALS can be detected, with a false positive rate less than 0.9% at 99% CL.
- ALS is detected in the subject if at least three of the 16 target genes have one or more mutations. With at least three target genes mutated, 45% to 59%, at 99% CL, of ALS can be detected, with a false positive rate less than 0.09% at 99% CL.
- the DNA is first extracted from a biological sample of a human subject.
- the biological sample is blood (such as peripheral whole blood), a tissue sample (such as fibroblast (skin) biopsy, or a mucosal sample), or any cell derived from the patient of a human subject.
- Method for extracting DNA from a biological sample is well-known to a person skilled in the art. For Example, see protocols for extracting DNAs from blood from Thermo Fisher product sheet catalog CS11040.
- the DNA extracted from the biological sample of the human subject is then performed target-specific amplification and target-specific sequencing to sequence each of the 16 target genes: MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPORIP 1, KIAA1841, and AC008074.4.
- Whole genome sequencing which is a genomic technique for sequencing all the protein-coding regions of genes in a genome, is not performed in this method.
- each of the DNA sequences of the 16 specific target genes is compared with its corresponding reference gene sequence.
- Targeted gene data are processed through an automated pipeline to perform read alignment and mutation analysis including variants such as SNPs, indels and substitutions in either introns, exons or both.
- paired- end 150bp reads are aligned to the GRCh38 human reference using the Burrows-Wheeler Aligner (BWA-MEM) and processed using the GATK best-practices workflow that includes marking of duplicate reads by the use of Picard tools, local realignment around indels, and base quality score recalibration (BQSR) via Genome Analysis Toolkit (GATK).
- BWA-MEM Burrows-Wheeler Aligner
- GATK Genome Analysis Toolkit
- step (c) single nucleotide variant analysis is performed to identify one or more mutations, if present, in each of the DNA sequences of the 16 target genes.
- Variant discovery is a two-step process. HaplotypeCaller is run on each sample separately in gVCF mode (GATK v3.5). This produces an intermediate file format called gVCF (genomic VCF). For projects with large number of samples, gVCFs are combined by batches into merged gVCFs. gVCFs are then run through a joint genotyping step (GATK v3.5) to produce a multi-sample VCF. Variant filtration is performed using Variant Quality Score Recalibration (VQSR) which identifies annotation profiles of variants that are likely to be real, and assigns a score (VQSLOD) to each variant.
- VQSR Variant Quality Score Recalibration
- Variant effects annotation is performed using SnpEff (PMID: 22728672), bcftools (http://github.com/samtools/bcftools) and in-house software.
- Other functional annotations include variant frequencies in different populations from 1000 Genomes project (PMID:20981092), Exome Aggregation Consortium - ExAC(http://biorxiv.org/content/early/2015/10/30/030338), dbSNP147 (PMID: 11125122); cross-species conservation scores from PhyloP (PMID: 15965027), Genomic Evolutionary Rate Profiling (GERP; PMID: 21152010), PhastCons (PMID: 21278375); functional prediction scores from Polyphen2 (PMID: 20354512) and SIFT (PMID: 19561590); Clinvar(http://www.ncbi.
- Variant discovery for example, is described in the following references: “A framework for variation discovery and genotyping using next-generation DNA sequencing data” DePristo M, et al, 2011 NATURE GENETICS 43:491-498; and “From FastQ data to high-confidence variant calls: the genome analysis toolkit best practices pipeline.” Van der Auwera G, et al., Curr Protoc Bioinformatics. 2013; 43: 1-33.
- step (d) ALS is detected in the subject, if at least one of the 16 target genes has one or more mutations, preferably at least two of the 16 target genes have one or more mutations, and more preferably at least three of the 16 target genes have one or more mutations.
- the present invention provides a method to detect and diagnose ALS before clinical- and pathological-onset, which is imperative to prolonging patient lifespan, understanding the pathobiology, and designing therapies for early intervention.
- ALS is a devastating neurodegenerative disorder, with no cures or genetic diagnostics.
- the present method detects 45%-59% of the ALS-only population, at 99% CL, with the 16 target genomic signatures, when at least 3 of the 16 target genes contain a mutation.
- the present method provides use of genetic screening in early ALS diagnosis and therapeutic intervention.
- this applications show that the detection of single mutations can identify up to 59% of the ALS population with genes that are never found mutated in the healthy control sample.
- This application illustrates two novel mutations in gene-coding regions of the genome that are never present in the healthy group yet are found in over 25% of the ALS cohort.
- the inventors have discovered that 22 target genes of the human genome, AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-NDUFB5P1, NDUFS4, RPL5P16-AC008885.1, SLF1, TNRC18, AC023095.1, TRPM3, AL161629.1, NCS1, TXNP1-INPP5F, CCDC59, ATP10A, COX5A, RN7SL33P, TOP2A, and ZC3H7B, which do not mutate in a normal subject, are important for detecting ALS.
- the invention provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 22 target genes selected from the groups consisting of: AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-NDUFB5P1, NDUFS4, RPL5P16- AC008885.1, SLF1, TNRC18, AC023095.1, TRPM3, AL161629.1, NCS1, TXNP1-INPP5F, CCDC59, ATP10A, COX5A, RN7SL33P, TOP2A, and ZC3H7B, and detecting amyotrophic lateral sclerosis in the subject if the 22 genes has one or more mutations.
- 22 target genes selected from the groups consisting of: AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-
- the invention also provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 23 genomic loci selected from the groups consisting of: chrl :25854953 (chromosome 1 at nucleotide position 25854953), chrl :3624870, chr3: 158557839, chr3: 185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9: 89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2
- the method comprises the step of: (a) amplifying DNA extracted from a biological sample of a subject by target-specific polymerase chain reaction to amplify specific genomic loci comprising 23 specific chromosome positions of chrl :25854953, chrl :3624870, chr3: 158557839, chr3:185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9: 89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2:82295320, chrl5:25687571,
- Whole genome sequencing which is a genomic technique for sequencing all the protein-coding regions of genes in a genome, is not performed in this method.
- step (a) of the method the DNA is first extracted from a biological sample of a human subject, as described in the first method.
- the DNA extracted from the biological sample of the human subject is then performed target-specific amplification to amplify the 23 loci of the 22 genes.
- Table 1 shows the 22 genes that frequently has at least one mutation in ALS patients and the position of the mutation in terms of nucleotide position on a chromosome.
- Gene TXNP1-INPP5F has two mutated loci chrlO: 119712877 and chrlO: 119712899 in ALS patients.
- PCR polymerase chain reaction
- the forward and reverse primer are designed to be 30-400 bases away from the target site, e,g, 40- 250 bases, 40-200 bases, 40-150 bases, or 40-100 bases.
- Table 1 illustrates one design of the forward primer and reverse primer for each of the 23 target loci.
- the primer design shown in Table 1 is an example, and the present invention is not limited to such specific primer sequences.
- the two loci of chrlO: 119712877 and chrlO: 119712899 of Gene TXNP1-INPP5F are only 22 bases apart from each other and therefore one set of forward and reverse primers can conveniently amplify both loci.
- step (b) the amplified DNA is purified, and sequenced according to methods known to a person skilled in the art.
- DNA purification is a step that removes everything that is not the amplicon from the PCR product, this includes unused primers, nucleotides, enzymes, and other impurities.
- Sequencing includes library preparation and the act of DNA sequencing itself, done by a sequencing system. Library preparation typically consists of fragmenting the DNA sample and adding sequencing adapters to the fragments that are needed for the sequencing step (next generation sequencing). The act of sequencing itself includes reading the nucleotides in the DNA sample and saving them sequentially into a digital file.
- the specific protocol for DNA purification and DNA sequencing may differ depending on a number of factors, including the method used for DNA amplification and the sequencing system used.
- the amplified DNA can be purified and sequenced by using QIAquickPCR Purification Kit for DNA purification, following QIAquick® Spin Handbook protocol; TruSeq DNA LT kit (see product sheet of TruSeq DNA Library Prep Kits®, Illumina) for library preparation, following the protocol available at Ilumina's website TruSeq® DNA Sample Preparation Guide; and sequencing done by Illumina MiSeq system (see MiSeqTM System specification sheet, Illumina).
- step (c) the amplified DNA sequences of (b) is analyzed and compared with its corresponding DNA sequence of the normal genomic loci. See description in the first method.
- step (d) single nucleotide variant analysis is performed to identify one or more mutations, if present, in each of the DNA sequences of the 23 target loci. See description in the first method.
- step (e) ALS is detected in the subject, if at least one of the 23 target loci has single mutation, preferably at least two of the 23 target loci have mutations, and more preferably at least three of the 23 target loci have mutations.
- the present method detects over 30% of the ALS-only population, at 99% CL, with the 23 target genomic signatures, when at least 1 of the 23 target genes contain a mutation.
- the present method provides use of genetic screening in early ALS diagnosis and therapeutic intervention.
- the inventors show that at least two genes must be mutated in the list of 23 top candidates to achieve 35.7-44.9% accuracy at detection of ALS.
- the following examples further illustrate the present invention. These examples are intended merely to be illustrative of the present invention and are not to be construed as being limiting.
- Clopper-Pearson Interval Bounds are set on the true fractions of either population with a given feature(s). The number of people within a sample that are positive for the feature-of- interest will have a binomial distribution. Clopper-Pearson intervals on the binomial proportion are calculated for true population proportions 11 .
- Fisher's Exact Test The probability (p-value) of the null hypothesis, that a mutation is present in the ALS population in the same proportion as in the control population 12 . This test statistic is ideal for this study because it is the exact probability that the two proportions are equal and can still be calculated in a reasonable amount of time due to the sample size limits. T -tests are approximations of this probability which converge to the exact value in the limit of large sample size.
- Row- wise Conditional Percentage To quantify how often a pair of our top 16 genes is mutated in the same patient, we calculated a conditional probability considering the independent probabilities of each mutation occurring on its own. For every possible ordered pairing of two genes (240 combinations), we counted the number of cases which have both gene mutations and divided by the total number of cases where the first gene was present. This metric is visually represented as a matrix, with each row and column representing a particular mutated gene from the set of 16, and each element representing the conditional probability of the column and row mutation happening in the same patient, adjusted for the baseline prevalence of the row mutation. The probability is converted into a percentage and can provide insights into how often two gene mutations co-occur in each patient.
- Answer ALS Data were provided by the Answer ALS consortium.
- C9ORF72 hexanucleotide-repeats are the most prevalent ALS mutation known to date, effecting 5-10% of all cases, and up to 34% of familial (fALS) 13 .
- fALS familial
- rs767982303 and rs760890146 were each found in 25% of the total ALS population yet are absent in controls (FIG2. 1 and 2). rs767982303 (located on the 0XR1 gene) and rs760890146 (located on the NPM1P49 gene) both lead to an acceptor variant mutation. Other top SNPs-of-interest and their significance are illustrated.
- FIGs 5A and 5B We propose a simple classifier that requires at least three of the 16 genes to be mutated. A conservative upper limit on the rate in the healthy population of having a gene mutation for each of these top 16 genes is estimated to be less than 10% (at 99% CL) using the Clopper-Pearson interval since each gene was not found in 53 control patients 11 .
- 16 mutations has a false-positive rate less than 0.1% (1/1000), meaning the specificity is greater than 99.9% at 99% CL.
- the sensitivity of this classifier is 52% ⁇ 7% at 99% CL, identifying just over half of the ALS sample.
- Example 1 demonstrates SNPs in coding-regions or entire genes that are associated in a majority of the ALS population.
- the Answer ALS consortium utilized the latest next-generation sequencing technology and annotation with the highest quality control and protocols to allow us to perform unbiased genetic analyses on protein-coding genes and other genomic areas of interest. We are the first to report on this novel genomic database using these statistical and computational methods.
- OXR1 is an essential member of the antioxidant defense mechanisms in the cell.
- microRNA MIR7155
- Answer ALS Data were provided by the Answer ALS consortium and Alzheimer’s Disease Neuroimaging Initiative.
- C9ORF72 hexanucleotide-repeats are the most prevalent ALS mutation know to data, affecting 5-10% of all cases and up to 34% of familial (fALS).
- fALS familial
- Table 2 shows the 22 genes that are not mutated in the control sample. The gene names, the number of ALS cases out of the 713-patient cohort, percent of total ALS cases with the 99% CL Clopper-Pearson interval are shown, and p-value, respectively.
- Table 2 shows the sensitivity and specificity of combined loci in detecting ALS. The sensitivity of any number of combination of mutations and specificity are shown.
- Diagnostic testing based on novel gene sequence identification could serve as an early disease detection tool.
- FIG. 6 illustrates distribution of candidate ALS-only mutated genes and probability of having ALS or not having ALS based on the number of positive results or negative results on mutations.
- the distribution of numbers of variants found out of the 23 genomic loci in the 713 ALS cases is shown in grey.
- the diamond plus represents the probability of having ALS, which shows an increasing probability with increasing positive numbers of variants.
- the star represents the probability of not having ALS, which shows a decreasing probability base with increasing positive numbers of variants.
- ALS A clinical and comprehensive multi- omics signature for ALS employing induced pluripotent stem cell derived motor neurons from 1000 sporadic and familial ALS patients nationwide. Annals of Neurology 80, S243- S243 (2016).
Landscapes
- Chemical & Material Sciences (AREA)
- Life Sciences & Earth Sciences (AREA)
- Proteomics, Peptides & Aminoacids (AREA)
- Health & Medical Sciences (AREA)
- Organic Chemistry (AREA)
- Wood Science & Technology (AREA)
- Analytical Chemistry (AREA)
- Zoology (AREA)
- Genetics & Genomics (AREA)
- Engineering & Computer Science (AREA)
- Pathology (AREA)
- Immunology (AREA)
- Microbiology (AREA)
- Molecular Biology (AREA)
- Biotechnology (AREA)
- Biophysics (AREA)
- Physics & Mathematics (AREA)
- Biochemistry (AREA)
- Bioinformatics & Cheminformatics (AREA)
- General Engineering & Computer Science (AREA)
- General Health & Medical Sciences (AREA)
- Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
Abstract
The inventors have identified 23 genomic loci and established that a majority of amyotrophic lateral sclerosis (ALS) patients have mutations in at least one of the target loci. The present invention is directed to a method for detecting ALS in a subject, comprising the step of: (a) amplifying DNA extracted from a biological sample of a subject by target-specific polymerase chain reaction to amplify specific genomic loci comprising 23 specific chromosome positions of chrl:25854953, chrl:3624870, chr3:158557839, chr3:185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9:89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2:82295320, chrl5:25687571, chrl 5:74926032, chrl7:2562894, chrl7:40390624, and chr22:41330858, (b) purifying, and sequencing the amplified DNA; (c) analyzing each of the amplified DNA sequences and comparing with its corresponding DNA sequence of the normal genomic loci, (d) identifying one or more mutations, if present, at the 23 chromosome positions, and (e) detecting ALS in the subject if the 23 chromosome positions have one or more mutations.
Description
METHOD FOR DETECTING AMYOTROPHIC LATERAL SCLEROSIS
REFERENCE TO SEQUENCE LISTING, TABLE OR COMPUTER PROGRAM
The Sequence Listing is concurrently submitted herewith with the specification as an ASCII formatted text file via EFS-Web with a file name of Sequence Listing.txt with a creation date of October 13, 2021, and a size of 8.16 kilobytes. The Sequence Listing filed via EFS-Web is part of the specification and is hereby incorporated in its entirety by reference herein.
FIELD OF THE INVENTION
The present invention relates to methods for detecting amyotrophic lateral sclerosis. Each method comprises sequencing 16 target genes or 23 target genomic loci from a biological sample of a subject, and identifying one or more mutations such as single nucleotide polymorphisms or insertions/deletions , if present, in the 16 target genes or 23 target genomic loci.
BACKGROUND
Amyotrophic lateral sclerosis (ALS) is clinically characterized by the loss and degeneration of motor neurons1. Patients typically live 2-5 years after diagnosis2. To date, there remains no cure or genetic test that can pre-determine ALS in a majority of patients3. On a genetic level, prior evidence supports the notion that ALS is a multifactorial disorder, with many genes and cell types influencing the disease4'7. Currently, there are over 50 known ALS-linked genes. However, these known associated genes are found mutated in less than 10 percent of the total ALS population, with most patients displaying a sporadic etiology8’9.
ALS cases can be grouped by two categories: familial ALS (fALS), where the patient has a genetically related family member also affected, and sporadic ALS (sALS), where the patient has no family history of ALS9. Historically, 5-10% of cases are fALS, and the other 90-95% cases are sALS. In the past ten years, the C9ORF72 hexanucleotide repeat expansion, has been identified as the most prevalent genomic mutation found in the ALS disease population13. C9ORF72 repeat expansions can be found in up to 34% of fALS and 5% of sALS cases. High associations with C9ORF72 to the pathologically-related neurodegenerative disorder, Frontotemporal Dementia (FTD), have also been shown17.
There is a need to develop a method to detect ALS early-on and elucidate the pathogenesis of both familial and sporadic ALS.
BRIEF DESCRIPTION OF THE DRAWINGS AND TABLES
FIG. 1 : Distribution of SNPs present only in the 338 ALS sample. rs767982303 and rs760890146 (SNIP IDs) are each in greater than 25% of the ALS population. The dots represent percentage of the ALS sample for each selected SNP with the 68.5% Confidence Level (CL) Clopper-Pearson interval on the true binomial proportion. The grey area represents the range of the possible percentage in the healthy population, with a 95% CL Clopper-Pearson interval.
FIG. 2: SNPs that are not mutated in the control sample. The number of ALS cases out of the overall 338 patient cohort, percent of total ALS cases with the 99% CL Clopper- Pearson interval, and p-value.
FIG. 3: Distribution of mutated genes found only in the 338 ALS sample. Dots represent percentage of the ALS group for each selected gene. The grey area represents an upper-bound on the potential false-positive percentage in the healthy population. This upper bound is set via the 99% CL Clopper-Pearson interval on the binomial proportion. MIR7155 mutations are detected in 51% of the ALS cohort.
FIG. 4: 16 genes that are not mutated in the control sample. The number of ALS cases out of the 338-patient cohort, number of unique SNPs, percent of total ALS cases, and p-value with the 99% CL Clopper-Pearson interval are shown, respectively.
FIGs. 5A-5B: Classifier Analysis using candidate ALS-only mutated genes. Selecting patients with three or more genes mutated of the 16 candidate genes yields a falsepositive rate less than 0.1% and false-negative rate less than 59% at 99% CL. 52% of the ALS cases have at least three of the 16 candidate genes mutated. (5 A) The distribution of the number of genes out of the top 16 candidates found in each of the 338 ALS cases. (5B) The percentage of ALS cases with at least the given number of genes mutated from the candidate list (light). The maximum false positive rate at 99% CL (dark).
FIG. 6: Distribution of candidate ALS-only mutated genes and probability of having ALS based on number of mutations. The distribution of the number of genes out of the top 22 candidates found in each of the 713 ALS cases is shown in grey. The probability of having ALS and the probability of not having ALS is represented is shown.
DETAILED DESCRIPTION OF THE INVENTION
Definitions
A “locus” is a specific, fixed position on a chromosome where a particular gene or genetic marker is located.
A “single nucleotide polymorphism” (SNP) is a germline substitution of a single nucleotide at a specific position in the genome. For example, at a specific base position in the human genome, the G nucleotide may appear in most individuals, but in a minority of individuals, the position is occupied by an A. This means that there is a SNP at this specific position, and the two possible nucleotide variations - G or A - are the alleles for this specific position.
Amyotrophic Lateral Sclerosis (ALS), a multifactorial neurodegenerative disorder, is widely characterized with the degeneration of motor neurons and neuro-inflammation. Currently, no cures or genetic tests are known, that can diagnose and classify all forms of ALS.
The present invention identifies a set of mutations in genomic-coding regions that are present in ALS patients but not in healthy control samples. The present invention provides methods to detect and diagnose ALS before clinical and pathological onset, which is imperative to prolonging patient lifespan, understanding the pathobiology, and designing therapies for early intervention.
The inventors compute and analyze large datasets of genomes of over 1,500 ALS disease patients and healthy controls. The inventors unravel mutations such as single nucleotide polymorphisms (SNPs) and Indels (insertions and deletions) in gene-coding and inter-genic regions that are associated with ALS disease diagnosis and always absent in healthy control patients.
To identify novel genetic associations to ALS, the inventors have analyzed nextgeneration genomic sequencing data from two cohorts of ALS and healthy controls from the Answer ALS Consortium. In doing so, the inventors discover mutations in protein-coding genes that have not been associated with ALS previously.
This application provides evidence that the detection of selected mutations such as SNPs and indels with genetic sequencing can be correlated with the pathobiology of ALS in a significant percentage of cases. These genetic biomarkers can be used as an early ALS disease diagnostic tool with a rapid and non-invasive technique.
The present invention is directed to methods for detecting amyotrophic lateral sclerosis in a subject by detecting one or more mutations in specific genes or gene loci.
The First Aspect
In the first aspect of the invention, the inventors have discovered that 16 target genes of the human genome, MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4, are important for detecting ALS.
The invention provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 16 target genes selected from the groups consisting of: MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4.
In one embodiment, the method comprising the steps of: (a) sequencing 16 target genes from a biological sample of a human subject, wherein the target genes are MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4, (b) comparing each of the DNA sequences of the 16 target genes with its corresponding normal genes, (c) identifying one or more mutations such as SNPs, if present, in each of the DNA sequences of the 16 target genes, and (d) detecting amyotrophic lateral sclerosis in the subject if at least one of the 16 target genes has one or more mutations. With at least one target gene found mutated, 67% to 80%, at 99% CL (C-P), of ALS can be detected, with a false positive rate less than 9.5% at 99% confidence level (Clopper Pearson).
In a preferred method, ALS is detected in the subject if at least two of the 16 target genes have one or more mutations. With at least two target genes mutated, 50% to 64%, at 99% CL, of ALS can be detected, with a false positive rate less than 0.9% at 99% CL.
In a further preferred method, ALS is detected in the subject if at least three of the 16 target genes have one or more mutations. With at least three target genes mutated, 45% to 59%, at 99% CL, of ALS can be detected, with a false positive rate less than 0.09% at 99% CL.
In step (a) of the method, the DNA is first extracted from a biological sample of a human subject. For example, the biological sample is blood (such as peripheral whole blood), a tissue sample (such as fibroblast (skin) biopsy, or a mucosal sample), or any cell
derived from the patient of a human subject. Method for extracting DNA from a biological sample is well-known to a person skilled in the art. For Example, see protocols for extracting DNAs from blood from Thermo Fisher product sheet catalog CS11040.
The DNA extracted from the biological sample of the human subject is then performed target-specific amplification and target-specific sequencing to sequence each of the 16 target genes: MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPORIP 1, KIAA1841, and AC008074.4. Whole genome sequencing (WGS), which is a genomic technique for sequencing all the protein-coding regions of genes in a genome, is not performed in this method.
The procedures of target-specific sequencing of DNA libraries from human genomic DNA extracted from a biological sample are known to a person skilled in the art. A range of sequencing platforms can be used, such as PacBio Sequencing (Rhoads & Au, Genomics, Proteomics and Bioinformatics, 13(5), 278-289, 2015), Oxford Nanopore (Jain, et al, Genome Biology, 17(1), 1-11, 2016), Ilumina, or lOx Genomics (Zheng et al., Nature Biotechnology, 34(3), 303-311, 2016). For example, see Product Instruction Sheet of NextSeq™ 550Dx Instrument (Illumina)
In step (b), each of the DNA sequences of the 16 specific target genes is compared with its corresponding reference gene sequence. Targeted gene data are processed through an automated pipeline to perform read alignment and mutation analysis including variants such as SNPs, indels and substitutions in either introns, exons or both. In one embodiment, paired- end 150bp reads are aligned to the GRCh38 human reference using the Burrows-Wheeler Aligner (BWA-MEM) and processed using the GATK best-practices workflow that includes marking of duplicate reads by the use of Picard tools, local realignment around indels, and base quality score recalibration (BQSR) via Genome Analysis Toolkit (GATK). See “The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data”, McKenna A, et al, 2010 GENOME RESEARCH 20: 1297-303.
In step (c), single nucleotide variant analysis is performed to identify one or more mutations, if present, in each of the DNA sequences of the 16 target genes.
Variant discovery is a two-step process. HaplotypeCaller is run on each sample separately in gVCF mode (GATK v3.5). This produces an intermediate file format called gVCF (genomic VCF). For projects with large number of samples, gVCFs are combined by batches into merged gVCFs. gVCFs are then run through a joint genotyping step (GATK
v3.5) to produce a multi-sample VCF. Variant filtration is performed using Variant Quality Score Recalibration (VQSR) which identifies annotation profiles of variants that are likely to be real, and assigns a score (VQSLOD) to each variant. Variant effects annotation is performed using SnpEff (PMID: 22728672), bcftools (http://github.com/samtools/bcftools) and in-house software. Other functional annotations include variant frequencies in different populations from 1000 Genomes project (PMID:20981092), Exome Aggregation Consortium - ExAC(http://biorxiv.org/content/early/2015/10/30/030338), dbSNP147 (PMID: 11125122); cross-species conservation scores from PhyloP (PMID: 15965027), Genomic Evolutionary Rate Profiling (GERP; PMID: 21152010), PhastCons (PMID: 21278375); functional prediction scores from Polyphen2 (PMID: 20354512) and SIFT (PMID: 19561590); Clinvar(http://www.ncbi. nlm.nih.gov/clinvar/); regulatory annotations from ENCODE (PMID: 15499007) and Regulome (PMID: 22955989). Variants and annotations are exported to tabular formats for the ease of downstream analysis. Additional filtration based on functional annotation is applied to extract variants with predicted effects on protein coding.
Variant discovery, for example, is described in the following references: “A framework for variation discovery and genotyping using next-generation DNA sequencing data” DePristo M, et al, 2011 NATURE GENETICS 43:491-498; and “From FastQ data to high-confidence variant calls: the genome analysis toolkit best practices pipeline.” Van der Auwera G, et al., Curr Protoc Bioinformatics. 2013; 43: 1-33.
In step (d), ALS is detected in the subject, if at least one of the 16 target genes has one or more mutations, preferably at least two of the 16 target genes have one or more mutations, and more preferably at least three of the 16 target genes have one or more mutations.
The present invention provides a method to detect and diagnose ALS before clinical- and pathological-onset, which is imperative to prolonging patient lifespan, understanding the pathobiology, and designing therapies for early intervention. ALS is a devastating neurodegenerative disorder, with no cures or genetic diagnostics. The present method detects 45%-59% of the ALS-only population, at 99% CL, with the 16 target genomic signatures, when at least 3 of the 16 target genes contain a mutation. The present method provides use of genetic screening in early ALS diagnosis and therapeutic intervention.
On the gene level, this applications show that the detection of single mutations can identify up to 59% of the ALS population with genes that are never found mutated in the healthy control sample. This application illustrates two novel mutations in gene-coding
regions of the genome that are never present in the healthy group yet are found in over 25% of the ALS cohort.
The Second Aspect
In a second aspect of the invention, the inventors have discovered that 22 target genes of the human genome, AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-NDUFB5P1, NDUFS4, RPL5P16-AC008885.1, SLF1, TNRC18, AC023095.1, TRPM3, AL161629.1, NCS1, TXNP1-INPP5F, CCDC59, ATP10A, COX5A, RN7SL33P, TOP2A, and ZC3H7B, which do not mutate in a normal subject, are important for detecting ALS.
The invention provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 22 target genes selected from the groups consisting of: AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-NDUFB5P1, NDUFS4, RPL5P16- AC008885.1, SLF1, TNRC18, AC023095.1, TRPM3, AL161629.1, NCS1, TXNP1-INPP5F, CCDC59, ATP10A, COX5A, RN7SL33P, TOP2A, and ZC3H7B, and detecting amyotrophic lateral sclerosis in the subject if the 22 genes has one or more mutations.
The invention also provides a method for detecting ALS in a subject, comprising obtaining a biological sample from a subject, and from the sample, detecting one or more mutations in 23 genomic loci selected from the groups consisting of: chrl :25854953 (chromosome 1 at nucleotide position 25854953), chrl :3624870, chr3: 158557839, chr3: 185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9: 89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2:82295320, chrl5:25687571, chrl 5:74926032, chrl7:2562894, chrl7:40390624, and chr22:41330858, and detecting amyotrophic lateral sclerosis in the subject if the 23 chromosome positions has one or more mutations.
In one embodiment, the method comprises the step of: (a) amplifying DNA extracted from a biological sample of a subject by target-specific polymerase chain reaction to amplify specific genomic loci comprising 23 specific chromosome positions of chrl :25854953, chrl :3624870, chr3: 158557839, chr3:185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9: 89866631, chr9: 130224292, chrlO: 119712877,
chrlO: 119712899, chrl2:82295320, chrl5:25687571, chrl 5:74926032, chrl7:2562894, chrl7:40390624, and chr22:41330858, (b) purifying and sequencing the amplified DNA; (c) analyzing each of the amplified DNA sequences and comparing with its corresponding DNA sequence of the normal genomic loci, (d) identifying one or more mutations, if present, at the 23 chromosome positions, and (e) detecting amyotrophic lateral sclerosis in the subject if the 23 chromosome positions has one or more mutations such as SNPs or Indels.
Whole genome sequencing, which is a genomic technique for sequencing all the protein-coding regions of genes in a genome, is not performed in this method.
In step (a) of the method, the DNA is first extracted from a biological sample of a human subject, as described in the first method.
The DNA extracted from the biological sample of the human subject is then performed target-specific amplification to amplify the 23 loci of the 22 genes.
Table 1 shows the 22 genes that frequently has at least one mutation in ALS patients and the position of the mutation in terms of nucleotide position on a chromosome. Gene TXNP1-INPP5F has two mutated loci chrlO: 119712877 and chrlO: 119712899 in ALS patients.
One way of amplifying these loci (that is used in general due to it being less time consuming and effort demanding) is by amplifying all the different regions at the same time in one polymerase chain reaction (PCR), which is called "multiplexing PCR". Multiplexing PCR can be done using more than one sample DNA at a time (multi-template), or it can be done using just one sample DNA per reaction (single template).
Single template multiplexing PCR is explained below, but both multi -tempi ate and single template are techniques that can be applied for the genes. Besides the classic requirements needed for a PCR, known to a person skilled in the art, there are a few regards to consider when designing primers for multiplexing PCR. These preliminary steps are: (a) Confirming that the melting temperatures (Tm) of all the 23 sets of primers are similar, with no set having a Tm that varies more than 5°C compared with any other set; (b) Evaluating for competitive binding, primers should only anneal to only one specific region in the DNA sample; (c) Checking for primer dimers, it is important because there is a large quantity of primer sets, making the possibility of primer dimer formation more probable. More information on multiplex PCR can be found in "Multiplex polymerase chain reaction: a practical approach" Markoulatos P. et al. 2002 Clinical Laboratory Analysis 16(l):47-51. There are different softwares available for primer design each with its own advantages and
disadvantages, for example, Jingwen et al. (2020) classified and reviewed some of the free programs available for doing this in "Classification and review of free PCR primer design software" Jingwen et al. 2020 Bioinformatics 36:22-23, which helps the primer design simpler and more precise. The forward primer and reverse primer can be designed according to methods described above or other methods known to a person skilled in the art to amplify the DNA comprising a target site. In general, the forward and reverse primer are designed to be 30-400 bases away from the target site, e,g, 40- 250 bases, 40-200 bases, 40-150 bases, or 40-100 bases. Table 1 illustrates one design of the forward primer and reverse primer for each of the 23 target loci. The primer design shown in Table 1 is an example, and the present invention is not limited to such specific primer sequences. The two loci of chrlO: 119712877 and chrlO: 119712899 of Gene TXNP1-INPP5F are only 22 bases apart from each other and therefore one set of forward and reverse primers can conveniently amplify both loci.
10
1542568971
* shows number of bases between the forward primer and target locus (before) and number of bases between the target locus and reverse primer (after)
11
154256897 1
In step (b), the amplified DNA is purified, and sequenced according to methods known to a person skilled in the art. DNA purification is a step that removes everything that is not the amplicon from the PCR product, this includes unused primers, nucleotides, enzymes, and other impurities. Sequencing includes library preparation and the act of DNA sequencing itself, done by a sequencing system. Library preparation typically consists of fragmenting the DNA sample and adding sequencing adapters to the fragments that are needed for the sequencing step (next generation sequencing). The act of sequencing itself includes reading the nucleotides in the DNA sample and saving them sequentially into a digital file. The specific protocol for DNA purification and DNA sequencing may differ depending on a number of factors, including the method used for DNA amplification and the sequencing system used. There are different well-known methods for purifying and sequencing the amplified DNA. For example, the amplified DNA can be purified and sequenced by using QIAquickPCR Purification Kit for DNA purification, following QIAquick® Spin Handbook protocol; TruSeq DNA LT kit (see product sheet of TruSeq DNA Library Prep Kits®, Illumina) for library preparation, following the protocol available at Ilumina's website TruSeq® DNA Sample Preparation Guide; and sequencing done by Illumina MiSeq system (see MiSeqTM System specification sheet, Illumina).
In step (c), the amplified DNA sequences of (b) is analyzed and compared with its corresponding DNA sequence of the normal genomic loci. See description in the first method.
In step (d), single nucleotide variant analysis is performed to identify one or more mutations, if present, in each of the DNA sequences of the 23 target loci. See description in the first method.
In step (e), ALS is detected in the subject, if at least one of the 23 target loci has single mutation, preferably at least two of the 23 target loci have mutations, and more preferably at least three of the 23 target loci have mutations.
The present method detects over 30% of the ALS-only population, at 99% CL, with the 23 target genomic signatures, when at least 1 of the 23 target genes contain a mutation. The present method provides use of genetic screening in early ALS diagnosis and therapeutic intervention. On the inter-genic and genic analyses, the inventors show that at least two genes must be mutated in the list of 23 top candidates to achieve 35.7-44.9% accuracy at detection of ALS.
The following examples further illustrate the present invention. These examples are intended merely to be illustrative of the present invention and are not to be construed as being limiting.
EXAMPLES
Example 1. Frequency of SNP Associations in 338 ALS patients Methods
Statistical Methods
Clopper-Pearson Interval: Bounds are set on the true fractions of either population with a given feature(s). The number of people within a sample that are positive for the feature-of- interest will have a binomial distribution. Clopper-Pearson intervals on the binomial proportion are calculated for true population proportions11.
Fisher's Exact Test: The probability (p-value) of the null hypothesis, that a mutation is present in the ALS population in the same proportion as in the control population12. This test statistic is ideal for this study because it is the exact probability that the two proportions are equal and can still be calculated in a reasonable amount of time due to the sample size limits. T -tests are approximations of this probability which converge to the exact value in the limit of large sample size.
Row- wise Conditional Percentage: To quantify how often a pair of our top 16 genes is mutated in the same patient, we calculated a conditional probability considering the independent probabilities of each mutation occurring on its own. For every possible ordered pairing of two genes (240 combinations), we counted the number of cases which have both gene mutations and divided by the total number of cases where the first gene was present. This metric is visually represented as a matrix, with each row and column representing a particular mutated gene from the set of 16, and each element representing the conditional probability of the column and row mutation happening in the same patient, adjusted for the baseline prevalence of the row mutation. The probability is converted into a percentage and can provide insights into how often two gene mutations co-occur in each patient.
Genetic Data Acquisition
Answer ALS Data were provided by the Answer ALS consortium.
Patient Cohorts: The entire Answer ALS consortium genetic database was used for a total of 338 ALS patients and 53 healthy control samples. This was the entirety of the Answer ALS genetic database. The total number of patients who tested positive for mutations in known
ALS-linked genes, C9ORF72, S0D1, TDP43, consisted of 30, 6, and 1, out of 130, 27, and 14 tested ALS patients, respectively. Our statistical models were implemented with careful consideration of these numbers.
Results:
The results are shown in FIGs. 1-5.
SNPs in the Coding-Genome Found Only in ALS patients.
Identification of patients in the ALS population that contain SNPs that are not present in the healthy control population renders valuable insight into disease pathogenesis and genetic diagnostics. After sorting through the entire cohort of ALS patients and the healthy control sample, we found that we could detect 100,143 SNPs in coding-genes that were only found in the ALS sample
C9ORF72 hexanucleotide-repeats are the most prevalent ALS mutation known to date, effecting 5-10% of all cases, and up to 34% of familial (fALS)13. Of the detected SNPs in the ALS-only sample, we focused on the mutations in greater than 12% of the sample to determine candidate genes in a higher than previously accepted association. We found that there were 21 candidate SNPs that reached this level of significance, with a p-value of less than 1.5 x 10'3, all of which are more significant than C9ORF72 reported population (FIGs. 1 and 2).
Of the 21 SNP candidates, rs767982303 and rs760890146 were each found in 25% of the total ALS population yet are absent in controls (FIG2. 1 and 2). rs767982303 (located on the 0XR1 gene) and rs760890146 (located on the NPM1P49 gene) both lead to an acceptor variant mutation. Other top SNPs-of-interest and their significance are illustrated.
Frequency of Candidate SNP Associations in ALS patients
To determine if there are associations between the candidate SNPs, we generated a row-wise conditional percentage heatmap. We found that roughly half of the ALS population containing one of the top 21 candidate SNPs also shared at least one other mutation. Mutations, rs62220927 and rs62220926, in the protein-coding gene TSPEAR, were both found in the same ALS patients, suggesting a dual-mutation linkage at those genomic sites.
Genomic Mutations in Genes of the ALS-only Population.
We identify mutations at a gene-level in the ALS-population and not in healthy controls. We sorted and analyzed candidate genes based on the presence of at least one SNP, rather than individual coding-region SNPs. This approach would allow us to identify genemodifiers involved in ALS pathology.
We found 16 individual genes that were each mutated in over 12% of the ALS cases (FIG. 4). One gene, the miRNA gene, MIR7155, was mutated in over 50% of the population (FIGs. 3 and 4). OXR1, which was identified in the SNP mutation list for its association in over 25% of the ALS-only population, was consistent in our candidate genes list with all mutations either influencing a missense mutation or acceptor variant. This is suggestive that mutations in any genomic region of MIR7155, OXR1 and other top candidates is associated with ALS disease in a large proportion of patients.
Associations of multiple mutations in the same population and biological pathway could provide insight into the pathology of ALS. We generated lists of interacting proteincoding genes and determined the frequency of mutation in those associated genes. We found that for our top candidates that the frequency of a gene in the family being mutated in the control group, such as interacting proteins with 0XR1 or RP11, displayed a similar rate as observed in the ALS population 14. This is suggestive of the influence of only our candidate genes and not their functionally associated protein-coding genes.
Frequency of Candidate Gene Associations in ALS patients
To determine if there are associations between the top 16 candidate genes and top ALS-linked genes; C9ORF72, SOD1, and TDP43, we generated a row-wise conditional percentage heatmap. In the ALS-only patient population, mutations in the genes TDP43, RP11, CIR1P3, and H2AFZP1 were always prevalent with a MIR7155 mutation. Interestingly, C9ORF72, CCDC42, and SOD1 mutant patients consisted of only half the population of MIR7155, suggestive of MIR7155 representing a novel ALS-correlated gene.
Clinical Evaluation of ALS patients
Multiple clinical phenotypes can be associated with the onset of ALS, including age and area of clinical disease initiation15. We compared the overall age of onset for the entire ALS population and the patients we can diagnose with our classifier analysis. There are no
statistically significant differences detected in the age of onset for our diagnosable patients as compared to the overall ALS population.
Next, we compared the three different areas of disease onset, “axial”, “bulbar”, and “limb”, and each potential correlation to our candidate genetic mutations. We found no significant difference between the clinical appearance of disease onset in our top gene candidate patients and the symptoms displayed in the overall ALS population.
Classifier Analysis for the Top Candidate Genes in the ALS-only Population
Diagnostic testing based on novel gene sequence identification could serve as an early disease detection tool. To determine if our top 16 gene candidates, single or in combination, could be used as a statistical tool to associate and identify the ALS-only population, we designed a classifier analysis. Evaluation of the top 16 candidates led to the discovery that a majority of the ALS sample had at least 3 of our 16 genes mutated, and a peak at 7 genes
(FIGs 5A and 5B) We propose a simple classifier that requires at least three of the 16 genes to be mutated. A conservative upper limit on the rate in the healthy population of having a gene mutation for each of these top 16 genes is estimated to be less than 10% (at 99% CL) using the Clopper-Pearson interval since each gene was not found in 53 control patients11.
The proportion of patients with N independently mutated genes of our 16 will be less than 0.1 to the Nth 99% CL Therefore, this classifier requiring at least of three of any of our
16 mutations has a false-positive rate less than 0.1% (1/1000), meaning the specificity is greater than 99.9% at 99% CL. The sensitivity of this classifier is 52% ± 7% at 99% CL, identifying just over half of the ALS sample.
Summary
Our data of Example 1 demonstrates SNPs in coding-regions or entire genes that are associated in a majority of the ALS population. In this clinical and biomedical trial, the Answer ALS consortium utilized the latest next-generation sequencing technology and annotation with the highest quality control and protocols to allow us to perform unbiased genetic analyses on protein-coding genes and other genomic areas of interest. We are the first to report on this novel genomic database using these statistical and computational methods.
We designed an analysis focused on identification of SNPs in the coding-genome. This model allowed us to detect individual SNPs in 27% of the ALS cohort. When using a gene-level approach, we robustly identify a majority (>50%) of the ALS population at 79%
CL using a one-sided Clopper-Pearson interval. Other known ALS-linked genes each consist of less than 10% of the overall ALS population.
We have identified OXR1 as a gene mutated in 27% of the ALS group. OXR1 is an essential member of the antioxidant defense mechanisms in the cell.
In our gene-level analyses, we have discovered that the microRNA, MIR7155, is mutated in 52% of the ALS test-sample.
We found 52% of ALS cases (177 out of 338) with at least three of 16 target genes mutated, with a false-positive rate of less than 0.1%, at 99% confidence level (CL) (see methods on the Clopper-Pearson interval). This establishes that a majority of our ALS patients have mutations in at least 3 of our 16 target genes.
Example 2. Frequency of Candidate SNP Associations in 713 ALS patients Methods
Statistical Methods
Statistical Methods are the same as described in Example 1.
Genetic Data Acquisition
Answer ALS Data were provided by the Answer ALS consortium and Alzheimer’s Disease Neuroimaging Initiative.
Patient Cohorts: A total of 713 ALS patients and 93 control samples were used from the ALS consortium genetic database. A total of 818 healthy control samples were used from the Alzheimer’s Disease Neuroimaging Initiative genetic dataset.
Results:
SNPs in the Coding-Genome Found Only in ALS patients.
Identification of patients in the ALS population that contain SNPs that are not present in the healthy control population renders valuable insight into disease pathogenesis and genetic diagnostics. After sorting through the entire cohort of ALS patients and the healthy control samples, we found that we could detect 44,156,401 variants present only in the ALS population.
C9ORF72 hexanucleotide-repeats are the most prevalent ALS mutation know to data, affecting 5-10% of all cases and up to 34% of familial (fALS). Of the detect variants in the ALS-only samples, we focused on the variants in greater than 22% of the ALS population to
determine candidate genes in a higher than previously accepted association. We found that there were 23 variants in 22 different genes that reached this level of significance, with a p- value of less than 2.2 x 10'16, all of which are more significant than C9ORF72 reported population (Table 2). Top variants of interest and their significance are shown. Table 2 shows the 22 genes that are not mutated in the control sample. The gene names, the number of ALS cases out of the 713-patient cohort, percent of total ALS cases with the 99% CL Clopper-Pearson interval are shown, and p-value, respectively.
Table 2.
Table 3 shows the sensitivity and specificity of combined loci in detecting ALS. The sensitivity of any number of combination of mutations and specificity are shown.
Genomic Mutations in Genes of the ALS-only Population.
We identified variants at a gene level in the ALS population and not in healthy control. We sorted and analyzed candidate genes based on the presence of at least one variant rather than individual variants. This approach would allow us to identify gene-modifiers involved in ALS pathology. We found 22 individual genes that were each mutated in over 21% of the ALS cases (Table 2). One gene, NDUFS4, was mutated in over 30% of the ALS
population. The high proportion of ALS patients with variants in the 22 identified genes is suggestive that the mutations in the 22 identified genes are associated with ALS disease in a large proportion of patients.
Clinical Evaluation of ALS patients
Multiple clinical phenotypes can be associated with the onset of ALS, including age and area of clinical disease initiation15. We compared the overall age of onset for the entire ALS population and the patients we can diagnose with our classifier analysis. There are no statistically significant differences detected in the age of onset for our diagnosable patients compared to the overall ALS population.
Next, we compared the three different areas of disease onset, “axial,” “bulbar,” and “limb,” and each potential correlation to our candidate genetic mutations. We found no significant difference between the clinical appearance of disease onset in our top gene candidate patients and the symptoms displayed in the overall ALS population.
Classifier Analysis for the Top Candidate Genes in the ALS-only Population
Diagnostic testing based on novel gene sequence identification could serve as an early disease detection tool. We designed a classifier analysis to determine if our top 22 gene candidates, single or in combination, could be used as a statistical tool to associate and identify the ALS-only population. Evaluation of the top 22 candidates led to the discovery that a majority of the ALS samples had at least one or two of our 23 specific loci mutated and peaked 17-20 loci and 22-23 loci, respectively. Our results show that the sensitivity of detecting ALS was 58.62% ± 4.8% when at least one of the 23 loci was mutated, and 40.4% ± 4.7% at 99% CL when at least two of the 23 loci was mutated, with a specificity of 100% because none of the 22 genes or 23 loci was mutated in normal population.
Our results also show that the probability of detecting ALS increases as the number of positive test results for the 23 loci increase. (Table 3).
FIG. 6 illustrates distribution of candidate ALS-only mutated genes and probability of having ALS or not having ALS based on the number of positive results or negative results on mutations. The distribution of numbers of variants found out of the 23 genomic loci in the 713 ALS cases is shown in grey. The diamond plus represents the probability of having ALS, which shows an increasing probability with increasing positive numbers of variants. Further,
the star represents the probability of not having ALS, which shows a decreasing probability base with increasing positive numbers of variants.
References
1 Tandan R., B. W. G. Amyotrophic Lateral Sclerosis: Part 1. Clinical Features, Pathology, and Ethical Issues in Management. Annals of Neurology 18, 271-280 (1985).
2 Petrov, D., Mansfield, C., Moussy, A. & Hermine, O. ALS Clinical Trials Review: 20 Years of Failure. Are We Any Closer to Registering a New Treatment? Front Aging Neurosci 9, 68, doi: 10.3389/fnagi.2017.00068 (2017).
3 Zinman, L. & Cudkowicz, M. Emerging targets and treatments in amyotrophic lateral sclerosis. The Lancet Neurology 10, 481-490, doi: 10.1016/sl474-4422(l 1)70024-2 (2011).
4 Eisen, A. Amyotrophic Lateral Sclerosis is a Multifactorial Disease. Muscle & Nerve 18, 741-752 (1995).
5 Miller, S. J. Astrocyte Heterogeneity in the Adult Central Nervous System. Front Cell Neurosci 12, 401, doi: 10.3389/fncel.2018.00401 (2018).
6 Miller, S. J., Glatzer, J. C., Hsieh, Y. C. & Rothstein, J. D. Cortical astroglia undergo transcriptomic dysregulation in the G93A SOD1 ALS mouse model. J Neurogenet 32, 322- 335, doi: 10.1080/01677063.2018.1513508 (2018).
7 Miller, S. J., Zhang, P. W., Glatzer, J. & Rothstein, J. D. Astroglial transcriptome dysregulation in early disease of an ALS mutant SOD1 mouse model. J Neurogenet 31, 37- 48, doi: 10.1080/01677063.2016.1260128 (2017).
8 Boylan, K. Familial Amyotrophic Lateral Sclerosis. Neurol Clin 33, 807-830, doi: 10.1016/j.ncl.2015.07.001 (2015).
9 Mejzini, R. et al. ALS Genetics, Mechanisms, and Therapeutics: Where Are We Now? Front Neurosci 13, 1310, doi: 10.3389/fnins.2019.01310 (2019).
10 Desvignes, J. P. et al. VarAFT: a variant annotation and filtration system for human next generation sequencing data. Nucleic Acids Res 46, W545-W553, doi: 10.1093/nar/gky471 (2018).
11 Clopper, C. J., Pearson, E.S. The Use of confidence or fiducial limits illustrated in the case of the binomial. Biometrika 26, 404-413 (1934).
12 Fisher, R. A. On the interpretation of X2 from contingency tables, and the calculation of P. 1922 85, 87-94 (1922).
13 Renton, A. E. et al. A hexanucleotide repeat expansion in C9ORF72 is the cause of chromosome 9p21-linked ALS-FTD. Neuron 72, 257-268, doi: 10.1016/j. neuron.2011.09.010 (2011).
14 Szklarczyk, D. et al. STRING vl 1 : protein-protein association networks with increased coverage, supporting functional discovery in genome-wide experimental datasets. Nucleic Acids Res 47, D607-D613, doi: 10.1093/nar/gkyl l31 (2019).
15 Ravits, J., Paul, P., Jorg C. Focality of upper and lower motor neuron degeneration at the clinical onset of ALS. Neurology 68, 1571-1575 (2007).
16 Rothstein, J. D., Svendsen, C.N., Cudkowicz, M., Berry, J., Maragakis, N., Sherman, A., Sareen, D., Finkbeiner, S., Fraenkel E. Answer ALS: A clinical and comprehensive multi- omics signature for ALS employing induced pluripotent stem cell derived motor neurons from 1000 sporadic and familial ALS patients nationwide. Annals of Neurology 80, S243- S243 (2016).
17 Zhang, K. et al. The C9orf72 repeat expansion disrupts nucleocytoplasmic transport. Nature 525, 56-61, doi: 10.1038/nature 14973 (2015).
18 Renton, A. E., Chio, A. & Traynor, B. J. State of play in amyotrophic lateral sclerosis genetics. Nat Neurosci 17, 17-23, doi: 10.1038/nn.3584 (2014).
19 Felbecker, A. et al. Four familial ALS pedigrees discordant for two SOD1 mutations: are all SOD1 mutations pathogenic? J Neurol Neurosurg Psychiatry 81, 572-577, doi: 10.1136/jnnp.2009.192310 (2010).
20 Liu, K. X. et al. Neuron-specific antioxidant OXR1 extends survival of a mouse model of amyotrophic lateral sclerosis. Brain 138, 1167-1181, doi: 10.1093/brain/awv039 (2015).
21 Tabula Muris, C. et al. Single-cell transcriptomics of 20 mouse organs creates a Tabula Muris. Nature 562, 367-372, doi: 10.1038/s41586-018-0590-4 (2018).
22 Molasy, M., Walczak, A., Szaflik, J., Szaflik, J. P. & Majsterek, I. MicroRNAs in glaucoma and neurodegenerative diseases. J Hum Genet 62, 105-112, doi: 10.1038/jhg.2016.91 (2017).
Claims
1. A method for detecting amyotrophic lateral sclerosis in a subject, comprising the step of: amplifying DNA extracted from a biological sample of a subject by target-specific polymerase chain reaction to amplify specific genomic loci comprising 23 specific chromosome positions of chrl :25854953 (chromosome 1 at nucleotide position 25854953), chrl:3624870, chr3: 158557839, chr3: 185543848, chr3: 186923875, chr4: 17685198, chr4: 180358067, chr5:53655366, chr5: 82813472, chr5:94666955, chr7:5338617, chr8: 62196626, chr9:71428255, chr9: 89866631, chr9: 130224292, chrlO: 119712877, chrlO: 119712899, chrl2:82295320, chrl5:25687571, chrl 5:74926032, chrl7:2562894, chrl7:40390624, and chr22:41330858, purifying and sequencing the amplified DNA; analyzing each of the amplified DNA sequences and comparing with its corresponding DNA sequence of the normal genomic loci, identifying one or more mutations, if present, at the 23 chromosome positions, and detecting amyotrophic lateral sclerosis in the subject if one or more mutations are present in the 23 chromosome positions.
2. The method according to Claim 1, wherein amyotrophic lateral sclerosis is detected in the subject if two or more mutations are present in the 23 chromosome positions.
3. The method according to Claim 1, wherein the one or more mutations include single nucleotide polymorphisms or insertions and deletions.
4. The method according to Claim 1, wherein the 23 chromosome positions are located in 22 genes selected from the group consisting of: AL033528.3, THRAP3, AC106707.1, LIPH, AC007690.1, FAM184B, AC096747.1-NDUFB5P1, NDUFS4, RPL5P16- AC008885.1, SLF1, TNRC18, AC023095.1, TRPM3, AL161629.1, NCS1, TXNP1-INPP5F, CCDC59, ATP10A, COX5A, RN7SL33P, TOP2A, and ZC3H7B
5. The method according to Claim 1, wherein the biological sample is blood, a tissue sample, or a cell.
23
6. A method for detecting amyotrophic lateral sclerosis in a subject, comprising the step of: sequencing 16 target genes from a biological sample of a subject, wherein the specific genes are MIR7155, NPM1P49, RP11-20B24.3, HNRNPA1P44, OXR1, H2AFZP1, TAB3P1, RPL5P35, ZNF92P2, CIR1P3, GNAI2, CCDC42, RP11-370110.6, ADIPOR1P1, KIAA1841, and AC008074.4, comparing each of the DNA sequences of the 16 target genes with its corresponding normal genes, identifying one or more mutations, if present, in each of the DNA sequences of the 16 target genes, and detecting amyotrophic lateral sclerosis in the subject if at least one of the 16 target genes has one or more mutations.
7. The method according to Claim 6, wherein amyotrophic lateral sclerosis is detected in the subject if at least two of the 16 target genes have one or more mutations.
8. The method according to Claim 6, wherein amyotrophic lateral sclerosis is detected in the subject if at least three of the 16 target genes have one or more mutations.
9. The method according to Claim 6, wherein the biological sample is blood, a tissue sample, or a cell.
Applications Claiming Priority (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
US202063093049P | 2020-10-16 | 2020-10-16 | |
US63/093,049 | 2020-10-16 |
Publications (1)
Publication Number | Publication Date |
---|---|
WO2022082199A1 true WO2022082199A1 (en) | 2022-04-21 |
Family
ID=81209431
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
PCT/US2021/071865 WO2022082199A1 (en) | 2020-10-16 | 2021-10-14 | Method for detecting amyotrophic lateral sclerosis |
Country Status (1)
Country | Link |
---|---|
WO (1) | WO2022082199A1 (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024077282A3 (en) * | 2022-10-07 | 2024-06-13 | Neu Bio, Inc. | Biomarkers for the diagnosis of amyotrophic lateral sclerosis |
Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040137450A1 (en) * | 2001-04-16 | 2004-07-15 | Hadano Shinji | Als2 gene and amyotrophic lateral sclerosis type 2 |
WO2013041577A1 (en) * | 2011-09-20 | 2013-03-28 | Vib Vzw | Methods for the diagnosis of amyotrophic lateral sclerosis and frontotemporal lobar degeneration |
US20130109589A1 (en) * | 2006-11-30 | 2013-05-02 | Translational Genomics Research Institute | Single nucleotide polymorphisms associated with amyotrophic lateral sclerosis |
US20160177389A1 (en) * | 2008-07-22 | 2016-06-23 | The General Hospital Corporation D/B/A Massachusetts General Hospital | Fus/tls-based compounds and methods for diagnosis, treatment and prevention of amyotrophic lateral sclerosis and related motor neuron diseases |
US20160338328A1 (en) * | 2009-08-25 | 2016-11-24 | Hiroshima University | Animal model and cell model developing amyotrophic lateral sclerosis |
-
2021
- 2021-10-14 WO PCT/US2021/071865 patent/WO2022082199A1/en active Application Filing
Patent Citations (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US20040137450A1 (en) * | 2001-04-16 | 2004-07-15 | Hadano Shinji | Als2 gene and amyotrophic lateral sclerosis type 2 |
US20130109589A1 (en) * | 2006-11-30 | 2013-05-02 | Translational Genomics Research Institute | Single nucleotide polymorphisms associated with amyotrophic lateral sclerosis |
US20160177389A1 (en) * | 2008-07-22 | 2016-06-23 | The General Hospital Corporation D/B/A Massachusetts General Hospital | Fus/tls-based compounds and methods for diagnosis, treatment and prevention of amyotrophic lateral sclerosis and related motor neuron diseases |
US20160338328A1 (en) * | 2009-08-25 | 2016-11-24 | Hiroshima University | Animal model and cell model developing amyotrophic lateral sclerosis |
WO2013041577A1 (en) * | 2011-09-20 | 2013-03-28 | Vib Vzw | Methods for the diagnosis of amyotrophic lateral sclerosis and frontotemporal lobar degeneration |
Non-Patent Citations (2)
Title |
---|
CALINI DANIELA, CORRADO LUCIA, DEL BO ROBERTO, GAGLIARDI STELLA, PENSATO VIVIANA, VERDE FEDERICO, CORTI STEFANIA, MAZZINI LETIZIA,: "Analysis of hnRNPA1, A2/B1, and A3 genes in patients with amyotrophic lateral sclerosis", NEUROBIOLOGY OF AGING, vol. 34, no. 11, November 2013 (2013-11-01) - 2 July 2013 (2013-07-02), pages 1 - 4, XP028691762, DOI: 10.1016/j.neurobiolaging. 2013.05.02 5 * |
LIU KEVIN X., EDWARDS BENJAMIN, LEE SHEENA, FINELLI MATTÉA J., DAVIES BEN, DAVIES KAY E., OLIVER PETER L.: "Neuron-specific antioxidant OXR1 extends survival of a mouse model of amyotrophic lateral sclerosis", BRAIN, vol. 138, no. 5, May 2015 (2015-05-01), pages 1167 - 1168, XP055933313, DOI: 10.1093/brain/awv039 * |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024077282A3 (en) * | 2022-10-07 | 2024-06-13 | Neu Bio, Inc. | Biomarkers for the diagnosis of amyotrophic lateral sclerosis |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
De Roeck et al. | NanoSatellite: accurate characterization of expanded tandem repeat length and sequence through whole genome long-read sequencing on PromethION | |
Yin et al. | Challenges in the application of NGS in the clinical laboratory | |
US20200335178A1 (en) | Detecting repeat expansions with short read sequencing data | |
KR101718940B1 (en) | Epigenetic early diagnostic composition for Alzheimer's disease or mild cognitive impairment | |
CA2922005A1 (en) | Methods and compositions for screening and treating developmental disorders | |
HUE030510T2 (en) | Diagnosing fetal chromosomal aneuploidy using genomic sequencing | |
CN111292804B (en) | Method and system for detecting SMN1 gene mutation by means of high-throughput sequencing | |
Hoyer et al. | Genetic diagnosis of Charcot-Marie-Tooth disease in a population by next-generation sequencing. | |
AU2014346680A1 (en) | Targeted screening for mutations | |
CN111534602A (en) | Method for analyzing human blood type and genotype based on high-throughput sequencing and application thereof | |
Wang et al. | Next-generation sequencing-based molecular diagnosis of neonatal hypotonia in Chinese Population | |
Jensen et al. | Combinatorial patterns of gene expression changes contribute to variable expressivity of the developmental delay-associated 16p12. 1 deletion | |
WO2022082199A1 (en) | Method for detecting amyotrophic lateral sclerosis | |
CN116083562B (en) | SNP marker combination and primer set related to aspirin resistance auxiliary diagnosis and application thereof | |
CN108570496A (en) | A kind of molecular diagnosis method and kit of constitutional bone disease | |
WO2022231449A1 (en) | Circulating noncoding rnas as a signature of autism spectrum disorder symptomatology | |
CN104164424A (en) | CC2D2A gene mutant and application thereof | |
Seo et al. | Quality threshold evaluation of Sanger confirmation for results of whole exome sequencing in clinically diagnostic setting | |
CN103571846B (en) | ATP6V1B2 gene mutation bodies and its application | |
CN119193822B (en) | A SNP marker for evaluating the risk of hereditary multiple osteochondroma and its application | |
US11920198B2 (en) | Method and kit for identifying gene mutations | |
CN119859681B (en) | Application of reagent for detecting SNP locus rs2246690 in preparation of AMS susceptible population screening product | |
US20250140346A1 (en) | Sensitivity of tumor-informed minimal residual disease panels | |
RU2825664C2 (en) | Sequence graph tool for determining variations in regions of short tandem repeats | |
CN108085381A (en) | Application of one group of gene mutation site in diagnosing chronic pancreatitis, the reagent for assessing prognosis or kit is prepared |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
121 | Ep: the epo has been informed by wipo that ep was designated in this application |
Ref document number: 21881316 Country of ref document: EP Kind code of ref document: A1 |
|
NENP | Non-entry into the national phase |
Ref country code: DE |
|
122 | Ep: pct application non-entry in european phase |
Ref document number: 21881316 Country of ref document: EP Kind code of ref document: A1 |