[go: up one dir, main page]

WO2014166303A2 - Use of multiomic signature to predict diabetes - Google Patents

Use of multiomic signature to predict diabetes Download PDF

Info

Publication number
WO2014166303A2
WO2014166303A2 PCT/CN2014/000408 CN2014000408W WO2014166303A2 WO 2014166303 A2 WO2014166303 A2 WO 2014166303A2 CN 2014000408 W CN2014000408 W CN 2014000408W WO 2014166303 A2 WO2014166303 A2 WO 2014166303A2
Authority
WO
WIPO (PCT)
Prior art keywords
chrl
associated genes
genes
mir
levels
Prior art date
Application number
PCT/CN2014/000408
Other languages
French (fr)
Other versions
WO2014166303A3 (en
Inventor
Juliana Chung-Ngor CHAN
Ronald Ching-Wan MA
Heung Man Lee
Xin Liu
Zhi Bao GAO
Jun Wang
Original Assignee
The Chinese University Of Hong Kong
Bgi-Hong Kong Co., Ltd.
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by The Chinese University Of Hong Kong, Bgi-Hong Kong Co., Ltd. filed Critical The Chinese University Of Hong Kong
Priority to CN201480031292.3A priority Critical patent/CN105431552B/en
Publication of WO2014166303A2 publication Critical patent/WO2014166303A2/en
Publication of WO2014166303A3 publication Critical patent/WO2014166303A3/en

Links

Classifications

    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q1/00Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions
    • C12Q1/68Measuring or testing processes involving enzymes, nucleic acids or microorganisms; Compositions therefor; Processes of preparing such compositions involving nucleic acids
    • C12Q1/6876Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes
    • C12Q1/6883Nucleic acid products used in the analysis of nucleic acids, e.g. primers or probes for diseases caused by alterations of genetic material
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/154Methylation markers
    • CCHEMISTRY; METALLURGY
    • C12BIOCHEMISTRY; BEER; SPIRITS; WINE; VINEGAR; MICROBIOLOGY; ENZYMOLOGY; MUTATION OR GENETIC ENGINEERING
    • C12QMEASURING OR TESTING PROCESSES INVOLVING ENZYMES, NUCLEIC ACIDS OR MICROORGANISMS; COMPOSITIONS OR TEST PAPERS THEREFOR; PROCESSES OF PREPARING SUCH COMPOSITIONS; CONDITION-RESPONSIVE CONTROL IN MICROBIOLOGICAL OR ENZYMOLOGICAL PROCESSES
    • C12Q2600/00Oligonucleotides characterized by their use
    • C12Q2600/158Expression markers

Definitions

  • Type 2 diabetes is a complex disease due to interaction between genetic and environmental factors.
  • Large scale family-based linkage analysis [3-6] and genome-wide association studies (GWAS) in case-control cohorts have identified multiple chromosomal regions and genetic variants associated with T2D [7-9], including genetic variants specific for the Chinese [10].
  • Most of the GWAS-T2D loci or nearby genes are implicated in ⁇ cell development, structure and function. Despite their replication in multiple populations, the low odds ratios of these SNPs (1.5-2) explained less than 10% of the heritability of T2D.
  • Multiomic data including genomic, transciptomic, or methylomic data, can be obtained from individuals and compared between individals.
  • Differences in these data can then be integrated and correlated with diabetes pheno types.
  • the individuals are related, such as in a father-mother-daughter trio.
  • the integration or correlation is performed using network analysis. Multiomic signatures, including genes not previously associated with diabetes,areidentified for predicting the occurrence of diabetes in individuals and developing treatments for diabetes.
  • a method for detecting the presence or increased risk of developing T2D in a subject includes these steps: (a) measuring, in a tissue sample obtained from the subject, expression or DNA methylation levels of one or more T2D-associated genes; (b) comparing the expression or DNA methylation levels of the one or more T2D-associated genes with levels from the same genes in a standard control; and (c) detecting the presence or increased risk of developing T2D when the expression or DNA methylation levels of the one or more T2D-associated genes are higher or lower than the levels in the standard control.
  • the T2D-associated genes are selected from the group consisting of LRRC7, TXNDC12, TGFBR3, AJAPl, AGRN, APOB, EIF4E3, SLC35A4, PACRG, COL15A1, NR4A3, E2F8, PCDH17, PLEKHH3, SCARF2, CUEDC2, KDM2B, PDCDl, miR-16-1, and miR-16-2.
  • step (a) includes measuring expression levels of the one or more T2D-associated genes, and the T2D-associated genes are selected from the group consisting of KDM2B, TXNDC12, EIF4E3, CUEDC2, PDCDl, PACRG, miR-16-1, and miR-16-2.
  • Measuring expression levels can include measuring levels of RNA transcripts of the one or more T2D-associated genes.
  • measuring expression levels can include measuring amounts of proteins or polypeptides resulting from translation of RNA transcripts of the one or more T2D-associated genes.
  • step (c) can include detecting the presence or increased risk of developing T2D when the expression levels of the one or more T2D-associated genes are higher than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of KDM2B, TXNDC12, and EIF4E3.
  • step (c) can include detecting the presence or increased risk of developing T2D when the expression levels of the one or more T2D- associated genes are lower than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of CUEDC2, PDCDl, PACRG, miR-16-1, and miR-16-2.
  • step (a) includes measuring DNA methylation levels of the one or more T2D-associated genes and the T2D-associated genes are selected from the group consisting of PDCDl, EIF4E3, PACRG, and KDM2B.
  • step (c) can include detecting the presence or increased risk of developing T2D when the DNA methylation levels of the one or more T2D-associated genes, or regions of these genes, are higher than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of PDCDl and EIF4E3.
  • the regions of the T2D- associated genes can be promoter regions.
  • step (c) can include detecting the presence or increased risk of developing T2D when the DNA methylation level of a promoter region in the one or more T2D-associated genes is higher than the level in the standard control.
  • step (c) can include detecting the presence or increased risk of developing T2D when the DNA methylation levels of the one or more T2D-associated genes, or regions of these genes, are lower than the levels in the standard control, and the T2D- associated genes are selected from the group consisting of EIF4E3, PACRG, and KDM2B.
  • step (c) includes detecting the presence or increased risk of developing T2D when the DNA methylation level of a promoter region in the one or more T2D-associated genes is lower than the level in the standard control.
  • a method for detecting the presence or increased risk of developing T2D in a subject.
  • the method includes these steps: (a) determining, in a tissue sample obtained from the subject, a single-nucleotide genotype of the subject at a genomic locus, wherein the genomic locus is given by a SNP ID number provided in Table 9, 10, or 11; and (b) detecting the presence or increased risk of developing T2D in the subject if the single-nucleotide genotype matches a reference genotype associated with T2D.
  • the genomic locus is given by a SNP ID number provided in Table 11 and selected from the group consisting of rsl052637, rs748767, rs9998519, rsl534938, rsl7497819, rsl 1597439, rsl 17510629, rsl 1065587,
  • the reference genotype associated with T2D for each genomic locus can be provided in a column Al of Table 11.
  • the genomic locus is given by the SNP ID number rs9998519 and the reference genotype is C.
  • the genomic locus is given by the SNP ID number rsl534938 and the reference genotype is T.
  • the genomic locus is given by a SNP ID number provided in Table 9 and the reference genotype for the genomic locus is provided in the "risk genotype” column of Table 9. In some embodiments, the genomic locus is given by a SNP ID number provided in Table 10 and the reference genotype for the genomic locus is provided in the "risk allele” column of Table 10.
  • steps (a) and (b) of the method are performed for a plurality of genomic loci.
  • the sample can be a blood or saliva sample.
  • the subject from whom the sample is obtained can be of Asian descent.
  • the subject is a Chinese.ln some cases, the subject is a Han Chinese.
  • the subject can also have a family history of T2D but no diagnosis for T2D (i.e., the subject has not been diagnosed with T2D).
  • step (a) includes an amplification reaction.
  • the amplification reaction can be a polymerase chain reaction (PCR).
  • a method of identifying genetic markers of T2D includes these steps: obtaining a multiomic data set from one or more T2D- positive subjects, wherein each T2D-positive subject has been diagnosed with T2D; obtaining a multiomic data set from one or more T2D-negative subjects, wherein each T2D-negative subject has not been diagnosed with T2D; identifying differences between the multiomic data sets obtained from the one or more T2D-positive subjects and the one or more T2D-negative subjects; and identifying one or more genetic markers based on the differences.
  • the T2D-positive subjects are related to the T2D-negative subjects.
  • the T2D-positive subjects and T2D-negative subjects can together include a family trio, and the family trio can include a father, a mother, and an offspring.
  • one of the father and the mother is a T2D-positive subject
  • the other of the father and the mother is a T2D-negative subject.
  • the father is a T2D-negative subject and the mother and the offspring are T2D-positive subjects.
  • the offspring is a daughter of the father and the mother.
  • each multiomic data set includes at least two of the following: DNA sequencing data, RNA sequencing data, and RNA expression level data.
  • the DNA sequencing data can include genomic sequencing data or methylomic sequencing data.
  • the RNA sequencing data can include transcriptomic sequencing data or microRNA sequencing data.
  • the RNA expression level data can include mRNA expression level data or microRNA expression level data.
  • differences between the multiomic data sets from the one or more T2D-positive subjects and the one or more T2D-negative subjects are identified by comparing the multiomic data sets pair-wise.
  • the one or more genetic markers are identified using network analysis.
  • the one or more genetic markers can include connected genes.
  • the genetic markers co- segregate in the multiomic data sets obtained from the T2D-positive subjects or the T2D- negative subjects.
  • the genetic markers include differentially expressed genes, differentially expressed microR As, differentially methylated regions, or single-nucleotide polymorphisms.
  • the genetic markers can include a differentially expressed gene that contains or overlaps with a differentially methylated region.
  • the differentially methylated region occurs within or overlaps with the promoter of the gene.
  • the differentially methylated region occurs within or overlaps with the body of the gene.
  • Embodiments of the method can also include validating the genetic markers by comparing gene expression or methylation in T2D-positive and T2D-negative cohorts.
  • Validating can include performing quantitative PCR.
  • a method can be carried out in whole or in part using a computer system. For example, comparisons between or among gene expression levels, DNA methylation levels, or SNP genotypes can be performed computationally, as can preparation of multiomic data sets and multiomic or network analysis.
  • a computer product including a tangible computer readable medium is provided, and the medium stores a plurality of instructions that when executed control a computer system to perform a method as substantially described herein.
  • a system is provided that includes one or more processors configured to perform a method as
  • Figure 1 Summary of the study design. DNA from the PBMC of the trio will be subjected to whole genome sequencing and whole genome bisulfite sequencing. Whole genome sequencing identified the SNPs and indels from the trio. Bisulfite sequencing identified the DNA methylation sites. The RNA samples were subjected to RNA sequencing and microRNA sequencing to identify the gene expression profile and the microRNA expression pattern.
  • Figure 2 Summary of the multiomic data analysis of DNA methylation, gene expression from the trio. Interested gene loci were chosen for validation in DNA methylation and gene expression. Co-segregated SNPs in the interested gene loci will be checked against the results of the Chinese GWAS meta-analysis to identify target SNPs for validation in independent case control cohort.
  • Figure 3 A computer system for use with some embodiments of the invention.
  • FIG. 4 Summary of differential methylation regions (DMR) in the genome. Whole genome methylation analysis showed that there are over 1500 DMR in each member of the trio. 186 DMRs co-segregated only in the father-mother (FM) and father-daughter (FD) pairs. These 186 DMRs overlaps with 32 genes.
  • DMR differential methylation regions
  • FIG. 5 Summary of gene expression and microRNA expression in the trio. Venn diagram showed the number of expressed genes in the family trio (top left) and the pair wise comparison of expressed genes (top right). 413 Genes expressed differentially in both the FM and FD comparisons but not in the MD comparison. Expression of microRNA was summarized in the bottom. Over 500 microRNAs were expressed in each member of the trio (left) and 67 showed differential expression in only the FM and FD pairs (right).
  • FIG. 6 Network analysis to discover genes associated with T2D.
  • Network analysis constructed a gene network composed of thousands of genes.
  • the sub-network contains the known T2D GWAS genes (left) contains 141 genes with 15 genes being the most connected genes shown in the table at right.
  • FIG. 7 Discovery of T2D genes by multiomic approach. Whole genome bisulfite sequencing and RNA sequencing were applied to a family trio with T2D to reveal DNA methylation and gene expression change in PBMC associated with T2D.
  • the CIRCOS diagram in centre summarizes the differentially methylated regions and differentially expressed genes. Key changes in both DNA methylation and gene expression in 5 gene regions were successfully validated in a 65 vs. 65 case control cohort.
  • T2D type 2 diabetes
  • Type 2 diabetes may becaused by a combination of lifestyle and genetic factors. Diabetes can be caused by distinct clinical entities such as endocrine disorders (e.g., Cushing's syndrome) and chronic pancreatitis. However, the majority of people with diabetes have risk factors including but not limited to obesity, hypertension, high blood cholesterol, metabolic syndrome (high triglyceride, low HDL-C, high blood glucose, high blood pressure, large waist), which may share common metabolic pathways, further amplified by aging, energy dense diets (e.g., high- fat and high
  • T2D glucose
  • drugs e.g., beta blockers, steroids
  • T2D has relatives (especially first degree) with T2D increases risks of developing T2D substantially.
  • Symptoms of T2D often include polyuria (frequent urination), polydipsia (increased thirst), polyphagia (increased hunger), fatigue, and weight loss.
  • the abnormal neurohormonal and metabolic milieu characterized by hyperglycemia, dyslipidemia and low grade inflammation can trigger a cascade of signaling pathways, which can lead to cell death and dysregulated cell growth, giving rise to multiple morbidities including heart disease, strokes, limb amputation, visual loss, kidney failure, cancers, and cognitive impairment.
  • cardiovascular disease refers to a broad class of diseases that involve the heart or blood vessels (arteries and veins) and affect the cardiovascular system, such as conditions related to atherosclerosis (arterial disease). These include but not limited to stroke, coronary heart disease and peripheral vascular disease.
  • Known risk factors for cardiovascular diseases include unhealthy eating, lack of exercise, obesity, suboptimally managed diabetes, abnormal blood lipids, high blood pressure, excessive consumption of alcohol, use of tobacco, as well as genetic background.
  • a BMI of 20 to 25 kg/m 2 is considered optimal weight; a BMI lower than 20kg/m 2 suggests the person is underweight whereas a BMI above 25 kg/m 2 may indicate the person is overweight; a BMI above 30 kg/m 2 suggests the person is obese; and a BMI over 40 kg/m 2 indicates the person to be morbidly obese.
  • Asians have more body fat for the same degree of BMI and waist circumference.
  • normal weight and obesity in Asians are defined as ⁇ 23 kg/m 2 and >25
  • biological sample includes any section of tissue or bodily fluid taken from a test subject such as a biopsy and autopsy sample, and frozen section taken for histologic purposes, or processed forms of any of such samples.
  • Biological samples include blood and blood fractions or products (e.g., serum, plasma, platelets, white blood cells, red blood cells, and the like), sputum or saliva, lymph and tongue tissue, cultured cells, e.g., primary cultures, explants, and transformed cells, stool, urine, stomach biopsy tissue etc.
  • a biological sample is typically obtained from a eukaryotic organism, which may be a mammal, may be a primate and may be a human subject.
  • biopsy refers to the process of removing a tissue sample for diagnostic or prognostic evaluation, and to the tissue specimen itself. Any biopsy technique known in the art can be applied to the methods of the present invention. The biopsy technique applied will depend on the tissue type to be evaluated (e.g., tongue, colon, prostate, kidney, bladder, lymph node, liver, bone marrow, blood cell, stomach tissue, etc.) among other factors. Representative biopsy techniques include, but are not limited to, excisional biopsy, incisional biopsy, needle biopsy, surgical biopsy, and bone marrow biopsy and may comprise endoscopy such as colonoscopy. A wide range of biopsy techniques are well known to those skilled in the art who will choose between them and implement them with minimal experimentation.
  • isolated nucleic acid molecule means a nucleic acid molecule that is separated from other nucleic acid molecules that are usually associated with the isolated nucleic acid molecule.
  • an "isolated" nucleic acid molecule includes, without limitation, a nucleic acid molecule that is free of nucleotide sequences that naturally flank one or both ends of the nucleic acid in the genome of the organism from which the isolated nucleic acid is derived (e.g., a cDNA or genomic DNA fragment produced by a polymerase chain reaction or restriction endonuclease digestion).
  • an isolated nucleic acid molecule is generally introduced into a vector (e.g., a cloning vector or an expression vector) for convenience of manipulation or to generate a fusion nucleic acid molecule.
  • an isolated nucleic acid molecule can include an engineered nucleic acid molecule such as a recombinant or a synthetic nucleic acid molecule.
  • nucleic acid or “polynucleotide” refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides.
  • nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, single nucleotide polymorphisms (SNPs), and complementary sequences as well as the sequence explicitly indicated.
  • degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et ah, Nucleic Acid Res. 19:5081 (1991); Ohtsuka et ah, J. Biol. Chem. 260:2605-2608 (1985); and
  • nucleic acid is used
  • gene means the segment of DNA involved in producing a polypeptide chain; it includes regions preceding and following the coding region (leader and trailer) involved in the transcription and/or translation of the gene product and the regulation of the transcription and/or translation, as well as intervening sequences (introns) between individual coding segments (exons).
  • polypeptide polypeptide
  • peptide protein
  • protein protein
  • amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers.
  • the terms encompass amino acid chains of any length, including full-length proteins (i.e., antigens), wherein the amino acid residues are linked by covalent peptide bonds.
  • amino acid refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids.
  • Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, ⁇ - carboxyglutamate, and O-phosphoserine.
  • amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid.
  • amino acid mimetics refer to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
  • Amino acids may include those having non-naturally occurring D-chirality, as disclosed in WO01/12654, which may improve the stability (e.g., half-life), bioavailability, and other characteristics of a polypeptide comprising one or more of such D-amino acids. In some cases, one or more, and potentially all of the amino acids of a therapeutic polypeptide have D-chirality.
  • Amino acids may be referred to herein by either the commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical
  • the specified binding agent e.g. , an antibody
  • Specific binding of an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein or a protein but not its similar "sister" proteins.
  • immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein or in a particular form.
  • solid-phase ELISA immunoassays are routinely used to select antibodies specifically immunoreactive with a protein (see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual (1988) for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity).
  • a specific or selective binding reaction will be at least twice background signal or noise and more typically more than 10 to 100 times background.
  • polynucleotide hybridization method refers to a method for detecting the presence and/or quantity of a pre-determined polynucleotide sequence based on its ability to form Watson-Crick base-pairing, under appropriate hybridization conditions, with a polynucleotide probe of a known sequence.
  • Primer refers to oligonucleotides that can be used in an amplification method, such as a polymerase chain reaction (PCR), to amplify a nucleotide sequence based on the polynucleotide sequence corresponding to a gene of interest or a portion thereof. Typically, at least one of the PCR primers for amplification of a
  • polynucleotide sequence is sequence-specific for that polynucleotidesequence.
  • the exact length of the primer will depend upon many factors, including temperature, source of the primer, and the method used.
  • the oligonucleotide primer typically contains at least 10, or 15, or 20, or25 or more nucleotides, although it may contain fewer nucleotides or more nucleotides. The factors involved in determining the appropriate length of primer are readily known to one of ordinary skill in the art.
  • primer pair means a pair of primers that hybridize to opposite strands a target DNA molecule or to regions of the target DNA which flank a nucleotide sequence to be amplified.
  • primer site means the area of the target DNA or other nucleic acid to which a primer hybridizes.
  • a “label,” “detectable label,” or “detectable moiety” is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means.
  • useful labels include 32 P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins that can be made detectable, e.g., by incorporating a radioactive component into the peptide or used to detect antibodies specifically reactive with the peptide.
  • a detectable label is attached to a probe or a molecule with defined binding characteristics (e.g., a polypeptide with a known binding specificity or a polynucleotide), so as to allow the presence of the probe (and therefore its binding target) to be readily detectable.
  • defined binding characteristics e.g., a polypeptide with a known binding specificity or a polynucleotide
  • polynucleotide of interest or a polypeptide of interest present in a sample may be expressed in the absolute terms, i.e., the total quantity of the polynucleotide or polypeptide in the sample, or in the relative terms, i.e., the concentration of the polynucleotide or polypeptide in the sample.
  • an effective amount of a cholesterol lowering drug or a blood glucose lowering drug is the amount of said drug to achieve a decreased level of cholesterol or blood glucose, respectively, in a patient who has been given the drug for therapeutic purposes.
  • An amount adequate to accomplish this is defined as the “therapeutically effective dose.
  • the dosing range varies with the nature of the therapeutic agent being administeredand other factors such as the route of
  • expression level refers to the expression level of a particular gene or genomic region in a sample obtained from an individual. Expression can be represented by the amount of transcribed RNA or translated protein or polypeptide associated with the gene or genomic region, or by a biological activity (e.g. enzymatic activity) resulting from expression of the gene or genomic region. Expression levels can be measured as desired, in absolute or relative terms, and can reflect the context of the measurement or the conditions under which the sample was obtained.
  • DNA methylation level refers to the extent to which a gene or genomic region is methylated in a sample obtained from an individual. A gene can be fully or partially methylated, and the pattern of methylation can be random, uniform, or specific to portions of the gene (for example, the promoter or body). Moreover, the pattern and extent of
  • methylation of a gene can vary, for example between chromosomes in the same cell, tissues of the same individual, or different individuals.
  • measuring a DNA methylation level in a sample can provide a detailed methylation pattern and can reflect the context in which the sample was obtained.
  • the measured DNA methylation level can be used to determine whether a genomic region is differentially methylated, for example between T2D-positive and T2D-negative individuals.
  • Standard control refers to a sample suitable for the use of a method of the present invention, in order to quantitatively determine the level of expression (e.g., abundance of RNA transcripts or gene products) or DNA methylation in a test sample for one or more genomic regions of interest (for example, a gene or genomic locus).
  • the standard control contains a known level or levels of expression or DNA methylation for the genomic region(s) of interest, such that the levels closely reflect those of an average healthy individual not suffering from T2D and not at an increased risk of later developing T2D.
  • the standard control may be derived from one or more healthy individuals.
  • Higher or lower than levels in a standard control refers to differences between the level of expression or DNA methylation in test sample as compared with corresponding levels in a standard control, for the same genomic region of interest.
  • a higher level is preferably at least 2-fold, more preferably at least 5-fold, and most preferably at least 10-fold higher than the level in the standard control.
  • a lower level is preferably less than 50%, more preferably less than 20%, and most preferably less than 10% of the level in the standard control.
  • subject or “subject in need of treatment,” as used herein, includes individuals who seek medical attention due to risk of, or actual suffering from T2D, cancer, or cardiovascular disease. Subjects also include individuals currently undergoing therapy that seek manipulation of the therapeutic regimen. Subjects or individuals in need of treatment include those that demonstrate symptoms ofT2D or cardiovascular disease, or are at risk of suffering from T2D or cardiovascular disease or related symptoms.
  • a subject in need of treatment includes individuals with a genetic predisposition or family history for T2D, cancer, or cardiovascular disease, those who have suffered relevant symptoms in the past, those who have been exposed to a triggering substance or event, as well as those suffering from chronic or acute symptoms of the condition.
  • a “subject in need of treatment” may be at any age of life.
  • the term "related,” as it pertains to individuals, is used herein to refer to individuals who have common ancestry. For example, a child is related to its parents, grandparents, and siblings, and moreover all of these individuals are related to each other. Relatedness between individuals can be established through knowledge of family history, with genetic tests, or otherwise. Any number of generations can be considered when determining whether two or more people have common ancestry and are thus related, although this number can be limited if desired to provide a more stringent test.
  • a "multiomic data set” refers to data resulting from sequencing DNA and/or RNA obtained from a tissue sample of an individual.
  • a multiomic data set can contain one or more of the following: all or a substantial part of the genomic DNA sequence of the individual; sequences or copy numbers of sites of interest within the genome; methylation patterns of genomic DNA; information about the accessibility of genomic DNA sequences to transcription; sequences of RNA transcripts, including coding and/or non-coding RNAs; and levels or relative abundances of RNA transcripts.
  • the multiomic data set contains sequencing data for the full genome, transcriptome, or methylome of the tissue sample.
  • the DNA and RNA can be obtained from any kind of tissue, and the multiomic data set can have components that a skilled artisan would expect to be specific to the tissue (e.g. the presence of RNA transcripts for specific genes) or largely the same for all tissues of the individual (e.g. sequences of genomic DNA).
  • Multiomic data sets from different individuals e.g. individuals with different diabetes phenotypes
  • polymorphisms polymorphisms, insertions, deletions, differences in gene copy numbers), differences in genomic DNA structure (e.g. methylation, binding of DNA to histone proteins or
  • RNA transcripts e.g., RNA transcripts or non-coding RNA.
  • differences in gene expression e.g. relative abundances of RNA transcripts or non-coding RNA.
  • a"multiomic signature refers to a set of genetic markers, identified using multiomic data sets and the methods disclosed herein, that characterizes or is possessed by an individual or group of individuals.
  • a multiomic signature can be shared by individuals who are related to each other (for example, one being the offspring of another) or who have a common phenotype (for example, being diagnosed with T2D).
  • a multiomic signature can include any number of identified genes, regions, or loci within the genome, as well as information about how the structure (e.g., sequence or methylation state) or expression (e.g., transcription levels) of these genes , regions, or loci differ between individuals who possess and do not possess the multiomic signature.
  • Network analysis refers to a method of integrating multiomic data sets from two or more individuals.
  • the individuals are related.
  • a gene can be represented as a vector made up of multiple data types from the multiomic data set, and connections between genes can be identified by correlating differences in the vectors between individuals.
  • a set of connected genes with characteristics e.g. expression level, DNA methylation level
  • characteristics e.g. expression level, DNA methylation level
  • T2D e.g., T2D
  • T2D type 2 diabetes
  • DNA and RNA samples from peripheral blood mononuclear cells from a family trio consisting of T2D affected mother and daughter and non-T2D father were used for high throughput DNA sequencing, bisulfite sequencing, RNA sequencing and small RNA sequencing to obtain a complete profile on DNA methylation and gene expression. Pairwise comparisons (FM, father-mother, FD, father-daughter and MD, mother-daughter) were used to find differentially methylated and expressed genes. Network analysis was used to integrate these changes to discover key genes associated with T2D. Changes in DNA methylation and gene expression were validated in 65 controls and 65 T2D cases using Sequenom Epityper and qRT-PCR.
  • nucleic acids sizes are given in either kilobases (kb) or base pairs (bp). These are estimates derived from agarose or acrylamide gel electrophoresis, from sequenced nucleic acids, or from published DNA sequences.
  • kb kilobases
  • bp base pairs
  • proteins sizes are given in kilodaltons (kDa) or amino acid residue numbers. Protein sizes are estimated from gel electrophoresis, from sequenced proteins, from derived amino acid sequences, or from published protein sequences.
  • Oligonucleotides that are not commercially available can be chemically synthesized, e.g., according to the solid phase phosphoramidite triester method first described by
  • oligonucleotides is synthesized using any art-recognized strategy, e.g., native acrylamide gel electrophoresis or anion-exchange high performance liquid chromatography (HPLC) as described in Pearson and Reanier, J Chrom.255: 137-149 (1983).
  • native acrylamide gel electrophoresis or anion-exchange high performance liquid chromatography (HPLC) as described in Pearson and Reanier, J Chrom.255: 137-149 (1983).
  • a biological sample is obtained from a person to be tested or assessed for risk of developing type 2 diabetes or associated cardiovascular or renal disease using a method of the present invention. Collection of a tissue or fluid sample from an individual is performed in accordance with the standard protocol laboratories, hospitals or clinics generally follow, such as during a biopsy, blood drawing, saliva collection, or oral swab. An appropriate amount of sample is collected and may be stored according to standard procedures prior to further preparation. [0066] The analysis of genomic DNA found in a subject's sample according to the present invention may be performed using essentially any tissue or bodily fluid, so long as genomic DNA is expected to be present in such sample. The methods for preparing tissue or fluid samples for nucleic acid extraction are well known among those of skill in the art. For example, a subject's epithelial tissue sample should be first treated to disrupt cellular membrane so as to release nucleic acids contained within the cells.
  • the Chinese family trio consisted of an affected mother (age: 69, age of diagnosis: 61) and daughter (age 48, age of diagnosis: 38) pair and an unaffected father (age 79). They were selected from the Hong Kong Family Diabetes Study which recruited over 180 families using an index patient with familial T2D diagnosed before the age of 40 [16]. Peripheral blood samples were obtained from the family trio for isolation of PBMC using Ficoll-Paque stepwise gradient centrifugation. The isolated PBMC were divided for DNA and RNA extraction.
  • RNA contamination should be eliminated to avoid interference with DNA analysis.
  • other components such as proteins and lipids may be removed from the biological sample prior to further analysis of the genomic DNA.
  • DNA/RNA is then subjected to sequence-based analysis, such that the genomic sequence of one or more of the pertinent genes, or one or more of its transcripts, found in a test subject may be determined and then compared with a standard sequence to detect any possible sequence variation.
  • An amplification reaction is optional prior to the sequence analysis.
  • a variety of polynucleotide amplification methods are well established and frequently used in research. For instance, the general methods of polymerase chain reaction (PCR) for polynucleotide sequence amplification are well known in the art and are thus not described in detail herein.
  • PCR reagents and protocols are also available from commercial vendors, such as Roche Molecular Systems.
  • PCR amplification is typically used in practicing the present invention, one of skill in the art will recognize that amplification of the relevant genomic sequence may be accomplished by any known method, such as the ligase chain reaction (LCR),
  • transcription-mediated amplification and self-sustained sequence replication or nucleic acid sequence-based amplification (NASBA), each of which provides sufficient amplification.
  • NASBA nucleic acid sequence-based amplification
  • DNA sequencing methods routinely practiced in research laboratories, either manual or automated, can be used for practicing the present invention. Additional means suitable for determining the polynucleotide sequence of a genomic DNA for practicing the methods of the present invention include but are not limited to mass spectrometry, primer extension, polynucleotide hybridization, real-time PCR, melting curve analysis, high resolution melting analysis, heteroduplex analysis,
  • Bisulfite sequencing was carried out to map the position of all methylcytosines in the genome.
  • the DNA were fragmented and treated with bisulfite using standard protocol. DNA fragments in the size range of 320 to 380bp were gel purified for sequencing. All procedures were in accordance to the manufacturer's instructions. Converted DNA was subjected to 50bp pair-end Illumina GA sequencing using Solexa sequencer. All raw data was processed by the Illumina Pipeline vl .3.1.
  • the cleaned reads generated were aligned to the reference human genome hgl8. Because DNA methylation is strand specific, the two strands of the reference human genome DNA was modified separately in silico to convert all 'C to 'T' to generate a combined 6Gbp target genome to allow alignment after bisulfite conversion.
  • the reads were also transformed using the following criteria: (1) observed 'C in the forward reads was replaced by 'T'; (2) observed 'G' in the reverse reads was converted to ⁇ '.
  • the transformed reads were then aligned to the modified target genome using SOAP2 aligner[17].
  • methylation levels were compared pairwise between two of the three samples to identify differentially methylated regions (DMR) using the following criteria. Firstly, we identified 'seed' regions containing at least 5 CpG sites with at least 2-fold change in methylation level (P ⁇ 0.05, Fisher's exact test). We only included 'seed' regions with 20% or more CpG methylation in any one of the 3 subjects assuming that CpG sites with less than 20% methylation had no biological significance.
  • RNA extracted from each sample was enriched by oligo-dT to obtain the polyA+ fraction for sequencing.
  • the polyA+ mRNA was then fragmented and converted to cDNA by reverse transcription.
  • DNA fragments were size selected for 75bp pair-end sequencing by Illumina Genome Analyzer II using standard procedures. All raw data was processed by the Illumina Pipeline v 1.3.1. The generated reads were then mapped to the human genome hgl8 using SOAP2 [17] and followed methods established from BGI [21, 22].
  • RNA were fractionated by gel electrophoresis to isolate the small RNA portion corresponding to the size of 18 to 30 nucleotides for sequencing with Solexa Genome Analyzer II using standard procedures. Reads were filtered if (1) > 4 bases with sequencing quality lower than 10 or >6 bases with sequencing quality lower than 13; (2) 5' adaptor was contaminated; (3) without 3' adaptor; (4) without insert tag; (5) with polyA or polyN; (6) shorter than 18 nt. The remaining 'clean reads' were then mapped back to the genome by SOAP2[17] to get the positions of small RNA tags along chromosomes.
  • RNA tags were also aligned by blastn (blastall -p blastn -F F -e 0.01) to the precursors and mature miRNA in the miRBase 15.0 to obtain the miRNA count for each miRNA expressed. Then the remaining small RNA tags were further aligned to the Rfam (by blastn, blastall -p blastn - F F -e 0.01), mRNA and piRNA from NCBI database to identify other small RNA species and small RNA fragments from degraded mRNA (exons and introns).
  • MicroCosm_Targets on the World Wide Web at ebi.ac.uk/enright- srv/microcosm/htdocs/targets/v5/
  • TargetScanHuman on the World Wide Web at targetscan.org/vert_60/
  • microRNA.org on the World Wide Web at microrna.org/microrna/getDownloads.do
  • TarBase_V5.0_human on the World Wide Web at diana.cslab.ece.ntua.gr/tarbase/tarbase_download.php.
  • Only target genes identified by at least 3 out of the 4 databases will be included for analysis. After discarding genes with too many missing elements (>2), the M/F and D/F networks included 3,180 and 3,091 genes separately.
  • RV-coefficient [24] which is a generalization of Pearson correlation coefficient, as a measure of similarity of genes.
  • RV-coefficients were computed for all pair of genes. From a network perspective, genes were taken as nodes, and edges were added between genes if they were significantly correlated. Scale-free is considered as robustness, which is one of the most prominent characteristics of biological network [25].
  • T2D relevant network the common parts of M/F and D/F networks
  • p ⁇ 0.035 RV-coefficient
  • RNA transcriptome analysis and microRNA expression were validated by real-time quantitative PCR (qRT-PCR) in 65 controls and 65 T2D cases.
  • First strand cDNA was synthesized by the High-Capacity cDNA Reverse Transcription Kit (ABI Biosystems) using 0.5 ⁇ g total RNA as template.
  • the SYBR Green method was used for qRT-PCR using an ABI 7900HT Real-time PCR machine and SYBP® Premix Ex TaqTM (Perfect Real Time) from Takara.
  • the expression levels of each tested gene were tested by real time qPCR for 40 cycles.
  • the expression levels of mRNA were normalized to the expression level of ⁇ -actin.
  • the expression level of microRNA were normalized to the expression level of U6 small RNA.
  • DNA methylation changes were validated by the Sequenom MassARRAY Compact System (Sequenom, Inc.) using the EpiTYPER assay. Genomic sequences from interested regions were used to design the primers for the assay. DNA samples were extracted from the PBMCs of 65 controls and 65 T2D cases. For each sample, 1 ug of DNA was treated with sodium bisulfite using the EZ DNA MethylationTM kit (Zymo Research) according to manufacturer's procedure. The treated DNA were PCR amplified and tested for DNA methylation for individual CpG diculeotides at the interested genomic regions using
  • a physician may arrange for regular monitoring of various symptoms oftype 2 diabetes or diabetic cardiovascular and renal diseases in a subject who has been deemed by the method of the present invention to have an elevated risk of developing type 2 diabetes.
  • the physician may also prescribe both pharmacological and non-pharmacological treatments such as lifestyle modification (e.g., reduce body weight by 5%, high fiber diet, walking for at least 150 minutes weekly) and medicines known to reduce risk of onset of diabetes (e.g., metformin, alpha glucosidase inhibitors, lipase inhibitors) to a subject who has been deemed by the method of the present invention to have an elevated risk of developing type 2 diabetes.
  • lifestyle modification e.g., reduce body weight by 5%, high fiber diet, walking for at least 150 minutes weekly
  • medicines known to reduce risk of onset of diabetes e.g., metformin, alpha glucosidase inhibitors, lipase inhibitors
  • the attending physician may prescribe medications to control risk factors such as high levels of blood cholesterol and triglycerol (e.g., statins and fibrates) and reduce angiotensin II activity (e.g., Angiotensin converting enzyme inhibitor (ACEI) and angiotensin II receptor blocker (ARB)), as well as place the subject under regular testing and monitoring of coronary artery condition and kidney function.
  • risk factors such as high levels of blood cholesterol and triglycerol (e.g., statins and fibrates) and reduce angiotensin II activity (e.g., Angiotensin converting enzyme inhibitor (ACEI) and angiotensin II receptor blocker (ARB)
  • ACEI Angiotensin converting enzyme inhibitor
  • ARB angiotensin II receptor blocker
  • Multiomic signatures obtained using the methods described herein can also be used to develop and screen new therapies for diabetes and its comorbidities.
  • the presentinvention provides compositions and kits for practicing the methods described herein to detect possible genomic sequence variation of certain gene(s) and the transcripts thereof in a subject, which can be used for various purposes such as detecting or diagnosing the presence of type 2 diabetes and diabetic cardiovascular or renal disease in a subject, determining the risk of developing type 2 diabetes and diabetic cardiovascular or renal disease in a subject, and guiding the treatment plan for these conditions in the subject.
  • Kits for carrying out assays for determining the nucleotide sequence of a relevant genomic sequence typically include at least one oligonucleotide useful for specific hybridization with a predetermined segment of a pertinent genomic sequence.
  • this oligonucleotide is labeled with a detectable moiety.
  • the oligonucleotide specifically hybridizes with the standard sequence only but not with any of the variant sequences. In other cases, the oligonucleotide specifically hybridizes with one particular version of the variant sequence but not with other versions, nor with the standard sequence.
  • kits may include at least two oligonucleotide primers that can be used in the amplification of at least one segment of a pertinent genomic sequence or transcripts thereof by PCR.
  • at least one of the oligonucleotide primers is designed to anneal only to the standard sequence or only to a particular version of the variant sequences.
  • the kits of this invention may provide instruction manuals (e.g., internet-based decision support tools) to guide users in analyzing test samples and assessing the presence or future risk of type 2 diabetes and diabetic cardiovascular or renal disease in a test subject.
  • any of the computer systems mentioned herein may utilize any suitable number of subsystems. Examples of such subsystems are shown in Figure 3 in computer apparatus 300.
  • a computer system includes a single computer apparatus, where the subsystems can be the components of the computer apparatus.
  • a computer system can include multiple computer apparatuses, each being a subsystem, with internal components.
  • I/O controller 371 Peripherals and input/output (I/O) devices, which couple to I/O controller 371, can be connected to the computer system by any number of means known in the art, such as serial port 377.
  • serial port 377 or external interface 381 e.g. Ethernet, Wi-Fi, etc.
  • serial port 377 or external interface 381 can be used to connect computer system 300 to a wide area network such as the Internet, a mouse input device, or a scanner.
  • system bus 375 allows the central processor 373 to communicate with each subsystem and to control the execution of instructions from system memory 372 or the storage device(s) 379 (e.g., a fixed disk, such as a hard drive or optical disk), as well as the exchange of information between subsystems.
  • system memory 372 and/or the storage device(s) 379 may embody a computer readable medium. Any of the data mentioned herein can be output from one component to another component and can be output to the user.
  • a computer system can include a plurality of the same components or subsystems, e.g., connected together by external interface 381 or by an internal interface.
  • computer systems, subsystem, or apparatuses can communicate over a network.
  • one computer can be considered a client and another computer a server, where each can be part of a same computer system.
  • a client and a server can each include multiple systems, subsystems, or components.
  • any of the embodiments of the present invention can be implemented in the form of control logic using hardware (e.g. an application specific integrated circuit or field programmable gate array) and/or using computer software with a generally programmable processor in a modular or integrated manner.
  • a processor includes a multi-core processor on a same integrated chip, or multiple processing units on a single circuit board or networked.
  • Any of the software components or functions described in this application may be implemented as software code to be executed by a processor using any suitable computer language such as, for example, Java, C++ or Perl using, for example, conventional or object- oriented techniques.
  • the software code may be stored as a series of instructions or commands on a computer readable medium for storage and/or transmission, suitable media include random access memory (RAM), a read only memory (ROM), a magnetic medium such as a hard-drive or a floppy disk, or an optical medium such as a compact disk (CD) or DVD (digital versatile disk), flash memory, and the like.
  • RAM random access memory
  • ROM read only memory
  • magnetic medium such as a hard-drive or a floppy disk
  • an optical medium such as a compact disk (CD) or DVD (digital versatile disk), flash memory, and the like.
  • the computer readable medium may be any combination of such storage or transmission devices.
  • Such programs may also be encoded and transmitted using carrier signals adapted for transmission via wired, optical, and/or wireless networks conforming to a variety of protocols, including the Internet.
  • a computer readable medium according to an embodiment of the present invention may be created using a data signal encoded with such programs.
  • Computer readable media encoded with the program code may be packaged with a compatible device or provided separately from other devices (e.g., via Internet download). Any such computer readable medium may reside on or within a single computer product (e.g. a hard drive, a CD, or an entire computer system), and may be present on or within different computer products within a system or network.
  • a computer system may include a monitor, printer, or other suitable display for providing any of the results mentioned herein to a user.
  • any of the methods described herein may be totally or partially performed with a computer system including one or more processors, which can be configured to perform the steps.
  • embodiments can be directed to computer systems configured to perform the steps of any of the methods described herein, potentially with different components performing a respective steps or a respective group of steps.
  • steps of methods herein can be performed at a same time or in a different order. Additionally, portions of these steps may be used with portions of other steps from other methods. Also, all or portions of a step may be optional. Additionally, any of the steps of any of the methods can be performed with modules, circuits, or other means for performing these steps.
  • the present inventors used whole genome bisulfite sequencing to obtain single-base resolution DNA methylome of human PBMC of the trio. Over 140 Gbp of DNA sequences were obtained from each member of the triowith about 3 Ox effective coverage of the human genome (Table 1). Mapped DNA sequences were used to calculate the degree of m C coverage at each position of C in the human genome. About 95% of the C in all autosomes was covered. Most of the DNA methylation were found in CG sites with a minority being, non-CG methylation. Differences in CG DNA methylation were identified by pairwise comparisons in the trio.
  • the inventors identified 2686, 2040 and 1514 DMRs in the corresponding FM, FD and MD pair wise comparisons (Figure 4).
  • DMR possibly related to T2D
  • 32 genes were identified(Table 3)with promoter regions overlapping with the DMR.
  • the inventors calculated the m C methylation coverage in the 2Kb promoter regions and found that the DNA m C methylation in these promoter regions showed a general trend in agreement with the DMRs identified.
  • RNA sequencing was performed on the RNA samples from PBMC. Most of the sequence reads were mapped to the annotated human reference genes successfully (Table 1). The mapped sequences represented expression of 16469, 16732 and 16916 annotated gene loci from the father, mother and daughter respectively with the majority of transcripts expressed in all 3 subjects. Using 2-fold difference as the criterion in pair wise comparisons, 1097, 848 and 548 genes were differentially expressed in the FM, FD and MD pairs respectively. In the Venn diagram, 413 genes (listed in Table 4) showed differential expression in the FD and FM pairs but not in the MD comparison (Figure 5). Of these, 15 were located on the Y chromosome. In the remaining 398 genes, 200 showed decreased expression in the FM and FD pairs and 198 showed increased expression. MicroRNA expression profile
  • RNA is a key regulator of many cellular pathways. Sequencing of the small RNA from the trio identified small RNA from different categories including scRNA, snoRNA, microRNA, snRNA, srpRNA, piRNA and parts of other RNA components of the cell, including tRNA, rRNA, repeat sequences as well as exon and intron sequences. Among the RNA sequence reads, microRNA sequences accounted for 70% of the total sequence reads. From these aligned RNA sequence reads, 616, 581 and 591 microRNAs were identified for the father, mother and daughter respectively. Among these expressed microRNAs, 540 were expressed in all 3 members. Pair wise comparisons showed that 67 microRNAs (listed in Table 5) were differentially expressed only in the FM and FD pairs but not the MD pair (Figure 5).
  • the inventors used network analysis to integrate these multiomic features to identify key regulatory genes and their pathways .
  • Two networks were constructed using pair wise comparison data from FM and FD pairs.
  • the common components of the two networks were used as the putative T2D gene network. From this network a subnetwork containing a maximum of 17 of the known T2D GWAS genes was constructed.
  • This T2D related network contains 141 genes located at the shortest connection with these 17 T2D GWAS genes ( Figure 6).
  • On this T2D gene network 15 genes listed in
  • Table 6 are the hub genes that are mostly used for connection. Among the 32 genes (listed in Table 3) with promoter regions overlapping the DMRs, PDCD1 (Programmed cell death 1) is the only one overlapped with the 413 genes showing differential expression. PDCD1 showed increased DNA methylation with decreased gene expression. This pattern of change can be validated in 65 controls versus 65 T2D cases ( Figure 7; Table 7, Table 8) using quantitative PCR (qRT-PCR) and Sequenom EpiTYPER DNA methylation assays. The expression and DNA methylation of two additional genes discovered by network analysis, EIF4E3 and PACRG, were also validated. KDM2B is a chromatin modifying enzyme showing differential methylation (DMR) in the trio.
  • DMR differential methylation
  • KDM2B is also a regulatory gene for the expression of the T2D GWAS gene CDKN2A/B. Both the expression and DNA methylation of KDM2B and CDKN2B showed significant changes in the 65 vs 65 case control cohort ( Figure 7). Discovery of T2D SNPs from multiomic and network analysis
  • RNA-seq analysis of prostate cancer in the Chinese population identifies recurrent gene fusions, cancer-associated long noncoding RNAs and aberrant alternative splicings. Cell Res, 2012.
  • Table 1 Summary of the sequencing results. Over 90G of whole genome sequencing and over 140G of bisulfite sequencing data were generated for each member of the trio. All sequencing data generated from whole genome sequencing, bisulfite sequencing, RNA sequencing and microRNA sequencing were subjected to QC procedure to remove low quality reads and then mapped to human genome build 36 (hgl8).
  • dir4 38187472 3 1S7 ⁇ 20 248 -l.OS -l.OS down chr4 55000809 55001457 648 1.43 1.53 up dir4 10W 10433 104410723 240 1.30 1.30 up chr4 147773215 147773660 445 1.22 1.14 _J»IL_ dii-4 105S055NS 105S05474 341 1.04 1.34 up chr4 182668242 182668877 635 1.20 1.02 MP dir4 18641300' ) 1S0414272 003 1.74 1.12 up chr4 186762687 186763063 376 1.12 1.09
  • dirO ⁇ ⁇ 009 ⁇ 7 ⁇ 01 4 " 300 1.00 1.00 Up chr6 131613039 131613256 217 1.06 1.24 _J » IL_ dirO 150758034 150 " 5%3K 704 -1.00 -1.00 down chr6 158815623 158815945 322 1.27 1.07 up dirO 164715221 104 1 027 400 1.09 1.30 up chr7 1945557 1945918 361 1.18 0.95
  • dirK 710271 IS 71027572 454 1.00 1.00 up chrlO 75404348 75404613 265 -1.40 -1.12 down dirlO ⁇ 03( 359 S03670S2 323 1.00 1.12 p chrlO 80430872 80431071 199 1.04 1.08 _J » IL_ fhr 10 05232010 303 1.51 2.12 up chrlO 104184393 201 ⁇ A3 1.13 3 ⁇ 4> child 1141301ft5 1141 0530 374 1.52 1.52 up chrl 6 119483997 119484297 300 1.00 1.14
  • chrl 4S004011 4S004553 542 1.00 1.Oft up chrl 4 22842253 22842456 203 -T O -1.00 down chrl 4 2311152ft 23111774 24S 1.0 " 1.04 up chrl 4 24020632 24021014 382 1.03 1.03 up chrl 4 315200N1 31530353 372 -1.3- -1.37 down chrl 4 49619439 49619992 553 -1.04 -1.22 down chrl 4 5032Sft09 5032NS3S 220 -1.05 -1.05 down chrl 4 51580080 515S0451 371 no L06 up chrl 4 ft5473 l > 1 ft54742S5 334 -1.00 -1.05 down chrl 4 Sft4 " 02ftO Sft4 " 0540 2 SO 1.32 1. ⁇ up chrl 4 90014202
  • Expression levels are expressed as RPKM (reads per kilobase per million mapped reads); methylation levels are expressed as the proportion of mC in the region as 0.0 being no methylation and 1.0 being 100% methylated.
  • 259286 / is: RHi -q34 0.-2 o.ir 0.14 Down
  • Expression levels are expressed as RPKM (reads per kilobase per million mapped reads). Table 5. FM and FD differentially expressed miRNA in the family trio
  • Expression levels are expressed as RPKM (reads per kilobase per million mapped reads); methylation levels are expressed as the proportion of m C in the region as 0.0 being no methylation and 1.0 being 100% methylated.
  • CDKN2B GWAS-T2D gene and KDM2B target gene 43% increase Yes rRX 1 ⁇ / ) . ⁇ 2 ⁇ > tar cl gene 27".. decrease No
  • TXNDC12 Genes from network analysis 18% increase
  • EIF4E3 Genes from network analysis 38% increase No miR-Ifi-l lop signal from microRN ⁇ xpr ssion In"., decrease No iiiiR ⁇ 62 top signal from microR A expression 30% decrase No
  • CDKN2B chr9:22001091-22001093 increase Yes "
  • CDKN2B chr9:22001207 increase Yes
  • KP ⁇ 12H chrl2: 120504330-120504336 decrease Table 9. SNPs co-segregated in the mother-daughter pair of the trio and correlated
  • Table 11 Summary of the genotyping results. Co-segregated mother-daughter SNPs from focus gene regions showing nominally significant P values in the Chinese GWAS metaanalysis were genotyped independently in 1144 Chinese T2D cases and 421 controls.

Landscapes

  • Chemical & Material Sciences (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Proteomics, Peptides & Aminoacids (AREA)
  • Health & Medical Sciences (AREA)
  • Organic Chemistry (AREA)
  • Wood Science & Technology (AREA)
  • Analytical Chemistry (AREA)
  • Zoology (AREA)
  • Genetics & Genomics (AREA)
  • Engineering & Computer Science (AREA)
  • Pathology (AREA)
  • Immunology (AREA)
  • Microbiology (AREA)
  • Molecular Biology (AREA)
  • Biotechnology (AREA)
  • Biophysics (AREA)
  • Physics & Mathematics (AREA)
  • Biochemistry (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Health & Medical Sciences (AREA)
  • Measuring Or Testing Involving Enzymes Or Micro-Organisms (AREA)
  • Investigating Or Analysing Biological Materials (AREA)

Abstract

Disclosed herein are methods of using genomic markers to predict the occurrence of diabetes in human subjects, and methods of using multiomic data to identify such markers.

Description

USE OF MULTIOMIC SIGNATURE TO PREDICT DIABETES
CROSS-REFERENCE TO RELATED APPLICATION
[0001] This application claims priority to U.S. Provisional Patent Application No.
61/811,639, filed April 12, 2013, the contents of which are incorporated herein by reference in their entirety for all purposes.
BACKGROUND OF THE INVENTION
[0002] There is a pandemic of type 2 diabetes (T2D) and obesity affecting an increasingly young population, especially in Asia [1, 2]. Type 2 diabetes is a complex disease due to interaction between genetic and environmental factors. Large scale family-based linkage analysis [3-6] and genome-wide association studies (GWAS) in case-control cohorts have identified multiple chromosomal regions and genetic variants associated with T2D [7-9], including genetic variants specific for the Chinese [10]. Most of the GWAS-T2D loci or nearby genes are implicated in β cell development, structure and function. Despite their replication in multiple populations, the low odds ratios of these SNPs (1.5-2) explained less than 10% of the heritability of T2D. Recent studies suggested epigenetics also plays a role in T2D. Using the FAIRE technique the GWAS SNP rs7903146 in the intron of TCF7L2 shows strong association in islet-specific open chromatins. Heterozygous carriers of rs7903146 showed marked allelic imbalance, the T allele is more open in islet chromatins and shows higher enhancer activity in functional luciferase activity assays [11]. Similarly, the risk- conferring A-allele of the intronic SNP rs8050136 of FTO, another T2D GWAS gene, is associated with increased DNA methylation in a 7.7Kb region harboring known conserved non-coding elements with regulatory activities [12]. Using a genome-wide approach, researchers have recently identified differentially methylated CpG in the first intron of FTO associated with T2D [13] supporting the importance of DNA methylation in T2D.
[0003] Recent advances in the next generation sequencing have enabled identification of all DNA methylation sites by whole genome bisulfite sequencing [14, 15]. In a proof-of-concept study to explain the missing heritability of T2D, we isolated DNA and RNA from peripheral blood mononuclear cells (PBMC) collected from a Chinese trio family consisting of an affected mother and daughter using the unaffected father as control. We applied whole genome re-sequencing, bisulfite sequencing and RNA sequencing to obtain the genomes, methylomes, transcriptomes and microRNA expression profile in this trio family (Figure 1). We examined the correlations of methylome, transcriptome, microRNA expression and SNPs using pairwise comparisons to identify genes with the most consistent features. We also used an integrated network analysis incorporating genes with differential expression and methylation as well as being targets of differentially expressed microRNAs to discover the most connected genes. These discoveries were replicated in GWAS meta-analysis datasets [10] and independent case control cohort, accompanied by gene expression and DNA methylation assays in PBMC from 65 controls and 65 T2D cases (Figure 2). Using this multipronged approach, we discovered novel T2D associated genes with multiomic features implicated in chromatin modification and protein metabolism pathways in T2D.
[0004] Because of the enormous social and economic impact of type 2 diabetes and associated complications such as cardiovascular and renal diseases, there exist clear and immediate needs to develop new and effective means for accurate diagnosis or early assessment a patient's risk of developing these diseases in the future, such that early intervention may be performed to minimize the harmful effects associated with these diseases and/or the risk of developing the diseases. The present invention fulfills this and other related needs.
BRIEF SUMMARY OF THE INVENTION
[0005] Provided herein are methods, procedures, and systems for identifying and using genetic markers for diabetes. Multiomic data, including genomic, transciptomic, or methylomic data, can be obtained from individuals and compared between individals.
Differences in these data, such as single-nucleotide polymorphisms, different patterns of DNA methylation, or different patterns of gene expression, can then be integrated and correlated with diabetes pheno types. In some embodiments, the individuals are related, such as in a father-mother-daughter trio. In some embodiments, the integration or correlation is performed using network analysis. Multiomic signatures, including genes not previously associated with diabetes,areidentified for predicting the occurrence of diabetes in individuals and developing treatments for diabetes.
[0006] In the first aspect, a method is provided for detecting the presence or increased risk of developing T2D in a subject. The method includes these steps: (a) measuring, in a tissue sample obtained from the subject, expression or DNA methylation levels of one or more T2D-associated genes; (b) comparing the expression or DNA methylation levels of the one or more T2D-associated genes with levels from the same genes in a standard control; and (c) detecting the presence or increased risk of developing T2D when the expression or DNA methylation levels of the one or more T2D-associated genes are higher or lower than the levels in the standard control. The T2D-associated genes are selected from the group consisting of LRRC7, TXNDC12, TGFBR3, AJAPl, AGRN, APOB, EIF4E3, SLC35A4, PACRG, COL15A1, NR4A3, E2F8, PCDH17, PLEKHH3, SCARF2, CUEDC2, KDM2B, PDCDl, miR-16-1, and miR-16-2. [0007] In some embodiments of this method, step (a) includes measuring expression levels of the one or more T2D-associated genes, and the T2D-associated genes are selected from the group consisting of KDM2B, TXNDC12, EIF4E3, CUEDC2, PDCDl, PACRG, miR-16-1, and miR-16-2. Measuring expression levels can include measuring levels of RNA transcripts of the one or more T2D-associated genes. Alternatively or in addition, measuring expression levels can include measuring amounts of proteins or polypeptides resulting from translation of RNA transcripts of the one or more T2D-associated genes. In these embodiments, step (c) can include detecting the presence or increased risk of developing T2D when the expression levels of the one or more T2D-associated genes are higher than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of KDM2B, TXNDC12, and EIF4E3. Alternatively, step (c) can include detecting the presence or increased risk of developing T2D when the expression levels of the one or more T2D- associated genes are lower than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of CUEDC2, PDCDl, PACRG, miR-16-1, and miR-16-2. [0008] In other embodiments of the method, step (a) includes measuring DNA methylation levels of the one or more T2D-associated genes and the T2D-associated genes are selected from the group consisting of PDCDl, EIF4E3, PACRG, and KDM2B. In these embodiments, step (c) can include detecting the presence or increased risk of developing T2D when the DNA methylation levels of the one or more T2D-associated genes, or regions of these genes, are higher than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of PDCDl and EIF4E3. In particular, the regions of the T2D- associated genes can be promoter regions. In other words, (c) can include detecting the presence or increased risk of developing T2D when the DNA methylation level of a promoter region in the one or more T2D-associated genes is higher than the level in the standard control. Alternatively, step (c) can include detecting the presence or increased risk of developing T2D when the DNA methylation levels of the one or more T2D-associated genes, or regions of these genes, are lower than the levels in the standard control, and the T2D- associated genes are selected from the group consisting of EIF4E3, PACRG, and KDM2B. In some cases, step (c) includes detecting the presence or increased risk of developing T2D when the DNA methylation level of a promoter region in the one or more T2D-associated genes is lower than the level in the standard control.
[0009] In the second aspect, a method is provided for detecting the presence or increased risk of developing T2D in a subject. The method includes these steps: (a) determining, in a tissue sample obtained from the subject, a single-nucleotide genotype of the subject at a genomic locus, wherein the genomic locus is given by a SNP ID number provided in Table 9, 10, or 11; and (b) detecting the presence or increased risk of developing T2D in the subject if the single-nucleotide genotype matches a reference genotype associated with T2D. [0010] In some embodiments of the method, the genomic locus is given by a SNP ID number provided in Table 11 and selected from the group consisting of rsl052637, rs748767, rs9998519, rsl534938, rsl7497819, rsl 1597439, rsl 17510629, rsl 1065587,
chrl2:120395303, rsl 1065588, rs3743599, rs3743598, rs734409, rs3818744, rsl7114359, rs220305, rs2839467, and rs93136. In these embodiments, the reference genotype associated with T2D for each genomic locus can be provided in a column Al of Table 11. In one such embodiment, the genomic locus is given by the SNP ID number rs9998519 and the reference genotype is C. In another such embodiment, the genomic locus is given by the SNP ID number rsl534938 and the reference genotype is T.
[0011] In some embodiments of the method, the genomic locus is given by a SNP ID number provided in Table 9 and the reference genotype for the genomic locus is provided in the "risk genotype" column of Table 9. In some embodiments, the genomic locus is given by a SNP ID number provided in Table 10 and the reference genotype for the genomic locus is provided in the "risk allele" column of Table 10.
[0012] In some embodiments, steps (a) and (b) of the method are performed for a plurality of genomic loci.
[0013] In the methods of the first aspect or the second aspect, the sample can be a blood or saliva sample. The subject from whom the sample is obtained can be of Asian descent. In some cases, the subject is a Chinese.ln some cases, the subject is a Han Chinese. The subject can also have a family history of T2D but no diagnosis for T2D (i.e., the subject has not been diagnosed with T2D).
[0014] In some embodiments of the methods of the first and second aspects, step (a) includes an amplification reaction. The amplification reaction can be a polymerase chain reaction (PCR).
[0015] In the third aspect, a method of identifying genetic markers of T2D is provided. The method includes these steps: obtaining a multiomic data set from one or more T2D- positive subjects, wherein each T2D-positive subject has been diagnosed with T2D; obtaining a multiomic data set from one or more T2D-negative subjects, wherein each T2D-negative subject has not been diagnosed with T2D; identifying differences between the multiomic data sets obtained from the one or more T2D-positive subjects and the one or more T2D-negative subjects; and identifying one or more genetic markers based on the differences.
[0016] In some embodiments of this method, the T2D-positive subjects are related to the T2D-negative subjects. The T2D-positive subjects and T2D-negative subjects can together include a family trio, and the family trio can include a father, a mother, and an offspring. In some cases, one of the father and the mother is a T2D-positive subject, and the other of the father and the mother is a T2D-negative subject. In some cases, the father is a T2D-negative subject and the mother and the offspring are T2D-positive subjects. In one such case, the offspring is a daughter of the father and the mother.
[0017] In some embodiments of the method, each multiomic data set includes at least two of the following: DNA sequencing data, RNA sequencing data, and RNA expression level data. The DNA sequencing data can include genomic sequencing data or methylomic sequencing data. The RNA sequencing data can include transcriptomic sequencing data or microRNA sequencing data. The RNA expression level data can include mRNA expression level data or microRNA expression level data.
[0018] In some embodiments of the method, differences between the multiomic data sets from the one or more T2D-positive subjects and the one or more T2D-negative subjects are identified by comparing the multiomic data sets pair-wise. In some embodiments, the one or more genetic markers are identified using network analysis. The one or more genetic markers can include connected genes. In some embodiments, the genetic markers co- segregate in the multiomic data sets obtained from the T2D-positive subjects or the T2D- negative subjects.
[0019] In some embodiments of the method, the genetic markers include differentially expressed genes, differentially expressed microR As, differentially methylated regions, or single-nucleotide polymorphisms. In these embodiments, the genetic markers can include a differentially expressed gene that contains or overlaps with a differentially methylated region. In some cases, the differentially methylated region occurs within or overlaps with the promoter of the gene. In some cases, the differentially methylated region occurs within or overlaps with the body of the gene. [0020] Embodiments of the method can also include validating the genetic markers by comparing gene expression or methylation in T2D-positive and T2D-negative cohorts.
Validating can include performing quantitative PCR.
[0021] In any aspect of the present invention, a method can be carried out in whole or in part using a computer system. For example, comparisons between or among gene expression levels, DNA methylation levels, or SNP genotypes can be performed computationally, as can preparation of multiomic data sets and multiomic or network analysis. In some embodiments, a computer product including a tangible computer readable medium is provided, and the medium stores a plurality of instructions that when executed control a computer system to perform a method as substantially described herein. In some embodiments, a system is provided that includes one or more processors configured to perform a method as
substantially described herein.
BRIEF DESCRIPTION OF THE DRAWINGS
[0022] Figure 1: Summary of the study design. DNA from the PBMC of the trio will be subjected to whole genome sequencing and whole genome bisulfite sequencing. Whole genome sequencing identified the SNPs and indels from the trio. Bisulfite sequencing identified the DNA methylation sites. The RNA samples were subjected to RNA sequencing and microRNA sequencing to identify the gene expression profile and the microRNA expression pattern. [0023] Figure 2:Summary of the multiomic data analysis of DNA methylation, gene expression from the trio. Interested gene loci were chosen for validation in DNA methylation and gene expression. Co-segregated SNPs in the interested gene loci will be checked against the results of the Chinese GWAS meta-analysis to identify target SNPs for validation in independent case control cohort.
[0024] Figure 3: A computer system for use with some embodiments of the invention.
[0025] Figure 4: Summary of differential methylation regions (DMR) in the genome. Whole genome methylation analysis showed that there are over 1500 DMR in each member of the trio. 186 DMRs co-segregated only in the father-mother (FM) and father-daughter (FD) pairs. These 186 DMRs overlaps with 32 genes.
[0026] Figure 5: Summary of gene expression and microRNA expression in the trio. Venn diagram showed the number of expressed genes in the family trio (top left) and the pair wise comparison of expressed genes (top right). 413 Genes expressed differentially in both the FM and FD comparisons but not in the MD comparison. Expression of microRNA was summarized in the bottom. Over 500 microRNAs were expressed in each member of the trio (left) and 67 showed differential expression in only the FM and FD pairs (right).
[0027] Figure 6: Network analysis to discover genes associated with T2D. Network analysis constructed a gene network composed of thousands of genes. The sub-network contains the known T2D GWAS genes (left) contains 141 genes with 15 genes being the most connected genes shown in the table at right.
[0028] Figure 7: Discovery of T2D genes by multiomic approach. Whole genome bisulfite sequencing and RNA sequencing were applied to a family trio with T2D to reveal DNA methylation and gene expression change in PBMC associated with T2D. The CIRCOS diagram in centre summarizes the differentially methylated regions and differentially expressed genes. Key changes in both DNA methylation and gene expression in 5 gene regions were successfully validated in a 65 vs. 65 case control cohort.
DEFINITIONS
[0029] The term "type 2 diabetes" (T2D) refers to a metabolic disorder that is
characterized by high blood glucose in the context of varying combinations of insulin resistance and insulin deficiency. Type 2 diabetes may becaused by a combination of lifestyle and genetic factors. Diabetes can be caused by distinct clinical entities such as endocrine disorders (e.g., Cushing's syndrome) and chronic pancreatitis. However, the majority of people with diabetes have risk factors including but not limited to obesity, hypertension, high blood cholesterol, metabolic syndrome (high triglyceride, low HDL-C, high blood glucose, high blood pressure, large waist), which may share common metabolic pathways, further amplified by aging, energy dense diets (e.g., high- fat and high
glucose),sedentary lifestyle and use of certain drugs (e.g., beta blockers, steroids). On the other hand, having relatives (especially first degree) with T2D increases risks of developing T2D substantially. Symptoms of T2D often include polyuria (frequent urination), polydipsia (increased thirst), polyphagia (increased hunger), fatigue, and weight loss. The abnormal neurohormonal and metabolic milieu characterized by hyperglycemia, dyslipidemia and low grade inflammation can trigger a cascade of signaling pathways, which can lead to cell death and dysregulated cell growth, giving rise to multiple morbidities including heart disease, strokes, limb amputation, visual loss, kidney failure, cancers, and cognitive impairment.
[0030] The term "cardiovascular disease" refers to a broad class of diseases that involve the heart or blood vessels (arteries and veins) and affect the cardiovascular system, such as conditions related to atherosclerosis (arterial disease). These include but not limited to stroke, coronary heart disease and peripheral vascular disease. Known risk factors for cardiovascular diseases include unhealthy eating, lack of exercise, obesity, suboptimally managed diabetes, abnormal blood lipids, high blood pressure, excessive consumption of alcohol, use of tobacco, as well as genetic background.
[0031] As used herein, the term "body mass index" or "BMI" refers to a number calculated from a person's weight and height to reflect the "fatness" or "thinness" of a person. More specifically, BMI = mass (kg) / (height (m))2 or mass (lb) x 703 / (height (in))2.
Typically, in Caucasian populations, a BMI of 20 to 25 kg/m2is considered optimal weight; a BMI lower than 20kg/m2 suggests the person is underweight whereas a BMI above 25 kg/m2may indicate the person is overweight; a BMI above 30 kg/m2 suggests the person is obese; and a BMI over 40 kg/m2 indicates the person to be morbidly obese. Compared to
Caucasians, Asians have more body fat for the same degree of BMI and waist circumference. Thus, normal weight and obesity in Asians are defined as <23 kg/m2 and >25
kg/m2respectively. While high BMI may predict risk for diabetes or prediabetes, people with low BMI, which correlates with beta cell function, are also at high risk, especially if these subjects develop central obesity, which tends to be associated with insulin resistance or reduced insulin sensitivity. [0032] In this disclosure, the term "biological sample" or "sample" includes any section of tissue or bodily fluid taken from a test subject such as a biopsy and autopsy sample, and frozen section taken for histologic purposes, or processed forms of any of such samples. Biological samples include blood and blood fractions or products (e.g., serum, plasma, platelets, white blood cells, red blood cells, and the like), sputum or saliva, lymph and tongue tissue, cultured cells, e.g., primary cultures, explants, and transformed cells, stool, urine, stomach biopsy tissue etc. A biological sample is typically obtained from a eukaryotic organism, which may be a mammal, may be a primate and may be a human subject.
[0033] In this disclosure, the term "biopsy" refers to the process of removing a tissue sample for diagnostic or prognostic evaluation, and to the tissue specimen itself. Any biopsy technique known in the art can be applied to the methods of the present invention. The biopsy technique applied will depend on the tissue type to be evaluated (e.g., tongue, colon, prostate, kidney, bladder, lymph node, liver, bone marrow, blood cell, stomach tissue, etc.) among other factors. Representative biopsy techniques include, but are not limited to, excisional biopsy, incisional biopsy, needle biopsy, surgical biopsy, and bone marrow biopsy and may comprise endoscopy such as colonoscopy. A wide range of biopsy techniques are well known to those skilled in the art who will choose between them and implement them with minimal experimentation.
[0034] In this disclosure, the term "isolated" nucleic acid molecule means a nucleic acid molecule that is separated from other nucleic acid molecules that are usually associated with the isolated nucleic acid molecule. Thus, an "isolated" nucleic acid molecule includes, without limitation, a nucleic acid molecule that is free of nucleotide sequences that naturally flank one or both ends of the nucleic acid in the genome of the organism from which the isolated nucleic acid is derived (e.g., a cDNA or genomic DNA fragment produced by a polymerase chain reaction or restriction endonuclease digestion). Such an isolated nucleic acid molecule is generally introduced into a vector (e.g., a cloning vector or an expression vector) for convenience of manipulation or to generate a fusion nucleic acid molecule. In addition, an isolated nucleic acid molecule can include an engineered nucleic acid molecule such as a recombinant or a synthetic nucleic acid molecule. A nucleic acid molecule existing among hundreds to millions of other nucleic acid molecules within, for example, a nucleic acid library (e.g., a cDNA or genomic library) or a gel (e.g., agarose, or polyacrylamine) containing restriction-digested genomic DNA, is not an "isolated" nucleic acid. [0035] The term "nucleic acid" or "polynucleotide" refers to deoxyribonucleic acids (DNA) or ribonucleic acids (RNA) and polymers thereof in either single- or double-stranded form. Unless specifically limited, the term encompasses nucleic acids containing known analogs of natural nucleotides that have similar binding properties as the reference nucleic acid and are metabolized in a manner similar to naturally occurring nucleotides. Unless otherwise indicated, a particular nucleic acid sequence also implicitly encompasses conservatively modified variants thereof (e.g., degenerate codon substitutions), alleles, orthologs, single nucleotide polymorphisms (SNPs), and complementary sequences as well as the sequence explicitly indicated. Specifically, degenerate codon substitutions may be achieved by generating sequences in which the third position of one or more selected (or all) codons is substituted with mixed-base and/or deoxyinosine residues (Batzer et ah, Nucleic Acid Res. 19:5081 (1991); Ohtsuka et ah, J. Biol. Chem. 260:2605-2608 (1985); and
Rossolini et ah, Mol. Cell. Probes 8:91-98 (1994)). The term nucleic acid is used
interchangeably with gene, cDNA, and mRNA encoded by a gene. [0036] The term "gene" means the segment of DNA involved in producing a polypeptide chain; it includes regions preceding and following the coding region (leader and trailer) involved in the transcription and/or translation of the gene product and the regulation of the transcription and/or translation, as well as intervening sequences (introns) between individual coding segments (exons). [0037] ADAT1, AGRN, AJAP1, AKAP7, ANKRD29, APOB, ASBIS, CI 7orf64, CI 7orf81, C21orf56, C6orfl38, CAMKID, CBLN4, CCDC155, CCDC93, CDKN2B, CEMPl, CITED2, CLDN6, COL15A1, CUED2, CUEDC2, DDX18, E2F8, EIF4E3, FLT1, FMOD, GTF2A2, HEATR2, KCNK17, KDM2B, LHX8, LRRC7, MED7, miR-16-1, miR-16-2, NR4A3, PACRG, PCDH17, PDCD1, PDGFRL, PLEKHH3, PPP4R1L, PRR14, PTPN7, PTPRD, RABGGTB, RPS16, SCARF2, SLC22A20, SLC35A4, TARS, TGFBR3, TNFRSF12A, TXNDC12, U2AF1, UMODL1, WFS1, and XBP1, as used herein, refer to the genes or proposed open reading frames (including their variants and mutants)and their polynucleotide transcripts as provided in publically accessible sequence databases of the human genome. In some context, these terms may also be used to refer to the polypeptides encoded by these genes or open reading frames.
[0038] In this application, the terms "polypeptide," "peptide," and "protein" are used interchangeably herein to refer to a polymer of amino acid residues. The terms apply to amino acid polymers in which one or more amino acid residue is an artificial chemical mimetic of a corresponding naturally occurring amino acid, as well as to naturally occurring amino acid polymers and non-naturally occurring amino acid polymers. As used herein, the terms encompass amino acid chains of any length, including full-length proteins (i.e., antigens), wherein the amino acid residues are linked by covalent peptide bonds.
[0039] The term "amino acid" refers to naturally occurring and synthetic amino acids, as well as amino acid analogs and amino acid mimetics that function in a manner similar to the naturally occurring amino acids. Naturally occurring amino acids are those encoded by the genetic code, as well as those amino acids that are later modified, e.g., hydroxyproline, γ- carboxyglutamate, and O-phosphoserine. For the purposes of this application, amino acid analogs refers to compounds that have the same basic chemical structure as a naturally occurring amino acid, i.e., a carbon that is bound to a hydrogen, a carboxyl group, an amino group, and an R group, e.g., homoserine, norleucine, methionine sulfoxide, methionine methyl sulfonium. Such analogs have modified R groups (e.g., norleucine) or modified peptide backbones, but retain the same basic chemical structure as a naturally occurring amino acid. For the purposes of this application, amino acid mimetics refer to chemical compounds that have a structure that is different from the general chemical structure of an amino acid, but that functions in a manner similar to a naturally occurring amino acid.
[0040] Amino acids may include those having non-naturally occurring D-chirality, as disclosed in WO01/12654, which may improve the stability (e.g., half-life), bioavailability, and other characteristics of a polypeptide comprising one or more of such D-amino acids. In some cases, one or more, and potentially all of the amino acids of a therapeutic polypeptide have D-chirality.
[0041] Amino acids may be referred to herein by either the commonly known three letter symbols or by the one-letter symbols recommended by the IUPAC-IUB Biochemical
Nomenclature Commission. Nucleotides, likewise, may be referred to by their commonly accepted single-letter codes.
[0042] The phrase "specifically binds," when used in the context of describing a binding relationship of a particular molecule to a protein or peptide, refers to a binding reaction that is determinative of the presence of the protein in a heterogeneous population of proteins and other biologies. Thus, under designated binding assay conditions, the specified binding agent(e.g. , an antibody) binds to a particular protein at least two times the background and does not substantially bind in a significant amount to other proteins present in the sample. Specific binding of an antibody under such conditions may require an antibody that is selected for its specificity for a particular protein or a protein but not its similar "sister" proteins. A variety of immunoassay formats may be used to select antibodies specifically immunoreactive with a particular protein or in a particular form. For example, solid-phase ELISA immunoassays are routinely used to select antibodies specifically immunoreactive with a protein (see, e.g., Harlow & Lane, Antibodies, A Laboratory Manual (1988) for a description of immunoassay formats and conditions that can be used to determine specific immunoreactivity). Typically a specific or selective binding reaction will be at least twice background signal or noise and more typically more than 10 to 100 times background. On the other hand, the term "specifically bind" when used in the context of referring to a polynucleotide sequence forming a double-stranded complex with another polynucleotide sequence describes "polynucleotide hybridization" based on the Watson-Crick base-pairing, as provided in the definition for the term "polynucleotide hybridization method." [0043] A "polynucleotide hybridization method" as used herein refers to a method for detecting the presence and/or quantity of a pre-determined polynucleotide sequence based on its ability to form Watson-Crick base-pairing, under appropriate hybridization conditions, with a polynucleotide probe of a known sequence. Examples of such hybridization methods include Southern blot, Northern blot, and in situ hybridization. [0044] "Primers" as used herein refer to oligonucleotides that can be used in an amplification method, such as a polymerase chain reaction (PCR), to amplify a nucleotide sequence based on the polynucleotide sequence corresponding to a gene of interest or a portion thereof. Typically, at least one of the PCR primers for amplification of a
polynucleotide sequence is sequence-specific for that polynucleotidesequence. The exact length of the primer will depend upon many factors, including temperature, source of the primer, and the method used. For example, for diagnostic and prognostic applications, depending on the complexity of the target sequence, the oligonucleotide primer typically contains at least 10, or 15, or 20, or25 or more nucleotides, although it may contain fewer nucleotides or more nucleotides. The factors involved in determining the appropriate length of primer are readily known to one of ordinary skill in the art. In this disclosure the term "primer pair" means a pair of primers that hybridize to opposite strands a target DNA molecule or to regions of the target DNA which flank a nucleotide sequence to be amplified. In this disclosure, the term "primer site" means the area of the target DNA or other nucleic acid to which a primer hybridizes.
[0045] A "label," "detectable label," or "detectable moiety" is a composition detectable by spectroscopic, photochemical, biochemical, immunochemical, chemical, or other physical means. For example, useful labels include 32P, fluorescent dyes, electron-dense reagents, enzymes (e.g., as commonly used in an ELISA), biotin, digoxigenin, or haptens and proteins that can be made detectable, e.g., by incorporating a radioactive component into the peptide or used to detect antibodies specifically reactive with the peptide. Typically a detectable label is attached to a probe or a molecule with defined binding characteristics (e.g., a polypeptide with a known binding specificity or a polynucleotide), so as to allow the presence of the probe (and therefore its binding target) to be readily detectable.
[0046] The term "amount" as used in this application refers to the quantity of a
polynucleotide of interest or a polypeptide of interest present in a sample. Such quantity may be expressed in the absolute terms, i.e., the total quantity of the polynucleotide or polypeptide in the sample, or in the relative terms, i.e., the concentration of the polynucleotide or polypeptide in the sample.
[0047] The term "effective amount" as used herein refers to an amount of a given substance that is sufficient in quantity to produce a desired effect. For example, an effective amount of a cholesterol lowering drug or a blood glucose lowering drug is the amount of said drug to achieve a decreased level of cholesterol or blood glucose, respectively, in a patient who has been given the drug for therapeutic purposes. An amount adequate to accomplish this is defined as the "therapeutically effective dose. "The dosing range varies with the nature of the therapeutic agent being administeredand other factors such as the route of
administration and the severity of a patient's condition. [0048] The term "expression level" refers to the expression level ofa particular gene or genomic region in a sample obtained from an individual. Expression can be represented by the amount of transcribed RNA or translated protein or polypeptide associated with the gene or genomic region, or by a biological activity (e.g. enzymatic activity) resulting from expression of the gene or genomic region. Expression levels can be measured as desired, in absolute or relative terms, and can reflect the context of the measurement or the conditions under which the sample was obtained. [0049] The term "DNA methylation level" refers to the extent to which a gene or genomic region is methylated in a sample obtained from an individual. A gene can be fully or partially methylated, and the pattern of methylation can be random, uniform, or specific to portions of the gene (for example, the promoter or body). Moreover, the pattern and extent of
methylation of a gene can vary, for example between chromosomes in the same cell, tissues of the same individual, or different individuals. Thus, measuring a DNA methylation level in a sample can provide a detailed methylation pattern and can reflect the context in which the sample was obtained. The measured DNA methylation level can be used to determine whether a genomic region is differentially methylated, for example between T2D-positive and T2D-negative individuals.
[0050] "Standard control" as used herein refers to a sample suitable for the use of a method of the present invention, in order to quantitatively determine the level of expression (e.g., abundance of RNA transcripts or gene products) or DNA methylation in a test sample for one or more genomic regions of interest (for example, a gene or genomic locus). The standard control contains a known level or levels of expression or DNA methylation for the genomic region(s) of interest, such that the levels closely reflect those of an average healthy individual not suffering from T2D and not at an increased risk of later developing T2D. The standard control may be derived from one or more healthy individuals.
[0051] "Higher or lower than levels in a standard control" as used herein refers to differences between the level of expression or DNA methylation in test sample as compared with corresponding levels in a standard control, for the same genomic region of interest. A higher level is preferably at least 2-fold, more preferably at least 5-fold, and most preferably at least 10-fold higher than the level in the standard control. Similarly, a lower level is preferably less than 50%, more preferably less than 20%, and most preferably less than 10% of the level in the standard control.
[0052] The term "subject" or "subject in need of treatment," as used herein, includes individuals who seek medical attention due to risk of, or actual suffering from T2D, cancer, or cardiovascular disease. Subjects also include individuals currently undergoing therapy that seek manipulation of the therapeutic regimen. Subjects or individuals in need of treatment include those that demonstrate symptoms ofT2D or cardiovascular disease, or are at risk of suffering from T2D or cardiovascular disease or related symptoms. For example, a subject in need of treatment includes individuals with a genetic predisposition or family history for T2D, cancer, or cardiovascular disease, those who have suffered relevant symptoms in the past, those who have been exposed to a triggering substance or event, as well as those suffering from chronic or acute symptoms of the condition. A "subject in need of treatment" may be at any age of life. [0053] The term "related," as it pertains to individuals, is used herein to refer to individuals who have common ancestry. For example, a child is related to its parents, grandparents, and siblings, and moreover all of these individuals are related to each other. Relatedness between individuals can be established through knowledge of family history, with genetic tests, or otherwise. Any number of generations can be considered when determining whether two or more people have common ancestry and are thus related, although this number can be limited if desired to provide a more stringent test.
[0054] As the term is used herein,a "multiomic data set" refers to data resulting from sequencing DNA and/or RNA obtained from a tissue sample of an individual. Without limitation, a multiomic data set can contain one or more of the following: all or a substantial part of the genomic DNA sequence of the individual; sequences or copy numbers of sites of interest within the genome; methylation patterns of genomic DNA; information about the accessibility of genomic DNA sequences to transcription; sequences of RNA transcripts, including coding and/or non-coding RNAs; and levels or relative abundances of RNA transcripts. In some embodiments, the multiomic data set contains sequencing data for the full genome, transcriptome, or methylome of the tissue sample. The DNA and RNA can be obtained from any kind of tissue, and the multiomic data set can have components that a skilled artisan would expect to be specific to the tissue (e.g. the presence of RNA transcripts for specific genes) or largely the same for all tissues of the individual (e.g. sequences of genomic DNA). Multiomic data sets from different individuals (e.g. individuals with different diabetes phenotypes) can be compared to identify genetic differences between the individuals, such as differences in genomic DNA sequence (e.g. single-nucleotide
polymorphisms, insertions, deletions, differences in gene copy numbers), differences in genomic DNA structure (e.g. methylation, binding of DNA to histone proteins or
transcription factors), and differences in gene expression (e.g. relative abundances of RNA transcripts or non-coding RNA).
[0055] As the term is used herein, a"multiomic signature" refers to a set of genetic markers, identified using multiomic data sets and the methods disclosed herein, that characterizes or is possessed by an individual or group of individuals. A multiomic signature can be shared by individuals who are related to each other (for example, one being the offspring of another) or who have a common phenotype (for example, being diagnosed with T2D). A multiomic signature can include any number of identified genes, regions, or loci within the genome, as well as information about how the structure (e.g., sequence or methylation state) or expression (e.g., transcription levels) of these genes , regions, or loci differ between individuals who possess and do not possess the multiomic signature.
[0056] "Network analysis," as the term is used herein, refers to a method of integrating multiomic data sets from two or more individuals. In some embodiments, the individuals are related. For each individual, a gene can be represented as a vector made up of multiple data types from the multiomic data set, and connections between genes can be identified by correlating differences in the vectors between individuals. A set of connected genes with characteristics (e.g. expression level, DNA methylation level) that correlate with the presence or absence of a particular phenotype (e.g., T2D) can then be identified.
DETAILED DESCRIPTION OF THE INVENTION
I. Introduction
[0057] Large scale studies like GWAS found multiple genes and SNPs associated with type 2 diabetes (T2D) but with limited effect size. We used a multiomic approach on a family trio to discover T2D associated genes.
[0058] DNA and RNA samples from peripheral blood mononuclear cells from a family trio consisting of T2D affected mother and daughter and non-T2D father were used for high throughput DNA sequencing, bisulfite sequencing, RNA sequencing and small RNA sequencing to obtain a complete profile on DNA methylation and gene expression. Pairwise comparisons (FM, father-mother, FD, father-daughter and MD, mother-daughter) were used to find differentially methylated and expressed genes. Network analysis was used to integrate these changes to discover key genes associated with T2D. Changes in DNA methylation and gene expression were validated in 65 controls and 65 T2D cases using Sequenom Epityper and qRT-PCR. [0059] Bisulfite sequencing and RNA sequencing yielded high coverage and depth data for analysis. There were 413 differentially expressed genes and 186 differentially methylated regions (DMR) showed cosegregation in the FD and FM pairs but not in the MD pair. The 186 DMR overlapped with the promoter of 32 genes. Using the results from changes in gene and microRNA expression and DNA methylation, multiomic network analysis highlighted 15 most connected genes. Among these focus genes with correlated methylation/expression patterns, several of them showed association with T2D in a GWAS meta-analysis of 677 Chinese T2D cases and 955 controls. DNA methylation and gene expression changes in these genes were validated in 65 controls and 65 T2D cases. The confirmed genes suggested changes in chromatin modification and protein metabolism pathways in T2D.
[0060] Apart from emphasizing the importance of epigenetics and protein metabolism in T2D, this is a proof of concept study to use multiomic approach to discover pivotal genes in complex disease such as T2D.
II. General Methodology
[0061] Practicing this invention utilizes routine techniques in the field of molecular biology. Basic texts disclosing the general methods of use in this invention include Sambrook and Russell, Molecular Cloning, A Laboratory Manual (3rd ed. 2001); Kriegler, Gene Transfer and Expression: A Laboratory Manual (1990); and Current Protocols in Molecular Biology (Ausubel et al, eds., 1994)).
[0062] For nucleic acids, sizes are given in either kilobases (kb) or base pairs (bp). These are estimates derived from agarose or acrylamide gel electrophoresis, from sequenced nucleic acids, or from published DNA sequences. For proteins, sizes are given in kilodaltons (kDa) or amino acid residue numbers. Protein sizes are estimated from gel electrophoresis, from sequenced proteins, from derived amino acid sequences, or from published protein sequences.
[0063] Oligonucleotides that are not commercially available can be chemically synthesized, e.g., according to the solid phase phosphoramidite triester method first described by
Beaucage and Caruthers, Tetrahedron Lett.22: 1859-1862 (1981), using an automated synthesizer, as described in Van Devanter et. al, Nucleic Acids i?es.l2:6159-6168 (1984). Purification of oligonucleotides is performed using any art-recognized strategy, e.g., native acrylamide gel electrophoresis or anion-exchange high performance liquid chromatography (HPLC) as described in Pearson and Reanier, J Chrom.255: 137-149 (1983).
[0064] Standard next-generation DNA sequencing methods, including but not limited to those of Illumina, Ion Torrent, and Pacific Biosciences are well known in the art. III. Multiomic Analysis
A. Subjects and sample preparation
[0065] A biological sample is obtained from a person to be tested or assessed for risk of developing type 2 diabetes or associated cardiovascular or renal disease using a method of the present invention. Collection of a tissue or fluid sample from an individual is performed in accordance with the standard protocol laboratories, hospitals or clinics generally follow, such as during a biopsy, blood drawing, saliva collection, or oral swab. An appropriate amount of sample is collected and may be stored according to standard procedures prior to further preparation. [0066] The analysis of genomic DNA found in a subject's sample according to the present invention may be performed using essentially any tissue or bodily fluid, so long as genomic DNA is expected to be present in such sample. The methods for preparing tissue or fluid samples for nucleic acid extraction are well known among those of skill in the art. For example, a subject's epithelial tissue sample should be first treated to disrupt cellular membrane so as to release nucleic acids contained within the cells.
[0067] The Chinese family trio consisted of an affected mother (age: 69, age of diagnosis: 61) and daughter (age 48, age of diagnosis: 38) pair and an unaffected father (age 79). They were selected from the Hong Kong Family Diabetes Study which recruited over 180 families using an index patient with familial T2D diagnosed before the age of 40 [16]. Peripheral blood samples were obtained from the family trio for isolation of PBMC using Ficoll-Paque stepwise gradient centrifugation. The isolated PBMC were divided for DNA and RNA extraction.
B. Whole genome sequencing and data analysis
1. DNA Extraction and Treatment
[0068] Methods for extracting DNA from a biological sample are well known and routinely practiced in the art of molecular biology, see, e.g., Sambrook and Russell, supra. RNA contamination should be eliminated to avoid interference with DNA analysis. Optionally, other components (such as proteins and lipids) may be removed from the biological sample prior to further analysis of the genomic DNA. 2. Optional Amplification and Sequence Analysis
[0069] Following the desired processing of DNA/RNA in a biological sample, the
DNA/RNA is then subjected to sequence-based analysis, such that the genomic sequence of one or more of the pertinent genes, or one or more of its transcripts, found in a test subject may be determined and then compared with a standard sequence to detect any possible sequence variation. An amplification reaction is optional prior to the sequence analysis. A variety of polynucleotide amplification methods are well established and frequently used in research. For instance, the general methods of polymerase chain reaction (PCR) for polynucleotide sequence amplification are well known in the art and are thus not described in detail herein. For a review of PCR methods, protocols, and principles in designing primers, see, e.g., Innis, et al., PCR Protocols: A Guide to Methods and Applications, Academic Press, Inc. N.Y., 1990. PCR reagents and protocols are also available from commercial vendors, such as Roche Molecular Systems.
[0070] Although PCR amplification is typically used in practicing the present invention, one of skill in the art will recognize that amplification of the relevant genomic sequence may be accomplished by any known method, such as the ligase chain reaction (LCR),
transcription-mediated amplification, and self-sustained sequence replication or nucleic acid sequence-based amplification (NASBA), each of which provides sufficient amplification.
[0071] Techniques for polynucleotide sequence determination are also well established and widely practiced in the relevant research field. For instance, the basic principles and general techniques for polynucleotide sequencing are described in various research reports and treatises on molecular biology and recombinant genetics, such as Wallace et al. , supra;
Sambrook and Russell, supra, and Ausubel et al, supra. DNA sequencing methods routinely practiced in research laboratories, either manual or automated, can be used for practicing the present invention. Additional means suitable for determining the polynucleotide sequence of a genomic DNA for practicing the methods of the present invention include but are not limited to mass spectrometry, primer extension, polynucleotide hybridization, real-time PCR, melting curve analysis, high resolution melting analysis, heteroduplex analysis,
pyrosequencing, and electrophoresis. [0072] The DNA samples from the trio were subjected to 75bp pair-end sequencing using Solexa sequencer from Illumina. All procedures were performed in accordance to the specifications of manufacturer. All GA sequencing raw data was processed by the Illumina Pipeline v 1.3.1. After removing the low quality reads and adapter sequences, the filtered reads were aligned to the reference human genome hgl8 using SOAP2 aligner developed by BGI[17]. The results were filtered and sorted. Variations in the sequence were identified by SOAPsnp vl .03[18]. Small indels in the sequence were identified by BWA aligner[19]. The identified variants were annotated and analyzed by SIFT (on the World Wide Web at sift- dna.org/)[20] to identify their functional implications.
C. Whole genome bisulfite sequencing and data analysis
[0073] Bisulfite sequencing was carried out to map the position of all methylcytosines in the genome. The DNA were fragmented and treated with bisulfite using standard protocol. DNA fragments in the size range of 320 to 380bp were gel purified for sequencing. All procedures were in accordance to the manufacturer's instructions. Converted DNA was subjected to 50bp pair-end Illumina GA sequencing using Solexa sequencer. All raw data was processed by the Illumina Pipeline vl .3.1.
[0074] The cleaned reads generated were aligned to the reference human genome hgl8. Because DNA methylation is strand specific, the two strands of the reference human genome DNA was modified separately in silico to convert all 'C to 'T' to generate a combined 6Gbp target genome to allow alignment after bisulfite conversion. The reads were also transformed using the following criteria: (1) observed 'C in the forward reads was replaced by 'T'; (2) observed 'G' in the reverse reads was converted to Ά'. The transformed reads were then aligned to the modified target genome using SOAP2 aligner[17].
[0075] All the reads mapped to unique location with minimum mismatches and clear strand information were defined as uniquely matched reads and used to determine the methylated C. According to the SOAP alignment results, the unconverted C and G from the original read sequences were used to substitute the sequence in the corresponding positions of the aligned reads. Based on the aligned reads with all positions of the methylated C, the two strands of the DNA were transformed back to the original sequence with the methylated C marked. Potential CpG methylation sites were called when the 'C positions were covered with cytosine in reads on the same strand or guanines from those on the opposite strand. Bases with quality scores lower than 14 were filtered out to exclude bases that were caused by sequencing errors. D. Identification of differentially methylated regions (DMR)
[0076] The methylation levels were compared pairwise between two of the three samples to identify differentially methylated regions (DMR) using the following criteria. Firstly, we identified 'seed' regions containing at least 5 CpG sites with at least 2-fold change in methylation level (P<0.05, Fisher's exact test). We only included 'seed' regions with 20% or more CpG methylation in any one of the 3 subjects assuming that CpG sites with less than 20% methylation had no biological significance. Next, we extended the 'seed' region on both sides by adding the adjacent CpG sites until one of the following conditions were met: (1) the distance between two consecutive CpG was longer than 200 bp; (2) the average methylation fold change was less than 2; (3) both samples had less than 20% CpG methylation; (4) the P value from Chi-square test exceeded 0.01. The identified DMR were further filtered by the following criteria: (1) the statistical false discovery rate (FDR) was less than 0.05; (2) the DMR must be covered by 20 reads or more; (3) the CpG sites in the DMR must be covered by 10 reads or more. Differentially methylated regions fulfilling all these criteria were considered to be the authentic DMR.
E. mRNA Transcriptome sequencing and data analysis
[0077] Total RNA extracted from each sample was enriched by oligo-dT to obtain the polyA+ fraction for sequencing. The polyA+ mRNA was then fragmented and converted to cDNA by reverse transcription. After ligation of the 5 ' and 3 ' sequencing adaptors to the cDNA, DNA fragments were size selected for 75bp pair-end sequencing by Illumina Genome Analyzer II using standard procedures. All raw data was processed by the Illumina Pipeline v 1.3.1. The generated reads were then mapped to the human genome hgl8 using SOAP2 [17] and followed methods established from BGI [21, 22].
F. Identification of differentially expressed genes
[0078] All reads were mapped to the regions of known annotated genes expressed as reads per kilobase per million mapped reads (RPKM). The expression levels were compared pairwise between two of the three samples based on fold changes and p value of the difference using the method of Audic and Claverie [23]. Briefly, a difference in gene expression was considered significant if (1) fold change>2; (2) P value≤0.001; (3)
FDR≤0.001 and (4) expression level > 10th percentile of all expressed transcripts in at least one of the 2 samples. G. Small RNA sequencing and data analysis
[0079] Total RNA were fractionated by gel electrophoresis to isolate the small RNA portion corresponding to the size of 18 to 30 nucleotides for sequencing with Solexa Genome Analyzer II using standard procedures. Reads were filtered if (1) > 4 bases with sequencing quality lower than 10 or >6 bases with sequencing quality lower than 13; (2) 5' adaptor was contaminated; (3) without 3' adaptor; (4) without insert tag; (5) with polyA or polyN; (6) shorter than 18 nt. The remaining 'clean reads' were then mapped back to the genome by SOAP2[17] to get the positions of small RNA tags along chromosomes. All the small RNA tags were also aligned by blastn (blastall -p blastn -F F -e 0.01) to the precursors and mature miRNA in the miRBase 15.0 to obtain the miRNA count for each miRNA expressed. Then the remaining small RNA tags were further aligned to the Rfam (by blastn, blastall -p blastn - F F -e 0.01), mRNA and piRNA from NCBI database to identify other small RNA species and small RNA fragments from degraded mRNA (exons and introns).
H. Identification of differentially expressed micro RNA
[0080] To identify the differential expression of miRNA, the number of tags mapping to known miRNA were considered as the expression level of the corresponding miRNA. The counts of each expressed miRNA were normalized to the total number of clean reads for comparison. The expression levels were compared pairwise between two of the three samples based on fold changes and the P value using published method [23]. I. Multiomic network analysis
[0081] We performed candidate gene prioritization of T2D through network analysis by integration of multiomic data. The difference networks of mother vs. father (M/F network) and daughter vs. father (D/F network) were constructed separately. Then the common parts of these two networks were considered as M/F and D/F co-segregated network or T2D relevant network. Characteristics of each gene could be represented by a vector (6 elements) corresponding to the multiomic data. These 6 elements are log2 ratios of gene expression, methylation level of gene body, methylation level of promoter region, tDMR in gene body, tDMR in promoter region and being the target genes of the miRNA showing the most significant difference in gene expression. We used 4 database to predict miRNA target genes: (1) MicroCosm_Targets (on the World Wide Web at ebi.ac.uk/enright- srv/microcosm/htdocs/targets/v5/), (2) TargetScanHuman (on the World Wide Web at targetscan.org/vert_60/), (3) microRNA.org (on the World Wide Web at microrna.org/microrna/getDownloads.do), and (4) TarBase_V5.0_human (on the World Wide Web at diana.cslab.ece.ntua.gr/tarbase/tarbase_download.php). Only target genes identified by at least 3 out of the 4 databases will be included for analysis. After discarding genes with too many missing elements (>2), the M/F and D/F networks included 3,180 and 3,091 genes separately.
[0082] We used modified RV-coefficient [24], which is a generalization of Pearson correlation coefficient, as a measure of similarity of genes. Herein, RV-coefficients were computed for all pair of genes. From a network perspective, genes were taken as nodes, and edges were added between genes if they were significantly correlated. Scale-free is considered as robustness, which is one of the most prominent characteristics of biological network [25]. In order to make the T2D relevant network (the common parts of M/F and D/F networks) scale-free network, we chose p<0.035 (RV-coefficient) to simplify the relationship of genes in M/F and D/F networks. Finally, the T2D relevant network included 2,375 notes and 121,368 edges. [0083] Corresponding to topology structure of network, betweenness centrality (BC) [26] and closeness centrality (CC) [27] hubs, are very crucial for maintaining network architecture and functionality [28]. Also, there were 17 known T2D GWAS genes in the T2D relevant network. So, hubs located on the shortest way between known T2D GWAS genes were selected for further study [25]. J. Gene expression validated by qRT-PCR
[0084] The gene expression level changes found in the RNA transcriptome analysis and microRNA expression were validated by real-time quantitative PCR (qRT-PCR) in 65 controls and 65 T2D cases. First strand cDNA was synthesized by the High-Capacity cDNA Reverse Transcription Kit (ABI Biosystems) using 0.5 μg total RNA as template. The SYBR Green method was used for qRT-PCR using an ABI 7900HT Real-time PCR machine and SYBP® Premix Ex Taq™ (Perfect Real Time) from Takara. The expression levels of each tested gene were tested by real time qPCR for 40 cycles. The expression levels of mRNA were normalized to the expression level of β-actin. The expression level of microRNA were normalized to the expression level of U6 small RNA. K. DNA Methylation validated by Sequenom MassARRAY
[0085] DNA methylation changes were validated by the Sequenom MassARRAY Compact System (Sequenom, Inc.) using the EpiTYPER assay. Genomic sequences from interested regions were used to design the primers for the assay. DNA samples were extracted from the PBMCs of 65 controls and 65 T2D cases. For each sample, 1 ug of DNA was treated with sodium bisulfite using the EZ DNA MethylationTM kit (Zymo Research) according to manufacturer's procedure. The treated DNA were PCR amplified and tested for DNA methylation for individual CpG diculeotides at the interested genomic regions using
EpiTYPER (Sequenom, Inc.) according to manufacturer's standard protocols.
IV. Therapeutic and Preventive Measures
[0086] By illustrating the correlation between genomic sequence or expression variation and the presence or heightened risk of developing type 2 diabetes or cardiovascular and renal diseases among related subjects, especially those fitting certain profiles, such as those of Asian descent, in particular Han Chinese, the present inventors have provided a valuable tool for clinicians to determine, often in combination with other information and diagnostic or predictive or screening test results, how a subject having certain genomic characteristics should be monitored and/or treated for type 2 diabetes and diabetic cardiovascular or renal disease such that the symptoms of these conditions may be prevented, eliminated, ameliorated, reduced in severity and/or frequency, or delayed in their onset. For example, a physician may arrange for regular monitoring of various symptoms oftype 2 diabetes or diabetic cardiovascular and renal diseases in a subject who has been deemed by the method of the present invention to have an elevated risk of developing type 2 diabetes. The physician may also prescribe both pharmacological and non-pharmacological treatments such as lifestyle modification (e.g., reduce body weight by 5%, high fiber diet, walking for at least 150 minutes weekly) and medicines known to reduce risk of onset of diabetes (e.g., metformin, alpha glucosidase inhibitors, lipase inhibitors) to a subject who has been deemed by the method of the present invention to have an elevated risk of developing type 2 diabetes. For a subject who has been deemedby the method of the present invention to suffer from or at risk of developing diabetic cardiovascular or renal disease, the attending physician may prescribe medications to control risk factors such as high levels of blood cholesterol and triglycerol (e.g., statins and fibrates) and reduce angiotensin II activity (e.g., Angiotensin converting enzyme inhibitor (ACEI) and angiotensin II receptor blocker (ARB)), as well as place the subject under regular testing and monitoring of coronary artery condition and kidney function.
[0087] Multiomic signatures obtained using the methods described herein can also be used to develop and screen new therapies for diabetes and its comorbidities.
V. Kits and Devices
[0088] The presentinvention provides compositions and kits for practicing the methods described herein to detect possible genomic sequence variation of certain gene(s) and the transcripts thereof in a subject, which can be used for various purposes such as detecting or diagnosing the presence of type 2 diabetes and diabetic cardiovascular or renal disease in a subject, determining the risk of developing type 2 diabetes and diabetic cardiovascular or renal disease in a subject, and guiding the treatment plan for these conditions in the subject.
[0089] Kits for carrying out assays for determining the nucleotide sequence of a relevant genomic sequence typically include at least one oligonucleotide useful for specific hybridization with a predetermined segment of a pertinent genomic sequence. Optionally, this oligonucleotide is labeled with a detectable moiety. In some cases, the oligonucleotide specifically hybridizes with the standard sequence only but not with any of the variant sequences. In other cases, the oligonucleotide specifically hybridizes with one particular version of the variant sequence but not with other versions, nor with the standard sequence. [0090] In some cases, the kits may include at least two oligonucleotide primers that can be used in the amplification of at least one segment of a pertinent genomic sequence or transcripts thereof by PCR. In some examples, at least one of the oligonucleotide primers is designed to anneal only to the standard sequence or only to a particular version of the variant sequences. [0091] In addition, the kits of this invention may provide instruction manuals (e.g., internet-based decision support tools) to guide users in analyzing test samples and assessing the presence or future risk of type 2 diabetes and diabetic cardiovascular or renal disease in a test subject. VI. Computer Systems
[0092] Any of the computer systems mentioned herein may utilize any suitable number of subsystems. Examples of such subsystems are shown in Figure 3 in computer apparatus 300. In some embodiments, a computer system includes a single computer apparatus, where the subsystems can be the components of the computer apparatus. In other embodiments, a computer system can include multiple computer apparatuses, each being a subsystem, with internal components.
[0093] The subsystems shown in Figure 3 are interconnected via a system bus 375.
Additional subsystems such as a printer 374, keyboard 378, storage device(s) 379, monitor 376, which is coupled to display adapter 382, and others are shown. Peripherals and input/output (I/O) devices, which couple to I/O controller 371, can be connected to the computer system by any number of means known in the art, such as serial port 377. For example, serial port 377 or external interface 381 (e.g. Ethernet, Wi-Fi, etc.) can be used to connect computer system 300 to a wide area network such as the Internet, a mouse input device, or a scanner. The interconnection via system bus 375 allows the central processor 373 to communicate with each subsystem and to control the execution of instructions from system memory 372 or the storage device(s) 379 (e.g., a fixed disk, such as a hard drive or optical disk), as well as the exchange of information between subsystems. The system memory 372 and/or the storage device(s) 379 may embody a computer readable medium. Any of the data mentioned herein can be output from one component to another component and can be output to the user.
[0094] A computer system can include a plurality of the same components or subsystems, e.g., connected together by external interface 381 or by an internal interface. In some embodiments, computer systems, subsystem, or apparatuses can communicate over a network. In such instances, one computer can be considered a client and another computer a server, where each can be part of a same computer system. A client and a server can each include multiple systems, subsystems, or components.
[0095] It should be understood that any of the embodiments of the present invention can be implemented in the form of control logic using hardware (e.g. an application specific integrated circuit or field programmable gate array) and/or using computer software with a generally programmable processor in a modular or integrated manner. As user herein, a processor includes a multi-core processor on a same integrated chip, or multiple processing units on a single circuit board or networked. Based on the disclosure and teachings provided herein, a person of ordinary skill in the art will know and appreciate other ways and/or methods to implement embodiments of the present invention using hardware and a
combination of hardware and software. [0096] Any of the software components or functions described in this application may be implemented as software code to be executed by a processor using any suitable computer language such as, for example, Java, C++ or Perl using, for example, conventional or object- oriented techniques. The software code may be stored as a series of instructions or commands on a computer readable medium for storage and/or transmission, suitable media include random access memory (RAM), a read only memory (ROM), a magnetic medium such as a hard-drive or a floppy disk, or an optical medium such as a compact disk (CD) or DVD (digital versatile disk), flash memory, and the like. The computer readable medium may be any combination of such storage or transmission devices. [0097] Such programs may also be encoded and transmitted using carrier signals adapted for transmission via wired, optical, and/or wireless networks conforming to a variety of protocols, including the Internet. As such, a computer readable medium according to an embodiment of the present invention may be created using a data signal encoded with such programs. Computer readable media encoded with the program code may be packaged with a compatible device or provided separately from other devices (e.g., via Internet download). Any such computer readable medium may reside on or within a single computer product (e.g. a hard drive, a CD, or an entire computer system), and may be present on or within different computer products within a system or network. A computer system may include a monitor, printer, or other suitable display for providing any of the results mentioned herein to a user. [0098] Any of the methods described herein may be totally or partially performed with a computer system including one or more processors, which can be configured to perform the steps. Thus, embodiments can be directed to computer systems configured to perform the steps of any of the methods described herein, potentially with different components performing a respective steps or a respective group of steps. Although presented as numbered steps, steps of methods herein can be performed at a same time or in a different order. Additionally, portions of these steps may be used with portions of other steps from other methods. Also, all or portions of a step may be optional. Additionally, any of the steps of any of the methods can be performed with modules, circuits, or other means for performing these steps. EXAMPLE
[0099] The following example is provided by way of illustration only and not by way of limitation. Those of skill in the art will readily recognize a variety of non-critical parameters that could be changed or modified to yield essentially the same or similar results. Methylome of the trio and differential methylation regions (DMRs) in the genome
[0100] The present inventors used whole genome bisulfite sequencing to obtain single-base resolution DNA methylome of human PBMC of the trio. Over 140 Gbp of DNA sequences were obtained from each member of the triowith about 3 Ox effective coverage of the human genome (Table 1). Mapped DNA sequences were used to calculate the degree of mC coverage at each position of C in the human genome. About 95% of the C in all autosomes was covered. Most of the DNA methylation were found in CG sites with a minority being, non-CG methylation. Differences in CG DNA methylation were identified by pairwise comparisons in the trio. Using a stringent seed expansion method, the inventors identified 2686, 2040 and 1514 DMRs in the corresponding FM, FD and MD pair wise comparisons (Figure 4). To identify DMR possibly related to T2D, we selected 186 DMRs present in both FM and FD comparisons but not in the MD comparison (Table 2). If 50% or more of the DMR sequence overlapped with the 2Kb promoter region of a gene, the latter was considered to be linked to the DMR. From these 186 DMRs, 32 genes were identified(Table 3)with promoter regions overlapping with the DMR. The inventors calculated the mC methylation coverage in the 2Kb promoter regions and found that the DNA mC methylation in these promoter regions showed a general trend in agreement with the DMRs identified.
Transcriptome by RNA-Seq and differentially expressed genes
[0101] In order to ascertain the functional significance of changes in DNA methylation,
RNA sequencing was performed on the RNA samples from PBMC. Most of the sequence reads were mapped to the annotated human reference genes successfully (Table 1). The mapped sequences represented expression of 16469, 16732 and 16916 annotated gene loci from the father, mother and daughter respectively with the majority of transcripts expressed in all 3 subjects. Using 2-fold difference as the criterion in pair wise comparisons, 1097, 848 and 548 genes were differentially expressed in the FM, FD and MD pairs respectively. In the Venn diagram, 413 genes (listed in Table 4) showed differential expression in the FD and FM pairs but not in the MD comparison (Figure 5). Of these, 15 were located on the Y chromosome. In the remaining 398 genes, 200 showed decreased expression in the FM and FD pairs and 198 showed increased expression. MicroRNA expression profile
[0102] Small RNA, especially microRNA, is a key regulator of many cellular pathways. Sequencing of the small RNA from the trio identified small RNA from different categories including scRNA, snoRNA, microRNA, snRNA, srpRNA, piRNA and parts of other RNA components of the cell, including tRNA, rRNA, repeat sequences as well as exon and intron sequences. Among the RNA sequence reads, microRNA sequences accounted for 70% of the total sequence reads. From these aligned RNA sequence reads, 616, 581 and 591 microRNAs were identified for the father, mother and daughter respectively. Among these expressed microRNAs, 540 were expressed in all 3 members. Pair wise comparisons showed that 67 microRNAs (listed in Table 5) were differentially expressed only in the FM and FD pairs but not the MD pair (Figure 5).
Multiomic features and network analysis
[0103] The inventors used network analysis to integrate these multiomic features to identify key regulatory genes and their pathways .Using the information on DNA methylation and gene expression and being the target of differentially expressed microRNA, two networks were constructed using pair wise comparison data from FM and FD pairs. The common components of the two networks were used as the putative T2D gene network. From this network a subnetwork containing a maximum of 17 of the known T2D GWAS genes was constructed. This T2D related network contains 141 genes located at the shortest connection with these 17 T2D GWAS genes (Figure 6). On this T2D gene network 15 genes (listed in
Table 6) are the hub genes that are mostly used for connection. Among the 32 genes (listed in Table 3) with promoter regions overlapping the DMRs, PDCD1 (Programmed cell death 1) is the only one overlapped with the 413 genes showing differential expression. PDCD1 showed increased DNA methylation with decreased gene expression. This pattern of change can be validated in 65 controls versus 65 T2D cases (Figure 7; Table 7, Table 8) using quantitative PCR (qRT-PCR) and Sequenom EpiTYPER DNA methylation assays. The expression and DNA methylation of two additional genes discovered by network analysis, EIF4E3 and PACRG, were also validated. KDM2B is a chromatin modifying enzyme showing differential methylation (DMR) in the trio. KDM2B is also a regulatory gene for the expression of the T2D GWAS gene CDKN2A/B. Both the expression and DNA methylation of KDM2B and CDKN2B showed significant changes in the 65 vs 65 case control cohort (Figure 7). Discovery of T2D SNPs from multiomic and network analysis
[0104] Network analysis and multiomic analysis discovered genes associated with T2D. From the millions of SNPs called in the trio, many of them are co-segregated in the mother daughter pair. Such SNPs were correlated with gene expression (Table 9) and DNA methylation (Table 10) data, and SNP genotypes associated with a T2D risk were
identified.The co-segregated SNPs in the focus gene regions were checked against the results of the 4 million SNPs in the Chinese GWAS meta-analysis consisted of 955 controls and 677 T2D cases [10]. Many of them showed significance in association with T2D (Table 2) and chosen for additional round of independent genotyping in 421 controls and 1144 T2D cases. Two SNPs, rs9998519 from the known T2D GWAS gene WFSl and rsl534938 from PACRG, a gene involved in cytoskeleton and protein folding, showed significance in association with T2D (Table 11).
Conclusion
[0105] By using a multiomic approach to ascertain the correlations of different features in a trio family, coupled with network analysis, the present inventors were able to discover novel loci for T2D implicated in chromatin modification, protein folding and intracellular trafficking. Given the increasing affordability of high throughput sequencing, advances in computational analysis as well as availability of multiple cohorts, GWAS datasets and bioinformatics tools, our integrated experiments have provided a template to use DNA and RNA extracted from PBMC to discover novel loci for complex diseases such as T2D.
[0106] All patents, patent applications, and other publications, including GenBank
Accession Numbers, cited in this application are incorporated by reference in the entirety for all purposes.
ADDITIONAL REFERENCES
1. Chan, J.C., V. Malik, W. Jia, et al., Diabetes in Asia: epidemiology, risk factors, and pathophysiology. JAMA, 2009. 301(20): p. 2129-40.
2. Ramachandran, A., R.C. Ma, and C. Snehalatha, Diabetes in Asia. Lancet, 2010.
375(9712): p. 408-18.
3. McCarthy, M.I., Growing evidence for diabetes susceptibility genes from genome scan data. Curr Diab Rep, 2003. 3(2): p. 159-67.
4. Ng, M.C., W.Y. So, N.J. Cox, et al., Genome-wide scan for type 2 diabetes loci in Hong Kong Chinese and confirmation of a susceptibility locus on chromosome Iq21-q25. Diabetes, 2004. 53(6): p. 1609-13.
5. Ng, M.C., W.Y. So, V.K. Lam, et al., Genome-wide scan for metabolic syndrome and related quantitative traits in Hong Kong Chinese and confirmation of a susceptibility locus on chromosome Iq21-q25. Diabetes, 2004. 53(10): p. 2676-83.
6. Prokopenko, I., E. Zeggini, R.L. Hanson, et al, Linkage disequilibrium mapping of the replicated type 2 diabetes linkage signal on chromosome lq. Diabetes, 2009. 58(7): p. 1704-9.
7. Saxena, R., B.F. Voight, V. Lyssenko, et al., Genome-wide association analysis identifies loci for type 2 diabetes and triglyceride levels. Science, 2007. 316(5829): p. 1331-6.
8. Voight, B.F., L.J. Scott, V. Steinthorsdottir, et al., Twelve type 2 diabetes
susceptibility loci identified through large-scale association analysis. Nat Genet, 2010. 42(7): p. 579-89.
9. Zeggini, E., M.N. Weedon, CM. Lindgren, et al., Replication of genome-wide association signals in UK samples reveals risk loci for type 2 diabetes. Science, 2007.
316(5829): p. 1336-41.
10. Ma, R.C, C. Hu, C.H. Tarn, et al., Genome -wide association study in a Chinese population identifies a susceptibility locus for type 2 diabetes at 7q32 near PAX4.
Diabetologia, 2013.
11. Gaulton, K. J., T. Nammo, L. Pasquali, et al., A map of open chromatin in human pancreatic islets. Nat Genet, 2010. 42(3): p. 255-9.
12. Bell, C.G., S. Finer, CM. Lindgren, et al, Integrated genetic and epigenetic analysis identifies haplotype-specific methylation in the FTO type 2 diabetes and obesity
susceptibility locus. PLoS One, 2010. 5(11): p. el4040.
13. Toperoff, G., D. Aran, J.D. Kark, et al., Genome -wide survey reveals predisposing diabetes type 2-related DNA methylation variations in human peripheral blood. Hum Mol Genet, 2012. 21(2): p. 371-83.
14. Li, Y., J. Zhu, G. Tian, et al., The DNA methylome of human peripheral blood mononuclear cells. PLoS Biol, 2010. 8(11): p. el000533. 15. Lister, R., M. Pelizzola, R.H. Dowen, et al., Human DNA methylomes at base resolution show widespread epigenomic differences. Nature, 2009. 462(7271): p. 315-22.
16. Li, J.K., M.C. Ng, W.Y. So, et al., Phenotypic and genetic clustering of diabetes and metabolic syndrome in Chinese families with type 2 diabetes mellitus. Diabetes Metab Res Rev, 2006. 22(1): p. 46-52.
17. Li, R., C. Yu, Y. Li, et al., SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics, 2009. 25(15): p. 1966-7.
18. Li, R., Y. Li, X. Fang, H. Yang, J. Wang, and K. Kristiansen, SNP detection for massively parallel whole-genome resequencing. Genome Res, 2009. 19(6): p. 1124-32.
19. Li, H. and R. Durbin, Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics, 2009. 25(14): p. 1754-60.
20. Ng, P.C. and S. Henikoff, SIFT: Predicting amino acid changes that affect protein function. Nucleic Acids Res, 2003. 31(13): p. 3812-4.
21. Peng, Z., Y. Cheng, B.C. Tan, et al., Comprehensive analysis of RNA-Seq data reveals extensive RNA editing in a human transcriptome. Nat Biotechnol, 2012. 30(3): p. 253-60.
22. Ren, S., Z. Peng, J.H. Mao, et al., RNA-seq analysis of prostate cancer in the Chinese population identifies recurrent gene fusions, cancer-associated long noncoding RNAs and aberrant alternative splicings. Cell Res, 2012.
23. Audic, S. and J.M. Claverie, The significance of digital gene expression profiles. Genome Res, 1997. 7(10): p. 986-95.
24. Smilde, A.K., H.A. Kiers, S. Bijlsma, CM. Rubingh, and M.J. van Erk, Matrix correlations for high-dimensional data: the modified RV-coefficient. Bioinformatics, 2009. 25(3): p. 401-5.
25. Wu, X. and S. Li, Cancer gene prediction using a network approach, in Cancer Systems Biology, E. Wang, Editor. 2010, CRC Press: Montreal, p. 191-212.
26. Brandes, U., A Faster Algorithm for Betweenness Centrality. Journal of mathematical sociology, 2001. 25(2): p. 163-177.
27. Sabidussi, G., The centrality of a graph. Psychometrika, 1966. 31(4): p. 581-603.
28. Langfelder, P. and S. Horvath, WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics, 2008. 9: p. 559. TABLES
Table 1 S equencing summary of the trio.
Father Mother Daughter
Bases (G] )epth*l Bases (G) C iepth*l Bases (G) [ 3epth*l
Raw data 96.7 32x 91.9 31x 101.2 34x
Genome
iiterea aata all' l
sequencing
Figure imgf000034_0001
Raw data 152.4 51.3x 140.5 47.3x 141.8 47.7x
Methylome
Filtered data*l 142 4
sequencing
Mapped data*2 96.5 32.5x 77.5 26.1x 88.9 29.9x
Raw data 17418802 19351315 15739336
MicroRNA
Filtered data joy / ")
sequencing
Mapped data*2 11643536 13370576 10318337
*1. Calculation was based on the genome size of 2.97 Gb. And for transcriptome data, the total coding sequence length was set as 500 Mb.
*2. Adaptor contaminated, low quality, duplicated reads were filtered.
Table 1 Summary of the sequencing results. Over 90G of whole genome sequencing and over 140G of bisulfite sequencing data were generated for each member of the trio. All sequencing data generated from whole genome sequencing, bisulfite sequencing, RNA sequencing and microRNA sequencing were subjected to QC procedure to remove low quality reads and then mapped to human genome build 36 (hgl8).
Table 2 List of the FM and FD common DMRs identified from the family trio
Figure imgf000035_0001
chrl 2 24 0 2532"02 320 -1.05 -l.r down dirl "51S 7 7 14040 533 -1.0S -l.OS down chrl 17297219 17297455 236 1.08 Ϊ.Ϊ9 i'P dirl 141224S0 14123204 224 1.13 1.22 ΐφ chrl 62524841 62525284 443 -1.13 -1.04 down dirl 14?4;, 45- 14-437055 54S 1.00 1.ON up chrl 164078594 164078806 212 3.36 2.86 up dirl 2064(14649 200404484 285 1.14 1.20 up chrl 215153254 215153703 449 1.10 1.24 up dirl 22 -303S 2275^3523 4S5 1.08 1.00 lip chrl 246166806 246167654 848 1.43 1.86
dir2 45S65S 45S434 2-0 1.07 1.07 up chr2 7782836 7783104 268 1.05 1.21 i'P dir2 211210SO 21121324 244 1.05 1.03 up chr2 37085602 37086100 498 1.24 1.31 up dir2 4O5S720O 405S741S 212 1.58 1.00 lip chr2 70555952 70556161 209 1.14 1.04 _J»IL_ dir2 10 - S647 105^54021 9-4 1.13 1.21 up chr2 106570233 106570535 302 no Π6 i'P dir2 112350S10 1123 1207 347 2.22 1.50 up chr2 1300055 IS 130000303 S4^ -1.32 -1.15 down dir2 14«i-lftll0 144710545 542 1.26 1.04 up chr2 167174820 167176108 1288 1.08 1.66 _J»IL_ dir2 230 19S09 2303200-0 201 -1.32 -1.32 down chr2 241493598 241493802 204 1.03 1.06 up dir3 278180 2782020 214 1.00 1.20 up chr3 044S5053 (,t)4K5404 ^l l.OS 1.04 up dir3 "2474541 724801 SI 540 1.45 1.32 up chr3 141322442 141322837 395 -1.58 -1.87 down dir3 lc)-S(|i)4 S 147S04S4S 390 1.40 2.07 up chr4 9629836 9630267 431 U)6 ΐ- 1
dir4 38187472 3 1S7~20 248 -l.OS -l.OS down chr4 55000809 55001457 648 1.43 1.53 up dir4 10W 10433 104410723 240 1.30 1.30 up chr4 147773215 147773660 445 1.22 1.14 _J»IL_ dii-4 105S055NS 105S05474 341 1.04 1.34 up chr4 182668242 182668877 635 1.20 1.02 MP dir4 18641300') 1S0414272 003 1.74 1.12 up chr4 186762687 186763063 376 1.12 1.09
din 1155445 11 50^5 230 1.00 1.05 Up chr5 3156310 3156774 464 1.05 1.05
din 111S55S05 111S50244 384 2.14 1.72 up chr5 111903540 111903937 397 2.04 1.69 MP din 1 484 15054401 075 1.44 1.53 up chr5 180018484 180018784 300 2.06 2.22
dirO 250SS4S 25042P 314 1.00 1.20 Up chr6 33980643 33981169 526 -1.65 -1.87 down dirO 30l942l)S 30104010 12 1.15 1.00 up chr6 37817757 .-"SlSl^l 434 ill 1.44 u dirO 420130(14 42013252 248 1.00 1.32 up chr6 48144802 48145295 493 1.00 1.38
dirO ΐ)Γ009Ν7 Γ01 4" 300 1.00 1.00 Up chr6 131613039 131613256 217 1.06 1.24 _J»IL_ dirO 150758034 150"5%3K 704 -1.00 -1.00 down chr6 158815623 158815945 322 1.27 1.07 up dirO 164715221 104 1 027 400 1.09 1.30 up chr7 1945557 1945918 361 1.18 0.95
dir" 2020322 2026543 221 1.05 1.10 Up chr7 2026336 2026566 230 1.05 1.05
du-7 3153139 315351- 378 1.04 1.20 up chr7 5083914 5084118 204 1.13 1.00 up dii-7 ~002342 7602573 2 1 1.2S 1.44 up chr7 27143176 27143579 203 1.21 1.05
dir" 21>1S4327 201S46S3 350 -1.27 -1.05 down chr7 38315887 38316125 238 2.42 2.00 up du-7 520535" 52053S3S 201 -1.21 -1.40 down
Figure imgf000036_0001
chr8 1443332 1443655 323 -1.10 -1.10 down dirS 1 "553 11 300 1.32 1.20 up chr8 54158684 54 59076 392 -1.31 - 1.00 down
L'lirS 99092W7 09093223 310 1.0- 1.20 up chr8 103681814 103682114 300 1.05 1.0
dirS 1037430W 103744 73 N3 1.04 1.3N ltP dirK 1 4"WS1 1 4S0012S -I.-4 -1.04 down dirS 1 "0%242 13"0%405 253 -1.32 -1.15 down chr8 142624198 142624654 456 1.00 1.07 _J L_
L'lirS 143N00504 143N0723O 0"5 1.00 1.12 up
Figure imgf000036_0002
d <) 152K^K20 152S4025 205 1.05 1.05 up dirl> 12420S 107 12420S040 833 -1.5" -1.12 down chr9 135390134 135390330 196 1.06 1.05 _J L_ dlilO 1401 Γ7 1401 1 354 1.10 1.14 up chrlO 1675605 1675846 241 1.07 1.07 _J&L_ dirK) 1704~% 1705002 200 -1.03 -1.05 down chrlO 3586413 3586747 314 -1.15 -1.15 down dirlO 235 17S1 23532257 4"0 0.93 1.00 up chrlO 23532652 23532908 256 1.00 1.11 i'P dirlO 3304^030 33647903 20- 1.05 1.05 i'P dirlO 45415837 45416085 248 1.66 1.03
dirK) 710271 IS 71027572 454 1.00 1.00 up chrlO 75404348 75404613 265 -1.40 -1.12 down dirlO Ν03( 359 S03670S2 323 1.00 1.12 p chrlO 80430872 80431071 199 1.04 1.08 _J»IL_ fhr 10 05232010 303 1.51 2.12 up chrlO 104184393 201 \A3 1.13 ¾> child 1141301ft5 1141 0530 374 1.52 1.52 up chrl 6 119483997 119484297 300 1.00 1.14
chrlO 12ft33ft3ft7 12ft33ft503 22ft 1.10 1.15 Up chrlO 129679019 129679273 254 1.05 1.05 _J»JL_ chrlO 1 1ft4502S 554 -1.00 -1.1ft down chrll 6054942 6055284 342 -1.02 -1.47 down chrl 1 102501ft1) 10250444 275 l.OS 1.25 up chrll 65403818 65404312 494 1.24 1.05
chrl 1 110ft ft4SO 110ft ftS07 31S 1.24 1.13 Up chrll 112618724 112619284 560 -1.58 -1.30 down chrl 2 40SftO23 40S7]ft7 244 4.Sft 5.00 up chrl2 19827764 19828029 265 1.02 1.05 up chrl 2 52ftftl lSS 52ftftl47S 200 1.07 1.00 up chrl2 74071092 74072128 1036 2.37 1.74
chrl 2 0 1ft2302 0 1ft2710 31S 1.11 1.14 Up chrl 2 99634239 99634611 372 0.96 1.04 up chrl 2 120204003 12020034ft 343 1.14 1.14 up chrl3 18847052 18847261 209 1.29 1.10
chrl 4S004011 4S004553 542 1.00 1.Oft up chrl 4 22842253 22842456 203 -T O -1.00 down chrl 4 2311152ft 23111774 24S 1.0" 1.04 up chrl 4 24020632 24021014 382 1.03 1.03 up chrl 4 315200N1 31530353 372 -1.3- -1.37 down chrl 4 49619439 49619992 553 -1.04 -1.22 down chrl 4 5032Sft09 5032NS3S 220 -1.05 -1.05 down chrl 4 51580080 515S0451 371 no L06 up chrl 4 ft5473l> 1 ft54742S5 334 -1.00 -1.05 down chrl 4 Sft4"02ftO Sft4"0540 2 SO 1.32 1.Γ up chrl 4 90014202 00014533 331 1.14 1.26 up chrl 4 105066551 105066963 352 1.12 1.22 _JJIL_ chrl 4 105600101 105ft004S() 0 -1.2ft -1.50 down
49278173 402"S541 368 1.28 L34 up chrl 5 S112S03~ N11202Sft 340 -1.07 -1.02 down chrl K400ftO"ft S400"10ft 220 l.OS l.OS up chrl S "37 140 S7375442 302 1.20 1.00 p chrl 5 92669570 92669911 341 1.36 1.11 ,_ chrl 2523224 2523520 305 1.ft5 1.04 up chrl 6 3008436 3008773 337 i 0 L09 _J&L_ chrl ft 40S5S71 4()Sft2Sft 415 1.14 1.14 up chrl 6 31078350 31078815 465 1.00 1.66 up chrl ft 32034405 3203471 21S -1.13 -1.00 down chrl 6 32034713 32035184 471 -1.81 -1.58 down chrl ft 73S-OS22 73S71050 23" 1.10 1.07 up chrl 6 84876755 84877006 251 1.09 1.13
chrl? (>205ftlO ft20503ft 1.13 l .oo up chrl? 6496186 6497030 844 1.15 1.20 up chrl" 700541ft 700 7 1 3ft5 -1.30 -2.S1 down
UAVOp 00Ί- 9V\- 9\Ζ 8οεκ68ε 360W.68£ ΖΖΨ dn ΖΖ :n Wb i _._-0:0o9t ΖΖίψ dn WZ Μ)Ζ ΖΖΨ d HY 1 ΖΡΡ (L:^ISI ΖΖΦ
UAVO w\ 6ΥΪ- PLP f sfw? i p dn t-ΟΊ Din bZ _-Ίεχοι-9ε o sof9t
u.wop (ΗΊ- 00Ί- L0Z Ϊ £Ι £6 8601 6S
Figure imgf000038_0001
dn Χ I 001 -9ο o: 18ε is: 0W dn \¥z Ζ9 66ε P99L9\Z9 wzlTiV) 6W u wop .- Ί- εο ι- SC tN.-" W u.wop εδ'ΐ- έσΐ- OO 88 6i"8 Z8t¾8S 61-ΐιρ dn 0Π 1-Ss? (>μιρ dn ςζ'ΐ 60 ΐ 6Z ΐβ ζΈς S0\LZi9S 61·'ΐρ
11 wop in- ΖΪ \- turnip 61 u dn 86Ό 00' ι LSZ $9$Z69& ϊΐζζω 6\ Ψ dn 00Ί () Z ZVLZ )i - Ζ-[ ΖΜ-ϊ-- W u.wop εο;ΐ: 60 ΐ- εεε ΪΊΖΪΖΨΜ 6Ϊ·'ΪΡ dn 91-ε :τε ±L†L mz L MZ (>μιρ dn 6()Ί ΖοΙ 9iZ 9εεεε8ει δοΐεεβεΐ 61·'ΐρ
Figure imgf000038_0002
dn (if I επ oi-o69:t-_. ( MZ L Η ι dn έ ΐ 19νΐ \9P !¾619 l ?6619 8Ϊ·'ΐ
11 wop sn- ι-π- z\z i l i6i 8 ΜΜ^
Figure imgf000038_0003
du SM I ΖΖΊ Zi 9oo !,- dn OO" I 00Ί zpi _¾iZ εΐοεϋΖ 0^000/Μ0ΖΝ3/Χ3<Ι Table 3 List of genes with promoters overlapped with the DMRs
Father Mother Daughter
DMR Father Mother Daughter
Gene stran Promoter Promoter Promoter
Symbol Cytoband Methylatio Expressio Expressio Expressio
ID d Methylatio Methylatio Methylatio n n n n
n n n
58"6 nun , am I p 1 lip 30.82 3.1 32.83 0.219 0.256 0.257
43170
LHX8 lp31.1 + Up 0.00 0.00 0.00 0.137 0.213 0.272 7
233 1 i-uon l q32 i 'p 0.01 0.00 0.01 0.613 0.632 0.62 1
5778 PTPN7 iq32. i - l'p 17.04 19.47 15.41 0.097 0.158 6.153 <S .won 2p24-p23 Up 0.00 0.00 0.00 0.4S6 0.611 0.632
5 1 33 PDCD1 2q37 - Down 0.45 0Ϊ90 1.31 0455 0.334 0.345
6N9" TARS 5pl .2 l'p 22.30 24.91 21.77 0.135 0.Γ7 0. 187
9443 MED 7 5q333 - Up 1 1.72 1023 10.48 0.287 0320 0355
44221
(.pi 2.3 I ' 0.00 0.00 0.00 o.oo o.oo 0,00
9465 AKAP7 6q23 + l'p 709 6.64 7J4 0.506 0.570 0.567
I03"0 ( m:i)2 <iq23.3 l'p 3S.74 5 .-(. 4'.78 0.025 0.034 0.039
54919 HEATR2 7p22.3 + Up 3.50 4.02 4.38 0.323 0.445 0.383
515- PlX il-RI. .Sp22-p2l.3 Up 0.00 0.00 0.04 0.17S 0.241 0.273
79754 ASB13 0pl5J - Up 4.30 .20 Ϊ93 0.124 0.148 6.172
"9004 ( i:/x 'J IOq24.32 lllllillllii|| p I .S9 17.24 I6.S6 0.2SS 0.356 0334
44004
SLC22A20 l lql3.1 + Up 0.04 0.05 0.06 0.454 0.488 0.525 4
Κ4(Γ8 KD U2H I2q24.31 Up 14.16 15.79 14.54 0.125 0.164 0.202
Figure imgf000039_0001
TNFRSF12
51330 16pl3.3 + Up 1.35 1.13 0.83 0.404 0.445 0.450
A
'5201
mWiSiiii ( ΈΜΡ1 I6 l 3.3 1111111 I 'p 2.3') 4.06 4.22 0.012 0.016 0.020
23587 ( Tur/ l ΓρΙ3.Ι + Down 12.15 y.so (l.l 15 0.(Γ3 0.059
I 4--
(Τυ 4 I7q23.2 IIPIIII I )o\\ n ().()() 0.06 0.00 (1.22(1 0.15^ 0.169
14746
ANKRD29 18qll.2 - Up 0.07 0.05 0.14 0.339 0.395 0.381 3
021" RPSI |yq 13.1 I'P W)."5 1631.30 I545.5S 0.303 0.366 0362
14787
CCDC155 19ql3.33 + Up 0.03 0.00 0.02 0.502 0.556 0.650 2
I406S 20ql .l3-
( '/!/. -f O.dO 0.00 O.dO 0.203 0.26S 0.265 l 3.33
Figure imgf000040_0001
Expression levels are expressed as RPKM (reads per kilobase per million mapped reads); methylation levels are expressed as the proportion of mC in the region as 0.0 being no methylation and 1.0 being 100% methylated.
Table 4 FM and FD differentially expressed genes in the family trio
Father Mother Daughter
Gene ID Symbol Cytoband Change expression expression expression
777 f u \ in. 1^2 -^ 1 0.2S 0.06 II.OS Down
Figure imgf000041_0001
(.11')- h'i >h'( lq21 1.10 2.89 4.2- up o " \( l 1 U|23 4.00 1.81 l.o- Down
8991 si 11 \/ /7 lq21.3 (1 ~- 1.84 Ί.Ί4 Up
')f,if, Λ(, 1 'fi. i3 -.S') 21.32 36.1 Up
')(.-: S/)f { lpter-p22.3 (l.id 11.-4 o.')2 up
1 158 rn/kiiri l 33 2<).')(, 4.')s Down
1 M I4 kll 2( 1 p34.1 ιι.-') 11.» 11. is Down
23127 (,/ i:*n: li|2- X il.f." o.')2 Down
:--')() ( ( IH l>> K|22 1.79 11.- 1 0.-2 Down
2718 nis( : K|42.1 o.3o U.l'l 0.12 Down
Figure imgf000041_0002
--')24 CI or/ 1 1 pi 3.2 ').')(, v')- 4.15 Down
79368 1 ( Λ72 U|21 3.-4 14.25 11.(.2 Up
-')(,»' 1 Ml 1 p34.1 I). \A 1.13 1.20 Up
-')f, (i in \/>* lp. , 11.34 1.40 1.24 Up
-')-(.: Cl or/115 K|41 II.-4 2.IK. 1.97 Up
8341 1 ( Λ7 ? U|21 1.19 4.53 lllllillllliiliiilili Up
S4MI') ( Λ'( )( ( // lp36.13 8.91 2 S4 22.75 Up 4S24 ICRI 1 U| 3 AM 14.')2 IO. Up
11(.4')(. / 1 \ii:>> 1 k|2- (.').» 2').2~ iil.-- Down
I ~ 541 CimjI73 lp 1.1 0.17 0.0- O.IK. Down
2-o4i- s/c.(.1/ \ If * 1 p31.1 II.') ί 1.99 1.93 Up
2"1()1 l 36J 1 (Ι.')ί, 3.12 lllllillllliiliiilili Up i V)4s i l// I //»" / lp35.1 (l.dS o. , 0.22 Down
348378 / i\ii?<n 1 p32.3 :. ') 5.-0 4.88 Up
}-4')4(. Clor/l 7 lp36.22 2. ι- o.-(. (l.dll Down -')() l(,l \ 1 p'fi. i3 0.18 0.4 ί Ο.ίΛ Up
388698 ικ,: lq21.3 II. ι4 ii.l(. 0.14 Down
785 U Mil -XI 1.1.3 0.40 0.4- Down
Figure imgf000041_0003
14')(. ( i\ \ i: 2p 12-p 11.1 Ο.2ο o.ol o.ol Down
15')3 ( )/':-·!/ 2i|33-i|i r (-.9- 18.67 17.74 Up l(.-4 /)/ s 11.114 il.(.- 0.-4 Up
249s / III! 31' 2p23.3 ".4- ! .(.- 1.0(1 Down
2817 (,l'( 1 -Ί^-'Ρ" 11.114 0.20 o.2(. Up s- H,i nr: "4 14D o.U, 0.11 Down
3577 II »' 1 -4.(.(. Oi 17.27 Down -') // v f i 182.14 -3.91 1').0- Down
4DS4 MXD1 2pl3-pl2 (.4.SS i|.')o 2').-- Down
5133 Ι Η /)/ 2l|l".l 0.4- O.'XI Ι Ι Up
5212 1 // 2p22.2 ().(,- 0.21 o.lo Down
"κ-ο // /Λ·: 2 l 15.23 6.41 lllllillllliiliiilili! Down
9173 IL1RL1 2ql2 1.48 0.38 0.59 Down j ^ ΙΚΙΊΙ! 2q21.2 U.II5 D.ill 0.01 Down
-')-S(, ( lll'l 0.21 il.(.- II.-- Up
(> 2p23.1 0.53 0.1 0.14 Down lllT 1 ι 2q21.1 10.14 24.(0 i.4s Up
S0 6 \\ \i ι 2q3- • 1.42 0.94 0.')2 Up
84913 Ι/Ο/ 2pl 1.2 ί 0.4- 0.4- Up
9 1 4 2q24.: <).3(. o.io 0.12 Down
1 :>)(.4:
Figure imgf000042_0002
2p25.1 s.24 3.13 i.(.') Down ΐϊιιΐι. ( :,>πίκΐ 2 l6.1 <).-<i 1.72 1.52 Up
130271 /'/ / kirn: 2p21 0.15 II. id 0.4 i Up
44090- l.O HO'HI? 2q21.1 .111) 0.35 0.21 Down
( -s4 1 ι ι 2q21.1 4.S2 14.57 13.51 Up
'Ml u|l 3-q:i • 1.24 o.-S 0.-2 Up
1359 n|2l-q2- 27.23 -.4' 10.81 Down n \ isuij .^1 . 2.91 1.12 1.32 Down
:i)4_ 1 rii/;i n|2l-q2i 1.33 o.i4 ll.-ll Down
3568 II 5Λ' 1 i 6- 4 3. 1.21 1.34 Down
Figure imgf000042_0003
-4-0 (,Λ' 1 l//)/f |l i.i| oi ().')- ii.s- Down
"-ill A7 ! 1/2" 3q21.3 II.W 0.21 0.18 Down s4:» lll'l i I 1 |2') (l.')S 0.41 ii.4s Down
I14.SS4 osnri iii 3p22.3 1.35 4.M 4.01 Up
132014 II 1 "Λ7 3p25.3 II.HI O.ii 0.-2 Up l-l«lf ¾ u|29 II.Kd 1.K 2.32 Up
152189 f Ml ΙΛ 3p22.3 II.» 1.80 2.IK. Up
Figure imgf000042_0004
339855 k > |22.2 II." o.l') o.l') Down
344-S- /\l \Ml 3p23 11.65 1.6(, 2.00 Up
Figure imgf000042_0005
SI OP 4pl(. 1.37 8.32 4.41 Up
7222 in re j 4q2~ 11.1.3 0.119 0.09 Down
S4~il 4qi-.l il.li. Ί 0.02 Down
11107 /7ί/ΜΛ~ 4q2--q26 Il.fi«) 0.1 II.IIS Down
:-^(K, ///'(. /)S 4q22.i 1.58 0.4') o.l,- Down
~')633 / 1 / / 4q2S.l 0.51 O.K. 0.20 Down
Figure imgf000042_0006
i oo // 4pl3 0.18 II.-- ii.4s Up
1879 /.7.7 / -q34 il.4(. l.-i) 1.21 Up
4131 i/ in/; -q 1 i 11.42 0.14 o.l(. Down
5159 ΙΊΗ,Ι Ki; q33.1 v-2 1.43 1.04 Down
(,-(,') i( mi 11.44 0.14 Ί Down
10 Γ HIM J -q35.3 2.71 0.00 0.00 Down
11346 s) \l'() n.ii, 1.20 1.25 Up
231 8 \·//.7'.·ϊ 11.14 0.44 0. 1 Up
-(,|(l(l i'( ni /(,/;(> -qi| 11.24 Ί Down
-Mill PCMICH! -q31 11.1 o.o: 0.03 Down
56105 pc ni /a in 5q31 0.12 0.03 0.01 Down -d|0(i f'( PI HI III) -q 1 0.3(. 0.03 0.02 Down
134285 TV1HV1171 -qi v: 11.24 ().')- 1.67 Up l-i-^) ni \!<J -qi^-' 0.12 0.')') l.ol Up
383 \Rdl |2i 1.03 i.i4 i.llll Up
13011 |21 -^22 D.S3 o.3s 0.34 Down
3117 in 1 n \i 6p21.3 -').(,(> 31.48 18.24 Down
3127 in \ nun ti Z 1. i :iiii.x3 4^.')5 i').2S Down
3139 III 1 / 6p21.3 1.26 i.~i i.(l') Up
464(> 1/}( 6q 1 3.3D 1.33 O.s- Down
-Id') / M'l'i i.T !.(.') 1.88 Down
5190 I'l \r> 6p l.l 3.4( -.4 Up
-->)(, I'll'Rk 11.41 il.') 1 l.oo Up w.4s son: nq2 .3 301.34 1 0.45 1 ').49 Down si4~ nisi II /:/;( 6p22.1 19.07 Mi') 9.21 Down ios ( /> 6p2 i 1.4') ΐ,ι -.4') Down
')S-(, hi 1 \t)il>) 6p22.3-p22.2 11.4s o.lo 0.14 Down
102 1 in {\: u 12.3 D.3(. 0.-4 0.94 Up
10279 I'RssU, 6p21 0.11 o.i- 0.4- Up
CRlsl'3 i» i:.3 2.4( 6.-0 .- Up
2o--- R(, r |2-.i I)/) 1 0.40 0.2') Down
ι v/ RSI ;/ 6p21.1 2.17 o.'3 l.os Down
55510 /)/)\ is oql -ql i 1.83 ill 0.-2 Down
80129 ( f ,rnr (u[25.1 2.')4 1.34 l.ii Down Ι Γ iK (,:/■■ 6p21.3 n.-s . i" id Down lli>3(.9 .s/.r ) is 6p21 1.79 0.51 0.-11 Down
221481 C6or/HI 6p21.31 0.31 (l.')S Up
221687 R\I is: 6[t O.IKi 0.31 0.2') Up
221711 s)( r:i (.|i24.: .') 1 0.18 0.2 i Down
285852 IRI. Ml 1 6p21.1 • il 0.25 o.i- Down i4ol4(. si( |2i.i ii."- 0.15 o.i2 Down
165 1/ ΒΙΊ 7pl3 1.30 i.ii 2.(4 Up
358 \ori 7pl4 ill ii.ns 11.11- Down H41 LP it 11 ~q34 1.93 4.43 4.03 Up
IIO\ I/O 7pl5.2 2.21 0.81 0.4') Down i6')6 II diss 7p2l.l 1.78 0.-- 0.-1 Down
4S4(. \OSS 7q36 0.20 o.-s 0.4 i Up
8972 \H, WI -\|34 17.73 8.23 -.32 Down
10135 \ 1 I//'/ -q .i 21 ').(.') (>2.(.s 68.21 Down
51 51 /\i ir 7qlL l 17.11 (1. 4 6.-0 Down
(.41H1 IRR( 1 -qil.i 5.51 2.12 2.71 Down
~')6S9 S/l t/'l 7q21.12 1.03 14.-(. 14.4- Down
91584 I'l \ \ 1 / 1.13 o.-i o.-i Down
')' Ϊ HH 'iSI ?; o.is 0.00 O.Od Down
221981 Ills/)- 1 7p21.3 0.18 o.ol O.IKi Down
259286 / is: RHi -q34 0.-2 o.ir 0.14 Down
:s(,( i(, C7 or/53 7q31.1 5.77 1.87 2.i4 Down
401431 iO( mi i -qi6.1 0.40 1.01 .N4 Up
4412-1) 1 ) it ii; 7ql 1.23 1.01 3.78 2.4- Up mm MM: ΚΗΊοοι:κ>'>: "q34 0.5S 0.11 0.10 Down
1807 /)/') s Sq22 0.4') 0.02 0.02 Down di R i: 8p l.3 0.24 0.64 0.52 Up
4s-o \Ol Sq24.l 1.80 O il.(.- Down
S-94 I M R.SI IliC 8p22~p21 54.45 21.09 1 .(,(. Down
9639 ARHGEFIO 8p23 LOO 9.53 7.84 Up 51312 Λ ί - .51.¾ 8p21.2 51.11 10.78 21.34 Down
-')(.(.!) rri'IRW 8p23.1 1 .00 -.20 -.4i Down HIII- 1 >< H A 5 8p21.2 22.H8 0.42 10.-0 Down
Si(,4S / 1 1 8p23-p22 0.2 i 1.38 1.07 Up s-453 /S7')/ 8q22.1 1.73 O.S1 0.-- Down
^ - 1 (id \ Ι/'Λ'// i.o4 0.-4 io.:o Up
114X22 A7//'\/ Si[24.'5 0.34 1.14 1.18 Up
137075 f / )\2> 8p23.1 1.32 II.-- 0.-0 Down
2ο3ο.-4 f/>( A.' 8^24.3 1.01 :.-o 2.10 Up
:-4"s Hor/46 Hq 13.1 11. 1.24 O.S4 Up
:st.o46 \AAY. 8p23.1 0.00 0.10 0.12 Up
286122 (¾r/i/ 0.11 o.so o.-i Up
216 //./>/// 1/ 9q21.13 32.00 Ί().:Ί 72.78 Up
-0(l4 OR Ml 4^-4^ 1.01 4s i.:o Up r.oi RI \: p24.1 1.26 ;.: :.-- Up
-old II k 0p21 o.lo 0.00 0.00 Down
~43(. itniR 24 0.01 o.2(, o.:4 Down
8013 \ Λ' / 1 ί 0.2- 0.00 0.11 Down
'«Ϊ1 I// A o i ;.: 0 - .iO o.:o Down
10811 \(>\ 1/ qi4.i 0.71 1.53 1.50 Up
11 M 4 ί «Λ„·7- 0 34 0.55 1.17 i.:3 Up
2 i i4 Kl A A 1045 Op 13.3 II. ill 0.00 0.0- Down
-r>f«54 M'lH 1 34.i ii.ftS 2.57 1.0s Up
-S400 /\i if,: Mii.: o.ol O.IKi 0.11 Up
->M}~ ( \ l \ il'J OphVl 4.33 0.51 o.(-: Down Γ, Ι /7H /)//(,: ')p24.2 (I.- i o.:s o.i: Down
SOKOI IJ/W?/ 4 .I I 0.6 i 1.42 1.34 Up l4H li IRI'\H> 04: 1. n 3.31 1.18 1.28 Down
158228 ( <,„■■/ Op 1.3.1 0.14 o.s- 0. 1 Up
: .^4^ Vor/150 0p23 ii.li. 0.00 o.ol Down
387328 /\i {::» 0.:: o.ol 0.00 Down
4di -(.: K \l 1 qi4.i ().-(, 0.15 O.Od Down
414332 K \ IH ,34.3 O..SO 3.81 3.28 Up
4414-s \A' IA7' qi4.i (I.- i o.:o 0.20 Down
6-44<i6 k<,iir: 9p 12 o.:o 1.01 ·)..-: Up
Figure imgf000044_0001
150 \DR i: 1 Kil|:4-l|2(. 11. i 0.11 0.12 Down
1305 ί < >/ /.* ! / 104: l. o o.:i 0.14 Down
5328 /V H 104:4 ().-(, o.:4 0.24 Down
-654 111 R 11 104:<.. 0.54 1.53 1.01 Up
-0~0 Rl 1 10 11.2 11.:- 0.10 o.lo Down sf.44 IkRH s I0pi5- l4 31.32 13.12 8.08 Down
Figure imgf000044_0002
iiS-0(, 1 Op 12.33 i.41 (1 - 1.27 Down ί < :n:n 104:4.1 0.00 0.40 0.41 Down
Figure imgf000044_0003
931 MS4A1 llql2 41.30 129.24 105.34 Up Hql2.1 4(..i: l(..( :o.o: Down
:ooo
Figure imgf000045_0001
1 lq22.3 11.11: 0.- o.:~ Up
///;/ / 11 pi 5.5 0.60 0.00 Down
Figure imgf000045_0002
5553 /'/»'(.: 1 lql2 n.:s 1.11 0.00 Up ^ 0 /7/W 1/· l l l 3.3 3 i>fi M.-T 73.73 Up
(,')4_ TCNl 1 lql l-ql2 18.11 5.10 <,.o- Down
I I /I ΙΚ|24.: -.XS vf.9 19- Down
27087 n , iii 1 lq25 v(.4 3.10 Down
-4 1) /mi ii( n ί lpl 5.1 24.15 v:<> -.S6 Down
S4^i4 NUDT22 1 lql 3.1 4/ : 0.S4 9.85 Up
S4(.4 mm: l lql v- l .Xf, o.:9 8.88 Down
116071 inn : 1 l l 3.1 0.-0 1.02 I.Oh Up
117195 \IR(,1'1<\< Hpl5.1 0.09 o.f.4 0.4: Up
120071 GYLTLIB 1 lpl 1.2 0.-- l.oo 1.91 Up
14-19 v <;«/( / t lpl 5.5 :.fi4 o.V, D.-O Down ,s44i) l\(W 1 lpl 5.5 3.31 7.82 -.00 Up
341208 /// /·/// / llq21 o.os o.ol 0.00 Down
41 if f \: 12q 12 0.12 0.-0 n.-> Up
94 in mi 12ql l-ql4 0.21 0.-4 .-f, Up
12411 f 1/Α/Λ7 i:q:4.i 17.85 s.^o ~ Down
1 (.34 in A 12q21.33 0.-0 o.lo 0.0- Down nn\n 12pl 1 0.71 4.21 ~4 Up
1X40 ηι\ι 1 : :4.1 ¾ o.-> :.r.o 2.35 Up
3741 ki \ i* 12pl3 o.^4 11.11- 0.04 Down
4060 1 ( \j i:q:i.3-q:: 0.50 Π.ΙΓ. 0.17 Down
8082 SS/'\ 12p 11.2 o.2(. o.s4 1.05 Up s4V SfM : 12ql3.13 0.09 0 ¾ 0.4S Up
8843 (,1'KliK)/; ll M.^I 14.19 5.19 4.11 Down
0N9I \l Ikl i:q:3.3 2.X* o.- 0.- Down
11247
Figure imgf000045_0003
12q 1 .3 0.00 o.:: 0.:- Up
:>4u. 12 l DM <).-> .i.S Up
/ ////// // 12pl2.1 o.:<. 1.10 o.~- Up 4i.«)s f i/'s: i:qi4.i 0.-4 o.i: o.: Down s4-:- s/'s/: 12p 13.31 o.:~ :.-o 1.40 Up
121355 (.7.S/-7 12ql3.13 14.:9 >.o¾ Down
:^4:o 1111 <>\ 12p 13.2 2.51 1.15 0.-0 Down
"S442 (,1'KlW t 12q24.31 10.04 ' r 2.77 Down 287569 NCRNA00173 i:q:4.:: (..4- 2.23 2.82 Down
η,πι 1 \ 4 o.:> 0.(1: 0.0- Down
10257 l/.'f f / 1 - 7.17 14- v:o Down
ID56: (>// w/ 13 l4.^ :.»- 6.04 -.4- Up
Figure imgf000045_0004
1 (.')■) an 11 14ql l.2-ql3 o.<>: 1.86 2.27 Up
9787 nn.ir? i4q::.3 or: 0.:: DM Down
(Wii>(,: 14ql 1.2 00- 0 -- 0 -4 Up
51016 1 Wll^ i U i i.: 2.18 4.-6 5.17 Up
I (, l/\ i4 ^:.: 0.00 0.17 0.17 Up
->>s:o f il /'JRH i4 ^:.i: o.:o Ο.ί,- Up
Figure imgf000046_0001
(.44MI ki ill :^ l-q2V. 0.31 0 -0 0 so Up
I4-X64 inn w l-i|26.1 2.58 6.-s Up
:4( ~~ s/7 s/7 1\|I' 0.4s 0.00 0.02 Down
OiO ( IV<> f6pl 1.2 3.16 10.42 I0.3- Up
1632 IH 1 1 p 1 .3 o: (.22 (. -2 Up
4-o2 \ii: 1 fu| 1 Ϊ 12.21 24.SS 2S.-0 Up
4' 1 i \ llll 1 1 p 13.3 0.88 2.14 2.10 Up
8912 ( H \ llll li.pl 3.3 0.06 0.24 O.K. Up
10101 \i i;r: 1 p 13.3 7.88 (. s4 Up 1(10- winun: n. i . 0.-- 1.62 2.2" Up
51327 IIISI' 16p 11.2 1.44 4.0- v~ Up
-45-0 \i( in: l6q2V. 0.-6 0.31 0.21 Down
-4~(.s
Figure imgf000046_0002
1(H|22.2 0.0: 0.12 0.11 Up
58189 /// /)( / 1 i|24. i 0.20 0.00 0.00 Down
(.»:: ( llll 1 p 13.3 0.1,0 1.40 1.4. Up o41 so /)/·/ r.i I(H|22.1 1.42 0.1.2 0.- Down
(,4^(. \1\11·: 1 p 13.3 20.:o 8.75 0 S(, Down
(,-ooo 1 niri-i h. l . 1.56 3.38 4.62 Up
Figure imgf000046_0003
84331 1 1 1 1 p 13.3 o.so 2.12 2.22 Up
S4~0h (,/·/: I6ql2.l 0.20 O.h- 0.-0 Up
114084 11 11: 1 p 13.3 2.22 5.14 4.4(. Up i i6o: afmr/75 16pl3.13 0.40 1.14 I.oo Up i:vo4 \R\ II 16q22.1 0.24 1.00 14(. Up
2224s- (il'RT 16q21 11.21 -.(16 3.55 Down
2(.o4:o /'A'SS 1 p 13.3 o.oi o.io 0.2- Down
24- in)\i:i>: 17p!3 0.40 0.20 0.21 Down
Figure imgf000046_0004
0-4 ( Π'ΊΙ,' 17.28 ^(..-4 i(,.-(, Up
000 (Ί Λ 1 ~i.|21. 0.-5 0.34 o.i- Down
2-s4 (, 11 hi Tq24 1.04 4.01 4.0- Up
3050 /.(, i/.WH' I "q25 5.60 12.05 U.83 Up
-020 r:i<\ 17p 13.3 (..11 23.01 17.52 Up
0- 1 sic in |-q2l-q22 1.4- 3.0- Vos Up
S(,V) !()( 17q21 1.61 0.-- D Down
0-20 i - m \ I7pl 1.2 o.2(. 0.1.0 l.os Up
10610 iM, 1/ \ if : 17q25.1 4.11 1.50 1.08 Down
I -50 (,Λ' w I7pll.2 4.03 lii.o- 12.oi Up
2^40- / \ / A's/ / W 17p 11.2 0.0: 3.87 v2- Up
-165- Kisni 17pH.2 O..SO 0.3S 0.33 Down
57332 ( n \ s |- 2-.i o.-(. !.(.(. 1.01 Up
79148 \i\ir:s 17q21.1 o.>- 1.59 O.OS Up
91107 TRIM47 17q25 0.48 1.55 1.08 Up Ι1.ιθ2(. I'lCD 17q21.31 i).2(. «i.5(i 0.54 Up i(.:4(.(. I'llOSl'IIOI 17q21.32 13.53 4.~- 4.29 Down u r. ax ΙΛ:ΛΙ: I7pl 1.2 o.r 3.>(. -.09 Up
:s4il4_ ( ( in 11 in 17p 11.2 II.2H 0.51 0.S2 Up sosw, ISM .{ 18qll D.II ll.lll) 0.01 Down
Figure imgf000047_0001
828 ( l/'S 19pl3.3 2 -- (..(.1 5.51 up
Til t iro 19pi 0.17 1.10 ii.-o Up
( ir>) 1 19ql3.2 17.19 51.42 49.-- Up
KIS4 (/. u -MM i Mi : 13.20 v41 -.-() Down
:Μ / / in: 19ql3.1 -.i i S.di 9.-4 Down ιΜ« hi kl 19 I . 11.14 1.46 0.9- Up
411- 1 CYP4F3 19pl3.2 11.67 4.9- v09 Down
(.141 Λ77 19 i 233.91 491.10 rr 1.41 Up
Figure imgf000047_0002
I004- s//:/){ 1 19pl3.3 2.(.i -.10 Up
I0.-9- IR IPP( in 19 I V4 1.70 4.111 1.91 Up
Figure imgf000047_0003
-(.T 1 ( 1 l( l \ll<> 19q 13.31 11.4 ^ 1.03 1.02 Up
-~4o9 i'wiu: 19q 13.32 ii.ii: 11.23 0.12 Up
59285 ( K \r,r. V \ 1 4 1.25 I), , 11.4s Down s4 (. iiknir I pl .3 1.67 3.9- 4.99 Up s4o-s 1 Ι/Λ'.' 19pl3.1 25.77 9.78 10.77 Down
89858 sn;u.( i: l')q!3.4 11.3" 2.28 1.30 Up
90011 kin an ι |9M| 42 2.18 1.03 o.ol Down i4s::< ( P>„n2 |9p|.V, 2.61 -.30 -.4" Up
1 (,:>)(, i /\l hi 19q 13.41 II.4H 2.1.9 2.25 Up
284415 I si Ml 1 q 1 .42 I..S4 19.57 10.68 Up
V429I
Figure imgf000047_0004
19pl3.3 4.17 9.13 10.89 Up
4006(.S I'K Sl 1 I pl .3 1.47 <).>- 0.40 Down
44o-Oi PI l\ 19pl3.3 1.73 0.-4 o.i- Down
(.4-191 I9pl3.3 H.S4 0.35 0.31 Down
///// .'.1 19q 13.31 (1 -- 1.12 1.22 Up
"293 9 I'lIM |9p|.V, H.X4 0.32 0.28 Down 128569 IVor/71 19pl3.3 H.S2 2.09 2.01 Up -1 nri 20q 11.23 3.14 9.14 S.09 Up
3787 k( \ l 2Hql 0.21 0.04 H ill Down
20q 1 .2-
'91 I 1 !Ml^ 11.114 o.l- 0.16
l ι.Ί Up
-2(.o /'/.· 20ql I 4o.ii S4 i.-O Down
(..-90 ll'l :ilql2 19.20 4.49 3.85 Down
-()-(, limn 20p 11.2 7.21 2.9 i 2.91 Down
9~-l S\/7/ 2()pi 11.94 Hl :.oi Up
11065 1 ni 20ql3.12 1.57 ii.(.- o.-ii Down
20p 11.22-
229S1 1.75 0.5- 0.39
pl 1.1 Down
54923 LIMEl 20ql3.3 431 13J3 11.48 Up 20p!3 8.28 17.55 1.03 Up
H550H S( Λ72 20pl3 0.17 o.ol 0.02 Down
2dq|1.12- llnl-4 rtl U IRt D.56 0.12 0.2- M 1 ----- - Down
Ι4ιΠ1 Ron: 2 p 12.1 11.-4 1.01 0.-2 Up
140x21 ROMOI 20q 11.22 M.K) 17.61 18-86 Up
3 1257 SI MOII'I 20 H.2 2.14 1.04 11.- 1 Down
1 91 ( Ol fill — 1 1 — ^ 0.1" 1.39 1.15 Up
.3772 k( \ II 2K|22.2 2H.I.2 (..(.- 7.16 Down
1021- Olll,: 2Iq22.ll 0.22 0.04 0.02 Down
-4oH RBM11 21ql 1 (l.hll 2.11 1.23 Up
\CR\ waiw 2lq22.11 2.X4 0.19 0. 1 Down
-in, ISI'O 22ql.3..31 I4.H- -0.-2 31.65 Up
4282 Mil 22q 11.23 4.N2 10.-') 1 - .66 Up
PKDREJ 22q 13.31 o.ll ii.li. 0.14 Down
25787 IKi( R') 22ql 1.21 O.di, 0.40 0.2" Up
25812 i'o\ii:ii ir 22ql 1.22 0.02 0.18 0.21 up -441 ( I ( Λ _ 22ql 1.2 0.04 0.14 0.17 Up
VPREB3 22ql 1.2.3 2.24 5.77 9.51 Up
566ί,ί, 22q 13.33 O.X9 0.40 0.40 Down
Figure imgf000048_0001
I'lo k 111 Xp22.32 0.-- (1.(1- 0.01 Down
5358 VI \.i \q23 ().-- o.l- 0.11 Down
"-Hi MSI \ql 1.2 0.00 ') i 1- 1(1 Up
8277 Ik/I 1 \q2S -.-4 2.19 1.61 Down
9185 Ri rs: \p22.2 4.-(. 2.17 2.04 Down ii.m k( \l 11 \q22.1 0.50 o.l - o.lo Down
->)')S i roi in \q 1.2 0.11 0.31 01- Up
1 91 9 IX, kk Xpl 1.22 1.(W O.OS 0.01 Down
2-(._l4
Figure imgf000048_0002
Xp22.12 o.ol 11.-11 0.-2 Up
14-404 1 Xp21.1 0.16 O.-ll Down
441531 /'(, 1 Ml \q 11 (l.hll 014 022 Down
(.1 2 RI'.SD I Ypl 1.3 2-6. , 0.00 0.00 Down
-4114 1 1) ^ l 1 >).21 (Ι.(Γ, 0.02 Down
-544 Ypl 1.3 5.13 0.00 0.00 Down
8284 ki i^n ^ l 1 20.1.4 000 000 Down
8287 I .si'v) Yql 1.2 1 49 0.00 0.00 Down
Figure imgf000048_0003
/ \isni) Yql 1.221 4.-') 000 000 Down
\ ,\ n Yq 11.221 o.lo 0.00 0.00 Down
--4111 XCRXA00IH5 ^ l 1 222 0.41 0.00 0.00 Down
645')- 111)1? Yql 1.1 3.03 o.ol 0.01 Down s4i.i.l CYarfLW ^ l 1.222 1(I/)S 000 000 Down
140012 Rrsn: l 1.2 0.2- 0.00 0.00 Down
24(.12(. CYor/15.4 ^ l 1.222 2o/)(. 0.00 0.00 Down
2 54 ncoRi: \>.]\\.::2 2.SO 0.00 0.00 Down
Expression levels are expressed as RPKM (reads per kilobase per million mapped reads). Table 5. FM and FD differentially expressed miRNA in the family trio
Father Mother Daughter
miRNA Expression Expression Expression Family
Level Level Level
h>a-miR-l 2541.5S 90S.99 S45.41 mir-I hsa-miR-106a 39.0 4.33" 8.00 mir-17 lisa-miR-lOftb 321.17 125.37 154.4^ mir-17 hsa-miR-1308 9^56 L72 L52
lisa-mi -1 4 S.13 1.42 2.59 mir-134 hsa-miR-144 15.78 3.26 ft.r mir-144 hsa-iniR-14Sh 451.19 143.53 121.S5 mir-148 hsa-miR-15a 1324.80 160.62 200.72 mir-15 lisa-miR-15h isr.79 lftft.07 193.ft3 mir- 1 hsa-miR-16 365438" 706 7" 929708 mir-15 l i-miR-1 3ft 1.21 66.51 105.47 mir-17 hsa-miR-181a 5050.16 1168.10 1392.48 mir-181 h-ia-miR-lSlb 193.90 243.40 mir-181 hsa-miR-181c S52ft 2 M 30748 mir-181 h-,a-miR-lSlil ISft.3l 40.35 4S.ft2 mir-1 1 hsa-miR-18a 32 4 7.77 S ft mir-17 lisa-niiR-lSh ft.1 2.20 1.91 mir-17 hsa-miR-190b 7.11 1.96 1.91 mir-190 hsa-miR-193a-3p 25.ft9 11.33 7.24 mir-1 3 hsa-miR-194 1160 3.74 6Λ0 mir-194 l a-niiR-1 " 30.61 S.3 7.1ft mir-197 hsa-miR-19b 174.76 34.06 51.59 mir-19 h>a-miR-20a 114.7N 30.02 52.5S mir-17 hsa-miR-20b 12716 2.37 335 mir-17 lisa-miR-210 ft.15 0.S9 l.ftO mir-21 hsa-miR-22 448.25 107.45 156.75 mir-22 lisa-niiR-223 7S0ft.4ft 24Sft.4S lS42.ft9 mir-223 hsa-miR-23a 4941.25 1489.57 1899.16 mir-23 lisa-miR-25 333S. S . S42.00 755.11 mir-25 hsa-miR-29b 144(λ80 295.18 528.17 mir-29 h>a-niiR-29 2ftft0.25 ft24.ftft 9ft9.39 mir-29 hsa-miR-30b 201.27 44.20 51.21 mir-30 a-miR-32 31.50 7.42 9.22 mir-32 hsa-miR-324-3p 7^58 1-1 122 mir-324 l i-miR-324-5p 28.35 5.70 10.90 mir-324 hsa-miR-338-3p 99.27 16.73 13.03 mir-338 l i-miR-3ftl-5p 93.N7 P.74 22.5ft mir-3 1 hsa-miR-362-5p 10.32 4.21 4.50 mir-362
1isa-miR-3ft5 30.0ft 4.09 3.43 mir-365 hsa-miR-374b 403.50 73.57 84.89 mir-374 h.>a-miR-3~ftb 5.88 0.S9 1.30 mir-368 hsa-miR-382 22.\4 7.06 9.37 mir-154 livi-mil<-424 2 Id.10 36.41 45.04 mir-322 hsa-miR-425 196.90 42.Ϊ9 44.35 mir-425
Figure imgf000050_0001
hsa-miR-505 27.60 9.32 8.46 mir-505 l i-miR-532-3p 2h.N -.36 a.w mir-1 8 hsa-miR-556-3p 7 4 2.79 1.98 mir-556
11 "-ill 1111 Ι\-. . -f-.^ - J 4. .i
hsa-miR-576-5p 37.51 4.45 5.41 mir-576
Jisa- i f 1"> IT - ■ _» mir-3f>
Figure imgf000050_0002
hsa-miR-629 40Λ0 8·90 7·62
hsa-miR-660 63.81 26.70 26.44 mir-1 8 hsa-miR-664 57.39 10.15 14.78 mir-664
Figure imgf000050_0003
hsa-miR-873 3047 2.49 L45 mir-873 lisa-miR-°3 "50.5" 25 .S9 341.7S mir-17 hsa-miR-96 4^58 1.01 1.14 mir-96
Figure imgf000050_0004
hsa-miR-122 12.78 36.85 58.52 mir-122 ii>a-iniK- 1 _w 1. > 1 OS! nitr- 1 z hsa-miR-125a-5p 6578 205 0 24080 mir-125 l a-miR-185 41N9.05 10S5".66 12WJI.1- mir-185
Table 6. Expression and promoter methylation of the 15 network genes in the family trio
Father Mother Daughter
Gene Father Mother Daughter
Symbol Cytoband Promoter Promoter Promoter
ID Expression Expression Expression
Methylation Methylation Methylation
LRRC7 57554 lp31.1 0.1 1 0.04 0.05 0.724 0.696 0.756
TXNDC12 51060 lp32.3 56.08 53.37 49.20 0.005 0.002 0.006
TGFBR3 7049 Ip33-p32 23.63 18.15 13.47 0.241 0.225 0.281
AJAP1 55966 lp36.32 0.05 0.02 0.03 0.097 0.106 0.1 16
AGRN 375790 lp36.33 0.18 0.43 0.68 0.423 0.391 0.435
APOB 338 2p24-p23 0.00 0.00 0.00 0.486 0.61 1 0.632
EIF4E3 317649 3pl4 23.18 14.98 19.15 0.076 0.098 0.1 13
SLC35A4 1 13829 5q31.3 22.40 25.92 24.38 0.138 0.166 0.151
PACRG 135138 6q26 0.17 0.07 0.13 0.637 0.635 0.715
COL15A1 1306 9q21-q22 0.12 0.04 0.07 0.394 0.466 0.480
NR4A3 8013 9q22 0.27 0.09 0.1 1 0.004 0.006 0.005
E2F8 79733 l lpl5.1 0.30 0.1 1 0.22 0.095 0.1 18 0.127
PCDH17 27253 13q21.1 0.00 0.00 0.00 0.059 0.083 0.101
PLEKHH3 79990 17q21.2 0.04 0.10 0.06 0.027 0.026 0.034
SCARF2 91 179 22ql l .21 0.08 0.1 1 0.14 0.314 0.323 0.321
Expression levels are expressed as RPKM (reads per kilobase per million mapped reads); methylation levels are expressed as the proportion of mC in the region as 0.0 being no methylation and 1.0 being 100% methylated.
Table 7. List of genes showing changed expression in type 2 diabetes
Known to
Change in
associate
Gene Reason type 2
with type 2 diabetes
diabetes
I'/PRI) ( iWAS- T2I) gene \\ illi 2 fold expression 40" decrease Yes
KCNK17 GWAS-T2D gene with > 2 fold expression 44% decrease Yes change
L IK ·: Genes from multiomic anaksis I ",, decrease o
KDM2B Genes from multiomic analysis 22% increase
/>/)( V)/ Genes from multioniie analysis 22".. decrease
CDKN2B GWAS-T2D gene and KDM2B target gene 43% increase Yes rRX 1 Κ/).\ί2Γ> tar cl gene 27".. decrease No
TXNDC12 Genes from network analysis 18% increase
r.H Rd Genes from network anaksis 20".. decrease No
EIF4E3 Genes from network analysis 38% increase No miR-Ifi-l lop signal from microRN \ xpr ssion In"., decrease No iiiiR Ϊ62 top signal from microR A expression 30% decrase No
Table 8. List of genes showed changed DNA methylation in type 2 diabetes
Change in Known to associate
Gene Region type 2 with type 2
diabetes diabetes
I' ni c r2: 242450^26 increase No
PDCDl chr2:242450767:242450772 increase No
1'ix ni clir2:242450KN9 increase No
EIF4E3 chr3:71884942 decrease
FJF4E3 chr3:"lSS 111 decrease
E1F4E3 ch"r3:71885141 increase No" i:ii'4i:3 chr3:71NNo72 decrease No
EIF4E3 chr3:71886895 decrease No"
J' tCR(, c rft:lft3522"32 decrease
PACRG clir6: 163593238 decrease No
CDK\2fi c r9:220010~5-22001 OS ] increase Yes
CDKN2B chr9:22001091-22001093 increase Yes"
CDK\ H chr<):22001115 increase Yes
CDKN2B chr9:22001207 increase Yes
KP\12H chrl2: 120504330-120504336 decrease Table 9. SNPs co-segregated in the mother-daughter pair of the trio and correlated
Figure imgf000053_0001
rs318486 chr6 2835282 A G Allele 2 G SERPINB9 rsl l5640156 chr6 30262178 T C Allele 1 T TRIM26 rs3748177 chr9 116162023 C T Allele 1 C AKNA rsl l7888214 chr7 141275138 T c Allele 2 c CLEC5A rs2016465 chr8 82067952 A G Allele 1 A PAG1 rs3826100 chrl6 12571771 A C Allele 1 A LOC92017 rs 10609 chrl6 12574577 A G Allele 1 A LOC92017 rs7722287 chr5 1116998 G A Allele 1 G SLC12A7
Table 10. SNPs co-segregated in the mother-daughter pair of the trio and correlated with DNA methylation data.
Figure imgf000055_0001
chrl4 56230691 chrl4 56230594 rs74052638 G A chrl4 67053237 chrl4 67053238 rs7160944 G A chrl4 71536910 chrl4 71536910 C T chrl4 71536927 chrl4 71536910 C T chrl4 76084828 chrl4 76084813 rs28431939 A G chrl4 77468644 chrl4 77468644 rsl l l59309 C T chrl4 81257445 chrl4 81257461 rs8015680 T C chrl4 83315655 chrl4 83315655 rsl7119051 c T chrl4 92138268 chrl4 92138174 rsl 1160078 c T chrl4 92230309 chrl4 92230310 rsl 1849325 G A chrl4 94989664 chrl4 94989664 rs7151678 C T chrl4 95103019 chrl4 95103019 rs8010459 C T chrl4 100572054 chrl4 100572021 rsl0134016 T c chrl4 101373044 chrl4 101373045 rs76697013 G A chrl4 104083010 chrl4 104083021 rs58381709 C T chrl4 104083019 chrl4 104083021 rs58381709 C T chrl4 104088425 chrl4 104088426 rsl0136519 G A chrl4 105797731 chrl4 105797717 rs61995548 C A chr8 819360 chr8 819361 rs6993875 G A chr8 858517 chr8 858518 G A chr8 1762233 chr8 1762233 rsl7756439 C T chr8 1767227 chr8 1767228 rsl l 8103924 G A chr8 1774286 chr8 1774287 rsl 17299306 G A chr8 2077144 chr8 2077045 rs67769214 G A chr8 3186094 chr8 3186095 rsl 549313 G A chr8 3186508 chr8 3186509 rsl348266 G C chr8 3907095 chr8 3907096 rsl3253871 G A chr8 4247951 chr8 4247952 G A chr8 4613571 chr8 4613553 rsl 2550274 C T chr8 4920875 chr8 4920869 rs56685768 A C chr8 8297855 chr8 8297834 rs2945874 C G chr8 8372826 chr8 8372802 rsl7152705 G A chr8 11653758 chr8 11653758 rs78334561 C T chr8 12958132 chr8 12958133 rs2466263 G A chr8 17791118 chr8 17791118 rs2106036 C T chr8 17817761 chr8 17817740 rsl3278127 G T chr8 18311233 chr8 18311199 rs7818999 G A chr8 22140465 chr8 22140448 rsl 1547656 C T chr8 22775344 chr8 22775344 rs6557597 C T chr8 22885562 chr8 22885563 rs56276231 G A chr8 24149079 chr8 24149079 rsl2549111 C T chr8 26896960 chr8 26896960 rs56135074 C T chr8 30517476 chr8 30517467 rs2979513 G C chr8 34595954 chr8 34595954 rs72640996 C T chr8 38882873 chr8 38882873 rs2056170 C T chr8 41713189 chr8 41713201 rs77677893 A T chr8 41713228 chr8 41713201 rs77677893 A T chr8 42405298 chr8 42405298 C T chr8 52367026 chr8 52367027 rsl2542002 G A chr8 52368233 chr8 52368233 rs7012256 C T chr8 56019859 chr8 56019859 rs78116834 C T chr8 56691043 chr8 56691043 rs28577254 C T chr8 59295417 chr8 59295418 rsl 1777995 G A chr8 62654016 chr8 62654016 rsl2674869 C T chr8 71717564 chr8 71717565 rsl 17937985 G A chr8 90271980 chr8 90271981 G A chr8 99057256 chr8 99057257 rsl 17598487 G A chr8 101125154 chr8 101125155 rsl 15353454 G A chr8 102172873 chr8 102172873 rs35655532 C T chr8 110280811 chr8 110280811 rs4734206 C T chr8 117711533 chr8 117711534 rsl2056883 G A chr8 117998854 chr8 117998855 rs2921738 G C chr8 120047115 chr8 120047115 C T chr8 121033918 chr8 121033906 rs4871815 A G chr8 125171600 chr8 125171600 rsl l7193397 C T chr8 134570089 chr8 134570090 rsl 1776026 G A chr8 138647406 chr8 138647407 rs55711768 G A chr8 142945696 chr8 142945696 C T chr8 144471411 chr8 144471411 rsl 1780593 C T chrl 1876343 chrl 1876361 rs2748976 C A chrl 2006537 chrl 2006537 rs74697444 C T chrl 2247745 chrl 2247746 rs72642149 G C chrl 4185556 chrl 4185557 rs4654516 G c chrl 4471999 chrl 4471883 rsl0915549 G A chrl 4857229 chrl 4857229 rs58851741 C T chrl 5225907 chrl 5225910 rs536995 T c chrl 9261200 chrl 9261200 rs9442515 c T chrl 10487930 chrl 10487930 rs61192982 c T chrl 12679477 chrl 12679470 rs7547122 G A chrl 16891271 chrl 16891185 rs676167 A G chrl 16891281 chrl 16891185 rs676167 A G chrl 16891294 chrl 16891185 rs676167 A G chrl 16891349 chrl 16891271 rsl 13256629 C G chrl 18103606 chrl 18103607 G A chrl 18348072 chrl 18348072 rs9660184 C T chrl 18348109 chrl 18348110 rs2027530 G A chrl 18353027 chrl 18352934 rs9662824 A G chrl 18414685 chrl 18414686 rs55715295 G A chrl 19553119 chrl 19553119 rsl2751502 C T chrl 21233995 chrl 21233889 rs6426659 G A chrl 25251065 chrl 25251066 rs435820 G A chrl 25398467 chrl 25398468 rs742393 G A chrl 30031499 chrl 30031500 rs74356443 G A chrl 30662721 chrl 30662721 rs7516250 C T chrl 31894771 chrl 31894772 G A chrl 33476662 chrl 33476663 rs79084040 G A chrl 34135103 chrl 34135104 rsl 11677570 G T chrl 37557298 chrl 37557299 rsl571789 G A chrl 38500761 chrl 38500761 rs77814891 C T chrl 44753510 chrl 44753510 rsl6832024 C T chrl 47832521 chrl 47832510 G T chrl 48895565 chrl 48895565 rs60162645 C T chrl 53366518 chrl 53366401 rs899976 G A chrl 53384559 chrl 53384552 rsl 679934 G A chrl 54260864 chrl 54260864 C T chrl 55194862 chrl 55194862 rsl 1206486 C T chrl 55246903 chrl 55246913 rs6682884 A C chrl 56062110 chrl 56062111 rs4129669 G A chrl 59734223 chrl 59734153 rs667634 A T chrl 62442182 chrl 62442182 rsl0889297 C T chrl 66473368 chrl 66473368 rs74873498 C T chrl 70661698 chrl 70661699 G A chrl 70689135 chrl 70689136 rs77759767 G C chrl 71179278 chrl 71179279 rs475468 G A chrl 77548809 chrl 77548809 rs2799550 C T chrl 78940451 chrl 78940439 rs75417281 G C chrl 79035617 chrl 79035618 rsl0873997 G A chrl 90697666 chrl 90697666 rs56661730 C G chrl 90717979 chrl 90717979 rsl l7868251 C T chrl 91377181 chrl 91377182 rs347003 G A chrl 97570651 chrl 97570651 rs4950027 C T chrl 101774939 chrl 101774939 rsl343363 C T chrl 105508681 chrl 105508678 rsl2029517 A G chrl 107024800 chrl 107024803 rs41346254 T C chrl 112156332 chrl 112156333 rs60237243 G A chrl 115646879 chrl 115646880 rs7542555 G A chrl 116889025 chrl 116889025 rsl2044773 C A chrl 119314192 chrl 119314193 rsl0458418 G A chrl 120141016 chrl 120140994 rsl l63545 C A chrl 153007502 chrl 153007488 rsl2404994 A C chrl 159762077 chrl 159762101 rs72633678 A T chrl 160199671 chrl 160199672 rs2499846 G A chrl 160713532 chrl 160713533 rs469970 G A chrl 166035874 chrl 166035874 rs6695245 C T chrl 169805739 chrl 169805740 rs72717840 G A chrl 177548350 chrl 177548351 rs7543894 G C chrl 178131379 chrl 178131379 rs555755 C T chrl 179601261 chrl 179601268 rs4651102 A G chrl 180005330 chrl 180005331 rs771329 G C chrl 185420925 chrl 185420926 rs6425080 G C chrl 185790902 chrl 185790903 rsl339082 G A chrl 186708283 chrl 186708284 rs 10912405 G T chrl 201062471 chrl 201062472 rs41264010 G A chrl 201910656 chrl 201910657 rs 11240720 G A chrl 205852774 chrl 205852775 rs7519119 G A chrl 206144266 chrl 206144266 rs656801 C T chrl 220582366 chrl 220582367 G A chrl 226029787 chrl 226029776 rs6697054 C T chrl 226031764 chrl 226031764 rsl l5617436 C T chrl 226782545 chrl 226782546 rsl0916338 G A chrl 228328438 chrl 228328438 rs77945324 C T chrl 230075732 chrl 230075733 rs9432028 G A chrl 231875624 chrl 231875624 rs3843251 C T chrl 232596175 chrl 232596175 rs2175594 C T chrl 233092225 chrl 233092226 rs4920161 G A chrl 233102933 chrl 233102933 rs67435549 C T chrl 235358771 chrl 235358771 rs2490381 C T chrl 237274246 chrl 237274246 rsl0925799 C T chrl 240625579 chrl 240625579 rs79799056 C T chrl 240919734 chrl 240919735 rs35575216 G A chrl 241156158 chrl 241156159 rs2491852 G A chrl 242849516 chrl 242849517 rs2806619 G A chrl 242987340 chrl 242987341 G A chrl 243791931 chrl 243791932 rs79726121 G A chrl 243926698 chrl 243926699 rs884987 G C chrl 243926832 chrl 243926828 rs2008766 C A chrl 244912807 chrl 244912807 rs72764608 C T chrl 7 3607498 chrl 7 3607499 rs220475 G A chrl 7 4253789 chrl 7 4253789 C T chrl 7 4401910 chrl 7 4401911 rs76953250 G A chrl 7 4877622 chrl7 4877622 rs2304445 C T chrl 7 5040598 chrl7 5040598 rs34765532 C T chrl 7 6059147 chrl7 6059147 rs78870150 C T chrl 7 6554098 chrl7 6554098 rsl6956192 C T chrl 7 7861378 chrl7 7861359 rsl l655691 G T chrl 7 8215726 chr!7 8215726 rs2909439 C T chrl7 9558164 chrl7 9558165 rs408537 G A chrl7 12053514 chrl7 12053514 rs7219462 C T chrl7 13330112 chrl7 13330112 rsl l l375519 C T chrl7 17280694 chrl7 17280581 rs78267649 A G chrl7 21138951 chrl7 21138936 rs2363227 G T chrl7 21140135 chrl7 21140108 rs5025581 C T chrl7 21148099 chrl7 21148094 rs57988116 G A chrl7 21863440 chrl7 21863440 rs 12602066 C T chrl7 24440749 chrl7 24440750 rs4965415 G A chrl7 24447352 chrl7 24447352 rs869718 C T chrl7 30341705 chrl7 30341705 C T chrl7 30578496 chrl7 30578496 rsl 1870733 C T chrl7 42421364 chrl7 42421266 rs3851798 G A chrl7 45523136 chrl7 45523137 rs77510864 G A chrl7 56920243 chrl7 56920266 rs59159776 C G chrl7 61754846 chrl7 61754846 rs67828405 C T chrl7 61755223 chrl7 61755223 rsl 2938072 C T chrl7 62365467 chrl7 62365455 rs62072186 G T chrl7 64303834 chrl7 64303835 rs4968953 G A chrl7 67130990 chrl7 67130991 rs72841221 G A chrl7 67578077 chrl7 67578078 rsl 14978079 G A chrl7 71102412 chrl7 71102412 rs2305524 C T chrl7 71120706 chrl7 71120707 rs820142 G A chrl7 71955651 chrl7 71955652 rs59035745 G A chrl7 71965707 chrl7 71965707 rs4789300 C T chrl7 72424575 chrl7 72424548 rsl2603183 G A chrl7 73018709 chrl7 73018710 rs2574863 G A chrl7 74553146 chrl7 74553146 rs78111110 C T chrl7 74738187 chrl7 74738171 rs9905981 C T chrl7 77264962 chrl7 77264949 G T chr21 20860496 chr21 20860501 rs6517976 T G chr21 22123150 chr21 22123151 rsl3046513 G A chr21 29036225 chr21 29036225 rs2832026 C T chr21 29371294 chr21 29371295 rs2853831 G C chr21 30167731 chr21 30167731 rs2832468 C T chr21 30733956 chr21 30733962 rs4816363 C G chr21 30756832 chr21 30756833 rs2000513 G A chr21 30886767 chr21 30886768 rs79025761 G A chr21 31741783 chr21 31741784 rs567943 G A chr21 36040302 chr21 36040303 rs60334620 G A chr21 38934537 chr21 38934540 rs461679 T C chr21 39876954 chr21 39876955 rsl 1700626 G A chr21 39942471 chr21 39942471 rs596618 C T chr21 40580199 chr21 40580208 rs!2626576 C G chr21 41833869 chr21 41833869 rs55788142 C T chr21 42245907 chr21 42245902 rsl3046500 G A chr21 43989429 chr21 43989430 rsl 17369905 G A chr21 45095875 chr21 45095880 rs235314 C T chr21 45108566 chr21 45108538 rs4818738 T C chr21 46301118 chr21 46301129 rs915808 A c chrl6 5558903 chrl6 5558905 rsl 1647221 C T chrl6 6777187 chrl6 6777188 rs62016111 G c chrl6 6805163 chrl6 6805163 rsl 15094998 C T chrl6 7201743 chrl6 7201744 rsl 17447259 G A chrl6 7441130 chrl6 7441131 G A chrl6 8544969 chrl6 8544969 rs76834968 C T chrl6 11134805 chrl6 11134806 rs75810024 G A chrl6 11188593 chrl6 11188593 rs78252181 C T chrl6 12395276 chrl6 12395277 rs8044113 G A chrl6 12549828 chrl6 12549829 G A chrl6 12682060 chrl6 12682064 rsl 651006 G C chrl6 13032468 chrl6 13032451 rs4780494 A G chrl6 13032470 chrl6 13032451 rs4780494 A G chrl6 19259863 chrl6 19259863 rs4782231 C T chrl6 26434655 chrl6 26434656 rs62032475 G T chrl6 26563774 chrl6 26563775 rs2008983 G A chrl6 29206910 chrl6 29206910 rsl3337460 C T chrl6 32545547 chrl6 32545548 rs28686921 G A chrl6 47587188 chrl6 47587179 rsl420589 C T chrl6 48177209 chrl6 48177210 rsl 1642742 G C chrl6 48863864 chrl6 48863857 rs7193434 G T chrl6 52757714 chrl6 52757715 rsl2447727 G A chrl6 54735180 chrl6 54735181 rs8063856 G A chrl6 55649907 chrl6 55649907 rsl 627144 C T chrl6 57989553 chrl6 57989553 rsl0775342 C T chrl6 58762853 chrl6 58762854 rs55950120 G T chrl6 62287580 chrl6 62287593 rs2319434 T c chrl6 64472868 chrl6 64472868 rs77659387 C T chrl6 64472874 chrl6 64472868 rs77659387 c T chrl6 64690121 chrl6 64690121 rs8053639 c T chrl6 67361336 chrl6 67361345 rs28628339 A G chrl6 76692144 chrl6 76692145 rs79416778 G T chrl6 76970453 chrl6 76970453 rs72796059 C T chrl6 76971805 chrl6 76971805 rs72796067 C T chrl6 76977015 chrl6 76977016 rs72796083 G A chrl6 77436189 chrl6 77436189 C T chrl6 77493230 chrl6 77493231 rsl 1864605 G A chr!6 77696228 chr!6 77696228 rs4888926 C T chrl6 78621364 chrl6 78621246 rsl3333795 A G chrl6 80758906 chrl6 80758906 rs4889478 C T chrl6 80777559 chrl6 80777553 rs7192638 T C chrl6 81202293 chrl6 81202175 rs35666337 A G chrl6 81624421 chrl6 81624422 rs35444894 G A chrl6 82451098 chrl6 82451099 rs76032180 G A chrl6 84585884 chrl6 84585885 rs6540244 G A chrl6 84861643 chrl6 84861643 rs62053617 C T chrl6 85276016 chrl6 85276016 rs62042107 C T chrl6 85302663 chrl6 85302663 rs300009 C T chrl6 86430971 chrl6 86430972 rs33967518 G A chrl6 86653704 chrl6 86653704 rs 12919293 C T chrl6 87092532 chrl6 87092533 rs12447946 G A chrl6 87293669 chrl6 87293669 rs55780854 C T chrl6 87828265 chrl6 87828265 rs56135239 C T chrl8 1057456 chrl8 1057457 rs72859353 G C chrl8 3692208 chrl8 3692208 rsl 13277345 C T chrl8 5008786 chrl8 5008786 rs8095458 C T chrl8 5878948 chrl8 5878948 C T chrl8 5893626 chrl8 5893626 rs4797235 C T chrl8 8436223 chrl8 8436220 rs7242589 C A chrl8 9766279 chrl8 9766279 rsl2966230 C T chrl8 12997346 chrl8 12997346 rs9957933 C A chrl8 21097568 chrl8 21097569 G A chrl8 24043267 chrl8 24043267 rsl988880 C T chrl8 26275522 chrl8 26275523 rs7236082 G A chrl8 42059645 chrl8 42059645 C T chrl8 48464643 chrl8 48464632 rs8090718 A G chrl8 52865095 chrl8 52865096 rs61747862 G A chrl8 58755881 chrl8 58755881 C T chrl8 60543074 chrl8 60543074 rsl 2963425 C T chrl8 61940561 chrl8 61940561 rs34407336 C T chrl8 61947536 chrl8 61947536 rsl 1873967 C A chrl8 61949363 chrl8 61949349 rs62087667 G T chrl8 66867150 chrl8 66867151 rs75472619 G A chrl8 66956359 chrl8 66956360 rs74513032 G A chrl8 71875809 chrl8 71875810 rs73972485 G A chrl8 71899497 chrl8 71899497 rs76341020 C T chrl8 72832706 chrl8 72832697 rs73968250 G C chrl8 72888539 chrl8 72888539 rsl 12900430 C T chrl8 73054287 chrl8 73054273 rs35828995 G A chrl8 73054272 chrl8 73054273 rs35828995 G A chrl8 73500430 chrl8 73500430 rs!83598 C G chrl8 74258992 chrl8 74258974 G T chrl 8 74919181 chrl 8 74919182 rsl2709636 G A chrl 8 75413796 chrl8 75413797 rs694732 G A chr3 7954065 chr3 7954066 G A chr3 8700097 chr3 8700097 rs34685587 C T chr3 10524847 chr3 10524848 rsl 13222553 G A chr3 13697642 chr3 13697636 rs4383497 A G chr3 14358255 chr3 14358256 rs73022213 G C chr3 16295376 chr3 16295265 rsl 11386922 C G chr3 18959026 chr3 18959027 rs61617737 G A chr3 23311984 chr3 23311972 rs903518 A G chr3 32192770 chr3 32192770 rs9838910 C T chr3 46577602 chr3 46577602 rs4683244 C T chr3 52840632 chr3 52840633 rs28552648 G A chr3 52970615 chr3 52970615 rs58515094 C T chr3 53946115 chr3 53946083 rsl 18025032 T C chr3 54166572 chr3 54166496 rs77060447 T G chr3 68306206 chr3 68306207 rs7618868 G A chr3 68322855 chr3 68322856 rs4459906 G A chr3 72003526 chr3 72003526 rs35644218 C T chr3 72337579 chr3 72337580 rs9790178 G A chr3 72604836 chr3 72604837 G A chr3 85023617 chr3 85023617 rs452496 C T chr3 101540791 chr3 101540792 rsl 66747 G A chr3 106498094 chr3 106498095 rs9842100 G A chr3 116644167 chr3 116644183 rsl353909 G A chr3 120003662 chr3 120003663 G A chr3 122660374 chr3 122660374 rs3843345 C T chr3 127410450 chr3 127410451 rs9822040 G A chr3 128973743 chr3 128973744 rs9852837 G A chr3 129196683 chr3 129196658 rs9289323 A C chr3 131634167 chr3 131634167 rs62281866 C T chr3 132323485 chr3 132323485 rsl 964083 C T chr3 140834494 chr3 140834494 rs73869367 c T chr3 146105317 chr3 146105318 G A chr3 152612802 chr3 152612803 rs6440740 G A chr3 154959094 chr3 154959090 rs641042 C T chr3 174750478 chr3 174750478 rs766004 C T chr3 178435565 chr3 178435565 rs6806734 C T chr3 178681500 chr3 178681408 rs7356095 G A chr3 185837799 chr3 185837799 rs4074989 C T chr3 185870867 chr3 185870867 rs4234609 C T chr3 193640185 chr3 193640186 rs7651615 G A chr3 194995299 chr3 194995197 rs472356 A G chr3 195360371 chr3 195360371 rs9869342 C T chrl2 70927 chrl2 70929 rs7135055 C T chrl2 116462 chrl2 116388 rs73036139 C T chrl2 249032 chrl2 249032 rs555512 c T chrl2 1282063 chrl2 1282064 rs56362023 G T chrl2 2478743 chrl2 2478744 rsl 004207 G A chrl2 5303065 chrl2 5303066 rs4766356 G A chrl2 5559867 chrl2 5559868 rsl 1063763 G A chrl2 14930055 chrl2 14930055 rsl800801 C T chrl2 19693387 chrl2 19693387 C T chrl2 25018486 chrl2 25018487 rs74070658 G A chrl2 29923484 chrl2 29923485 rsl0843532 G T chrl2 30921284 chrl2 30921285 rs33234 G A chrl2 31848937 chrl2 31848937 rsl 16894325 C T chrl2 42254508 chrl2 42254503 rs61925207 G A chrl2 50568327 chrl2 50568327 rs697634 C T chrl2 51093627 chrl2 51093628 rs4761877 G A chrl2 51108231 chrl2 51108259 rs664354 A G chrl2 51149480 chrl2 51149480 rs412533 C T chrl2 56968846 chrl2 56968847 rsl 177774 G A chrl2 60035695 chrl2 60035696 rs68067046 G T chrl2 71393626 chrl2 71393627 rsl0879481 G A chrl2 73099856 chrl2 73099857 G A chrl2 80315758 chrl2 80315758 rsl 1114846 C T chrl2 80568208 chrl2 80568208 rs4594075 C T chrl2 89540305 chrl2 89540305 rs75954279 C T chrl2 95759743 chrl2 95759639 rs7138120 C A chrl2 96768088 chrl2 96768089 rsl2820199 G A chrl2 103634684 chrl2 103634684 rs4514507 C T chrl2 110094145 chrl2 110094146 rs3809296 G A chrl2 112664492 chrl2 112664493 rsl6943171 G T chrl2 113423935 chrl2 113423916 rsl 6944407 A T chrl2 113468744 chrl2 113468748 rs838314 T G chrl2 115473595 chrl2 115473595 rs6490076 C T chrl2 120651478 chrl2 120651478 rs2720034 c T chrl2 121194490 chrl2 121194490 rsl 2321299 c T chrl2 123230886 chrl2 123230886 rsl2308891 c T chrl2 123456184 chrl2 123456184 rsl0846670 c T chrl2 123743705 chrl2 123743705 rsl2579740 c T chrl2 124053056 chrl2 124053048 rsl 1057974 G A chrl2 125376903 chrl2 125376903 rsl 0847126 C T chrl2 125895218 chrl2 125895219 rsl2316600 G A chrl2 128837216 chrl2 128837216 rsl 0734971 C T chrl2 129058376 chrl2 129058377 rs7135364 G A chrl2 129189268 chrl2 129189268 rs!0848013 C T chrl2 129196987 chrl2 129196988 rsl0773744 G A chrl2 129530149 chrl2 129530149 rsl 1060934 C T chrl2 131393512 chrl2 131393494 rs28524950 G A chrl5 19347770 chrl5 19347779 rsl l4113680 G A chrl5 19347788 chrl5 19347779 rsl l4113680 G A chrl5 19928411 chrl5 19928411 rsl533331 C T chrl5 22981746 chrl5 22981747 rs72546347 G A chrl5 29579000 chrl5 29579001 rsl6956797 G A chrl5 29579055 chrl5 29579056 rs72724993 G T chrl5 29605530 chrl5 29605530 rs28445355 C T chrl5 31109770 chrl5 31109770 rsl 2903271 C T chrl5 43809295 chrl5 43809296 rs58712480 G A chrl5 47273450 chrl5 47273437 G A chrl5 47783679 chrl5 47783679 C T chrl5 53889353 chrl5 53889354 rs62046599 G A chrl5 56479439 chrl5 56479440 rs35128881 G T chrl5 59882298 chrl5 59882298 rsl l631588 C T chrl5 65935370 chrl5 65935371 rs35319451 G A chrl5 67728784 chrl5 67728784 rs8038339 C T chrl5 70306881 chrl5 70306882 rs8192364 G A chrl5 76326475 chrl5 76326476 rsl6969656 G A chrl5 76636973 chrl5 76636969 rs7173512 C T chrl5 77362540 chrl5 77362541 rs62025009 G A chrl5 79427835 chrl5 79427836 rs7170663 G A chrl5 85910576 chrl5 85910577 rs79388227 G A chrl5 90372479 chrl5 90372480 rsl2905838 G C chrl5 90375023 chrl5 90375023 rs59913727 C T chrl5 90731411 chrl5 90731297 rs8035799 G A chrl5 93814026 chrl5 93814027 rsl2899367 G A chrl5 98636288 chrl5 98636288 rs62036241 C T chr2 151488 chr2 151489 rs300738 G A chr2 1473379 chr2 1473380 rsl0196514 G A chr2 2747587 chr2 2747588 rs4853772 G A chr2 3113856 chr2 3113856 rs4854193 C T chr2 3946516 chr2 3946516 rs6755953 C T chr2 6716348 chr2 6716348 rs752383 C T chr2 7077063 chr2 7077063 rs309308 C T chr2 10435580 chr2 10435581 rsl6856174 G A chr2 11771475 chr2 11771463 rsl0182021 G A chr2 15531272 chr2 15531166 rsl 1893303 G A chr2 15751472 chr2 15751473 rsl 123382 G A chr2 15754409 chr2 15754410 rs6738170 G A chr2 23241714 chr2 23241715 rsl 3022502 G A chr2 27681773 chr2 27681774 G A chr2 33565423 chr2 33565423 rsl7013131 C T chr2 47668916 chr2 47668917 rs59567488 G A chr2 51232478 chr2 51232479 rs77003468 G A chr2 51517516 chr2 51517516 rs7568061 C T chr2 52093631 chr2 52093632 rs4971778 G A chr2 72107780 chr2 72107780 rs4852838 C T chr2 76092016 chr2 76092017 rsl 822108 G A chr2 83606423 chr2 83606423 rsl0210196 C T chr2 86941714 chr2 86941711 rs6713248 C T chr2 99640992 chr2 99640992 rsl 17984084 C T chr2 100603111 chr2 100603112 rsl 879121 G A chr2 100888208 chr2 100888209 rs2117714 G A chr2 101594445 chr2 101594445 rsl0177367 C T chr2 104695098 chr2 104695098 rs4605392 C T chr2 109613796 chr2 109613798 rsl2614336 C T chr2 113099154 chr2 113099155 rs6758898 G A chr2 113871048 chr2 113870956 rsl2711769 A G chr2 114622863 chr2 114622857 rs56036335 A G chr2 118794607 chr2 118794593 rs73947787 A T chr2 130091295 chr2 130091295 C T chr2 134452809 chr2 134452810 rsl6829877 G A chr2 134985305 chr2 134985306 rs60802178 G A chr2 137450730 chr2 137450712 rs7587475 C T chr2 137488476 chr2 137488477 rsl347033 G A chr2 150314358 chr2 150314359 rsl526287 G A chr2 152477087 chr2 152477088 rs6733800 G A chr2 153481719 chr2 153481620 rs61026505 A G chr2 153920617 chr2 153920611 rsl 1689374 A G chr2 161612945 chr2 161612946 rsl 97262 G A chr2 161887438 chr2 161887443 rs57090160 G A chr2 163748770 chr2 163748771 rsl6848125 G A chr2 168866218 chr2 168866219 rs2466561 G A chr2 192299354 chr2 192299355 rs6746000 G C chr2 192739040 chr2 192739045 rs61025859 C T chr2 195387582 chr2 195387582 rs79898226 C T chr2 202548778 chr2 202548774 rs62195522 G A chr2 206530884 chr2 206530778 rsl2618329 T G chr2 215155192 chr2 215155083 rsl0169894 C T chr2 216856675 chr2 216856662 rs876771 c A chr2 217161987 chr2 217161988 rsl l690415 G A chr2 220895890 chr2 220895890 rsl454551 C T chr2 225723983 chr2 225723984 rsl2622774 G A chr2 227124322 chr2 227124322 rs76885413 C T chr2 227198508 chr2 227198509 rs!273824 G A chr2 231520204 chr2 231520205 G A chr2 232603576 chr2 232603577 rsl l902551 G A chr2 233744810 chr2 233744810 rs6741388 C T chr2 234373308 chr2 234373309 rsl 1563069 G A chr2 234373367 chr2 234373368 rsl3018934 G A chr2 235455956 chr2 235455952 rsl 0204640 A T chr2 236848832 chr2 236848833 rs6723343 G A chr2 237240258 chr2 237240258 rsl 2475884 C T chr2 239695925 chr2 239695926 rs3828204 G A chr2 241034429 chr2 241034430 rsl l8151125 G A chr2 242462182 chr2 242462194 rs79274870 T C chrl3 18526068 chrl3 18526077 rs56025905 T c chrl3 21448613 chrl3 21448614 rs9509950 G A chrl3 22243115 chrl3 22243116 rs78511793 G A chrl3 25437276 chrl3 25437276 rs77152232 C T chrl3 26303628 chrl3 26303605 rs4771032 G A chrl3 28473658 chrl3 28473659 G T chrl3 28524447 chrl3 28524444 rs9508239 A C chrl3 29509188 chrl3 29509187 rsl409511 G A chrl3 36283882 chrl3 36283882 rs9547676 C T chrl3 36996489 chrl3 36996490 rs9576291 G C chrl3 42891561 chrl3 42891561 rs9567186 C T chrl3 47780343 chrl3 47780343 rsl981435 C T chrl3 49917042 chrl3 49917042 C T chrl3 52421953 chrl3 52421953 rs2018259 C T chrl3 56788341 chrl3 56788341 rs9316919 C T chrl3 63842896 chrl3 63842897 rs9571183 G A chrl3 65312735 chrl3 65312735 rs9592432 C T chrl3 65637622 chrl3 65637616 rs9540671 G c chrl3 72165298 chrl3 72165298 rs9543078 C T chrl3 73640651 chrl3 73640651 rsl2867877 C T chrl3 76273700 chrl3 76273701 rs60752038 G A chrl3 92949923 chrl3 92949924 rs9301886 G A chrl3 96051366 chrl3 96051367 rs625346 G A chrl3 98105958 chrl3 98105958 rsl7574896 C T chrl3 98107013 chrl3 98107013 rsl7471453 C T chrl3 100346369 chrl3 100346370 rs9518200 G A chrl3 100394275 chrl3 100394282 rs7984634 A G chrl3 101613809 chrl3 101613809 C T chrl3 102120704 chrl3 102120596 rs603929 T c chrl3 102626843 chrl3 102626844 rsl730643 G A chrl3 105448094 chrl3 105448094 rs9558661 C T chrl3 108545748 chrl3 108545749 rs9521172 G T chr!3 109584225 chrl3 109584225 rs75949503 C T chrl3 110388043 chrl3 110388029 rs8000674 A G chrl3 111953742 chrl3 111953717 rs7993873 G A chrl3 112052245 chrl3 112052246 rs61960761 G A chrl3 112055146 chrl3 112055138 rs9604304 A G chrl3 112145361 chrl3 112145362 rs9577369 G A chrl3 112575967 chrl3 112575968 rsl320525 G A chrl3 113838817 chrl3 113838686 rs9525226 T G chrl3 113964995 chrl3 113964995 rsl l6889516 C T chrlO 1864736 chrlO 1864736 rs35325935 c T chrlO 1902802 chrlO 1902803 rs4242735 G A chrlO 2333859 chrlO 2333859 rs78333966 C T chrlO 2566950 chrlO 2566950 rsl0903799 C T chrlO 4053985 chrlO 4053979 rsl2769283 G T chrlO 6839258 chrlO 6839260 G A chrlO 7967963 chrlO 7967963 rs76522141 C T chrlO 11317630 chrlO 11317619 rsl2098246 A G chrlO 12649163 chrlO 12649163 rs2768358 C T chrlO 13777977 chrlO 13777978 rs3750885 G A chrlO 16589590 chrlO 16589591 rs60165268 G A chrlO 16755791 chrlO 16755768 rs61842277 A G chrlO 18776515 chrlO 18776505 rs75580483 T C chrlO 23776550 chrlO 23776551 rsl l013492 G A chrlO 23849845 chrlO 23849846 rs4570482 G A chrlO 27052102 chrlO 27052102 rsl l015250 C T chrlO 27866042 chrlO 27866042 rsl l015857 C T chrlO 28288545 chrlO 28288546 rsl 15092513 G A chrlO 28615002 chrlO 28615002 rs74741916 C T chrlO 34483336 chrlO 34483337 rs674388 G C chrlO 35030767 chrlO 35030768 rs3002035 G A chrlO 43551002 chrlO 43551010 rsl335894 G T chrlO 43932157 chrlO 43932157 rsl 2268442 C T chrlO 44366697 chrlO 44366678 rs2152635 T c chrlO 44409926 chrlO 44409943 rs34118524 T c chrlO 44496487 chrlO 44496488 rsl2358198 G c chrlO 56056854 chrlO 56056864 rsl 1004420 T c chrlO 57232141 chrlO 57232142 rs77220716 G A chrlO 62185895 chrlO 62185897 rs7077201 G T chrlO 63787013 chrlO 63786997 A G chrlO 65318842 chrlO 65318842 rsl916450 C T chrlO 67256157 chrlO 67256158 rs7477944 G A chrlO 72032085 chrlO 72032086 rs3758562 G A chrlO 72187843 chrlO 72187836 rsl 0999516 G A chrlO 76502636 chrlO 76502636 rs56126796 C T chrlO 77773728 chrlO 77773738 rsl 1001736 C T chrlO 79553308 chrlO 79553308 C T chrlO 80595460 chrlO 80595460 C T chrlO 80702560 chrlO 80702471 rsl0824737 G A chrlO 82331162 chrlO 82331145 rsl l l 86151 T G chrlO 83582387 chrlO 83582387 rsl l l91455 c T chrlO 85263317 chrlO 85263318 rsl590415 G A chrlO 85266299 chrlO 85266300 rs4471365 G A chrlO 85371798 chrlO 85371799 rsl2414182 G A chrlO 88086065 chrlO 88086066 rsl0788490 G A chrlO 91312074 chrlO 91312076 rs7083585 T C chrlO 105328811 chrlO 105328812 rsl l 815048 G A chrlO 110022712 chrlO 110022713 rsl 17322960 G A chrlO 112181902 chrlO 112181903 rs7915044 G A chrlO 112560572 chrlO 112560565 rsl0885050 G A chrlO 115419136 chrlO 115419136 rsl 1816040 C T chrlO 119220533 chrlO 119220534 rs242969 G A chrlO 123880758 chrlO 123880758 rs35909191 C T chrlO 124407375 chrlO 124407375 rs3013180 C T chrlO 124571836 chrlO 124571836 rs76231035 C T chrlO 128705880 chrlO 128705880 rs73384457 C T chrlO 129083365 chrlO 129083366 G A chrlO 129173444 chrlO 129173445 rs74961347 G A chrlO 131395403 chrlO 131395393 rs57758055 T C chrlO 131403271 chrlO 131403272 rs78815056 G A chrlO 132428171 chrlO 132428172 G C chrlO 132913126 chrlO 132913126 rs74817440 C T chrlO 133430053 chrlO 133430054 G A chrlO 134737876 chrlO 134737876 rs4074092 C T chrlO 135075471 chrlO 135075450 rs7906765 G T chrlO 135075473 chrlO 135075450 rs7906765 G T chrlO 135192304 chrlO 135192305 rsl536828 G c chrlO 135192633 chrlO 135192634 rs8192770 G A chr7 180344 chr7 180344 rsl0253011 C T chr7 189332 chr7 189332 rs6962041 C T chr7 227597 chr7 227597 rsl 2540429 C T chr7 848384 chr7 848384 rs6415241 C T chr7 2910843 chr7 2910713 rsl713918 A c chr7 3694519 chr7 3694422 rs7778346 A c chr7 4107798 chr7 4107798 rsl7134377 C T chr7 4994057 chr7 4994057 rsl 11957975 C T chr7 5136058 chr7 5136058 rs6972792 c T chr7 24085207 chr7 24085208 G A chr7 29391730 chr7 29391714 rs6462135 G T chr7 30897832 chr7 30897832 rs!203181 C T chr7 31104620 chr7 31104621 rs2299908 G A chr7 33577181 chr7 33577181 rs2256309 C T chr7 36093161 chr7 36093161 rs61675108 C T chr7 36825146 chr7 36825147 rs73690052 G A chr7 37431894 chr7 37431897 rsl731993 G A chr7 41295198 chr7 41295198 C T chr7 45157714 chr7 45157715 rsl 0228907 G A chr7 45516100 chr7 45516101 rsl7172585 G A chr7 46136039 chr7 46136044 rsl0255580 C T chr7 51922317 chr7 51922318 rsl 1238145 G A chr7 54479686 chr7 54479687 rsl 17604640 G A chr7 55686196 chr7 55686196 rsl 0244823 C T chr7 61542452 chr7 61542408 rs59188871 G A chr7 67026359 chr7 67026345 rs6460413 C T chr7 67612235 chr7 67612203 rs78742511 G A chr7 69847235 chr7 69847221 rs79808556 C T chr7 71515973 chr7 71515974 rsl2540434 G C chr7 75495655 chr7 75495655 rsl 894812 C T chr7 82744111 chr7 82744111 rs74406218 C T chr7 88933402 chr7 88933402 rs42477 C T chr7 90892275 chr7 90892276 rs713317 G A chr7 91949974 chr7 91949974 rs4727277 C T chr7 97449132 chr7 97449132 rsl 0240723 C T chr7 98000896 chr7 98000897 rs817918 G A chr7 99165093 chr7 99165094 rs2140128 G A chr7 101072837 chr7 101072838 rsl 1975626 G A chr7 104627725 chr7 104627726 G A chr7 107198953 chr7 107198952 rs41668 G A chr7 109979249 chr7 109979249 rsl2537289 C T chr7 116677136 chr7 116677235 rsl0953839 A C chr7 117541782 chr7 117541783 rsl0263991 G A chr7 122427336 chr7 122427337 rs9770283 G A chr7 123943094 chr7 123943095 rsl7132872 G C chr7 129659119 chr7 129659129 rsl 3229387 T C chr7 129711001 chr7 129711001 rs2178158 C T chr7 131788343 chr7 131788344 rsl0954376 G A chr7 136552370 chr7 136552283 rs832973 C T chr7 138230550 chr7 138230550 rs74700697 C T chr7 148139033 chr7 148139034 G A chr7 149862242 chr7 149862242 rsl 1769828 C T chr7 150672875 chr7 150672875 rs2975200 C T chr7 150733814 chr7 150733800 rs4726024 G A chr7 151060577 chr7 151060578 rs62478184 G A chr7 152059611 chr7 152059612 rs75986755 G A chr7 154590740 chr7 154590741 rs 10242760 G A chr7 154877055 chr7 154877056 rs 11760227 G A chr7 155311007 chr7 155311008 rs288765 G A chr7 155378850 chr7 155378835 G A chr7 155851133 chr7 155851134 rsl2386634 G A chr7 157208036 chr7 157208036 rs2109302 C T chr7 157406602 chr7 157406613 rsl 17321937 G A chr7 157758151 chr7 157758151 C T chr7 157763200 chr7 157763193 rsl 1976578 G A chr7 158417650 chr7 158417650 rs3793188 C T chr20 1806223 chr20 1806141 rsl3036379 C T chr20 3630242 chr20 3630242 rs45472900 C T chr20 3979438 chr20 3979439 rs6052269 G T chr20 4519694 chr20 4519695 rs3761232 G A chr20 4700020 chr20 4700021 rs2422961 G A chr20 8515266 chr20 8515266 rs6055929 C T chr20 9700795 chr20 9700796 rs77467524 G A chr20 16568031 chr20 16568031 rs56676408 C CG chr20 16828892 chr20 16828892 rsl 12837998 C T chr20 23273046 chr20 23273047 rs6048740 G A chr20 34496077 chr20 34496078 rsl 275392 G A chr20 36356153 chr20 36356141 rs6069597 T C chr20 40930797 chr20 40930785 rs208181 A G chr20 43089195 chr20 43089196 rs6065774 G A chr20 43714154 chr20 43714140 rs6104273 T C chr20 43755974 chr20 43755975 G A chr20 46211855 chr20 46211856 rsl 3044461 G A chr20 49840916 chr20 49840909 rs6126344 A C chr20 59325202 chr20 59325202 rs78580698 C T chr20 59354510 chr20 59354510 rsl3045809 C T chr20 59359393 chr20 59359393 rs6101325 c T chr20 59370918 chr20 59370919 rsl 12977494 G T chr20 59655012 chr20 59655012 rs2427223 C T chr20 59674039 chr20 59674040 rs57868457 G A chr20 59868876 chr20 59868877 rs6121817 G A chr20 61451548 chr20 61451548 rsl 044397 C T chr20 61602801 chr20 61602801 rs910946 C T chrl9 955710 chrl9 955710 rs4807399 C T chrl9 955822 chrl9 955823 rs4806908 G A chrl9 1407594 chrl9 1407595 rs3896894 G c chrl9 2436683 chrl9 2436698 rsl 1670337 T c chrl9 2471351 chrl9 2471352 rsl 14492803 G A chrl9 5150276 chrl9 5150277 rs8107318 G A chr!9 8907870 chrl9 8907876 rs4804378 G T chrl9 9116967 chrl9 9116967 rs8106716 C T chrl9 11915670 chrl9 11915670 C T chrl9 13466552 chrl9 13466561 rsl363343 T c chrl9 16028975 chrl9 16028975 rs 17722606 c T chrl9 16051801 chrl9 16051802 rs3826723 G A chrl9 17174811 chrl9 17174811 C T chrl9 21173301 chrl9 21173302 rsl988919 G A chrl9 24355894 chrl9 24355997 rs8187215 G A chrl9 34216606 chrl9 34216611 rs9749591 C T chrl9 34590865 chrl9 34590848 rs73014561 G A chrl9 34688329 chrl9 34688329 rsl2463326 C T chrl9 34716579 chrl9 34716570 rs7252103 A C chrl9 35829274 chrl9 35829274 rs 12974562 C T chrl9 35906446 chrl9 35906446 rsl6964560 C T chrl9 40236802 chrl9 40236803 rs2451996 G A chrl9 42721340 chrl9 42721340 rs2385126 C T chrl9 48285249 chrl9 48285261 rs 10426895 C T chrl9 49770221 chrl9 49770221 rsl2978619 C T chrl9 53508643 chrl9 53508644 rsl2980180 G A chrl9 55643600 chrl9 55643600 rs77008217 C T chrl9 55760242 chrl9 55760264 rs62116099 T c chrl9 59908706 chrl9 59908706 rsl325155 c T chrl9 60703924 chrl9 60703924 c T chrl9 60889870 chrl9 60889871 rs76080867 G A chrl9 61066851 chrl9 61066851 rs35093307 C T chr6 3134177 chr6 3134178 rs55728283 G A chr6 3657428 chr6 3657428 rs226935 C T chr6 3665281 chr6 3665282 rs226967 G A chr6 3665302 chr6 3665282 rs226967 G A chr6 3754508 chr6 3754509 rsl2197082 G A chr6 4629637 chr6 4629637 C T chr6 7202789 chr6 7202789 rs 12662874 C T chr6 7422212 chr6 7422212 rs2763119 C T chr6 13172487 chr6 13172382 rs9381653 T A chr6 15030518 chr6 15030514 rs367558 T C chr6 16055474 chr6 16055474 rs6941696 c T chr6 16456848 chr6 16456848 rsl79984 c T chr6 18191570 chr6 18191571 rs77417815 G A chr6 23319085 chr6 23319086 rs35909592 G A chr6 25520681 chr6 25520681 rs4712934 C T chr6 29769803 chr6 29769803 rsl 13797808 C T chr6 29822038 chr6 29822032 rsl l5018116 G T chr6 29942752 chr6 29942752 rsl l l569173 C T chr6 30016088 chr6 30016089 rsl 12692946 G A chr6 30047522 chr6 30047523 rsl l4059410 G A chr6 31416606 chr6 31416607 rsl l3313361 G C chr6 34218235 chr6 34218235 rs9380409 C T chr6 34971693 chr6 34971694 rs9357186 G A chr6 35611441 chr6 35611441 rs2103681 C T chr6 39219801 chr6 39219802 rs228823 G A chr6 41835462 chr6 41835463 rs79529786 G A chr6 50325671 chr6 50325672 rsl417503 G A chr6 52931733 chr6 52931715 rs436675 C T chr6 55983872 chr6 55983873 rsl2660068 G A chr6 57482430 chr6 57482454 rs62400921 A G chr6 66999633 chr6 66999640 rs9294687 C T chr6 70914541 chr6 70914532 rsl 1759148 G A chr6 74674414 chr6 74674414 rs2917878 C T chr6 76889103 chr6 76889092 rs77479261 T C chr6 77065803 chr6 77065804 G A chr6 77218572 chr6 77218572 rs4598033 C T chr6 78092375 chr6 78092375 rs76359405 C T chr6 91505598 chr6 91505598 rsl923082 C T chr6 91520240 chr6 91520158 rs9362780 G A chr6 94527022 chr6 94527023 rs9363114 G A chr6 95061281 chr6 95061282 rs434310 G A chr6 97556469 chr6 97556434 rs6568697 G A chr6 99086988 chr6 99086989 rsl481451 G A chr6 109968481 chr6 109968481 rs9320293 C T chr6 114491691 chr6 114491692 rsl334902 G A chr6 133057292 chr6 133057293 rs45568334 G A chr6 133779789 chr6 133779790 rs3777840 G A chr6 139849739 chr6 139849739 rsl468708 C T chr6 144067944 chr6 144067944 rsl2528079 C T chr6 144771025 chr6 144771026 rs7738289 G A chr6 147955921 chr6 147955921 rs71566486 C T chr6 148604917 chr6 148604917 rs79426509 C T chr6 150339215 chr6 150339216 rs75997276 G A chr6 151092617 chr6 151092617 rs62434148 C T chr6 152084984 chr6 152084983 rs3020340 A G chr6 152348059 chr6 152348059 C T chr6 168239921 chr6 168239923 rs4708433 C T chr6 168382994 chr6 168382994 rs56657236 c T chr6 169035961 chr6 169035961 rs58522233 c T chr6 169228265 chr6 169228265 rsl l l301366 c T chr6 170324143 chr6 170324144 G A chr6 170586031 chr6 170586031 rs4710723 C T chrl l 286674 chrl l 286675 rs72636979 G A chrl 1 1376170 chrl l 1376187 rs74046931 C G chrl 1 1404259 chrl l 1404243 rs73409585 C T chrl 1 1432204 chrl l 1432204 rs66529968 c T chrl 1 2391098 chrl l 2391098 rsl l605839 c T chrl 1 4202387 chrl l 4202357 rs79850225 T c chrl 1 5784028 chrl l 5784028 rsl 1607346 c T chrl 1 5890010 chrl l 5890010 rs2198445 c T chrl 1 5890111 chrl l 5889993 rs2198444 T G chrl 1 6737527 chrl l 6737528 rs4757969 G T chrl 1 7926543 chrl l 7926544 rsl510998 G A chrl 1 10244563 chrl l 10244447 rsl7295954 A C chrl 1 19499355 chrl 1 19499355 rsl7597260 C T chrl 1 20003681 chrl 1 20003680 rsl l 825935 G c chrl 1 24638115 chrl 1 24638108 rs3923615 G T chrl 1 24638136 chrl 1 24638108 rs3923615 G T chrl 1 26586186 chrl 1 26586187 rsl0835012 G c chrl 1 30808609 chrl 1 30808610 rs3741026 G A chrl 1 36194135 chrl 1 36194136 rsl 1605921 G A chrl 1 42853579 chrl 1 42853499 rsl0837957 A c chrl 1 42881323 chrl 1 42881279 rs79598717 C T chrl 1 43929787 chrl 1 43929788 rs66616283 G A chrl 1 43971264 chrl 1 43971264 rsl 0400220 C T chrl 1 44606989 chrl 1 44606989 rs7127343 C T chrl 1 44855135 chrl 1 44855135 rs4755908 C T chrl 1 61914644 chrl 1 61914627 rs2302361 T A chrl 1 62653400 chrl 1 62653316 rs2097075 G c chrl 1 64738413 chrl 1 64738413 rsl 2420456 C T chrl 1 65249688 chrl 1 65249688 rs481482 C T chrl 1 66026555 chrl 1 66026555 C T chrl 1 68353461 chrl 1 68353447 rs77818363 A G chrl 1 68829223 chrl l 68829224 G A chrl 1 69511500 chrl 1 69511475 rs61885142 T A chrl 1 69570099 chrl 1 69570099 rs7947026 C T chrl 1 74191076 chrl 1 74191076 rs536823 c T chrl 1 78356329 chrl 1 78356330 rs681267 G A chrl 1 79683029 chrl l 79683030 rsl 0792476 G A chrl 1 80299348 chrl 1 80299348 rs4945458 C T chrl 1 80393911 chrl 1 80393886 rsl0792530 G A chrl 1 80524400 chrl 1 80524401 rs2512104 G A chrl 1 86328959 chrl 1 86328960 rsl l234888 G A chrl 1 86411474 chrl 1 86411445 rsl 1234911 T C chrl 1 86568788 chrl 1 86568788 rs7946112 C T chrl 1 88005197 chrl 1 88005198 rs67163709 G A chrl 1 91573544 chrl l 91573545 G A chrl l 98580740 chrl l 98580742 rs7938173 T C chrl l 108395653 chrl l 108395644 rs7927335 T C chrl l 111831541 chrl l 111831541 c T chrl l 123999512 chrl l 123999513 rs2156155 G A chrl l 125749828 chrl l 125749805 A G chrl l 126024189 chrl l 126024190 rs4937169 G C chrl l 128158107 chrl l 128158108 rs2284785 G A chrl l 129290825 chrl l 129290804 rs3734073 G A chrl l 130480464 chrl l 130480465 rs79528350 G A chrl l 131926781 chrl l 131926781 rs4400825 C T chrl l 131958536 chrl l 131958551 rs75470688 G A chrl l 131986396 chrl l 131986383 rs9787796 C T chrl l 133193474 chrl l 133193483 rs58888419 G A chrl l 134119657 chrl l 134119657 rs4937946 C T chr4 868048 chr4 868048 rs899387 C T chr4 1218946 chr4 1218947 rs73793145 G A chr4 1219071 chr4 1219071 rs73793148 C T chr4 1235385 chr4 1235277 rs2279281 T G chr4 1348894 chr4 1348895 rs79873757 G A chr4 3679320 chr4 3679320 C T chr4 4445519 chr4 4445519 rsl0017186 C T chr4 5508174 chr4 5508111 rs3821926 A G chr4 6217443 chr4 6217357 rs6824225 A G chr4 7795563 chr4 7795564 rsl3130675 G A chr4 9724236 chr4 9724219 rs4393994 C T chr4 10613189 chr4 10613190 rsl6876870 G A chr4 13441393 chr4 13441393 rs77764276 C T chr4 13710915 chr4 13710898 rs3843431 C A chr4 15351689 chr4 15351690 rsl 807250 G C chr4 15572770 chr4 15572771 rs2286461 G A chr4 15634935 chr4 15634935 rsl7478336 C T chr4 24710495 chr4 24710495 rs6819784 C T chr4 24970264 chr4 24970265 G c chr4 30810032 chr4 30810032 rsl587410 C T chr4 37878854 chr4 37878738 rs4833008 G A chr4 39482965 chr4 39482965 rs59496471 C T chr4 43797868 chr4 43797869 rs7656241 G A chr4 44321394 chr4 44321395 rs80199883 G A chr4 55517623 chr4 55517624 rsl 1728462 G A chr4 55559063 chr4 55559064 rs77772711 G T chr4 59819066 chr4 59819065 rs5027239 A G chr4 60591058 chr4 60591059 rs76979826 G A chr4 65851368 chr4 65851368 rsl l737511 C T chr4 78356258 chr4 78356228 rs4629478 A G chr4 78631735 chr4 78631736 rs6856567 G A chr4 111417683 chr4 111417683 rs2713947 C T chr4 113971463 chr4 113971463 rs72906881 C T chr4 114578185 chr4 114578185 rsl l7751698 C T chr4 116609419 chr4 116609388 rsl0027287 G A chr4 116629029 chr4 116629029 rs9884192 C T chr4 120441740 chr4 120441739 rsl 13434765 G A chr4 124606908 chr4 124606909 rsl7807333 G A chr4 127289521 chr4 127289521 C T chr4 133358739 chr4 133358740 rsl68339 G A chr4 133381018 chr4 133381019 rsl 3126205 G C chr4 139662612 chr4 139662612 rsl450432 C T chr4 148244491 chr4 148244491 C T chr4 150184734 chr4 150184735 rs4355369 G A chr4 163740399 chr4 163740400 rs62326271 G A chr4 182328617 chr4 182328617 C T chr4 186015738 chr4 186015649 rsl 11420399 T C chr4 188330197 chr4 188330198 rs55761513 G T chr4 190270951 chr4 190270953 rs35316875 A G chr4 190395130 chr4 190395130 rsl0021138 C A chr9 672944 chr9 672945 rsl 6922510 G A chr9 830970 chr9 830971 rs942073 G A chr9 1730278 chr9 1730278 rs4741586 C T chr9 1788528 chr9 1788529 rsl6934186 G A chr9 2186734 chr9 2186734 rsl886264 C T chr9 2198793 chr9 2198793 rsl0965207 C T chr9 7928329 chr9 7928329 C T chr9 9977332 chr9 9977333 rsl0978149 G A chr9 12292804 chr9 12292805 rs79962943 G T chr9 12319961 chr9 12319961 rs79334083 C T chr9 13694539 chr9 13694539 rsl998584 C T chr9 16872284 chr9 16872285 rs61202585 G A chr9 22544603 chr9 22544601 rsl l58382 G A chr9 27885371 chr9 27885371 rsl930021 C T chr9 44173605 chr9 44173606 rs28829650 G A chr9 85528826 chr9 85528826 rs4877803 C T chr9 85923713 chr9 85923713 rsl 1140389 C T chr9 89228881 chr9 89228882 rsl 15072974 G C chr9 89671061 chr9 89671065 rsl 1142008 T c chr9 90503718 chr9 90503719 rs4559317 G A chr9 90504278 chr9 90504296 rsl573234 T C chr9 91634970 chr9 91634970 rs73649344 C T chr9 92040775 chr9 92040775 rsl475628 c T chr9 92142496 chr9 92142497 rs2254787 G A chr9 94639659 chr9 94639660 rs61743154 G A chr9 99310684 chr9 99310667 rs7044880 A G chr9 100069864 chr9 100069865 G A chr9 100807123 chr9 100807124 rsl6918128 G A chr9 100855291 chr9 100855291 rs74483774 C T chr9 103563252 chr9 103563252 rsl572563 C T chr9 106427883 chr9 106427898 rs2417536 G A chr9 107818719 chr9 107818720 rs 16925120 G A chr9 110589628 chr9 110589580 rs7870428 G T chr9 114055802 chr9 114055802 rsl0817303 C T chr9 124818467 chr9 124818468 rs653441 G A chr9 125158942 chr9 125158942 C T chr9 125997371 chr9 125997372 rs73575777 G A chr9 129336197 chr9 129336197 rsl0987675 C T chr9 130617908 chr9 130617908 rs7025290 C T chr9 133967698 chr9 133967699 rsl 11479389 G A chr9 134017310 chr9 134017311 rs 11243644 G A chr9 134032016 chr9 134031987 rs3829759 G A chr9 138279423 chr9 138279424 rs35302005 G C chr9 139425484 chr9 139425484 C T chr5 1136420 chr5 1136421 rsl l7206816 G A chr5 1180318 chr5 1180318 rs56221185 C T chr5 2061988 chr5 2061988 rs74365793 C T chr5 2188252 chr5 2188252 rs73027534 C T chr5 2194959 chr5 2194944 rsl0056878 G A chr5 2194964 chr5 2194944 rsl0056878 G A chr5 2738550 chr5 2738550 rsl3170001 C T chr5 3184398 chr5 3184398 rsl60899 C T chr5 3269672 chr5 3269672 rs57572328 C T chr5 3303986 chr5 3303987 rs62338472 G A chr5 3304117 chr5 3304118 rs77583350 G A chr5 5360238 chr5 5360239 rs34029195 G C chr5 6406001 chr5 6406001 rs271416 C T chr5 6406017 chr5 6406001 rs271416 C T chr5 6776546 chr5 6776546 rs274714 C T chr5 8114221 chr5 8114222 rs79587908 G A chr5 9019159 chr5 9019162 rs6863266 C T chr5 9197643 chr5 9197644 rsl6882093 G A chr5 9900752 chr5 9900753 rsl 0462870 G A chr5 11335256 chr5 11335257 rs4702796 G A chr5 19243058 chr5 19243058 rs6865590 C T chr5 28167021 chr5 28167022 rsl 0472602 G A chr5 40815447 chr5 40815448 rsl 54276 G A chr5 40903405 chr5 40903387 rs364443 A G chr5 46257191 chr5 46257191 rsl0052784 C T chr5 56857856 chr5 56857856 rs72765423 C T chr5 57111930 chr5 57111930 rsl432477 c T chr5 58085398 chr5 58085399 G A chr5 68465953 chr5 68465954 rs 164707 G A chr5 78657453 chr5 78657453 rs3733885 C T chr5 79599567 chr5 79599567 rs79747351 C T chr5 90074285 chr5 90074285 rs4916818 C T chr5 116611972 chr5 116611972 C T chr5 119173180 chr5 119173174 rs7709933 C T chr5 123259371 chr5 123259373 rs 11745800 C T chr5 125918889 chr5 125918774 rs34731336 G A chr5 132902616 chr5 132902616 rs461418 C T chr5 132902642 chr5 132902643 rs462330 G A chr5 133223255 chr5 133223255 rs 1644312 C T chr5 133276022 chr5 133276023 rs62373722 G A chr5 135113312 chr5 135113312 rs4495176 C T chr5 135403503 chr5 135403372 rs4141306 A G chr5 136738011 chr5 136738012 rs34825120 G A chr5 143346948 chr5 143346949 rs312615 G T chr5 145309192 chr5 145309173 rs999237 T C chr5 146968646 chr5 146968647 rsl 14906228 G A chr5 150508726 chr5 150508726 rs55797624 C T chr5 157102545 chr5 157102546 rs6867670 G A chr5 160980854 chr5 160980854 rs79615688 C T chr5 161515354 chr5 161515355 rs647625 G A chr5 166025924 chr5 166025925 rsl 7067407 G A chr5 166760140 chr5 166760140 rs9313378 C T chr5 179480297 chr5 179480298 rs72813638 G A
Table 11. Summary of genotyping results of the trio
HK GWAS HK Genotyping (955 control vs 677 T2D) (421 control vs 1144 T2D)
SNP rs# Gene Chr
Al A2 OR P value Al A2 OR P value rsl052637 DDX18 2 C G 1.35 0.0025 C G 0.98 0.873 rs748767 CCDC93 2 G A 0.78 0.0062 G A 1.00 0.959 rs9998519 WFS1 4 C T 0.62 0.0059 C T 0.76 0.035 rsl534938 PACRG 6 T G 0.72 1.82e-04 T G 0.84 0.039 rsl7497819 CAMK1D 10 G T 2.27 0.0026 T G 0.98 0.912 rsl 1597439 CUED2 10 C G 1.95 0.0029 G C 1.01 0.915 rsl 17510629 KDM2B 12 T A 1.43 0.038 A T 1.01 0.926 rsl 1065587 KDM2B 12 G C 1.43 0.038 C G 1.01 0.933 chrl2: 120395303 KDM2B 12 G A 1.43 0.038 A G 1.00 0.999 rsl 1065588 KDM2B 12 G A 1.39 0.041 A G 1.03 0.833 rs3743599 ADAT1 16 T G 0.77 0.0041 C A 0.86 0.086 rs3743598 ADAT1 16 T G 0.71 0.002 C A 0.82 0.057 rs734409 17 T C 0.78 0.0081 G A 1.13 0.178 rs3818744 PPP4R1L 20 G A 0.83 0.03 G A 0.88 0.127 rsl7114359 UMODL1 21 A C 1.25 0.01 C A 0.97 0.689 rs220305 UMODL1 21 G A 1.37 2.72e-04 T C 1.03 0.697 rs2839467 UMODL1 21 G A 1.25 0.011 A G 0.93 0.403 rs93136 UMODL1 21 G A 1.36 2.87e-04 T C 1.02 0.788
Table 11. Summary of the genotyping results. Co-segregated mother-daughter SNPs from focus gene regions showing nominally significant P values in the Chinese GWAS metaanalysis were genotyped independently in 1144 Chinese T2D cases and 421 controls.

Claims

WHAT IS CLAIMED IS: 1. A method for detecting the presence or increased risk of developing type 2 diabetes (T2D) in a subject, the method comprising:
(a) measuring, in a tissue sample obtained from the subject, expression or DNA methylation levels of one or more T2D-associated genes, wherein the T2D-associated genes are selected from the group consisting of LRRC7, TXNDC12, TGFBR3, AJAPl, AGRN, APOB, EIF4E3, SLC35A4, PACRG, COL15A1, NR4A3, E2F8, PCDH17, PLEKHH3, SCARF2, CUEDC2, KDM2B, PDCD1, miR-16-1, and miR-16-2;
(b) comparing the expression or DNA methylation levels of the one or more T2D-associated genes with levels from the same genes in a standard control; and
(c) detecting the presence or increased risk of developing T2D when the expression or DNA methylation levels of the one or more T2D-associated genes are higher or lower than the levels in the standard control.
2. The method of claim 1, wherein step (a) comprises measuring expression levels of the one or more T2D-associated genes, and the T2D-associated genes are selected from the group consisting of KDM2B, TXNDC12, EIF4E3, CUEDC2, PDCD1, PACRG, miR-16-1, and miR-16-2.
3. The method of claim 2, wherein measuring expression levels comprises measuring levels of RNA transcripts of the one or more T2D-associated genes.
4. The method of claim 2, wherein measuring expression levels comprises measuring amounts of proteins or polypeptides resulting from translation of RNA transcripts of the one or more T2D-associated genes.
5. The method of claim 2, wherein step (c) comprises detecting the presence or increased risk of developing T2D when the expression levels of the one or more T2D-associated genes are higher than the levels in the standard control, and the T2D- associated genes are selected from the group consisting of KDM2B, TXNDC12, and EIF4E3.
6. The method of claim 2, wherein step (c) comprises detecting the presence or increased risk of developing T2D when the expression levels of the one or more T2D-associated genes are lower than the levels in the standard control, and the T2D- associated genes are selected from the group consisting of CUEDC2, PDCD1, PACRG, miR- 16-1, and miR-16-2.
7. The method of claim 1, wherein step (a) comprises measuring DNA methylation levels of the one or more T2D-associated genes and the T2D-associated genes are selected from the group consisting of PDCD1, EIF4E3, PACRG, and KDM2B.
8. The method of claim 7, wherein step (c) comprises detecting the presence or increased risk of developing T2D when the DNA methylation levels of the one or more T2D-associated genes, or regions of these genes, are higher than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of PDCD1 and EIF4E3.
9. The method of claim 8, wherein step (c) comprises detecting the presence or increased risk of developing T2D when the DNA methylation level of a promoter region in the one or more T2D-associated genes is higher than the level in the standard control.
10. The method of claim 7, wherein step (c) comprises detecting the presence or increased risk of developing T2D when the DNA methylation levels of the one or more T2D-associated genes, or regions of these genes, are lower than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of EIF4E3, PACRG, and KDM2B.
11. The method of claim 10, wherein step (c) comprises detecting the presence or increased risk of developing T2D when the DNA methylation level of a promoter region in the one or more T2D-associated genes is lower than the level in the standard control.
12. A method for detecting the presence or increased risk of developing T2Din a subject, the method comprising:
(a) determining, in a tissue sample obtained from the subject, a single- nucleotide genotype of the subject at a genomic locus, wherein the genomic locus is given by a SNP ID number provided in Table 9, 10, or 11; and
(b) detecting the presence or increased risk of developing T2D in the subject if the single-nucleotide genotype matches a reference genotype associated with T2D.
13. The method of claim 12, wherein the genomic locus is given by a SNP ID number provided in Table 11 and selected from the group consisting of rsl052637, rs748767, rs9998519, rsl534938, rsl7497819, rsl 1597439, rsl 17510629, rsl 1065587, chrl2:120395303, rsl 1065588, rs3743599, rs3743598, rs734409, rs3818744, rsl7114359, rs220305, rs2839467, and rs93136.
14. The method of claim 13, wherein the reference genotype associated with T2D for each genomic locus is provided in a column Al of Table 11.
15. The method of claim 13, wherein the genomic locus is given by the
SNP ID number rs9998519 and the reference genotype is C.
16. The method of claim 13, wherein the genomic locus is given by the SNP ID number rsl534938 and the reference genotype is T.
17. The method of claim 12, wherein the genomic locus is given by a SNP ID number provided in Table 9 and the reference genotype for the genomic locus is provided in the "risk genotype" column of Table 9.
18. The method of claim 12, wherein the genomic locus is given by a SNP ID number provided in Table 10 and the reference genotype for the genomic locus is provided in the "risk allele" column of Table 10.
19. The method of claim 12, wherein steps (a) and (b) are performed for a plurality of genomic loci.
20. The method of claim 1 or 12, wherein the sample is a blood or saliva sample.
21. The method of claim 1 or 12, wherein the subject is of Asian descent.
22. The method of claim 21, wherein the subject is a Chinese.
23. The method of claim 22,wherein the subjectis a Han Chinese.
24. The method of claim 1 or 12,wherein the subjecthas a family history of T2D but has not been diagnosed with T2D.
25. The method of claim 1 or 12, wherein step (a) comprises an amplification reaction.
26. The method of claim 25, wherein the amplification reaction is a polymerase chain reaction (PCR).
A method of identifying genetic markers of T2D, the method comprising:
obtaining a multiomic data set from one or more T2D-positive subjects, wherein each T2D-positive subject has been diagnosed with T2D;
obtaining a multiomic data set from one or more T2D-negative subjects, wherein each T2D-negative subject has not been diagnosed with T2D;
identifying differences between the multiomic data sets obtained from the one or more T2D-positive subjects and the one or more T2D-negative subjects; and
identifying one or more genetic markers based on the differences.
28. The method of claim 27, wherein the T2D-positive subjects and T2D- negative subjects together comprise a family trio, and the family trio comprises a father, a mother, and an offspring.
29. The method of claim 27, wherein each multiomic data set comprises at least two of the following: DNA sequencing data, RNA sequencing data, and RNA
expression level data.
30. The method of claim 27, wherein differences between the multiomic data sets from the one or more T2D-positive subjects and the one or more T2D-negative subjects are identified by comparing the multiomic data sets pair-wise.
31. The method of claim 27, wherein the one or more genetic markers are identified using network analysis.
32. The method of claim 27, wherein the genetic markers comprise differentially expressed genes, differentially expressed microRNAs, differentially methylated regions, or single-nucleotide polymorphisms.
33. The method of claim 32, wherein the genetic markers comprise a differentially expressed gene that contains or overlaps with a differentially methylated region.
34. A kit for detecting the presence or increased risk of developing type 2 diabetes (T2D) in a subject, the kit comprising: reagents for measuring, in a tissue sample obtained from the subject, expression or DNA methylation levels of one or more T2D- associated genes, wherein the T2D-associated genes are selected from the group consisting of LRRC7, TXNDC12, TGFBR3, AJAPl, AGRN, APOB, EIF4E3, SLC35A4, PACRG, COL15A1, NR4A3, E2F8, PCDH17, PLEKHH3, SCARF2, CUEDC2, KDM2B, PDCD1, miR-16-1, and miR-16-2; and a standard control,
wherein the presence or increased risk of developing T2D is detected when the expression or DNA methylation levels of the one or more T2D-associated genes are higher or lower than the levels in the standard control.
35. The kit of claim 34, wherein the reagents are used for measuring expression levels of the one or more T2D-associated genes, and the T2D-associated genes are selected from the group consisting of KDM2B, TXNDC12, EIF4E3, CUEDC2, PDCD1, PACRG, miR-16-1, and miR-16-2.
36. The kit of claim 35, wherein the presence or increased risk of developing T2D is detected when the expression levels of the one or more T2D-associated genes are higher than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of KDM2B, TXNDC12, and EIF4E3.
37. The kit of claim 35, wherein the presence or increased risk of developing T2D is detected when the expression levels of the one or more T2D-associated genes are lower than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of CUEDC2, PDCD1, PACRG, miR-16-1, and miR-16-2.
38. The kit of claim 34, wherein the reagents are used for measuring DNA methylation levels of the one or more T2D-associated genes and the T2D-associated genes are selected from the group consisting of PDCD1, EIF4E3, PACRG, and KDM2B.
39. The kit of claim 38, wherein the presence or increased risk of developing T2D is detected when the DNA methylation levels of the one or more T2D- associated genes, or regions of these genes, are higher than the levels in the standard control, and the T2D-associated genes are selected from the group consisting of PDCD1 and EIF4E3.
40. The kit of claim 38, wherein the presence or increased risk of developing T2D is detected when the DNA methylation levels of the one or more T2D-associated genes, or regions of these genes, are lower than the levels in the standard control, and the T2D- associated genes are selected from the group consisting of EIF4E3, PACRG, and KDM2B.
41. A kit for detecting the presence or increased risk of developing T2D in a subject, the kit comprising: reagents for determining, in a tissue sample obtained from the subject, a single-nucleotide genotype of the subject at a genomic locus, wherein the genomic locus is given by a SNP ID number provided in Table 9, 10, or 11; and wherein the presence or increased risk of developing T2D in the subject is detected if the single-nucleotide genotype matches a reference genotype associated with T2D.
42. The kit of claim 41, wherein the genomic locus is given by the SNP ID number rs9998519 and the reference genotype is C.
43. The kit of claim 41, wherein the genomic locus is given by the SNP ID number rsl534938 and the reference genotype is T.
44. The kit of claim 41, wherein the genomic locus is given by a SNP ID number provided in Table 9 and the reference genotype for the genomic locus is provided in the "risk genotype" column of Table 9.
45. The kit of claim 41, wherein the genomic locus is given by a SNP ID number provided in Table 10 and the reference genotype for the genomic locus is provided in the "risk allele" column of Table 10.
46. The kit of claim 34 or 41, wherein the reagents comprises those used for an amplification reaction.
47. A device of identifying genetic markers of T2D, the device comprising: a component for obtaining a multiomic data set from one or more T2D- positive subjects, wherein each T2D-positive subject has been diagnosed with T2D;
a component for obtaining a multiomic data set from one or more T2D- negative subjects, wherein each T2D-negative subject has not been diagnosed with T2D;
a component for identifying differences between the multiomic data sets obtained from the one or more T2D-positive subjects and the one or more T2D-negative subjects; and
a component for identifying one or more genetic markers based on the differences.
48. The device of claim 47, wherein the T2D-positive subjects and T2D- negative subjects together comprise a family trio, and the family trio comprises a father, a mother, and an offspring.
49. The device of claim 47, wherein each multiomic data set comprises at least two of the following: DNA sequencing data, RNA sequencing data, and RNA expression level data.
PCT/CN2014/000408 2013-04-12 2014-04-14 Use of multiomic signature to predict diabetes WO2014166303A2 (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201480031292.3A CN105431552B (en) 2013-04-12 2014-04-14 Use of multiomic markers for predicting diabetes

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
US201361811639P 2013-04-12 2013-04-12
US61/811,639 2013-04-12

Publications (2)

Publication Number Publication Date
WO2014166303A2 true WO2014166303A2 (en) 2014-10-16
WO2014166303A3 WO2014166303A3 (en) 2015-07-16

Family

ID=51690067

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2014/000408 WO2014166303A2 (en) 2013-04-12 2014-04-14 Use of multiomic signature to predict diabetes

Country Status (2)

Country Link
CN (1) CN105431552B (en)
WO (1) WO2014166303A2 (en)

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109295197A (en) * 2018-08-31 2019-02-01 南京金域医学检验所有限公司 BSND gene SNP mutational site serotype specific primer and its application in coronary disease disease forecasting
JP2023509677A (en) * 2020-01-10 2023-03-09 ソマロジック オペレーティング カンパニー インコーポレイテッド Methods for determining impaired glucose tolerance
CN118773333A (en) * 2024-08-05 2024-10-15 河北农业大学 A method for distinguishing Taihang chicken and Bashang long-tail chicken

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2019527343A (en) * 2016-06-21 2019-09-26 ナント ホールディングス アイピー エルエルシーNant Holdings IP, LLC Exosome-induced therapy for cancer
WO2019129200A1 (en) * 2017-12-28 2019-07-04 安诺优达基因科技(北京)有限公司 C-site extraction method and apparatus
CN117025743A (en) * 2023-08-28 2023-11-10 长江大学 Method for participating in carotenoid biosynthesis in yellow peach through miRNA regulation network based on multiple-study identification
CN118053503B (en) * 2024-01-11 2024-12-06 中国农业科学院农业基因组研究所 A method and system for constructing a multi-omics database of invasive organisms

Family Cites Families (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US7470542B2 (en) * 2001-09-05 2008-12-30 Pride Proteomics A/S Proteins in type 2 diabetes
WO2007044860A2 (en) * 2005-10-11 2007-04-19 Tethys Bioscience, Inc. Diabetes-associated markers and methods of use thereof
WO2007128884A1 (en) * 2006-05-09 2007-11-15 Oy Jurilab Ltd Novel genes and markers in type 2 diabetes and obesity
CA2713909C (en) * 2008-02-01 2023-12-12 The General Hospital Corporation Use of microvesicles in diagnosis, prognosis and treatment of medical diseases and conditions
US8669057B2 (en) * 2009-05-07 2014-03-11 Veracyte, Inc. Methods and compositions for diagnosis of thyroid conditions
EP2895861A4 (en) * 2012-09-12 2016-06-22 Berg Llc Use of markers in the identification of cardiotoxic agents

Cited By (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109295197A (en) * 2018-08-31 2019-02-01 南京金域医学检验所有限公司 BSND gene SNP mutational site serotype specific primer and its application in coronary disease disease forecasting
JP2023509677A (en) * 2020-01-10 2023-03-09 ソマロジック オペレーティング カンパニー インコーポレイテッド Methods for determining impaired glucose tolerance
CN118773333A (en) * 2024-08-05 2024-10-15 河北农业大学 A method for distinguishing Taihang chicken and Bashang long-tail chicken

Also Published As

Publication number Publication date
WO2014166303A3 (en) 2015-07-16
CN105431552B (en) 2020-03-03
CN105431552A (en) 2016-03-23

Similar Documents

Publication Publication Date Title
AU2019219864A1 (en) Detection processes using sites of chromosome interaction
EP3337465B1 (en) Compositions and methods for use in combination for the treatment and diagnosis of autoimmune diseases
CN117597456A (en) Method for determining the rate of tumor growth
WO2014166303A2 (en) Use of multiomic signature to predict diabetes
Lionetti et al. A compendium of DIS3 mutations and associated transcriptional signatures in plasma cell dyscrasias
KR102006417B1 (en) Method for discovering pharmacogenomic biomarkers
EP3507384B1 (en) Methods and composition for the prediction of the activity of enzastaurin
WO2012170704A2 (en) Methods and compositions of predicting activity of retinoid x receptor modulator
Kwan et al. Tissue effect on genetic control of transcript isoform variation
De Carvalho et al. miRNA genetic variants alter their secondary structure and expression in patients with RASopathies syndromes
WO2015168252A1 (en) Mitochondrial dna copy number as a predictor of frailty, cardiovascular disease, diabetes, and all-cause mortality
US20120309641A1 (en) Diagnostic kits, genetic markers, and methods for scd or sca therapy selection
KR101617612B1 (en) SNP Markers for hypertension in Korean
KR20150092937A (en) SNP Markers for hypertension in Korean
Went Deciphering genetic susceptibility to multiple myeloma
Zhou et al. Genome-wide DNA methylation analysis revealed epigenetic mechanism underlying end stage renal disease
WO2024200616A1 (en) Novel assay for phasing of distant genomic loci with zygosity resolution via long-read sequencing hybrid data analysis
Lung Identification of candidate genes predisposing to familial colorectal cancer by germline whole exome sequencing
Mbele Molecular genetics of arrhythmogenic right ventricular and dilated cardiomyopathy in South Africans
KR100803258B1 (en) Polynucleotide containing a monobasic polymorph, microarray and diagnostic kit comprising the same, antibody-response diagnostic method for hepatitis V vaccine using the same
KR101167942B1 (en) Polynucleotides derived from ALG12 gene comprising single nucleotide polymorphisms, microarrays and diagnostic kits comprising the same, and analytic methods for autism spectrum disorders using the same
KR101196241B1 (en) Analytic method for diagnosing premature ovarian failure using polynucleotides comprising single nucleotide polymorphism derived from HSD17B4 gene and analytical kit therefor
Al-Owain et al. Mendelian Genetic Diseases
Shanker et al. Determining the risk of Coronary Artery Disease using genetic markers in Asian Indians
Abu Bakar Generation of diversity at the human beta-defensin copy number

Legal Events

Date Code Title Description
WWE Wipo information: entry into national phase

Ref document number: 201480031292.3

Country of ref document: CN

122 Ep: pct application non-entry in european phase

Ref document number: 14782892

Country of ref document: EP

Kind code of ref document: A2