Genome-wide association studies (GWAS) have identified hundreds of loci associated with Crohns di... more Genome-wide association studies (GWAS) have identified hundreds of loci associated with Crohns disease (CD), however, as with all complex diseases, deriving pathogenic mechanisms from these non-coding GWAS discoveries has been challenging. To complement GWAS and better define actionable biological targets, we analysed sequenced data from more than 30,000 CD patients and 80,000 population controls. We observe rare coding variants in established CD susceptibility genes as well as ten genes where coding variation directly implicates the gene in disease risk for the first time.
ABSTRACTInflammatory bowel disease (IBD) is a chronic inflammatory disease of the gut. Genetic as... more ABSTRACTInflammatory bowel disease (IBD) is a chronic inflammatory disease of the gut. Genetic association studies have identified the highly variable human leukocyte antigen (HLA) region as the strongest susceptibility locus for IBD, and specifically DRB1*01:03 as a determining factor for ulcerative colitis (UC). However, for most of the association signal such a delineation could not be made due to tight structures of linkage disequilibrium within the HLA. The aim of this study was therefore to further characterize the HLA signal using a trans-ethnic approach. We performed a comprehensive fine mapping of single HLA alleles in UC in a cohort of 9,272 individuals with African American, East Asian, Puerto Rican, Indian and Iranian descent and 40,691 previously analyzed Caucasians, additionally analyzing whole HLA haplotypes. We computationally characterized the binding of associated HLA alleles to human self-peptides and analysed the physico-chemical properties of the HLA proteins an...
Adverse drug reactions (ADRs) are pharmacological events triggered by drug interactions with vari... more Adverse drug reactions (ADRs) are pharmacological events triggered by drug interactions with various sources of origin including drug-drug interactions. While there are many computational studies that explore models to predict ADRs originating from single drugs, only a few of them explore models that predict ADRs from drug combinations. Further, as far as we know, none of them have developed models using transcriptomic data, specifically the LINCS L1000 drug induced gene expression data to predict ADRs for drug combinations. In this study we use the TWOSIDES database as a source of ADRs originating from two-drug combinations. 34,549 common drug pairs between these two databases were used to train an artificial neural network (ANN), to predict 243 ADRs that were induced by at least 10% of the drug pairs. Our model predicts the occurrence of these ADRs with an average accuracy of 82% across a multi fold cross validation.Source Code and input dataset used in this study can be found at:...
Neutrophil dysfunction and GM-CSF auto-antibodies are observed in pediatric and adult patients wi... more Neutrophil dysfunction and GM-CSF auto-antibodies are observed in pediatric and adult patients with Crohn’s disease (CD). We associated damaging coding variants with low GM-CSF induced STAT5 stimulation index (GMSI) in pediatric CD patients and implicated variation of neutrophil GM-CSF signaling in cell function and disease complications. Because many CD patients with low GMSI do not carry damaging coding mutations, we sought to test the hypothesis that non-coding variants contribute to this phenotype. We enrolled, performed whole genome sequencing, and measured the GMSI in 77 CD and ulcerative colitis (UC) patients (24 low and 53 normal GMSI). We identified 4 non-coding variants (rs3808851, rs10974787, rs10974788 and rs10974789) in RCL1 significantly associated with variation of GMSI level (p < 0.011). They were validated in two independent cohorts with: RNAseq data (n = 50) and blood eQTL dataset (n = 31,684). These variants are in LD and affect expression of JAK2 (p 0.005 to 0...
Biliary atresia (BA) is the most common cause of end‐stage liver disease in children and the prim... more Biliary atresia (BA) is the most common cause of end‐stage liver disease in children and the primary indication for pediatric liver transplantation, yet underlying etiologies remain unknown. Approximately 10% of infants affected by BA exhibit various laterality defects (heterotaxy) including splenic abnormalities and complex cardiac malformations—a distinctive subgroup commonly referred to as the biliary atresia splenic malformation (BASM) syndrome. We hypothesized that genetic factors linking laterality features with the etiopathogenesis of BA in BASM patients could be identified through whole‐exome sequencing (WES) of an affected cohort. DNA specimens from 67 BASM subjects, including 58 patient–parent trios, from the National Institute of Diabetes and Digestive and Kidney Diseases–supported Childhood Liver Disease Research Network (ChiLDReN) underwent WES. Candidate gene variants derived from a prespecified set of 2,016 genes associated with ciliary dysgenesis and/or dysfunction o...
Individuals with monogenic disorders of phagocyte function develop chronic colitis that resembles... more Individuals with monogenic disorders of phagocyte function develop chronic colitis that resembles Crohn's disease (CD). We tested for associations between mutations in genes encoding reduced nicotinamide adenine dinucleotide phosphate (NADPH) oxidases, neutrophil function, and phenotypes of CD in pediatric patients. We performed whole-exome sequence analysis to identify mutations in genes encoding NADPH oxidases (such as CYBA, CYBB, NCF1, NCF2, NCF4, RAC1, and RAC2) using DNA from 543 pediatric patients with inflammatory bowel diseases. Blood samples were collected from an additional 129 pediatric patients with CD and 26 children without IBD (controls); we performed assays for neutrophil activation, reactive oxygen species (ROS) production, and bacteria uptake and killing. Whole-exome sequence analysis was performed using DNA from 46 of the children with CD to examine associations with NADPH gene mutations; RNA sequence analyses were performed using blood cells from 46 children ...
The genetic contributions to pediatric onset ulcerative colitis (UC), characterized by severe dis... more The genetic contributions to pediatric onset ulcerative colitis (UC), characterized by severe disease and extensive colonic involvement, are largely unknown. In adult onset UC, Genome Wide Association Study (GWAS) has identified numerous loci, most of which have a modest susceptibility risk (OR 0.84-1.14), with the exception of the human leukocyte antigen (HLA) region on Chromosome 6 (OR 3.59). To study the genetic contribution to exclusive pediatric onset UC, a GWAS was performed on 466 cases with 2099 healthy controls using UK Biobank array. SNP2HLA was used to impute classical HLA alleles and their corresponding amino acids, and the results are compared with adult onset UC. HLA explained the almost entire association signal, dominated with 191 single nucleotide polymorphisms (SNPs) (p = 5 x 10-8 to 5 x 10-10). Although very small effects, established SNPs in adult onset UC loci had similar direction and magnitude in pediatric onset UC. SNP2HLA imputation identified HLA-DRB1*0103 ...
Although gene expression has been studied in bacteria for decades, many aspects of the bacterial ... more Although gene expression has been studied in bacteria for decades, many aspects of the bacterial transcriptome remain poorly understood. Transcript structure, operon linkages, and information on absolute abundance all provide valuable insights into gene function and regulation, but none has ever been determined on a genome-wide scale for any bacterium. Indeed, these aspects of the prokaryotic transcriptome have been explored on a large scale in only a few instances, and consequently little is known about the absolute composition of the mRNA population within a bacterial cell. Here we report the use of a high-throughput sequencing-based approach in assembling the first comprehensive, single-nucleotide resolution view of a bacterial transcriptome. We sampled the Bacillus anthracis transcriptome under a variety of growth conditions and showed that the data provide an accurate and high-resolution map of transcript start sites and operon structure throughout the genome. Further, the sequ...
Yam (Dioscorea cayennensis subsp. rotundata (Poir) J. Miege, is an important staple food and sour... more Yam (Dioscorea cayennensis subsp. rotundata (Poir) J. Miege, is an important staple food and source of carbohydrate for people in developing countries, especially West Africa. Yam production is hampered by its inability to control dormancy adequately. The goal of this study was to investigate the molecular characteristic of dormancy. A time course protein analysis and RNA fingerprinting were carried out at 0, 2, 4, 7, and 15 days by using Sodium Dodecyl Sulfate – Polyacrylamide Gel Electrophoresis (SDS-PAGE) and Amplified Fragment Length Polymorphism (AFLP), respectively. Total RNA was isolated from dormant yam tubers incubated at 32°C ± 2?C at relative humidity of 50% ± 5%. Complementary deoxyribonucleic acid (cDNA) generated from total RNA was subjected to AFLP techniques to identify differentially expressed genes up or down-regulated during dormancy. cDNA-AFLP results using different primer combinations revealed an array of transcript derived fragments (TDFs). About 14% of the TD...
BackgroundIn a recent study, we identified 1189 CpG sites whose DNA methylation (DNAm) level in b... more BackgroundIn a recent study, we identified 1189 CpG sites whose DNA methylation (DNAm) level in blood distinguished Crohn’s disease (CD) cases from controls. We also demonstrated that the vast majority of these differences were a consequence of disease, rather than a cause of CD. Since methylation can be influenced by both genetic and environmental factors, here we focus on CpGs under demonstrable genetic control (methylation quantitative trait loci, or mQTLs). By comparing mQTL patterns across disease states and tissue (blood vs. ileum), we may distinguish patterns unique to CD. Such DNAm patterns may be relevant for the developmental origins of CD.MethodsWe investigated three datasets: (i) 402 blood samples from 164 newly diagnosed pediatric CD patients taken at two time points, and 74 non-IBD controls (ii) 780 blood samples from a non-CD adult population and (iii) 40 ileal biopsies (17 CD cases and 23 non-IBD controls) from group (i). Genome-wide DNAm profiling and genotyping wer...
Genome-wide association studies (GWAS) have identified hundreds of loci associated with Crohns di... more Genome-wide association studies (GWAS) have identified hundreds of loci associated with Crohns disease (CD), however, as with all complex diseases, deriving pathogenic mechanisms from these non-coding GWAS discoveries has been challenging. To complement GWAS and better define actionable biological targets, we analysed sequenced data from more than 30,000 CD patients and 80,000 population controls. We observe rare coding variants in established CD susceptibility genes as well as ten genes where coding variation directly implicates the gene in disease risk for the first time.
ABSTRACTInflammatory bowel disease (IBD) is a chronic inflammatory disease of the gut. Genetic as... more ABSTRACTInflammatory bowel disease (IBD) is a chronic inflammatory disease of the gut. Genetic association studies have identified the highly variable human leukocyte antigen (HLA) region as the strongest susceptibility locus for IBD, and specifically DRB1*01:03 as a determining factor for ulcerative colitis (UC). However, for most of the association signal such a delineation could not be made due to tight structures of linkage disequilibrium within the HLA. The aim of this study was therefore to further characterize the HLA signal using a trans-ethnic approach. We performed a comprehensive fine mapping of single HLA alleles in UC in a cohort of 9,272 individuals with African American, East Asian, Puerto Rican, Indian and Iranian descent and 40,691 previously analyzed Caucasians, additionally analyzing whole HLA haplotypes. We computationally characterized the binding of associated HLA alleles to human self-peptides and analysed the physico-chemical properties of the HLA proteins an...
Adverse drug reactions (ADRs) are pharmacological events triggered by drug interactions with vari... more Adverse drug reactions (ADRs) are pharmacological events triggered by drug interactions with various sources of origin including drug-drug interactions. While there are many computational studies that explore models to predict ADRs originating from single drugs, only a few of them explore models that predict ADRs from drug combinations. Further, as far as we know, none of them have developed models using transcriptomic data, specifically the LINCS L1000 drug induced gene expression data to predict ADRs for drug combinations. In this study we use the TWOSIDES database as a source of ADRs originating from two-drug combinations. 34,549 common drug pairs between these two databases were used to train an artificial neural network (ANN), to predict 243 ADRs that were induced by at least 10% of the drug pairs. Our model predicts the occurrence of these ADRs with an average accuracy of 82% across a multi fold cross validation.Source Code and input dataset used in this study can be found at:...
Neutrophil dysfunction and GM-CSF auto-antibodies are observed in pediatric and adult patients wi... more Neutrophil dysfunction and GM-CSF auto-antibodies are observed in pediatric and adult patients with Crohn’s disease (CD). We associated damaging coding variants with low GM-CSF induced STAT5 stimulation index (GMSI) in pediatric CD patients and implicated variation of neutrophil GM-CSF signaling in cell function and disease complications. Because many CD patients with low GMSI do not carry damaging coding mutations, we sought to test the hypothesis that non-coding variants contribute to this phenotype. We enrolled, performed whole genome sequencing, and measured the GMSI in 77 CD and ulcerative colitis (UC) patients (24 low and 53 normal GMSI). We identified 4 non-coding variants (rs3808851, rs10974787, rs10974788 and rs10974789) in RCL1 significantly associated with variation of GMSI level (p < 0.011). They were validated in two independent cohorts with: RNAseq data (n = 50) and blood eQTL dataset (n = 31,684). These variants are in LD and affect expression of JAK2 (p 0.005 to 0...
Biliary atresia (BA) is the most common cause of end‐stage liver disease in children and the prim... more Biliary atresia (BA) is the most common cause of end‐stage liver disease in children and the primary indication for pediatric liver transplantation, yet underlying etiologies remain unknown. Approximately 10% of infants affected by BA exhibit various laterality defects (heterotaxy) including splenic abnormalities and complex cardiac malformations—a distinctive subgroup commonly referred to as the biliary atresia splenic malformation (BASM) syndrome. We hypothesized that genetic factors linking laterality features with the etiopathogenesis of BA in BASM patients could be identified through whole‐exome sequencing (WES) of an affected cohort. DNA specimens from 67 BASM subjects, including 58 patient–parent trios, from the National Institute of Diabetes and Digestive and Kidney Diseases–supported Childhood Liver Disease Research Network (ChiLDReN) underwent WES. Candidate gene variants derived from a prespecified set of 2,016 genes associated with ciliary dysgenesis and/or dysfunction o...
Individuals with monogenic disorders of phagocyte function develop chronic colitis that resembles... more Individuals with monogenic disorders of phagocyte function develop chronic colitis that resembles Crohn's disease (CD). We tested for associations between mutations in genes encoding reduced nicotinamide adenine dinucleotide phosphate (NADPH) oxidases, neutrophil function, and phenotypes of CD in pediatric patients. We performed whole-exome sequence analysis to identify mutations in genes encoding NADPH oxidases (such as CYBA, CYBB, NCF1, NCF2, NCF4, RAC1, and RAC2) using DNA from 543 pediatric patients with inflammatory bowel diseases. Blood samples were collected from an additional 129 pediatric patients with CD and 26 children without IBD (controls); we performed assays for neutrophil activation, reactive oxygen species (ROS) production, and bacteria uptake and killing. Whole-exome sequence analysis was performed using DNA from 46 of the children with CD to examine associations with NADPH gene mutations; RNA sequence analyses were performed using blood cells from 46 children ...
The genetic contributions to pediatric onset ulcerative colitis (UC), characterized by severe dis... more The genetic contributions to pediatric onset ulcerative colitis (UC), characterized by severe disease and extensive colonic involvement, are largely unknown. In adult onset UC, Genome Wide Association Study (GWAS) has identified numerous loci, most of which have a modest susceptibility risk (OR 0.84-1.14), with the exception of the human leukocyte antigen (HLA) region on Chromosome 6 (OR 3.59). To study the genetic contribution to exclusive pediatric onset UC, a GWAS was performed on 466 cases with 2099 healthy controls using UK Biobank array. SNP2HLA was used to impute classical HLA alleles and their corresponding amino acids, and the results are compared with adult onset UC. HLA explained the almost entire association signal, dominated with 191 single nucleotide polymorphisms (SNPs) (p = 5 x 10-8 to 5 x 10-10). Although very small effects, established SNPs in adult onset UC loci had similar direction and magnitude in pediatric onset UC. SNP2HLA imputation identified HLA-DRB1*0103 ...
Although gene expression has been studied in bacteria for decades, many aspects of the bacterial ... more Although gene expression has been studied in bacteria for decades, many aspects of the bacterial transcriptome remain poorly understood. Transcript structure, operon linkages, and information on absolute abundance all provide valuable insights into gene function and regulation, but none has ever been determined on a genome-wide scale for any bacterium. Indeed, these aspects of the prokaryotic transcriptome have been explored on a large scale in only a few instances, and consequently little is known about the absolute composition of the mRNA population within a bacterial cell. Here we report the use of a high-throughput sequencing-based approach in assembling the first comprehensive, single-nucleotide resolution view of a bacterial transcriptome. We sampled the Bacillus anthracis transcriptome under a variety of growth conditions and showed that the data provide an accurate and high-resolution map of transcript start sites and operon structure throughout the genome. Further, the sequ...
Yam (Dioscorea cayennensis subsp. rotundata (Poir) J. Miege, is an important staple food and sour... more Yam (Dioscorea cayennensis subsp. rotundata (Poir) J. Miege, is an important staple food and source of carbohydrate for people in developing countries, especially West Africa. Yam production is hampered by its inability to control dormancy adequately. The goal of this study was to investigate the molecular characteristic of dormancy. A time course protein analysis and RNA fingerprinting were carried out at 0, 2, 4, 7, and 15 days by using Sodium Dodecyl Sulfate – Polyacrylamide Gel Electrophoresis (SDS-PAGE) and Amplified Fragment Length Polymorphism (AFLP), respectively. Total RNA was isolated from dormant yam tubers incubated at 32°C ± 2?C at relative humidity of 50% ± 5%. Complementary deoxyribonucleic acid (cDNA) generated from total RNA was subjected to AFLP techniques to identify differentially expressed genes up or down-regulated during dormancy. cDNA-AFLP results using different primer combinations revealed an array of transcript derived fragments (TDFs). About 14% of the TD...
BackgroundIn a recent study, we identified 1189 CpG sites whose DNA methylation (DNAm) level in b... more BackgroundIn a recent study, we identified 1189 CpG sites whose DNA methylation (DNAm) level in blood distinguished Crohn’s disease (CD) cases from controls. We also demonstrated that the vast majority of these differences were a consequence of disease, rather than a cause of CD. Since methylation can be influenced by both genetic and environmental factors, here we focus on CpGs under demonstrable genetic control (methylation quantitative trait loci, or mQTLs). By comparing mQTL patterns across disease states and tissue (blood vs. ileum), we may distinguish patterns unique to CD. Such DNAm patterns may be relevant for the developmental origins of CD.MethodsWe investigated three datasets: (i) 402 blood samples from 164 newly diagnosed pediatric CD patients taken at two time points, and 74 non-IBD controls (ii) 780 blood samples from a non-CD adult population and (iii) 40 ileal biopsies (17 CD cases and 23 non-IBD controls) from group (i). Genome-wide DNAm profiling and genotyping wer...
Uploads
Papers by David Okou