Abstract
Cancer is one of the major diseases that threaten human life. The advancement of high-throughput sequencing technology provides a way to accurately diagnose cancer and reveal the pathogenesis of cancer at the molecular level. In this study, we integrated the differentially expressed genes, and differential DNA methylation patterns, and applied multiple machine learning methods to conduct cancer diagnosis. The experimental results show that the performance of cancer diagnosis can be significantly improved with the integrated multi-scale gene features of RNA and epigenetic level. The AUC of classifier can be increased by 7.4% with multi-scale gene features compared to only differentially expressed genes, which verifies the effectiveness of the integration of multi-scale gene features for cancer diagnosis.
Access this chapter
Tax calculation will be finalised at checkout
Purchases are for personal use only
Similar content being viewed by others
References
Schuster, S.C.: Next-generation sequencing transforms today’s biology. J. Nat. Methods 5(1), 16–18 (2008)
Zhou, X.G., Ren, L.F., Li, Y.T., et al.: The next-generation sequencing technology: a technology review and future perspective. J. Sci China Life Sci. 53(1), 44–57 (2010)
Maglogiannis, I., Zafiropoulos, E., Anagnostopoulos, I.: An intelligent system for automated breast cancer diagnosis and prognosis using SVM based classifiers. J. Appl. Intell. 30(1), 24–36 (2009)
Chen, A.H., Huang, Z.-W.: A new multi-task learning technique to predict classification of leukemia and prostate cancer. In: Zhang, D., Sonka, M. (eds.) ICMB 2010. LNCS, vol. 6165, pp. 11–20. Springer, Heidelberg (2010). https://doi.org/10.1007/978-3-642-13923-9_2
Hijazi, H., Chan, C.: A classification framework applied to cancer gene expression profiles. J. Healthcare Eng. 4(4), 255–284 (2013)
Nakkeeran, R., Victoire, T.A.A.: Hybrid approach of data mining techniques, PCA, EDM and SVM for cancer gene feature selection and classification. J. Eur. J. Sci. Res. 79, 638–652 (2012)
Kourou, K., Exarchos, T.P., Exarchos, K.P., Karamouzis, M.V., Fotiadis, D.I.: Machine learning applications in cancer prognosis and prediction. Comput. Struct. Biotechnol. J. 13, 8–17 (2015)
Kuan, P.F., Wang, S., Zhou, X., Chu, H.: A statistical framework for Illumina DNA methylation arrays. J. Bioinform. 26, 2849–2855 (2010)
Baylin, S.B., Ohm, J.E.: Epigenetic gene silencing in cancer - a mechanism for early oncogenic pathway addiction. J. Nat. Rev. Cancer 6, 107–116 (2006)
Kulis, M., Esteller, M.: DNA methylation and cancer. J. Adv. Gene. 70, 27–56 (2010)
Wang, S.: Method to detect differentially methylated loci with case-control designs using Illumina arrays. J. Genet. Epidemiol. 35, 686–694 (2011)
Robinson, M.D., McCarthy, D.J., Smyth, G.K.: EdgeR: a Bioconductor package for differential expression analysis of digital gene expression data. J. Bioinform. 26, 139–140 (2010)
Wang, D., Yan, L., Hu, Q., et al.: IMA: an R package for high-throughput analysis of Illumina’s 450K Infinium methylation data. J. Bioinform. 28(5), 729–730 (2012)
Ahn, S., Wang, T.: A powerful statistical method for identifying differentially methylated markers in complex diseases. J. Pac. Symp. Biocomput. 69–79 (2013). NIH Public Access
Huang, H., Chen, Z., Huang, X.: Age-adjusted nonparametric detection of differential DNA methylation with case-control designs. J. BMC Bioinform. 14, 86–94 (2013)
Zhang, Y., Zhang, J., Shang, J.: Quantitative identification of differentially methylated loci based on relative entropy for matched case-control data. J. Epigenomics 5, 631–643 (2013)
Jaffe, A.E., Murakami, P., Lee, H., et al.: Bump hunting to identify differentially methylated regions in epigenetic epidemiology studies. J. Int. J. Epidemiol. 41(1), 200–209 (2012)
Sofer, T., Schifano, E.D., Hoppin, J.A., et al.: A-clustering: a novel method for the detection of co-regulated methylation regions, and regions associated with exposure. J. Bioinform. 29(22), 2884–2891 (2013)
Ong, M.L., Holbrook, J.D.: Novel region discovery method for Infinium 450K DNA methylation data reveals changes associated with aging in muscle and neuronal pathways. J. Aging Cell. 13(1), 142–155 (2014)
Wang, Y., Teschendorff, A.E., Widschwendter, M., Wang, S.: Accounting for differential variability in detecting differentially methylated regions. J. Brief. Bioinform. (2017). bbx097
Du, P., Zhang, X., et al.: Comparison of Beta-value and M-value methods for quantifying methylation levels by microarray analysis. J. BMC Bioinform. 11, 587–596 (2010)
The Cancer Genome Atlas Research Network., Weinstein, J.N., et al.: The cancer genome atlas Pan-Cancer analysis project. J. Nat. Genet. 45(10), 1113–1120 (2013)
Ge, S., Xia, X., Ding, C., et al.: A proteomic landscape of diffuse-type gastric cancer. J. Nat. Commun. 9(1), 1012–1028 (2018)
Mertins, P., Mani, D.R., Ruggles, K.V., et al.: Proteogenomics connects somatic mutations to signalling in breast cancer. J. Nature 534, 55–62 (2016)
Zhang, H., Liu, T., Zhang, Z., et al.: Integrated proteogenomic characterization of human high-grade serous ovarian cancer. J. Cell. 166(3), 755–765 (2016)
Zhang, B., Wang, J., Wang, X., et al.: Proteogenomic characterization of human colon and rectal cancer. J. Nature 513, 382–403 (2014)
Acknowledgments
The project sponsored by the Scientific Research Foundation for the Returned Overseas Chinese Scholars, State Education Ministry (NO. 48, 2014-1685) and the Key Natural Science Project of Anhui Provincial Education Department (KJ2017A016).
Author information
Authors and Affiliations
Corresponding authors
Editor information
Editors and Affiliations
Rights and permissions
Copyright information
© 2018 Springer Nature Switzerland AG
About this paper
Cite this paper
Hang, P., Shi, M., Long, Q., Li, H., Zhao, H., Ma, M. (2018). Integrating Multi-scale Gene Features for Cancer Diagnosis. In: Zhou, J., et al. Biometric Recognition. CCBR 2018. Lecture Notes in Computer Science(), vol 10996. Springer, Cham. https://doi.org/10.1007/978-3-319-97909-0_67
Download citation
DOI: https://doi.org/10.1007/978-3-319-97909-0_67
Published:
Publisher Name: Springer, Cham
Print ISBN: 978-3-319-97908-3
Online ISBN: 978-3-319-97909-0
eBook Packages: Computer ScienceComputer Science (R0)