Analysis of Classification Methods for Gene Expression Data

Lamiaa Zakaria¹⁹,
Hala M. Ebeid ORCID: orcid.org/0000-0001-9843-842X¹⁹,
Sayed Dahshan¹⁹ &
…
Mohamed F. Tolba ORCID: orcid.org/0000-0003-3104-6418¹⁹

Part of the book series: Advances in Intelligent Systems and Computing ((AISC,volume 921))

Included in the following conference series:

International Conference on Advanced Machine Learning Technologies and Applications

2143 Accesses
4 Citations

Abstract

The discovery of diseases at a molecular level is a great challenge for researchers in the field of bioinformatics and cancer classification. Understanding the genes that contribute to the cancer malady is a great challenge to many researchers. Cancer classification based on the molecular level investigation has gained the interest of researches as it provides a systematic, accurate and objective diagnosis for different cancer types. This Paper aims to present some classification methods for gene expression data. We compared the efficiency of three different classification methods; support vector machines, k-nearest neighbor and random forest. Two publicly available gene expression data sets were used in the classifications; Freije and Philips dataset. By performing the classification methods, results revealed that the best performance was achieved by using support vector machine classifier for both datasets comparing with other used classifiers.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 169.00; Price excludes VAT (USA)

Softcover Book: USD 219.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Molecular Classification of Cancer by Gene Expression Monitoring Using Ensemble Learning

Genetic Clustering Algorithm-Based Feature Selection and Divergent Random Forest for Multiclass Cancer Classification Using Gene Expression Data

Article Open access 05 February 2024

An entropy-based classification of breast cancerous genes using microarray data

Article 10 November 2018

References

Stewart, B.W., Wild, C.P.: World Cancer report 2014. In: International Agency for Research on Cancer (IARC), World Health Organization (WHO). WHO Press, Switzerland (2014)
Google Scholar
Wang, J.J.-Y., Bensmail, H., Gao, X.: Multiple graph regularized nonnegative matrix factorization. Pattern Recogn. 46(10), 2840–2847 (2013)
Article Google Scholar
Wang, J.J.-Y., Wang, X., Gao, X.: Non-negative matrix factorization by maximizing correntropy for cancer clustering. BMC Bioinform. 14, 107–118 (2013)
Article Google Scholar
Wang, J.-Y., Almasri, I., Gao, X.: Adaptive graph regularized nonnegative matrix factorization via feature selection. In: 21st International Conference on Pattern Recognition (ICPR), pp. 963–966 (2012)
Google Scholar
Tusher, V.G., Tibshirani, R., Chu, G.: Significance analysis of microarrays applied to the ionizing radiation response. Proc. Natl. Acad. Sci. U.S.A. 98(9), 5116–5121 (2001)
Article Google Scholar
Spang, R.: Diagnostic signatures from microarrays: a bioinformatics concept for personalized medicine. BIOSILICO 1, 64–68 (2003)
Article Google Scholar
Brunet, J.P., Tamayo, P., Golub, T.R., Mesirov, J.P.: Metagenes and molecular pattern discovery using matrix factorization. Proc. Natl. Acad. Sci. U.S.A. 101(12), 4164–4169 (2004)
Article Google Scholar
McLachlan, G.J., Bean, R., Peel, D.: A mixture model-based approach to the clustering of microarray expression data. Bioinformatics 18(3), 413–422 (2002)
Article Google Scholar
Golub, T.R., Slonim, D.K., Tamayo, P., Huard, C., Gaasenbeek, M., Mesirov, J.P., Coller, H., Loh, M.L., Downing, J.R., Caligiuri, M.A., Bloomfield, C.D., Lander, E.S.: Molecular classification of cancer: class discovery and class prediction by gene expression monitoring. Science 286(5439), 531–537 (1999)
Article Google Scholar
Li, Y., Kang, K., Krahn, J.M., Croutwater, N., Lee, K., Umbach, D.M., Li, L.: A comprehensive genomic pan-cancer classification using the Cancer Genome Atlas gene expression data. BMC Genom. 18(1), 508 (2017)
Article Google Scholar
Li, L., Weinberg, C.R., Darden, T., Pedersen, L.G.: Gene selection for sample classification based on gene expression data: study of sensitivity to choice of parameters of the GA/KNN method. Bioinformatics 17(12), 1131–1142 (2001)
Article Google Scholar
Li, L., Darden, T.A., Weinberg, C.R., Levine, A.J., Pedersen, L.G.: Gene assessment and sample classification for gene expression data using a genetic algorithm/k-nearest neighbor method. Comb. Chem. High Throughput Screen. 4(8), 727–739 (2001)
Article Google Scholar
Singha, R.K., Sivabalakrishnan, M.: Feature selection of gene expression data for cancer classification: a review. Procedia Comput. Sci. 50, 52–57 (2015)
Article Google Scholar
Zhong, W., Lu, X., Wu, J.: Feature selection for cancer classification using microarray gene expression data. Biostat. Biometr. 1(2), 1–7 (2017)
Google Scholar
Li, T., Zhang, C., Ogihara, M.A.: comparative study of feature selection and multiclass classification methods for tissue classification based on gene expression. Bioinformatics 20(15), 2429–2437 (2004)
Article Google Scholar
Wu, X., Kumar, V., Ross Quinlan, J., Ghosh, J., Yang, Q., Motoda, H., McLachlan, G.J., Ng, A., Liu, B., Yu, P.S., Zhou, Z.H., Steinbach, M., Hand, D.J., Steinberg, D.: Top 10 algorithms in data mining. Knowl. Inf. Syst. 14(1), 1–37 (2008)
Article Google Scholar
Nello, C., Taylor, J.S.: An Introduction to support vector machines and other kernel-based learning methods. Cambridge Univ. Press 22(2), 204–210 (2001)
MATH Google Scholar
The Freije dataset. http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4271. last accessed 10 Aug 2018
The Phillips dataset. http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE4412. last accessed 10 Aug 2018
Schlkopf, B., Tsuda, K., Vert, J.P.: Kernel methods in computational biology. MIT Press series on Computational Molecular Biology, Berlin (2003)
Google Scholar

Download references

Author information

Authors and Affiliations

Scientific Computing, Faculty of Computer and Information Sciences, Ain Shams University, Cairo, Egypt
Lamiaa Zakaria, Hala M. Ebeid, Sayed Dahshan & Mohamed F. Tolba

Authors

Lamiaa Zakaria
View author publications
You can also search for this author in PubMed Google Scholar
Hala M. Ebeid
View author publications
You can also search for this author in PubMed Google Scholar
Sayed Dahshan
View author publications
You can also search for this author in PubMed Google Scholar
Mohamed F. Tolba
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Hala M. Ebeid .

Editor information

Editors and Affiliations

Faculty of Computers and Information, Cairo University, Giza, Egypt
Aboul Ella Hassanien
Faculty of Computers and Information, Benha University, Benha, Egypt
Ahmad Taher Azar
School of Computing, Science and Engineering, University of Salford, Salford, Greater Manchester, UK
Tarek Gaber
Department of Computer Science and Engineering, School of Computing and IT, Faculty of Engineering, Manipal University Jaipur, Jaipur, Rajasthan, India
Roheet Bhatnagar
Faculty of Computer and Information Science, Ain Shams University, Cairo, Egypt
Mohamed F. Tolba

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Zakaria, L., Ebeid, H.M., Dahshan, S., Tolba, M.F. (2020). Analysis of Classification Methods for Gene Expression Data. In: Hassanien, A., Azar, A., Gaber, T., Bhatnagar, R., F. Tolba, M. (eds) The International Conference on Advanced Machine Learning Technologies and Applications (AMLTA2019). AMLTA 2019. Advances in Intelligent Systems and Computing, vol 921. Springer, Cham. https://doi.org/10.1007/978-3-030-14118-9_19

Download citation

DOI: https://doi.org/10.1007/978-3-030-14118-9_19
Published: 17 March 2019
Publisher Name: Springer, Cham
Print ISBN: 978-3-030-14117-2
Online ISBN: 978-3-030-14118-9
eBook Packages: Intelligent Technologies and RoboticsIntelligent Technologies and Robotics (R0)

Publish with us

Policies and ethics

Analysis of Classification Methods for Gene Expression Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Molecular Classification of Cancer by Gene Expression Monitoring Using Ensemble Learning

Genetic Clustering Algorithm-Based Feature Selection and Divergent Random Forest for Multiclass Cancer Classification Using Gene Expression Data

An entropy-based classification of breast cancerous genes using microarray data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Analysis of Classification Methods for Gene Expression Data

Abstract

Access this chapter

Subscribe and save

Buy Now

Similar content being viewed by others

Molecular Classification of Cancer by Gene Expression Monitoring Using Ensemble Learning

Genetic Clustering Algorithm-Based Feature Selection and Divergent Random Forest for Multiclass Cancer Classification Using Gene Expression Data

An entropy-based classification of breast cancerous genes using microarray data

References

Author information

Authors and Affiliations

Corresponding author

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation