[go: up one dir, main page]

Skip to main content

GAN-Based Data Augmentation for Prediction Improvement Using Gene Expression Data in Cancer

  • Conference paper
  • First Online:
Computational Science – ICCS 2022 (ICCS 2022)

Abstract

Within the area of bioinformatics, Deep Learning (DL) models have shown exceptional results in applications in which histological images, scans and tomographies are used. However, when gene expression data is under analysis, the performance is often limited, further hampered by the complexity of these models that require several instances, in the order of thousands, to provide good results. Due to the difficulty and the costs involved in the collection of medical data, the application of Data Augmentation (DA) techniques to alleviate the lack of samples is a topic of great relevance. State-of-the-art models based on Conditional Generative Adversarial Networks (CGAN) and some introduced modifications are used in this work to investigate the effect of DA for prediction of the vital status of patients from RNA-Seq gene expression data. Experimental results on several real-world data sets demonstrate the effectiveness and efficiency of the proposed models. The application of DA methods significantly increase prediction accuracy, leading by 12% with respect to benchmark data sets and 3.15% with respect to data processed with feature selection. Results based on CGAN models outperform in most cases, alternative methods like the SMOTE or noise injection techniques.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Chapter
USD 29.95
Price excludes VAT (USA)
  • Available as PDF
  • Read on any device
  • Instant download
  • Own it forever
eBook
USD 89.00
Price excludes VAT (USA)
  • Available as EPUB and PDF
  • Read on any device
  • Instant download
  • Own it forever
Softcover Book
USD 119.99
Price excludes VAT (USA)
  • Compact, lightweight edition
  • Dispatched in 3 to 5 business days
  • Free shipping worldwide - see info

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Similar content being viewed by others

References

  1. Barile, B., Marzullo, A., Stamile, C., Durand-Dubief, F., Sappey-Marinier, D.: Data augmentation using generative adversarial neural networks on brain structural connectivity in multiple sclerosis. Comput. Methods Programs Biomed. 206, 106113 (2021). https://doi.org/10.1016/j.cmpb.2021.106113

    Article  Google Scholar 

  2. Breiman, L.: Random forests. Mach. Learn. 45(1), 5–32 (2001). https://doi.org/10.1023/a:1010933404324

    Article  MATH  Google Scholar 

  3. Chawla, N.V., Bowyer, K.W., Hall, L.O., Kegelmeyer, W.P.: SMOTE: synthetic minority over-sampling technique. J. Artif. Intell. Res. 16, 321–357 (2002). https://doi.org/10.1613/jair.953

    Article  MATH  Google Scholar 

  4. Cheerla, A., Gevaert, O.: Deep learning with multimodal representation for pancancer prognosis prediction. Bioinformatics 35(14), i446–i454 (2019). https://doi.org/10.1093/bioinformatics/btz342

    Article  Google Scholar 

  5. Douzas, G., Bacao, F.: Effective data generation for imbalanced learning using conditional generative adversarial networks. Expert Syst. Appl. 91, 464–471 (2018). https://doi.org/10.1016/j.eswa.2017.09.030

  6. Frid-Adar, M., Diamant, I., Klang, E., Amitai, M., Goldberger, J., Greenspan, H.: GAN-based synthetic medical image augmentation for increased CNN performance in liver lesion classification. Neurocomputing 321, 321–331 (2018). https://doi.org/10.1016/j.neucom.2018.09.013

  7. Goodfellow, I., Bengio, Y., Courville, A.: Deep Learning. MIT Press, Cambridge (2016)

    MATH  Google Scholar 

  8. Goodfellow, I., et al.: Generative adversarial nets. In: Advances in Neural Information Processing Systems, pp. 2672–2680 (2014)

    Google Scholar 

  9. Han, C., et al.: GAN-based synthetic brain MR image generation. In: 2018 IEEE 15th International Symposium on Biomedical Imaging (ISBI 2018), pp. 734–738. IEEE (2018). https://doi.org/10.1109/isbi.2018.8363678

  10. Heusel, M., Ramsauer, H., Unterthiner, T., Nessler, B., Hochreiter, S.: GANs trained by a two time-scale update rule converge to a local Nash equilibrium. In: Advances in Neural Information Processing Systems, vol. 30 (2017)

    Google Scholar 

  11. Hsu, W.N., Zhang, Y., Glass, J.: Unsupervised domain adaptation for robust speech recognition via variational autoencoder-based data augmentation. In: 2017 IEEE Automatic Speech Recognition and Understanding Workshop (ASRU), pp. 16–23. IEEE, December 2017. https://doi.org/10.1109/asru.2017.8268911

  12. Liu, Y., Zhou, Y., Liu, X., Dong, F., Wang, C., Wang, Z.: Wasserstein GAN-based small-sample augmentation for new-generation artificial intelligence: a case study of cancer-staging data in biology. Engineering 5(1), 156–163 (2019). https://doi.org/10.1016/j.eng.2018.11.018

  13. Maas, A.L., Hannun, A.Y., Ng, A.Y.: Rectifier nonlinearities improve neural network acoustic models. In: International Conference on Machine Learning, vol. 30, p. 3 (2013)

    Google Scholar 

  14. Marouf, M., et al.: Realistic in silico generation and augmentation of single-cell RNA-seq data using generative adversarial networks. Nat. Commun. 11(1), 1–12 (2020). https://doi.org/10.1038/s41467-019-14018-z

    Article  Google Scholar 

  15. Mirza, M., Osindero, S.: Conditional generative adversarial nets (2014)

    Google Scholar 

  16. Moreno-Barea, F.J., Jerez, J.M., Franco, L.: Improving classification accuracy using data augmentation on small data sets. Expert Syst. Appl. 161, 113696 (2020). https://doi.org/10.1016/j.eswa.2020.113696

    Article  Google Scholar 

  17. Moreno-Barea, F.J., Strazzera, F., Jerez, J.M., Urda, D., Franco, L.: Forward noise adjustment scheme for data augmentation. In: IEEE Symposium Series on Computational Intelligence (IEEE SSCI 2018) (2018). https://doi.org/10.1109/ssci.2018.8628917

  18. Piotrowski, A.P., Napiorkowski, J.J.: A comparison of methods to avoid overfitting in neural networks training in the case of catchment runoff modelling. J. Hydrol. 476, 97–111 (2013). https://doi.org/10.1016/j.jhydrol.2012.10.019

    Article  Google Scholar 

  19. Radford, A., Metz, L., Chintala, S.: Unsupervised representation learning with deep convolutional generative adversarial networks (2015)

    Google Scholar 

  20. Reed, R.D., Marks, R.J.: Neural Smithing: Supervised Learning in Feedforward Artificial Neural Networks. MIT Press, Cambridge (1998)

    Google Scholar 

  21. dos Santos Tanaka, F.H.K., Aranha, C.: Data augmentation using GANs. In: Proceedings of Machine Learning Research XXX 1, p. 16 (2019)

    Google Scholar 

  22. Schmidhuber, J.: Deep learning in neural networks: an overview. Neural Netw. 61, 85–117 (2015). https://doi.org/10.1016/j.neunet.2014.09.003

    Article  Google Scholar 

  23. Shao, S., Wang, P., Yan, R.: Generative adversarial networks for data augmentation in machine fault diagnosis. Comput. Ind. 106, 85–93 (2019). https://doi.org/10.1016/j.compindJ.2019.01.001

    Article  Google Scholar 

  24. Srivastava, N., Hinton, G., Krizhevsky, A., Sutskever, I., Salakhutdinov, R.: Dropout: a simple way to prevent neural networks from overfitting. J. Mach. Learn. Res. 15(1), 1929–1958 (2014)

    MathSciNet  MATH  Google Scholar 

  25. Tibshirani, R.: Regression shrinkage and selection via the Lasso. J. R. Stat. Soc. Ser. B (Methodol.) 58(1), 267–288 (1996). https://doi.org/10.1111/j.2517-6161.1996.tb02080.x

  26. Vale-Silva, L.A., Rohr, K.: Long-term cancer survival prediction using multimodal deep learning. Sci. Rep. 11(1), 1–12 (2021). https://doi.org/10.1038/s41598-021-92799-4

    Article  Google Scholar 

  27. Waheed, A., Goyal, M., Gupta, D., Khanna, A., Al-Turjman, F., Pinheiro, P.R.: CovidGAN: data augmentation using auxiliary classifier GAN for improved COVID-19 detection. IEEE Access 8, 91916–91923 (2020). https://doi.org/10.1109/access.2020.2994762

    Article  Google Scholar 

  28. Xu, B., Wang, N., Chen, T., Li, M.: Empirical evaluation of rectified activations in convolutional network (2015)

    Google Scholar 

  29. Zur, R.M., Jiang, Y., Pesce, L., Drukker, K.: Noise injection for training artificial neural networks: a comparison with weight decay and early stopping. Med. Phys. 36(10), 4810–4818 (2009). https://doi.org/10.1118/1.3213517

    Article  Google Scholar 

Download references

Acknowledgements

The authors acknowledge the support from MICINN (Spain) through grant TIN2017-88728-C2-1-R and PID2020-116898RB-I00, from Universidad de Málaga y Junta de Andalucía through grant UMA20-FEDERJA-045, and from Instituto de Investigación Biomédica de Málaga - IBIMA (all including FEDER funds). The results published here are based upon data generated by the TCGA Research Network: https://www.cancer.gov/tcga.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Francisco J. Moreno-Barea .

Editor information

Editors and Affiliations

Rights and permissions

Reprints and permissions

Copyright information

© 2022 The Author(s), under exclusive license to Springer Nature Switzerland AG

About this paper

Check for updates. Verify currency and authenticity via CrossMark

Cite this paper

Moreno-Barea, F.J., Jerez, J.M., Franco, L. (2022). GAN-Based Data Augmentation for Prediction Improvement Using Gene Expression Data in Cancer. In: Groen, D., de Mulatier, C., Paszynski, M., Krzhizhanovskaya, V.V., Dongarra, J.J., Sloot, P.M.A. (eds) Computational Science – ICCS 2022. ICCS 2022. Lecture Notes in Computer Science, vol 13352. Springer, Cham. https://doi.org/10.1007/978-3-031-08757-8_3

Download citation

  • DOI: https://doi.org/10.1007/978-3-031-08757-8_3

  • Published:

  • Publisher Name: Springer, Cham

  • Print ISBN: 978-3-031-08756-1

  • Online ISBN: 978-3-031-08757-8

  • eBook Packages: Computer ScienceComputer Science (R0)

Publish with us

Policies and ethics