[go: up one dir, main page]

Skip to main content

Advertisement

Log in

In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods

  • Original Article
  • Published:
Molecular Diversity Aims and scope Submit manuscript

Abstract

Chemical-induced hematotoxicity is an important concern in the drug discovery, since it can often be fatal when it happens. It is quite useful for us to give special attention to chemicals which can cause hematotoxicity. In the present study, we focused on in silico prediction of chemical-induced hematotoxicity with machine learning (ML) and deep learning (DL) methods. We collected a large data set contained 632 hematotoxic chemicals and 1525 approved drugs without hematotoxicity. Computational models were built using several different machine learning and deep learning algorithms integrated on the Online Chemical Modeling Environment (OCHEM). Based on the three best individual models, a consensus model was developed. It yielded the prediction accuracy of 0.83 and balanced accuracy of 0.77 on external validation. The consensus model and the best individual model developed with random forest regression and classification algorithm (RFR) and QNPR descriptors were made available at https://ochem.eu/article/135149, respectively. The relevance of 8 commonly used molecular properties and chemical-induced hematotoxicity was also investigated. Several molecular properties have an obvious differentiating effect on chemical-induced hematotoxicity. Besides, 12 structural alerts responsible for chemical hematotoxicity were identified using frequency analysis of substructures from Klekota–Roth fingerprint. These results should provide meaningful knowledge and useful tools for hematotoxicity evaluation in drug discovery and environmental risk assessment.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic
$34.99 /Month
  • Get 10 units per month
  • Download Article/Chapter or eBook
  • 1 Unit = 1 Article or 1 Chapter
  • Cancel anytime
Subscribe now

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Fig. 1
Fig. 2
Fig. 3

Similar content being viewed by others

Explore related subjects

Discover the latest articles, news and stories from top researchers in related subjects.

References

  1. Rich IN (2003) In vitro hematotoxicity testing in drug development: a review of past, present and future applications. Curr Opin Drug Discov Devel 6(1):100–109

    CAS  PubMed  Google Scholar 

  2. Budinsky RA Jr (2000) Hematotoxicity: chemically induced toxicity of the blood: principles of toxicology. Wiley, New York, pp 87–109

    Google Scholar 

  3. Cox A (2007) Recognition and management of drug-induced blood disorders. Prescriber 18(3):51–56. https://doi.org/10.1002/psb.22

    Article  Google Scholar 

  4. Goto K, Goto M, Ando-Imaoka M et al (2017) Evaluation of drug-induced hematotoxicity using novel in vitro monkey CFU-GM and BFU-E colony assays. J Toxicol Sci 42(4):397–405. https://doi.org/10.2131/jts.42.397

    Article  CAS  PubMed  Google Scholar 

  5. Ng P, Belgur C, Barthakur S et al (2019) Organs-on-chips: a new paradigm for safety assessment of drug-induced thrombosis. Cur Opinion Toxicol 17:1–8. https://doi.org/10.1016/j.cotox.2019.08.004

    Article  Google Scholar 

  6. Jiao Z, Hu P, Xu H et al (2020) Machine learning and deep learning in chemical health and safety: a systematic review of techniques and applications. ACS Chem Health Safety 27(6):316–334. https://doi.org/10.1021/acs.chas.0c00075

    Article  CAS  Google Scholar 

  7. Vo AH, Van Vleet TR, Gupta RR et al (2020) An overview of machine learning and big data for drug toxicity evaluation. Chem Res Toxicol 33(1):20–37. https://doi.org/10.1021/acs.chemrestox.9b00227

    Article  CAS  PubMed  Google Scholar 

  8. Wang MWH, Goodman JM, Allen TEH (2021) Machine learning in predictive toxicology: recent applications and future directions for classification models. Chem Res Toxicol 34(2):217–239. https://doi.org/10.1021/acs.chemrestox.0c00316

    Article  CAS  PubMed  Google Scholar 

  9. Yang H, Lou C, Sun L et al (2019) admetSAR 2.0 web-service for prediction and optimization of chemical ADMET properties. Bioinformatics 35(6):1067–1069

    Article  CAS  Google Scholar 

  10. Crivori P, Pennella G, Magistrelli M et al (2011) Predicting myelosuppression of drugs from in silico models. J Chem Inf Model 51(2):434–445. https://doi.org/10.1021/ci1003834

    Article  CAS  PubMed  Google Scholar 

  11. Zhang H, Yu P, Zhang T-G et al (2015) In silico prediction of drug-induced myelotoxicity by using Naïve Bayes method. Mol Diversity 19(4):945–953. https://doi.org/10.1007/s11030-015-9613-3

    Article  CAS  Google Scholar 

  12. Kuhn M, Letunic I, Jensen LJ et al (2016) The SIDER database of drugs and side effects. Nucleic Acids Res 44(D1):D1075–D1079. https://doi.org/10.1093/nar/gkv1075

    Article  CAS  PubMed  Google Scholar 

  13. Tomasulo P (2002) ChemIDplus-super source for chemical and drug information. Med Ref Serv Q 21(1):53–59. https://doi.org/10.1300/J115v21n01_04

    Article  PubMed  Google Scholar 

  14. Wishart DS, Knox C, Guo AC et al (2008) DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucl Acids Res. https://doi.org/10.1093/nar/gkm958

    Article  PubMed  PubMed Central  Google Scholar 

  15. Ancuceanu R, Dinu M, Neaga I et al (2019) Development of QSAR machine learning-based models to forecast the effect of substances on malignant melanoma cells. Oncol Lett 17(5):4188–4196. https://doi.org/10.3892/ol.2019.10068

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  16. Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans A Math Phys Eng Sci 374(2065):20150202. https://doi.org/10.1098/rsta.2015.0202

    Article  PubMed  PubMed Central  Google Scholar 

  17. Sushko I, Novotarskyi S, Körner R et al (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 25(6):533–554. https://doi.org/10.1007/s10822-011-9440-2

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Cui X, Liu J, Zhang J et al (2019) In silico prediction of drug-induced rhabdomyolysis with machine-learning models and structural alerts. J Appl Toxicol 39(8):1224–1232. https://doi.org/10.1002/jat.3808

    Article  CAS  PubMed  Google Scholar 

  19. Cui X, Yang R, Li S et al (2020) Modeling and insights into molecular basis of low molecular weight respiratory sensitizers. Mol Diversity. https://doi.org/10.1007/s11030-020-10069-3

    Article  Google Scholar 

  20. Karpov P, Godin G, Tetko IV (2020) Transformer-CNN: Swiss knife for QSAR modeling and interpretation. J Cheminform 12(1):17. https://doi.org/10.1186/s13321-020-00423-w

    Article  PubMed  PubMed Central  Google Scholar 

  21. Kovalishyn V, Abramenko N, Kopernyk I et al (2018) Modelling the toxicity of a large set of metal and metal oxide nanoparticles using the OCHEM platform. Food Chem Toxicol 112:507–517. https://doi.org/10.1016/j.fct.2017.08.008

    Article  CAS  PubMed  Google Scholar 

  22. Li X, Zhang Y, Li H et al (2017) Modeling of the hERG K+ channel blockage using online chemical database and modeling environment (OCHEM). Mol Inf 36(12):1700074. https://doi.org/10.1002/minf.201700074

    Article  CAS  Google Scholar 

  23. Tetko IV (2008) Associative neural network. In: Clifton NJ (ed) Methods in molecular biology. Springer, Berlin

    Google Scholar 

  24. P Indyk, R Motwani, (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. Paper presented at the Proceedings of the thirtieth annual ACM symposium on Theory of computing, Dallas, Texas, USA https://doi.org/10.1145/276698.276876

  25. Chang C-C, Lin C-J (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol. https://doi.org/10.1145/1961189.1961199

    Article  Google Scholar 

  26. Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. Paper presented at the Proceedings of the 22nd ACM SIGKDD International conference on knowledge discovery and data mining, San Francisco, California, USA. https://doi.org/10.1145/2939672.2939785

  27. Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324

    Article  Google Scholar 

  28. Wu Z, Ramsundar B, Feinberg Evan N et al (2018) Molecule Net: a benchmark for molecular machine learning. Chem Sci 9(2):513–530. https://doi.org/10.1039/C7SC02664A

    Article  CAS  PubMed  Google Scholar 

  29. Nogueira RF, Lotufo RdA, Machado RC (2016) Fingerprint liveness detection using convolutional neural networks. IEEE Trans Inf Forensics Secur 11(6):1206–1213. https://doi.org/10.1109/TIFS.2016.2520880

    Article  Google Scholar 

  30. Hewitt M, Cronin MTD, Madden JC et al (2007) Consensus QSAR models: do the benefits outweigh the complexity? J Chem Inf Model 47(4):1460–1468. https://doi.org/10.1021/ci700016d

    Article  CAS  PubMed  Google Scholar 

  31. Lei T, Li Y, Song Y et al (2016) ADMET evaluation in drug discovery: 15 Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling. J Cheminformatics 8(1):6

    Article  Google Scholar 

  32. Khan K, Benfenati E, Roy K (2019) Consensus QSAR modeling of toxicity of pharmaceuticals to different aquatic organisms: ranking and prioritization of the DrugBank database compounds. Ecotoxicol Environ Saf 168:287–297. https://doi.org/10.1016/j.ecoenv.2018.10.060

    Article  CAS  PubMed  Google Scholar 

  33. Valsecchi C, Grisoni F, Consonni V et al (2020) Consensus versus individual QSARs in classification: comparison on a large-scale case study. J Chem Inf Model 60(3):1215–1223. https://doi.org/10.1021/acs.jcim.9b01057

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  34. Abdelaziz A, Spahn-Langguth H, Schramm K-W et al (2016) Consensus modeling for HTS assays using In silico descriptors calculates the best balanced accuracy in Tox21 challenge. Front Environ Sci. https://doi.org/10.3389/fenvs.2016.00002

    Article  Google Scholar 

  35. Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21(1):6. https://doi.org/10.1186/s12864-019-6413-7

    Article  PubMed  PubMed Central  Google Scholar 

  36. Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707

    Article  CAS  PubMed  Google Scholar 

  37. Li X, Zhang Y, Chen H et al (2017) Insights into the molecular basis of the acute contact toxicity of diverse organic chemicals in the honey bee. J Chem Inf Model 57(12):2948–2957. https://doi.org/10.1021/acs.jcim.7b00476

    Article  CAS  PubMed  Google Scholar 

  38. Li X, Zhang Y, Chen H et al (2017) In silico prediction of chronic toxicity with chemical category approaches. RSC Adv 7(66):41330–41338. https://doi.org/10.1039/C7RA08415C

    Article  Google Scholar 

  39. Yang H, Lou C, Li W et al (2020) Computational approaches to identify structural alerts and their applications in environmental toxicology and drug discovery. Chem Res Toxicol 33(6):1312–1322. https://doi.org/10.1021/acs.chemrestox.0c00006

    Article  CAS  PubMed  Google Scholar 

  40. Klekota J, Roth FP (2008) Chemical substructures that enrich for biological activity. Bioinformatics 24(21):2518–2525. https://doi.org/10.1093/bioinformatics/btn479

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  41. Korkmaz S (2020) Deep learning-based imbalanced data classification for drug discovery. J Chem Inf Model 60(9):4180–4190. https://doi.org/10.1021/acs.jcim.9b01162

    Article  CAS  PubMed  Google Scholar 

  42. Jing XY, Zhang X, Zhu X et al (2021) Multiset feature learning for highly imbalanced data classification. IEEE Trans Pattern Anal Mach Intell 43(1):139–156. https://doi.org/10.1109/TPAMI.2019.2929166

    Article  PubMed  Google Scholar 

  43. Willighagen EL, Mayfield JW, Alvarsson J et al (2017) The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminformatics. https://doi.org/10.1186/s13321-017-0220-4

    Article  Google Scholar 

  44. Ringnér M (2008) What is principal component analysis? Nat Biotechnol 26(3):303–304. https://doi.org/10.1038/nbt0308-303

    Article  CAS  PubMed  Google Scholar 

  45. Thormann M, Vidal D, Almstetter M et al (2007) Nomen est omen: quantitative prediction of molecular properties directly from IUPAC names. The Open Applied Informatics J. https://doi.org/10.2174/1874136300701010028

    Article  Google Scholar 

  46. Xu P, Hu G, Luo C et al (2016) DNA methyltransferase inhibitors: an updated patent review (2012–2015). Expert Opin Ther Pat 26(9):1017–1030. https://doi.org/10.1080/13543776.2016.1209488

    Article  CAS  PubMed  Google Scholar 

  47. Goldstein RS, Rickert DE (1985) Relationship between red blood cell uptake and methemoglobin production by nitrobenzene and dinitrobenzene in vitro. Life Sci 36(2):121–125. https://doi.org/10.1016/0024-3205(85)90090-6

    Article  CAS  PubMed  Google Scholar 

  48. Carey PJ (2003) Drug-induced myelosuppression. Drug Saf 26(10):691–706. https://doi.org/10.2165/00002018-200326100-00003

    Article  CAS  PubMed  Google Scholar 

Download references

Acknowledgements

This work was supported by the National Natural Science Foundationof China (grant 81803433). The authors gratefully acknowledge the encouragement and support from Miss Chaoyue Yang.

Author information

Authors and Affiliations

Authors

Corresponding author

Correspondence to Xiao Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 989 KB)

Rights and permissions

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Hua, Y., Shi, Y., Cui, X. et al. In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods. Mol Divers 25, 1585–1596 (2021). https://doi.org/10.1007/s11030-021-10255-x

Download citation

  • Received:

  • Accepted:

  • Published:

  • Issue Date:

  • DOI: https://doi.org/10.1007/s11030-021-10255-x

Keywords

Navigation