In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods

Yuqing Hua^1,2,
Yinping Shi²,
Xueyan Cui² &
…
Xiao Li ORCID: orcid.org/0000-0002-1148-9898^2,3

1156 Accesses
24 Citations
2 Altmetric
Explore all metrics

Abstract

Chemical-induced hematotoxicity is an important concern in the drug discovery, since it can often be fatal when it happens. It is quite useful for us to give special attention to chemicals which can cause hematotoxicity. In the present study, we focused on in silico prediction of chemical-induced hematotoxicity with machine learning (ML) and deep learning (DL) methods. We collected a large data set contained 632 hematotoxic chemicals and 1525 approved drugs without hematotoxicity. Computational models were built using several different machine learning and deep learning algorithms integrated on the Online Chemical Modeling Environment (OCHEM). Based on the three best individual models, a consensus model was developed. It yielded the prediction accuracy of 0.83 and balanced accuracy of 0.77 on external validation. The consensus model and the best individual model developed with random forest regression and classification algorithm (RFR) and QNPR descriptors were made available at https://ochem.eu/article/135149, respectively. The relevance of 8 commonly used molecular properties and chemical-induced hematotoxicity was also investigated. Several molecular properties have an obvious differentiating effect on chemical-induced hematotoxicity. Besides, 12 structural alerts responsible for chemical hematotoxicity were identified using frequency analysis of substructures from Klekota–Roth fingerprint. These results should provide meaningful knowledge and useful tools for hematotoxicity evaluation in drug discovery and environmental risk assessment.

Graphic abstract

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Revealing cytotoxic substructures in molecules using deep learning

Article Open access 16 April 2020

Artificial Intelligence in Clinical Toxicology

Discover the latest articles, news and stories from top researchers in related subjects.

References

Rich IN (2003) In vitro hematotoxicity testing in drug development: a review of past, present and future applications. Curr Opin Drug Discov Devel 6(1):100–109
CAS PubMed Google Scholar
Budinsky RA Jr (2000) Hematotoxicity: chemically induced toxicity of the blood: principles of toxicology. Wiley, New York, pp 87–109
Google Scholar
Cox A (2007) Recognition and management of drug-induced blood disorders. Prescriber 18(3):51–56. https://doi.org/10.1002/psb.22
Article Google Scholar
Goto K, Goto M, Ando-Imaoka M et al (2017) Evaluation of drug-induced hematotoxicity using novel in vitro monkey CFU-GM and BFU-E colony assays. J Toxicol Sci 42(4):397–405. https://doi.org/10.2131/jts.42.397
Article CAS PubMed Google Scholar
Ng P, Belgur C, Barthakur S et al (2019) Organs-on-chips: a new paradigm for safety assessment of drug-induced thrombosis. Cur Opinion Toxicol 17:1–8. https://doi.org/10.1016/j.cotox.2019.08.004
Article Google Scholar
Jiao Z, Hu P, Xu H et al (2020) Machine learning and deep learning in chemical health and safety: a systematic review of techniques and applications. ACS Chem Health Safety 27(6):316–334. https://doi.org/10.1021/acs.chas.0c00075
Article CAS Google Scholar
Vo AH, Van Vleet TR, Gupta RR et al (2020) An overview of machine learning and big data for drug toxicity evaluation. Chem Res Toxicol 33(1):20–37. https://doi.org/10.1021/acs.chemrestox.9b00227
Article CAS PubMed Google Scholar
Wang MWH, Goodman JM, Allen TEH (2021) Machine learning in predictive toxicology: recent applications and future directions for classification models. Chem Res Toxicol 34(2):217–239. https://doi.org/10.1021/acs.chemrestox.0c00316
Article CAS PubMed Google Scholar
Yang H, Lou C, Sun L et al (2019) admetSAR 2.0 web-service for prediction and optimization of chemical ADMET properties. Bioinformatics 35(6):1067–1069
Article CAS Google Scholar
Crivori P, Pennella G, Magistrelli M et al (2011) Predicting myelosuppression of drugs from in silico models. J Chem Inf Model 51(2):434–445. https://doi.org/10.1021/ci1003834
Article CAS PubMed Google Scholar
Zhang H, Yu P, Zhang T-G et al (2015) In silico prediction of drug-induced myelotoxicity by using Naïve Bayes method. Mol Diversity 19(4):945–953. https://doi.org/10.1007/s11030-015-9613-3
Article CAS Google Scholar
Kuhn M, Letunic I, Jensen LJ et al (2016) The SIDER database of drugs and side effects. Nucleic Acids Res 44(D1):D1075–D1079. https://doi.org/10.1093/nar/gkv1075
Article CAS PubMed Google Scholar
Tomasulo P (2002) ChemIDplus-super source for chemical and drug information. Med Ref Serv Q 21(1):53–59. https://doi.org/10.1300/J115v21n01_04
Article PubMed Google Scholar
Wishart DS, Knox C, Guo AC et al (2008) DrugBank: a knowledgebase for drugs, drug actions and drug targets. Nucl Acids Res. https://doi.org/10.1093/nar/gkm958
Article PubMed PubMed Central Google Scholar
Ancuceanu R, Dinu M, Neaga I et al (2019) Development of QSAR machine learning-based models to forecast the effect of substances on malignant melanoma cells. Oncol Lett 17(5):4188–4196. https://doi.org/10.3892/ol.2019.10068
Article CAS PubMed PubMed Central Google Scholar
Jolliffe IT, Cadima J (2016) Principal component analysis: a review and recent developments. Philos Trans A Math Phys Eng Sci 374(2065):20150202. https://doi.org/10.1098/rsta.2015.0202
Article PubMed PubMed Central Google Scholar
Sushko I, Novotarskyi S, Körner R et al (2011) Online chemical modeling environment (OCHEM): web platform for data storage, model development and publishing of chemical information. J Comput Aided Mol Des 25(6):533–554. https://doi.org/10.1007/s10822-011-9440-2
Article CAS PubMed PubMed Central Google Scholar
Cui X, Liu J, Zhang J et al (2019) In silico prediction of drug-induced rhabdomyolysis with machine-learning models and structural alerts. J Appl Toxicol 39(8):1224–1232. https://doi.org/10.1002/jat.3808
Article CAS PubMed Google Scholar
Cui X, Yang R, Li S et al (2020) Modeling and insights into molecular basis of low molecular weight respiratory sensitizers. Mol Diversity. https://doi.org/10.1007/s11030-020-10069-3
Article Google Scholar
Karpov P, Godin G, Tetko IV (2020) Transformer-CNN: Swiss knife for QSAR modeling and interpretation. J Cheminform 12(1):17. https://doi.org/10.1186/s13321-020-00423-w
Article PubMed PubMed Central Google Scholar
Kovalishyn V, Abramenko N, Kopernyk I et al (2018) Modelling the toxicity of a large set of metal and metal oxide nanoparticles using the OCHEM platform. Food Chem Toxicol 112:507–517. https://doi.org/10.1016/j.fct.2017.08.008
Article CAS PubMed Google Scholar
Li X, Zhang Y, Li H et al (2017) Modeling of the hERG K+ channel blockage using online chemical database and modeling environment (OCHEM). Mol Inf 36(12):1700074. https://doi.org/10.1002/minf.201700074
Article CAS Google Scholar
Tetko IV (2008) Associative neural network. In: Clifton NJ (ed) Methods in molecular biology. Springer, Berlin
Google Scholar
P Indyk, R Motwani, (1998) Approximate nearest neighbors: towards removing the curse of dimensionality. Paper presented at the Proceedings of the thirtieth annual ACM symposium on Theory of computing, Dallas, Texas, USA https://doi.org/10.1145/276698.276876
Chang C-C, Lin C-J (2011) LIBSVM: A library for support vector machines. ACM Trans Intell Syst Technol. https://doi.org/10.1145/1961189.1961199
Article Google Scholar
Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. Paper presented at the Proceedings of the 22nd ACM SIGKDD International conference on knowledge discovery and data mining, San Francisco, California, USA. https://doi.org/10.1145/2939672.2939785
Breiman L (2001) Random forests. Mach Learn 45(1):5–32. https://doi.org/10.1023/A:1010933404324
Article Google Scholar
Wu Z, Ramsundar B, Feinberg Evan N et al (2018) Molecule Net: a benchmark for molecular machine learning. Chem Sci 9(2):513–530. https://doi.org/10.1039/C7SC02664A
Article CAS PubMed Google Scholar
Nogueira RF, Lotufo RdA, Machado RC (2016) Fingerprint liveness detection using convolutional neural networks. IEEE Trans Inf Forensics Secur 11(6):1206–1213. https://doi.org/10.1109/TIFS.2016.2520880
Article Google Scholar
Hewitt M, Cronin MTD, Madden JC et al (2007) Consensus QSAR models: do the benefits outweigh the complexity? J Chem Inf Model 47(4):1460–1468. https://doi.org/10.1021/ci700016d
Article CAS PubMed Google Scholar
Lei T, Li Y, Song Y et al (2016) ADMET evaluation in drug discovery: 15 Accurate prediction of rat oral acute toxicity using relevance vector machine and consensus modeling. J Cheminformatics 8(1):6
Article Google Scholar
Khan K, Benfenati E, Roy K (2019) Consensus QSAR modeling of toxicity of pharmaceuticals to different aquatic organisms: ranking and prioritization of the DrugBank database compounds. Ecotoxicol Environ Saf 168:287–297. https://doi.org/10.1016/j.ecoenv.2018.10.060
Article CAS PubMed Google Scholar
Valsecchi C, Grisoni F, Consonni V et al (2020) Consensus versus individual QSARs in classification: comparison on a large-scale case study. J Chem Inf Model 60(3):1215–1223. https://doi.org/10.1021/acs.jcim.9b01057
Article CAS PubMed PubMed Central Google Scholar
Abdelaziz A, Spahn-Langguth H, Schramm K-W et al (2016) Consensus modeling for HTS assays using In silico descriptors calculates the best balanced accuracy in Tox21 challenge. Front Environ Sci. https://doi.org/10.3389/fenvs.2016.00002
Article Google Scholar
Chicco D, Jurman G (2020) The advantages of the Matthews correlation coefficient (MCC) over F1 score and accuracy in binary classification evaluation. BMC Genomics 21(1):6. https://doi.org/10.1186/s12864-019-6413-7
Article PubMed PubMed Central Google Scholar
Yap CW (2011) PaDEL-descriptor: an open source software to calculate molecular descriptors and fingerprints. J Comput Chem 32(7):1466–1474. https://doi.org/10.1002/jcc.21707
Article CAS PubMed Google Scholar
Li X, Zhang Y, Chen H et al (2017) Insights into the molecular basis of the acute contact toxicity of diverse organic chemicals in the honey bee. J Chem Inf Model 57(12):2948–2957. https://doi.org/10.1021/acs.jcim.7b00476
Article CAS PubMed Google Scholar
Li X, Zhang Y, Chen H et al (2017) In silico prediction of chronic toxicity with chemical category approaches. RSC Adv 7(66):41330–41338. https://doi.org/10.1039/C7RA08415C
Article Google Scholar
Yang H, Lou C, Li W et al (2020) Computational approaches to identify structural alerts and their applications in environmental toxicology and drug discovery. Chem Res Toxicol 33(6):1312–1322. https://doi.org/10.1021/acs.chemrestox.0c00006
Article CAS PubMed Google Scholar
Klekota J, Roth FP (2008) Chemical substructures that enrich for biological activity. Bioinformatics 24(21):2518–2525. https://doi.org/10.1093/bioinformatics/btn479
Article CAS PubMed PubMed Central Google Scholar
Korkmaz S (2020) Deep learning-based imbalanced data classification for drug discovery. J Chem Inf Model 60(9):4180–4190. https://doi.org/10.1021/acs.jcim.9b01162
Article CAS PubMed Google Scholar
Jing XY, Zhang X, Zhu X et al (2021) Multiset feature learning for highly imbalanced data classification. IEEE Trans Pattern Anal Mach Intell 43(1):139–156. https://doi.org/10.1109/TPAMI.2019.2929166
Article PubMed Google Scholar
Willighagen EL, Mayfield JW, Alvarsson J et al (2017) The Chemistry Development Kit (CDK) v2.0: atom typing, depiction, molecular formulas, and substructure searching. J Cheminformatics. https://doi.org/10.1186/s13321-017-0220-4
Article Google Scholar
Ringnér M (2008) What is principal component analysis? Nat Biotechnol 26(3):303–304. https://doi.org/10.1038/nbt0308-303
Article CAS PubMed Google Scholar
Thormann M, Vidal D, Almstetter M et al (2007) Nomen est omen: quantitative prediction of molecular properties directly from IUPAC names. The Open Applied Informatics J. https://doi.org/10.2174/1874136300701010028
Article Google Scholar
Xu P, Hu G, Luo C et al (2016) DNA methyltransferase inhibitors: an updated patent review (2012–2015). Expert Opin Ther Pat 26(9):1017–1030. https://doi.org/10.1080/13543776.2016.1209488
Article CAS PubMed Google Scholar
Goldstein RS, Rickert DE (1985) Relationship between red blood cell uptake and methemoglobin production by nitrobenzene and dinitrobenzene in vitro. Life Sci 36(2):121–125. https://doi.org/10.1016/0024-3205(85)90090-6
Article CAS PubMed Google Scholar
Carey PJ (2003) Drug-induced myelosuppression. Drug Saf 26(10):691–706. https://doi.org/10.2165/00002018-200326100-00003
Article CAS PubMed Google Scholar

Download references

Acknowledgements

This work was supported by the National Natural Science Foundationof China (grant 81803433). The authors gratefully acknowledge the encouragement and support from Miss Chaoyue Yang.

Author information

Authors and Affiliations

School of Pharmacy, Shandong First Medical University, Taian, 271000, China
Yuqing Hua
Department of Clinical Pharmacy, The First Affiliated Hospital of Shandong First Medical University & Shandong Provincial Qianfoshan Hospital, Jinan, 250014, China
Yuqing Hua, Yinping Shi, Xueyan Cui & Xiao Li
Department of Clinical Pharmacy, Shandong Provincial Qianfoshan Hospital, Shandong University, Jinan, 250014, China
Xiao Li

Authors

Yuqing Hua
View author publications
You can also search for this author in PubMed Google Scholar
Yinping Shi
View author publications
You can also search for this author in PubMed Google Scholar
Xueyan Cui
View author publications
You can also search for this author in PubMed Google Scholar
Xiao Li
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Xiao Li.

Ethics declarations

Conflict of interest

The authors declare that they have no conflict of interest.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Below is the link to the electronic supplementary material.

Supplementary file1 (PDF 989 KB)

Rights and permissions

Reprints and permissions

About this article

Cite this article

Hua, Y., Shi, Y., Cui, X. et al. In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods. Mol Divers 25, 1585–1596 (2021). https://doi.org/10.1007/s11030-021-10255-x

Download citation

Received: 13 March 2021
Accepted: 14 June 2021
Published: 01 July 2021
Issue Date: August 2021
DOI: https://doi.org/10.1007/s11030-021-10255-x

In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods

Abstract

Graphic abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Revealing cytotoxic substructures in molecules using deep learning

Artificial Intelligence in Clinical Toxicology

Artificial Intelligence in Clinical Toxicology

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 989 KB)

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

In silico prediction of chemical-induced hematotoxicity with machine learning and deep learning methods

Abstract

Graphic abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Revealing cytotoxic substructures in molecules using deep learning

Artificial Intelligence in Clinical Toxicology

Artificial Intelligence in Clinical Toxicology

Explore related subjects

References

Acknowledgements

Author information

Authors and Affiliations

Corresponding author

Ethics declarations

Conflict of interest

Additional information

Publisher's Note

Supplementary Information

Supplementary file1 (PDF 989 KB)

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation