Abstract
Dropout is an Artificial Neural Network (ANN) training technique that has been shown to improve ANN performance across canonical machine learning (ML) datasets. Quantitative Structure Activity Relationship (QSAR) datasets used to relate chemical structure to biological activity in Ligand-Based Computer-Aided Drug Discovery pose unique challenges for ML techniques, such as heavily biased dataset composition, and relatively large number of descriptors relative to the number of actives. To test the hypothesis that dropout also improves QSAR ANNs, we conduct a benchmark on nine large QSAR datasets. Use of dropout improved both enrichment false positive rate and log-scaled area under the receiver-operating characteristic curve (logAUC) by 22–46 % over conventional ANN implementations. Optimal dropout rates are found to be a function of the signal-to-noise ratio of the descriptor set, and relatively independent of the dataset. Dropout ANNs with 2D and 3D autocorrelation descriptors outperform conventional ANNs as well as optimized fingerprint similarity search methods.






Similar content being viewed by others
References
Sliwoski G, Kothiwale S, Meiler J, Lowe EW Jr (2014) Computational methods in drug discovery. Pharmacol Rev 66(1):334–395. doi:10.1124/pr.112.007336
Srivastava N, Hinton G, Krizhevsky A, Sutskever I, Salakhutdinov R (2014) Dropout: a simple way to prevent neural networks from overfitting. J Mach Learn Res 15:1929–1958
Krizhevsky A, Sutskever I, Hinton GE (2012) Imagenet classification with deep convolutional neural networks. Adv Neural Inf Process Syst 25:1097–1105
Deng L, Hinton G, Kingsbury B (2013) New types of deep neural network learning for speech recognition and related applications: an overview. In: IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), IEEE. pp 8599–8603
Myint KZ, Wang L, Tong Q, Xie XQ (2012) Molecular fingerprint-based artificial neural networks QSAR for ligand biological activity predictions. Mol Pharm 9(10):2912–2923. doi:10.1021/mp300237z
Butkiewicz M, Lowe EW Jr, Mueller R, Mendenhall JL, Teixeira PL, Weaver CD, Meiler J (2013) Benchmarking ligand-based virtual High-Throughput Screening with the PubChem database. Molecules 18(1):735–756. doi:10.3390/molecules18010735
Mueller R, Dawson ES, Niswender CM, Butkiewicz M, Hopkins CR, Weaver CD, Lindsley CW, Conn PJ, Meiler J (2012) Iterative experimental and virtual high-throughput screening identifies metabotropic glutamate receptor subtype 4 positive allosteric modulators. J Mol Model 18(9):4437–4446. doi:10.1007/s00894-012-1441-0
Sliwoski G, Lowe EW, Butkiewicz M, Meiler J (2012) BCL:EMAS—enantioselective molecular asymmetry descriptor for 3D-QSAR. Molecules 17(8):9971–9989. doi:10.3390/molecules17089971
Hartman JH, Cothren SD, Park SH, Yun CH, Darsey JA, Miller GP (2013) Predicting CYP2C19 catalytic parameters for enantioselective oxidations using artificial neural networks and a chirality code. Bioorg Med Chem 21(13):3749–3759. doi:10.1016/j.bmc.2013.04.044
Ahmadi M, Shahlaei M (2015) Quantitative structure-activity relationship study of P2X7 receptor inhibitors using combination of principal component analysis and artificial intelligence methods. Res Pharm Sci 10(4):307–325
Dahl GE, Jaitly N, Salakhutdinov R (2014) Multi-task neural networks for QSAR predictions. arXiv preprint arXiv:14061231
Dahl G (2012) Deep learning how I did it: Merck 1st place interview. http://blog.kaggle.com/2012/11/01/deep-learning-how-i-did-it-merck-1st-place-interview/. Accessed Aug 14 2015
Cherkasov A, Muratov EN, Fourches D, Varnek A, Baskin II, Cronin M, Dearden J, Gramatica P, Martin YC, Todeschini R, Consonni V, Kuz’min VE, Cramer R, Benigni R, Yang C, Rathman J, Terfloth L, Gasteiger J, Richard A, Tropsha A (2014) QSAR modeling: where have you been? Where are you going to? J Med Chem 57(12):4977–5010. doi:10.1021/jm4004285
Sadowski J (1997) A hybrid approach for addressing ring flexibility in 3D database searching. J Comput Aid Mol Des 11(1):53–60
Berenger F, Voet A, Lee XY, Zhang KYJ (2014) A rotation-translation invariant molecular descriptor of partial charges and its use in ligand-based virtual screening. J Chem Inf. doi:10.1186/1758-2946-6-23
Sliwoski G, Mendenhall J, Meiler J (2015) Autocorrelation descriptor improvements for QSAR: 2DA_Sign and 3DA_Sign. J Comput Aid Mol Des. doi:10.1007/s10822-015-9893-9
Todeschini R, Consonni V (2009) Molecular descriptors for chemoinformatics. Methods and principles in medicinal chemistry, vol 41, 2nd edn. Wiley-VCH, Weinheim
Sastry M, Lowrie JF, Dixon SL, Sherman W (2010) Large-scale systematic analysis of 2D fingerprint methods and parameters to improve virtual screening enrichments. J Chem Inf Model 50(5):771–784. doi:10.1021/ci100062n
Xing L, Glen RC (2002) Novel methods for the prediction of logP, pK(a), and logD. J Chem Inf Comput Sci 42(4):796–805
Ertl P, Rohde B, Selzer P (2000) Fast calculation of molecular polar surface area as a sum of fragment-based contributions and its application to the prediction of drug transport properties. J Med Chem 43(20):3714–3717
Gasteiger J, Marsili M (1980) Iterative partial equalization of orbital electronegativity—a rapid access to atomic charges. Tetrahedron 36(22):3219–3228. doi:10.1016/0040-4020(80)80168-2
Gilson MK, Gilson HS, Potter MJ (2003) Fast assignment of accurate partial atomic charges: an electronegativity equalization method that accounts for alternate resonance forms. J Chem Inf Comput Sci 43(6):1982–1997. doi:10.1021/ci034148o
Miller KJ (1990) Additivity methods in molecular polarizability. J Am Chem Soc 112(23):8533–8542. doi:10.1021/Ja00179a044
PubChem (2009) PubChem Substructure Fingerprint. ftp://ncbi.nlm.nih.gov/pubchem/specifications/pubchem_fingerprints.pdf. Accessed May 05 2014
Rumelhart DE, Hinton GE, Williams RJ (1986) Learning representations by back-propagating errors. Nature 323(6088):533–536. doi:10.1038/323533a0
Mysinger MM, Shoichet BK (2010) Rapid context-dependent ligand desolvation in molecular docking. J Chem Inf Model 50(9):1561–1573. doi:10.1021/ci100214a
Weisstein EW (2000) Normal sum distribution. Wolfram Research, Inc. http://mathworld.wolfram.com/NormalSumDistribution.html. Accessed Nov 1 2015
Valcu M, Valcu CM (2011) Data transformation practices in biomedical sciences. Nat Methods 8(2):104–105. doi:10.1038/nmeth0211-104
LeCun Y, Bottou L, Orr G, Müller K-R (1998) Efficient BackProp. In: Orr G, Müller K-R (eds) Neural networks: tricks of the trade, vol 1524. Lecture Notes in Computer Science. Springer, Berlin, pp 9–50. doi:10.1007/3-540-49430-8_2
Prati RC, Batista GE, Silva DF (2014) Class imbalance revisited: a new experimental setup to assess the performance of treatment methods. Knowl Inf Syst 1–24
Batista GEAPA, Prati RC, Monard MC (2005) Balancing strategies and class overlapping. Adv Intell Data Anal VI Proc 3646:24–35
Nakama T (2009) Theoretical analysis of batch and on-line training for gradient descent learning in neural networks. Neurocomput 73(1–3):151–159. doi:10.1016/j.neucom.2009.05.017
Wilson DR, Martinez TR (2003) The general inefficiency of batch training for gradient descent learning. Neural Netw 16(10):1429–1451. doi:10.1016/S0893-6080(03)00138-2
Wu W, Wang J, Cheng M, Li Z (2011) Convergence analysis of online gradient method for BP neural networks. Neural Netw 24(1):91–98. doi:10.1016/j.neunet.2010.09.007
Igel C, Husken M (2003) Empirical evaluation of the improved Rprop learning algorithms. Neurocomputing 50:105–123. doi:10.1016/S0925-2312(01)00700-7
Jain AN, Nicholls A (2008) Recommendations for evaluation of computational methods. J Comput Aid Mol Des 22(3–4):133–139. doi:10.1007/s10822-008-9196-5
Gasteiger J (2003) Handbook of chemoinformatics: from data to knowledge. Wiley-VCH, Weinheim
Ba J, Frey B (2013) Adaptive dropout for training deep neural networks. Adv Neural Inf Process Syst 26:3084–3092
Mueller R, Rodriguez AL, Dawson ES, Butkiewicz M, Nguyen TT, Oleszkiewicz S, Bleckmann A, Weaver CD, Lindsley CW, Conn PJ, Meiler J (2010) Identification of metabotropic glutamate receptor subtype 5 potentiators using virtual high-throughput screening. ACS Chem Neurosci 1(4):288–305. doi:10.1021/cn9000389
Marsili M, Gasteiger J (1980) Pi-charge distribution from molecular topology and pi-orbital electronegativity. Croat Chem Acta 53(4):601–614
Gilson MK, Gilson HSR, Potter MJ (2003) Fast assignment of accurate partial atomic charges: an electronegativity equalization method that accounts for alternate resonance forms. J Chem Inf Comput Sci 43(6):1982–1997. doi:10.1021/Ci034148o
Author information
Authors and Affiliations
Corresponding author
Rights and permissions
About this article
Cite this article
Mendenhall, J., Meiler, J. Improving quantitative structure–activity relationship models using Artificial Neural Networks trained with dropout. J Comput Aided Mol Des 30, 177–189 (2016). https://doi.org/10.1007/s10822-016-9895-2
Received:
Accepted:
Published:
Issue Date:
DOI: https://doi.org/10.1007/s10822-016-9895-2