Prediction of Molecular Bioactivity for Drug Design Using a Decision Tree Algorithm

Sanghoon Lee⁴,
Jihoon Yang⁴ &
Kyung-whan Oh⁴

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2843))

Included in the following conference series:

International Conference on Discovery Science

486 Accesses
2 Citations

Abstract

A machine learning-based approach to the prediction of molecular bioactivity in new drugs is proposed. Two important aspects are considered for the task: feature subset selection and cost-sensitive classification. These are to cope with the huge number of features and unbalanced samples in a dataset of drug candidates. We designed a pattern classifier with such capabilities based on information theory and re-sampling techniques. Experimental results demonstrate the feasibility of the proposed approach. In particular, the classification accuracy of our approach was higher than that of the winner of KDD Cup 2001 competition.

This is a preview of subscription content, log in via an institution to check access.

Access this chapter

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Chapter: USD 29.95; Price excludes VAT (USA)

eBook: USD 39.99; Price excludes VAT (USA)

Softcover Book: USD 54.99; Price excludes VAT (USA)

Tax calculation will be finalised at checkout

Purchases are for personal use only

Institutional subscriptions

Preview

Unable to display preview. Download preview PDF.

New machine learning and physics-based scoring functions for drug discovery

Article Open access 04 February 2021

An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing

Article Open access 08 August 2022

Optimized Analysis Using Feature Selection Techniques for Drug Discovery Detection

References

C. Hatzis, David Page(2001). KDD-2001 Cup The Genomics Challenge (2001)
Google Scholar
Gibas, C., Jambeck, P.: Developing Bioinformatics Computer Skills. O’Reilly, Sebastopol (2001)
Google Scholar
Siedlecki, W., Sklansky, J.: On automatic feature selection. International Journal of Pattern Recognition 2, 197–220 (1988)
Article Google Scholar
Langley, P.: Selection of relevant features in machine learning. In: Proceedings of the AAAI Fall Symposium on Relevance, New Orleans, LA, pp. 1–5. AAAI Press, Menlo Park (1994)
Google Scholar
Dash, M., Liu, H.: Feature selection for classification. Intelligent Data Analysis 1(3) (1997)
Google Scholar
Yang, J., Honavar, V.: Feature Subset Selection Using A Genetic Algorithm. In: Proceedings of the GP 1997, Stanford, CA, pp. 380–385 (1997)
Google Scholar
Nucciardi, A., Gose, E.: A comparison of seven techniques for choosing subsets of pattern recognition. IEEE Transactions on Computers 20, 1023–1031 (1971)
Article Google Scholar
Battiti, R.: Using Mutual Information for Selecting Features in Supervised Neural Net Learning. IEEE Transaction on Neural Networks 5(4), 537–550 (1994)
Article Google Scholar
Al-Ani, A., Deriche, M.: Feature selection using a mutual information based measure. In: Proceedings of 16th International Conference on Pattern Recognition, vol. 4, pp. 82–85 (2002)
Google Scholar
Siedlecki, W., Sklansky, J.: A note on genetic algorithms for large-scale feature selection. IEEE Transactions on Computers 10, 335–347 (1989)
MATH Google Scholar
Brill, F., Brown, D., Martin, W.: Fast Genetic selection of features for neural network classifiers. IEEE Transactions on Neural Networks 3(2), 324–328 (1992)
Article Google Scholar
Richeldi, M., Lanzi, P.: Performing effective feature selection by investigating the deep structure of the data. In: Proceedings of the Second International Conference on Knowledge Discovery and Data Mining, pp. 379–383. AAAI Press, Menlo Park (1996)
Google Scholar
Ng, A.Y.: Preventing “over-fitting” of cross-validation data. In: Proceedings of the 14th International Conference on Machine Learning (ICML), Nashvilli, TN, pp. 245–253 (1997)
Google Scholar
Kohavi, R.: A Study of Cross-Validation and Bootstrap for Accuracy Estimation and Model Selection. In: International Conference on Artificial Intelligence( IJCAI) (1995)
Google Scholar
Duda, R.O., Hart, P.E., Stork, D.G.: Pattern Classification. Wiley Interscience, Hoboken (2001)
MATH Google Scholar

Download references

Author information

Authors and Affiliations

Department of Computer Science, Sogang University, 1 Shinsoo-Dong Mapo-Ku, Seoul, 121-742, Korea
Sanghoon Lee, Jihoon Yang & Kyung-whan Oh

Authors

Sanghoon Lee
View author publications
You can also search for this author in PubMed Google Scholar
Jihoon Yang
View author publications
You can also search for this author in PubMed Google Scholar
Kyung-whan Oh
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

FG Knowledge Engineering, FB Informatik, Technical University Darmstadt, Hochschulstr. 10, 64289, Darmstadt
Gunter Grieser
Meme Media Laboratory, Hokkaido University, N13 W8, 0608628, Sapporo, Japan
Yuzuru Tanaka
Graduate School of Informatics, Kyoto University Yoshida Honmachi, Sakyo-ku, 606-850, Kyoto, Japan
Akihiro Yamamoto

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Lee, S., Yang, J., Oh, Kw. (2003). Prediction of Molecular Bioactivity for Drug Design Using a Decision Tree Algorithm. In: Grieser, G., Tanaka, Y., Yamamoto, A. (eds) Discovery Science. DS 2003. Lecture Notes in Computer Science(), vol 2843. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-39644-4_32

Download citation

DOI: https://doi.org/10.1007/978-3-540-39644-4_32
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-20293-6
Online ISBN: 978-3-540-39644-4
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Prediction of Molecular Bioactivity for Drug Design Using a Decision Tree Algorithm

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

New machine learning and physics-based scoring functions for drug discovery

An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing

Optimized Analysis Using Feature Selection Techniques for Drug Discovery Detection

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Subscribe and save

Buy Now

Navigation

Prediction of Molecular Bioactivity for Drug Design Using a Decision Tree Algorithm

Abstract

Access this chapter

Subscribe and save

Buy Now

Preview

Similar content being viewed by others

New machine learning and physics-based scoring functions for drug discovery

An ensemble-based drug–target interaction prediction approach using multiple feature information with data balancing

Optimized Analysis Using Feature Selection Techniques for Drug Discovery Detection

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation