Abstract
We present a novel application of ILP to the problem of diterpene structure elucidation from 13C NMR spectra. Diterpenes are organic compounds of low molecular weight that are based on a skeleton of 20 carbon atoms. They are of significant chemical and commercial interest because of their use as lead compounds in the search for new pharmaceutical effectors. The structure elucidation of diterpenes based on 13C NMR spectra is usually done manually by human experts with specialized background knowledge on peak patterns and chemical structures. In the process, each of the 20 skeletal atoms is assigned an atom number that corresponds to its proper place in the skeleton and the diterpene is classified into one of the possible skeleton types. We address the problem of learning classification rules from a database of peak patterns for diterpenes with known structure. Recently, propositional learning was successfully applied to learn classification rules from spectra with assigned atom numbers. As the assignment of atom numbers is a difficult process in itself (and possibly indistinguishable from the classification process), we apply ILP, i.e., relational learning, to the problem of classifying spectra without assigned atom numbers.
Preview
Unable to display preview. Download preview PDF.
Similar content being viewed by others
References
Aha, D., Kibler, D., and Albert, M. Instance-based learning algorithms. Machine Learning, 6: 37–66, 1991.
Abraham, R.J., Loftus, P. Proton and Carbon 13 NMR Spectroscopy, An Integrated Approach. Heyden, London, 1978.
Clark, P. and Boswell, R. Rule induction with CN2: Some recent improvements. In Proc. Fifth European Working Session on Learning, pages 151–163. Springer, Berlin, 1991.
Cover, T.M., and Hart, P.E. Nearest neighbor pattern classification. IEEE Transactions on Information Theory, 13: 21–27, 1968.
De Raedt, L., and Van Laer, V. Inductive constraint logic. In Proc. Sixth International Workshop on Algorithmic Learning Theory, pages 80–94. Springer, Berlin, 1995.
Džeroski, S. Handling imperfect data in inductive logic programming. In Proc. Fourth Scandinavian Conference on Artificial Intelligence, pages 111–125. IOS Press, Amsterdam, 1993.
Dzeroski, S., Schulze-Kremer, S., Heidtke, K., Siems, K., Wettschereck, D. Diterpene structure elucidation from 13C NMR spectra with machine learning. In Proc. ECAI'96 Workshop on Intelligent Data Analysis in Medicine and Pharmacology, 1996.
Emde, W., Wettschereck, D. Relational instance-based learning. In Proc. Thirteenth International Conference on Machine Learning, pages 122–130. Morgan Kaufmann, San Mateo, CA, 1996.
Gray, N. A. B. Progress in NMR-spectroscopy, Vol. 15, pp. 201–248, 1982
Lavrač, N., Džeroski, S. Inductive Logic Programming: Techniques and Applications. Ellis Horwood, Chichester, 1994.
Muggleton, S. Inverse entailment and PROGOL. New Generation Computing, 13: 245–286, 1995.
Muggleton, S., and Feng, C. Efficient induction of logic programs. In Proc. First Conference on Algorithmic Learning Theory, pages 368–381. Ohmsha, Tokyo, 1990.
Natural products on CD-ROM. Chapman and Hall, London, 1995.
Quinlan, J.R. Induction of decision trees. Machine Learning 1(1): 81–106, 1986.
Quinlan, J.R. Learning logical definitions from relations. Machine Learning, 5(3): 239–266, 1990.
Quinlan, J.R. C4.5: Programs for Machine Learning. Morgan Kaufmann, San Mateo, CA, 1993.
Schulze-Kremer, S. Molecular Bioinformatics — Algorithms and Applications. de Gruyter, Berlin, 1995.
Stuttgart Neural Network Simulator. Computer code available from the University of Stuttgart, Germany, via anonymous ftp ftp://ftp.informatik.uni-stuttgart.de/pub/SNNS, 1995.
Tveter, D. R. Fast-Backpropagation. Computer code available from the author. Address: 5228 N Nashville Ave, Chicago, Illinois, 60656, drt@chinet.chi. il.us, 1995.
Wettschereck, D. A study of distance-based machine learning algorithms. PhD Thesis, Department of Computer Science, Oregon State University, Corvallis, OR, 1994.
Author information
Authors and Affiliations
Editor information
Rights and permissions
Copyright information
© 1997 Springer-Verlag Berlin Heidelberg
About this paper
Cite this paper
Džeroski, S., Schulze-Kremer, S., Heidtke, K.R., Siems, K., Wettschereck, D. (1997). Applying ILP to diterpene structure elucidation from 13C NMR spectra. In: Muggleton, S. (eds) Inductive Logic Programming. ILP 1996. Lecture Notes in Computer Science, vol 1314. Springer, Berlin, Heidelberg. https://doi.org/10.1007/3-540-63494-0_47
Download citation
DOI: https://doi.org/10.1007/3-540-63494-0_47
Published:
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-63494-2
Online ISBN: 978-3-540-69583-7
eBook Packages: Springer Book Archive