Graph kernels and Gaussian processes for relational reinforcement learning

Kurt Driessens¹,
Jan Ramon² &
Thomas Gärtner³

1318 Accesses
26 Citations
3 Altmetric
Explore all metrics

Abstract

RRL is a relational reinforcement learning system based on Q-learning in relational state-action spaces. It aims to enable agents to learn how to act in an environment that has no natural representation as a tuple of constants. For relational reinforcement learning, the learning algorithm used to approximate the mapping between state-action pairs and their so called Q(uality)-value has to be very reliable, and it has to be able to handle the relational representation of state-action pairs. In this paper we investigate the use of Gaussian processes to approximate the Q-values of state-action pairs. In order to employ Gaussian processes in a relational setting we propose graph kernels as a covariance function between state-action pairs. The standard prediction mechanism for Gaussian processes requires a matrix inversion which can become unstable when the kernel matrix has low rank. These instabilities can be avoided by employing QR-factorization. This leads to better and more stable performance of the algorithm and a more efficient incremental update mechanism. Experiments conducted in the blocks world and with the Tetris game show that Gaussian processes with graph kernels can compete with, and often improve on, regression trees and instance based regression as a generalization algorithm for RRL.

Artificial Intelligence

Use our pre-submission checklist

Avoid common mistakes on your manuscript.

References

Barnett, S. (1979). Matrix Methods for Engineers and Scientists. MacGraw-Hill.
Cortes, C., Haffner, P., & Mohri, M. (2003). Positive Definite Rational Kernels. In Proceedings of the 16th Annual Conference on Computational Learning Theory and the 7th Kernel Workshop.
Dearden, R., Friedman, N., & Russell, S. (1998). Bayesian Q-learning. In Proceedings of AAAI-98/IAAI-98, (pp. 761–768).
Demaine, E., Hohenberger, S., & Liben-Nowell, D. (2002). Tetris is Hard, Even to Approximate. Technical Report MIT-LCS-TR-865, Massachussets Institue of Technology, Boston.
Google Scholar
Deshpande, M., Kuramochi, M., & Karypis, G. (2002). Automated Approaches for Classifying Structures. In Proceedings of the 2nd ACM SIGKDD Workshop on Data Mining in Bioinformatics.
Diestel, R. (2000). Graph Theory. Springer-Verlag.
Dietterich, T., & Wang, X. (2002). Batch value function approximation via support vectors. In T. G. Dietterich, S. Becker, & Z. Ghahramani (Eds.), Advances in Neural Information Processing Systems, vol. 14, Cambridge, MA, The MIT Press.
Google Scholar
Driessens, K., & Džeroski, S. (2002). Integrating experimentation and guidance in relational reinforcement learning. In C. Sammut, & A. Hoffmann (Eds.), Proceedings of the Nineteenth International Conference on Machine Learning (pp. 115–122). Morgan Kaufmann Publishers, Inc.
Driessens, K., & Džeroski, S. (2004). Integrating guidance into relational reinforcement learning. Machine Learning, 57, 271–304.
Article MATH Google Scholar
Driessens, K., & Ramon, J. (2003). Relational instance based regression for relational reinforcement learning. In Proceedings of the Twentieth International Conference on Machine Learning (pp. 123–130). AAAI Press.
Driessens, K., Ramon, J., & Blockeel, H. (2001). Speeding up Relational Reinforcement Learning Through the Use of an Incremental First Order Decision Tree Learner. In L. De Raedt, & P. Flach (Eds.), Proceedings of the 13th European Conference on Machine Learning, vol. 2167 of Lecture Notes in Artificial Intelligence (pp. 97–108). Springer-Verlag.
Džeroski, S., De Raedt, L., & Blockeel, H. (1998). Relational reinforcement Learning. In Proceedings of the 15th International Conference on Machine Learning (pp. 136–143). Morgan Kaufmann.
Engel, Y., Mannor, S., & Meir, R. (2003). Bayes meets Bellman: The gaussian process approach to temporal difference learning. In Proceedings of the Twentieth International Conference on Machine Learning (ICML 2003) (pp. 154–161). Morgan Kaufmann.
Gärtner, T. (2002). Exponential and Geometric Kernels for Graphs. In NIPS Workshop on Unreal Data: Principles of Modeling Nonvectorial Data.
Gärtner, T. (2003). A survey of kernels for structured data. SIGKDD Explorations, 5(1), 49–58.
Google Scholar
Gärtner, T., Driessens, K., & Ramon, J. (2003a). Graph kernels and Gaussian processes for relational reinforcement learning. In Inductive Logic Programming, 13th International Conference, ILP 2003, Proceedings, vol. 2835 of Lecture Notes in Computer Science (pp. 146–163). Springer.
Gärtner, T., Flach, P., & Wrobel, S. (2003b). On graph kernels: Hardness Results and Efficient Alternatives. In M. W. B. Schölkopf (Ed.), Proceedings of the 16th Annual Conference on Computational Learning Theory and the 7th Kernel Workshop (129–143).
Gibbs, M. (1997). Bayesian Gaussian Processes for Regression and Classification. Ph.D. thesis, University of Cambridge.
Golub, G. H., & Van Loan, C. F. (1996). Matrix computations. Johns Hopkins Series in the Mathematical Sciences. The Johns Hopkins University Press.
Graepel, T. (2002). PAC-Bayesian Pattern Classification with Kernels. Ph.D. thesis, TU Berlin.
Horvath, T., Gärtner, T., & Wrobel, S. (2004). Cyclic Pattern Kernels for Predictive Graph Mining. In Proceedings of the International Conference on Knowledge Discovery and Data Mining.
Imrich, W., & Klavžar, S. (2000). Product Graphs: Structure and Recognition. John Wiley.
Kaelbling, L., Littman, M., & Moore, A. (1996). Reinforcement learning: A survey. Journal of Artificial Intelligence Research, 4, 237–285.
Google Scholar
Kashima, H., & Inokuchi, A. (2002). Kernels for Graph Classification. In ICDM Workshop on Active Mining.
Kashima, H., Tsuda, K., & Inokuchi, A. (2003). Marginalized kernels Between Labeled Graphs. In Proceedings of the 20th International Conference on Machine Learning.
Korte, B., & Vygen, J. (2002). Combinatorial Optimization: Theory and Algorithms. Springer-Verlag.
Kuramochi, M., & Karypis, G. (2001). Frequent subgraph discovery. In Proceedings of the IEEE International Conference on Data Mining.
MacKay, D. (1997a) Introduction to Gaussian processes. Aavailable at http://wol.ra.phy.cam.ac.uk/mackay.
MacKay, D. J. C. (1997b). Introduction to Gaussian processes. Available at http://wol.ra.phy.cam.ac.uk/mackay.
Mitchell, T. (1997). Machine Learning. McGraw-Hill.
Ormoneit, D., & Sen, S. (2002). Kernel-based reinforcement learning. Machine Learning, 49, 161–178.
Article MATH Google Scholar
Rasmussen, C. E., & Kuss, M. (2004). Gaussian Processes in Reinforcement Learning. In Advances in Neural Information Processing Systems, vol. 16. MIT Press.
Rifkin, R. M. (2002). Everything old is new again: A fresh Look at Historical Approaches to Machine Learning. Ph.D. thesis, MIT.
Saunders, C., Gammerman, A., & Vovk, v. (1998). Ridge Regression Learning Algorithm in Dual Variables. In Proceedings of the Fifteenth International Conference on Machine Learning. Morgan Kaufmann.
Schaal, S., Atkeson, C. G., & Vijayakumar, S. (2000). Real-Time Robot Learning with Locally Weighted Statistical Learning. In Proceedings of the IEEE International Conference on Robotics and Automation (pp. 288–293). IEEE Press, Piscataway, N.J.
Google Scholar
Schölkopf, B., & Smola, A. J. (2002). Learning with kernels. MIT Press.
Smart, W. D., & Kaelbling, L. P. (2000). Practical Reinforcement Learning in Continuous Spaces. In Proceedings of the 17th International Conference on Machine Learning (pp. 903–910). Morgan Kaufmann.
Sutton, R., & Barto, A. (1998). Reinforcement Learning: An introduction. Cambridge, MA: The MIT Press.
Google Scholar
Vapnik, V. (1995). The Nature of Statistical Learning Theory. Springer-Verlag.
Watkins, C. (1989). Learning from Delayed Rewards. Ph.D. thesis, King’s College, Cambridge.

Download references

Author information

Authors and Affiliations

Department of Computer Science, University of Waikato, Private Bag 3105, Hamilton, New Zealand
Kurt Driessens
Department of Computer Science, Katholieke Universiteit Leuven, Celestijnenlaan 200A, B-3001, Heverlee, Belgium
Jan Ramon
Knowledge Discovery, Fraunhofer Institut Autonome Intelligente Systeme, Schloss Birlinghoven, 53754, Sankt Augustin, Germany
Thomas Gärtner

Authors

Kurt Driessens
View author publications
You can also search for this author in PubMed Google Scholar
Jan Ramon
View author publications
You can also search for this author in PubMed Google Scholar
Thomas Gärtner
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Kurt Driessens.

Additional information

Editors: David Page and Akihiro Yamamoto

Rights and permissions

Reprints and permissions

About this article

Cite this article

Driessens, K., Ramon, J. & Gärtner, T. Graph kernels and Gaussian processes for relational reinforcement learning. Mach Learn 64, 91–119 (2006). https://doi.org/10.1007/s10994-006-8258-y

Download citation

Received: 07 August 2004
Revised: 09 March 2006
Accepted: 11 March 2006
Published: 08 May 2006
Issue Date: September 2006
DOI: https://doi.org/10.1007/s10994-006-8258-y

Graph kernels and Gaussian processes for relational reinforcement learning

Abstract

Article PDF

Similar content being viewed by others

Towards Relational Multi-Agent Reinforcement Learning via Inductive Logic Programming

Stochastic Games and Learning

Stochastic Games and Learning

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Keywords

Navigation

Graph kernels and Gaussian processes for relational reinforcement learning

Abstract

Article PDF

Similar content being viewed by others

Towards Relational Multi-Agent Reinforcement Learning via Inductive Logic Programming

Stochastic Games and Learning

Stochastic Games and Learning

Explore related subjects

References

Author information

Authors and Affiliations

Corresponding author

Additional information

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Search

Navigation