Direct Reward and Indirect Reward in Multi-agent Reinforcement Learning

Masayuki Ohta⁹

Part of the book series: Lecture Notes in Computer Science ((LNAI,volume 2752))

Included in the following conference series:

Robot Soccer World Cup

1314 Accesses

Abstract

When we apply reinforcement learning onto multi-agent environment, credit assignment problem will occur, because it is sometimes difficult to define which agents are the real contributors. If we praise all agents, when a group of cooperative agents get reward, some agents which did not contribute it will also reinforce their policies. On the other hand, if we praise obvious contributors only, indirect contribution will not be reinforced. For the first step to reduce this dilemma, we propose a classification of reward, and then investigate the feature of it. We treat a positioning task on SoccerServer for the experiments. The empirical results show that direct reward takes effect faster and helps obtaining individuality. On the contrary, indirect reward takes effect slower, but agents tend to form a group and obtain another effective positioning.

Download to read the full chapter text

Chapter PDF

Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks

Article 04 January 2021

Collective Intrinsic Motivation of a Multi-agent System Based on Reinforcement Learning Algorithms

Tacit Commitments Emergence in Multi-agent Reinforcement Learning

Keywords

These keywords were added by machine and not by the authors. This process is experimental and the keywords may be updated as the learning algorithm improves.

References

Andou, T.: Andhill-98: A robocup team which reinforces positioning with observation. In: Asada, M., Kitano, H. (eds.) RoboCup-98: Robot Soccer World Cup II, pp. 338–345. Springer, Heidelberg (1998)
Google Scholar
Arai, S., Sycara, K.: Effective learning approach for planning and scheduling in multi-agent domain. In: Proceedings of the 6th International Conference on Simulation of Adaptive Behavior, pp. 507–516 (2000)
Google Scholar
Watkins, C.J.C.H.: Learning from Delayed Rewards. PhD thesis, Cambridge University, Cambridge, England (1989)
Google Scholar
Grefenstette, J.J.: Credit assignment in rule discovery systems based on genetic algorithms. Machine Learning 3, 225–245 (1988)
Google Scholar
Kaelbling, L.P., Littman, M.L., Moore, A.W.: Reinforcement learning: A survey. Journal of Artificial Intelligence Research 4, 237–285 (1996)
Google Scholar
Miyazaki, K., Kobayashi, S.: Rationality of reward sharing in multi-agent reinforcement learning. In: Second Pacific Rim International Workshop on Multi-Agents, pp. 111–125 (1999)
Google Scholar
Noda, I., Matsubara, H., Hiraki, K., Frank, I.: Soccer server: A tool for research on multiagent systems. In: Applied Artificial Intelligence, vol. 12, pp. 233–250 (1998)
Google Scholar
Sen, S., Sekaran, M., Hale, J.: Proceedings of the 12th national conference on artificial intelligence. Learning to Coordinate without Sharing Information, 426–431 (1994)
Google Scholar
Stone, P., McAllester, D.: An Architecture for Action Selection in Robotic Soccer. In: Proceedings of the Fifth International Conference on Autonomous Agents (2001)
Google Scholar
Sutton, R.S., Barto, A.G.: Reinforcement Learning: An Introduction. MIT Press, Cambridge (1998)
Google Scholar

Download references

Author information

Authors and Affiliations

Cyber Assist Research Center, National Institution of Advanced Industrial Science and Technology, AIST Tokyo Waterfront, 2-41-6 Aomi Koto-ku, Tokyo, 135-0064, Japan
Masayuki Ohta

Authors

Masayuki Ohta
View author publications
You can also search for this author in PubMed Google Scholar

Editor information

Editors and Affiliations

The MAVERICK Group, Computer Science Department, Bar Ilan University, Israel
Gal A. Kaminka
Institute for Systems and Robotics, Instituto Superior Técnico, Technical University of Lisbon,
Pedro U. Lima
Institut für Informatik, Freie Universität Berlin, Takustr. 9, 14195, Berlin, Germany
Raúl Rojas

Rights and permissions

Reprints and permissions

Copyright information

About this paper

Cite this paper

Ohta, M. (2003). Direct Reward and Indirect Reward in Multi-agent Reinforcement Learning. In: Kaminka, G.A., Lima, P.U., Rojas, R. (eds) RoboCup 2002: Robot Soccer World Cup VI. RoboCup 2002. Lecture Notes in Computer Science(), vol 2752. Springer, Berlin, Heidelberg. https://doi.org/10.1007/978-3-540-45135-8_31

Download citation

DOI: https://doi.org/10.1007/978-3-540-45135-8_31
Publisher Name: Springer, Berlin, Heidelberg
Print ISBN: 978-3-540-40666-2
Online ISBN: 978-3-540-45135-8
eBook Packages: Springer Book Archive

Publish with us

Policies and ethics

Direct Reward and Indirect Reward in Multi-agent Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks

Collective Intrinsic Motivation of a Multi-agent System Based on Reinforcement Learning Algorithms

Tacit Commitments Emergence in Multi-agent Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Publish with us

Navigation

Direct Reward and Indirect Reward in Multi-agent Reinforcement Learning

Abstract

Chapter PDF

Similar content being viewed by others

Bottom-up multi-agent reinforcement learning by reward shaping for cooperative-competitive tasks

Collective Intrinsic Motivation of a Multi-agent System Based on Reinforcement Learning Algorithms

Tacit Commitments Emergence in Multi-agent Reinforcement Learning

Keywords

References

Author information

Authors and Affiliations

Editor information

Editors and Affiliations

Rights and permissions

Copyright information

About this paper

Cite this paper

Download citation

Share this paper

Publish with us

Search

Navigation