research-article

Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning

Authors:

Jessica HodginsAuthors Info & Claims

ACM Transactions on Graphics (TOG), Volume 36, Issue 3

Article No.: 29, Pages 1 - 14

https://doi.org/10.1145/3083723

Published: 27 June 2017 Publication History

Abstract

Given a robust control system, physical simulation offers the potential for interactive human characters that move in realistic and responsive ways. In this article, we describe how to learn a scheduling scheme that reorders short control fragments as necessary at runtime to create a control system that can respond to disturbances and allows steering and other user interactions. These schedulers provide robust control of a wide range of highly dynamic behaviors, including walking on a ball, balancing on a bongo board, skateboarding, running, push-recovery, and breakdancing. We show that moderate-sized Q-networks can model the schedulers for these control tasks effectively and that those schedulers can be efficiently learned by the deep Q-learning algorithm.

Supplementary Material

JPG File (tog-30.jpg)

Download
15.33 KB

liu (liu.zip)

Supplemental movie, appendix, image and software files for, Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning

Download
8.28 MB

MP4 File (tog-30.mp4)

Download
432.58 MB

References

[1]

Yeuhi Abe and Jovan Popovíc. 2011. Simulating 2D gaits with a phase-indexed tracking controller. IEEE Comput. Graph. Appl. 31, 4 (July 2011), 22--33.

Digital Library

[2]

Mazen Al Borno, Martin de Lasa, and Aaron Hertzmann. 2013. Trajectory optimization for full-body movements with complex contacts. IEEE Trans. Visual. Comput. Graph. 19, 8 (Aug 2013), 1405--1414.

Digital Library

[3]

Mazen Al Borno, Eugene Fiume, Aaron. Hertzmann, and Martin de Lasa. 2014. Feedback control for rotational movements in feature space. Comput. Graph. Forum 33, 2 (May 2014), 225--233.

Digital Library

[4]

Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, James Bergstra, Ian J. Goodfellow, Arnaud Bergeron, Nicolas Bouchard, and Yoshua Bengio. 2012. Theano: New features and speed improvements. Deep Learning and Unsupervised Feature Learning NIPS 2012 Workshop.

[5]

James Bergstra, Olivier Breuleux, Frédéric Bastien, Pascal Lamblin, Razvan Pascanu, Guillaume Desjardins, Joseph Turian, David Warde-Farley, and Yoshua Bengio. 2010. Theano: A CPU and GPU math expression compiler. In Proceedings of the Python for Scientific Computing Conference (SciPy). Oral Presentation.

[6]

Brian G. Buss, Alireza Ramezani, Kaveh Akbari Hamed, Brent A. Griffin, Kevin S. Galloway, and Jessy W. Grizzle. 2014. Preliminary walking experiments with underactuated 3D bipedal robot MARLO. In Proceedings of the 2014 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS’14). 2529--2536.

[7]

Seth Cooper, Aaron Hertzmann, and Zoran Popović. 2007. Active learning for real-time motion controllers. ACM Trans. Graph. 26, 3 (July 2007), Article 5.

Digital Library

[8]

Stelian Coros, Philippe Beaudoin, and Michiel van de Panne. 2009. Robust task-based control policies for physics-based characters. ACM Trans. Graph. 28, 5 (Dec. 2009), Article 170, 170:1--170:9.

Digital Library

[9]

Stelian Coros, Philippe Beaudoin, and Michiel van de Panne. 2010. Generalized biped walking control. ACM Trans. Graph. 29, 4 (July 2010), Article 130, 130:1--130:9 pages.

Digital Library

[10]

Marco da Silva, Frédo Durand, and Jovan Popović. 2009. Linear Bellman combination for control of character animation. ACM Trans. Graph. 28, 3 (July 2009), Article 82, 82:1--82:10.

Digital Library

[11]

Thomas Geijtenbeek and Nicolas Pronost. 2012. Interactive character animation using simulated physics: A state-of-the-art review. Comput. Graph. Forum 31, 8 (Dec. 2012), 2492--2515.

Digital Library

[12]

Gaël Guennebaud, Benoît Jacob, and others. 2010. Eigen v3. Retrieved from http://eigen.tuxfamily.org.

[13]

Sehoon Ha and C. Karen Liu. 2014. Iterative training of dynamic skills inspired by human coaching techniques. ACM Trans. Graph. 34, 1, Article 1 (Dec. 2014), 1:1--1:11.

Digital Library

[14]

Perttu Hämäläinen, Joose Rajamäki, and C. Karen Liu. 2015. Online control of simulated humanoids using particle belief propagation. ACM Trans. Graph. 34, 4, Article 81 (July 2015), 81:1--81:13.

Digital Library

[15]

Sumit Jain and C. Karen Liu. 2011. Controlling physics-based characters using soft contacts. ACM Trans. Graph. 30, 6 (Dec. 2011), Article 163, 163:1--163:10.

Digital Library

[16]

Taesoo Kwon and Jessica Hodgins. 2010. Control systems for human running using an inverted pendulum model and a reference motion capture sequence. In Proceedings of the 2010 ACM SIGGRAPH/Eurographics Symposium on Computer Animation (SCA’10). 129--138.

Digital Library

[17]

Taesoo Kwon and Jessica K. Hodgins. 2017. Momentum-mapped inverted pendulum models for controlling dynamic human motions. ACM Trans. Graph. 36, 1, Article 10 (Jan. 2017), 14 pages.

Digital Library

[18]

Yoonsang Lee, Sungeun Kim, and Jehee Lee. 2010a. Data-driven biped control. ACM Trans. Graph. 29, 4, Article 129 (July 2010), 129:1--129:8.

Digital Library

[19]

Yongjoon Lee, Kevin Wampler, Gilbert Bernstein, Jovan Popović, and Zoran Popović. 2010b. Motion fields for interactive character locomotion. ACM Trans. Graph. 29, 6, Article 138 (Dec. 2010), 138:1--138:8.

Digital Library

[20]

Sergey Levine and Vladlen Koltun. 2013. Guided policy search. In Proceedings of the 30th International Conference on Machine Learning (ICML’13).

Digital Library

[21]

Sergey Levine and Vladlen Koltun. 2014. Learning complex neural network policies with trajectory optimization. In Proceedings of the 31st International Conference on Machine Learning (ICML’14).

Digital Library

[22]

Timothy P. Lillicrap, Jonathan J. Hunt, Alexander Pritzel, Nicolas Heess, Tom Erez, Yuval Tassa, David Silver, and Daan Wierstra. 2015. Continuous control with deep reinforcement learning. CoRR abs/1509.02971 (2015). http://arxiv.org/abs/1509.02971

[23]

Chenggang Liu, Christopher G. Atkeson, and Jianbo Su. 2013. Biped walking control using a trajectory library. Robotica 31, 2 (March 2013), 311--322.

Digital Library

[24]

Libin Liu, Michiel van de Panne, and KangKang Yin. 2016. Guided learning of control graphs for physics-based characters. ACM Trans. Graph. 35, 3 (May 2016), Article 29, 29:1--29:14.

Digital Library

[25]

Libin Liu, KangKang Yin, and Baining Guo. 2015. Improving sampling-based motion control. Comput. Graph. Forum 34, 2 (2015), 415--423.

Digital Library

[26]

Libin Liu, KangKang Yin, Michiel van de Panne, Tianjia Shao, and Weiwei Xu. 2010. Sampling-based contact-rich motion control. ACM Trans. Graph. 29, 4 (2010), Article 128.

Digital Library

[27]

Libin Liu, KangKang Yin, Bin Wang, and Baining Guo. 2013. Simulation and control of skeleton-driven soft body characters. ACM Trans. Graph. 32, 6 (2013), Article 215.

Digital Library

[28]

Adriano Macchietto, Victor Zordan, and Christian R. Shelton. 2009. Momentum control for balance. ACM Trans. Graph. 28, 3 (2009).

Digital Library

[29]

James McCann and Nancy Pollard. 2007. Responsive characters from motion fragments. ACM Trans. Graph. 26, 3 (July 2007), Article 6.

Digital Library

[30]

Volodymyr Mnih, Koray Kavukcuoglu, David Silver, Andrei A. Rusu, Joel Veness, Marc G. Bellemare, Alex Graves, Martin Riedmiller, Andreas K. Fidjeland, Georg Ostrovski, Stig Petersen, Charles Beattie, Amir Sadik, Ioannis Antonoglou, Helen King, Dharshan Kumaran, Daan Wierstra, Shane Legg, and Demis Hassabis. 2015. Human-level control through deep reinforcement learning. Nature 518, 7540 (26 Feb 2015), 529--533.

[31]

Igor Mordatch, Martin de Lasa, and Aaron Hertzmann. 2010. Robust physics-based locomotion using low-dimensional planning. ACM Trans. Graph. 29, 4 (July 2010), Article 71, 71:1--71:8.

Digital Library

[32]

Igor Mordatch, Kendall Lowrey, Galen Andrew, Zoran Popovic, and Emanuel V. Todorov. 2015. Interactive control of diverse complex characters with neural networks. In Advances in Neural Information Processing Systems 28. Curran Associates, Inc., 3114--3122.

Digital Library

[33]

Igor Mordatch, Emanuel Todorov, and Zoran Popović. 2012. Discovery of complex behaviors through contact-invariant optimization. ACM Trans. Graph. 31, 4 (July 2012), Article 43, 43:1--43:8.

Digital Library

[34]

Uldarico Muico, Yongjoon Lee, Jovan Popović, and Zoran Popović. 2009. Contact-aware nonlinear control of dynamic characters. ACM Trans. Graph. 28, 3 (2009).

Digital Library

[35]

Arun Nair, Praveen Srinivasan, Sam Blackwell, Cagdas Alcicek, Rory Fearon, Alessandro De Maria, Vedavyas Panneershelvam, Mustafa Suleyman, Charles Beattie, Stig Petersen, Shane Legg, Volodymyr Mnih, Koray Kavukcuoglu, and David Silver. 2015. Massively parallel methods for deep reinforcement learning. In Deep Learning Workshop, International Conference on Machine Learning (ICML’15).

[36]

Jun Nakanishi, Jun Morimoto, Gen Endo, Gordon Cheng, Stefan Schaal, and Mitsuo Kawato. 2004. Learning from demonstration and adaptation of biped locomotion. Robot. Auton. Syst. 47, 23 (2004), 79--91.

[37]

Xue Bin Peng, Glen Berseth, and Michiel van de Panne. 2015. Dynamic terrain traversal skills using reinforcement learning. ACM Trans. Graph. 34, 4, Article 80 (July 2015), 80:1--80:11.

Digital Library

[38]

Xue Bin Peng, Glen Berseth, and Michiel van de Panne. 2016. Terrain-adaptive locomotion skills using deep reinforcement learning. ACM Trans. Graph. 35, 4 (July 2016).

Digital Library

[39]

Kwang Won Sok, Manmyung Kim, and Jehee Lee. 2007. Simulating biped behaviors from human motion data. ACM Trans. Graph. 26, 3 (2007), Article 107.

Digital Library

[40]

Freek Stulp, Evangelos A. Theodorou, and Stefan Schaal. 2012. Reinforcement learning with sequences of motion primitives for robust manipulation. IEEE Trans. Robot. 28, 6 (Dec 2012), 1360--1370.

Digital Library

[41]

Richard S. Sutton, Doina Precup, and Satinder P. Singh. 1998. Intra-option learning about temporally abstract actions. In Proceedings of the 15th International Conference on Machine Learning (ICML’98). 556--564.

Digital Library

[42]

Jie Tan, Yuting Gu, C. Karen Liu, and Greg Turk. 2014. Learning bicycle stunts. ACM Trans. Graph. 33, 4 (July 2014), Article 50, 50:1--50:12.

Digital Library

[43]

Jie Tan, C. Karen Liu, and Greg Turk. 2011. Stable proportional-derivative controllers. IEEE Comput. Graph. Appl. 31, 4 (2011), 34--44.

Digital Library

[44]

Tijmen Tieleman and Geoff Hinton. 2012. Lecture 6.5—RmsProp: Divide the gradient by a running average of its recent magnitude. COURSERA: Neural Networks for Machine Learning.

[45]

Adrien Treuille, Yongjoon Lee, and Zoran Popović. 2007. Near-optimal character animation with continuous control. ACM Trans. Graph. 26, 3 (July 2007), Article 7.

Digital Library

[46]

Hado van Hasselt, Arthur Guez, and David Silver. 2015. Deep reinforcement learning with double Q-learning. CoRR abs/1509.06461 (2015). http://arxiv.org/abs/1509.06461

Digital Library

[47]

Kevin Wampler and Zoran Popović. 2009. Optimal gait and form for animal locomotion. ACM Trans. Graph. 28, 3 (2009), Article 60.

Digital Library

[48]

Jack M. Wang, David J. Fleet, and Aaron Hertzmann. 2010. Optimizing walking controllers for uncertain inputs and environments. ACM Trans. Graph. 29, 4, Article 73 (July 2010), 73:1--73:8.

Digital Library

[49]

Christopher John Cornish Hellaby Watkins. 1989. Learning from Delayed Rewards. Ph.D. dissertation. King’s College, Cambridge, UK. http://www.cs.rhul.ac.uk/chrisw/new_thesis.pdf.

[50]

Yuting Ye and C. Karen Liu. 2010. Optimal feedback control for character animation using an abstract model. ACM Trans. Graph. 29, 4, Article 74 (July 2010), 74:1--74:9.

Digital Library

[51]

KangKang Yin, Kevin Loken, and Michiel van de Panne. 2007. SIMBICON: Simple biped locomotion control. ACM Trans. Graph. 26, 3 (2007), Article 105.

Digital Library

[52]

Victor Zordan, David Brown, Adriano Macchietto, and KangKang Yin. 2014. Control of rotational dynamics for ground and aerial behavior. IEEE Trans. Visual. Comput. Graph. 20, 10 (Oct. 2014), 1356--1366.

Cited By

Gomez-Nogales GPrieto-Martin MRomero CComino-Trinidad MRamon-Prieto POlivier AHoyet LOtaduy MPettre JCasas D(2024)Resolving Collisions in Dense 3D Crowd AnimationsACM Transactions on Graphics10.1145/368726643:5(1-14)Online publication date: 6-Sep-2024
https://doi.org/10.1145/3687266
Wang JHodgins JWon J(2024)Strategy and Skill Learning for Physics-based Table Tennis AnimationACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657437(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657437
Serifi AGrandia RKnoop EGross MBächer MKry PCani MSkouras MWang H(2024)VMP: Versatile Motion Priors for Robustly Tracking Motion on Physical CharactersProceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation10.1111/cgf.15175(1-11)Online publication date: 21-Aug-2024
https://dl.acm.org/doi/10.1111/cgf.15175
Show More Cited By

Index Terms

Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning
1. Computing methodologies

Recommendations

Guided Learning of Control Graphs for Physics-Based Characters

The difficulty of developing control strategies has been a primary bottleneck in the adoption of physics-based simulations of human motion. We present a method for learning robust feedback strategies around given motion capture clips as well as the ...
Learning to Schedule Control Fragments for Physics-Based Characters Using Deep Q-Learning

Given a robust control system, physical simulation offers the potential for interactive human characters that move in realistic and responsive ways. In this article, we describe how to learn a scheduling scheme that reorders short control fragments as ...
Deep Reinforcement Learning: From Q-Learning to Deep Q-Learning
Neural Information Processing
Abstract
As the two hottest branches of machine learning, deep learning and reinforcement learning both play a vital role in the field of artificial intelligence. Combining deep learning with reinforcement learning, deep reinforcement learning is a method ...

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Graphics

ACM Transactions on Graphics Volume 36, Issue 3

June 2017

165 pages

ISSN:0730-0301

EISSN:1557-7368

DOI:10.1145/3087678

Editor:
Kavita Bala
Cornell University

Issue’s Table of Contents

Copyright © 2017 ACM.

Permission to make digital or hard copies of all or part of this work for personal or classroom use is granted without fee provided that copies are not made or distributed for profit or commercial advantage and that copies bear this notice and the full citation on the first page. Copyrights for components of this work owned by others than the author(s) must be honored. Abstracting with credit is permitted. To copy otherwise, or republish, to post on servers or to redistribute to lists, requires prior specific permission and/or a fee. Request permissions from [email protected].

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 27 June 2017

Accepted: 01 March 2017

Revised: 01 February 2017

Received: 01 September 2016

Published in TOG Volume 36, Issue 3

Permissions

Request permissions for this article.

Request Permissions

Check for updates

Badges

Artifacts Available

Author Tags

Qualifiers

Research-article
Research
Refereed

Contributors

Other Metrics

View Article Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

48
Total Citations
View Citations
1,176
Total Downloads

Downloads (Last 12 months)49
Downloads (Last 6 weeks)3

Reflects downloads up to 20 Feb 2025

Other Metrics

View Author Metrics

Citations

Cited By

Gomez-Nogales GPrieto-Martin MRomero CComino-Trinidad MRamon-Prieto POlivier AHoyet LOtaduy MPettre JCasas D(2024)Resolving Collisions in Dense 3D Crowd AnimationsACM Transactions on Graphics10.1145/368726643:5(1-14)Online publication date: 6-Sep-2024
https://doi.org/10.1145/3687266
Wang JHodgins JWon J(2024)Strategy and Skill Learning for Physics-based Table Tennis AnimationACM SIGGRAPH 2024 Conference Papers10.1145/3641519.3657437(1-11)Online publication date: 13-Jul-2024
https://dl.acm.org/doi/10.1145/3641519.3657437
Serifi AGrandia RKnoop EGross MBächer MKry PCani MSkouras MWang H(2024)VMP: Versatile Motion Priors for Robustly Tracking Motion on Physical CharactersProceedings of the ACM SIGGRAPH/Eurographics Symposium on Computer Animation10.1111/cgf.15175(1-11)Online publication date: 21-Aug-2024
https://dl.acm.org/doi/10.1111/cgf.15175
Wang ZBenes BQureshi AMousas C(2024)Evolution-Based Shape and Behavior Co-Design of Virtual AgentsIEEE Transactions on Visualization and Computer Graphics10.1109/TVCG.2024.335574530:12(7579-7591)Online publication date: Dec-2024
https://doi.org/10.1109/TVCG.2024.3355745
Wang JYuan YLuo ZXie KLin DIqbal UFidler SKhamis S(2023)Learning Human Dynamics in Autonomous Driving Scenarios2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01901(20739-20749)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.01901
Yuan YSong JIqbal UVahdat AKautz J(2023)PhysDiff: Physics-Guided Human Motion Diffusion Model2023 IEEE/CVF International Conference on Computer Vision (ICCV)10.1109/ICCV51070.2023.01467(15964-15975)Online publication date: 1-Oct-2023
https://doi.org/10.1109/ICCV51070.2023.01467
Won JGopinath DHodgins J(2022)Physics-based character controllers using conditional VAEsACM Transactions on Graphics10.1145/3528223.353006741:4(1-12)Online publication date: 22-Jul-2022
https://dl.acm.org/doi/10.1145/3528223.3530067
Kwiatkowski AAlvarado EKalogeiton VLiu CPettré Jvan de Panne MCani M(2022)A Survey on Reinforcement Learning Methods in Character AnimationComputer Graphics Forum10.1111/cgf.1450441:2(613-639)Online publication date: 24-May-2022
https://doi.org/10.1111/cgf.14504
Qin WTao RSun LDong K(2022)Muscle‐driven virtual human motion generation approach based on deep reinforcement learningComputer Animation and Virtual Worlds10.1002/cav.209233:3-4Online publication date: 17-Jun-2022
https://doi.org/10.1002/cav.2092
Ahn JGu TKwon T(2021)Motion Generation of a Single Rigid Body Character Using Deep Reinforcement LearningJournal of the Korea Computer Graphics Society10.15701/kcgs.2021.27.3.1327:3(13-23)Online publication date: 1-Jul-2021
https://doi.org/10.15701/kcgs.2021.27.3.13
Show More Cited By

View Options

Login options

Check if you have access through your login credentials or your institution to get full access on this article.

Full Access

Get this Article

View options

PDF

View or Download as a PDF file.

eReader

View online with eReader.

Figures

Tables

Media

View Issue’s Table of Contents