Abstract
This paper provides a study of two deep reinforcement learning techniques for application in navigation of mobile robots, one of the techniques is the Soft Actor Critic (SAC) that is compared with the Deep Deterministic Policy Gradients (DDPG) algorithm in the same situation. In order to make a robot to arrive at a target in an environment, both networks have 10 laser range findings, the previous linear and angular velocity, and relative position and angle of the mobile robot to the target are used as the network inputs. As outputs, the networks have the linear and angular velocity of the mobile robot. The reward function created was designed in a way to only give a positive reward to the agent when it gets to the target and a negative reward when colliding with any object. The proposed architecture was applied successfully in two simulated environments, and a comparison between the two referred techniques was made using the results obtained as a basis and it was demonstrated that the SAC algorithm has a superior performance for the navigation of mobile robots than the DDPG algorithm (Code available at https://github.com/dranaju/project).
Similar content being viewed by others
Explore related subjects
Discover the latest articles, news and stories from top researchers in related subjects.Materials Availability
Available in GitHub https://github.com/dranaju/project.
References
Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing atari with deep reinforcement learning, [Online]. Available: arXiv:1312.5602 (2013)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1511.05952 (2016)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1509.02971 (2016)
Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1506.02438 (2016)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3389–3396. IEEE (2017)
Mahmood, A.R., Korenkevych, D., Vasan, G., Ma, W., Bergstra, J.: Benchmarking reinforcement learning algorithms on real-world robots, CoRR, vol. abs/1809.07731. [Online]. Available: arXiv:1809.07731 (2018)
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp 2829–2838 (2016)
Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L., Farhadi, A.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3357–3364. IEEE (2017)
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots, CoRR, vol. abs/1610.01733. [Online]. Available: arXiv:1610.01733 (2016)
Tai, L., Paolo, G., Liu, M.: Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 31–36. IEEE (2017)
Chen, Y.F., Everett, M., Liu, M., How, J.P.: Socially aware motion planning with deep reinforcement learning. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 1343–1350. IEEE (2017)
Jesus, J.C., Bottega, J.A., Cuadros, M.A., Gamarra, D.F.: Deep deterministic policy gradient for navigation of mobile robots in simulated environments. In: 2019 19th International Conference on Advanced Robotics (ICAR), pp 362–367. IEEE (2019)
Pajaziti, A., Avdullahu, P.: Slam–map building and navigation via ros. Int. J. Intell. Syst. Appl. Eng. 2(4), 71–75 (2014)
Omara, H.I.M.A., Sahari, K.S.M.: Indoor mapping using kinect and ros. In: 2015 International Symposium on Agents, Multi-Agent Systems and Robotics (ISAMSR), pp 110–116. IEEE (2015)
François-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., Pineau, J.: An introduction to deep reinforcement learning. Found. Trends Mach. Learn. 11(3-4), 219–354 (2018). [Online]. Available: https://doi.org/10.1561/2200000071
Hausknecht, M., Stone, P.: Deep recurrent q-learning for partially observable mdps. In: 2015 AAAI Fall Symposium Series (2015)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor, [Online]. Available: arXiv:1801.01290 (2018)
Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., Levine, S.: Soft actor-critic algorithms and applications, [Online]. Available: arXiv:1812.05905 (2018)
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Zhelo, O., Zhang, J., Tai, L., Liu, M., Burgard, W.: Curiosity-driven exploration for mapless navigation with deep reinforcement learning, [Online]. Available: arXiv:1804.00456 (2018)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Uhlenbeck, G.E., Ornstein, L.S.: On the theory of the brownian motion. Phys. Rev. 36(5), 823 (1930)
Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R.Y., Chen, X., Asfour, T., Abbeel, P., Andrychowicz, M.: Parameter space noise for exploration, [Online]. Available: arXiv:1706.01905 (2017)
Achiam, J.: Spinning up in deep reinforcement learning, GitHub repository. [Online]. Available: https://github.com/openai/spinningup (2018)
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)
Pfitscher, M., Welfer, D., do Nascimento, E.J., Cuadros, M. A. d. S. L., Gamarra, D.F.T.: Article users activity gesture recognition on kinect sensor using convolutional neural networks and fastdtw for controlling movements of a mobile robot. Intel. Artif. 22(63), 121–134 (2019)
da Silva, R.M., de Souza Leite Cuadros, M.A., Gamarra, D.F.T.: Comparison of a backstepping and a fuzzy controller for tracking a trajectory with a mobile robot. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds.) Intelligent Systems Design and Applications, pp 212–221. Cham, Springer International Publishing (2020)
Subramanian, V.: Deep learning with pytorch: a practical approach to building neural network models using PyTorch Packt. Publishing Ltd (2018)
Wu, C.-J., Brooks, D., Chen, K., Chen, D., Choudhury, S., Dukhan, M., Hazelwood, K., Isaac, E., Jia, Y., Jia, B., et al.: Machine learning at facebook: understanding inference at the edge. In: 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp 331–344. IEEE (2019)
Laskin, M., Srinivas, A., Abbeel, P.: Curl: contrastive unsupervised representations for reinforcement learning. In: International Conference on Machine Learning, pp 5639–5650. PMLR (2020)
Dai, Z., Yang, Z., Yang, Y., Carbonell, J.G., Le, Q.V., Salakhutdinov, R.: Transformer-xl: attentive language models beyond a fixed-length context, [Online]. Available: arXiv:1901.02860 (2019)
Rao, D., McMahan, B.: Natural Language Processing with Pytorch: Build Intelligent Language Applications Using Deep Learning. O’Reilly Media, Inc (2019)
Pyo, Y., Cho, H., Jung, R., Lim, T.: Ros Robot Programming. Seoul, ROBOTIS Co (2015)
Joseph, L.: Mastering ROS for robotics programming. Packt Publishing Ltd (2015)
de Assis Brasil, P.M., Pereira, F.U., de Souza Leite Cuadros, M.A., Cukla, A.R., Tello Gamarra, D.F.: A study on global path planners algorithms for the simulated turtlebot 3 robot in ros. In: 2020 Latin American Robotics Symposium (LARS), 2020 Brazilian Symposium on Robotics (SBR) and 2020 Workshop on Robotics in Education (WRE), pp 1–6 (2020)
Fairchild, C., Harman, T.L.: ROS Robotics by Example. Packt Publishing Ltd (2016)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge (2018)
Acknowledgements
We would like to thank Fabio Ugalde Pereira, by sharing the idea and environments with symmetric and assymmetric map formats, and all participants of VersusAI, by interchanging ideas and thoughts in the area of Robotics and Artificial Intelligence.
Author information
Authors and Affiliations
Contributions
- Junior Costa de Jesus conceived the research, writing of the article, designed and program the experiments, collected and processed the test data.
- Ricardo Bedin Grando write the article, collected and processed the test data.
- Victor Augusto Kich write the article, program the experiments, collected and processed the test data.
- Alisson Henrique Kolling write the article, program the experiments, collected and processed the test data.
- Marco Antonio de Souza Leite Cuadros discussion and conception of the main ideas of the article, provided valuable comments.
- Daniel Fernando Tello Gamarra conceived the research, writing of the article and discussion of the main ideas of the article.
Corresponding author
Ethics declarations
Ethical Approval
The article has the approval of all the authors.
Consent to Participate
All the authors gave their consent to participate in this article.
Consent for Publication
The authors gave their authorization for the publishing of this article.
Competing interests
There are not conflict of interest or competing interest.
Additional information
Publisher’s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Rights and permissions
About this article
Cite this article
de Jesus, J., Kich, V.A., Kolling, A.H. et al. Soft Actor-Critic for Navigation of Mobile Robots. J Intell Robot Syst 102, 31 (2021). https://doi.org/10.1007/s10846-021-01367-5
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10846-021-01367-5