Soft Actor-Critic for Navigation of Mobile Robots

Junior Costa de Jesus¹,
Victor Augusto Kich²,
Alisson Henrique Kolling²,
Ricardo Bedin Grando¹,
Marco Antonio de Souza Leite Cuadros³ &
…
Daniel Fernando Tello Gamarra⁴

1745 Accesses
47 Citations
Explore all metrics

Abstract

This paper provides a study of two deep reinforcement learning techniques for application in navigation of mobile robots, one of the techniques is the Soft Actor Critic (SAC) that is compared with the Deep Deterministic Policy Gradients (DDPG) algorithm in the same situation. In order to make a robot to arrive at a target in an environment, both networks have 10 laser range findings, the previous linear and angular velocity, and relative position and angle of the mobile robot to the target are used as the network inputs. As outputs, the networks have the linear and angular velocity of the mobile robot. The reward function created was designed in a way to only give a positive reward to the agent when it gets to the target and a negative reward when colliding with any object. The proposed architecture was applied successfully in two simulated environments, and a comparison between the two referred techniques was made using the results obtained as a basis and it was demonstrated that the SAC algorithm has a superior performance for the navigation of mobile robots than the DDPG algorithm (Code available at https://github.com/dranaju/project).

This is a preview of subscription content, log in via an institution to check access.

Access this article

Subscribe and save

Springer+ Basic

$34.99 /Month

Get 10 units per month
Download Article/Chapter or eBook
1 Unit = 1 Article or 1 Chapter
Cancel anytime

Buy Now

Price excludes VAT (USA)
Tax calculation will be finalised during checkout.

Instant access to the full article PDF.

Institutional subscriptions

Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles

Article 31 January 2022

Reinforcement learning-based dynamic obstacle avoidance and integration of path planning

Article 06 October 2021

Application of Actor-Critic Deep Reinforcement Learning Method for Obstacle Avoidance of WMR

Discover the latest articles, news and stories from top researchers in related subjects.

Artificial Intelligence

Materials Availability

Available in GitHub https://github.com/dranaju/project.

References

Mnih, V., Kavukcuoglu, K., Silver, D., Graves, A., Antonoglou, I., Wierstra, D., Riedmiller, M.A.: Playing atari with deep reinforcement learning, [Online]. Available: arXiv:1312.5602 (2013)
Schaul, T., Quan, J., Antonoglou, I., Silver, D.: Prioritized experience replay. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1511.05952 (2016)
Lillicrap, T.P., Hunt, J.J., Pritzel, A., Heess, N., Erez, T., Tassa, Y., Silver, D., Wierstra, D.: Continuous control with deep reinforcement learning. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1509.02971 (2016)
Schulman, J., Moritz, P., Levine, S., Jordan, M.I., Abbeel, P.: High-dimensional continuous control using generalized advantage estimation. In: Bengio, Y., LeCun, Y. (eds.) 4th International Conference on Learning Representations, ICLR 2016, San Juan, Puerto Rico, May 2-4, 2016, Conference Track Proceedings. [Online]. Available: arXiv:1506.02438 (2016)
Gu, S., Holly, E., Lillicrap, T., Levine, S.: Deep reinforcement learning for robotic manipulation with asynchronous off-policy updates. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3389–3396. IEEE (2017)
Mahmood, A.R., Korenkevych, D., Vasan, G., Ma, W., Bergstra, J.: Benchmarking reinforcement learning algorithms on real-world robots, CoRR, vol. abs/1809.07731. [Online]. Available: arXiv:1809.07731 (2018)
Gu, S., Lillicrap, T., Sutskever, I., Levine, S.: Continuous deep q-learning with model-based acceleration. In: International Conference on Machine Learning, pp 2829–2838 (2016)
Zhu, Y., Mottaghi, R., Kolve, E., Lim, J.J., Gupta, A., Fei-Fei, L., Farhadi, A.: Target-driven visual navigation in indoor scenes using deep reinforcement learning. In: 2017 IEEE International Conference on Robotics and Automation (ICRA), pp 3357–3364. IEEE (2017)
Tai, L., Liu, M.: Towards cognitive exploration through deep reinforcement learning for mobile robots, CoRR, vol. abs/1610.01733. [Online]. Available: arXiv:1610.01733 (2016)
Tai, L., Paolo, G., Liu, M.: Virtual-to-real deep reinforcement learning: continuous control of mobile robots for mapless navigation. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 31–36. IEEE (2017)
Chen, Y.F., Everett, M., Liu, M., How, J.P.: Socially aware motion planning with deep reinforcement learning. In: 2017 IEEE/RSJ International Conference on Intelligent Robots and Systems (IROS), pp 1343–1350. IEEE (2017)
Jesus, J.C., Bottega, J.A., Cuadros, M.A., Gamarra, D.F.: Deep deterministic policy gradient for navigation of mobile robots in simulated environments. In: 2019 19th International Conference on Advanced Robotics (ICAR), pp 362–367. IEEE (2019)
Pajaziti, A., Avdullahu, P.: Slam–map building and navigation via ros. Int. J. Intell. Syst. Appl. Eng. 2(4), 71–75 (2014)
Article Google Scholar
Omara, H.I.M.A., Sahari, K.S.M.: Indoor mapping using kinect and ros. In: 2015 International Symposium on Agents, Multi-Agent Systems and Robotics (ISAMSR), pp 110–116. IEEE (2015)
François-Lavet, V., Henderson, P., Islam, R., Bellemare, M.G., Pineau, J.: An introduction to deep reinforcement learning. Found. Trends Mach. Learn. 11(3-4), 219–354 (2018). [Online]. Available: https://doi.org/10.1561/2200000071
Article Google Scholar
Hausknecht, M., Stone, P.: Deep recurrent q-learning for partially observable mdps. In: 2015 AAAI Fall Symposium Series (2015)
Van Hasselt, H., Guez, A., Silver, D.: Deep reinforcement learning with double q-learning. In: Thirtieth AAAI Conference on Artificial Intelligence (2016)
Haarnoja, T., Zhou, A., Abbeel, P., Levine, S.: Soft actor-critic: off-policy maximum entropy deep reinforcement learning with a stochastic actor, [Online]. Available: arXiv:1801.01290 (2018)
Haarnoja, T., Zhou, A., Hartikainen, K., Tucker, G., Ha, S., Tan, J., Kumar, V., Zhu, H., Gupta, A., Abbeel, P., Levine, S.: Soft actor-critic algorithms and applications, [Online]. Available: arXiv:1812.05905 (2018)
Kober, J., Bagnell, J.A., Peters, J.: Reinforcement learning in robotics: a survey. Int. J. Robot. Res. 32(11), 1238–1274 (2013)
Article Google Scholar
Zhelo, O., Zhang, J., Tai, L., Liu, M., Burgard, W.: Curiosity-driven exploration for mapless navigation with deep reinforcement learning, [Online]. Available: arXiv:1804.00456 (2018)
Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A.A., Veness, J., Bellemare, M.G., Graves, A., Riedmiller, M., Fidjeland, A.K., Ostrovski, G., et al.: Human-level control through deep reinforcement learning. Nature 518(7540), 529–533 (2015)
Article Google Scholar
Uhlenbeck, G.E., Ornstein, L.S.: On the theory of the brownian motion. Phys. Rev. 36(5), 823 (1930)
Article Google Scholar
Plappert, M., Houthooft, R., Dhariwal, P., Sidor, S., Chen, R.Y., Chen, X., Asfour, T., Abbeel, P., Andrychowicz, M.: Parameter space noise for exploration, [Online]. Available: arXiv:1706.01905 (2017)
Achiam, J.: Spinning up in deep reinforcement learning, GitHub repository. [Online]. Available: https://github.com/openai/spinningup (2018)
Polyak, B.T., Juditsky, A.B.: Acceleration of stochastic approximation by averaging. SIAM J. Control Optim. 30(4), 838–855 (1992)
Article MathSciNet Google Scholar
Pfitscher, M., Welfer, D., do Nascimento, E.J., Cuadros, M. A. d. S. L., Gamarra, D.F.T.: Article users activity gesture recognition on kinect sensor using convolutional neural networks and fastdtw for controlling movements of a mobile robot. Intel. Artif. 22(63), 121–134 (2019)
Article Google Scholar
da Silva, R.M., de Souza Leite Cuadros, M.A., Gamarra, D.F.T.: Comparison of a backstepping and a fuzzy controller for tracking a trajectory with a mobile robot. In: Abraham, A., Cherukuri, A.K., Melin, P., Gandhi, N. (eds.) Intelligent Systems Design and Applications, pp 212–221. Cham, Springer International Publishing (2020)
Subramanian, V.: Deep learning with pytorch: a practical approach to building neural network models using PyTorch Packt. Publishing Ltd (2018)
Wu, C.-J., Brooks, D., Chen, K., Chen, D., Choudhury, S., Dukhan, M., Hazelwood, K., Isaac, E., Jia, Y., Jia, B., et al.: Machine learning at facebook: understanding inference at the edge. In: 2019 IEEE International Symposium on High Performance Computer Architecture (HPCA), pp 331–344. IEEE (2019)
Laskin, M., Srinivas, A., Abbeel, P.: Curl: contrastive unsupervised representations for reinforcement learning. In: International Conference on Machine Learning, pp 5639–5650. PMLR (2020)
Dai, Z., Yang, Z., Yang, Y., Carbonell, J.G., Le, Q.V., Salakhutdinov, R.: Transformer-xl: attentive language models beyond a fixed-length context, [Online]. Available: arXiv:1901.02860 (2019)
Rao, D., McMahan, B.: Natural Language Processing with Pytorch: Build Intelligent Language Applications Using Deep Learning. O’Reilly Media, Inc (2019)
Pyo, Y., Cho, H., Jung, R., Lim, T.: Ros Robot Programming. Seoul, ROBOTIS Co (2015)
Google Scholar
Joseph, L.: Mastering ROS for robotics programming. Packt Publishing Ltd (2015)
de Assis Brasil, P.M., Pereira, F.U., de Souza Leite Cuadros, M.A., Cukla, A.R., Tello Gamarra, D.F.: A study on global path planners algorithms for the simulated turtlebot 3 robot in ros. In: 2020 Latin American Robotics Symposium (LARS), 2020 Brazilian Symposium on Robotics (SBR) and 2020 Workshop on Robotics in Education (WRE), pp 1–6 (2020)
Fairchild, C., Harman, T.L.: ROS Robotics by Example. Packt Publishing Ltd (2016)
Sutton, R.S., Barto, A.G.: Reinforcement Learning: an Introduction. MIT Press, Cambridge (2018)
MATH Google Scholar

Download references

Acknowledgements

We would like to thank Fabio Ugalde Pereira, by sharing the idea and environments with symmetric and assymmetric map formats, and all participants of VersusAI, by interchanging ideas and thoughts in the area of Robotics and Artificial Intelligence.

Author information

Authors and Affiliations

Federal University of Rio Grande, Rio Grande, RS, Brazil
Junior Costa de Jesus & Ricardo Bedin Grando
Federal University of Santa Maria, Santa Maria, RS, Brazil
Victor Augusto Kich & Alisson Henrique Kolling
Federal Institute of Espirito Santo, Serra, ES, Brazil
Marco Antonio de Souza Leite Cuadros
Department of Processing of Electrical Energy (DPEE), Federal University of Santa Maria, Santa Maria, RS, Brazil
Daniel Fernando Tello Gamarra

Authors

Junior Costa de Jesus
View author publications
You can also search for this author in PubMed Google Scholar
Victor Augusto Kich
View author publications
You can also search for this author in PubMed Google Scholar
Alisson Henrique Kolling
View author publications
You can also search for this author in PubMed Google Scholar
Ricardo Bedin Grando
View author publications
You can also search for this author in PubMed Google Scholar
Marco Antonio de Souza Leite Cuadros
View author publications
You can also search for this author in PubMed Google Scholar
Daniel Fernando Tello Gamarra
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

- Junior Costa de Jesus conceived the research, writing of the article, designed and program the experiments, collected and processed the test data.

- Ricardo Bedin Grando write the article, collected and processed the test data.

- Victor Augusto Kich write the article, program the experiments, collected and processed the test data.

- Alisson Henrique Kolling write the article, program the experiments, collected and processed the test data.

- Marco Antonio de Souza Leite Cuadros discussion and conception of the main ideas of the article, provided valuable comments.

- Daniel Fernando Tello Gamarra conceived the research, writing of the article and discussion of the main ideas of the article.

Corresponding author

Correspondence to Junior Costa de Jesus.

Ethics declarations

Ethical Approval

The article has the approval of all the authors.

Consent to Participate

All the authors gave their consent to participate in this article.

Consent for Publication

The authors gave their authorization for the publishing of this article.

Competing interests

There are not conflict of interest or competing interest.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Reprints and permissions

About this article

Cite this article

de Jesus, J., Kich, V.A., Kolling, A.H. et al. Soft Actor-Critic for Navigation of Mobile Robots. J Intell Robot Syst 102, 31 (2021). https://doi.org/10.1007/s10846-021-01367-5

Download citation

Received: 31 July 2020
Accepted: 12 March 2021
Published: 14 May 2021
DOI: https://doi.org/10.1007/s10846-021-01367-5

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles

Reinforcement learning-based dynamic obstacle avoidance and integration of path planning

Application of Actor-Critic Deep Reinforcement Learning Method for Obstacle Avoidance of WMR

Materials Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent for Publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Keywords

Subscribe and save

Buy Now

Navigation

Soft Actor-Critic for Navigation of Mobile Robots

Abstract

Access this article

Subscribe and save

Buy Now

Similar content being viewed by others

Double Critic Deep Reinforcement Learning for Mapless 3D Navigation of Unmanned Aerial Vehicles

Reinforcement learning-based dynamic obstacle avoidance and integration of path planning

Application of Actor-Critic Deep Reinforcement Learning Method for Obstacle Avoidance of WMR

Explore related subjects

Materials Availability

References

Acknowledgements

Author information

Authors and Affiliations

Contributions

Corresponding author

Ethics declarations

Ethical Approval

Consent to Participate

Consent for Publication

Competing interests

Additional information

Publisher’s Note

Rights and permissions

About this article

Cite this article

Share this article

Keywords

Subscribe and save

Buy Now

Search

Navigation