Article Contents

2023, Volume 6, Issue 2: 161-177. Doi: 10.3934/mfc.2022014

This issue Previous Article A review of definitions of fractional differences and sums Next Article Concept and attribute reduction based on rectangle theory of formal concept

Adaptive attitude determination of bionic polarization integrated navigation system based on reinforcement learning strategy

1.
School of Information Science and Technology, North China University of Technology, Beijing 100144, China

^*Corresponding author: Tao Du
^*Corresponding author: Tao Du

Received: March 2022

Revised: April 2022

Early access: May 2022

Published: May 2023

Abstract / Introduction Full Text(HTML) Figure(10) / Table(3) Related Papers Cited by

Abstract

The bionic polarization integrated navigation system includes three-axis gyroscopes, three-axis accelerometers, three-axis magnetometers, and polarization sensors, which provide pitch, roll, and yaw. When the magnetometers are interfered or the polarization sensors are obscured, the accuracy of attitude will be decreased due to abnormal measurement. To improve the accuracy of attitude of the integrated navigation system under these complex environments, an adaptive complementary filter based on DQN (Deep Q-learning Network) is proposed. The complementary filter is first designed to fuse the measurements from the gyroscopes, accelerometers, magnetometers, and polarization sensors. Then, a reward function of the bionic polarization integrated navigation system is defined as the function of the absolute value of the attitude angle error. The action-value function is introduced by a fully-connected network obtained by historical sensor data training. The strategy can be calculated by the deep Q-learning network and the action that optimal action-value function is obtained. Based on the optimized action, three types of integration are switched automatically to adapt to the different environments. Three cases of simulations are conducted to validate the effectiveness of the proposed algorithm. The results show that the adaptive attitude determination of bionic polarization integrated navigation system based on DQN can improve the accuracy of the attitude estimation.

Keywords:

Mathematics Subject Classification: Primary: 58F15, 58F17; Secondary: 53C35.

Citation:

Full Text(HTML)

Figure 1. Illustration of DQN

Download: Full-size image PowerPoint slide

Figure 2. Variation of geomagnetic field intensity under geomagnetic interference

Download: Full-size image PowerPoint slide

Figure 3. Comparison decision action under geomagnetic interference

Download: Full-size image PowerPoint slide

Figure 4. Attitude estimation errors under geomagnetic interference

Download: Full-size image PowerPoint slide

Figure 5. Variation of polarization angle under polarization interference

Download: Full-size image PowerPoint slide

Figure 6. Comparison decision action under polarization interference

Download: Full-size image PowerPoint slide

Figure 7. Attitude estimation errors under polarization interference

Download: Full-size image PowerPoint slide

Figure 8. (a) Polarization angle under polarization interference. (b) Geomagnetic field intensity under geomagnetic interference

Download: Full-size image PowerPoint slide

Figure 9. Comparison of decision-making actions when the magnetometer is disturbed and polarization is blocked

Download: Full-size image PowerPoint slide

Figure 10. Attitude estimation errors under the magnetometer is disturbed and polarization is blocked

Download: Full-size image PowerPoint slide

Table 1. Standard deviation of attitude angle for decision comparison in experiment 1

Method	Pitch(°)	Roll(°)	Yaw(°)
Complementary filter without DQN	0.4958	0.5057	0.5866
Complementary filter with DQN	0.4790	0.5022	0.3540

| Show Table

DownLoad: CSV

Table 2. Standard deviation of attitude angle for decision comparison in experiment 2

Method	Pitch(°)	Roll(°)	Yaw(°)
Complementary filter without DQN	0.5450	0.5078	0.4700
Complementary filter with DQN	0.4640	0.5031	0.4241

| Show Table

DownLoad: CSV

Table 3. Standard deviation of attitude angle for decision comparison in experiment 3

Method	Pitch(°)	Roll(°)	Yaw(°)
Complementary filter without DQN	0.4966	0.5052	0.6005
Complementary filter with DQN	0.4735	0.5031	0.3822

| Show Table

DownLoad: CSV

Related Papers

Cited by

References

[1]	R. Wehner, Desert ant navigation: How miniature brains solve complex tasks, J. Comparative Physiology A, 189 (2003), 579-588. doi: 10.1007/s00359-003-0431-1.
[2]	S. M. Reppert, H. Zhu and R. H. White, Polarized light helps monarch butterflies navigate, Current Biology, 14 (2004), 155-158. doi: 10.1016/j.cub.2003.12.034.
[3]	R. Muheim, Behavioural and physiological mechanisms of polarized light sensitivity in birds, Philos. Trans. Roy. Soc. B Bio. Sci., 366 (2011), 763-771. doi: 10.1098/rstb.2010.0196.
[4]	J. Chu, Z. Wang, L. Guan and Z. Liu, Integrated polarization dependent photodetector and its application for polarization navigation, IEEE Photonics Tech. Lett., 26 (2014), 469-472. doi: 10.1109/LPT.2013.2296945.
[5]	D. Lambrinos, H. Kobayashi, R. Pfeifer, M. Maris and T. Labhart, et al., An autonomous agent navigating with a polarized light compass, Adaptive Behavior, 6 (1997), 131-161. doi: 10.1177/105971239700600104.
[6]	J. Chu, K. Zhao, Q. Zhang and T. Wang, Construction and performance test of a novel polarization sensor for navigation, Sensors Actuators A Phys., 148 (2008), 75-82. doi: 10.1016/j.sna.2008.07.016.
[7]	D. Wang, H. Liang, H. Zhu and S. Zhang, A bionic camera-based polarization navigation sensor, Sensors, 14 (2014), 13006-13023. doi: 10.3390/s140713006.
[8]	J. Chahl and A. Mizutani, Biomimetic attitude and orientation sensors, IEEE Sensors J., 12 (2012), 289-297. doi: 10.1109/JSEN.2010.2078806.
[9]	T. Du, Y. H. Zeng, J. Yang, C. Z. Tian and P. F. Bai, Multi-sensor fusion SLAM approach for the mobile robot with a bio-inspired polarised skylight sensor, IET Radar Sonar Navigation, 14 (2020), 1950-1957. doi: 10.1049/iet-rsn.2020.0260.
[10]	C. J. C. H. Watkins and P. Dayan, Q-learning, Machine Learning, 8 (1992), 279-292. doi: 10.1007/BF00992698.
[11]	V. Mnih, K. Kavukcuoglu, D. Silver, A. A. Rusu and J. Veness, et al., Human-level control through deep reinforcement learning, Nature, 518 (2015), 529-533. doi: 10.1038/nature14236.
[12]	T. Hester, M. Vecerik, O. Pietquin, M. Lanctot and T. Schaul, et al., Deep q-learning from demonstrations, in Proceedings of the Thirty-Second AAAI Conference on Artificial Intelligence, 2018, 3223–3230.
[13]	Q. Zhang, T. Du and C. Tian, A Sim2real method based on DDQN for training a self-driving scale car, Math. Found. Comput., 2 (2019), 315-331. doi: 10.3934/mfc.2019020.
[14]	J. Fan, Z. Wang, Y. Xie and Z. Yang, A theoretical analysis of deep Q-learning, preprint, 2020, arXiv: 1901.00137.
[15]	J. Yang, T. Du, X. Liu, B. Niu and L. Guo, Method and implementation of a bioinspired polarization-based attitude and heading reference system by integration of polarization compass and inertial sensors, IEEE Trans. Industrial Electron., 67 (2020), 9802-9812. doi: 10.1109/TIE.2019.2952799.
[16]	G. Shani, D. Heckerman and R. I. Brafman, An MDP-based recommender system, J. Mach. Learn. Res., 6 (2005), 1265-1295.
[17]	S. James and E. Johns, 3D simulation for robot arm control with deep Q-learning, preprint, 2016, arXiv: 1609.03759.
[18]	J. L. Crassidis and J. L. Junkins, Optimal estimation of dynamic systems, Chapman & Hall/CRC Applied Mathematics and Nonlinear Science Series, 2, Chapman & Hall/CRC, Boca Raton, FL, 2004. doi: 10.1201/9780203509128.
[19]	S. de Marco, M.-D. Hua, T. Hamel and C. Samson, Position, velocity, attitude and accelerometer-bias estimation from IMU and bearing measurements, 2020 European Control Conference (ECC), St. Petersburg, Russia, 2020. doi: 10.23919/ECC51009.2020.9143918.
[20]	R. L. Farrenkopf, Analytic steady-state accuracy solutions for two common spacecraft attitude estimators, J. Guidance Control, 1 (1978), 282-284. doi: 10.2514/3.55779.
[21]	S. O. H. Madgwick, A. J. L. Harrison and R. Vaidyanathan, Estimation of IMU and MARG orientation using a gradient descent algorithm, IEEE International Conference on Rehabilitation Robotics, Zurich, Switzerland, 2011. doi: 10.1109/ICORR.2011.5975346.
[22]	T. Du, C. Tian, J. Yang, S. Wang, X. Liu and L. Guo, An autonomous initial alignment and observability analysis for SINS with bio-inspired polarized skylight sensors, IEEE Sensors J., 20 (2020), 7941-7956. doi: 10.1109/JSEN.2020.2981171.
[23]	R. Mahony, T. Hamel and J.-M. Pflimlin, Nonlinear complementary filters on the special orthogonal group, IEEE Trans. Automat. Control, 53 (2008), 1203-1218. doi: 10.1109/TAC.2008.923738.
[24]	V. Mnih, K. Kavukcuoglu, D. Silver, A. Graves and I. Antonoglou, et al., Playing atari with deep reinforcement learning, preprint, 2013, arXiv: 1312.5602.