CN119002244B

CN119002244B - Intelligent steering based on machine learning

Info

Publication number: CN119002244B
Application number: CN202411488117.9A
Authority: CN
Inventors: 程依春; 罗长江; 程猷益
Original assignee: Zero New Energy Technology Guangdong Co ltd
Current assignee: Zero New Energy Technology Guangdong Co ltd
Priority date: 2024-10-24
Filing date: 2024-10-24
Publication date: 2025-02-14
Anticipated expiration: 2044-10-24
Also published as: CN119002244A

Abstract

The present invention relates to an intelligent steering based on machine learning, and belongs to the field of steering gear control technology. The method comprises the following steps: planning a tracking path for a ship to return to a target path from a current position, selecting a reference point from the tracking path as a reference point for calculating the error of steering control; obtaining the tracking deviation by setting parameters and detecting parameters, and correcting the operating conditions of the tracking deviation based on a machine learning model; converting the corrected tracking deviation into a steering angle of a steering gear, sequentially establishing a steering control parameter table between two reference points, and using a PD controller to control the output with the steering gear as an actuator; the present invention selects reference points for tracking deviation calculation by planning the tracking path, and compensates for the deviation effect caused by interference factors based on machine learning, thereby improving the accuracy and efficiency of path tracking.

Description

Intelligent steering based on machine learning

Technical Field

The invention belongs to the technical field of steering engine control, and particularly relates to an intelligent steering system based on machine learning.

Background

The intelligent steering is a steering engine control method combining advanced technology and algorithm, and aims to improve steering precision and efficiency in sailing, so that automatic error correction of the sailing track offset angle is realized.

At present, the steering engine system completes steering control of the ship by adopting a PID controller, and points on a target path are directly selected as reference points in an error calculation process in the PID control process, so that the steering engine can generate control oscillation, meanwhile, only local optimization is considered but global path adaptation is not considered, in addition, interference of operation working conditions is not considered in steering angle control, external environmental factors such as wind flow, waves and the like can influence the actual track of the ship, and therefore, the accuracy of an automatic correction process is low and the efficiency is low.

Therefore, it is needed to provide an intelligent steering method, which considers the path regression process globally, and by planning the tracking path, achieves higher accuracy and efficiency, so as to better cope with environmental interference and dynamic change, reduce oscillation, and optimize the suitability of the target path.

Disclosure of Invention

In order to solve the problems in the prior art, the invention provides an intelligent steering based on machine learning, tracking deviation calculation is performed by planning a tracking path to select a reference point, meanwhile, deviation influence caused by interference factors is compensated based on machine learning, and accuracy and efficiency of path tracking are improved.

The aim of the invention can be achieved by the following technical scheme:

The disclosure provides an intelligent steering based on machine learning, comprising the following steps:

S1, selecting a reference point, namely planning a tracking path of a ship returning to a target path from a current position, and selecting the reference point from the tracking path as an error calculation reference point for steering control;

s2, obtaining rudder angle parameters, namely obtaining tracking deviation through setting parameters and detection parameters, and correcting the running working condition of the tracking deviation based on a machine learning model;

S3, tracking a target path, namely converting the corrected tracking deviation into a steering angle of a steering engine, sequentially establishing a steering control parameter table between every two reference points, and controlling and outputting the steering engine by using a PD controller by taking the steering engine as an executing mechanism;

The selecting a reference point further comprises the sub-steps of:

s11, generating a tracking path according to the current position of the ship and the target path;

S12, acquiring reference points, namely setting a reference point selection interval from the generated optimal tracking path according to the control fineness requirement, and selecting a group of rudder angle control reference points from the starting point of the tracking path;

Wherein, the generating trace path further comprises the sub-steps of:

s111, selecting target points, namely setting a threshold value of a direction difference and a curvature difference, sampling a group of candidate target points from a target path, traversing the candidate target points, screening target points with the direction difference and the curvature difference within a threshold range, and selecting a point closest to the current position as a final target point;

S112, generating an initial tracking path, namely randomly generating the initial tracking path of the ship by adopting a B spline control point according to the current position of the ship and the selected target point;

And S113, searching control points, namely encoding the control points of the B spline curve, searching the optimal control point configuration through the evolution process of a genetic algorithm, and realizing the optimal tracking path planning target.

Further, the generating an initial tracking path further includes the sub-steps of:

let the time domain be [ t _s,t_f ], the B-spline curve track with the number of control points n is:

;

In the formula, T epsilon [ t _s,t_f ] is a spline curve track; Is the control point of the B-spline curve, As the ith k-th order B-spline basis function, the basis functionConsists of a Deboolean-Cookex recurrence formula, and the expression is as follows:

;

Wherein U ₁、U₂ is a basis function Coefficients of (2);

Introducing a B-spline curve into the vessel path representation, then a set of control point coordinates is expressed as:

;

The path expression generated by the set of control points is:

;

Where X (t) represents the change in position of a point on the path on the X-axis, and Y (t) represents the change in position of a point on the path on the Y-axis.

Further, the searching control point further includes the sub-steps of:

S1131, coding the control points and solving the control points ,) Coordinates:

;

In the middle of (a) ,) The coordinates of the bisection point of the line sf connecting the start point s and the end point f of the path,For the distance between the start point s and the end point f, rand is a random number and rand e [ -0.5,0.5],Is the slope of the straight line sf;

S1132, establishing a fitness function, namely setting a limiting condition according to the actual requirement in the error correction process, and establishing the fitness function;

S1133, genetic iterative optimization, namely minimizing the fitness function through iterative optimization, and obtaining the optimal path.

Further, the establishing the fitness function further includes the sub-steps of:

defining path distance by calculating path distance from the position of a point on the traced path :

;

In the formula,Representing the ith discrete point on the path z, wherein h is the number of the discrete points;

Defining path smoothness calculating path smoothness by curvature :

;

In the formula,Represents the direction angle of the i-th point on the path, which is calculated from the relative position between the i-th point and the i+1-th point,Representing curvature changes between adjacent three path points;

Defining the path direction, defining the path direction to fit the objective function The method is used for measuring the consistency of the direction of the path and the direction of the target path:

;

In the formula, The direction angle of the ith point of the ship on the tracking path,The direction angle of the ith point on the target path;

establishing a fitness function, namely weighting and summing the established indexes to obtain the fitness function:

;

In the formula, 、、And the weight coefficient is used for adjusting the importance of each index.

Further, the genetic iterative optimization further comprises the sub-steps of:

randomly generating control point coordinates, determining crossover probability, variation probability and iteration times, and generating a tracking path by using a B spline interpolation algorithm according to the coordinates of the starting point and the target point position;

calculating fitness function values of all paths and recording the fitness function values as fitness of all individuals, carrying out roulette wheel selection operation according to the fitness value of each individual to obtain offspring, and enabling the individual with the highest fitness in the individuals not to participate in selection and directly entering the offspring;

performing crossover and mutation operation to obtain n+1 generation of individuals, calculating the fitness of the n+1 generation of individuals, judging whether n is larger than the set iteration times, if so, entering the next step, otherwise, returning to the operation, and performing B spline interpolation according to the obtained optimal individuals to obtain the optimal tracking path.

Further, the obtaining the rudder angle parameter further includes the sub-steps:

S21, acquiring tracking deviation, namely acquiring the tracking deviation between the set parameter and the detection parameter by taking the feedback coordinate point of the GPS and inertial integrated navigation system as the detection parameter according to the selected reference point as the set parameter;

S22, tracking deviation correction, namely carrying out working condition interference correction on the tracking deviation by collecting working condition parameters of the ship and adopting a machine learning model to compensate the tracking deviation.

Further, the step of acquiring tracking deviation further comprises the following substeps:

Acquiring real-time feedback coordinates by using a GPS and inertial navigation system, performing time alignment on the data, and converting geographic coordinates provided by the GPS into a coordinate system consistent with the inertial navigation system;

And carrying out data fusion on the GPS and inertial integrated navigation system data by adopting a Kalman filtering algorithm to obtain an optimal detection parameter, and calculating the direction deviation between the set parameter and the detection parameter as tracking deviation.

Further, the tracking deviation correction further includes the sub-steps of:

S221, establishing a machine learning model, namely establishing a data set by collecting historical navigation data, constructing the machine learning model for training, and integrating the model into a steering engine control system of the ship;

S222, acquiring ship working condition data, namely setting corresponding sensors according to predicted features of model training and screening, acquiring the working condition data in real time, and transmitting the acquired data to a steering engine system;

S223, correcting tracking deviation, namely, inputting the acquired real-time working condition data into a machine learning model to predict and output rudder angle compensation deviation.

Further, the building of the machine learning model further comprises the sub-steps of:

establishing a data set through historical navigation data, wherein the data type comprises current position, target position, rudder angle, navigational speed, heading, wind speed, wind direction and ocean current information;

Cleaning the collected original data, extracting the characteristics for prediction from the original data by a correlation coefficient method, and combining the characteristics by adopting a polynomial characteristic enhancement mode;

Setting the node number of an input layer according to the screened prediction characteristics, setting an output layer to compensate deviation, setting the layer number and the node number of each layer by a hidden layer according to an empirical formula, and selecting an activation function to complete model construction;

The model training comprises the steps of distributing collected data into a training set, a verification set and a test set according to the proportion of 70%, 15% and 15%, transmitting the data to a hidden layer through an input layer, then reaching an output layer through an activation function, calculating an output predicted value, calculating the gradient of a loss function, reversely spreading errors to various levels, and adjusting the weight and bias of each neuron;

the performance of the model is evaluated using the validation set data and the test set data.

Further, the target path tracking method further comprises the following substeps:

and establishing a control output of the PD controller, namely synthesizing proportional and differential control outputs to obtain a final control output u (t):

;

Where e (t) is the tracking bias at time t, K _p is the proportional gain, and K _d is the differential gain;

Obtaining a transfer function of the PD controller, the transfer function of the PD controller being expressed as a relationship of error to control output in a continuous time domain, and the transfer function G _PD(s) of the PD controller being expressed as:

;

where s is the complex frequency variable in the laplace transform;

According to an output formula of the PD controller, calculating a control signal on line, performing amplitude limiting processing on the control signal, converting the control signal calculated by the PD algorithm into a control instruction of a steering engine, and driving the steering engine to execute corresponding angle rotation;

and (3) optimizing parameters of the PD controller by adopting a depth deterministic strategy gradient algorithm:

The method comprises the steps of updating an Actor-Critic network learning parameter based on a reward function, evaluating the operation effect of a control system under the current PD parameter through the reward function, wherein the input value of the reward function is tracking deviation obtained after the control system operates and the output result of a PD algorithm, comparing the output value of the reward function with a value estimated value output by a Critic network through a comparator after the reward function is calculated, and sending the obtained error value into the Actor-Critic network to update the Actor-Critic network learning parameter;

Based on the training PD parameter of the Actor-Critic network, an Actor adjusts the PD parameter according to the input deviation of the control system, an evaluator estimates the value of the adjustment effect according to the deviation obtained by the operation of the control system after the PD parameter is adjusted and the adjusted PD parameter, and when the training result meets the setting requirement, the training is stopped, and the obtained PD parameter is output as an optimal value.

The beneficial effects of the invention are as follows:

The method comprises the steps of establishing a tracking path for correcting ship navigation track deviation, selecting a reference point from the tracking path as a deviation calculation basis of steering control, refining the stage of steering control, improving the stability of ship operation and the smoothness of regulation, predicting steering angle control under the current ship operation working condition on the basis of the calculated deviation, realizing deviation compensation, reducing the influence of interference factors on the actual track of the ship, improving the accuracy of path tracking, and finally finishing steering control in tracking path navigation by inputting the deviation into a PD controller, rapidly responding to input change and reducing overshoot and oscillation.

In the process of tracking path planning, a fitness function is established according to limiting conditions set according to actual requirements, so that an optimal path which is smooth and easy to control and is adaptive to a target path is generated, steering adjustment controllability is realized by flexibly selecting reference points, control precision is improved, in the process of deviation compensation, a machine learning algorithm is adopted to compensate rudder angle influence caused by interference factors of operation conditions, deviation obtained by a PD controller is more accurate, system response is faster, path tracking efficiency is improved, and in the process of steering control of the PD controller, parameters of the PD controller are optimized by adopting a depth deterministic strategy gradient algorithm to adapt to different operation conditions, and robustness of a control system is enhanced.

Drawings

The present invention is further described below with reference to the accompanying drawings for the convenience of understanding by those skilled in the art.

Fig. 1 is a schematic diagram of steps of intelligent steering based on machine learning according to an embodiment of the present invention;

FIG. 2 is a schematic diagram of a step of selecting a reference point according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating steps for generating a trace path according to an embodiment of the present invention;

FIG. 4 is a schematic diagram of a step of obtaining rudder angle parameters according to an embodiment of the present invention;

fig. 5 is a schematic diagram illustrating steps for tracking bias correction according to an embodiment of the present invention.

Detailed Description

In order to further describe the technical means and effects adopted by the invention for achieving the preset aim, the following detailed description is given below of the specific implementation, structure, characteristics and effects according to the invention with reference to the attached drawings and the preferred embodiment.

The intelligent steering based on machine learning utilizes a data driving method to optimize steering engine control, collects historical navigation data from sensors and a ship system, including rudder angle, ship position, speed, environmental conditions and the like, trains a model by using a machine learning algorithm to learn steering compensation data of a ship under different conditions, adds the compensation data into a steering engine adjusted deviation angle so as to reduce the influence of an operation condition on path tracking.

Specifically, the following details are given for each step of intelligent steering based on machine learning:

The embodiment provides an intelligent steering based on machine learning, as shown in fig. 1, comprising the following steps:

s1, selecting a reference point, namely planning a tracking path of a ship returning to a target path from a current position, and selecting the reference point from the tracking path as an error calculation reference point for steering control, wherein the method comprises the following steps of:

S11, generating a tracking path according to the current position of the ship and the target path, wherein the tracking path is generated according to the current position of the ship and the target path, as shown in FIG. 3, and comprises the following steps:

s111, selecting target points, namely setting a threshold value of a direction difference and a threshold value of a curvature difference, sampling a group of candidate target points from a target path, wherein the points can be uniformly or distributed and sampled according to the geometric characteristics of the path;

Traversing candidate target points, screening target points with direction difference and curvature difference within a threshold range, and selecting a point closest to the current position from target points meeting the conditions as a final target point.

It can be understood that, in this embodiment, by generating the tracking path, the ship returns from the current position to the target path, and when generating the tracking path, a certain point on the target path needs to be determined as the end point of the tracking path, but the selection of the target point needs to achieve the best match between the path from the current position to the target point and the target path in the curvature and direction, so as to ensure the fitting property of the path direction and the feasibility of rudder angle adjustment.

It is noted that if no eligible target point is found in the sampled points, a measure may be taken to relax the threshold by gradually increasing the direction difference threshold and the curvature difference threshold until an appropriate target point is found, and a measure may be taken to select a point with the smallest direction or curvature difference as the target point without the appropriate point, although it may not fully satisfy the threshold condition. The threshold may be set experimentally or empirically to achieve a better result. The points meeting the threshold condition are directly screened, the nearest point is preferentially selected, the calculation complexity is obviously reduced, meanwhile, the target point selection is directly realized through simple screening, and the instantaneity is good.

S112, generating an initial tracking path, namely randomly generating the initial tracking path of the ship by adopting a B spline control point according to the current position of the ship and the selected target point, wherein the method comprises the following steps of:

;

Wherein U ₁、U₂ is a basis function Is a coefficient of (a).

;

The path expression generated by the set of control points is:

;

It should be noted that in this embodiment, the shape and trend of the spline curve are adjusted by using the control points of the B-spline curve, and the first control point and the last control point are the start point and the end point of the spline curve track respectively, that is、The B spline interpolation optimization method can interpolate path points with certain distance, the interpolated curve path is smoother, curvature change is more continuous, path quality and characteristics are improved, further, the values of transverse deviation and heading deviation are optimized and smoothed, the problems of oscillation and the like caused by error mutation are prevented, and the transition of the tracking path of the ship is more natural and continuous.

S113, searching control points, namely encoding the control points of the B spline curve, searching the optimal control point configuration through the evolution process of a genetic algorithm, and realizing the optimal tracking path planning target, wherein the method comprises the following steps of:

S1131, coding the control points and solving the control points ,) Coordinates:

;

In the middle of (a) ,) The coordinates of the bisection point of the line sf connecting the start point s and the end point f of the path,For the distance between the start point s and the end point f, rand is a random number and rand e [ -0.5,0.5],Is the slope of the straight line sf.

It can be understood that the distance between the starting point s and the end point f can be solved by adopting a distance formula according to the longitude and latitude coordinates of the starting point s and the end point f, and the control points of the B-spline curve are expressed as individuals of the genetic algorithm in the expression mode, namely, a group of control points represent a feasible path, a tracking path represents a genetic individual, and the subsequent optimization is completed.

S1132, establishing a fitness function, namely setting a limiting condition according to the actual demand in the error correction process, and establishing the fitness function, wherein the method comprises the following steps of:

(1) Defining path distance by calculating path distance from the position of a point on the traced path :

;

In the formula,Represents the i-th discrete point on the path z, h being the number of discrete points.

The constraint is used to constrain the path distance so that it tracks the path to minimize the total distance.

(2) Defining path smoothness calculating path smoothness by curvature:

;

In the formula,Represents the direction angle of the i-th point on the path, which is calculated from the relative position between the i-th point and the i+1-th point,Representing the curvature change between adjacent three path points.

In order to make the path as smooth as possible, the smaller the curvature should be, the better.

(3) Defining the path direction, defining the path direction to fit the objective functionThe method is used for measuring the consistency of the direction of the path and the direction of the target path:

;

In the formula, The direction angle of the ith point of the ship on the tracking path,Is the direction angle of the i-th point on the target path.

By minimizingThe direction of the actually planned tracking path can be attached to the direction of the target path as much as possible, and deviation from the expected direction of the target path is avoided.

(4) Establishing a fitness function, namely weighting and summing the established indexes to obtain the fitness function:

;

The planned tracking path in this embodiment is used to correct the error of the ship in tracking the target path, and is matched with the rudder system to correct, in this process, the ship is in the correction error range of the target path, and no factors such as obstacles need to be considered, so that the problems actually solved in this process should meet the conditions that the limiting path is as short as possible, the limiting path is as smooth as possible, and the limiting path is as close to the target path as possible.

S1133, genetic iterative optimization, namely minimizing an fitness function through iterative optimization to obtain an optimal path, wherein the method comprises the following steps of:

Randomly generating control point coordinates, determining parameters such as crossover probability, variation probability, iteration times and the like, and generating a tracking path by using a B spline interpolation algorithm according to the coordinates of the starting point and the target point position;

calculating fitness function values of all paths and recording the fitness function values as fitness of all individuals, carrying out roulette wheel selection operation according to the fitness value of each individual to obtain offspring, and enabling individuals with highest fitness in the individuals to directly enter the offspring without participating in selection;

The method comprises the steps of performing cross and mutation operation to obtain an n+1 generation individual, calculating the fitness of the n+1 generation individual, judging whether n is larger than the set iteration times, if so, entering the next step, otherwise, returning to the operation;

And performing B spline interpolation according to the obtained optimal individual to obtain an optimal tracking path.

It is to be noted that, the method randomly generates the tracking paths through the B spline control points, calculates the comprehensive cost function of each path in the dynamic flow field by considering the path distance, the path smoothness and the fitting target path, and uses the genetic algorithm to iteratively search the optimal control point combination to provide refined data for controlling the rudder angle of the later rudder system, thereby ensuring that the ship accurately follows the target path, reducing the fluctuation of the rudder angle and improving the sailing stability of the ship.

S12, acquiring reference points, namely setting a reference point selection interval according to the control fineness requirement on the generated optimal tracking path, and selecting a group of rudder angle control reference points from the starting point of the tracking path.

It can be understood that an optimal tracking path of the ship regression target path is generated through the method, the error between the current position of the ship and the reference point can be solved by taking the point on the tracking path as the reference point, the error is converted into the control error of the rudder angle, the path tracking of the ship can be completed under the sectional control adjustment, the accuracy of the path tracking and the control stability are improved by planning the tracking path, and meanwhile, the position correction of the ship is completed through the tracking path, so that the speed and the direction of the ship are attached to the target path after the ship returns to the target point, and the yaw risk after the ship reaches the target point is reduced.

S2, obtaining rudder angle parameters, namely obtaining tracking deviation through setting parameters and detection parameters, and correcting the running working conditions of the tracking deviation based on a machine learning model, wherein the method comprises the following steps of:

S21, acquiring tracking deviation, namely acquiring the tracking deviation between the set parameter and the detection parameter by taking the feedback coordinate point of the GPS and inertial integrated navigation system as the detection parameter according to the selected reference point as the set parameter, wherein the tracking deviation comprises the following steps:

Acquiring real-time feedback coordinates by using a GPS and inertial integrated navigation system, wherein the inertial navigation system estimates the current position by measuring acceleration and angular velocity, and the GPS provides global positioning coordinates;

The sampling frequencies of the GPS and inertial integrated navigation system data are typically different, so that time alignment of the data is required to ensure that the time stamps of the two are consistent. Synchronization is achieved by interpolation or data alignment methods;

Converting the geographic coordinates (e.g., latitude, longitude) provided by the GPS into a coordinate system consistent with an inertial navigation system (e.g., ENU or ECEF) ensures that the two data are compared within the same frame of reference.

And carrying out data fusion on the GPS and inertial integrated navigation system data by adopting algorithms such as Kalman filtering and the like to obtain optimal detection parameters, and reducing errors or drift when GPS signals are lost.

And calculating the direction deviation between the setting parameter and the detection parameter as the tracking deviation.

The GPS and the inertial integrated navigation system have a positioning function, the positioning precision is higher, the precision cannot be changed along with the change of positioning time, the inertial navigation system can provide a motion displacement quantity with extremely high precision, the advantages of the GPS and the inertial integrated navigation system can be combined through a data fusion technology, and the global positioning capability of the GPS and high-frequency data of the inertial integrated navigation system are utilized to improve the performance and the precision of the whole system.

In this embodiment, the tracking deviation of the set parameter and the detection parameter is mainly used to calculate the deflection angle of the steering engine, so as to implement intelligent steering to complete the navigation of the tracking path.

S22, tracking deviation correction, namely carrying out working condition interference correction on the tracking deviation by collecting working condition parameters of the ship and adopting a machine learning model to compensate the tracking deviation, wherein the tracking deviation is shown in fig. 5, and the method comprises the following steps of:

S221, establishing a machine learning model, namely establishing a data set by collecting historical navigation data, constructing the machine learning model for training, and integrating the model into a steering engine control system of a ship, wherein the method comprises the following steps of:

The method comprises the steps of establishing a data set through historical navigation data, wherein the data type comprises, but is not limited to, current position, target position, rudder angle, navigational speed, heading, wind speed, wind direction, ocean current information and the like, cleaning collected original data, removing abnormal values and noise data, processing missing data, carrying out normalization or standardization treatment, extracting characteristics for prediction from the original data through a correlation coefficient method, and combining the characteristics through a polynomial characteristic enhancement mode.

The empirical formula is:

;

wherein h, m and n are the numbers of neurons of an implicit layer, an input layer and an output layer respectively, and a is a constant of 1-10.

Model training, namely distributing the collected data into a training set, a verification set and a test set according to the proportion of 70%, 15% and 15%;

The method comprises the steps of transferring data to a hidden layer through an input layer, then reaching an output layer through an activation function, calculating an output predicted value, calculating the gradient of a loss function, back-propagating errors to various levels, adjusting the weight and bias of each neuron, and updating the weight of a model through multiple iterations until the model converges or reaches a set stop condition (such as that the loss function is not significantly reduced).

The performance of the model is evaluated using the validation set data. Common evaluation metrics include Mean Square Error (MSE), mean Absolute Error (MAE), etc., and overfitting of the model is avoided by the validation set.

And evaluating the model by using the test set data, and ensuring the generalization capability of the model on the unseen data.

In the process, tracking deviation is obtained through deviation between the current position and the reference point, the tracking deviation is taken as a basis, and a deviation angle to be compensated for the ship under a real-time working condition is added, so that the actual track of the ship in the navigation of the tracking path is more accurate, and the ship can complete regression of the target path in the planned tracking path.

S3, tracking a target path, namely converting the corrected tracking deviation into a steering angle of a steering engine, sequentially establishing a steering control parameter table between every two reference points, and using the steering engine as an executing mechanism to control and output by using a PD controller, wherein the method comprises the following steps of:

proportional control (P) the proportional control part adjusts the output directly according to the current error.

;

Where K _p is the proportional gain and e (t) is the tracking offset at time t.

And differential control (D) for predicting the trend of the error and suppressing overshoot caused by rapid change of the error.

;

Where K _d is the differential gain.

The PD controller integrates the proportional and differential control outputs to obtain a final control output u (t):

;

Transfer function of PD controller in continuous time domain, transfer function of PD controller is expressed as error to control output relationship, transfer function G _PD(s) of PD controller in laplace transform domain is expressed as:

;

Where s is the complex frequency variable in the laplace transform.

According to an output formula of the PD controller, calculating a control signal on line, and performing amplitude limiting treatment on the control signal to ensure that the output signal is in a physical range allowed by a steering engine;

and converting the control signal calculated by the PD algorithm into a control instruction (such as a PWM signal) of the steering engine, and driving the steering engine to execute corresponding angle rotation.

Adjusting parameters of the PD controller with DDPG (DEEP DETERMINISTIC Policy Gradiengt), comprising:

The Actor-Critic network learning parameters are updated based on the reward function. The reward function is in the form of the sum of all input signals and the corresponding weighted products, and is used for evaluating the effect of the control system on the operation of the control system under the current PD parameters. The input value of the reward function is the tracking deviation obtained after the control system operates and the output result of the PD algorithm, the output value of the reward function is compared with the value estimated value output by the Critic network through a comparator after the reward function is calculated, and the obtained error value is sent into the Actor-Critic network for updating the Actor-Critic network learning parameter.

PD parameters are trained based on an Actor-Critic network. The Actor-Critic network is a deep neural network architecture, an Actor adjusts PD parameters according to the input deviation of a control system, and an evaluator estimates the value of the adjusting effect according to the deviation obtained by the operation of the control system after the PD parameters are adjusted and the adjusted PD parameters. And stopping training after the training result meets the set requirement, and outputting the obtained PD parameter as an optimal value.

It should be noted that, the tracking path is divided into a plurality of segments by the number of the selected reference points, the corresponding rudder angle parameters are obtained by the deviation calculation and the interference correction of the two end points between each segment, and the rudder angle parameters are input into the steering engine system, so that the steering control of the navigation of the tracking path is completed.

The present invention is not limited in any way by the above-described preferred embodiments, but is not limited to the above-described preferred embodiments, and any person skilled in the art will appreciate that the present invention can be embodied in the form of a program for carrying out the method of the present invention, while the above disclosure is directed to equivalent embodiments capable of being modified or altered in some ways, it is apparent that any modifications, equivalent variations and alterations made to the above embodiments according to the technical principles of the present invention fall within the scope of the present invention.

Claims

1. An intelligent steering system based on machine learning, characterized in that it comprises the following steps:

Select reference point: Plan the tracking path of the ship from the current position to the target path, and select the reference point from the tracking path as the reference point for error calculation of steering control;

Obtaining rudder angle parameters: Obtaining tracking deviation by setting parameters and detecting parameters, and correcting the operating conditions of the tracking deviation based on the machine learning model;

Target path tracking: According to the corrected tracking deviation, it is converted into the steering angle of the servo, and the steering control parameter table between two reference points is established in sequence. The servo is used as the actuator and the PD controller is used for control output;

The step of selecting a reference point further comprises the following sub-steps:

Generate tracking path: Generate tracking path according to the current position of the ship and the target path;

Get reference points: Set the reference point selection interval based on the control precision requirement on the generated optimal tracking path, and select a set of rudder angle control reference points starting from the starting point of the tracking path;

The step of generating a tracking path further comprises the following sub-steps:

Select target point: set the thresholds of direction difference and curvature difference, sample a set of candidate target points from the target path, traverse the candidate target points, filter out the target points whose direction difference and curvature difference are within the threshold range, and select the point closest to the current position as the final target point;

Generate initial tracking path: According to the current position of the ship and the selected target point, the initial tracking path of the ship is randomly generated using B-spline control points;

Search control points: Encode the control points of the B-spline curve and search for the best control point configuration through the evolution process of the genetic algorithm to achieve the optimal tracking path planning goal;

The generating of the initial tracking path further comprises the following sub-steps:

Assume that the time domain is [ ts , tf ] and the B-spline curve trajectory with n control points is:

;

Where Z ( t ) is the spline curve trajectory, t ∈ [ ts , tf ]; are the control points of the B-spline curve, is the i - th k- order B-spline basis function, the basis function It is composed of the DeBoer-Cox recursion formula, expressed as:

;

Where U ₁ and U ₂ are basis functions The coefficient of

By introducing the B-spline curve into the ship path representation, a set of control point coordinates can be expressed as:

;

The path expression generated by this group of control points is:

;

Where X ( t ) represents the position change of the point on the path on the x- axis, and Y ( t ) represents the position change of the point on the path on the y- axis.

2. The intelligent steering system based on machine learning according to claim 1, characterized in that: the searching control point further comprises the following sub-steps:

Encode the control points and solve the control points ( , )coordinate:

;

In the formula, ( , ) is the coordinate of the equally divided point of the line sf connecting the starting point s and the end point f of the path, d _sf is the distance between the starting point s and the end point f , rand is a random number and rand ∈ [-0.5, 0.5], k _sf is the slope of the line sf ;

Establishing fitness function: Setting constraints according to actual requirements in the error correction process to establish the fitness function;

Genetic iterative optimization: Minimize the fitness function through iterative optimization and obtain the optimal path.

3. The intelligent steering system based on machine learning according to claim 2, characterized in that: the step of establishing a fitness function further comprises the following sub-steps:

Define path distance: Calculate the path distance Jd based on the position of the points on the tracking path:

;

In the formula, represents the i- th discrete point on path z , h is the number of discrete points;

Defining path smoothness: Calculating path smoothness by curvature :

;

In the formula, Represents the direction angle of the i- th point on the path, which is calculated by the relative position between the i- th point and the i +1-th point. Indicates the curvature change between three adjacent path points;

Define path direction: Define the path direction to fit the objective function , which is used to measure the consistency between the direction of the path and the direction of the target path:

;

In the formula, is the direction angle of the ship at the i-th point on the tracking path, is the direction angle of the i-th point on the target path;

Establish a fitness function: assign weights to the established indicators and perform weighted summation to obtain the fitness function:

;

In the formula, , , is the weight coefficient, which is used to adjust the importance of each indicator.

4. The intelligent steering system based on machine learning according to claim 3, characterized in that: the genetic iterative optimization further comprises the following sub-steps:

Randomly generate control point coordinates, determine the crossover probability, mutation probability and number of iterations, and use the B-spline interpolation algorithm to generate the tracking path based on the coordinates of the starting and target points.

Calculate the fitness function value of each path and record it as the fitness of each individual; perform roulette selection operation according to the fitness value of each individual to obtain the offspring, and the individual with the highest fitness among the individuals does not participate in the selection and directly enters the offspring;

Perform crossover and mutation operations to obtain n + 1 generation individuals; calculate the fitness of n + 1 generation individuals; determine whether n is greater than the set number of iterations, if it is greater than the maximum number of iterations, proceed to the next step, otherwise return to the above operation; based on the obtained optimal individual, perform B-spline interpolation to obtain the optimal tracking path.

5. The intelligent steering system based on machine learning according to claim 1, characterized in that: the step of obtaining the rudder angle parameter further comprises the following sub-steps:

Obtain tracking deviation: Based on the selected reference point as the setting parameter and the coordinate point feedback from the GPS and inertial integrated navigation system as the detection parameter, obtain the tracking deviation between the setting parameter and the detection parameter;

Tracking deviation correction: By collecting ship operating parameters, a machine learning model is used to correct the operating interference of the tracking deviation and compensate for the tracking deviation.

6. The intelligent steering system based on machine learning according to claim 5, characterized in that: the step of obtaining the tracking deviation further comprises the following sub-steps:

Use GPS and inertial combined navigation system to obtain real-time feedback coordinates, time-align the data, and convert the geographic coordinates provided by GPS into a coordinate system consistent with the inertial navigation system;

The Kalman filter algorithm is used to fuse the GPS and inertial combined navigation system data to obtain the optimal detection parameters, and the direction deviation between the set parameters and the detection parameters is calculated as the tracking deviation.

7. The intelligent steering system based on machine learning according to claim 6, characterized in that: the tracking deviation correction further comprises the following sub-steps:

Establishing a machine learning model: By collecting historical navigation data to establish a data set, constructing a machine learning model for training, and integrating the model into the ship's steering gear control system;

Collect ship operating data: Set up corresponding sensors based on the prediction features selected by model training, collect operating data in real time, and send the collected data to the steering gear system;

Correct tracking deviation: The real-time operating data acquired is input into the machine learning model to predict the output rudder angle compensation deviation.

8. The intelligent steering system based on machine learning according to claim 7, characterized in that: the step of establishing a machine learning model further comprises the following sub-steps:

Establishing data set: Establishing data set through historical navigation data. The data types include current position, target position, rudder angle, speed, heading, wind speed, wind direction, and ocean current information.

Clean the collected raw data, extract the features for prediction from the raw data by using the correlation coefficient method, and combine the features by using polynomial feature enhancement;

Construct BP neural network model: set the number of nodes in the input layer according to the selected prediction features, set the output layer to compensate for the deviation, set the number of layers and the number of nodes in each layer of the hidden layer according to the empirical formula, and select the activation function to complete the model construction;

Model training: The collected data is allocated as training set, validation set and test set in the ratio of 70%, 15% and 15% respectively; the data is passed to the hidden layer through the input layer, and then reaches the output layer through the activation function to calculate the output prediction value; the gradient of the loss function is calculated, and the error is back-propagated to each layer to adjust the weight and bias of each neuron; the weight of the model is updated through multiple iterations until the model converges or the set stop condition is reached;

The performance of the model is evaluated using validation set data and test set data.

9. The intelligent steering system based on machine learning according to claim 1, characterized in that: the target path tracking further comprises the following sub-steps:

Establish the control output of the PD controller: Combine the proportional and differential control outputs to get the final control output u(t):

;

Where, e(t) is the tracking error at time t, _Kp is the proportional gain, and _Kd is the differential gain;

Obtain the transfer function of the PD controller: In the continuous time domain, the transfer function of the PD controller is expressed as the relationship from error to control output. In the Laplace transform domain, the transfer function GPD( S ) of the PD controller is expressed as:

;

Where, S is the complex frequency variable in Laplace transform;

According to the output formula of the PD controller, the control signal is calculated online, the control signal is limited, and the control signal calculated by the PD algorithm is converted into a control instruction of the servo, which drives the servo to perform the corresponding angle rotation;

For the parameters of the PD controller, a deep deterministic policy gradient algorithm is used for optimization:

Based on the reward function, the learning parameters of the Actor-Critic network are updated, and the effect of the control system under the current PD parameters is evaluated by the reward function. The input value of the reward function is the tracking deviation obtained after the control system is running and the output result of the PD algorithm. After the reward function is calculated, its output value is compared with the value estimate output by the Critic network through a comparator, and the obtained error value is sent to the Actor-Critic network to update the learning parameters of the Actor-Critic network.

Based on the Actor-Critic network training of PD parameters, the actor adjusts the PD parameters according to the deviation of the control system input, and the evaluator estimates the value of the adjustment effect based on the deviation obtained from the control system operation after the PD parameters are adjusted and the adjusted PD parameters. When the training results meet the set requirements, the training is stopped and the obtained PD parameters are output as the optimal value.