Disclosure of Invention
In order to solve the problems in the prior art, the invention provides a traffic flow self-adaptive prediction method based on a multi-time step diagram. The technical problems to be solved by the invention are realized by the following technical scheme:
the invention provides a traffic flow self-adaptive prediction method based on a multi-time step diagram, which comprises the following steps:
Acquiring traffic sensor data, and respectively constructing a distance graph, a time-varying side weight topological graph and a multi-time step graph to respectively extract time features and space features;
The distance map, the time-varying side weight topological map and the multi-time step map are fused to obtain an adaptive topological map, wherein the adaptive topological map is used as an encoder to encode the extracted time features and the spatial features;
Performing space-time embedded modeling on the coded time features and the spatial features to obtain comprehensive space-time features;
Constructing a decoder, decoding the comprehensive space-time characteristics through the decoder to obtain a preliminary prediction result, and weighting and fusing the characteristic information of the encoder and the decoder by adopting a distraction mechanism to establish a mapping relation of the space-time characteristics between the encoder and the decoder;
The optimal prediction time of each node in the self-adaptive topological graph is self-adaptively determined through a reinforcement learning control module, a prediction motion vector is generated, and a prediction result is obtained according to the prediction motion vector and the preliminary prediction result;
And combining the performance loss function, the controller loss function and the early loss function to obtain a weighted total loss function, optimizing a training model, and outputting an accurate prediction result of traffic flow.
In one embodiment of the present invention, the obtaining traffic sensor data, respectively constructing a distance graph, a time-varying side-weight topology graph and a multi-time-step graph, to extract temporal features and spatial features, respectively, includes:
Acquiring traffic sensor data, wherein the traffic sensor data comprises time information and space information;
Constructing a graph structure, wherein the graph structure comprises a plurality of nodes;
quantizing the spatial relationship between the nodes by adopting a distance matrix to generate a distance graph;
introducing self-loop connection into each node, integrating node information and adjacent node information thereof, wherein the expression is as follows:
Wherein, the The method is based on the output characteristics of the nodes after convolution fusion calculation processing of the distance graph, wherein W d is a weight matrix capable of learning, and H in is the input node characteristics; a degree matrix which is a diagonal matrix calculated according to the distance matrix, wherein the elements on the diagonal line are the degree matrix Is the degree of the node i, namely the sum of the weights of the edges directly connected with the node i, byCarrying out normalization processing to balance the influence of the degree difference of different nodes on feature propagation; Beta 1 and beta 2 are super parameters which respectively represent the importance of the characteristics of the node itself and the characteristics of the adjacent nodes in the output characteristics;
constructing a time-varying side weight topological graph, wherein the expression is as follows:
Wherein, the Is a time-varying side weight topological graph; the ReLU function is a function for ensuring that the elements of the output matrix are non-negative values, the tanh function is a function for carrying out standardization processing on the matrix element values, alpha is a super parameter for adjusting the saturation of an activation function, X se is a dynamic embedded vector of a source node, and X te is a dynamic embedded vector of a target node; transpose of the dynamic embedded vector for the target node;
Generating a multi-time step diagram by using a dynamic time-warping method to calculate similarity between time sequences, wherein the expression is as follows:
Wherein, the The method is characterized by comprising the steps of calculating a distance for a multi-time step chart, calculating a distance for a similarity measure between traditional time sequences by using DTW, and controlling the attenuation rate of a dynamic time warping distance by using a parameter kappa.
In one embodiment of the present invention, fusing the distance map, the time-varying side weight topology map, and the multi-time step map to obtain an adaptive topology map includes:
and fusing the distance map and the time-varying side weight topological map through a weighted fusion strategy, wherein the expression is as follows:
Wherein, the The distance map and the variable-edge-weight topological map are fused; a degree matrix for the time-varying side weight topology map; The whole weighted fusion process is regulated through super parameter beta 1,β2,β3, beta 1,β2,β3 is super parameter, and the contribution degree of the self-loop connection, the distance graph and the time-varying side weight topological graph in the output characteristic representation is represented respectively; A time-varying side weight topological graph at a time t;
Introducing super parameter beta 4, representing the contribution degree of the multi-time step diagram to the final fusion diagram characteristic representation, and obtaining a self-adaptive topological diagram, wherein the expression is as follows:
wherein H out is an adaptive topological graph, and H t is a hidden state at time t.
In one embodiment of the present invention, performing space-time embedding modeling on the encoded temporal feature and the spatial feature to obtain a comprehensive space-time feature, including:
Performing result conversion on the encoded time features and the encoded space features by utilizing single-heat encoding to generate time embedding and space embedding respectively;
Integrating the time embedding and the space embedding to obtain comprehensive space-time characteristics, wherein the expression is as follows:
Wherein, the Is a comprehensive space-time feature; is space embedded; Is time embedded.
In one embodiment of the present invention, constructing a decoder, decoding the integrated space-time feature by the decoder to obtain a preliminary prediction result, weighting and fusing feature information of the encoder and the decoder by using a attention turning mechanism to establish a mapping relationship of space-time features between the encoder and the decoder, including:
Constructing a decoder, and decoding the comprehensive space-time characteristics through the decoder to obtain a preliminary prediction result;
Modeling the connection between the encoder and the decoder by adopting a distraction mechanism, wherein the expression is as follows:
Wherein ε i,t(t1,t2) is the attention score, e i,t′ is the space-time embedding processed by the ReLU activation function, t 1 and t 2 are both time nodes, W is the weight, b is the deviation parameter, and [ in ], [ in ] is the vector inner product; Is a scaling factor;
the decoder performs weighted summation on the hidden states of the previous time points according to the attention score, and the expression is as follows:
wherein T 2 = t..a.t-1, σ (·) is a normalized exponential function, W h is a weight parameter of the network in the decoder, b h is a bias parameter, ε t is an attention score; Is the hidden state at time t 1.
In one embodiment of the present invention, adaptively determining, by a reinforcement learning control module, an optimal prediction time of each node in the adaptive topology map, generating a predicted motion vector, and obtaining a predicted result according to the predicted motion vector and the preliminary predicted result, including:
the optimal prediction time of each node in the self-adaptive topological graph is self-adaptively determined through a reinforcement learning control module;
optimizing a prediction strategy based on the current state and the current strategy by the reinforcement learning control module, wherein the expression is as follows:
pt=σ(WcSt+bc);
at=π(St);
Wherein p t is the pause probability, W c is the weight parameter of the reinforcement learning control module, sigma is the activation function, b c is the deviation parameter of the reinforcement learning control module, S t is the current state, a t is the predicted motion vector indicating the optimal predicted time, pi is the current strategy;
multiplying the prediction motion vector by the nonlinear transformation result of the hidden state element by element to obtain a prediction result, wherein the expression is as follows:
Wherein, the W p is weight, b p is deviation parameter; the decoder outputs a hidden state for each time step t.
In one embodiment of the present invention, if the predicted motion vector of a node i in all time steps is 0, the optimal predicted time of the node is defaulted to T-1, and a predicted result is obtained
In one embodiment of the present invention, combining the performance loss function, the controller loss function, and the early loss function to obtain a weighted total loss function, optimizing a training model, and outputting an accurate prediction result of traffic flow, including:
calculating the average absolute error between the predicted result and the true value to obtain a performance loss function, wherein the expression is as follows:
Wherein L p is a performance loss function, N is the number of nodes, and X T,i is a true value;
and calculating negative logarithm pause probability to obtain an early loss function, wherein the expression is as follows:
wherein L e is the early loss function; for each node i, calculating by a strategy pi to obtain the optimal prediction time;
Using the prize tensor, maximizing the desired prize to improve the prediction accuracy, and obtaining a controller loss function, the expression of which is:
Wherein L c is a controller loss function, E represents the desire of the bracket interior, b t′ is a base line, S t is a hidden state, r t' is updated rewards, r is an overall rewards;
and obtaining a weighted loss function through a super parameter according to the performance loss function, the early loss function and the controller loss function, wherein the expression is as follows:
L=Lp+ω1Lc+ω2Le;
Wherein L is a weighted loss function, and omega 1 and omega 2 are super parameters;
And optimizing the training model according to the weighted loss function to enable the model training to be converged, and obtaining the accurate prediction result of the traffic flow after the training is completed and outputting.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the traffic flow self-adaptive prediction method based on the multi-time step map when executing the computer program.
The invention also provides a computer readable storage medium, on which a computer program is stored, which when being executed by a processor, implements the traffic flow adaptive prediction method based on multiple time step diagrams.
Compared with the prior art, the invention has the beneficial effects that:
According to the traffic flow self-adaptive prediction method based on the multi-time step diagram, a self-adaptive topological diagram network is obtained by fusing a distance diagram, a time-varying side weight topological diagram and a multi-time step diagram, a attention turning mechanism is introduced, a prediction technology combining the attention turning mechanism and the self-adaptive topological diagram network is designed by utilizing a reinforcement learning control module, comprehensive capture of time-space dynamic characteristics, effective alleviation of prediction errors and self-adaptive determination of optimal prediction time are realized, and therefore, the timeliness and reliability of prediction are remarkably improved while the prediction accuracy is ensured.
The invention inputs the real-time data sample collected by the sensor into the model, the self-adaptive topological graph network is used as an encoder to extract data characteristics, the output of the encoder is used for initializing the decoder, the reinforcement learning control module judges whether the current moment meets the condition of the optimal step length according to the network hiding state output by the decoder, if so, a prediction result is generated and is used as a final prediction result, namely, the number of vehicles passing in each sensor area in a future time period. Through the steps, the method can effectively improve the prediction accuracy and timeliness of traffic flow, and particularly when complex urban traffic data is processed, the method has higher prediction accuracy and flexibility than the traditional method.
The foregoing description is only an overview of the present invention, and is intended to be implemented in accordance with the teachings of the present invention, as well as the preferred embodiments thereof, together with the following detailed description of the invention, given by way of illustration only, together with the accompanying drawings.
Detailed Description
In order to further explain the technical means and effects adopted by the invention to achieve the preset aim, the following describes in detail a traffic flow self-adaptive prediction method based on a multi-time step chart according to the invention with reference to the attached drawings and the specific embodiments.
The foregoing and other features, aspects, and advantages of the present invention will become more apparent from the following detailed description of the preferred embodiments when taken in conjunction with the accompanying drawings. The technical means and effects adopted by the present invention to achieve the intended purpose can be more deeply and specifically understood through the description of the specific embodiments, however, the attached drawings are provided for reference and description only, and are not intended to limit the technical scheme of the present invention.
Example 1
The traditional space-time diagram convolution network is excellent in capturing complex space-time dependency, but generally depends on fixed prediction time or a single static diagram structure, cannot be fully suitable for a dynamically changeable traffic flow mode in an actual scene, particularly lacks adaptive optimization capability for the prediction time, is difficult to flexibly cope with real-time change requirements of different areas, and is very unfavorable for adaptively predicting traffic flow and avoiding congestion accidents. The invention provides a traffic flow self-adaptive prediction method based on a multi-time step map, which aims to solve the problem of insufficient dynamic data capture of a time sequence in the existing traffic flow prediction, in particular to short-time high-frequency fluctuation and long-time trend change.
As shown in fig. 1 and fig. 2, fig. 1 is an overall frame diagram of a traffic flow adaptive prediction method based on a multi-time step diagram according to an embodiment of the present invention, and fig. 2 is a flowchart of a traffic flow adaptive prediction method based on a multi-time step diagram according to an embodiment of the present invention.
In this embodiment, the traffic flow adaptive prediction method based on the multiple time step graphs includes the steps of:
And step 1, acquiring traffic sensor data, and respectively constructing a distance graph, a time-varying side weight topological graph and a multi-time step graph to respectively extract time characteristics and space characteristics.
In an alternative embodiment, step 1 comprises:
step 1.1, acquiring traffic sensor data, wherein the traffic sensor data comprises time information and space information;
and 1.2, constructing a graph structure which comprises a plurality of nodes.
Specifically, the graph structure is definedThe graph comprises N nodes V i epsilon V, wherein V is a vertex set and E is an edge set, and the relation between the nodes (namely the edge (V i,vj) epsilon E in the graph structure) is determined according to the relevance between the nodes. In order to quantify the connection strength between these nodes, the edge data is represented by a matrix A εR N×N, R being a real set, meaning that each element in the matrix is real, and each node information at time t is recorded by node data X t∈RN.
And 1.3, quantifying the spatial relationship among the nodes by adopting a distance matrix to generate a distance graph.
Specifically, a distance matrix is adoptedQuantifying the spatial relationship between nodes, and the expression of the distance matrix is:
Where d ij is the distance between nodes v i,vj, each element Representing the correlation between node i and node j based on distance computation, exp is an exponential function, for undirected graph,The method is a symmetric matrix, eta is an attenuation parameter used for adjusting the range of distance influence, and through an attenuation function, the correlation of space dimension among nodes can be considered as long as the geographic distance is within a certain range even though the nodes are relatively far away.
Step 1.4, introducing self-loop connection into each node, integrating node information and adjacent node information thereof, wherein the expression is as follows:
Wherein, the The method is based on the output characteristics of the nodes after convolution fusion calculation processing of the distance graph, wherein W d is a weight matrix capable of learning, and H in is the input node characteristics; a degree matrix which is a diagonal matrix calculated according to the distance matrix, wherein the elements on the diagonal line are the degree matrix Is the degree of the node i, namely the sum of the weights of the edges directly connected with the node i, byCarrying out normalization processing to balance the influence of the degree difference of different nodes on feature propagation; And beta 1 and beta 2 are super parameters, and respectively represent the importance of the characteristics of the node itself and the characteristics of the adjacent nodes in the output characteristics.
Specifically, the self-loop connection is introduced into each node, the characteristic information of the node is maintained, the influence of adjacent nodes is considered while the characteristic is updated, and therefore the information of the node and the information of adjacent nodes are integrated.
Step 1.5, constructing a time-varying side weight topological graph, wherein the expression is as follows:
Wherein, the The matrix element values are mapped between-1 and 1 to control the output intensity of an activation function, alpha is a super parameter used for adjusting the saturation of the activation function to control the sensitivity of a model to the connection intensity between nodes in a graph structure, X se is a dynamic embedded vector of a source node, and X te is a dynamic embedded vector of a target node; is a transpose of the dynamic embedded vector of the target node.
Specifically, at time t, the hidden state H t-1 and the current node value X t are spliced, the integrated characteristic representation obtained after splicing is used as input, after the integrated characteristic representation is sent into a graph convolution neural network, the output of the neural network and the learnable distance graph node embedding vector subjected to random initialization are subjected to bit multiplication operation, and dynamic node embedding with the dimension of N is generated. The time-varying edge topology node embedding operation has X se for the source node and X te for the target node. Then obtaining the adjacency matrix of the time-varying side weight topological graph through the similarity between node embedding of the time-varying side weight topological graph
Step 1.6, generating a multi-time step diagram by using a dynamic time regularization method to calculate similarity between time sequences, wherein the expression is as follows:
Wherein, the The method is characterized by comprising the steps of calculating a distance for a multi-time step chart, calculating a distance for a similarity measure between traditional time sequences by using DTW, and controlling the attenuation rate of a dynamic time warping distance by using a parameter kappa.
Specifically, for each time step, a multi-time step similarity matrix is created, which is expressed at the current time step asThe matrix (i.e., a multi-time-step graph) is used to calculate the similarity between each node from the time beginning to the current time step t, and for each element (i, j) in the matrix, calculate the similarity between nodes v i and v j at each time step t '(0. Ltoreq.t'. Ltoreq.t).
It will be appreciated that Dynamic Time Warping (DTW) is a method of measuring the similarity of two time series, and is particularly well suited to deal with differences due to time offsets or long dyssynchrony. The basic principle is to evaluate the similarity of two time series by non-linearly aligning them and finding the best matching path between them. Dynamic time warping allows the time series to be non-linearly deformed on the time axis, stretching or compressing certain parts so that the two time series can be better aligned. The distances between each time point are calculated recursively and accumulated to finally find a shortest matching path. The accumulated distance corresponding to the path reflects the similarity between the two sequences, and a smaller distance indicates a higher similarity.
And 2, fusing the distance map, the time-varying side weight topological map and the multi-time step map to obtain an adaptive topological map, wherein the adaptive topological map is used as an encoder to encode the extracted time features and the extracted space features.
In an alternative embodiment, step 2 comprises:
Step 2.1, fusing the distance map and the time-varying side weight topological map through a weighted fusion strategy, wherein the fusion process has the expression:
Wherein, the The distance map and the variable-edge-weight topological map are fused; a degree matrix for the time-varying side weight topology map; The whole weighted fusion process is regulated through super parameter beta 1,β2,β3, beta 1,β2,β3 is super parameter, and the contribution degree of the self-loop connection, the distance graph and the time-varying side weight topological graph in the output characteristic representation is represented respectively; is a time-varying side weight topology graph at time t.
Step 2.2, introducing super parameter beta 4, representing the contribution degree of the multi-time step diagram to the final fusion diagram characteristic representation, and obtaining a self-adaptive topological diagram, wherein the expression is as follows:
wherein H out is an adaptive topological graph, and H t is a hidden state at time t.
So far, the modeling of the self-adaptive topological graph for representing the final characteristic representation after the multi-graph fusion operation is completed, which can be simply described as:
Further, the matrix multiplication operation in the conventional gate-controlled loop unit (GRU) can be replaced by the multi-graph fusion calculation operation in the above steps, and the calculation formula is as follows:
Wherein, the AndRepresenting multiple graph fusion operations applied in different gating and state updating processes, respectively, specifically, changing matrix operation items of 'following new gate', 'resetting gate' and 'current hidden state' in a traditional gating circulation unit into the three graph fusion operation itemsAnd
The method comprises the steps of defining a basic graph structure comprising node sets and edge sets, using a matrix to represent weights of edges, recording space relations among nodes, generating a distance graph adjacent matrix based on an exponential decay function of node geographic distances to obtain a constructed distance graph structure quantifying space relations among nodes, introducing self-loop connection to the graph structure to reserve self characteristics, integrating information of the nodes and adjacent nodes through graph fusion convolution calculation operation, and obtaining preliminary space-time characteristic representation of the nodes.
The method comprises the steps of establishing a preliminary fusion graph of a distance graph and a time-varying side weight topological graph, combining a hidden state at the previous moment with a current node value at each time step, inputting a graph convolution neural network, embedding and calculating an adjacent matrix of the time-varying side weight topological graph based on the time-varying side weight topological graph nodes, normalizing similarity among nodes by using an activation function, adjusting sensitivity of a model to connection strength in the graph, combining characteristic information of the distance graph and the time-varying side weight topological graph, and generating a preliminary fusion graph representation through a weighted fusion strategy, wherein weight is adjusted by super parameters.
And 3, performing space-time embedded modeling on the coded time features and the spatial features to obtain comprehensive space-time features.
In an alternative embodiment, step 3 comprises:
Step 3.1, performing result conversion on the encoded time characteristics and space characteristics by utilizing single-heat encoding to generate time embedding and space embedding respectively;
and 3.2, integrating time embedding and space embedding to obtain comprehensive space-time characteristics, wherein the expression is as follows:
Wherein, the To integrate the space-time characteristics, a result tensor is embedded in real time space; is space embedded; Is time embedded.
Specifically, by using one-hot encoding (one-hot), the encoding result is converted through a two-layer fully connected neural network to generate an embedded result vector in the time dimensionWherein D represents the embedded dimension. The spatial data of node v i at time t is processed through a similar neural network to produce an embedded result vector in the spatial dimensionIntegrating the embedded vector in the time dimension with the embedded vector in the space dimension to obtain a space-time embedded result tensor after integrating the two dimensions
And 4, constructing a decoder, decoding the comprehensive space-time characteristics through the decoder to obtain a preliminary prediction result, and weighting and fusing the characteristic information of the encoder and the decoder by adopting a turning attention mechanism so as to establish a mapping relation of the space-time characteristics between the encoder and the decoder.
In an alternative embodiment, step 4 includes:
Step 4.1, constructing a decoder, and decoding the comprehensive space-time characteristics through the decoder to obtain a preliminary prediction result;
Modeling the connection between the encoder and the decoder by adopting a distraction mechanism, wherein the expression is as follows:
Wherein ε i,t(t1,t2) is the attention score, e i,t′ is the space-time embedding processed by the ReLU activation function, t 1 and t 2 are both time nodes, W is the weight, b is the deviation parameter, and [ in ], [ in ] is the vector inner product; For scaling factors, by scaling factors The effect of the increase in dimension is controlled to prevent the model from entering the low gradient region.
In particular, a transition-Attention mechanism (transition) is used to model the link between the encoder (recorded values) and the decoder (predicted results). For node i at time t, a similarity score between times t 1 and t 2 is calculated according to the turn attention mechanism to model the feature internal relationship.
Step 4.3, the decoder performs weighted summation on the hidden states of the previous time points according to the attention score, and the expression is as follows:
Wherein, T 2 = t..a., T-1; σ (·) is a normalized exponential function, and the attention score is normalized by the normalized exponential function to ensure that the score sum is 1;W h as a weight parameter of the network in the decoder, b h as a bias parameter, ε t as an attention score; Is the hidden state at time t 1.
The method is based on the principle that space-time embedding modeling is firstly carried out on nodes based on time and space data to generate space-time tensors, wherein a transfer meaning mechanism (transducer-attention) is used for establishing a mapping relation of space-time characteristics between an encoder and a decoder, similarity scores between time steps are calculated, weighted summation is carried out on hidden states of the decoder, and finally weighted prediction information in two dimensions of space and time is generated.
And 5, adaptively determining the optimal prediction time of each node in the adaptive topological graph through the reinforcement learning control module, generating a prediction motion vector, and obtaining a prediction result according to the prediction motion vector and the preliminary prediction result.
In an alternative embodiment, step 5 comprises:
Step 5.1, adaptively determining the optimal prediction time of each node in the adaptive topological graph through the reinforcement learning control module, wherein the optimal prediction time can be expressed as
Specifically, the optimal prediction time is an earlier prediction time which can give consideration to both the accuracy and timeliness of prediction, so that not only can the preparation for treatment be made in advance, but also the high accuracy of the prediction result can be ensured.
And 5.2, optimizing a prediction strategy based on the current state and the current strategy by a reinforcement learning control module, wherein the expression is as follows:
pt=σ(WcSt+bc) (12);
at=π(St) (13);
Wherein p t is the pause probability, W c is the weight parameter of the reinforcement learning control module, sigma is the activation function, b c is the deviation parameter of the reinforcement learning control module, S t is the current state, S t∈RN×(H+1);at is the predicted motion vector indicating the optimal predicted time, and pi is the current strategy.
Further, the epsilon greedy strategy can be applied to continue optimizing the prediction strategy after the step 5.2, and the expression is as follows:
In particular, the greedy strategy follows the greedy strategy (i.e., the current optimal action is selected) in most cases, but a random action is selected with a small probability ε to ensure a certain exploration. The epsilon greedy strategy can avoid sinking into local optimum, and gradually explore larger state space, so that a better strategy is found.
And 5.3, multiplying the prediction motion vector and the nonlinear transformation result of the hidden state element by element to obtain a prediction result, wherein the expression is as follows:
Wherein, the W p is weight, b p is deviation parameter; Generating prediction results by a linear rectifying unit (ReLU activation function) for processing decoder output hidden states for each time step t
Further, if the predicted motion vector of a node i in all time steps is 0, the optimal predicted time of the node i is defaulted to be T-1, and a predicted result is obtainedNamely, when the maximum time window is reached, if no action is triggered, the time T-1 is taken as the optimal prediction time to obtain a prediction result
The method is based on the principle that a reinforcement learning control module calculates an action vector of the optimal prediction moment based on the current state and a strategy, an epsilon greedy strategy is applied to balance between exploration and utilization, prediction moment is determined, and a prediction value of each time step is calculated according to the decision of the reinforcement learning control module and the output of a decoder, so that the optimization of the prediction moment is realized.
And 6, combining the performance loss function, the controller loss function and the early loss function to obtain a weighted total loss function, optimizing a training model, and outputting an accurate prediction result of the traffic flow.
In an alternative embodiment, step 6 includes:
And 6.1, calculating the average absolute error between the predicted result and the true value to obtain a performance loss function, wherein the expression is as follows:
Wherein L p is a performance loss function, N is the number of nodes; X T,i is a true value;
And 6.2, calculating negative logarithm pause probability to obtain an early loss function, wherein the expression is as follows:
wherein L e is the early loss function; for each node i, calculating by a strategy pi to obtain the optimal prediction time;
And 6.3, maximizing expected rewards by using the rewards tensor to improve prediction accuracy, and obtaining a controller loss function, wherein the expression is as follows:
Wherein L c is a controller loss function, E represents the desire of the bracket interior, b t′ is a base line, the base line b t′ is obtained by minimizing the mean square error learning between the base line b t′ and the integral rewards r depending on the hidden state S t of the controller, S t is a hidden state, r t' is an updated rewards, r is an integral rewards;
And 6.4, obtaining a weighted loss function through the super parameters according to the performance loss function, the early loss function and the controller loss function, wherein the expression is as follows:
L=Lp+ω1Lc+ω2Le (21);
Wherein L is a weighted loss function, and omega 1 and omega 2 are super parameters;
and 6.5, optimizing the training model according to the weighted loss function to enable the model training to converge, and obtaining the accurate prediction result of the traffic flow after the training is completed.
Specifically, combining the performance, timeliness and controller loss, and calculating to obtain a weighted loss function L through super parameters omega 1 and omega 2:
The expression of the mean absolute error MAE is:
The expression of the root mean square error RMSE is:
The expression for the mean absolute percentage error MAPE is:
Specifically, training and optimizing the model by considering the prediction precision, the prediction timeliness and the comprehensive loss function formed by the rewards multiple aspects generated by the control module until the model converges to obtain a prediction output result.
The method is based on the principle that model performance loss, early prediction loss and controller loss are modeled, the three loss functions are integrated into a total loss function in a weighting mode, wherein the weight is controlled by super parameters, then the super parameters such as initial learning rate, batch size and node embedding dimension are set through an adaptive moment estimation optimizer (Adam) training model, finally verification set errors are monitored by adopting an early stopping mechanism until the model converges, and a model capable of carrying out traffic flow adaptive prediction output in a complex traffic scene is obtained, so that model training and optimization are realized.
It can be understood that by adopting the traffic flow self-adaptive prediction method based on the multi-time step diagram, other model training methods, such as gradient descent optimization training, small batch training, RMSProp self-adaptive optimization training and the like, can be used after the prediction model is obtained, so that the training optimization process of the model is not limited in the invention.
For example, the initial learning rate is configured to be 0.001, the convolution layer depth of the multi-map fusion is 1, the hidden state H is set to 32, the node embedding dimension D is set to 40, the batch size is set to 32, the maximum observation window length T is selected to be 12, the super-parameters β 1,β2,β3,β4 are set to 0.05,0.95,0.95 and 0.95, respectively, and the loss function weight super-parameter ω 1,ω2 is set to 1 and between 0 and 0.1, respectively. The training process adopts an advanced stopping mechanism, an adaptive moment estimation optimizer (Adam) is selected as a model training tool, and training is performed on an Injettia GeForce RTX 3070Ti high-performance computing platform. And when the model training converges, obtaining a model with the final training completed, and outputting a traffic flow prediction result.
The data adopted by the simulation experiment of the invention is a public transportation speed data set collected by a los Angeles highway annular detector, and the data collected by 207 selected sensors is included, and the sampling rate is 5 minutes. The time span is from 3 months in 2012 to 1 month in 2012 to 30 months in 6 years, providing insight into the traffic patterns and congestion conditions in los Angeles areas, and being an ideal choice for evaluating the model in terms of traffic flow prediction accuracy. The results are shown in table 1 by comparing the frame of the present invention with the following baselines.
TABLE 1 Performance indicators of different models on datasets
Table 1 shows the prediction based on the present invention and other comparative baselines on the public transportation speed dataset collected by the los angeles circular detector, i.e. the comparison result of three indexes of average absolute error, root mean square error and average absolute percentage error at the same average optimal prediction time. As can be seen from the prediction results of the table 1, the traffic flow self-adaptive prediction method based on the multi-time step diagram can accurately realize the traffic flow prediction of the public traffic speed data set acquired by the los Angeles expressway annular detector, so that the optimal performance with statistical significance can be obtained in all evaluation indexes by adopting the traffic flow self-adaptive prediction model based on the multi-time step diagram.
The traffic flow self-adaptive prediction method based on the multi-time step diagram also has the following advantages:
(1) Dynamic space-time feature capturing capability the model of the invention realizes the effective modeling of space-time features by combining a distance graph and a time-varying side weight topological graph through a multi-graph fusion convolution calculation module. Compared with other prior art, the time-varying side weight topological graph has the advantages that the relationship among nodes is updated in real time, so that complex time-space association in traffic flow data is more easily captured compared with a single graph structure, and the time-varying side weight topological graph is particularly suitable for scenes with rapid traffic flow changes.
(2) The self-adaptive prediction time is determined, namely, the model can self-adaptively determine the optimal prediction time for each node through the reinforcement learning control module. Compared with the traditional method of fixed prediction time, the adaptive strategy is adopted to effectively reduce the prediction error caused by data delay, and the timeliness of prediction is superior to that of the prior art.
(3) The error propagation suppressing mechanism establishes strong correlation between the encoder and the decoder by using a distraction mechanism, and effectively relieves the problem of error accumulation in long-time sequence prediction by weighting and calculating important characteristics, so that compared with the technology in the field, the method and the device can fully guarantee and enhance the stability and the accuracy of prediction.
(4) And the multi-graph depth fusion is realized by a multi-level fusion mechanism of a distance graph, a time-varying side weight topological graph and a multi-time step graph, so that the model can be flexibly adapted to the characteristics of different time nodes and space positions. The method and the device can not be realized by a single graph structure, and the self-adaptive topological graph after multi-graph fusion can realize good balance between global structural stability and local dynamic change by self-adaptively adjusting the weights of the distance graph and the time-varying side weight topological graph through the super-parameters, so that hidden characteristic information between multiple nodes and multiple time step graphs can be more fully utilized, which is neglected in the prior art.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the traffic flow self-adaptive prediction method based on the multi-time step map when executing the computer program.
The invention also provides a computer readable storage medium having stored thereon a computer program which when executed by a processor implements a traffic flow adaptive prediction method based on a multi-time step map.
According to the traffic flow self-adaptive prediction method based on the multi-time step diagram, a self-adaptive topological diagram network is obtained by fusing a distance diagram, a time-varying side weight topological diagram and a multi-time step diagram, a attention turning mechanism is introduced, a prediction technology combining the attention turning mechanism and the self-adaptive topological diagram network is designed by utilizing a reinforcement learning control module, comprehensive capture of time-space dynamic characteristics, effective alleviation of prediction errors and self-adaptive determination of optimal prediction time are realized, and therefore, the timeliness and reliability of prediction are remarkably improved while the prediction accuracy is ensured.
The invention inputs the real-time data sample collected by the sensor into the model, the self-adaptive topological graph network is used as an encoder to extract data characteristics, the output of the encoder is used for initializing the decoder, the reinforcement learning control module judges whether the current moment meets the condition of the optimal step length according to the network hiding state output by the decoder, if so, a prediction result is generated and is used as a final prediction result, namely, the number of vehicles passing in each sensor area in a future time period. Through the steps, the method can effectively improve the prediction accuracy and timeliness of traffic flow, and particularly when complex urban traffic data is processed, the method has higher prediction accuracy and flexibility than the traditional method.
It should be noted that in this document relational terms such as first and second, and the like are used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Moreover, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that an article or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed. Without further limitation, an element defined by the phrase "comprising one does not exclude the presence of other like elements in an article or apparatus that comprises the element. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. The orientation or positional relationship indicated by "upper", "lower", "left", "right", etc. is based on the orientation or positional relationship shown in the drawings, and is merely for convenience of description and to simplify the description, and is not indicative or implying that the apparatus or elements referred to must have a specific orientation, be constructed and operated in a specific orientation, and therefore should not be construed as limiting the invention.
The foregoing is a further detailed description of the invention in connection with the preferred embodiments, and it is not intended that the invention be limited to the specific embodiments described. It will be apparent to those skilled in the art that several simple deductions or substitutions may be made without departing from the spirit of the invention, and these should be considered to be within the scope of the invention.