CN112700072B

CN112700072B - Traffic condition prediction method, electronic device, and storage medium

Info

Publication number: CN112700072B
Application number: CN202110312654.8A
Authority: CN
Inventors: 周弘懿; 郭庆锋
Original assignee: Tongdun Technology Co ltd; Tongdun Holdings Co Ltd
Current assignee: Tongdun Technology Co ltd; Tongdun Holdings Co Ltd
Priority date: 2021-03-24
Filing date: 2021-03-24
Publication date: 2021-06-29
Anticipated expiration: 2041-03-24
Also published as: CN112700072A

Abstract

The application relates to a traffic condition prediction method, an electronic device and a storage medium, belonging to the field of artificial intelligence, wherein the traffic condition prediction method comprises the following steps: acquiring time sequence traffic data acquired by an ETC portal system, and performing data preprocessing on the time sequence traffic data to obtain time sequence traffic characteristics; writing the time sequence traffic characteristics into a big data platform through Spark Streaming; reading time sequence traffic characteristics by utilizing Spark Jar on a big data platform, and training a flow model and a speed model by expanding a causal convolutional neural network algorithm; and predicting the traffic flow and the vehicle speed based on the trained flow model and speed model. According to the embodiment of the application, the acquired vehicle data are comprehensive, and the accuracy of traffic condition prediction can be improved; by using a distributed data processing technology, the real-time performance of data can be improved, a large amount of data can be processed quickly, and the efficiency is high; by adopting the expanded causal convolutional neural network algorithm, resources can be saved, and the training speed is improved.

Description

Traffic condition prediction method, electronic device, and storage medium

Technical Field

The present application relates to the field of artificial intelligence, and more particularly, to a traffic condition prediction method, an electronic device, and a storage medium.

Background

From the construction and operation condition analysis of intelligent highways in various provinces, the problems of rapid increase of the demand, relative delay of the service capacity, lack of effective means and inaccurate data still face nowadays. At present, the vehicle-road cooperation technology is still in a research stage, the popularization difficulty is very high, and effective means of information service is lacked.

For a learning model of traffic conditions, a currently adopted method is to train the model based on data of a mobile phone App, but the scheme has great limitations: the vehicle driving data can be collected only by starting a large number of mobile phone Apps, and the vehicle data which are not started cannot be collected, so that the fitting degree of the model is insufficient due to the loss of a large number of data, the trained model is far away from the real traffic condition, and the accuracy of the prediction result of the traffic condition is low.

Moreover, with the rapid increase of the data volume of the terminal, the real-time requirement on the data is higher, and the traditional data storage and analysis mode based on the relational database is necessarily challenged, so that the service requirement under a new charging mode cannot be met.

Therefore, how to improve the real-time performance of traffic data and the accuracy of traffic prediction becomes a problem to be solved urgently by those in the art.

Disclosure of Invention

The embodiment of the application provides a traffic condition prediction method, electronic equipment and a storage medium, so as to at least solve the problem of how to improve the real-time performance of traffic data and the accuracy of traffic prediction in the related art.

In a first aspect, an embodiment of the present application provides a traffic condition prediction method, including: acquiring time sequence traffic data acquired by an ETC portal system, and performing data preprocessing on the time sequence traffic data to obtain time sequence traffic characteristics; writing the time sequence traffic characteristics into a big data platform through Spark Streaming; reading the time sequence traffic characteristics by utilizing a Spark Jar on the big data platform, and training a flow model and a speed model by expanding a causal convolutional neural network algorithm; and predicting the traffic flow and the vehicle speed based on the trained flow model and speed model.

In some of these embodiments, the data pre-processing includes at least one of data de-noising, feature dimension reduction, default value population, data normalization, and feature selection.

In some of these embodiments, the feature dimension reduction comprises: reducing the dimension of the time sequence traffic characteristics by adopting a principal component analysis mode to obtain key characteristics; and/or, the default value population comprises: and filling the default values in the time sequence traffic characteristics by adopting a cubic spline interpolation method.

In some embodiments, the training of the traffic model and the velocity model by the extended causal convolutional neural network algorithm includes: splitting the time sequence traffic characteristics into a training data set, a verification data set and a test data set; training a flow model and a speed model by using the training data set according to a k-fold cross validation method, and adjusting the hyper-parameters of each model based on the validation data set; and evaluating the accuracy of each model by using the test data set and a preset evaluation function.

In some embodiments, the predicting the vehicle flow and the vehicle speed based on the trained flow model and speed model includes: acquiring real-time characteristics and off-line characteristics, wherein the real-time characteristics are obtained by reading time sequence traffic data acquired by an ETC portal system in real time through Flink and processing the data in an ETL (Extract-Transform-Load) mode, the off-line characteristics are obtained by calculating according to the real-time characteristics acquired in historical time, and the off-line characteristics are stored in a cache; splicing the real-time features and the off-line features to obtain spliced features; and substituting the splicing characteristics into a trained flow model and a trained speed model to predict the traffic flow and the speed.

In some of these embodiments, after training the traffic model and the velocity model by the augmented causal convolutional neural network algorithm, the method includes: predicting the traffic flow and the vehicle speed from the current position of the user to the target position on the user planned route by using the trained traffic model and speed model; calculating a first elapsed time for the planned route based on the predicted traffic flow and the vehicle speed.

In some of these embodiments, after calculating a first elapsed time for the planned route based on the predicted vehicle flow and vehicle speed, the method includes: when a congestion event is identified on the planned route through an event identification algorithm, calculating second consumed time of all possible nearby traveling routes according to the current position and the target position; and if the second consumed time is less than the first consumed time, recommending a route corresponding to the second consumed time to the user terminal.

In some of these embodiments, said calculating a first elapsed time for said planned route based on said predicted vehicle flow and vehicle speed comprises: the average speed is calculated according to the following formula:

wherein

Which is indicative of the average speed of the vehicle,

the speed calculated through a speed model is represented, and the value of delta represents the proportion difference between the traffic flow of the same road section at the same time in the week and the traffic flow of the same road section at the current time; calculating the passing time of each road section according to the average speed of each road section on the planned route; and accumulating the passing time of each road section to obtain the first consumed time.

In some of these embodiments, in the case where the second elapsed time is less than the first elapsed time, the method includes: for the second elapsed time, calculating a time saving ratio:

where n denotes a link that an end user needs to pass through to a target location, S denotes a serial number of the link, ST denotes time consumed per link expected to pass through,

representing a first elapsed time; recommending a route corresponding to the second elapsed time to the user if the time saving ratio is greater than a predetermined thresholdA user terminal.

In a second aspect, an embodiment of the present application provides an electronic device, including a processor and a storage medium storing a computer program, where the computer program, when executed by the processor, implements the method according to any one of the above.

In a third aspect, the present application provides a computer-readable storage medium, on which a computer program is stored, where the computer program is implemented, when executed by a processor, to implement the method according to any one of the above.

According to the traffic condition prediction method, traffic information on different road sections is collected through the ETC portal system on the road for data mining, a user does not need to start any mobile phone program, the collected vehicle data are comprehensive, and the trained model is close to the real road condition, so that the accuracy of traffic condition prediction can be improved; the distributed data processing technology (including Spark Streaming and Spark Jar) is applied, the real-time performance of data can be improved, a large amount of data can be processed quickly, and the efficiency is high; by adopting the expanded causal convolutional neural network algorithm, resources can be saved, and the training speed is increased, so that the data processing capability is improved on the whole, and the calculation cost is saved. Moreover, because the real-time performance of the data is high and the accuracy of the prediction result is high, the accuracy of the time-consuming prediction of the route is high, and the accuracy of the result recommended to the user is high.

Drawings

The accompanying drawings, which are included to provide a further understanding of the application and are incorporated in and constitute a part of this application, illustrate embodiment(s) of the application and together with the description serve to explain the application and not to limit the application. In the drawings:

FIG. 1 is a flow chart of a traffic condition prediction method according to an embodiment of the present application;

FIG. 2 is a schematic diagram of an overall algorithm architecture of a traffic condition prediction method according to an embodiment of the present application;

FIG. 3 is a schematic diagram of components extracted from two-dimensional data after principal component analysis according to an embodiment of the present application;

FIG. 4 is a schematic diagram comparing three interpolation methods according to an embodiment of the present application;

FIG. 5 is a schematic diagram of the structure of a causal convolution according to an embodiment of the present application;

FIG. 6 is a representation of an expanded convolutional network according to an embodiment of the present application;

FIG. 7 is an architectural diagram of a model of a multi-layered differential augmented causal convolutional neural network according to an embodiment of the present application;

FIG. 8 is a flow chart of training a flow model and a velocity model according to an embodiment of the present application;

FIG. 9 is a flow chart of the training and use of a flow model and a velocity model according to an embodiment of the present application;

FIG. 10 is a schematic view of a traffic condition prediction interface according to an embodiment of the present application;

FIG. 11 is a flow chart of a traffic inducement method based on traffic prediction conditions according to an embodiment of the present application;

fig. 12 is a block diagram of an electronic device according to an embodiment of the present application.

Detailed Description

In order to make the objects, technical solutions and advantages of the present application more apparent, the present application will be described and illustrated below with reference to the accompanying drawings and embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the present application and are not intended to limit the present application. All other embodiments obtained by a person of ordinary skill in the art based on the embodiments provided in the present application without any inventive step are within the scope of protection of the present application.

The traffic flow prediction can provide support for intelligent application of the expressway, help road managers to take measures in time in emergency, effectively manage a traffic network to ensure normal operation of the road, help road planners to know the change trend of future traffic volume, help travelers to correctly select a travel mode, avoid time waste and delay of travel due to road congestion or emergencies, and accordingly improve travel quality of people.

Therefore, an embodiment of the present application provides a traffic condition prediction method, and fig. 1 is a flowchart of a traffic condition prediction method according to an embodiment of the present application, and as shown in fig. 1, the method includes:

s100: acquiring time sequence traffic data acquired by an ETC portal system, and performing data preprocessing on the time sequence traffic data to obtain time sequence traffic characteristics;

s200: writing the time sequence traffic characteristics into a big data platform through Spark Streaming;

s300: reading time sequence traffic characteristics by utilizing Spark Jar on a big data platform, and training a flow model and a speed model by expanding a causal convolutional neural network algorithm;

s400: and predicting the traffic flow and the vehicle speed based on the trained flow model and speed model.

According to the content, the running information of the vehicles on different road sections is collected through the ETC portal system on the road to serve as the training data of the model, a user does not need to start any mobile phone program, the collected vehicle data are comprehensive, and the trained model is close to the real road condition, so that the accuracy of traffic condition prediction can be improved; the distributed data processing technology (including Spark Streaming and Spark Jar) is applied, so that the real-time performance of data can be improved, a large amount of data can be rapidly processed, and the efficiency is high; by adopting the expanded causal convolutional neural network algorithm, resources can be saved, and the training speed is increased, so that the data processing capability is improved on the whole, and the calculation cost is saved.

As an example, fig. 2 is a schematic diagram of an overall architecture of an algorithm of a traffic condition prediction method according to an embodiment of the present application, and as shown in fig. 2, the overall architecture of the algorithm according to the embodiment of the present application is divided into four layers: the bottom layer is a big data platform and a stream processing platform (i.e. a computing platform in fig. 2), which can be integrated with the big data platform and used for processing and storing traffic data accessed in real time, and each traffic data corresponds to a specific time, so the traffic data is called as "time-series traffic data". And the data acquisition layer acquires time sequence traffic data from a front-end portal frame industrial control machine of the ETC portal frame system in real time. The modeling process comprises the following steps: the method comprises the following steps of data preprocessing (including characteristic engineering), model training and model test evaluation, obtaining a flow model and a speed model through expanding a causal convolutional neural network algorithm, and then deploying the model on a machine learning platform (namely the big data platform), wherein the data preprocessing process comprises ETL processing: and the data is loaded to a data warehouse after extraction, cleaning and conversion so as to integrate scattered, disordered and standard non-uniform data. At the model application layer, the traffic condition is predicted through the model.

Data processing is the most important link in machine learning, and comprises the work of data set acquisition, data cleaning, data modeling and the like. For example, the raw data samples are vehicle data collected by a gantry system on the Shanghan corridor expressway, the content of the data comprises gantry numbers, license plates, vehicle types, operation occurrence time and the like, and the raw data are processed and then used for training the model. In order to more clearly illustrate the present application, a detailed procedure is set forth below.

Step S100: and acquiring time sequence traffic data acquired by the ETC portal system, and performing data preprocessing on the time sequence traffic data to obtain time sequence traffic characteristics.

The main factors that take into account traffic prediction include spatial and temporal dependencies. The spatial dependence means that the change of the flow is mainly controlled by the topological structure of a road network, the traffic state of an upstream highway influences a downstream highway through transfer, and the traffic state of the downstream highway influences the upstream state through feedback; time-dependent means that the flow changes dynamically over time, mainly periodically and trending. Highway traffic may be affected by traffic conditions at or earlier than its previous time.

As an example, the raw data used in the embodiments of the present application is derived from an ETC portal system, which includes diversified edge devices such as an rsu (road Side unit) antenna and a camera image capturing system of a trading system. In reality, the road coverage of ETC portal is high, and about 2 on average kilometers just are equipped with 1 to 2 portal equipment, so the original data sample through ETC portal real-time collection is more comprehensive. Specifically, by arranging and mining the data of ETC portal road net serial number, ETC portal road section serial number, ETC portal serial number, license plate color, snapshot time and the like generated by the ETC portal system, can better depict the time space characteristics, can carry out regression and prediction on information such as traffic flow, speed and the like by using time series fitting, and can analyze historical data (traffic data generated on special dates, festivals and holidays), extract characteristic periodic events or trends, thereby effectively predicting the influence of events between the upstream and downstream expressways, deducing whether special conditions occur or not, realizing the real-time traffic flow and speed information of the expressway network, and the prediction of section traffic road conditions (traffic flow and vehicle speed) after 15 minutes, 30 minutes and 60 minutes is realized, and the prediction of section short-term traffic flow change caused by construction, accidents, congestion and other events is realized.

As an example, the time series traffic data used to train the model has the following fields: table (mysql): track _ min _ section, to field: road section number, section number and road section code; millimeter stakes gps (mysql): dim _ miletone _ gps, important field: road section number, millimeter pile distance, gps longitude and latitude; vehicle zone speed (kafka): etl-sitteregon-speed, important field: license plate number, license plate color, road segment number, speed.

Quality analysis on time series traffic data acquired by the ETC portal system:

table 1: toll station data statistics

。

As can be seen from table 1, the total amount of vehicles per day is about 50W to 60W, wherein the entrance and exit are both about 50% in the hungry range, the license plate recognition rate is about 95%, and it can be seen that the data collected by the ETC portal system has high quality.

Therefore, in the embodiment of the application, the source data is identified and generated by the RSU antenna of the ETC portal system, and the ETC is a core charging system, and the collected data has the characteristics of low delay and high quality.

In addition, the mobile phone App and the vehicle-mounted intelligent device are developed by a specific manufacturer, data generated in the using process include identity information of a user, and information leakage risks exist.

The data preprocessing process of the embodiments of the present application may include data denoising. Since abnormal sample data is generated when a driver encounters a special situation or driving habits, data screening is required. For example, given a data set for a highway toll station (provided with an ETC portal) is U, where the recorded data is denoted as ri.j (tm), and tm is the time in which the vehicle is recorded from entering toll station i to entering toll station j. Setting a left and right neighborhood range tW of a time interval, determining a time unit [ tm-tW, tm-plus tW ], determining a sub-boundary value for the travel time due to asymmetric data distribution of the travel time, and rejecting recording data which are not at the sub-boundary value.

The specific sample extraction logic is as follows:

(1) the station id of entrance and exit belongs to the Shanghai corridor range;

(2) screening a passenger car with a blue license plate (the license plate number is not empty, the color of the license plate is blue, and the license plate number is more than 6 digits);

(3) eliminating abnormal time values (too short or too long time) of entering and exiting stations;

(4) rejecting lines with the number of passing vehicles lower than a threshold value; the total screening effective sample is more than 300 ten thousand.

The data preprocessing process of the embodiment of the application may further include feature dimension reduction. In the obtained data set, besides the data such as key traffic flow, speed and the like, there are many other feature data such as weather data, vehicle type, time, relevant configuration of gantry equipment and the like, but if the feature dimension is too large, the problem of too slow training caused by too high dimension is caused when the feature dimension is directly input into the model. Therefore, in the embodiment of the present application, the high-dimensional features are reduced, and preferably, the features other than the traffic flow, the vehicle speed, and some key features are reduced by using a Principal Component Analysis (PCA), so that most feature data are represented by a small part of the features after the reduction. Specifically, fig. 3 is a schematic diagram of components extracted from two-dimensional data after principal component analysis according to the embodiment of the present application, and as shown in fig. 3, a key component is extracted from a first feature and a second feature after principal component analysis for a large number of samples of the two-dimensional data, specifically, a covariance matrix between each pair of data is calculated to obtain a correlation degree of each pair of data, and then a feature value and a feature vector of the covariance matrix are obtained, where a larger feature value indicates that the variance of the data is large, and a larger signal-to-noise ratio indicates that the data has a large variance of the dimension, so that the whole can be described better, the feature vector corresponding to the feature value is a principal component axis, and finally, the principal component is extracted after the data is converted and projected on the axis of the feature vector. After principal component analysis, key characteristics such as vehicle speed, vehicle flow, yellow card proportion and yellow card number are obtained, and other characteristics are not analyzed. Therefore, the training speed of the model can be improved, and the aim of saving the calculation cost is fulfilled.

The data preprocessing process of the embodiments of the present application may further include default value padding. During the real-time acquisition process of the portal data, the acquisition equipment may have bug or halt conditions, so that the portal data acquisition fails within a certain period of time. Through statistics, the data loss rate in the data set is almost 4%, which causes great interference to the accuracy of subsequent modeling.

For example, the traffic flow and the vehicle speed are one-dimensional data, and interpolation is used in the embodiment of the present application to fill the default values. Interpolation is a method of extrapolating new data points from known data points in the vicinity of the missing point. For one-dimensional input, this is the process of fitting a high-order curve to the input data to derive the value of a new data point. In the embodiment of the present application, a cubic spline interpolation method is used, which segments the original sequence into several segments, and each segment constructs a cubic function to fit a function curve, so that the joints of the segments have the property of continuous second derivatives.

In the embodiment of the present application, a Nearest neighbor interpolation method (Nearest), a Linear interpolation method (Linear), and a Cubic spline interpolation method (Cubic) are tested in an experiment, fig. 4 is a schematic comparison diagram of three interpolation methods according to the embodiment of the present application, and as shown in fig. 4, the Cubic spline interpolation method obtains a relatively smooth curve, and has the best fitting ability. Therefore, the filling of the default value is carried out by adopting a cubic spline interpolation method, the accuracy of a model for subsequent training can be improved, and the accuracy of traffic condition prediction is improved.

The data preprocessing process of the embodiment of the application may further include data normalization. Because the numerical ranges among the multiple features and the scales of the measuring units are different, the data analysis result can be influenced by directly using the numerical values, and the numerical values among the features can be in the same range after normalization is adopted, so that the size comparison is convenient.

The essence of data normalization is a linear variation, and the problem of training described above can be solved after the data is linearly varied, so that the data can be better represented in the training stage. The normalized data can be restored into original data after being subjected to inverse linear change, and the data cannot generate loss in precision.

In the modeling link of the flow value, the flow value is a long type (long integer type) value, and the deviation of the value is large, so that the numerical type is converted into a specific range by adopting a MinMaxScaler normalization method. The model flow value of the non-differential layer does not have negative number, and the range of the flow value is set to be (0-1); the scheme of adding a differential layer will appear negative, and the range of flow values for the differential scheme is set at (-1, 1).

The formula for the MinMaxScale normalization is:

where Xstd represents the normalized data, and x.min (axis =0) and x.max (axis =0) represent the minimum value and the maximum value of the feature vector in the column direction, respectively.

Xscaled represents the data recovered after inverse normalization, and the expression is:

。

therefore, the label class data can be converted into integer numbers by standardization using labels, so that training of the extended causal convolutional neural network model is facilitated, in other words, a traffic model and a velocity model are facilitated through an extended causal convolutional neural network algorithm.

The data preprocessing process of the embodiment of the present application may further include feature selection, for example, according to the available data sources and the quality of data, the time-series traffic features are mainly constructed according to several aspects of lines, time, congestion conditions, and historical time-series traffic situations, and in total, 100 features are constructed, and after being screened, 50 features in the following table 2 are left:

TABLE 2 information on more than 50 characteristics left

。

Based on the above contents, the time sequence traffic characteristics are obtained by performing data preprocessing on the time sequence traffic data, and then the model is trained by using the time sequence traffic characteristics, so that the generalization capability of the model can be increased, and the overfitting probability can be reduced.

Step S200: and writing the time sequence traffic characteristics into a big data platform through Spark Streaming.

The embodiment of the application also considers that the requirement on the real-time performance is higher along with the rapid increase of the data volume of the terminal, and the traditional storage and analysis based on the relational database are necessarily challenged, so that the service requirement under a new charging mode cannot be met. The embodiment of the application solves the problems by means of a distributed big data correlation technology. As an example, the data acquisition flow of the gantry apparatus is as follows: the front-end portal frame industrial control machine acquires data; and writing the collected data into a message system in real time. And then, writing the characteristic data into a big data platform by using a Spark Streaming real-time consumption message system, wherein the Spark Streaming processing mode is not to process each piece of data according to the sequence of the related technology, but to segment the butted external data stream according to time and batch process the segmented files. Real-time writing can be realized, compared with the traditional off-line exchange, the delay is low and is in the order of seconds, while the traditional off-line exchange is delayed by days more and minutes less.

Based on the above contents, the ETC portal system is used for acquiring time sequence traffic data in real time, the time sequence traffic data are subjected to data preprocessing to obtain time sequence traffic characteristics, and then the time sequence traffic characteristics are written into the big data platform. When the model is trained, because a large number of time sequence traffic characteristics are stored in the big data platform, the model can be directly read from the big data platform for training, for example, the traffic characteristics of each time sequence of the last month can be read for training the model.

Step S300: and reading time sequence traffic characteristics by utilizing Spark Jar on a big data platform, and training a flow model and a speed model by expanding a causal convolutional neural network algorithm.

As an example, after writing, the Spark task is packed into a Jar package, and the Spark-Submit is used for submitting, so that the Spark is a distributed task, and the calculation speed is high. For example, the code for training the model is encapsulated in a Spark Jar package and integrated with the big data platform, and the Spark Jar package can read the data in the big data platform, so that the functions of the big data platform, such as periodic scheduling and error retry, can be enjoyed, and thus, the periodic training of the model can be realized. The beneficial effects of high data real-time performance and high calculation speed are achieved.

As one example, a feature data set is split into three parts: training the data set, validating the data set, and testing the data set. The training data set is divided into input vectors and output scalars, which appear in pairs as input and output to the model. The validation dataset provides unbiased estimates for models generated based on the training dataset by continually adjusting the hyper-parameters of the models. The test data set evaluates the accuracy of the training generative model through an evaluation function.

Because different algorithms or different sets of hyper-parameters are used in the comparison experiment, a plurality of models can be generated, and assuming that the data set is T, a most excellent model needs to be selected as a final model. The following may be used: training each model with a training set T

Obtaining the hyperparameter

Then, the hyperparameter for the minimum error rate is selected. However, this approach has a significant problem, as described by the okamu razor principle, although higher order polynomials fit the training set better, they are less generalised and are easily over-fitted. The embodiment of the application adopts Hold-out cross validation to solve the problems, and comprises the following steps:

(1) the data set is randomly split into a training set T and a verification set V, and preferably, the ratio of the training set to the verification set is 7: 3.

(2) training each model with a training set T

Obtaining the hyperparameter

。

(3) And performing the selection of the model with the minimum error rate on the verification set V again by using the obtained hyper-parameters.

Through the above steps, the optimal model is obtained on the verification set, so that the overfitting phenomenon of the training set is prevented to a great extent. This approach works well for large data volumes, but training for small data volumes may have insufficient data volume model training.

In the embodiment of the application, in order to efficiently utilize data, a k-fold cross validation mode is used in an experiment to validate a model. The k-fold cross validation has a good effect on model training with small data volume. In the subsequent experimental phase, k selects 10 parameters for cross-validation. The specific steps of k-fold cross validation are as follows:

(1) randomly partitioning a training set T into k disjoint subsets

Assuming that the number of training samples in T is m, each subset has m/k training samples.

(2) For each trained model, from

And selecting a subset as a verification set, using the rest as a training set, learning a hypothesis function by using the training set, and verifying by using the verification set to obtain an error value.

(3) The process is repeated k times, and the final error value of the model is the average of the k error values.

(4) And selecting the model with the minimum error value as a final model.

Based on the above steps, the degree of goodness of the model is evaluated by testing a data set and a preset evaluation function, the evaluation function adopts a uniform Absolute Percentage Error (MAPE) for describing an average value of the ratio between the Error of each predicted value relative to the actual value and the actual value, and a specific expression formula is as follows:

wherein,

which represents the actual value at the time of the i-th instant,

the predicted value at the ith time is expressed, and is subjected to weighted average and then multiplied by 100% to calculate the percentage. For example, when the value of M is calculated to reach a preset value, a trained model can be obtained. It should be noted that, if the traffic data is time series data, there are n traffic flow samples and n vehicle speed samples at n times.

The following describes in detail the procedure of training the traffic model and the velocity model by the extended causal convolutional neural network algorithm.

The neural network-based timing prediction model generally includes two types, namely, a timing prediction model based on a Recurrent Neural Network (RNN) and a timing prediction model based on a Convolutional Neural Network (CNN). The embodiment of the application provides a time Sequence Convolution Neural Network (SCNN) model added with a difference layer as a flow model and a speed model, and the model can use relatively fewer resources to train more quickly and reduce the training time under the condition of similar accuracy. SCNN is a one-dimensional CNN structure used to process time series data modeling, which possesses several unique properties compared to CNN: convolution layers are causality, i.e. the output of the current time sequence is only related to the first few inputs of the current time sequence, similar to the effect of the k-order markov chain and the cycle layer of RNN; secondly, the expansion convolution technology can be used to enable convolution to have a very large receptive field, so that resources are saved and the training speed is improved.

Specifically, SCNN is a structure that employs an extended causal convolution in the design of convolutional layers, and is described in detail below:

(1) causal convolution

The design of the dilated causal convolution needs to be based on the causal convolution. The training of the SCNN adopts a K-order Markov chain mode, and a certain future state is only related to the first K state values and is not related to the state sequence before. Because of the sequence data, the input layer of the causal convolution is a one-dimensional convolution, a plurality of hidden layers are arranged above the input layer to abstract the characteristics of the input data, and the uppermost layer is the output layer. Fig. 5 is a schematic structural diagram of causal convolution according to an embodiment of the present application, and as shown in fig. 5, after a first timing prediction obtains a result in an output layer (the result is obtained through 5 input points in fig. 5), the end of a next timing is continuously fed back as a last input point of the next timing to predict a next timing value, and the operation is performed repeatedly until timing training in all training data is completed.

The above training approach simulates chain structures in the RNN. Because CNN has no cyclic link layer, the training speed is faster than RNN and its variants, especially for training where the input is a long sequence, compared to RNN and its variants. But causal convolution is also problematic, requiring very large convolution kernels or very many hidden layers to enlarge the receptive field. As illustrated in fig. 5, when the receptive field size is only 5, where the receptive field size is equal to the number of network layers (4) plus the input convolution kernel length (2) minus 1 (i.e., 4+2-1= 5), 3 hidden layers are required to obtain the output.

The small receptive field means that the model can only learn a smaller time sequence range, but can not learn a more global time sequence, and several methods are available for solving the problem of the small receptive field. The first is to use a convolution kernel with a larger length and add more neural network layers, which will cause the problem of increasing the complexity of the model, so that the machine resources consumed by training are increased and the training time is lengthened. The second is that a pooling layer can be added for data compression, but pooling can cause the time series data to lose part of the information and can reduce the accuracy of the model.

(2) Dilated causal convolution

Because the causal convolution has the problem that the receptive field can be enlarged only by enlarging the length of a convolution kernel or the number of network layers, the extended causal convolution is adopted in the modeling process. The idea is to skip some input timing to make the convolution kernel applicable to regions larger than the length of the convolution kernel. Fig. 6 is a schematic representation of an extended convolutional network according to an embodiment of the present application, and fig. 6 is an extended convolutional network with a convolutional kernel size of 3 and an extension coefficient of each hidden layer of 1, 2, and 4, respectively.

Formulating the extended convolution by assuming a one-dimensional sequenceInput is as

Each layer has a convolution kernel size of k and a coefficient of expansion of k

The number of layers is n.

When d is 1, the dilation convolution reduces to a conventional convolution. In the case of a fixed convolution kernel size, the receptive field size is proportional to the expansion coefficient. This way, the receptive field of the convolutional neural network can be effectively increased. The idea of using an expanded convolutional network can be used to expand the receptive field of a convolutional neural network in two ways. Using a larger convolution kernel or using a larger expansion coefficient, these two correspond to the value of k and the value of d in the formula, respectively.

The expansion convolution is also characterized by the advantage that the size of the receptive field is exponential to the number of network layers, for example, the receptive field of an 11-layer network is 1024, which is equivalent to a one-dimensional 1024-length convolution network. The expanded convolutional network can use little network depth to obtain a very large sensing field, and meanwhile, the final calculation amount is unchanged because effective calculation points in a convolutional kernel are unchanged. Compared with pooling, the input time series data content is not lost. The dilation convolution solves the problem posed by the causal convolution very well.

After the structure of the dilation convolution is obtained, the structure of the causal convolution is fused for a time sequence prediction task. This allows a cyclic layer structure similar to RNN to be achieved while retaining the benefits of extended convolution.

(3) Multilayer differential expansion causal convolution neural network

The multilayer differential expansion causal convolutional neural network (RES/DCNN) is a neural network structure designed according to the data characteristics of the traffic flow and the vehicle speed. The design idea is based on the expansion causal convolution, and a plurality of differential layers are added on the basis of the expansion causal convolution layer. Fig. 7 is a schematic diagram of an architecture of a multi-layer differential augmented causal convolutional neural network model according to an embodiment of the present application, and as shown in fig. 7, the following describes the respective layers in sequence:

a. multiple differential layers

In the design of the model, aiming at the characteristic that the distribution characteristics of the traffic flow and the vehicle speed data in the data set are unstable due to the missing problem, a plurality of layers of difference layers and inverse difference layers are added to enable the traffic flow and the vehicle speed data to tend to be stable, and the stable traffic data can be better trained and predicted in a neural network.

And if the single differential conversion is not enough to stabilize the traffic flow and the vehicle speed data, the process is continued until the traffic flow and the vehicle speed data tend to be stable. If the difference is still not stable after N times of difference, the traffic flow and the vehicle speed data are not suggested to be used for modeling analysis.

b. Expanding cause and effect volume block

The data after the multi-layer difference layer processing is input into the expanded cause and effect volume block. The expansion cause and effect convolution block is a stacked hierarchy, each layer consisting of an expansion cause and effect convolution layer and a Dropout layer.

The expansion causal convolutional layer is a one-dimensional convolutional neural network, the structure of the convolutional layer is that the dimension of an output space is 32, the size of a convolutional kernel is 2, the initial value of an expansion coefficient is 1, and after hierarchical stacking, the expansion coefficient of each layer of stacking is multiplied by 2 on the previous basis. The output of each layer in the expanded causal convolution block is connected to the input of the next layer to form a causal chain to simulate causal convolution.

A DropConnect layer follows each dilation causal convolution layer, and the DropConnect scheme is used in the selection of the Dropout mode, with the input ratio set to 0.3. The Dropout layer is introduced to randomly set the weight to 0 according to a preset value scale each time the neural network parameters are updated in training, which helps to prevent the overfitting phenomenon. This scheme is used for the subsequent Dropout layers.

c. Full connection layer

The dilation causal volume block is followed by a fully connected layer with an output dimension of 128, and the activation function uses RELU. The fully-connected network is introduced to tile the results to accommodate the inputs of subsequent neural network layers. And a Dropout layer is connected behind the fully-connected network, so that the occurrence probability of overfitting is continuously reduced. The Dropout layer is followed by a fully connected network, which has a dimension of 1 because of a corresponding prediction result of the regression model.

d. Differential inversion stratification

And finally, a reverse differential layer is connected behind the dimension 1 full connection layer, the order of the reverse differential is the same as that of the differential layer, and the reverse differential layer is used for restoring the differential data to the original flow data.

After the multiple layers are stacked, the flow values of the first k times of the predicted time sequence at the next moment are input into the model and are strictly arranged according to the sequence of the time sequence. The output result of the model is the predicted value of the traffic flow and the vehicle speed of the next time sequence. And calling the process for n times through the preset input predicted time window size n to obtain the predicted values of the traffic flow and the vehicle speed in a future period of time.

Based on the above, fig. 8 is a flowchart of training a traffic model and a speed model according to an embodiment of the present application, and as shown in fig. 8, first, vehicle interval speed data and a section table are accessed, and vehicles passing per minute are counted according to a road section; calculating the traffic flow, the average speed, the yellow card proportion and the like according to the road sections; selecting characteristics; and training a short-time flow model and a short-time speed model through an expanded causal convolutional neural network algorithm, and obtaining a trained model when the model evaluation reaches a preset value.

Step S400: and predicting the traffic flow and the vehicle speed based on the trained flow model and speed model.

As an example, fig. 9 is a flowchart of training and using a flow model and a speed model according to an embodiment of the present application, and as shown in fig. 9, after ETL processing is performed on raw time-series traffic data collected in real time, standard unified feature data is obtained, and the feature data is converted into time-series traffic features that can be directly used for training the model through feature engineering (including, for example, feature dimension reduction, default value filling, data normalization and/or feature selection, etc.), that is, used as input values of the model, so as to train the flow model and the speed model through an extended causal convolutional neural network algorithm. In the process, after the standard unified feature data are obtained, the feature data are also stored in the Redis cache as the offline feature batch. And then, when the traffic condition is predicted, splicing the real-time characteristics acquired in real time and the offline characteristics of the Redis cache to obtain splicing characteristics, and bringing the splicing characteristics into a flow model and a speed model so as to predict the traffic flow and the speed. According to the embodiment of the application, the prediction is carried out through the splicing characteristics, the current real-time traffic condition is considered, the training data excessively depending on the historical time sequence is avoided, and the actual conditions of the current traffic, including sudden congestion events, can be identified in time.

As an example, when predicting traffic conditions, a real-time characteristic and an offline characteristic are obtained, where the real-time characteristic is obtained by reading, in real time, time series traffic data (time series traffic data at the current time) acquired by the ETC portal system through Flink, and performing data processing in an ETL (earth-based transportation) manner. For example, in table 2, "current vehicle speed", "current flow rate", "current number of yellow-card cars", "current yellow-card occupancy", and "current congestion index" are all real-time characteristics. The offline features are calculated iteratively from real-time data collected over historical time and stored in a cache for easy reading and use, for example, table 2 "speed and current difference for the first 5 minutes", "speed and difference for the first 10 minutes", "speed and difference for the first 15 minutes" and "difference for the first 10 minutes" are all offline features. In short, the difference between the real-time feature and the off-line feature is: the real-time characteristics can be directly read from the real-time data, and the off-line characteristics are obtained by continuously reading data from the real-time data and calculating, so that the off-line characteristics are stored in the cache, for example, the traffic characteristics at the current time and the traffic characteristics before the week are compared, and the calculation can be performed in various ways. Splicing the real-time characteristics and the off-line characteristics to obtain splicing characteristics; and then, the splicing characteristics are brought into the trained flow model and speed model to predict the traffic flow and the vehicle speed. It should be noted that the Flink is an open source stream processing framework, and the core of the Flink is a distributed stream data stream engine written in Java and Scala. Flink executes arbitrary stream data programs in a data parallel and pipelined manner, and Flink's pipelined runtime system can execute batch and stream processing programs. The Flink is used as a distributed data processing technology, and can improve the calculation efficiency in the embodiment of the application. Moreover, by adopting the ETL mode to process data, the characteristic data with unified standards can be obtained.

Because the splicing characteristics are obtained by splicing the real-time characteristics and the off-line characteristics, the current actual traffic conditions (such as traffic accidents) can be considered, so that more accurate prediction can be made, namely the accuracy of the prediction result is improved.

The embodiment of the application can also identify the real-time traffic events according to an event identification algorithm, wherein the event identification comprises the following contents:

(1) identifying the driving direction: the event discovery and congestion in the vehicle driving process need to identify the front of the vehicle on the driving path, and because the data only contain gps longitude and latitude information, the events on the rear and the adjacent roads need to be eliminated through an algorithm.

The real-time program obtains the nearest n pieces of data of a certain vehicle, the first piece of data and the last piece of data are obtained after the data are sorted according to time, the distance between the two points and the gps of the hundred-meter pile near the road section is calculated respectively, and the nearest hundred-meter pile point is obtained. And expressing ascending according to the increasing relation of the hundred-meter pile numbers, and expressing descending according to the decreasing relation.

(2) Event direction identification: the method comprises the steps of firstly reading an event table, wherein only road section information and no gps data are contained in the event table, calculating and obtaining a pile with the minimum absolute value of distance through the distance between the pile and a hectometer pile on the same road section, obtaining gps information corresponding to the pile through the nearest pile, and obtaining the position of the event gps as accurate as possible by the approximate means.

The real-time program obtains the latest n pieces of data of a certain vehicle, and the first piece of vehicle gps data and the last piece of vehicle gps data are taken as the starting position and the ending position of the vehicle in the time range after the vehicle gps data and the last piece of vehicle gps data are sequenced according to time. The direction angle is calculated from the start and end positions, and represents the direction of the vehicle track. And meanwhile, calculating a direction angle of the last gps data of the travelling crane and the gps data of the pile, wherein the direction angle represents the direction of the pile.

After obtaining the two sets of direction angles, it can be calculated whether the direction of the event and the direction of the vehicle are the same, and the system only pushes the event in the same direction to the vehicle where the terminal device is located.

Based on the above content, the embodiment of the application can timely acquire the real-time traffic condition and can accurately predict the real-time traffic condition. Fig. 10 is a schematic diagram of a traffic condition prediction interface according to an embodiment of the present application, and as shown in fig. 10, the interface may display predicted traffic conditions of 5 minutes, 15 minutes, 30 minutes, and the like in the future, including average speed and traffic volume, and may display more traffic conditions according to business demands, such as congestion index, yellow-card occupancy, and the like. Therefore, the method helps a road planner to know the change trend of the future traffic volume and help the traveler to correctly select a trip mode, the time waste caused by road congestion or emergencies can be avoided, and the trip quality of people is improved.

According to the content recorded above, the embodiment of the application has complete and accurate vehicle driving data as the input of model training, and then has better model generalization capability and selects correct feature description to prevent under-fitting or over-fitting of the model, and has a better machine learning algorithm for modeling, so that the high accuracy of the algorithm ensures the authenticity of the traffic condition prediction result, and a basis is laid for realizing an accurately predicted traffic guidance reminding scheme.

Further, in the embodiment of the application, the data uploaded by the mobile terminal App is the real-time position of the vehicle, and the road where the user is located can be determined according to the real-time position, so that the passing time can be predicted, and then the user can make a route planning according to the predicted passing time, which can be called as a traffic guidance method. The specific contents are as follows:

after a flow model and a speed model are trained by an expanded causal convolutional neural network algorithm, predicting the traffic flow and the speed from the current position to the target position of a user on a user planned route by using the trained flow model and speed model; a first elapsed time for the planned route is calculated based on the predicted traffic flow and the vehicle speed. Further, when a congestion event on the planned route is identified through an event identification algorithm, calculating second time consumption of all possible nearby traveling routes according to the current position and the target position; and if the second consumed time is less than the first consumed time, recommending the route corresponding to the second consumed time to the user terminal. It should be noted that there may be multiple possible travel routes nearby, and therefore multiple second time consumptions may be obtained, for example, 5 second time consumptions are calculated according to 5 possible travel routes, and if 3 of the second time consumptions are less than the first time consumption, routes corresponding to the 3 second time consumptions may be recommended to the user terminal, so that the user may select one route from the recommended 3 routes as a new planned route according to his or her own condition. Optionally, when a plurality of second consumed times are less than the first consumed time, the recommended routes are sorted according to a certain rule, for example, the recommended routes are displayed according to the sorting of the second consumed times, which is convenient for the user to select. Or recommending a route with the shortest second time consumption, wherein the user only needs to click whether to approve switching to the recommended route, and if the user does not receive an approved instruction within the preset time, the user defaults not to switch. In addition, the basis of the event identification can be events pushed by the transportation department, such as accident information, construction information and the like.

As described above, the embodiment of the present application has high accuracy of the prediction result of the traffic condition and high real-time performance of the data, so that it takes time to calculate the route based on the prediction result and generates the recommended result with high accuracy and high real-time performance.

As an example, in the prediction stage, the planned route of the user is obtained, and the Flink is adopted to read the App terminal data in real time to obtain the position information of the end user, so as to predict the traffic condition (i.e. the traffic flow and the vehicle speed of each road section on the planned route) according to the position information of the user, and then calculate the consumed time required by the planned route according to the predicted traffic condition.

However, considering that many large vehicles and engineering vehicles at high speed seriously reduce the average speed, when the time consumption required by a route is predicted, the accuracy of the time consumption prediction can be improved by using the flow model as auxiliary calculation. For example, the traffic flow and the vehicle speed are predicted by using the flow model and the speed model, when the flow of the road section at the same moment is about 20% lower than the historical average flow of the road section and no road section accident occurs, it can be considered that vehicles and engineering vehicles are increased at present on the road section, and the speed is compensated in percentage for the low flow and low speed condition, so as to reduce the influence of the vehicles and the engineering vehicles as much as possible. After the average speed is calculated, the passing time of each road section is calculated according to the average speed of each road section, and then the passing time of each road section is accumulated to obtain the total consumed time of the route.

For the calculation of the average speed, the following formula is preferably employed:

wherein

Which represents the average velocity of the final calculation,

the value of delta represents the proportional difference between the traffic flow at the same time in the same road section in the next week and the current time of the same road section. The average speed can be compensated according to the formula so as to reduce the influence of the large vehicle and the engineering vehicle on the average speed.

Therefore, the embodiment of the application starts from the prediction mode of the traffic flow and the vehicle speed, completes the search of part of vehicle-road cooperation, and realizes the traffic guidance method based on traffic flow prediction. Fig. 11 is a flowchart of a traffic guidance method based on traffic prediction conditions according to an embodiment of the present application, and as shown in fig. 11, a gps table of buildings, events and gps location information pushed by a vehicle-mounted terminal device are obtained; then, obtaining the latest gps position information, and identifying the event ahead of the vehicle; extracting an adjacent road; calculating the traffic flow and the speed of the adjacent road by using the trained flow model and speed model; calculating the required transit time (i.e. the elapsed time) of the originally planned route and the adjacent route; judging the proportion of time consumed by the original planned route and the adjacent route; and if the set proportion is met, pushing the corresponding route to the user terminal.

The traffic guidance process is a model using process, and the required information is as follows: vehicle-mounted terminal equipment pushes gps (kafka): dsp-feature-detector, important fields: equipment number, trigger time, gps longitude and latitude; building gps (mysql): position _ gps, important field: road section number, building name, gps longitude and latitude; high speed event table (mysql): road _ event, important field: road segment number, direction, road segment location.

Regarding the algorithm for predicting the transit time, the type thereof is of the regression type. The algorithm needs to predict in real time, and needs to predict all small sections of road sections near the pushed vehicle-mounted terminal equipment at the same time, so that batch synchronous calculation of large data volume can be realized, meanwhile, the accuracy of results is guaranteed, the related technology cannot be well realized at present, but the problem is solved by the embodiment of the application, specifically, historical minute-level situation data is selected as a data sample, a learning model is established by using an expanded causal convolutional neural network algorithm, after a training set is obtained, portal data of each road section are read in real time through Flink, and real-time traffic condition prediction is performed on each road section, for example, the traffic flow and the vehicle speed of the road section after 5 minutes, 15 minutes, 30 minutes, 1 hour and 2 hours are predicted; the vehicle-mounted terminal gps data is read in real time through the Flink, a license plate is used as a main dimension for calculation to find a front congestion event, the passing time of a nearby road is estimated through a pre-trained model, and an optimal path with the shortest time consumption is recommended.

The event identification comprises two types of events, wherein the first type of event is an actively reported event and comprises an accident, road maintenance and the like; the second type is submission data of a user using the in-vehicle terminal on a road. The event can be broadcasted to the user terminal in real time, so that a user can know the road condition ahead in time and can actively select a route, and the event which is suspected to be congested can be calculated and reported to the terminal user on the same road section, and meanwhile, the optimal route nearby is recommended through a route recommendation algorithm.

The optimal route selection algorithm is based on a traffic time prediction algorithm, and after a severe congestion event generated in the driving direction of a user is identified through an event identification algorithm, all possible traveling routes are calculated through the current position and the target position of a terminal user. And secondly, calculating the predicted passing time of all possible traveling routes, calculating the passing time of each road section from near to far through the predicted average speed of 5 minutes, 15 minutes, 30 minutes, 60 minutes and 120 minutes, and accumulating the time. And finally, compared with the predicted time of the original planned route, if the time is obviously shortened, the new route is pushed to the client terminal in a voice broadcasting mode.

Where n represents a link that an end user needs to pass through to an end point, S is a serial number of the link, S =1 represents a first link that needs to be passed through, ST represents a link consumption time per link expected to pass through,

the time taken by the end user to reach the original planned route is indicated, the calculated M value represents the time saving ratio brought by selecting a new route, and when the ratio is larger than a preset threshold value, the notice is pushed.

Therefore, the short-time traffic road condition prediction algorithm can be used for calculating the passing time of the adjacent road and recommending the adjacent optimal road to carry out intelligent traffic emergency evacuation, so that the traffic organization and induction of the road network are optimized, and the traffic jam is relieved.

A specific example is set forth below.

The method comprises the following steps: the platform acquires a portal transaction data table generated in a historical week every weekday later, segments the data according to time intervals of 5 minutes, 15 minutes, 30 minutes, 1 hour and 2 hours, selects characteristics, and trains an offline speed model and a flow model through an algorithm.

Step two: and reading a portal transaction data table on the current day, storing an offline index required by model prediction in Redis, and providing the offline index for subsequent prediction.

Step three: reading the gps data of the app terminal in real time to obtain the specific position of a user at a high speed, inputting the characteristics of the current time range on the adjacent road into a pre-trained speed model and a pre-trained flow model when the user is congested or in an accident in front of a driving road section, and calculating the predicted speed value of the adjacent road and the predicted passing speed value.

Step four: and calculating the passing time required by each road section by dividing the distance by the speed, and accumulating the passing time of each road section to obtain the passing time of the adjacent road and the passing time of the originally planned route.

Step five: and if the passing time of the adjacent road is obviously shorter than that of the originally planned route, selecting the information of the adjacent road, pushing the information to the user terminal and reminding the user of changing the route.

In summary, the traffic guidance method based on traffic prediction conditions in the embodiments of the present application includes: firstly, acquiring vehicle information of each passing portal through a real-time pushed expressway station interval speedometer, and counting indexes such as traffic flow, average speed and the like of each road section through real-time calculation; the traffic flow and speed model is trained according to indexes such as traffic flow and average speed of each road section through a machine learning algorithm, and a congestion model can be trained according to business requirements in the embodiment of the application; judging the congestion or accident condition of a road section by analyzing the gps position pushed by a user who installs the intelligent terminal in real time and combining real-time traffic flow information, and pushing events such as congestion or accidents to all intelligent terminals in a range area; when the event congestion is serious, the prediction time of an original planned operation route and the prediction time of the running of nearby roads are calculated by using a pre-trained traffic flow and speed model through a prediction algorithm, and an optimal path which consumes the shortest time is selected from the roads after calculation and is pushed to the terminal equipment.

In the specific steps, by means of a distributed data processing technology, the data processing capacity is large, and the efficiency is high; rapidly returning the model prediction result to the client terminal through real-time calculation; by data processing and feature selection, the generalization capability of the model is increased, and the overfitting probability is reduced; compared with a convolutional neural network, the spread causal convolutional neural network algorithm is larger in the scope of the receptive field, and can achieve similar accuracy by using fewer layers; compared with the training performance of the recurrent neural network, the training performance is high.

Fig. 12 is a block diagram of an electronic device according to an embodiment of the present application, and as shown in fig. 12, the electronic device may include a processor 81 and a memory 82 storing computer program instructions.

Specifically, the processor 81 may include a Central Processing Unit (CPU), or A Specific Integrated Circuit (ASIC), or may be configured to implement one or more Integrated circuits of the embodiments of the present Application.

Memory 82 may include, among other things, mass storage for data or instructions. By way of example, and not limitation, memory 82 may include a Hard Disk Drive (Hard Disk Drive, abbreviated to HDD), a floppy Disk Drive, a Solid State Drive (SSD), flash memory, an optical Disk, a magneto-optical Disk, tape, or a Universal Serial Bus (USB) Drive or a combination of two or more of these. Memory 82 may include removable or non-removable (or fixed) media, where appropriate. The memory 82 may be internal or external to the data processing apparatus, where appropriate. In a particular embodiment, the memory 82 is a Non-Volatile (Non-Volatile) memory. In particular embodiments, Memory 82 includes Read-Only Memory (ROM) and Random Access Memory (RAM). The ROM may be mask-programmed ROM, Programmable ROM (PROM), Erasable PROM (EPROM), Electrically Erasable PROM (EEPROM), Electrically rewritable ROM (EAROM), or FLASH Memory (FLASH), or a combination of two or more of these, where appropriate. The RAM may be a Static Random-Access Memory (SRAM) or a Dynamic Random-Access Memory (DRAM), where the DRAM may be a Fast Page Mode Dynamic Random-Access Memory (FPMDRAM), an Extended data output Dynamic Random-Access Memory (EDODRAM), a Synchronous Dynamic Random-Access Memory (SDRAM), and the like.

The memory 82 may be used to store or cache various data files for processing and/or communication use, as well as possible computer program instructions executed by the processor 81.

The processor 81 implements any one of the traffic condition prediction methods or traffic induction methods based on traffic predicted conditions in the above embodiments by reading and executing computer program instructions stored in the memory 82.

In some of these embodiments, the electronic device may also include a communication interface 83 and a bus 80. As shown in fig. 12, the processor 81, the memory 82, and the communication interface 83 are connected via the bus 80 to complete mutual communication.

The communication interface 83 is used for implementing communication between modules, devices, units and/or equipment in the embodiment of the present application. The communication interface 83 may also enable communication with other components such as: the data communication is carried out among external equipment, image/data acquisition equipment, a database, external storage, an image/data processing workstation and the like.

The bus 80 includes hardware, software, or both to couple the components of the electronic device to one another. Bus 80 includes, but is not limited to, at least one of the following: data Bus (Data Bus), Address Bus (Address Bus), Control Bus (Control Bus), Expansion Bus (Expansion Bus), and Local Bus (Local Bus). By way of example, and not limitation, Bus 80 may include an Accelerated Graphics Port (AGP) or other Graphics Bus, an Enhanced Industry Standard Architecture (EISA) Bus, a Front-Side Bus (FSB), a Hyper Transport (HT) Interconnect, an ISA (ISA) Bus, an InfiniBand (InfiniBand) Interconnect, a Low Pin Count (LPC) Bus, a memory Bus, a microchannel Architecture (MCA) Bus, a PCI (Peripheral Component Interconnect) Bus, a PCI-Express (PCI-X) Bus, a Serial Advanced Technology Attachment (SATA) Bus, a Video Electronics Bus (audio Electronics Association), abbreviated VLB) bus or other suitable bus or a combination of two or more of these. Bus 80 may include one or more buses, where appropriate. Although specific buses are described and shown in the embodiments of the application, any suitable buses or interconnects are contemplated by the application.

In addition, in combination with the traffic condition prediction method or the traffic guidance method based on the traffic condition prediction in the above embodiments, the embodiments of the present application may be implemented by providing a computer-readable storage medium. The computer readable storage medium having stored thereon computer program instructions; the computer program instructions, when executed by a processor, implement any of the traffic condition prediction methods or traffic induction methods based on traffic prediction conditions in the above embodiments.

The above-mentioned embodiments only express several embodiments of the present application, and the description thereof is more specific and detailed, but not construed as limiting the scope of the invention. It should be noted that, for a person skilled in the art, several variations and modifications can be made without departing from the concept of the present application, which falls within the scope of protection of the present application. Therefore, the protection scope of the present patent shall be subject to the appended claims.

Claims

1. A traffic condition prediction method, comprising:

acquiring time sequence traffic data acquired by an ETC portal system, and performing data preprocessing on the time sequence traffic data to obtain time sequence traffic characteristics;

writing the time sequence traffic characteristics into a big data platform through Spark Streaming;

reading the time sequence traffic characteristics by utilizing a Spark Jar on the big data platform, and training a flow model and a speed model through an extended causal convolutional neural network algorithm, wherein the flow model and the speed model are multilayer differential extended causal convolutional neural network models, and the multilayer differential extended causal convolutional neural network models comprise difference layers and inverse difference layers;

and predicting the traffic flow and the vehicle speed based on the trained flow model and speed model.

2. The method of claim 1, wherein the data pre-processing comprises at least one of data de-noising, feature dimensionality reduction, default value population, data normalization, and feature selection.

3. The method of claim 2,

the feature dimensionality reduction comprises: reducing the dimension of the time sequence traffic characteristics by adopting a principal component analysis mode to obtain key characteristics; and/or the presence of a gas in the gas,

the default value population includes: and filling the default values in the time sequence traffic characteristics by adopting a cubic spline interpolation method.

4. The method of claim 1, wherein training the traffic model and the velocity model by the augmented causal convolutional neural network algorithm comprises:

splitting the time sequence traffic characteristics into a training data set, a verification data set and a test data set;

training a flow model and a speed model by using the training data set according to a k-fold cross validation method, and adjusting the hyper-parameters of each model based on the validation data set;

and evaluating the accuracy of each model by using the test data set and a preset evaluation function.

5. The method of claim 1, wherein predicting vehicle flow and speed based on the trained flow and speed models comprises:

acquiring real-time characteristics and offline characteristics, wherein the real-time characteristics are obtained by reading time sequence traffic data acquired by an ETC portal system in real time through Flink and processing the data in an ETL mode, the offline characteristics are obtained by calculating according to the real-time characteristics acquired in historical time, and the offline characteristics are stored in a cache;

splicing the real-time features and the off-line features to obtain spliced features;

and substituting the splicing characteristics into a trained flow model and a trained speed model to predict the traffic flow and the speed.

6. The method of claim 1, wherein after training the traffic model and the velocity model by the augmented causal convolutional neural network algorithm, the method comprises:

predicting the traffic flow and the vehicle speed from the current position of the user to the target position on the user planned route by using the trained traffic model and speed model;

calculating a first elapsed time for the planned route based on the predicted traffic flow and the vehicle speed.

7. The method of claim 6, wherein after calculating the first elapsed time for the planned route based on the predicted vehicle flow and vehicle speed, the method comprises:

when a congestion event is identified on the planned route through an event identification algorithm, calculating second consumed time of all possible nearby traveling routes according to the current position and the target position;

and if the second consumed time is less than the first consumed time, recommending a route corresponding to the second consumed time to the user terminal.

8. The method of claim 6, wherein calculating the first elapsed time for the planned route based on the predicted vehicle flow and vehicle speed comprises:

the average speed is calculated according to the following formula:

wherein

Which is indicative of the average speed of the vehicle,

the speed calculated through a speed model is represented, and the value of delta represents the proportion difference between the traffic flow of the same road section at the same time in the week and the traffic flow of the same road section at the current time;

calculating the passing time of each road section according to the average speed of each road section on the planned route;

and accumulating the passing time of each road section to obtain the first consumed time.

9. The method of claim 7, wherein if the second elapsed time is less than the first elapsed time, the method comprises:

for the second elapsed time, calculating a time saving ratio:

wherein n represents a road section which an end user needs to pass through to a target position, S represents a serial number of the road section, and ST represents each road section which is expected to pass throughThe time that is consumed is the time required for,

representing a first elapsed time;

and if the time saving ratio is larger than a preset threshold value, recommending a route corresponding to the second consumed time to the user terminal.

10. An electronic device comprising a processor and a storage medium storing a computer program, wherein the computer program, when executed by the processor, implements the method of any of claims 1 to 9.

11. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 9.