CN115238963A

CN115238963A - Traffic prediction method, device, electronic device and storage medium

Info

Publication number: CN115238963A
Application number: CN202210723890.3A
Authority: CN
Inventors: 吕宜生; 魏泽兵; 陈薏竹; 王晓; 王飞跃
Original assignee: Institute of Automation of Chinese Academy of Science
Current assignee: Institute of Automation of Chinese Academy of Science
Priority date: 2022-06-23
Filing date: 2022-06-23
Publication date: 2022-10-25

Abstract

The present invention provides a traffic prediction method, device, electronic device and storage medium, wherein the method includes: determining historical traffic data and a missing position matrix of the historical traffic data; inputting the historical traffic data and the missing position matrix into a prediction completion model, Obtain the prediction result output by the prediction completion model; based on the weight parameters of the data completion module in the initial model, the prediction completion model applies the sample data, the sample missing data, the expected value of the sample data and the sample missing position matrix to carry out the analysis on the initial model. obtained by training; the weight parameters of the data completion module in the initial model are obtained by pre-training based on sample data, sample missing data and sample missing position matrix, and the method provided by the present invention realizes the prediction module and the data completion module in the prediction model The information exchange between them completes the end-to-end prediction, which improves the accuracy of the prediction results while improving the immediacy.

Description

Traffic prediction method, device, electronic device and storage medium

技术领域technical field

本发明涉及人工智能技术领域，尤其涉及一种交通预测方法、装置、电子设备及存储介质。The present invention relates to the technical field of artificial intelligence, and in particular, to a traffic prediction method, device, electronic device and storage medium.

背景技术Background technique

随着大数据和人工智能技术的发展，日益累积的交通大数据以及各种优异的深度学习算法为缓解交通拥堵以及改善交通污染提供了现实基础。然而，在现实交通场景中，受物理检测设备维护困难，以及数据传输、存储过程中可能存在的偶然因素(如数据丢失、错误等)的影响，所获得的交通数据往往数据质量较差，或存在着不同比例的数据缺失情况。With the development of big data and artificial intelligence technology, the increasingly accumulated traffic big data and various excellent deep learning algorithms provide a realistic basis for alleviating traffic congestion and improving traffic pollution. However, in real traffic scenarios, due to the difficulty in maintaining physical detection equipment, as well as possible accidental factors (such as data loss, errors, etc.) in the process of data transmission and storage, the obtained traffic data is often of poor data quality, or There are different proportions of missing data.

现有的方法在数据缺失场景下虽然能够取得较好的预测性能，但仍有以下几点不足：Although the existing methods can achieve better prediction performance in the scenario of missing data, they still have the following shortcomings:

第一，现有方法将数据补全视为一个独立的任务，在数据缺失场景下的交通预测被分为了数据补全和预测两个阶段，即时性较差；First, the existing methods regard data completion as an independent task, and traffic prediction in a data-missing scenario is divided into two stages: data completion and prediction, which is less immediate;

第二，数据补全与预测任务之间未进行有效交互，会造成数据补全的误差进一步在预测任务中积累，给预测精度造成负面影响。Second, there is no effective interaction between data completion and prediction tasks, which will cause errors in data completion to further accumulate in the prediction task, which will have a negative impact on the prediction accuracy.

发明内容SUMMARY OF THE INVENTION

本发明提供一种交通预测方法、装置、电子设备及存储介质，用以解决现有技术中交通数据缺失场景下交通预测即时性差且预测精度低的缺陷。The present invention provides a traffic prediction method, device, electronic device and storage medium, which are used to solve the defects of poor immediacy and low prediction accuracy of traffic prediction in the prior art in the scenario of missing traffic data.

本发明提供一种交通预测方法，包括：The present invention provides a traffic prediction method, comprising:

确定历史交通数据和所述历史交通数据的缺失位置矩阵；determining historical traffic data and a matrix of missing locations for said historical traffic data;

将所述历史交通数据和所述缺失位置矩阵输入至预测补全模型中，得到所述预测补全模型输出的预测结果；Inputting the historical traffic data and the missing position matrix into the prediction completion model to obtain the prediction result output by the prediction completion model;

所述预测补全模型在初始模型中数据补全模块的权重参数的基础上，应用样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对所述初始模型进行训练得到的；所述初始模型中数据补全模块的权重参数是基于所述样本数据、所述样本缺失数据和所述样本缺失位置矩阵预训练得到的。The prediction completion model is obtained by training the initial model by applying sample data, sample missing data, sample data expected value and sample missing position matrix on the basis of the weight parameters of the data completion module in the initial model; The weight parameters of the data completion module in the initial model are pre-trained based on the sample data, the sample missing data and the sample missing position matrix.

根据本发明提供的一种交通预测方法，所述预测补全模型的训练步骤，包括：According to a traffic prediction method provided by the present invention, the training steps of the prediction completion model include:

确定所述初始模型；所述初始模型包括数据补全模块和预测模块；Determine the initial model; the initial model includes a data completion module and a prediction module;

基于所述样本数据、所述样本缺失数据、所述样本数据期望值和所述样本缺失位置矩阵对所述数据补全模块进行预训练，得到所述数据补全模块的权重参数；Pre-training the data completion module based on the sample data, the sample missing data, the expected value of the sample data and the sample missing position matrix to obtain the weight parameter of the data completion module;

基于所述权重参数、所述样本数据、所述样本缺失数据、所述样本数据期望值和所述样本缺失位置矩阵，对所述数据补全模块和所述预测模块进行联合训练，得到所述预测补全模型。Based on the weight parameter, the sample data, the sample missing data, the expected value of the sample data, and the sample missing position matrix, the data completion module and the prediction module are jointly trained to obtain the prediction Complete the model.

根据本发明提供的一种交通预测方法，所述对所述数据补全模块和所述预测模块进行联合训练，包括：According to a traffic prediction method provided by the present invention, the joint training of the data completion module and the prediction module includes:

基于联合训练损失函数，对所述数据补全模块和所述预测模块进行联合训练；所述联合训练函数是以所述样本数据期望值和所述预测模块输出的所述样本数据的初始预测结果之间的差异，以及所述样本缺失数据和所述数据补全模块输出的所述样本数据的补全数据之间的差异构建的。Based on the joint training loss function, the data completion module and the prediction module are jointly trained; the joint training function is based on the expected value of the sample data and the initial prediction result of the sample data output by the prediction module. and the difference between the missing data of the sample and the complementary data of the sample data output by the data completion module.

根据本发明提供的一种交通预测方法，所述初始预测结果的获取步骤，包括：According to a traffic prediction method provided by the present invention, the step of obtaining the initial prediction result includes:

基于所述样本数据和所述样本数据的补全数据，确定样本重建数据；determining sample reconstruction data based on the sample data and the complement data of the sample data;

基于所述样本重建数据和各时间粒度，确定所述各时间粒度对应的样本重构数据；Based on the sample reconstruction data and each time granularity, determine the sample reconstruction data corresponding to each time granularity;

基于所述各时间粒度对应的样本重构数据，确定样本融合特征；Determine the sample fusion feature based on the sample reconstruction data corresponding to each time granularity;

基于所述样本融合特征，确定得到所述初始预测结果。Based on the sample fusion feature, it is determined that the initial prediction result is obtained.

根据本发明提供的一种交通预测方法，所述基于所述各时间粒度对应的样本重构数据，确定样本融合特征，包括：According to a traffic prediction method provided by the present invention, the determination of sample fusion features based on the sample reconstruction data corresponding to each time granularity includes:

对所述各时间粒度对应的样本重构数据进行时空特征提取，得到所述各时间粒度对应的样本时空特征；Perform spatiotemporal feature extraction on the sample reconstruction data corresponding to each time granularity to obtain the sample spatiotemporal feature corresponding to each time granularity;

基于所述各时间粒度对应的样本时空特征及自适应权重，确定样本时空融合特征；Determine the sample spatiotemporal fusion feature based on the sample spatiotemporal features and adaptive weights corresponding to each time granularity;

基于所述样本时空融合特征和外部因素特征，确定所述样本融合特征；所述外部因素特征包括天气因素特征，和/或，时段因素特征。The sample fusion characteristic is determined based on the sample spatiotemporal fusion characteristic and the external factor characteristic; the external factor characteristic includes the weather factor characteristic and/or the time period factor characteristic.

根据本发明提供的一种交通预测方法，所述基于所述样本数据、所述样本缺失数据、所述样本数据期望值和所述样本缺失位置矩阵对所述数据补全模块进行预训练，得到所述数据补全模块的权重参数，包括：According to a traffic prediction method provided by the present invention, the data completion module is pre-trained based on the sample data, the sample missing data, the expected value of the sample data and the sample missing position matrix, and the obtained data is obtained by pre-training the data completion module. The weight parameters of the data completion module, including:

确定所述样本数据、所述样本缺失数据和所述样本缺失位置矩阵；determining the sample data, the sample missing data and the sample missing location matrix;

对所述样本数据进行时空特征提取，并基于得到的所述样本数据对应的样本时空特征，对所述样本数据进行补全，得到所述样本数据对应的补全数据；Perform spatiotemporal feature extraction on the sample data, and complete the sample data based on the obtained sample spatiotemporal features corresponding to the sample data, to obtain complementary data corresponding to the sample data;

基于所述样本数据对应的补全数据、所述样本缺失数据和所述样本缺失位置矩阵，确定补全模块损失，并基于所述补全模块损失，对所述数据补全模块的参数进行迭代更新，得到所述数据补全模块的权重参数。Based on the completion data corresponding to the sample data, the sample missing data and the sample missing position matrix, the loss of the completion module is determined, and based on the loss of the completion module, the parameters of the data completion module are iterated update to obtain the weight parameters of the data completion module.

根据本发明提供的一种交通预测方法，所述样本数据、所述样本缺失数据和所述样本缺失位置矩阵确定步骤如下：According to a traffic prediction method provided by the present invention, the steps of determining the sample data, the sample missing data and the sample missing location matrix are as follows:

确定原始交通数据；determine raw traffic data;

从所述原始交通数据中随机删除各预设比例数量的交通数据，得到所述各预设比例数量对应的样本数据和所述各预设比例数量对应的样本缺失数据；Randomly delete each preset proportion of traffic data from the original traffic data, and obtain sample data corresponding to each preset proportion and sample missing data corresponding to each preset proportion;

基于所述原始交通数据时空位置和所述各预设比例数量对应的样本缺失数据在所述原始交通数据中的时空位置，确定所述各预设比例数量对应的样本缺失位置矩阵；Based on the spatiotemporal position of the original traffic data and the spatiotemporal position of the sample missing data corresponding to each preset proportional quantity in the original traffic data, determining a sample missing position matrix corresponding to each preset proportional quantity;

基于所述各预设比例数量对应的样本数据、所述各预设比例数量对应的样本缺失数据和所述各预设比例数量对应的样本缺失位置矩阵，确定所述样本数据、所述样本缺失数据和所述样本缺失位置矩阵。Based on the sample data corresponding to each preset proportional quantity, the sample missing data corresponding to each preset proportional quantity, and the sample missing position matrix corresponding to each preset proportional quantity, determine the sample data, the sample missing data and the sample missing location matrix.

本发明还提供一种交通预测装置，包括：The present invention also provides a traffic prediction device, comprising:

确定模块，用于确定历史交通数据和所述历史交通数据的缺失位置矩阵；a determining module for determining historical traffic data and a missing location matrix of the historical traffic data;

预测模块，用于将所述历史交通数据和所述缺失位置矩阵输入至预测补全模型中，得到所述预测补全模型输出的预测结果；A prediction module, for inputting the historical traffic data and the missing position matrix into a prediction completion model to obtain a prediction result output by the prediction completion model;

本发明还提供一种电子设备，包括存储器、处理器及存储在存储器上并可在处理器上运行的计算机程序，所述处理器执行所述程序时实现如上述任一种所述交通预测方法。The present invention also provides an electronic device, comprising a memory, a processor, and a computer program stored in the memory and running on the processor, when the processor executes the program, the traffic prediction method as described above is implemented by the processor .

本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现如上述任一种所述交通预测方法。The present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements any one of the traffic prediction methods described above.

本发明还提供一种计算机程序产品，包括计算机程序，所述计算机程序被处理器执行时实现如上述任一种所述交通预测方法。The present invention also provides a computer program product, including a computer program, which, when executed by a processor, implements any one of the traffic prediction methods described above.

本发明提供的交通预测方法、装置、电子设备及存储介质，通过以数据补全模型为基础训练得到的预测补全模型，对根据历史交通数据进行预测，得到预测结果，实现了预测补全模型端到端的预测过程，并且在预测补全模型训练的过程中会依据输出的初始预测结果更新预测补全网络模型的初始模型的参数，实现了预测补全模型中的预测模块和数据补全模块之间的信息交互，完成了端到端的预测，在提高即时性的同时，提高了预测结果的准确性。The traffic forecasting method, device, electronic equipment and storage medium provided by the present invention, through the forecasting complementation model obtained by training on the basis of the data complementation model, predicts according to historical traffic data, obtains the forecasting result, and realizes the forecasting complementation model The end-to-end prediction process, and during the training of the prediction completion model, the parameters of the initial model of the prediction completion network model will be updated according to the output initial prediction results, and the prediction module and data completion module in the prediction completion model are implemented. The information exchange between them completes the end-to-end prediction, which improves the accuracy of the prediction results while improving the immediacy.

附图说明Description of drawings

为了更清楚地说明本发明或现有技术中的技术方案，下面将对实施例或现有技术描述中所需要使用的附图作一简单地介绍，显而易见地，下面描述中的附图是本发明的一些实施例，对于本领域普通技术人员来讲，在不付出创造性劳动的前提下，还可以根据这些附图获得其他的附图。In order to explain the present invention or the technical solutions in the prior art more clearly, the following will briefly introduce the accompanying drawings that need to be used in the description of the embodiments or the prior art. Obviously, the accompanying drawings in the following description are the For some embodiments of the invention, for those of ordinary skill in the art, other drawings can also be obtained according to these drawings without any creative effort.

图1是本发明提供的交通预测方法的流程示意图；1 is a schematic flowchart of a traffic prediction method provided by the present invention;

图2是本发明提供的预测补全模型训练方法的流程示意图之一；Fig. 2 is one of the schematic flow charts of the training method of prediction completion model provided by the present invention;

图3是本发明提供的初始预测结果获取方法的流程示意图；3 is a schematic flowchart of a method for obtaining an initial prediction result provided by the present invention;

图4是本发明提供的样本融合特征获取方法的流程示意图；4 is a schematic flowchart of a method for obtaining a sample fusion feature provided by the present invention;

图5是本发明提供的时空特征提取网络的网络结构图；5 is a network structure diagram of a spatiotemporal feature extraction network provided by the present invention;

图6是本发明提供的预测补全模型训练方法的流程示意图之二；Fig. 6 is the second schematic flow chart of the prediction completion model training method provided by the present invention;

图7是本发明提供的预测补全模型的网络结构图；Fig. 7 is the network structure diagram of the prediction completion model provided by the present invention;

图8是本发明提供的交通预测装置的结构示意图；8 is a schematic structural diagram of a traffic prediction device provided by the present invention;

图9是本发明提供的电子设备的结构示意图。FIG. 9 is a schematic structural diagram of an electronic device provided by the present invention.

具体实施方式Detailed ways

为使本发明的目的、技术方案和优点更加清楚，下面将结合本发明中的附图，对本发明中的技术方案进行清楚、完整地描述，显然，所描述的实施例是本发明一部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。In order to make the objectives, technical solutions and advantages of the present invention clearer, the technical solutions in the present invention will be clearly and completely described below with reference to the accompanying drawings. Obviously, the described embodiments are part of the embodiments of the present invention. , not all examples. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

目前，现有的数据补全主要是通过历史平均、线性回归等经典的数据插值法，或者通过马尔可夫链蒙特卡罗(Markov Chain Monte Carlo,MCMC)等传统机器学习方法以及基于深度学习的数据补全方法。现有的方法在数据缺失场景下虽然能够取得较好的预测性能，但由于数据补全是独立于预测之外的独立任务，导致即时性较差，并且由于数据补全和预测之间没有交互，会造成数据补全的误差在预测中积累，从而导致预测精度角度。At present, the existing data completion is mainly through classical data interpolation methods such as historical average and linear regression, or through traditional machine learning methods such as Markov Chain Monte Carlo (MCMC) and deep learning-based methods. Data completion method. Although the existing methods can achieve better prediction performance in the scenario of missing data, due to the fact that data completion is an independent task independent of prediction, the immediacy is poor, and there is no interaction between data completion and prediction. , which will cause errors in data completion to accumulate in the prediction, resulting in a prediction accuracy angle.

因此，如何提高数据缺失场景下交通数据预测的即时性和预测精度是本领域技术人员亟待解决的技术问题。Therefore, how to improve the immediacy and prediction accuracy of traffic data prediction in a data-missing scenario is a technical problem to be solved urgently by those skilled in the art.

针对于上述技术问题，本发明实施例提供了一种交通预测方法。图1是本发明提供的交通预测方法的流程示意图。如图1所示，该方法包括：In view of the above technical problems, embodiments of the present invention provide a traffic prediction method. FIG. 1 is a schematic flowchart of a traffic prediction method provided by the present invention. As shown in Figure 1, the method includes:

步骤110，确定历史交通数据和历史交通数据的缺失位置矩阵；Step 110, determining the historical traffic data and the missing location matrix of the historical traffic data;

需要说明的是，历史交通数据为一个区域中在一个时间段内的交通数据序列，交通数据可以是交通流量、速度或者需求等数据，其中，区域可以是一个街道或者多个街道，也可以是基于经纬度将城市划分为多个区域，本发明实施例对此不作限制。It should be noted that historical traffic data is a sequence of traffic data in an area within a time period, and the traffic data can be data such as traffic flow, speed, or demand, where the area can be one street or multiple streets, or The city is divided into multiple regions based on longitude and latitude, which is not limited in this embodiment of the present invention.

此外，历史交通数据可以是一个区域中一个时间段内完整的交通数据，还可以是一个区域中一个时间段内含有缺失数据的交通数据，本发明实施例对此不作限制。完整的交通数据指的是该时间段的起点时间到终点时间每一个时间步的交通数据都没有丢失，其中，时间步表示时间间隔，例如：时间步为1秒，则一分钟的完整的交通数据中包括60条数据，即每一秒对应一条数据。历史交通数据的缺失位置矩阵是以完整的交通数据的数量作为尺寸构建的，并应用缺失标记值记录历史交通数据中的缺失数据，例如：时间步为1秒，构建一分钟的历史交通数据的缺失位置矩阵，即可以是60x1的矩阵，设历史交通数据中第15秒的交通数据缺失，则该矩阵中第15索引位标记为0，矩阵其余位标记为1，本发明实施例对此不作限制。In addition, the historical traffic data may be complete traffic data in a time period in an area, or may be traffic data with missing data in a time period in an area, which is not limited in this embodiment of the present invention. The complete traffic data means that the traffic data of each time step is not lost from the start time to the end time of the time period, where the time step represents the time interval, for example: if the time step is 1 second, then the complete traffic in one minute The data includes 60 pieces of data, that is, one piece of data per second. The missing location matrix of historical traffic data is constructed with the number of complete traffic data as the size, and the missing flag value is applied to record the missing data in the historical traffic data, for example: the time step is 1 second, constructing one minute of historical traffic data. The missing position matrix can be a 60x1 matrix. If the traffic data of the 15th second in the historical traffic data is missing, the 15th index bit in the matrix is marked as 0, and the rest of the matrix bits are marked as 1, which is not performed in this embodiment of the present invention. limit.

步骤120，将历史交通数据和缺失位置矩阵输入至预测补全模型中，得到预测补全模型输出的预测结果；Step 120, input the historical traffic data and the missing position matrix into the prediction completion model, and obtain the prediction result output by the prediction completion model;

预测补全模型在初始模型中数据补全模块的权重参数的基础上，应用样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对初始模型进行训练得到的；初始模型中数据补全模块的权重参数是基于样本数据、样本缺失数据和样本缺失位置矩阵预训练得到的。The prediction completion model is obtained by training the initial model by applying sample data, sample missing data, sample data expected value and sample missing position matrix on the basis of the weight parameters of the data completion module in the initial model; data completion in the initial model The weight parameters of the module are pre-trained based on sample data, sample missing data and sample missing location matrix.

目前的交通预测模型在缺失数据的场景下的预测过程，多是先单独通过数据补全模型对缺失数据进行补全，得到补全的交通数据，然后预测模型根据补全后的交通数据进行预测得到预测结果，预测过程并不是端到端的过程，无法实现数据补全模块和预测模块之间的信息交互，可见预测补全模型需要具有数据补全模块和预测模块之间的信息交互的能力，实现端到端的预测过程，方能提高预测的即时性以及缓解数据补全结果的误差在预测模块中进一步积累，从而提高了预测结果的准确性。因此，本发明实施例中，使用数据补全模块为预测补全模型提供了数据补全的能力，并通过设计联合训练机制，实现数据补全模块和预测模块之间的信息交互，实现了预测补全模型端到端的预测过程。The prediction process of the current traffic prediction model in the scenario of missing data is to first complete the missing data through the data completion model alone to obtain the completed traffic data, and then the prediction model predicts based on the completed traffic data. To get the prediction result, the prediction process is not an end-to-end process, and the information interaction between the data completion module and the prediction module cannot be realized. It can be seen that the prediction completion model needs to have the ability to exchange information between the data completion module and the prediction module. Only by realizing the end-to-end prediction process can improve the immediacy of prediction and alleviate the further accumulation of errors in data completion results in the prediction module, thereby improving the accuracy of the prediction results. Therefore, in the embodiment of the present invention, the data completion module is used to provide the prediction completion model with the capability of data completion, and by designing a joint training mechanism, the information interaction between the data completion module and the prediction module is realized, and the prediction is realized. Completing the model end-to-end prediction process.

具体地，根据样本数据、样本缺失数据和样本缺失位置矩阵对初始模型中的数据补全模块进行预训练，得到数据补全模块的权重参数，由此使得初始模型本身具有了数据补全的能力。在此基础上，可以样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对初始模型的参数进行迭代更新，同时在迭代更新的过程中，初始模型中的数据补全模块的参数也会同时进行更新，从而初始模型中的数据补全模块会受到预测模块输出的初始预测结果的影响，从而得到针对缺失数据场景下预测补全模型具备了数据补全能力以及预测补全模型中数据补全模块与预测模块之间形成信息交互的能力。Specifically, the data completion module in the initial model is pre-trained according to the sample data, the sample missing data and the sample missing position matrix, and the weight parameters of the data completion module are obtained, so that the initial model itself has the ability of data completion. . On this basis, the parameters of the initial model can be iteratively updated with sample data, sample missing data, sample data expected value and sample missing position matrix. It will be updated at the same time, so that the data completion module in the initial model will be affected by the initial prediction result output by the prediction module, so that the prediction completion model for the missing data scenario has the data completion capability and the data in the prediction completion model. The ability to form information interaction between the completion module and the prediction module.

由此得到的预测补全模型，具备了端到端的预测过程以及预测补全模型中的数据补全模块和预测模块的信息交互的能力，因此，可以直接将历史交通数据输入到预测模型中，得到预测模型输出的准确可靠的预测结果。The resulting prediction completion model has the end-to-end prediction process and the ability to exchange information between the data completion module and the prediction module in the prediction completion model. Therefore, historical traffic data can be directly input into the prediction model. Obtain accurate and reliable prediction results output by the prediction model.

需要说明的是，样本数据可以是一个区域中在一个时间段内缺失比例不同的不完整的交通数据，样本标注数据则是样本数据所缺失的交通数据，缺失比例表示缺失的交通数据的个数占完整的交通数据的比例，例如：时间步为1秒的一分钟完整的交通数据，缺失率为10％，则以该缺失率确定的样本数据中随机缺失6条数据，本发明实施例对此不作限制。此外，样本数据期望值是指未来该区域的同等时间段内的交通数据值，例如：流量值或者速度值等，本发明实施例对此不作限制。It should be noted that the sample data can be incomplete traffic data in a region with different missing proportions within a time period, and the sample labeled data is the traffic data missing from the sample data, and the missing proportion represents the number of missing traffic data. The proportion of complete traffic data, for example: complete traffic data for one minute with a time step of 1 second, and the missing rate is 10%, then 6 pieces of data are randomly missing in the sample data determined by the missing rate. This is not limited. In addition, the expected value of the sample data refers to the traffic data value within the same time period in the area in the future, such as a flow value or a speed value, which is not limited in the embodiment of the present invention.

本发明实施例提供的交通预测方法，通过以预训练得到的数据补全模块的权重参数为基础训练得到的预测补全模型，对历史交通数据进行预测，得到预测结果，并且在预测补全模型训练的过程中会依据输出的初始预测结果更新预测补全网络模型的初始模型的参数，实现了预测模型中的预测模块和数据补全模块之间的信息交互，完成了端到端的预测，在提高即时性的同时，提高了预测结果的准确性。In the traffic prediction method provided by the embodiment of the present invention, a prediction completion model obtained by training based on the weight parameters of the data completion module obtained by pre-training is used to predict historical traffic data, and a prediction result is obtained. During the training process, the parameters of the initial model of the prediction completion network model will be updated according to the output initial prediction results, realizing the information interaction between the prediction module and the data completion module in the prediction model, and completing the end-to-end prediction. While improving immediacy, the accuracy of prediction results is improved.

基于上述实施例，图2是本发明提供的预测补全模型训练方法的流程示意图之一。如图2所示，步骤120中的预测模型的训练步骤，包括：Based on the above embodiment, FIG. 2 is one of the schematic flowcharts of the training method of the prediction completion model provided by the present invention. As shown in Figure 2, the training steps of the prediction model in step 120 include:

步骤210，确定初始模型；初始模型包括数据补全模块和预测模块；Step 210, determine an initial model; the initial model includes a data completion module and a prediction module;

步骤220，基于样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵对数据补全模块进行预训练，得到数据补全模块的权重参数；Step 220, pre-training the data completion module based on the sample data, the sample missing data, the expected value of the sample data and the sample missing position matrix, to obtain the weight parameters of the data completion module;

步骤230，基于权重参数、样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对数据补全模块和预测模块进行联合训练，得到预测补全模型。Step 230: Jointly train the data completion module and the prediction module based on the weight parameters, sample data, missing sample data, expected value of sample data, and sample missing position matrix to obtain a prediction completion model.

为了使得预测补全模型能够具备缺失数据场景下预测补全模型具备数据补全能力和预测结果的能力，应用数据补全模块和预测模块构建初始模型，同时通过对数据补全模块进行预训练，得到数据补全模块的权重参数。In order to enable the prediction completion model to have the ability to complete data and predict results in the scenario of missing data, the data completion module and the prediction module are used to build the initial model, and the data completion module is pre-trained to ensure Get the weight parameters of the data completion module.

相较于数据补全模块预训练之前的初始模型，数据补全模块预训练之后的初始模型，已然具备了一定的数据补全的能力。Compared with the initial model before the data completion module pre-training, the initial model after the data completion module pre-training already has certain data completion capabilities.

然后应用权重参数、样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵对初始模型中的数据补全模块和预测模块进行联合训练，得到预测补全模型，即根据初始模型输出的初始预测结果，对初始模型中的数据补全模块和预测模块的参数进行迭代更新，得到预测补全模型。此处，数据补全模块的参数也会随着初始预测结果进行更新，使得预测补全模型中数据补全模块与预测模块之间形成信息交互的能力。Then apply the weight parameters, sample data, sample missing data, sample data expected value and sample missing position matrix to jointly train the data completion module and the prediction module in the initial model to obtain the prediction completion model, that is, the initial prediction based on the output of the initial model As a result, the parameters of the data completion module and the prediction module in the initial model are iteratively updated to obtain the prediction completion model. Here, the parameters of the data completion module are also updated along with the initial prediction result, so that the ability of information interaction is formed between the data completion module and the prediction module in the prediction completion model.

本发明实施例提供的交通预测方法，通过在预训练得到初始模型中的数据补全模块的权重参数之后，对数据补全模块和预测模块联合训练，使得训练得到的预测补全模型在具有数据补全能力和预测能力的同时，具备了数据补全模块与预测模块之间信息交互的能力，从而减小了训练时的误差积累，进而提高了预测补全模型的预测结果的准确性。In the traffic prediction method provided by the embodiment of the present invention, after the weight parameters of the data completion module in the initial model are obtained by pre-training, the data completion module and the prediction module are jointly trained, so that the prediction and completion model obtained by training has the data In addition to the completion ability and prediction ability, it has the ability to exchange information between the data completion module and the prediction module, thereby reducing the accumulation of errors during training, thereby improving the accuracy of the prediction results of the prediction completion model.

基于上述实施例，步骤220中的对数据补全模块和预测模块进行联合训练，包括：Based on the above embodiment, the joint training of the data completion module and the prediction module in step 220 includes:

基于联合训练损失函数，对初始模型中的数据补全模块和预测模块进行联合训练；联合训练函数是以初始模型中预测模块输出的样本数据的初始预测结果和样本数据期望值之间的差异，以及样本缺失数据和初始模型中数据补全模块输出的样本数据的补全数据之间的差异构建的。Based on the joint training loss function, the data completion module and the prediction module in the initial model are jointly trained; the joint training function is the difference between the initial prediction result of the sample data output by the prediction module in the initial model and the expected value of the sample data, and It is constructed from the difference between the missing data in the sample and the completion data of the sample data output by the data completion module in the initial model.

考虑到预测补全模型中的数据补全模块和预测模块之间需要形成信息交互，同时考虑到初始模型中数据补全模块和预测模块之间的梯度是中断的，数据补全模块的梯度无法传递到预测模块，因此，本发明实施例以联合训练损失最小为目标对初始模型中的数据补全模块和预测模块进行联合训练。Considering that information interaction needs to be formed between the data completion module and the prediction module in the prediction completion model, and considering that the gradient between the data completion module and the prediction module in the initial model is interrupted, the gradient of the data completion module cannot be It is passed to the prediction module. Therefore, in the embodiment of the present invention, the data completion module and the prediction module in the initial model are jointly trained with the goal of minimizing the joint training loss.

具体地，联合训练损失函数基于初始模型中的数据补全模块的损失函数和预测模块的损失函数联合确定的，例如可以是两者的加权求和，也可以是非线性组合的方式，本发明实施例对此不作限制。Specifically, the joint training loss function is jointly determined based on the loss function of the data completion module and the loss function of the prediction module in the initial model. For example, it may be a weighted sum of the two, or a nonlinear combination. The present invention implements The example does not limit this.

其中，初始模型中的数据补全模块的损失函数是样本数据的补全数据和样本缺失数据之间的差异最小为目标构建的。进一步地，可以通过如下公式计算得到初始模型中的数据补全模块的损失：Among them, the loss function of the data completion module in the initial model is constructed with the goal of minimizing the difference between the completed data of the sample data and the missing data of the sample. Further, the loss of the data completion module in the initial model can be calculated by the following formula:

式中，x为样本缺失数据和

为样本数据的补全数据，m为样本缺失位置矩阵，在计算数据补全的误差时只保留样本缺失位置矩阵中缺失标识的位置的数据参与计算，β为超参数。where x is the sample missing data and

is the completion data of the sample data, m is the sample missing position matrix, when calculating the error of the data completion, only the data of the position of the missing mark in the sample missing position matrix is retained to participate in the calculation, and β is the hyperparameter.

初始模型中的预测模块的损失函数是以初始预测结果和样本数据期望值之间的差异最小为目标构建的。The loss function of the prediction module in the initial model is constructed with the goal of minimizing the difference between the initial prediction result and the expected value of the sample data.

进一步地，可以通过如下公式计算得到初始模型中的预测模块的损失：Further, the loss of the prediction module in the initial model can be calculated by the following formula:

式中，y为样本数据期望值和

为初始预测结果，β为超参数。where y is the expected value of the sample data and

is the initial prediction result, and β is a hyperparameter.

进一步地，可以通过如下公式计算得到联合损失：Further, the joint loss can be calculated by the following formula:

式中，

分别为数据补全和预测结果的损失值，λ∈[0.0,1.0]是一个调节因子，用于调节联合损失函数中数据补全模块损失的最大占比，γ是一个缩放因子，用于对联合损失函数中数据补全模块损失的初始比例进行调节。In the formula,

are the loss values of data completion and prediction results, respectively, λ∈[0.0,1.0] is an adjustment factor used to adjust the maximum proportion of the loss of the data completion module in the joint loss function, γ is a scaling factor, used to The initial scale of the data completion module loss in the joint loss function is adjusted.

基于上述实施例，图3是本发明提供的初始预测结果获取方法的流程示意图。如图3所示，初始预测结果的获取步骤，包括：Based on the foregoing embodiment, FIG. 3 is a schematic flowchart of a method for obtaining an initial prediction result provided by the present invention. As shown in Figure 3, the steps for obtaining the initial prediction result include:

步骤310，基于样本数据和样本数据的补全数据，确定样本重建数据；Step 310, based on the sample data and the complement data of the sample data, determine the sample reconstruction data;

具体地，根据样本数据中缺失数据的时空位置，将样本数据的补全数据插入到对应的缺失数据的时空位置中，得到样本重建数据。Specifically, according to the spatiotemporal position of the missing data in the sample data, the complementary data of the sample data is inserted into the corresponding spatiotemporal position of the missing data to obtain the sample reconstruction data.

需要说明的是，由于样本数据来自于一个区域一个时间段内，以时间步采集的完整的以时间顺序排列的交通数据序列，时空位置表示交通数据在交通数据序列中的位置，例如：时间步为1秒，时间段为1分钟，则完整的交通数据序列中有60条以时间顺序排列的交通数据，则时空位置为15的交通数据表示在交通数据序列中索引为14的交通数据，序列的索引号从0开始。It should be noted that since the sample data comes from a complete traffic data sequence arranged in time order collected in time steps within one area and one time period, the spatiotemporal position represents the position of the traffic data in the traffic data sequence, for example: time step is 1 second and the time period is 1 minute, then there are 60 traffic data in the complete traffic data sequence arranged in chronological order, then the traffic data whose spatiotemporal position is 15 represents the traffic data whose index is 14 in the traffic data sequence. The index number starts from 0.

步骤320，基于样本重建数据和各时间粒度，确定各时间粒度对应的样本重构数据；Step 320, based on the sample reconstruction data and each time granularity, determine the sample reconstruction data corresponding to each time granularity;

步骤330，基于各时间粒度对应的样本重构数据，确定样本融合特征；Step 330: Determine the sample fusion feature based on the sample reconstruction data corresponding to each time granularity;

步骤340，基于样本融合特征，确定得到初始预测结果。Step 340, based on the sample fusion feature, determine to obtain an initial prediction result.

考虑到如果能从不同的时间粒度，例如：采样时间由短变长，则预测补全模型能够学习到不同时间尺度下的多种时间趋势特征，进而能够提高预测补全模型的预测结果的准确性。Considering that if the sampling time can be changed from short to long from different time granularities, the prediction completion model can learn a variety of temporal trend features at different time scales, which can improve the accuracy of the prediction results of the prediction completion model. sex.

具体地，将样本重建数据在各时间粒度下进行重构，得到各时间粒度对应的样本重构数据，并对各时间粒度对应的样本重构数据进行时空特征提取，并将各时间粒度对应的样本重构数据对应的时空特征进行融合，得到样本融合特征，最后根据样本融合特征进行预测，得到初始预测结果。Specifically, reconstruct the sample reconstruction data at each time granularity to obtain the sample reconstruction data corresponding to each time granularity, perform spatiotemporal feature extraction on the sample reconstruction data corresponding to each time granularity, and extract the sample reconstruction data corresponding to each time granularity. The spatiotemporal features corresponding to the sample reconstruction data are fused to obtain the sample fusion features, and finally prediction is performed according to the sample fusion features to obtain the initial prediction result.

需要说明的是，各时间粒度中每一个时间粒度均不相同，并且各时间粒度涵盖了时间间隔由小到大，例如：时间粒度分钟、小时和天，本发明实施例对此不作限制。将样本重建数据在各时间粒度下进行重构具体可以表示为由相邻时间粒度内的样本数据聚合得到，例如：时间粒度为1小时，则将当前样本数据所处时间点的后面两个小时内的样本数据合并成当前样本数据对应的样本重构数据。It should be noted that each time granularity in each time granularity is different, and each time granularity covers a time interval ranging from small to large, for example, time granularity minutes, hours, and days, which are not limited in this embodiment of the present invention. The reconstruction of sample reconstruction data at each time granularity can be expressed as the aggregation of sample data in adjacent time granularities. For example, if the time granularity is 1 hour, then the current sample data is located two hours after the time point. The sample data within is merged into the sample reconstruction data corresponding to the current sample data.

此外，将各时间粒度对应的样本重构数据对应的时空特征进行融合，可以基于预设的权重进行加权，还可以基于自适应权重进行加权，其中，自适应权重的权重值可以根据融合损失函数得到的损失进行相应调整，本发明实施例对此不作限制。In addition, the spatiotemporal features corresponding to the sample reconstruction data corresponding to each time granularity are fused, which can be weighted based on a preset weight, and can also be weighted based on an adaptive weight, wherein the weight value of the adaptive weight can be based on the fusion loss function. The obtained loss is adjusted accordingly, which is not limited in the embodiment of the present invention.

基于上述实施例，图4是本发明提供的样本融合特征获取方法的流程示意图。如图4所示，步骤330，包括：Based on the foregoing embodiment, FIG. 4 is a schematic flowchart of a method for acquiring a sample fusion feature provided by the present invention. As shown in Figure 4, step 330 includes:

步骤331，对各时间粒度对应的样本重构数据进行时空特征提取，得到各时间粒度对应的样本时空特征；Step 331 , perform spatiotemporal feature extraction on the sample reconstruction data corresponding to each time granularity, to obtain the sample spatiotemporal feature corresponding to each time granularity;

步骤332，基于各时间粒度对应的样本时空特征及自适应权重，确定样本时空融合特征；Step 332, based on the sample spatiotemporal features and adaptive weights corresponding to each time granularity, determine the sample spatiotemporal fusion features;

步骤333，基于样本时空融合特征和外部因素特征，确定样本融合特征；外部因素特征包括天气因素特征，和/或，时段因素特征。Step 333: Determine the sample fusion characteristics based on the sample spatiotemporal fusion characteristics and external factor characteristics; the external factor characteristics include weather factor characteristics, and/or time period factor characteristics.

考虑到自适应权重可以让初始模型中的预测模块根据融合损失函数计算的损失自适应进行调整，即可以由数据驱动调整权重比例，进而使得训练得到的预测补全模型能够知道哪些时间粒度对应的时空特征更能影响预测结果，从而提高预测结果的准确性。Considering that the adaptive weight can allow the prediction module in the initial model to adjust adaptively according to the loss calculated by the fusion loss function, that is, the weight ratio can be adjusted by data-driven, so that the trained prediction completion model can know which time granularity corresponds to. Spatiotemporal features can more affect the prediction results, thereby improving the accuracy of the prediction results.

同时，考虑到交通数据会受到一些外部因素的影响，譬如：雨天就比晴天更拥堵或早晚高峰时间段更拥堵等，因此，本发明实施例将外部因素特征和样本时空特征进一步融合，则能进一步提高预测结果的准确性。At the same time, considering that the traffic data will be affected by some external factors, such as: rainy days are more congested than sunny days or more congested in the morning and evening rush hours, etc. Therefore, the embodiment of the present invention further integrates the external factor characteristics and the sample spatiotemporal characteristics, which can Further improve the accuracy of prediction results.

具体地，先对各时间粒度对应的样本重构特征进行时空特征提取，得到各时间粒度对应的样本时空特征；再将各时间粒度对应的样本时空特征以自适应权重进行加权融合，得到样本时空融合特征，最后将样本时空融合特征和外部因素特征进行再次融合，得到样本融合特征。其中，外部因素包括：天气因素，和/或，时段因素，时段因素又可以包括早晚高峰时段，节假日时段等，外部因素还可以包括温度因素和风速因素等，本发明实施例对此不作限制。Specifically, first perform spatiotemporal feature extraction on the sample reconstruction features corresponding to each time granularity to obtain the sample spatiotemporal features corresponding to each time granularity; Fusion features, and finally the sample spatiotemporal fusion features and external factor features are fused again to obtain sample fusion features. The external factors include: weather factors, and/or time period factors. The time period factors may include morning and evening peak periods, holiday periods, etc., and the external factors may also include temperature factors, wind speed factors, and the like, which are not limited in this embodiment of the present invention.

需要说明的是，对各时间粒度对应的样本重构特征进行时空特征提取是根据预测模块中各时间粒度对应的时空特征提取网络分别进行提取的，并且预测模块中各时间粒度对应的时空特征提取网络的参数会根据融合损失函数计算得到的损失值进行迭代更新。It should be noted that the spatiotemporal feature extraction for the sample reconstruction features corresponding to each time granularity is extracted according to the spatiotemporal feature extraction network corresponding to each time granularity in the prediction module, and the spatiotemporal feature extraction corresponding to each time granularity in the prediction module is performed. The parameters of the network are iteratively updated according to the loss value calculated by the fusion loss function.

进一步地，时间粒度包括：分粒度、时粒度和天粒度，则分别基于预测模块中分粒度的时空特征提取网络，对分粒度对应的样本重构数据进行时空特征提取，得到分粒度对应的样本时空特征

基于预测模块中时粒度的时空特征提取网络，对时粒度对应的样本重构数据进行时空特征提取，得到时粒度对应的样本时空特征

基于预测模块中天粒度的时空特征提取网络，对天粒度对应的样本重构数据进行时空特征提取，得到天粒度对应的样本时空特征

基于

和

以及

的自适应权重W_m、

的自适应权重W_h和

的自适应权重W_d，通过融合公式进行融合得到样本时空融合特征

具体公式如下：Further, the time granularity includes: sub-granularity, time-granularity and day-granularity, respectively, based on the spatial-temporal feature extraction network of sub-granularity in the prediction module, perform spatio-temporal feature extraction on the sample reconstruction data corresponding to sub-granularity, and obtain samples corresponding to sub-granularity. spatiotemporal features

Based on the spatiotemporal feature extraction network of the time granularity in the prediction module, perform spatiotemporal feature extraction on the sample reconstruction data corresponding to the time granularity, and obtain the sample spatiotemporal features corresponding to the time granularity.

Based on the spatiotemporal feature extraction network of the sky granularity in the prediction module, the spatiotemporal feature extraction of the sample reconstruction data corresponding to the sky granularity is carried out, and the spatiotemporal characteristics of the sample corresponding to the sky granularity are obtained.

based on

and

as well as

The adaptive weight W _m ,

The adaptive weights W _h and

The adaptive weight W _d of , and the fusion formula is used to obtain the sample spatio-temporal fusion features

The specific formula is as follows:

式中，⊙表示哈达玛积，W_m、W_h和W_d以数据驱动的方式自适应调节。where ⊙ represents the Hadamard product, and W _m , W _h and W _d are adaptively adjusted in a data-driven manner.

进一步的，图5是本发明提供的时空特征提取网络的网络结构图。如图5所示，先根据样本数据中的时空信息，使用残差的方式对样本数据进行特征提取，再将提取到的特征进行特征维度调整，最后将调整后的特征输入到膨胀因果卷积中，得到输出的样本数据对应的样本时空特征，图中，BN为标注化，Conv2d、Conv2d_1和Conv2d_2为二维卷积层，ReLu为激活函数，ResBlock 1……L为残差模块单元。Further, FIG. 5 is a network structure diagram of a spatiotemporal feature extraction network provided by the present invention. As shown in Figure 5, first, according to the spatiotemporal information in the sample data, the feature extraction is performed on the sample data by means of residual error, and then the feature dimension of the extracted features is adjusted, and finally the adjusted features are input into the dilated causal convolution , the sample spatiotemporal features corresponding to the output sample data are obtained. In the figure, BN is the labeling, Conv2d, Conv2d_1 and Conv2d_2 are the two-dimensional convolutional layers, ReLu is the activation function, and ResBlock 1...L is the residual module unit.

本发明实施例提供的交通预测方法，通过多时间粒度对样本重建数据进行重构，使得预测补全模型能够学习到多趋势的时空特征，并且将外部因素特征与各时间粒度的样本时空融合特征进行融合，使得预测补全模型能够学到外部因素对预测结果的影响，进一步提高了预测补全模型输出的预测结果的准确性。The traffic prediction method provided by the embodiment of the present invention reconstructs the sample reconstruction data through multi-time granularity, so that the prediction completion model can learn multi-trend spatio-temporal features, and fuse the external factor features with the sample spatio-temporal features of each time granularity. The fusion enables the prediction completion model to learn the influence of external factors on the prediction results, and further improves the accuracy of the prediction results output by the prediction completion model.

基于上述实施例，步骤220，包括：Based on the above embodiment, step 220 includes:

步骤410，确定样本数据、样本缺失数据和样本缺失位置矩阵；Step 410, determine sample data, sample missing data and sample missing position matrix;

需要说明的是，样本数据可以表示为不同数据缺失比例下的样本交通数据，样本缺失数据及其对应的样本数据中所缺失的样本交通数据，样本缺失位置矩阵可以用来标记对应的样本数据所缺失的样本交通数据对应时空位置，其中，样本缺失数据可以作为其对应样本数据的标签，用于最后计算损失。It should be noted that the sample data can be expressed as sample traffic data with different data missing ratios, sample missing data and sample traffic data missing in the corresponding sample data, and sample missing location matrix can be used to mark the corresponding sample data. The missing sample traffic data corresponds to the spatiotemporal location, wherein the sample missing data can be used as the label of its corresponding sample data for the final calculation of the loss.

步骤420，对样本数据进行时空特征提取，并基于得到的样本数据对应的样本时空特征，对样本数据进行补全，得到样本数据对应的补全数据；Step 420, extracting the spatiotemporal features of the sample data, and completing the sample data based on the sample spatiotemporal features corresponding to the obtained sample data, to obtain the complementary data corresponding to the sample data;

步骤430，基于样本数据对应的补全数据、样本缺失数据和样本缺失位置矩阵，确定补全模块损失，基于补全模块损失，对数据补全模块的参数进行迭代更新，得到数据补全模块的权重参数。Step 430: Determine the loss of the completion module based on the completion data corresponding to the sample data, the sample missing data and the sample missing position matrix, and based on the loss of the completion module, iteratively update the parameters of the data completion module to obtain the data completion module. weight parameter.

具体地，对样本数据进行时空特征提取，得到样本数据对应的样本时空特征，样本数据对应的样本时空特征通过带激活层的卷积层得到初始补全结果。补全网络损失函数基于初始补全结果、样本缺失数据和样本缺失位置矩阵计算得到补全模块损失，通过补全模块损失，对数据补全模块的参数进行迭代更新，直至训练完成，得到数据补全模块的权重参数。进一步地，补全网络损失函数为：Specifically, the spatiotemporal feature extraction is performed on the sample data to obtain the sample spatiotemporal feature corresponding to the sample data, and the sample spatiotemporal feature corresponding to the sample data obtains the initial completion result through the convolution layer with the activation layer. Completion network loss function calculates the loss of the completion module based on the initial completion result, sample missing data and sample missing position matrix. Through the completion module loss, the parameters of the data completion module are iteratively updated until the training is completed, and the data complement is obtained. The weight parameter of the full module. Further, the complete network loss function is:

式中，x为样本缺失数据和

基于上述实施例，样本数据、样本标注数据和缺失位置矩阵确定步骤如下：Based on the above embodiment, the steps for determining sample data, sample labeling data and missing position matrix are as follows:

步骤510，确定原始交通数据；Step 510, determine the original traffic data;

需要说明的是，原始交通数据可以表示为一区域中一时间段内的以时间排序的完整的交通数据序列。其中，交通数据序列中的交通数据是进行标准化预处理的数据，进一步地，标准化预处理是基于如下公式处理的：It should be noted that the original traffic data may be represented as a complete sequence of traffic data in a time sequence in a region within a time period. Among them, the traffic data in the traffic data sequence is the data subjected to standardized preprocessing, and further, the standardized preprocessing is processed based on the following formula:

式中，x′为标准化前的交通数据，x_max、x_min分别为对应的交通数据序列中交通数据的最大值和最小值，x′为标准化后的交通数据，并基于缺失位置编码矩阵M，将x′中对应有数据缺失的位置用-1填充。In the formula, x' is the traffic data before normalization, x _max and x _min are the maximum and minimum values of the traffic data in the corresponding traffic data sequence, respectively, x' is the traffic data after normalization, and is based on the missing location coding matrix M. , and fill the position with missing data in x' with -1.

步骤520，从原始交通数据中随机删除各预设比例数量的交通数据，得到各预设比例数量对应的样本数据和各预设比例数量对应的样本缺失数据；Step 520: Randomly delete traffic data of each preset proportion from the original traffic data, and obtain sample data corresponding to each preset proportion and sample missing data corresponding to each preset proportion;

步骤530，基于原始交通数据时空位置和各预设比例数量对应的样本缺失数据在原始交通数据中的时空位置，确定各预设比例数量对应的样本缺失位置矩阵；Step 530: Determine the sample missing position matrix corresponding to each preset proportional quantity based on the spatiotemporal position of the original traffic data and the temporal and spatial position of the sample missing data corresponding to each preset proportional quantity in the original traffic data;

步骤540，基于各预设比例数量对应的样本数据、各预设比例数量对应的样本缺失数据和各预设比例数量对应的样本缺失位置矩阵，确定样本数据、样本缺失数据和样本缺失位置矩阵。Step 540: Determine sample data, sample missing data, and sample missing position matrix based on sample data corresponding to each preset proportional quantity, sample missing data corresponding to each preset proportional quantity, and sample missing position matrix corresponding to each preset proportional quantity.

具体地，根据各预设比例数量，从原始交通数据中随机删除对应预设比例数量的交通数据，此时，从原始交通数据中删除的交通数据为该预设比例数量对应的样本缺失数据，原始交通数据中未删除的交通数据为该预设比例数量对应的样本数据。再根据原始交通数据的时空位置和各预设比例数量对应的样本缺失数据在原始交通数据中的时空位置，确定各预设比例数量对应的样本缺失位置矩阵，最后根据各预设比例数量对应的样本数据、各预设比例数量对应的样本缺失数据和各预设比例数量对应的样本缺失位置矩阵，确定用于数据补全模型训练和用于预测补全模型训练的样本数据、样本缺失数据和样本缺失位置矩阵。Specifically, according to each preset proportion, the traffic data corresponding to the preset proportion is randomly deleted from the original traffic data. In this case, the traffic data deleted from the original traffic data is the sample missing data corresponding to the preset proportion, The traffic data that is not deleted in the original traffic data is the sample data corresponding to the preset proportion. Then, according to the temporal and spatial positions of the original traffic data and the temporal and spatial positions of the sample missing data corresponding to each preset proportion in the original traffic data, determine the sample missing position matrix corresponding to each preset proportion, and finally determine the sample missing position matrix corresponding to each preset proportion. The sample data, the sample missing data corresponding to each preset proportion, and the sample missing position matrix corresponding to each preset proportion, determine the sample data, sample missing data and Sample missing location matrix.

需要说明的是，根据原始交通数据的时空位置和各预设比例数量对应的样本缺失数据在原始交通数据中的时空位置，确定各预设比例数量对应的样本缺失位置矩阵具体可以是，基于原始交通数据的时空位置确定初始缺失位置矩阵，在基于每一个预设比例数量对应的样本缺失数据在原始交通数据中的时空位置，在初始缺失位置矩阵对应的索引位置进行缺失标注，得到每一个预设比例数量对应的样本缺失位置矩阵，进一步地，初始缺失位置矩阵中的每一个元素均为1，样本缺失位置矩阵中缺失数据对应的索引位置的元素为缺失标注为0。It should be noted that, according to the temporal and spatial positions of the original traffic data and the temporal and spatial positions of the sample missing data corresponding to each preset proportional quantity in the original traffic data, the sample missing position matrix corresponding to each preset proportional quantity may be determined specifically, based on the original traffic data. The spatiotemporal position of the traffic data determines the initial missing position matrix. Based on the spatiotemporal position of the sample missing data corresponding to each preset proportion in the original traffic data, the missing labeling is performed at the index position corresponding to the initial missing position matrix, and each prediction is obtained. Set the sample missing position matrix corresponding to the proportional quantity. Further, each element in the initial missing position matrix is 1, and the element of the index position corresponding to the missing data in the sample missing position matrix is missing and marked as 0.

进一步地，以10％、20％、30％、40％和50％不同的预设比例数量在原始交通数据中进行随机删除，得到10％、20％、30％、40％和50％不同的预设比例数量对应的样本数据和样本缺失数据。Further, random deletion is performed in the original traffic data with different preset proportions of 10%, 20%, 30%, 40% and 50%, and 10%, 20%, 30%, 40% and 50% different The sample data and sample missing data corresponding to the preset proportions.

图6是本发明提供的预测补全模型训练方法的流程示意图之二。图7是本发明提供的预测补全模型的网络结构图。如图6和图7所示，该训练方法包括：FIG. 6 is the second schematic flowchart of the training method of the prediction completion model provided by the present invention. FIG. 7 is a network structure diagram of the prediction completion model provided by the present invention. As shown in Figure 6 and Figure 7, the training method includes:

步骤610，根据原始采集的交通数据，以时间顺序进行聚合得到原始交通数据，并对原始交通数据进行标准化处理。Step 610 , according to the originally collected traffic data, perform aggregation in time sequence to obtain the original traffic data, and perform standardization processing on the original traffic data.

步骤620，将原始交通数据进行缺失数据构造，具体为：Step 620, performing missing data construction on the original traffic data, specifically:

以10％、20％、30％、40％和50％不同的预设比例数量在原始交通数据中进行随机删除，得到10％、20％、30％、40％和50％不同的预设比例数量对应的样本数据和样本缺失数据，并通过10％、20％、30％、40％和50％不同的预设比例数量对应的样本数据和样本缺失数据，构造对应的样本缺失位置矩阵。Random deletion in raw traffic data with different preset ratio numbers of 10%, 20%, 30%, 40% and 50%, resulting in different preset ratios of 10%, 20%, 30%, 40% and 50% The number of corresponding sample data and sample missing data, and by 10%, 20%, 30%, 40% and 50% of different preset proportions corresponding to the sample data and sample missing data, construct the corresponding sample missing position matrix.

步骤630，根据10％、20％、30％、40％和50％不同的预设比例数量对应样本数据、样本缺失数据和样本缺失位置矩阵对数据补全模型进行预训练，并将训练完成的数据补全模型的模型参数，即权重参数W^pre和偏置参数b^pre，移植到初始模型的数据补全模块中。Step 630: Pre-train the data completion model according to different preset proportions of 10%, 20%, 30%, 40%, and 50% corresponding to sample data, sample missing data, and sample missing position matrix, and train the completed model. The model parameters of the data completion model, namely the weight parameter W ^pre and the bias parameter b ^pre , are transplanted into the data completion module of the initial model.

步骤640，将样本数据和经由初始模型的数据补全模块输出的样本数据的补全数据进行重建，得到样本重建数据，具体重建公式为：Step 640, reconstruct the sample data and the complement data of the sample data output by the data complement module of the initial model to obtain the sample reconstruction data, and the specific reconstruction formula is:

X^*＝X·M+X^im·M_re X ^* = X · M + X ^im · M _re

式中，X、X^im分别表示原始存在缺失的数据和补全后的数据，M为样本缺失位置矩阵，该矩阵是一个0-1二元邻接矩阵，M_re为对应的反码矩阵，X^*即为补全后数据的重建结果。In the formula, X and X ^im represent the original missing data and the completed data respectively, M is the sample missing position matrix, which is a 0-1 binary adjacency matrix, M _re is the corresponding inverse code matrix, X ^* is the reconstruction result of the data after completion.

步骤650，应用分粒度、时粒度和天粒度多趋势对重建得到的结果进行重构，得到各时间粒度对应样本重构数据；Step 650, applying multi-trends of sub-granularity, time-granularity and day-granularity to reconstruct the result obtained by reconstruction, and obtain reconstructed data of samples corresponding to each time granularity;

步骤660，将各时间粒度对应样本重构数据分别应用对应的初始模型中预测模块中的时空特征提取网络进行特征提取，得到各时间粒度对应的样本时空特征。其中，分粒度对应的样本时空特征

Step 660: Apply the reconstruction data of the samples corresponding to each time granularity to the spatiotemporal feature extraction network in the prediction module in the corresponding initial model to perform feature extraction, and obtain the sample spatiotemporal features corresponding to each time granularity. Among them, the sample spatiotemporal features corresponding to the granularity

步骤670，将各时间粒度对应的样本时空进行自适应权重加权融合，得到样本时空融合特征

具体公式为：Step 670: Perform adaptive weight-weighted fusion of the sample space-time corresponding to each time granularity to obtain the sample space-time fusion feature

The specific formula is:

步骤680，将样本时空融合特征和外部因素特征进行拼接，得到样本融合特征，拼接的具体公式为

其中，外部因素特征是先将外部因素进行One-hot编码，并输入到特征嵌入网络中得到的，具体为，将样本数据中不同时间步对应的外部因素输入特征嵌入网络进行编码，具体步骤为将天气、节假日、温度、风速数据以及时间编码输入到环境特征嵌入网络，公式为：Step 680, splicing the sample spatiotemporal fusion features and external factor features to obtain the sample fusion features, and the specific formula for splicing is:

Among them, the external factor features are obtained by first performing One-hot encoding on the external factors and inputting them into the feature embedding network. Specifically, the external factors corresponding to different time steps in the sample data are input into the feature embedding network for encoding. The specific steps are as follows: Input weather, holiday, temperature, wind speed data and time code into the environmental feature embedding network, the formula is:

X_ef＝Relu(WX_e+b)X _ef =Relu(WX _e +b)

式中，X_e表示外部因素，X_ef表示提取的外部因素特征，W表示预设权重，b表示预设偏移量，Relu为激活函数，其公式为：In the formula, X _e represents the external factor, X _ef represents the extracted external factor feature, W represents the preset weight, b represents the preset offset, Relu is the activation function, and its formula is:

式中，α为超参。where α is a hyperparameter.

步骤690，样本融合特征经过多步回归数据得到初步预测结果，并根据初步预测结果和样本数据期望值计算损失，并基于计算得到的损失对初始模型的参数进行迭代更新，得到预测补全模型。Step 690, the sample fusion feature obtains a preliminary prediction result through multi-step regression data, and calculates the loss according to the preliminary prediction result and the expected value of the sample data, and iteratively updates the parameters of the initial model based on the calculated loss to obtain a prediction completion model.

下面对本发明提供的交通预测装置进行描述，下文描述的交通预测装置与上文描述的交通预测方法可相互对应参照。The traffic prediction device provided by the present invention is described below, and the traffic prediction device described below and the traffic prediction method described above can be referred to each other correspondingly.

图8是本发明提供的交通预测装置的结构示意图。如图8所示，该装置包括：确定模块810和预测模块820。FIG. 8 is a schematic structural diagram of a traffic prediction device provided by the present invention. As shown in FIG. 8 , the apparatus includes: a determination module 810 and a prediction module 820 .

其中，in,

确定模块810，用于确定历史交通数据和历史交通数据的缺失位置矩阵；a determining module 810, configured to determine historical traffic data and a missing location matrix of the historical traffic data;

预测模块820，用于将历史交通数据和缺失位置矩阵输入至预测补全模型中，得到预测补全模型输出的预测结果；The prediction module 820 is used to input the historical traffic data and the missing position matrix into the prediction completion model to obtain the prediction result output by the prediction completion model;

在本发明实施例中，通过确定模块，用于确定历史交通数据和历史交通数据的缺失位置矩阵；预测模块，用于将历史交通数据和缺失位置矩阵输入至预测补全模型中，得到预测补全模型输出的预测结果；预测补全模型在初始模型中数据补全模块的权重参数的基础上，应用样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对初始模型进行训练得到的；初始模型中数据补全模块的权重参数是基于样本数据、样本缺失数据和样本缺失位置矩阵预训练得到的，并且在预测补全模型训练的过程中会依据输出的初始预测结果更新预测补全网络模型的初始模型的参数，实现了预测模型中的预测模块和数据补全模块之间的信息交互，完成了端到端的预测，在提高即时性的同时，提高了预测结果的准确性。In the embodiment of the present invention, the determination module is used to determine the historical traffic data and the missing position matrix of the historical traffic data; the prediction module is used to input the historical traffic data and the missing position matrix into the prediction completion model to obtain the prediction supplement The prediction result output by the full model; the prediction completion model is based on the weight parameters of the data completion module in the initial model, and applies the sample data, sample missing data, sample data expected value and sample missing position matrix to train the initial model. ;The weight parameters of the data completion module in the initial model are pre-trained based on sample data, sample missing data and sample missing position matrix, and the prediction completion will be updated according to the output initial prediction results during the training of the prediction completion model. The parameters of the initial model of the network model realize the information interaction between the prediction module and the data completion module in the prediction model, and complete the end-to-end prediction, which improves the accuracy of the prediction results while improving the immediacy.

基于上述实施例，该交通预测装置还包括：预测补全模型训练模块，该预测补全模型训练模块包括：Based on the above-mentioned embodiment, the traffic prediction device further includes: a prediction completion model training module, the prediction completion model training module includes:

初始模型确定子模块，用于确定初始模型；初始模型包括数据补全模块和预测模块；The initial model determination sub-module is used to determine the initial model; the initial model includes a data completion module and a prediction module;

补全模块预训练子模块，用于基于样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵对数据补全模块进行预训练，得到数据补全模块的权重参数；The pre-training sub-module of the completion module is used to pre-train the data completion module based on sample data, sample missing data, sample data expected value and sample missing position matrix, and obtain the weight parameters of the data completion module;

初始模型训练子模块，用于基于权重参数、样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对数据补全模块和预测模块进行联合训练，得到预测补全模型。The initial model training sub-module is used to jointly train the data completion module and the prediction module based on the weight parameters, sample data, sample missing data, sample data expected value and sample missing position matrix to obtain a prediction completion model.

基于上述任一实施例，初始模型训练子模块，包括：Based on any of the above embodiments, the initial model training sub-module includes:

联合损失计算子模块，用于基于联合训练损失函数，对数据补全模块和预测模块进行联合训练；联合训练函数是以样本数据期望值和预测模块输出的样本数据的初始预测结果之间的差异，以及样本缺失数据和数据补全模块输出的样本数据的补全数据之间的差异构建的。The joint loss calculation sub-module is used to jointly train the data completion module and the prediction module based on the joint training loss function; the joint training function is the difference between the expected value of the sample data and the initial prediction result of the sample data output by the prediction module, and the difference between the sample missing data and the completed data of the sample data output by the data completion module.

基于上述任一实施例，初始预测子模块，包括：Based on any of the above embodiments, the initial prediction sub-module includes:

重建数据确定子模块，基于样本数据和样本数据的补全数据，确定样本重建数据；The reconstruction data determination sub-module determines the sample reconstruction data based on the sample data and the complementary data of the sample data;

多时间粒度重构数据确定子模块，用于基于样本重建数据和各时间粒度，确定各时间粒度对应的样本重构数据；The multi-time granularity reconstruction data determination submodule is used to determine the sample reconstruction data corresponding to each time granularity based on the sample reconstruction data and each time granularity;

特征融合子模块，用于基于各时间粒度对应的样本重构数据，确定样本融合特征；The feature fusion sub-module is used to reconstruct the data based on the samples corresponding to each time granularity to determine the sample fusion features;

预测子模块，用于基于样本融合特征，确定得到初始预测结果。The prediction sub-module is used to fuse the features based on the sample to determine the initial prediction result.

基于上述任一实施例，特征融合子模块，包括：Based on any of the above embodiments, the feature fusion sub-module includes:

时空特征提取子模块，用于对各时间粒度对应的样本重构数据进行时空特征提取，得到各时间粒度对应的样本时空特征；The spatiotemporal feature extraction sub-module is used to extract the spatiotemporal features of the sample reconstruction data corresponding to each time granularity, and obtain the sample spatiotemporal features corresponding to each time granularity;

时空特征融合子模块，用于基于各时间粒度对应的样本时空特征及自适应权重，确定样本时空融合特征；The spatiotemporal feature fusion sub-module is used to determine the sample spatiotemporal fusion features based on the sample spatiotemporal features and adaptive weights corresponding to each time granularity;

外部因素拼接子模块，用于基于样本时空融合特征和外部因素特征，确定样本融合特征；外部因素特征包括天气因素特征，和/或，时段因素特征。The external factor splicing sub-module is used to determine the sample fusion characteristics based on the sample spatiotemporal fusion characteristics and the external factor characteristics; the external factor characteristics include weather factor characteristics and/or time period factor characteristics.

基于上述任一实施例，补全模块预训练子模块包括：Based on any of the above embodiments, the pre-training sub-module of the completion module includes:

样本信息确定子模块，用于确定样本数据、样本缺失数据和样本缺失位置矩阵；The sample information determination sub-module is used to determine the sample data, the sample missing data and the sample missing position matrix;

补全数据获取子模块，用于对样本数据进行时空特征提取，并基于得到的样本数据对应的样本时空特征，对样本数据进行补全，得到样本数据对应的补全数据；The completion data acquisition sub-module is used to extract the spatiotemporal features of the sample data, and based on the sample spatiotemporal features corresponding to the obtained sample data, complete the sample data to obtain the completed data corresponding to the sample data;

模型训练子模块，用于基于样本数据对应的补全数据、样本缺失数据和样本缺失位置矩阵，确定补全模块损失，并基于补全模块损失，对数据补全模块的参数进行迭代更新，得到数据补全模块的权重参数。The model training sub-module is used to determine the loss of the completion module based on the completion data, sample missing data and sample missing position matrix corresponding to the sample data, and based on the loss of the completion module, iteratively update the parameters of the data completion module to obtain The weight parameter of the data completion module.

基于上述任一实施例，样本信息确定子模块，包括：Based on any of the above embodiments, the sample information determination submodule includes:

原始数据确定子模块，用于确定原始交通数据；The original data determination sub-module is used to determine the original traffic data;

各比例样本信息确定子模块，用于从原始交通数据中随机删除各预设比例数量的交通数据，得到各预设比例数量对应的样本数据和各预设比例数量对应的样本缺失数据；Each proportional sample information determination submodule is used to randomly delete traffic data of each preset proportion from the original traffic data, and obtain sample data corresponding to each preset proportion and sample missing data corresponding to each preset proportion;

各比例样本矩阵确定子模块，用于基于原始交通数据时空位置和各预设比例数量对应的样本缺失数据在原始交通数据中的时空位置，确定各预设比例数量对应的样本缺失位置矩阵；Each proportional sample matrix determination submodule is used to determine the sample missing position matrix corresponding to each preset proportional quantity based on the temporal and spatial position of the original traffic data and the temporal and spatial position of the sample missing data corresponding to each preset proportional quantity in the original traffic data;

样本信息及矩阵确定子模块，用于基于各预设比例数量对应的样本数据、各预设比例数量对应的样本缺失数据和各预设比例数量对应的样本缺失位置矩阵，确定样本数据、样本缺失数据和样本缺失位置矩阵。The sample information and matrix determination sub-module is used to determine the sample data and sample missing based on the sample data corresponding to each preset proportion, the sample missing data corresponding to each preset proportion, and the sample missing position matrix corresponding to each preset proportion. Data and sample missing location matrix.

图9示例了一种电子设备的实体结构示意图，如图9所示，该电子设备可以包括：处理器(processor)910、通信接口(Communications Interface)920、存储器(memory)930和通信总线940，其中，处理器910，通信接口920，存储器930通过通信总线940完成相互间的通信。处理器910可以调用存储器930中的逻辑指令，以执行交通预测方法，该方法包括：确定历史交通数据和历史交通数据的缺失位置矩阵；用于将历史交通数据和缺失位置矩阵输入至预测补全模型中，得到预测补全模型输出的预测结果；预测补全模型在初始模型中数据补全模块的权重参数的基础上，应用样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对初始模型进行训练得到的；初始模型中数据补全模块的权重参数是基于样本数据、样本缺失数据和样本缺失位置矩阵预训练得到的。FIG. 9 illustrates a schematic diagram of the physical structure of an electronic device. As shown in FIG. 9 , the electronic device may include: a processor (processor) 910, a communication interface (Communications Interface) 920, a memory (memory) 930, and a communication bus 940, The processor 910 , the communication interface 920 , and the memory 930 communicate with each other through the communication bus 940 . The processor 910 may invoke logic instructions in the memory 930 to execute a traffic prediction method, the method comprising: determining historical traffic data and a missing location matrix of the historical traffic data; for inputting the historical traffic data and the missing location matrix to the prediction completion In the model, the prediction result output by the prediction completion model is obtained; the prediction completion model applies sample data, sample missing data, sample data expected value and sample missing position matrix on the basis of the weight parameters of the data completion module in the initial model, to It is obtained by training the initial model; the weight parameters of the data completion module in the initial model are pre-trained based on sample data, sample missing data and sample missing position matrix.

此外，上述的存储器930中的逻辑指令可以通过软件功能单元的形式实现并作为独立的产品销售或使用时，可以存储在一个计算机可读取存储介质中。基于这样的理解，本发明的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的部分可以以软件产品的形式体现出来，该计算机软件产品存储在一个存储介质中，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行本发明各个实施例所述方法的全部或部分步骤。而前述的存储介质包括：U盘、移动硬盘、只读存储器(ROM，Read-Only Memory)、随机存取存储器(RAM，Random Access Memory)、磁碟或者光盘等各种可以存储程序代码的介质。In addition, the above-mentioned logic instructions in the memory 930 can be implemented in the form of software functional units and can be stored in a computer-readable storage medium when sold or used as an independent product. Based on this understanding, the technical solution of the present invention can be embodied in the form of a software product in essence, or the part that contributes to the prior art or the part of the technical solution. The computer software product is stored in a storage medium, including Several instructions are used to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present invention. The aforementioned storage medium includes: U disk, mobile hard disk, Read-Only Memory (ROM, Read-Only Memory), Random Access Memory (RAM, Random Access Memory), magnetic disk or optical disk and other media that can store program codes .

另一方面，本发明还提供一种计算机程序产品，所述计算机程序产品包括计算机程序，计算机程序可存储在非暂态计算机可读存储介质上，所述计算机程序被处理器执行时，计算机能够执行上述各方法所提供的交通预测方法，该方法包括：确定历史交通数据和历史交通数据的缺失位置矩阵；用于将历史交通数据和缺失位置矩阵输入至预测补全模型中，得到预测补全模型输出的预测结果；预测补全模型在初始模型中数据补全模块的权重参数的基础上，应用样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对初始模型进行训练得到的；初始模型中数据补全模块的权重参数是基于样本数据、样本缺失数据和样本缺失位置矩阵预训练得到的。In another aspect, the present invention also provides a computer program product, the computer program product includes a computer program, the computer program can be stored on a non-transitory computer-readable storage medium, and when the computer program is executed by a processor, the computer can Execute the traffic prediction method provided by the above methods, the method includes: determining the historical traffic data and the missing position matrix of the historical traffic data; for inputting the historical traffic data and the missing position matrix into the prediction completion model to obtain the prediction completion The prediction result output by the model; the prediction completion model is obtained by training the initial model by applying sample data, sample missing data, sample data expected value and sample missing position matrix based on the weight parameters of the data completion module in the initial model; The weight parameters of the data completion module in the initial model are pre-trained based on sample data, sample missing data and sample missing location matrix.

又一方面，本发明还提供一种非暂态计算机可读存储介质，其上存储有计算机程序，该计算机程序被处理器执行时实现以执行上述各方法提供的交通预测方法，该方法包括：确定历史交通数据和历史交通数据的缺失位置矩阵；用于将历史交通数据和缺失位置矩阵输入至预测补全模型中，得到预测补全模型输出的预测结果；预测补全模型在初始模型中数据补全模块的权重参数的基础上，应用样本数据、样本缺失数据、样本数据期望值和样本缺失位置矩阵，对初始模型进行训练得到的；初始模型中数据补全模块的权重参数是基于样本数据、样本缺失数据和样本缺失位置矩阵预训练得到的。In another aspect, the present invention also provides a non-transitory computer-readable storage medium on which a computer program is stored, the computer program is implemented by a processor to execute the traffic prediction method provided by the above methods, and the method includes: Determine the missing location matrix of historical traffic data and historical traffic data; it is used to input historical traffic data and missing location matrix into the prediction completion model, and obtain the prediction result output by the prediction completion model; the prediction completion model contains the data in the initial model Based on the weight parameters of the completion module, the initial model is trained by applying sample data, sample missing data, expected value of sample data and sample missing position matrix; the weight parameters of the data completion module in the initial model are based on sample data, Sample missing data and sample missing location matrix are pretrained.

以上所描述的装置实施例仅仅是示意性的，其中所述作为分离部件说明的单元可以是或者也可以不是物理上分开的，作为单元显示的部件可以是或者也可以不是物理单元，即可以位于一个地方，或者也可以分布到多个网络单元上。可以根据实际地需要选择其中的部分或者全部模块来实现本实施例方案的目的。本领域普通技术人员在不付出创造性的劳动的情况下，即可以理解并实施。The device embodiments described above are only illustrative, wherein the units described as separate components may or may not be physically separated, and the components shown as units may or may not be physical units, that is, they may be located in One place, or it can be distributed over multiple network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution in this embodiment. Those of ordinary skill in the art can understand and implement it without creative effort.

通过以上的实施方式的描述，本领域的技术人员可以清楚地了解到各实施方式可借助软件加必需的通用硬件平台的方式来实现，当然也可以通过硬件。基于这样的理解，上述技术方案本质上或者说对现有技术做出贡献的部分可以以软件产品的形式体现出来，该计算机软件产品可以存储在计算机可读存储介质中，如ROM/RAM、磁碟、光盘等，包括若干指令用以使得一台计算机设备(可以是个人计算机，服务器，或者网络设备等)执行各个实施例或者实施例的某些部分所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that each embodiment can be implemented by means of software plus a necessary general hardware platform, and certainly can also be implemented by hardware. Based on this understanding, the above-mentioned technical solutions can be embodied in the form of software products in essence or the parts that make contributions to the prior art, and the computer software products can be stored in computer-readable storage media, such as ROM/RAM, magnetic A disc, an optical disc, etc., includes several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in various embodiments or some parts of the embodiments.

最后应说明的是：以上实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的精神和范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that it can still be The technical solutions described in the foregoing embodiments are modified, or some technical features thereof are equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the spirit and scope of the technical solutions of the embodiments of the present invention.

Claims

1. A traffic prediction method, comprising:

determining historical traffic data and a missing position matrix of the historical traffic data;

inputting the historical traffic data and the missing position matrix into a prediction completion model to obtain a prediction result output by the prediction completion model;

the prediction completion model is obtained by training the initial model by applying sample data, sample missing data, a sample data expected value and a sample missing position matrix on the basis of a weight parameter of a data completion module in the initial model; and the weight parameters of the data completion module in the initial model are obtained by pre-training based on the sample data, the sample missing data and the sample missing position matrix.

2. The traffic prediction method according to claim 1, wherein the training step of the predictive completion model comprises:

determining the initial model; the initial model comprises a data completion module and a prediction module;

pre-training the data completion module based on the sample data, the sample missing data, the sample data expected value and the sample missing position matrix to obtain a weight parameter of the data completion module;

and performing joint training on the data completion module and the prediction module based on the weight parameter, the sample data, the sample missing data, the sample data expected value and the sample missing position matrix to obtain the prediction completion model.

3. The traffic prediction method of claim 2, wherein the jointly training the data completion module and the prediction module comprises:

performing joint training on the data completion module and the prediction module based on a joint training loss function; the joint training function is constructed by the difference between the expected value of the sample data and the initial prediction result of the sample data output by the prediction module and the difference between the missing sample data and the complement data of the sample data output by the data complement module.

4. The traffic prediction method of claim 3, wherein the step of obtaining the initial prediction result comprises:

determining sample reconstruction data based on the sample data and the complementary data of the sample data;

determining sample reconstruction data corresponding to each time granularity based on the sample reconstruction data and each time granularity;

determining sample fusion characteristics based on the sample reconstruction data corresponding to each time granularity;

and determining to obtain the initial prediction result based on the sample fusion characteristics.

5. The traffic prediction method according to claim 4, wherein the determining a sample fusion feature based on the sample reconstruction data corresponding to each time granularity comprises:

performing space-time feature extraction on the sample reconstruction data corresponding to each time granularity to obtain sample space-time features corresponding to each time granularity;

determining sample space-time fusion characteristics based on the sample space-time characteristics and the self-adaptive weight corresponding to each time granularity;

determining the sample fusion characteristics based on the sample spatio-temporal fusion characteristics and the external factor characteristics; the external factor characteristic includes a weather factor characteristic, and/or a time period factor characteristic.

6. The traffic prediction method according to any one of claims 2 to 5, wherein the pre-training the data completion module based on the sample data, the missing sample data, the expected sample data value, and the missing sample position matrix to obtain the weight parameter of the data completion module comprises:

determining the sample data, the sample missing data and the sample missing position matrix;

performing space-time feature extraction on the sample data, and completing the sample data based on the obtained sample space-time feature corresponding to the sample data to obtain complete data corresponding to the sample data;

and determining the loss of a completion module based on completion data corresponding to the sample data, the missing sample data and the missing sample position matrix, and iteratively updating the parameters of the data completion module based on the loss of the completion module to obtain the weight parameters of the data completion module.

7. The traffic prediction method of claim 6, wherein the sample data, the missing sample data and the missing sample location matrix are determined by:

determining original traffic data;

randomly deleting traffic data with each preset proportion quantity from the original traffic data to obtain sample data corresponding to each preset proportion quantity and sample missing data corresponding to each preset proportion quantity;

determining a sample missing position matrix corresponding to each preset proportion quantity based on the space-time position of the original traffic data and the space-time position of the sample missing data corresponding to each preset proportion quantity in the original traffic data;

and determining the sample data, the sample missing data and the sample missing position matrix based on the sample data corresponding to each preset proportion quantity, the sample missing data corresponding to each preset proportion quantity and the sample missing position matrix corresponding to each preset proportion quantity.

8. A traffic prediction apparatus, comprising:

a determination module for determining historical traffic data and a missing location matrix of the historical traffic data;

the prediction module is used for inputting the historical traffic data and the missing position matrix into a prediction completion model to obtain a prediction result output by the prediction completion model;

9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor when executing the program implements the traffic prediction method according to any of claims 1 to 7.

10. A non-transitory computer-readable storage medium having stored thereon a computer program, which when executed by a processor implements the traffic prediction method according to any one of claims 1 to 7.