CN115909746A

CN115909746A - Traffic flow prediction method, system and medium based on federal learning

Info

Publication number: CN115909746A
Application number: CN202310006261.3A
Authority: CN
Inventors: 鲁鸣鸣; 何文勇
Original assignee: Central South University
Current assignee: Central South University
Priority date: 2023-01-04
Filing date: 2023-01-04
Publication date: 2023-04-04

Abstract

The invention discloses a traffic flow prediction method, a system and a medium based on federal learning, wherein the method comprises the following steps: setting hyper-parameters of a server model and initializing to obtain a global model; the server distributes the global model to the client to obtain each local model; each client updates the local model by using the local traffic flow data set; calculating the correlation between each local model update and the global model update, and screening the client according to the correlation; the screened client side sends the local model parameters to a server; the server converges the received local model parameters to complete the global model updating; repeating the steps until the models converge; and finally, predicting the traffic flow by using the converged local model by each client. The invention can avoid invalid parameter uploading and reduce the communication overhead in the process of federal learning training.

Description

A traffic flow prediction method, system and medium based on federated learning

技术领域Technical Field

本发明属于智能交通技术领域，尤其涉及一种基于联邦学习的交通流预测方法、系统和介质。The present invention belongs to the field of intelligent transportation technology, and in particular relates to a traffic flow prediction method, system and medium based on federated learning.

背景技术Background Art

随着城市化进程的加快，人口居住密度变得愈发密集，随之而来的是城市中私家车数量的不断剧增，居民对公共交通服务的需求量的迅速增长。一方面，汽车排放的大量尾气致使城市环境急剧恶化；另一方面，车辆的密集出行使得交通拥堵程度愈发严重。此等问题，不仅致使大量人力和财力的耗费，还严重影响了人们的出行体验。因此，如何缓解交通压力，提高城市出行效率，是一个迫切研究的问题。近年来，智能交通系统（ITS）[1]通过提供合理地交通管理决策，能够及时地防止一些交通拥堵、意外事故的发生。因此，智能交通系统的研发越来越受到研究人员的重视。交通流预测是智能交通系统中一个重要的研究领域[2]，它能够有效地刻画各道路的实时车流量情况并捕获各道路上交通流随时间的变化规律，因此交通流预测已经被广泛应用于交通系统的各类应用。With the acceleration of urbanization, the population density has become increasingly dense, and the number of private cars in cities has continued to increase dramatically, and the demand for public transportation services has increased rapidly. On the one hand, the large amount of exhaust gas emitted by cars has caused the urban environment to deteriorate sharply; on the other hand, the intensive travel of vehicles has made traffic congestion more and more serious. These problems not only lead to the consumption of a large amount of manpower and financial resources, but also seriously affect people's travel experience. Therefore, how to alleviate traffic pressure and improve urban travel efficiency is an urgent research issue. In recent years, intelligent transportation systems (ITS) [1] can timely prevent some traffic congestion and accidents by providing reasonable traffic management decisions. Therefore, the research and development of intelligent transportation systems has received more and more attention from researchers. Traffic flow prediction is an important research field in intelligent transportation systems [2]. It can effectively describe the real-time traffic flow conditions of each road and capture the time-varying laws of traffic flow on each road. Therefore, traffic flow prediction has been widely used in various applications of transportation systems.

目前，深度学习已在一系列交通预测问题中取得了较好的表现。现有的交通流预测方法主要关注于如何将传感器收集的交通流数据在云端运用机器学习模型进行训练和预测，这一方面没有充分利用边缘端传感器设备的算力，另一方面还使得数据在传输过程中泄露的风险。此外，由于交通数据量呈爆炸式增长，而GPU/CPU算力的提升相对缓慢，这使得云端服务器已经远远无法满足真实场景的计算需求[3]。近年来，由于终端设备在计算、存储等方面得到了大幅度提升，研究人员已开始将部分服务和计算下放到终端设备上，并结合联邦学习（FL）[4]进而提出一种全新的解决方案。该方案将集中式地训练模式转化为分布式的终端设备协同训练模式，从而有效地解决了上述问题。目前基于联邦学习的交通流预测工作要么过于简单以致表示能力不足[5]，从而使得模型存在预测性能过差的问题；要么算法设计过于复杂[6]，导致边缘端和计算端产生了巨大的通信开销。At present, deep learning has achieved good performance in a series of traffic prediction problems. Existing traffic flow prediction methods mainly focus on how to use machine learning models to train and predict traffic flow data collected by sensors in the cloud. On the one hand, this does not fully utilize the computing power of edge sensor devices, and on the other hand, it also makes the risk of data leakage during transmission. In addition, due to the explosive growth of traffic data and the relatively slow improvement of GPU/CPU computing power, cloud servers are far from meeting the computing needs of real scenarios [3]. In recent years, as terminal devices have been greatly improved in computing and storage, researchers have begun to delegate some services and computing to terminal devices, and combined with federated learning (FL) [4] to propose a new solution. This solution transforms the centralized training mode into a distributed terminal device collaborative training mode, thereby effectively solving the above problems. At present, traffic flow prediction based on federated learning is either too simple to have insufficient representation capabilities [5], resulting in the problem of poor prediction performance of the model; or the algorithm design is too complex [6], resulting in huge communication overhead between the edge and computing ends.

发明内容Summary of the invention

为了解决现有基于联邦学习的交通流预测方法中性能不足、通信开销巨大的问题，本发明提出一种基于联邦学习且通信上更轻量级的交通流预测方法、系统和介质，可以避免无效参数上传，降低联邦学习训练过程的通信开销。In order to solve the problems of insufficient performance and huge communication overhead in the existing traffic flow prediction methods based on federated learning, the present invention proposes a traffic flow prediction method, system and medium based on federated learning and lighter in communication, which can avoid uploading invalid parameters and reduce the communication overhead of the federated learning training process.

为实现上述技术目的，本发明采用如下技术方案：In order to achieve the above technical objectives, the present invention adopts the following technical solutions:

一种基于联邦学习的交通流预测方法，包括以下步骤：A traffic flow prediction method based on federated learning includes the following steps:

步骤1，设定服务器模型的超参数，并对模型的基本结构初始化得到全局模型参数

；参数下标G指代全局模型，上标

代表迭代轮次，令初始化时

；Step 1: Set the hyperparameters of the server model and initialize the basic structure of the model to obtain the global model parameters.

; The parameter subscript G refers to the global model, and the superscript

Represents the iteration round, let initialization

;

步骤2，服务器将全局模型参数

分发给各个客户端，得到各个客户端的局部模型参数

；参数下标中的

分别指代

个客户端局部模型；Step 2: The server sets the global model parameters

Distribute to each client to obtain the local model parameters of each client

; The parameter subscript

Respectively refer to

A client-side local model;

步骤3，利用各个客户端的本地交通流数据集分别对本地局部模型进行训练，完成局部模型第

轮更新，得到各局部模型参数为

；Step 3: Use the local traffic flow data set of each client to train the local local model respectively to complete the local model

Round update, the local model parameters are obtained as follows:

;

步骤4，计算各个客户端局部模型更新与服务器全局模型更新的相关性，根据相关性筛选本轮次待上传局部模型参数至服务器的客户端；Step 4: Calculate the correlation between the local model update of each client and the global model update of the server, and select the clients to upload the local model parameters to the server in this round according to the correlation;

步骤5，筛选出来的各客户端分别将本地局部模型参数发送至服务器；Step 5, each of the selected clients sends the local model parameters to the server;

步骤6，服务器将接收到的局部模型参数进行汇聚，即全局模型第

轮更新，得到全局模型参数为

；Step 6: The server aggregates the received local model parameters, i.e. the global model

Round update, the global model parameters are

;

步骤7，更新

，重复步骤2至步骤6，直至全局模型和各局部模型收敛；Step 7, Update

, repeat steps 2 to 6 until the global model and local models converge;

步骤8，各个客户端使用本地局部模型对本地交通流进行预测。Step 8: Each client uses the local model to predict the local traffic flow.

进一步地，步骤1中设定的超参数包括学习率

、通信轮数

、局部训练轮次

。Furthermore, the hyperparameters set in step 1 include the learning rate

, Communication rounds

, local training rounds

.

进一步地，模型的基本结构采用Encoder-Decoder架构：Encoder模块使用基于门控的循环神经网络捕获输入交通流时间序列中的上下文信息，即，将输入的时间序列中隐藏的时间动态性特征转化为一个中间隐藏向量

；Decoder模块使用基于门控的循环神经网络和全连接网络进行结果预测。Furthermore, the basic structure of the model adopts the Encoder-Decoder architecture: the Encoder module uses a gated recurrent neural network to capture the contextual information in the input traffic flow time series, that is, to transform the hidden temporal dynamic features in the input time series into an intermediate hidden vector

; The Decoder module uses a gated recurrent neural network and a fully connected network to predict results.

进一步地，步骤4包括：Further, step 4 includes:

（1）计算各客户端局部模型第

轮的更新：(1) Calculate the local model of each client

Wheel update:

式中，

表示第

个客户端在第

轮的局部模型更新，

和

分别表示第

个客户端局部模型在第

轮和第

轮的参数；In the formula,

Indicates

Client No.

The local model of the wheel is updated,

and

Respectively represent

The client local model is in

Round and

Wheel parameters;

（2）计算服务器全局模型第

轮的更新，将其近似估算为第

轮的更新：(2) Calculate the server global model

The update of the round is approximately estimated as

Wheel update:

式中，

表示服务器在第

轮的全局模型更新；In the formula,

Indicates that the server is

Global model update of the wheel;

（3）计算各客户端局部模型更新与全局模型更新的相关性，具体通过比较

和

中符号相同的参数数量进行衡量，表示为：(3) Calculate the correlation between each client's local model update and the global model update, by comparing

and

The number of parameters with the same sign is measured and expressed as:

式中，

表示客户端k的局部模型更新和全局模型更新的相关性，

代表模型参数的索引，M代表模型参数总数，

表示

中对应于局部模型

第

个参数的更新值，

表示

中对应于全局模型G第

个参数的更新值；sgn表示符号计算函数；In the formula,

represents the correlation between the local model update of client k and the global model update,

represents the index of the model parameter, M represents the total number of model parameters,

express

Corresponding to the local model

No.

The updated value of the parameter,

express

The corresponding global model G

The updated value of the parameter; sgn represents the symbolic calculation function;

（4）对所有客户端进行筛选：若某个客户端局部模型更新与全局模型更新的相关性大于设定的相关性阈值，该客户端的局部模型为符合要求上传服务器的模型。(4) Screening all clients: If the correlation between a client's local model update and the global model update is greater than the set correlation threshold, the client's local model is a model that meets the requirements for uploading to the server.

进一步地，服务器利用FedAVG算法对接收到的局部模型参数进行汇聚。Furthermore, the server aggregates the received local model parameters using the FedAVG algorithm.

一种基于联邦学习的交通流预测系统，包括1个服务器和

个客户端，所述服务器和每个客户端均包括存储器和处理器，所述存储器中存储有计算机程序，所述计算机程序被所述处理器执行时，使得服务器和客户端的处理器联合实现上述任一技术方案中所述的基于联邦学习的交通流预测方法。A traffic flow prediction system based on federated learning, including a server and

The server and each client include a memory and a processor, the memory stores a computer program, and when the computer program is executed by the processor, the processors of the server and the client jointly implement the traffic flow prediction method based on federated learning described in any of the above technical solutions.

一种计算机可读存储介质，其上存储有计算机程序，所述计算机程序被处理器执行时实现上述任一技术方案中所述的基于联邦学习的交通流预测方法。A computer-readable storage medium stores a computer program, which, when executed by a processor, implements the traffic flow prediction method based on federated learning described in any of the above technical solutions.

有益效果Beneficial Effects

与现有技术相比，本发明的有益效果：Compared with the prior art, the present invention has the following beneficial effects:

本发明方法通过计算各客户端局部模型更新与服务器全局模型更新的相关性，以根据相关性筛选对全局模型更新有效的客户端模型参数进行上传，避免了无效参数的上传，在保证精度相当的情况下，极大地降低了系统整体的通信开销。另外本发明方法相比于传统集中式方法，不仅充分利用了终端设备的算力，还保证了各个设备上的数据隐私。The method of the present invention calculates the correlation between the local model update of each client and the global model update of the server, and selects the client model parameters that are effective for the global model update according to the correlation for uploading, thereby avoiding the uploading of invalid parameters. While ensuring the same accuracy, the overall communication overhead of the system is greatly reduced. In addition, compared with the traditional centralized method, the method of the present invention not only fully utilizes the computing power of the terminal device, but also ensures the data privacy on each device.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

图1为本发明方法具体实施流程图；FIG1 is a flow chart showing a specific implementation of the method of the present invention;

图2为本发明方法中全局模型的结构示意图；FIG2 is a schematic diagram of the structure of the global model in the method of the present invention;

图3为本发明方法总体结构示意图；FIG3 is a schematic diagram of the overall structure of the method of the present invention;

图4为本发明方法同其他算法在预测精度上的结果对比图；FIG4 is a comparison chart of the prediction accuracy of the method of the present invention and other algorithms;

图5为本发明方法同其他算法在通信开销上的结果对比图。FIG5 is a comparison chart showing the communication overhead results of the method of the present invention and other algorithms.

具体实施方式DETAILED DESCRIPTION

为了使本技术领域的人员更好地理解本发明实施例中的技术方案，并使本发明的上述目的、特征和优点能够更加明显易懂，下面结合附图对本发明中技术方案作进一步详细的说明：In order to enable those skilled in the art to better understand the technical solutions in the embodiments of the present invention, and to make the above-mentioned purposes, features and advantages of the present invention more obvious and easy to understand, the technical solutions in the present invention are further described in detail below in conjunction with the accompanying drawings:

本发明方法具体实施例流程图如图1所示，其过程如下：The flow chart of a specific embodiment of the method of the present invention is shown in FIG1 , and the process is as follows:

步骤1，设定服务器模型的超参数，如学习率

、通信轮数

、局部训练轮次

等，并对模型的基本结构初始化得到全局模型参数

；参数下标G指代全局模型，上标

代表迭代轮次，令初始化时

，因此初始化的全局模型表示为

。Step 1: Set the hyperparameters of the server model, such as the learning rate

, Communication rounds

, local training rounds

Etc., and initialize the basic structure of the model to obtain the global model parameters

; The parameter subscript G refers to the global model, and the superscript

Represents the iteration round, let initialization

, so the initialized global model is expressed as

.

为了更好地捕获交通流数据中动态变化的时序特征，本发明方法采用了Encoder-Decoder架构作为全局模型的基础结构，其结构如图2所示。其中，Encoder模块使用基于门控的循环神经网络（GRU）捕获时序上的上下文信息，将输入的时间序列中隐藏的时间动态性特征转化为一个中间隐藏向量

。而Decoder模块则使用GRU和全连接网络（FNN）进行结果预测。通过初始化全局模型

, (r=0)，r表示迭代伦次，从而为后续模型分发做好准备。In order to better capture the dynamic time series features in traffic flow data, the method of the present invention adopts the Encoder-Decoder architecture as the basic structure of the global model, and its structure is shown in Figure 2. The Encoder module uses a gated recurrent neural network (GRU) to capture the contextual information on the time series and convert the hidden time dynamic features in the input time series into an intermediate hidden vector

The decoder module uses GRU and fully connected network (FNN) to predict the results. By initializing the global model

, (r=0), r represents the iteration number, thus preparing for the subsequent model distribution.

步骤2，服务器将全局模型参数

分发给各个客户端，得到各个客户端的局部模型参数

；参数下标中的

分别指代

个客户端局部模型。Step 2: The server sets the global model parameters

Distribute to each client to obtain the local model parameters of each client

; The parameter subscript

Respectively refer to

A client-side local model.

图3中流程①表示服务器将参数分发给各个客户端的过程。Process ① in Figure 3 represents the process of the server distributing parameters to each client.

轮更新，得到各局部模型参数为

。Step 3: Use the local traffic flow data set of each client to train the local local model respectively to complete the local model

Round update, the local model parameters are obtained as follows:

.

图3中流程②表示客户端局部模型训练过程。Process ② in Figure 3 represents the client-side local model training process.

首先，本发明方法使用

表示输入的一段交通流时间序列，其中n代表序列的长度，

为交通流时间序列

中第i个时间点的交通流数据，D表示每个时间点交通流数据的特征维度。Encoder在读取整段序列X后，会输出一个中间隐藏态向量

，表示该段时间序列中潜在的动态时间模式。在客户端k中，其计算过程如下所示：First, the method of the present invention uses

Represents a traffic flow time series of the input, where n represents the length of the sequence,

is the traffic flow time series

The traffic flow data at the i-th time point in the Encoder, D represents the feature dimension of the traffic flow data at each time point. After reading the entire sequence X, the Encoder will output an intermediate hidden state vector

, represents the potential dynamic time pattern in this time series. In client k, its calculation process is as follows:

（1）

(1)

其中

表示初始隐藏态向量。in

represents the initial hidden state vector.

其次，本实施例方法使用

表示模型预测得到的时间序列，其中n代表预测序列的长度。该模块的特点是每一个时间步的输入均为上一个时间步的输出。需要注意的是，第一个时间步的输入为Encoder输出的中间隐藏向量

和输入序列的最后一个值

。在客户端k中，其计算过程如下所示：Secondly, the method of this embodiment uses

Represents the time series predicted by the model, where n represents the length of the predicted sequence. The characteristic of this module is that the input of each time step is the output of the previous time step. It should be noted that the input of the first time step is the intermediate hidden vector output by the Encoder

and the last value of the input sequence

In client k, the calculation process is as follows:

（2）。

(2).

最后，本实施例方法使用梯度下降方法，进行参数更新，其过程如下：Finally, the method of this embodiment uses the gradient descent method to update the parameters, and the process is as follows:

（3）

(3)

其中

表示学习率，

表示计算梯度。in

represents the learning rate,

Represents the calculation of gradient.

步骤4，计算各个客户端局部模型更新与服务器全局模型更新的相关性，根据相关性筛选本轮次待上传局部模型参数至服务器的客户端。Step 4: Calculate the correlation between the local model update of each client and the global model update of the server, and select the clients to upload the local model parameters to the server in this round based on the correlation.

图3中流程③表示计算客户端局部模型更新与全局模型更新的相关性，本实施例方法将其分为如下关键4个过程：Process ③ in FIG. 3 represents the calculation of the correlation between the local model update of the client and the global model update. The method of this embodiment divides it into the following four key processes:

（1）计算各客户端局部模型第r轮的更新：(1) Calculate the update of each client's local model in round r:

（4）

(4)

式中，

表示第

个客户端在第

轮的局部模型更新，

和

分别表示第

个客户端局部模型在第

轮和第

轮的参数；In the formula,

Indicates

Client No.

The local model of the wheel is updated,

and

Respectively represent

The client local model is in

Round and

Wheel parameters;

（2）计算服务器全局模型第

轮的更新，将其近似估算为第

轮的更新：(2) Calculate the server global model

The update of the round is approximately estimated as

Wheel update:

（4）

(4)

式中，

表示服务器在第

轮的全局模型更新；In the formula,

Indicates that the server is

Global model update of the wheel;

和

and

The number of parameters with the same sign is measured and expressed as:

（6）

(6)

（7）

(7)

式中，

表示客户端k的局部模型更新和全局模型更新的相关性，

代表模型参数的索引，M代表模型参数总数，

表示

中对应于局部模型

第

个参数的更新值，

表示

中对应于全局模型G第

个参数的更新值；sgn表示符号计算函数；In the formula,

express

Corresponding to the local model

No.

The updated value of the parameter,

express

The corresponding global model G

步骤5，筛选出来的各客户端分别将本地局部模型参数发送至服务器。Step 5: Each of the selected clients sends the local model parameters to the server.

图3中流程④表示各个客户端发送本地模型参数到服务器的过程。Process ④ in Figure 3 represents the process of each client sending local model parameters to the server.

步骤6，服务器利用FedAVG算法将接收到的局部模型参数进行汇聚，即全局模型第

轮更新，得到全局模型参数为

。Step 6: The server uses the FedAVG algorithm to aggregate the received local model parameters, that is, the global model

Round update, the global model parameters are

.

图3中流程⑤表示服务器将参数进行汇聚的过程。Process ⑤ in FIG. 3 represents the process of the server aggregating parameters.

步骤7，更新

，重复步骤2至步骤6，直至全局模型和各局部模型收敛。Step 7, Update

, repeat steps 2 to 6 until the global model and local models converge.

本发明实施例进行的交通流预测实验中，主要的评价指标有均方误差、均方根误差、平均绝对误差。本实施例使用的数据集为METR-LA和PEMS-BAY交通流数据集。METR-LA包含了207个在4个月内对洛杉矶县高速公路车流速度的监测值，其中每个传感器均为每五分钟监测一次，总共包含34249条数据样本；PEMS-BAY包含了325个传感器在6个月内对湾区高速公路车流速度的监测值，总共包含52093条数据样本。In the traffic flow prediction experiment conducted by the embodiment of the present invention, the main evaluation indicators are mean square error, root mean square error, and mean absolute error. The data sets used in this embodiment are METR-LA and PEMS-BAY traffic flow data sets. METR-LA contains 207 monitoring values of the traffic speed of Los Angeles County highways within 4 months, where each sensor monitors once every five minutes, and a total of 34,249 data samples; PEMS-BAY contains 325 sensors monitoring the traffic speed of Bay Area highways within 6 months, and a total of 52,093 data samples.

图4 a)比较了本实施例方法（CM-FedSeq2Seq）与其他方法在METR-LA数据集上的预测性能。首先，相比于一些经典的集中式方法而言，本实施例方法CM-FedSeq2Seq在多项评价指标上均均具有更好的结果。特别地，CM-FedSeq2Seq相比于ARIMA的MAE和MAPE误差分别降低了38.4%和44.0%。此外，相比于现有联邦学习的方法而言，本实施例方法CM-FedSeq2Seq同样远远超过最新的两个方法FedGRU和CNFGNN。特别地，CM-FedSeq2Seq相比FedGRU在各项指标误差指标上分别降低了24.6%、19.2%和25.1%；CM-FedSeq2Seq相比于CNFGNN在RMSE指标上则下降了4.5%（由于CNFGNN仅提供了RMSE误差指标，因此在这里仅仅比较RMSE）。FIG4 a) compares the prediction performance of the method of this embodiment (CM-FedSeq2Seq) and other methods on the METR-LA dataset. First, compared with some classic centralized methods, the method of this embodiment CM-FedSeq2Seq has better results in multiple evaluation indicators. In particular, the MAE and MAPE errors of CM-FedSeq2Seq compared with ARIMA are reduced by 38.4% and 44.0%, respectively. In addition, compared with the existing federated learning methods, the method of this embodiment CM-FedSeq2Seq also far exceeds the two latest methods FedGRU and CNFGNN. In particular, CM-FedSeq2Seq reduces the error indicators of various indicators by 24.6%, 19.2% and 25.1% respectively compared with FedGRU; CM-FedSeq2Seq reduces the RMSE indicator by 4.5% compared with CNFGNN (since CNFGNN only provides the RMSE error indicator, only RMSE is compared here).

图4 b)比较了本实施例方法（CM-FedSeq2Seq）与其他方法在PEMS-BAY数据集上的预测性能。首先，相比于所对比的集中式方法而言，本实施例方法CM-FedSeq2Seq在各项评价指标均具有更好的结果。特别地，CM-FedSeq2Seq相比于FC-LSTM（精度较好的模型）在MAE、RMSE以及MAPE误差上分别降低了27.4%、23.4%和29.8%。与此同时，相比于现有联邦学习的方法而言，本实施例方法CM-FedSeq2Seq同样优于最新的两个方法FedGRU和CNFGNN。特别地，CM-FedSeq2Seq相比FedGRU在各项指标误差指标上分别降低了22.9%、21.6%和27.5%；CM-FedSeq2Seq相比于CNFGNN在RMSE指标上则下降了0.52%。FIG4 b) compares the prediction performance of the method of this embodiment (CM-FedSeq2Seq) and other methods on the PEMS-BAY dataset. First, compared with the centralized method compared, the method of this embodiment CM-FedSeq2Seq has better results in all evaluation indicators. In particular, compared with FC-LSTM (a model with better accuracy), CM-FedSeq2Seq reduces MAE, RMSE and MAPE errors by 27.4%, 23.4% and 29.8%, respectively. At the same time, compared with the existing federated learning methods, the method of this embodiment CM-FedSeq2Seq is also better than the two latest methods FedGRU and CNFGNN. In particular, CM-FedSeq2Seq reduces the error indicators of various indicators by 22.9%, 21.6% and 27.5% respectively compared with FedGRU; CM-FedSeq2Seq decreases the RMSE indicator by 0.52% compared with CNFGNN.

为了突出本发明方法在通信开销上的优势，因此还对比了各个方法在训练阶段的通信开销情况。如图5所示，本实施例方法CM-FedSeq2Seq在RMSE指标优于现有方法CNFGNN的前提下，还保证了CM-FedSeq2Seq的通信开销远远小于CNFGNN。In order to highlight the advantages of the method of the present invention in terms of communication overhead, the communication overhead of each method in the training phase is also compared. As shown in Figure 5, the method CM-FedSeq2Seq of this embodiment has a better RMSE index than the existing method CNFGNN, and also ensures that the communication overhead of CM-FedSeq2Seq is much smaller than CNFGNN.

通过实验可知，本发明不仅具有更高的预测准确率，同时还保证了更低的通信开销，因此本发明是一种更加有效的联邦学习交通流预测方法。It can be seen from experiments that the present invention not only has a higher prediction accuracy, but also ensures a lower communication overhead. Therefore, the present invention is a more effective federated learning traffic flow prediction method.

以上实施例为本申请的优选实施例，本领域的普通技术人员还可以在此基础上进行各种变换或改进，在不脱离本申请总的构思的前提下，这些变换或改进都应当属于本申请要求保护的范围之内。The above embodiments are preferred embodiments of the present application. Ordinary technicians in this field can also make various changes or improvements on this basis. Without departing from the overall concept of the present application, these changes or improvements should fall within the scope of protection required by the present application.

本文参考文献：References:

[1] Lin Y, Wang P, Ma M. Intelligent transportation system (ITS):Concept, challenge and opportunity[C]//2017 ieee 3rd international conferenceon big data security on cloud (bigdatasecurity), ieee internationalconference on high performance and smart computing (hpsc), and ieeeinternational conference on intelligent data and security (ids). IEEE, 2017:167-172.[1] Lin Y, Wang P, Ma M. Intelligent transportation system (ITS): Concept, challenge and opportunity[C]//2017 ieee 3rd international conference on big data security on cloud (bigdatasecurity), ieee international conference on high performance and smart computing (hpsc), and ieeeinternational conference on intelligent data and security (ids). IEEE, 2017:167-172.

[2] Luo Q, Zhou Y. Spatial-temporal Structures of Deep LearningModels for Traffic Flow Forecasting: A Survey[C]//2021 4th InternationalConference on Intelligent Autonomous Systems (ICoIAS). IEEE, 2021: 187-193s.[2] Luo Q, Zhou Y. Spatial-temporal Structures of Deep LearningModels for Traffic Flow Forecasting: A Survey[C]//2021 4th InternationalConference on Intelligent Autonomous Systems (ICoIAS). IEEE, 2021: 187-193s.

[3] Guo Y, Zhao R, Lai S, et al. Distributed machine learning formultiuser mobile edge computing systems[J]. IEEE Journal of Selected Topicsin Signal Processing, 2022.[3] Guo Y, Zhao R, Lai S, et al. Distributed machine learning for multiuser mobile edge computing systems[J]. IEEE Journal of Selected Topics in Signal Processing, 2022.

[4] Khan L U, Saad W, Han Z, et al. Federated learning for internetof things: Recent advances, taxonomy, and open challenges[J]. IEEECommunications Surveys & Tutorials, 2021.[4] Khan L U, Saad W, Han Z, et al. Federated learning for internet of things: Recent advances, taxonomy, and open challenges[J]. IEEECommunications Surveys & Tutorials, 2021.

[5] Liu Y, James J Q, Kang J, et al. Privacy-preserving traffic flowprediction: A federated learning approach[J]. IEEE Internet of ThingsJournal, 2020, 7(8): 7751-7763.[5] Liu Y, James J Q, Kang J, et al. Privacy-preserving traffic flowprediction: A federated learning approach[J]. IEEE Internet of ThingsJournal, 2020, 7(8): 7751-7763.

[6] Meng C, Rambhatla S, Liu Y. Cross-node federated graph neuralnetwork for spatio-temporal data modeling[C]//Proceedings of the 27th ACMSIGKDD Conference on Knowledge Discovery & Data Mining. 2021: 1202-1211。[6] Meng C, Rambhatla S, Liu Y. Cross-node federated graph neural network for spatio-temporal data modeling[C]//Proceedings of the 27th ACMSIGKDD Conference on Knowledge Discovery & Data Mining. 2021: 1202-1211.

Claims

1. a traffic flow prediction method based on federal learning is characterized by comprising the following steps:

step 1, setting hyper-parameters of a server model, and initializing a basic structure of the model to obtain global model parameters

(ii) a The parameter subscript G designates the global model, superscript @>

Represents iteration turns, and makes initialization->

；

Step 2, the server sends the global model parameters

Distributing the parameters to each client to obtain the local model parameters of each client>

(ii) a In the parameter subscript>

Respectively denote->

A client-side local model;

step 3, training local models respectively by using local traffic flow data sets of all clients to finish the first local model

Updating the wheel to obtain the parameters of each local model as->

；

Step 4, calculating the correlation between local model update of each client and global model update of the server, and screening the clients to be uploaded with local model parameters to the server in the current turn according to the correlation;

step 5, the screened client sides respectively send local model parameters to a server;

step 6, the server converges the received local model parameters, namely the global model is the first

Updating the wheel to obtain a global model parameter of ^ 4>

；

Step 7, updating

Repeating the steps 2 to 6 until the global model and each local model are converged;

and 8, predicting the local traffic flow by each client by using the local model.

2. The federally-learned traffic flow prediction method according to claim 1, wherein the hyper-parameter set in step 1 includes a learning rate

Number of communication rounds->

And local training round>

。

3. The federally-learned traffic flow prediction method according to claim 1, wherein the basic structure of the model employs an Encoder-Decoder architecture: the Encoder module uses a gate-based recurrent neural network to capture the context information in the input traffic flow time sequence, namely, the hidden time dynamic characteristics in the input time sequence are converted into an intermediate hidden vector

(ii) a The Decoder module uses a gate-based recurrent neural network and a fully-connected network for result prediction.

4. The federal learning based traffic flow prediction method as claimed in claim 1, wherein step 4 includes:

(1) Calculating the local model number of each client

Updating the wheel:

in the formula (I), the compound is shown in the specification,

indicates the fifth->

Individual client is on the ^ h>

Partial model update of the wheel, ->

And &>

Respectively denote a fifth->

Individual client side local model in ^ th>

Wheel and a fifth->

A parameter of the wheel;

(2) Computing server Global model number

Updating of the wheel, evaluating it approximately as ^ based>

Updating the wheel:

in the formula (I), the compound is shown in the specification,

indicating that the server is in the ^ th->

Global model update of the wheel;

(3) Calculating the correlation between the local model update and the global model update of each client, specifically by comparing

And &>

The number of parameters with the same symbol in the Chinese character is measured and expressed as: />

In the formula (I), the compound is shown in the specification,

represents the correlation of local and global model updates for client k @>

An index representing the parameters of the model,Mrepresents the total number of model parameters, <' > is selected>

Represents->

In correspondence with a partial model>

Is/are>

An updated value of a parameter->

Represents->

Corresponds to the global pattern G £ th>

Update values of individual parameters; sgn represents a sign calculation function;

(4) Screening all clients: and if the correlation between the local model update and the global model update of a certain client is greater than a set correlation threshold, the local model of the client is a model which meets the requirement of uploading to the server.

5. The federal learning based traffic flow prediction method of claim 1, wherein the server aggregates the received local model parameters using a FedAVG algorithm.

6. A traffic flow prediction system based on federal learning is characterized by comprising 1 server and

a client, the server and each client compriseA memory and a processor, the memory having stored therein a computer program that, when executed by the processor, causes the processor of the server and the client to jointly implement the method of any one of claims 1 to 5.

7. A computer-readable storage medium, on which a computer program is stored, which, when being executed by a processor, carries out the method according to any one of claims 1 to 5.