[go: up one dir, main page]

CN115423048B - A traffic flow anomaly detection method and system based on pattern similarity - Google Patents

A traffic flow anomaly detection method and system based on pattern similarity Download PDF

Info

Publication number
CN115423048B
CN115423048B CN202211365058.7A CN202211365058A CN115423048B CN 115423048 B CN115423048 B CN 115423048B CN 202211365058 A CN202211365058 A CN 202211365058A CN 115423048 B CN115423048 B CN 115423048B
Authority
CN
China
Prior art keywords
traffic flow
similarity
pattern
flow data
short
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202211365058.7A
Other languages
Chinese (zh)
Other versions
CN115423048A (en
Inventor
张彩明
马翔
袁晨迅
李雪梅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong University
Original Assignee
Shandong University
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong University filed Critical Shandong University
Priority to CN202211365058.7A priority Critical patent/CN115423048B/en
Publication of CN115423048A publication Critical patent/CN115423048A/en
Application granted granted Critical
Publication of CN115423048B publication Critical patent/CN115423048B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G08SIGNALLING
    • G08GTRAFFIC CONTROL SYSTEMS
    • G08G1/00Traffic control systems for road vehicles
    • G08G1/01Detecting movement of traffic to be counted or controlled
    • G08G1/0104Measuring and analyzing of parameters relative to traffic conditions
    • G08G1/0125Traffic data processing
    • G08G1/0129Traffic data processing for creating historical data or processing based on historical data

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Chemical & Material Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Analytical Chemistry (AREA)
  • Traffic Control Systems (AREA)

Abstract

The invention discloses a traffic flow anomaly detection method and a system based on pattern similarity, which relate to the technical field of traffic flow anomaly detection models and comprise the following steps: extracting time sequence characteristics from traffic flow data by adopting an improved long-short-term memory neural network; dividing and clustering traffic flow data by adopting a sliding window, and taking a short-term sequence corresponding to a clustering center as a mode characteristic; calculating time sequence similarity for time sequence features of different space positions; determining the mode characteristics closest to each mode characteristic, and weighting the nearest neighbor distances of the mode characteristic pairs to obtain the mode similarity of different spatial positions; determining sequence similarity according to the time sequence similarity and the mode similarity, and constructing traffic flow dynamic relation diagrams of different time and different space positions according to the sequence similarity; and detecting abnormal traffic flow states by adopting a traffic flow dynamic relation diagram and time sequence similarity so as to improve the accuracy of detecting abnormal traffic flow.

Description

一种基于模式相似性的交通流量异常检测方法及系统A traffic flow anomaly detection method and system based on pattern similarity

技术领域technical field

本发明涉及交通流量异常检测模型技术领域,特别是涉及一种基于模式相似性的交通流量异常检测方法及系统。The invention relates to the technical field of traffic flow anomaly detection models, in particular to a traffic flow anomaly detection method and system based on pattern similarity.

背景技术Background technique

随着大数据技术的相关发展,人工智能技术广泛应用于交通流量异常检测和交通流量预测中,准确检测交通流量的异常情况,不仅能给交通管理部门提供有利的决策参考,也能提供给出行人更合适的路线选择,有利于缓解交通压力。With the relevant development of big data technology, artificial intelligence technology is widely used in traffic flow anomaly detection and traffic flow forecasting. Accurate detection of traffic flow anomalies can not only provide favorable decision-making references for traffic management departments, but also provide travel information. People choose more suitable routes, which is conducive to alleviating traffic pressure.

路口交通流量的变化受时间、天气、交通政策等多方面的影响,具有明显的周期性,现有使用机器学习方法的交通流量异常检测算法至少存在以下三方面问题:The change of traffic flow at intersections is affected by time, weather, traffic policies, etc., and has obvious periodicity. The existing traffic flow anomaly detection algorithm using machine learning methods has at least the following three problems:

(1)单一的循环神经网络模型无法更有效的提取交通流量历史序列的信息。(1) A single cyclic neural network model cannot extract the information of the historical sequence of traffic flow more effectively.

(2)现有的交通流量异常检测只考虑了单一路口的交通状况,并没有考虑其它路口的关联影响因素。(2) Existing traffic flow anomaly detection only considers the traffic conditions of a single intersection, and does not consider the associated influencing factors of other intersections.

(3)在计算不同路口之间交通流量的相似性时计算缺乏一种有效的度量方式。(3) There is a lack of an effective measurement method when calculating the similarity of traffic flow between different intersections.

发明内容Contents of the invention

为了解决上述问题,本发明提出了一种基于模式相似性的交通流量异常检测方法及系统,对交通流量数据分别提取时序特征和模式特征,并构建交通流量动态关系图,从而对交通流量异常情况进行判断,提高交通流量异常检测的准确率。In order to solve the above problems, the present invention proposes a traffic flow anomaly detection method and system based on pattern similarity, which extracts time-series features and pattern features from traffic flow data, and constructs a traffic flow dynamic relationship diagram, so as to detect traffic flow anomalies Make judgments to improve the accuracy of traffic flow anomaly detection.

为了实现上述目的,本发明采用如下技术方案:In order to achieve the above object, the present invention adopts the following technical solutions:

第一方面,本发明提供一种基于模式相似性的交通流量异常检测方法,包括:In a first aspect, the present invention provides a traffic flow anomaly detection method based on pattern similarity, including:

获取交通流量数据;Obtain traffic flow data;

采用改进的长短期记忆神经网络对交通流量数据提取时序特征;所述改进的长短期记忆神经网络为对不同时刻得到的隐藏状态经加权求和后得到时序特征;The improved long-short-term memory neural network is used to extract time-series features from traffic flow data; the improved long-short-term memory neural network obtains time-series features after weighted summation of hidden states obtained at different times;

采用滑动窗口对交通流量数据进行分割,得到短期序列集,对短期序列集进行聚类后,以每个类别的聚类中心所对应的短期序列作为模式特征;Segment the traffic flow data with a sliding window to obtain a short-term sequence set. After clustering the short-term sequence set, the short-term sequence corresponding to the cluster center of each category is used as the pattern feature;

对不同空间位置的时序特征计算时序相似度;Calculate the temporal similarity of temporal features at different spatial locations;

对每个模式特征确定与其距离最近的模式特征,以组成模式特征对,对模式特征对的最近邻距离经加权处理后,得到不同空间位置的模式相似度;Determine the pattern feature with the closest distance to each pattern feature to form a pattern feature pair, and after weighting the nearest neighbor distance of the pattern feature pair, the pattern similarity at different spatial positions is obtained;

根据时序相似度和模式相似度确定序列相似度,根据序列相似度构建不同时间且不同空间位置的交通流量动态关系图;The sequence similarity is determined according to the time series similarity and pattern similarity, and the dynamic relationship diagram of traffic flow at different times and different spatial locations is constructed according to the sequence similarity;

采用交通流量动态关系图和时序相似度进行交通流量异常状态的检测。The abnormal state of traffic flow is detected by using the traffic flow dynamic relationship diagram and time series similarity.

作为可选择的实施方式,对不同时刻得到的隐藏状态进行加权求和得到时序特征的过程中,以不同时刻的隐藏状态与交通流量数据的相关性确定权重,权重为:As an optional implementation, in the process of weighting and summing the hidden states obtained at different times to obtain time series features, the weight is determined based on the correlation between the hidden states at different times and the traffic flow data , with a weight of:

其中,为第 t日的交通流量数据,为隐藏状态,为相关性函数,为待学习参数,是输入的交通流量数据的天数,为转置操作。 in, is the traffic flow data on day t , for the hidden state, is the correlation function, is the parameter to be learned, is the number of days of the input traffic flow data, for the transpose operation.

作为可选择的实施方式,对不同空间位置的时序特征计算时序相似度的过程为:As an optional implementation, the time series similarity is calculated for the time series features of different spatial positions The process is:

其中,为第 t日空间位置 a的时序特征,为第 t日空间位置 b的时序特征,是由待学习权重矩阵和激活函数tanh构成的网络,指将进行拼接。 in, is the time series feature of spatial position a on the tth day, is the time series feature of spatial position b on the tth day, is the weight matrix to be learned And the network composed of the activation function tanh, Commander and to splice.

作为可选择的实施方式,对模式特征对的最近邻距离进行加权处理的过程中,权重为模式特征所在类别包含的元素个数。As an optional implementation manner, in the process of weighting the nearest neighbor distance of the pattern feature pair, the weight is the number of elements contained in the category of the pattern feature.

作为可选择的实施方式,对时序相似度和模式相似度赋权后求和确定序列相似度。As an optional implementation manner, the sequence similarity is determined by summing the time sequence similarity and the pattern similarity after weighting.

作为可选择的实施方式,构建交通流量动态关系图的过程包括:As an optional implementation, the process of constructing the traffic flow dynamic relationship diagram includes:

根据不同空间位置的交通流量数据的序列相似度,构建同一时间不同空间位置的关系图According to the sequence similarity of traffic flow data at different spatial locations, construct the relationship diagram of different spatial locations at the same time ;

引入不同空间位置的交通流量数据间的连通关系矩阵,根据关系图和连通关系矩阵构建交通流量动态关系图Introduce the connection relationship matrix between the traffic flow data in different spatial locations, and construct the dynamic relationship diagram of traffic flow according to the relationship diagram and the connection relationship matrix ;

其中,为待学习参数,为连通关系矩阵,tanh为激活函数,分别为当前时刻和先验数据指示的时刻,为时间差,是递减函数。in, is the parameter to be learned, is the connectivity matrix, tanh is the activation function, and are the current moment and the moment indicated by the prior data, respectively, for the time difference, is a decreasing function.

作为可选择的实施方式,连通关系矩阵为:As an optional implementation, the connectivity matrix is:

其中,X a 为空间位置 a的交通流量数据,X b 为空间位置 b的交通流量数据,为X a 和X b 间的连通关系矩阵。 Among them, X a is the traffic flow data of spatial position a , X b is the traffic flow data of spatial position b , is the connectivity matrix between X a and X b .

第二方面,本发明提供一种基于模式相似性的交通流量异常检测系统,包括:In a second aspect, the present invention provides a traffic flow anomaly detection system based on pattern similarity, comprising:

数据获取模块,被配置为获取交通流量数据;a data acquisition module configured to acquire traffic flow data;

时序特征提取模块,被配置为采用改进的长短期记忆神经网络对交通流量数据提取时序特征;所述改进的长短期记忆神经网络为对不同时刻得到的隐藏状态经加权求和后得到时序特征;The timing feature extraction module is configured to use an improved long-short-term memory neural network to extract timing features from traffic flow data; the improved long-short-term memory neural network obtains timing features after weighted summation of hidden states obtained at different times;

模式特征提取模块,被配置为采用滑动窗口对交通流量数据进行分割,得到短期序列集,对短期序列集进行聚类后,以每个类别的聚类中心所对应的短期序列作为模式特征;The pattern feature extraction module is configured to use a sliding window to segment the traffic flow data to obtain a short-term sequence set, and after clustering the short-term sequence set, use the short-term sequence corresponding to the cluster center of each category as the pattern feature;

时序相似度确定模块,被配置为对不同空间位置的时序特征计算时序相似度;A timing similarity determining module configured to calculate timing similarity for timing features at different spatial locations;

模式相似度确定模块,被配置为对每个模式特征确定与其距离最近的模式特征,以组成模式特征对,对模式特征对的最近邻距离经加权处理后,得到不同空间位置的模式相似度;The pattern similarity determination module is configured to determine the pattern feature with the closest distance to each pattern feature to form a pattern feature pair, and after weighting the nearest neighbor distance of the pattern feature pair, the pattern similarity of different spatial positions is obtained;

动态关系图构建模块,被配置为根据时序相似度和模式相似度确定序列相似度,根据序列相似度构建不同时间且不同空间位置的交通流量动态关系图;The dynamic relationship diagram building module is configured to determine the sequence similarity according to the time series similarity and the pattern similarity, and construct the dynamic relationship diagram of traffic flow at different times and different spatial locations according to the sequence similarity;

异常检测模块,被配置为采用交通流量动态关系图和时序相似度进行交通流量异常状态的检测。The anomaly detection module is configured to detect the abnormal state of the traffic flow by using the traffic flow dynamic relationship diagram and the time series similarity.

第三方面,本发明提供一种电子设备,包括存储器和处理器以及存储在存储器上并在处理器上运行的计算机指令,所述计算机指令被处理器运行时,完成第一方面所述的方法。In a third aspect, the present invention provides an electronic device, including a memory, a processor, and computer instructions stored in the memory and run on the processor. When the computer instructions are executed by the processor, the method described in the first aspect is completed. .

第四方面,本发明提供一种计算机可读存储介质,用于存储计算机指令,所述计算机指令被处理器执行时,完成第一方面所述的方法。In a fourth aspect, the present invention provides a computer-readable storage medium for storing computer instructions, and when the computer instructions are executed by a processor, the method described in the first aspect is completed.

与现有技术相比,本发明的有益效果为:Compared with prior art, the beneficial effect of the present invention is:

本发明提出一种基于模式相似性的交通流量异常检测方法及系统,采用改进的长短期记忆神经网络提取时序特征,同时还通过提取模式特征,以综合考虑交通流量数据的周期性特征,将提取的两部分特征进行相似性计算后,构建交通流量动态关系图,交通流量动态关系图考虑了不同空间位置间关联关系的影响,还考虑了不同时间对当前关联关系的影响,最后利用图注意力网络对交通流量异常情况进行判断,提高交通流量异常检测的准确率。The present invention proposes a traffic flow anomaly detection method and system based on pattern similarity. The improved long-short-term memory neural network is used to extract time-series features. At the same time, pattern features are extracted to comprehensively consider the periodic characteristics of traffic flow data. The extracted After the similarity calculation of the two parts of the features, the traffic flow dynamic relationship diagram is constructed. The traffic flow dynamic relationship diagram considers the influence of the relationship between different spatial locations, and also considers the influence of different times on the current relationship, and finally uses the graph attention The network judges the abnormal situation of traffic flow to improve the accuracy of abnormal traffic flow detection.

本发明附加方面的优点将在下面的描述中部分给出,部分将从下面的描述中变得明显,或通过本发明的实践了解到。Advantages of additional aspects of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention.

附图说明Description of drawings

构成本发明的一部分的说明书附图用来提供对本发明的进一步理解,本发明的示意性实施例及其说明用于解释本发明,并不构成对本发明的不当限定。The accompanying drawings constituting a part of the present invention are used to provide a further understanding of the present invention, and the schematic embodiments of the present invention and their descriptions are used to explain the present invention and do not constitute improper limitations to the present invention.

图1为本发明实施例1提供的基于模式相似性的交通流量异常检测方法流程图;Fig. 1 is the flowchart of the traffic flow anomaly detection method based on pattern similarity provided by Embodiment 1 of the present invention;

图2为本发明实施例1提供的动态关系图的构建示意图;FIG. 2 is a schematic diagram of the construction of the dynamic relationship diagram provided by Embodiment 1 of the present invention;

图3为本发明实施例1提供的异常判断流程图。FIG. 3 is a flow chart of abnormality judgment provided by Embodiment 1 of the present invention.

具体实施方式Detailed ways

下面结合附图与实施例对本发明做进一步说明。The present invention will be further described below in conjunction with the accompanying drawings and embodiments.

应该指出,以下详细说明都是示例性的,旨在对本发明提供进一步的说明。除非另有指明,本文使用的所有技术和科学术语具有与本发明所属技术领域的普通技术人员通常理解的相同含义。It should be noted that the following detailed description is exemplary and intended to provide further explanation of the present invention. Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this invention belongs.

需要注意的是,这里所使用的术语仅是为了描述具体实施方式,而非意图限制根据本发明的示例性实施方式。如在这里所使用的,除非上下文另外明确指出,否则单数形式也意图包括复数形式,此外,还应当理解的是,术语“包括”和“具有”以及他们的任何变形,意图在于覆盖不排他的包含,例如,包含了一系列步骤或单元的过程、方法、系统、产品或设备不必限于清楚地列出的那些步骤或单元,而是可包括没有清楚地列出的或对于这些过程、方法、产品或设备固有的其它步骤或单元。It should be noted that the terminology used here is only for describing specific embodiments, and is not intended to limit exemplary embodiments according to the present invention. As used herein, unless the context clearly dictates otherwise, the singular is intended to include the plural, and it should also be understood that the terms "comprising" and "having" and any variations thereof are intended to cover a non-exclusive Comprising, for example, a process, method, system, product, or device comprising a series of steps or units is not necessarily limited to those steps or units explicitly listed, but may include steps or units not explicitly listed or for these processes, methods, Other steps or units inherent in a product or equipment.

在不冲突的情况下,本发明中的实施例及实施例中的特征可以相互组合。In the case of no conflict, the embodiments and the features in the embodiments of the present invention can be combined with each other.

实施例1Example 1

本实施例提出一种基于模式相似性的交通流量异常检测方法,如图1所示,包括:This embodiment proposes a traffic flow anomaly detection method based on pattern similarity, as shown in FIG. 1 , including:

获取交通流量数据;Obtain traffic flow data;

采用改进的长短期记忆神经网络对交通流量数据提取时序特征;所述改进的长短期记忆神经网络为对不同时刻得到的隐藏状态经加权求和后得到时序特征;The improved long-short-term memory neural network is used to extract time-series features from traffic flow data; the improved long-short-term memory neural network obtains time-series features after weighted summation of hidden states obtained at different times;

采用滑动窗口对交通流量数据进行分割,得到短期序列集,对短期序列集进行聚类后,以每个类别的聚类中心所对应的短期序列作为模式特征;Segment the traffic flow data with a sliding window to obtain a short-term sequence set. After clustering the short-term sequence set, the short-term sequence corresponding to the cluster center of each category is used as the pattern feature;

对不同空间位置的时序特征计算时序相似度;Calculate the temporal similarity of temporal features at different spatial locations;

对每个模式特征确定与其距离最近的模式特征,以组成模式特征对,对模式特征对的最近邻距离经加权处理后,得到不同空间位置的模式相似度;Determine the pattern feature with the closest distance to each pattern feature to form a pattern feature pair, and after weighting the nearest neighbor distance of the pattern feature pair, the pattern similarity at different spatial positions is obtained;

根据时序相似度和模式相似度确定序列相似度,根据序列相似度构建不同时间且不同空间位置的交通流量动态关系图;The sequence similarity is determined according to the time series similarity and pattern similarity, and the dynamic relationship diagram of traffic flow at different times and different spatial locations is constructed according to the sequence similarity;

采用交通流量动态关系图和时序相似度进行交通流量异常状态的检测。The abnormal state of traffic flow is detected by using the traffic flow dynamic relationship diagram and time series similarity.

在本实施例中,将在T天内的交通流量数据定义为;其中,第 t天的交通流量数据为为第 t天第 n分钟的数据,N是日内交通流量数据的长度。 In this embodiment, the traffic flow data in T days is defined as ; Among them, the traffic flow data on day t is , is the data of the nth minute of the t day, and N is the length of the intraday traffic flow data.

由于交通流量数据受多种复杂因素的影响,为了对其时序特征进行向量表示,本实施例采用改进的长短期记忆神经网络(Long Short Term Memory,LSTM)对获取的交通流量数据进行建模,以提取时序特征。Since the traffic flow data is affected by many complex factors, in order to express its time-series features as a vector, this embodiment adopts an improved Long Short Term Memory neural network (Long Short Term Memory, LSTM) to model the acquired traffic flow data, to extract temporal features.

LSTM是循环神经网络(RNN)的一种改进算法,同时被广泛应用于时间序列建模中,其所采用的门控单元可以一定程度上抑制RNN的梯度消失问题。对于每个交通流量数据 x t 来讲,LSTM的建模公式如式(1)-式(6)所示: LSTM is an improved algorithm of recurrent neural network (RNN), and is widely used in time series modeling. The gating unit used in it can suppress the gradient disappearance problem of RNN to a certain extent. For each traffic flow data x t , the modeling formula of LSTM is shown in formula (1) - formula (6):

输入门:Input gate:

(1) (1)

遗忘门:Forgotten Gate:

(2) (2)

(3) (3)

输出门:Output gate:

(4) (4)

长记忆:long memory:

(5) (5)

短记忆:short memory:

(6) (6)

其中, W i W f W C W o 均为待学习的参数, C t 为细胞状态,为细胞状态的中间量,为哈达玛积, h t 为隐藏状态。为了方便解释具体的改进算法,上述公式忽略了偏移项。 Among them, W i , W f , W C and W o are the parameters to be learned, C t is the state of the cell, is the intermediate quantity of the cell state, is the Hadamard product, h t is the hidden state. For the convenience of explaining the specific improved algorithm, the above formula ignores the offset term.

现有的大多数算法中,一般是将最后一个隐藏状态 h t 作为LSTM的输出结果,但是这样往往会忽略前面隐藏状态中所包含的特征。 In most existing algorithms, the last hidden state h t is generally used as the output result of LSTM, but this often ignores the features contained in the previous hidden state.

为此,本实施例将不同时刻的隐藏状态以加权求和的方式融合后得到时序特征;其中,以不同时刻的隐藏状态与 x t 的相关性定义权重,相关性大的隐藏状态对应的权重大,从而提高输出结果对 x t 的表示能力,如式(7)-式(9)所示: For this reason, in this embodiment, the hidden states at different moments are fused in a weighted summation manner to obtain time-series features; wherein, the weight is defined by the correlation between the hidden states at different moments and xt , and the weight corresponding to the hidden state with a large correlation Significant, thereby improving the ability of the output to represent x t , as shown in formula (7) - formula (9):

(7) (7)

(8) (8)

(9) (9)

其中,的权重,由相关性计算函数决定;为待学习参数。in, for The weight of , calculated by the correlation function Decide; is the parameter to be learned.

对所有交通流量数据进行上述处理后,得到 x t 的向量表示,即时序特征 v t ;该过程简化为式(10)表示: After the above processing of all traffic flow data, the vector representation of x t is obtained, that is, the sequence feature v t ; the process is simplified to formula (10):

(10) (10)

其中,是输入数据的天数。in, is the number of days to enter the data.

模式特征指在历史数据上重复出现的一系列近似短期数据。由此,本实施例提出一种基于分割和聚类的模式特征提取方法,以捕获周期性特征。Pattern characteristics refer to a series of approximate short-term data that repeats on historical data. Therefore, this embodiment proposes a pattern feature extraction method based on segmentation and clustering to capture periodic features.

首先,采用滑动窗口将交通流量数据分割成若干个短期序列;具体地:First, the traffic flow data is divided into several short-term sequences by using a sliding window; specifically:

采用滑动窗口将第 t天的交通流量数据 x t 分割为M个窗口,以构建第 t天的短期序列集;其中为短期序列,L是窗口长度,M=N-L+1。 Use the sliding window to divide the traffic flow data x t of day t into M windows to construct the short-term sequence set of day t ;in is a short-term sequence, L is the window length, M=N-L+1.

然后,根据短期序列间的距离对短期序列集进行聚类,以捕获重复出现的短期序列,即模式特征;Then, the short-term sequence set is clustered according to the distance between the short-term sequences to capture the recurring short-term sequences, i.e. pattern features;

具体地:将短期序列整合到集合中,通过对中的所有短期序列进行聚类以捕获模式特征;Concretely: integrating short-term series into collections in, through the All short-term series in clustering to capture pattern features;

属于同一类别的具有近似的短期序列,取每个类别的聚类中心作为第 t天的交通流量数据的模式特征,其中每个元素表示每个类别的聚类中心,g为类别的数量。 belonging to the same category With approximate short-term series, take the cluster center of each category As the mode feature of the traffic flow data on day t , each element represents the cluster center of each category, and g is the number of categories.

在本实施例中,对时序特征和模式特征分别进行相似度计算,继而根据时序相似度和模式相似度,确定序列相似度,控制两种相似度的平衡。In this embodiment, the similarity calculation is performed on the timing feature and the pattern feature respectively, and then the sequence similarity is determined according to the timing similarity and the pattern similarity, and the balance of the two similarities is controlled.

在本实施例中,计算同一时间不同空间位置(如不同交通路口)的交通流量数据的时序特征之间的时序相似度In this embodiment, the time series similarity between time series features of traffic flow data at different spatial locations (such as different traffic intersections) at the same time is calculated :

(11) (11)

(12) (12)

其中,为第 t日空间位置 a的时序特征,为第 t日空间位置 b的时序特征,是由待学习权重矩阵和激活函数tanh构成的网络,指将进行拼接。 in, is the time series feature of spatial position a on the tth day, is the time series feature of spatial position b on the tth day, is the weight matrix to be learned And the network composed of the activation function tanh, Commander and to splice.

在本实施例中,得到所有空间位置交通流量数据的模式特征后,通过计算空间位置 a的交通流量数据的模式特征和空间位置 b的交通流量数据的模式特征之间的距离,得到的相关性,从而确定模式相似度。 In this embodiment, the pattern characteristics of traffic flow data at all spatial locations are obtained After that, by calculating the traffic flow data of spatial location a pattern features and traffic flow data at spatial location b pattern features the distance between and , so as to determine the pattern similarity.

由于本身并不存在顺序关系,且包含的元素数目可能不同,导致在计算过程中中元素的对应关系不易确定。为了保证算法的简洁和鲁棒性,本实施例采用计算每个模式特征的最近邻距离,来解决不同交通流量数据序列的趋势模式特征没有一一对应关系的问题。because There is no order relationship in itself, and and may contain different numbers of elements, resulting in and The corresponding relationship of the elements in is not easy to determine. In order to ensure the simplicity and robustness of the algorithm, this embodiment uses the calculation of the nearest neighbor distance of each pattern feature to solve the problem that there is no one-to-one correspondence between the trend pattern features of different traffic flow data sequences.

最近邻距离指的是每个模式特征与其距离最近的模式特征之间的距离D1NN,表示为:The nearest neighbor distance refers to the distance D 1NN between each pattern feature and its closest pattern feature, expressed as:

其中,的第个模式特征,的第个模式特征;in, yes First a pattern feature, yes First a pattern feature;

为“1”时的之间欧氏距离为,将中所有元素相对于的最近邻距离表示为数组Will is "1" when and The Euclidean distance between ,Will All elements in The nearest neighbor distances of are represented as an array ;

值得注意的是,当的最近邻时,可能不是的最近邻;It is worth noting that when yes nearest neighbor time, maybe not nearest neighbor of

所以,需要用分别表示中所有元素相对于的最近邻和中所有元素相对于的最近邻;So, need to use and Respectively All elements in The nearest neighbor and All elements in nearest neighbor of

为了使之间的距离度量对称,将合并为,其中分别为的模式特征数量;包含中所有模式特征的最近邻距离,问题在于如何选择最合理的值来表示之间的距离,如果选择中最大的值,那么在任意中出现的噪声峰值都会严重影响距离判断,而如果基于最小值,大多数之间则几乎没有区别。because and The distance metric between them is symmetric, and the and merged into ,in and respectively and The number of pattern features; Include and The nearest neighbor distance of all pattern features in , the problem is how to choose the most reasonable value to represent and distance between, if selected The largest value in , then in any The noise peaks appearing in will seriously affect the distance judgment, and if based on the minimum value, most There is almost no difference between them.

为了尽可能考虑所有模式的影响,本实施例选择对中所有数值进行加权得到模式相似度。构造时,分别记录了各个模式特征所在类别包含元素的个数,同样的把合并为,则加权函数如式(13)所示:In order to consider the influence of all modes as much as possible, this embodiment chooses the All the values in are weighted to get the pattern similarity . structure and , respectively recorded the number of elements contained in the category of each pattern feature and , the same and merged into , then the weighting function is shown in formula (13):

(13) (13)

其中,为求和操作。in, for the summation operation.

在本实施例中,根据时序相似度和模式相似度确定序列相似度,如式(14)所示:In this embodiment, sequence similarity is determined according to time series similarity and pattern similarity , as shown in formula (14):

(14) (14)

其中,是待学习的权重参数,用于控制两种相似度的平衡。in, is the weight parameter to be learned, which is used to control the balance of the two similarities.

在本实施例中,根据序列相似度构建同一时间不同空间位置(交通路口)的关系图,再利用基于动态图的LSTM处理不同时间的关系图,以构建包含时序特征的交通流量动态关系图;In this embodiment, the relationship diagrams of different spatial locations (traffic intersections) at the same time are constructed according to the sequence similarity, and then the relationship diagrams at different times are processed by LSTM based on dynamic diagrams to construct a dynamic relationship diagram of traffic flow including time series features;

具体地,根据不同空间位置的交通流量数据的序列相似度,构建出同一时间(如同一天)不同空间位置的关系图,如式(15)所示:Specifically, according to the sequence similarity of traffic flow data at different spatial locations, a relationship diagram of different spatial locations at the same time (like a day) is constructed , as shown in formula (15):

其中,为第 t天构建的关系图,为空间位置 a和空间位置 b的交通流量数据间的序列相似度。 in, A relational graph built for day t , is the sequence similarity between the traffic flow data of spatial position a and spatial position b .

关系图能够反映第 t天的关联关系,但忽略了其他时间对当前关联关系的影响。为此,借鉴LSTM的门控结构,设计动态关系图构建方法,如图2所示,命名为基于动态图的LSTM(Dynamic-based LSTM,DGLSTM),该部分实际上是对输入数据进行优化,对应的具体公式如式(16)所示: relation chart It can reflect the association relationship on day t , but ignores the influence of other times on the current association relationship. To this end, drawing on the gating structure of LSTM, we design a dynamic relationship graph construction method, as shown in Figure 2, named as Dynamic-based LSTM (Dynamic-based LSTM, DGLSTM), this part actually optimizes the input data, The corresponding specific formula is shown in formula (16):

(16) (16)

其中,为待学习参数,为不同空间位置之间的连通关系矩阵,用于指导动态关系图的构建,其定义式(17)所示:in, is the parameter to be learned, is the connectivity relationship matrix between different spatial locations, which is used to guide the construction of dynamic relationship graphs, and its definition is shown in (17):

(17) (17)

其中,为时间差,分别为当前时刻和先验数据指示的时刻;是一个递减函数,用于赋值先验数据,且意味着中的数据随着时间间隔的增大会被逐渐遗忘。in, for the time difference, , and are the current moment and the moment indicated by the prior data, respectively; is a decreasing function for assigning prior data , and means The data in will be gradually forgotten as the time interval increases.

则DGLSTM可以表示为式(18)-式(23)所示:Then DGLSTM can be expressed as formula (18) - formula (23):

输入门:Input gate:

(18) (18)

遗忘门:Forgotten Gate:

(19) (19)

(20) (20)

输出门:Output gate:

(21) (twenty one)

长记忆:long memory:

(22) (twenty two)

短记忆:short memory:

(23) (twenty three)

经过DGLSTM后,将得到动态关系图;相对于不仅能够反应当前时刻序列间的关联性,还受到历史上其他时刻关系图的影响。为了便于描述,将的构建过程表述为式(24)所示:After DGLSTM, the dynamic relationship diagram will be obtained ; relative to , It can not only reflect the correlation between the current time series, but also be affected by other time relationship diagrams in history. For ease of description, the The construction process of is expressed as formula (24):

(24) (twenty four)

在本实施例中,采用图注意力网络(Graph attention networks,GAT)进行交通流量异常判断,通过聚合近似序列之间的影响,以捕获动态关系图中的隐含信息,相对于传统的图卷积神经网络(Graph Convolutional Networks,GCN)模型,GAT能够对近似序列的影响进行有选择的聚合,表达式如式(25)所示:In this embodiment, Graph attention networks (Graph attention networks, GAT) are used to judge traffic flow anomalies. By aggregating the influence between approximate sequences, the implicit information in the dynamic relationship graph is captured. Compared with the traditional graph volume Based on the Graph Convolutional Networks (GCN) model, GAT can selectively aggregate the influence of approximate sequences, and the expression is shown in formula (25):

(25) (25)

其中,为第 k个路口的交通流量数据通过LSTM得到的时序特征;为路口 k和路口 p的序列相似度,其大小为动态关系图k行第 p列的取值;为待学习的权值矩阵;路口 k在GAT的输出即为隐含信息in, is the time series feature obtained by LSTM for the traffic flow data of the kth intersection; is the sequence similarity between intersection k and intersection p , and its size is the dynamic relationship graph The value of row k and column p ; is the weight matrix to be learned; the output of intersection k in GAT is hidden information ;

通过GAT获取所有路口隐含信息的过程可以表示为式(26):Obtain implicit information of all intersections through GAT The process can be expressed as formula (26):

(26) (26)

进行异常判断时,定义标签,采用单层全连接层(Fullyconnected layer,FC)作为预测函数对进行预测,如图3所示,预测公式如式(27)所示:When making exception judgments, define tags , using a single fully connected layer (Fullyconnected layer, FC) as the prediction function pair Forecast, as shown in Figure 3, the prediction formula is shown in formula (27):

(27) (27)

其中,是判断的异常结果。in, is an abnormal result of judgment.

在本实施例中,采用交叉熵对图注意力网络进行训练,如式(28)所示:In this embodiment, cross-entropy is used to train the graph attention network, as shown in equation (28):

(28) (28)

其中,分别为路口 k在第 t时刻的真实类别和预测值, L为损失函数,用于最小化预测值和真实类别之间的差距。 in, and are the real category and predicted value of intersection k at time t , respectively, and L is the loss function, which is used to minimize the gap between the predicted value and the real category.

由于异常检测是一种典型的分类任务,本实施例使用在分类任务中广泛认可的准确率(Accuracy,ACC)和马修斯相关系数(Matthews correlation coefficient,MCC)来评价图注意网络的预测效果。Since anomaly detection is a typical classification task, this example uses the accuracy rate (Accuracy, ACC) and Matthews correlation coefficient (Matthews correlation coefficient, MCC), which are widely recognized in classification tasks, to evaluate the prediction effect of the graph attention network .

ACC能够较为直观的表述模型的预测效果,公式如式(29)所示:ACC can express the prediction effect of the model more intuitively, and the formula is shown in formula (29):

(29) (29)

其中,真阳性(TP)为预测和真实值均为正常的结果;真阴性(TN)为预测和真实值均为异常的结果;假阳性(FP)为预测为正常,实际为异常的结果;假阴性(FN)为预测为异常,实际为正常的结果。Among them, true positive (TP) is the result of normal prediction and true value; true negative (TN) is the result of abnormal prediction and true value; false positive (FP) is the result of prediction as normal but actually abnormal; A false negative (FN) is a result that is predicted to be abnormal but is actually normal.

MCC是一种评价模型二分类性能的指标,实际上是一个描述实际分类和预测分类之间的相关系数。它的值介于-1和+1之间,系数+1表示完美预测,0表示不比随机预测好,-1表示预测和观察之间的完全不一致。MCC计算公式如式(30)所示:MCC is an index to evaluate the performance of the model's two classifications. It is actually a correlation coefficient describing the actual classification and the predicted classification. Its value ranges between -1 and +1, with a coefficient of +1 indicating a perfect forecast, 0 indicating no better than random forecast, and -1 indicating complete inconsistency between forecast and observation. The calculation formula of MCC is shown in formula (30):

(30) (30)

将本实施例方法与3种传统方法,即图注意力网络(Graph attention networks ,GAT)、时间卷积神经网络(Temporal Convolutional Neural Network,TCN)、门控循环单元神经网络(Gated Recurrent Unit,GRU)进行对比后,本实施例方法的两项指标均为最高,在对比方法中排名第一,证实本实施例方法的有效性。Combining the method of this embodiment with three traditional methods, namely Graph attention networks (Graph attention networks, GAT), temporal convolutional neural network (Temporal Convolutional Neural Network, TCN), gated recurrent unit neural network (Gated Recurrent Unit, GRU ) after comparison, the two indicators of the method of this embodiment are the highest, ranking first among the comparison methods, confirming the effectiveness of the method of this embodiment.

实施例2Example 2

本实施例提供一种基于模式相似性的交通流量异常检测系统,包括:This embodiment provides a traffic flow anomaly detection system based on pattern similarity, including:

数据获取模块,被配置为获取交通流量数据;a data acquisition module configured to acquire traffic flow data;

时序特征提取模块,被配置为采用改进的长短期记忆神经网络对交通流量数据提取时序特征;所述改进的长短期记忆神经网络为对不同时刻得到的隐藏状态经加权求和后得到时序特征;The timing feature extraction module is configured to use an improved long-short-term memory neural network to extract timing features from traffic flow data; the improved long-short-term memory neural network obtains timing features after weighted summation of hidden states obtained at different times;

模式特征提取模块,被配置为采用滑动窗口对交通流量数据进行分割,得到短期序列集,对短期序列集进行聚类后,以每个类别的聚类中心所对应的短期序列作为模式特征;The pattern feature extraction module is configured to use a sliding window to segment the traffic flow data to obtain a short-term sequence set, and after clustering the short-term sequence set, use the short-term sequence corresponding to the cluster center of each category as the pattern feature;

时序相似度确定模块,被配置为对不同空间位置的时序特征计算时序相似度;A timing similarity determining module configured to calculate timing similarity for timing features at different spatial locations;

模式相似度确定模块,被配置为对每个模式特征确定与其距离最近的模式特征,以组成模式特征对,对模式特征对的最近邻距离经加权处理后,得到不同空间位置的模式相似度;The pattern similarity determination module is configured to determine the pattern feature with the closest distance to each pattern feature to form a pattern feature pair, and after weighting the nearest neighbor distance of the pattern feature pair, the pattern similarity of different spatial positions is obtained;

动态关系图构建模块,被配置为根据时序相似度和模式相似度确定序列相似度,根据序列相似度构建不同时间且不同空间位置的交通流量动态关系图;The dynamic relationship diagram building module is configured to determine the sequence similarity according to the time series similarity and the pattern similarity, and construct the dynamic relationship diagram of traffic flow at different times and different spatial locations according to the sequence similarity;

异常检测模块,被配置为采用交通流量动态关系图和时序相似度进行交通流量异常状态的检测。The anomaly detection module is configured to detect the abnormal state of the traffic flow by using the traffic flow dynamic relationship diagram and the time series similarity.

此处需要说明的是,上述模块对应于实施例1中所述的步骤,上述模块与对应的步骤所实现的示例和应用场景相同,但不限于上述实施例1所公开的内容。需要说明的是,上述模块作为系统的一部分可以在诸如一组计算机可执行指令的计算机系统中执行。It should be noted here that the above modules correspond to the steps described in Embodiment 1, and the examples and application scenarios implemented by the above modules and corresponding steps are the same, but are not limited to the content disclosed in Embodiment 1 above. It should be noted that, as a part of the system, the above-mentioned modules can be executed in a computer system such as a set of computer-executable instructions.

在更多实施例中,还提供:In further embodiments, there is also provided:

一种电子设备,包括存储器和处理器以及存储在存储器上并在处理器上运行的计算机指令,所述计算机指令被处理器运行时,完成实施例1中所述的方法。为了简洁,在此不再赘述。An electronic device includes a memory, a processor, and computer instructions stored in the memory and executed on the processor. When the computer instructions are executed by the processor, the method described in Embodiment 1 is completed. For the sake of brevity, details are not repeated here.

应理解,本实施例中,处理器可以是中央处理单元CPU,处理器还可以是其他通用处理器、数字信号处理器DSP、专用集成电路ASIC,现成可编程门阵列FPGA或者其他可编程逻辑器件、分立门或者晶体管逻辑器件、分立硬件组件等。通用处理器可以是微处理器或者该处理器也可以是任何常规的处理器等。It should be understood that in this embodiment, the processor can be a central processing unit CPU, and the processor can also be other general-purpose processors, digital signal processors DSP, application specific integrated circuits ASIC, off-the-shelf programmable gate array FPGA or other programmable logic devices , discrete gate or transistor logic devices, discrete hardware components, etc. A general-purpose processor may be a microprocessor, or the processor may be any conventional processor, or the like.

存储器可以包括只读存储器和随机存取存储器,并向处理器提供指令和数据、存储器的一部分还可以包括非易失性随机存储器。例如,存储器还可以存储设备类型的信息。The memory may include read-only memory and random access memory, and provide instructions and data to the processor, and a part of the memory may also include non-volatile random access memory. For example, the memory may also store device type information.

一种计算机可读存储介质,用于存储计算机指令,所述计算机指令被处理器执行时,完成实施例1中所述的方法。A computer-readable storage medium is used for storing computer instructions, and when the computer instructions are executed by a processor, the method described in Embodiment 1 is completed.

实施例1中的方法可以直接体现为硬件处理器执行完成,或者用处理器中的硬件及软件模块组合执行完成。软件模块可以位于随机存储器、闪存、只读存储器、可编程只读存储器或者电可擦写可编程存储器、寄存器等本领域成熟的存储介质中。该存储介质位于存储器,处理器读取存储器中的信息,结合其硬件完成上述方法的步骤。为避免重复,这里不再详细描述。The method in Embodiment 1 can be directly implemented by a hardware processor, or implemented by a combination of hardware and software modules in the processor. The software module may be located in a mature storage medium in the field such as random access memory, flash memory, read-only memory, programmable read-only memory or electrically erasable programmable memory, register. The storage medium is located in the memory, and the processor reads the information in the memory, and completes the steps of the above method in combination with its hardware. To avoid repetition, no detailed description is given here.

本领域普通技术人员可以意识到,结合本实施例描述的各示例的单元即算法步骤,能够以电子硬件或者计算机软件和电子硬件的结合来实现。这些功能究竟以硬件还是软件方式来执行,取决于技术方案的特定应用和设计约束条件。专业技术人员可以对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。Those skilled in the art can appreciate that the units of the examples described in this embodiment, that is, the algorithm steps, can be implemented by electronic hardware or a combination of computer software and electronic hardware. Whether these functions are executed by hardware or software depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functions for each specific application, but such implementation should not be regarded as exceeding the scope of the present application.

上述虽然结合附图对本发明的具体实施方式进行了描述,但并非对本发明保护范围的限制,所属领域技术人员应该明白,在本发明的技术方案的基础上,本领域技术人员不需要付出创造性劳动即可做出的各种修改或变形仍在本发明的保护范围以内。Although the specific implementation of the present invention has been described above in conjunction with the accompanying drawings, it does not limit the protection scope of the present invention. Those skilled in the art should understand that on the basis of the technical solution of the present invention, those skilled in the art do not need to pay creative work Various modifications or variations that can be made are still within the protection scope of the present invention.

Claims (8)

1.一种基于模式相似性的交通流量异常检测方法,其特征在于,包括:1. A traffic flow anomaly detection method based on pattern similarity, is characterized in that, comprises: 获取交通流量数据;Obtain traffic flow data; 采用改进的长短期记忆神经网络对交通流量数据提取时序特征;所述改进的长短期记忆神经网络为对不同时刻得到的隐藏状态经加权求和后得到时序特征;The improved long-short-term memory neural network is used to extract time-series features from traffic flow data; the improved long-short-term memory neural network obtains time-series features after weighted summation of hidden states obtained at different times; 采用滑动窗口对交通流量数据进行分割,得到短期序列集,对短期序列集进行聚类后,以每个类别的聚类中心所对应的短期序列作为模式特征;Segment the traffic flow data with a sliding window to obtain a short-term sequence set. After clustering the short-term sequence set, the short-term sequence corresponding to the cluster center of each category is used as the pattern feature; 对不同空间位置的时序特征计算时序相似度;Calculate the temporal similarity of temporal features at different spatial locations; 对每个模式特征确定与其距离最近的模式特征,以组成模式特征对,对模式特征对的最近邻距离经加权处理后,得到不同空间位置的模式相似度;Determine the pattern feature with the closest distance to each pattern feature to form a pattern feature pair, and after weighting the nearest neighbor distance of the pattern feature pair, the pattern similarity at different spatial positions is obtained; 根据时序相似度和模式相似度确定序列相似度,根据序列相似度构建不同时间且不同空间位置的交通流量动态关系图;The sequence similarity is determined according to the time series similarity and pattern similarity, and the dynamic relationship diagram of traffic flow at different times and different spatial locations is constructed according to the sequence similarity; 采用交通流量动态关系图和时序相似度进行交通流量异常状态的检测;Use the traffic flow dynamic relationship diagram and time series similarity to detect the abnormal state of traffic flow; 对不同时刻得到的隐藏状态进行加权求和得到时序特征的过程中,以不同时刻的隐藏状态与交通流量数据的相关性确定权重,权重为:In the process of weighting and summing the hidden states obtained at different times to obtain time series features, the weight is determined by the correlation between the hidden states at different times and the traffic flow data , with a weight of: 其中,x t 为第t日的交通流量数据,为隐藏状态,为相关性函数,为待学习参数,是输入的交通流量数据的天数,为转置操作;Among them, xt is the traffic flow data on the tth day, for the hidden state, is the correlation function, is the parameter to be learned, is the number of days of the input traffic flow data, for the transpose operation; 对不同空间位置的时序特征计算时序相似度的过程为:Calculating temporal similarity for temporal features at different spatial locations The process is: 其中,为第t日空间位置a的时序特征,为第t日空间位置b的时序特征,是由待学习权重矩阵和激活函数tanh构成的网络,指将进行拼接。in, is the time series feature of spatial position a on the tth day, is the time series feature of spatial position b on the tth day, is the weight matrix to be learned And the network composed of the activation function tanh, Commander and to splice. 2.如权利要求1所述的一种基于模式相似性的交通流量异常检测方法,其特征在于,对模式特征对的最近邻距离进行加权处理的过程中,权重为模式特征所在类别包含的元素个数。2. a kind of traffic flow anomaly detection method based on pattern similarity as claimed in claim 1, is characterized in that, in the process that the nearest neighbor distance of pattern feature pair is carried out weighting process, weight is the element that pattern feature place category comprises number. 3.如权利要求1所述的一种基于模式相似性的交通流量异常检测方法,其特征在于,对时序相似度和模式相似度赋权后求和确定序列相似度。3. A traffic flow anomaly detection method based on pattern similarity as claimed in claim 1, characterized in that the sequence similarity is determined by summing after the sequence similarity and the pattern similarity are weighted. 4.如权利要求1所述的一种基于模式相似性的交通流量异常检测方法,其特征在于,构建交通流量动态关系图的过程包括:4. a kind of traffic flow anomaly detection method based on pattern similarity as claimed in claim 1, is characterized in that, the process of constructing traffic flow dynamic relationship diagram comprises: 根据不同空间位置的交通流量数据的序列相似度,构建同一时间不同空间位置的关系图According to the sequence similarity of traffic flow data at different spatial locations, construct the relationship diagram of different spatial locations at the same time ; 引入不同空间位置的交通流量数据间的连通关系矩阵,根据关系图和连通关系矩阵构建交通流量动态关系图Introduce the connection relationship matrix between the traffic flow data in different spatial locations, and construct the dynamic relationship diagram of traffic flow according to the relationship diagram and the connection relationship matrix ; 其中,为待学习参数,为连通关系矩阵,tanh为激活函数,分别为当前时刻和先验数据指示的时刻,为时间差,是递减函数。in, is the parameter to be learned, is the connectivity matrix, tanh is the activation function, and are the current moment and the moment indicated by the prior data, respectively, for the time difference, is a decreasing function. 5.如权利要求4所述的一种基于模式相似性的交通流量异常检测方法,其特征在于,连通关系矩阵为:5. a kind of traffic flow anomaly detection method based on pattern similarity as claimed in claim 4, is characterized in that, connectivity matrix is: 其中,X a 为空间位置a的交通流量数据,X b 为空间位置b的交通流量数据,为X a 和X b 间的连通关系矩阵。Among them, X a is the traffic flow data of spatial position a , X b is the traffic flow data of spatial position b , is the connectivity matrix between X a and X b . 6.一种基于模式相似性的交通流量异常检测系统,其特征在于,包括:6. A traffic flow anomaly detection system based on pattern similarity, characterized in that, comprising: 数据获取模块,被配置为获取交通流量数据;a data acquisition module configured to acquire traffic flow data; 时序特征提取模块,被配置为采用改进的长短期记忆神经网络对交通流量数据提取时序特征;所述改进的长短期记忆神经网络为对不同时刻得到的隐藏状态经加权求和后得到时序特征;The timing feature extraction module is configured to use an improved long-short-term memory neural network to extract timing features from traffic flow data; the improved long-short-term memory neural network obtains timing features after weighted summation of hidden states obtained at different times; 模式特征提取模块,被配置为采用滑动窗口对交通流量数据进行分割,得到短期序列集,对短期序列集进行聚类后,以每个类别的聚类中心所对应的短期序列作为模式特征;The pattern feature extraction module is configured to use a sliding window to segment the traffic flow data to obtain a short-term sequence set, and after clustering the short-term sequence set, use the short-term sequence corresponding to the cluster center of each category as the pattern feature; 时序相似度确定模块,被配置为对不同空间位置的时序特征计算时序相似度;A timing similarity determining module configured to calculate timing similarity for timing features at different spatial locations; 模式相似度确定模块,被配置为对每个模式特征确定与其距离最近的模式特征,以组成模式特征对,对模式特征对的最近邻距离经加权处理后,得到不同空间位置的模式相似度;The pattern similarity determination module is configured to determine the pattern feature with the closest distance to each pattern feature to form a pattern feature pair, and after weighting the nearest neighbor distance of the pattern feature pair, the pattern similarity of different spatial positions is obtained; 动态关系图构建模块,被配置为根据时序相似度和模式相似度确定序列相似度,根据序列相似度构建不同时间且不同空间位置的交通流量动态关系图;The dynamic relationship diagram building module is configured to determine the sequence similarity according to the time series similarity and the pattern similarity, and construct the dynamic relationship diagram of traffic flow at different times and different spatial locations according to the sequence similarity; 异常检测模块,被配置为采用交通流量动态关系图和时序相似度进行交通流量异常状态的检测;The anomaly detection module is configured to detect the abnormal state of the traffic flow by using the traffic flow dynamic relationship diagram and the time series similarity; 对不同时刻得到的隐藏状态进行加权求和得到时序特征的过程中,以不同时刻的隐藏状态与交通流量数据的相关性确定权重,权重为:In the process of weighting and summing the hidden states obtained at different times to obtain time series features, the weight is determined by the correlation between the hidden states at different times and the traffic flow data , with a weight of: 其中,x t 为第t日的交通流量数据,为隐藏状态,为相关性函数,为待学习参数,是输入的交通流量数据的天数,为转置操作;Among them, xt is the traffic flow data on the tth day, for the hidden state, is the correlation function, is the parameter to be learned, is the number of days of the input traffic flow data, for the transpose operation; 对不同空间位置的时序特征计算时序相似度的过程为:Calculating temporal similarity for temporal features at different spatial locations The process is: 其中,为第t日空间位置a的时序特征,为第t日空间位置b的时序特征,是由待学习权重矩阵和激活函数tanh构成的网络,指将进行拼接。in, is the time series feature of spatial position a on the tth day, is the time series feature of spatial position b on the tth day, is the weight matrix to be learned And the network composed of the activation function tanh, Commander and to splice. 7.一种电子设备,其特征在于,包括存储器和处理器以及存储在存储器上并在处理器上运行的计算机指令,所述计算机指令被处理器运行时,完成权利要求1-5任一项所述的方法。7. An electronic device, characterized in that it comprises a memory, a processor, and computer instructions stored in the memory and run on the processor, when the computer instructions are run by the processor, any one of claims 1-5 can be accomplished the method described. 8.一种计算机可读存储介质,其特征在于,用于存储计算机指令,所述计算机指令被处理器执行时,完成权利要求1-5任一项所述的方法。8. A computer-readable storage medium, characterized in that it is used to store computer instructions, and when the computer instructions are executed by a processor, the method according to any one of claims 1-5 is completed.
CN202211365058.7A 2022-11-03 2022-11-03 A traffic flow anomaly detection method and system based on pattern similarity Active CN115423048B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202211365058.7A CN115423048B (en) 2022-11-03 2022-11-03 A traffic flow anomaly detection method and system based on pattern similarity

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202211365058.7A CN115423048B (en) 2022-11-03 2022-11-03 A traffic flow anomaly detection method and system based on pattern similarity

Publications (2)

Publication Number Publication Date
CN115423048A CN115423048A (en) 2022-12-02
CN115423048B true CN115423048B (en) 2023-04-25

Family

ID=84207956

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202211365058.7A Active CN115423048B (en) 2022-11-03 2022-11-03 A traffic flow anomaly detection method and system based on pattern similarity

Country Status (1)

Country Link
CN (1) CN115423048B (en)

Families Citing this family (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN116361635B (en) * 2023-06-02 2023-10-10 中国科学院成都文献情报中心 Multidimensional time sequence data anomaly detection method

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022047658A1 (en) * 2020-09-02 2022-03-10 大连大学 Log anomaly detection system
WO2022160902A1 (en) * 2021-01-28 2022-08-04 广西大学 Anomaly detection method for large-scale multivariate time series data in cloud environment

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
AU2012285379B2 (en) * 2011-07-20 2017-04-13 Elminda Ltd. Method and system for estimating brain concussion
US20200097808A1 (en) * 2018-09-21 2020-03-26 International Business Machines Corporation Pattern Identification in Reinforcement Learning
CN111145541B (en) * 2019-12-18 2021-10-22 深圳先进技术研究院 Traffic flow data prediction method, storage medium and computer equipment
CN112801404B (en) * 2021-02-14 2024-03-22 北京工业大学 Traffic prediction method based on self-adaptive space self-attention force diagram convolution

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2022047658A1 (en) * 2020-09-02 2022-03-10 大连大学 Log anomaly detection system
WO2022160902A1 (en) * 2021-01-28 2022-08-04 广西大学 Anomaly detection method for large-scale multivariate time series data in cloud environment

Also Published As

Publication number Publication date
CN115423048A (en) 2022-12-02

Similar Documents

Publication Publication Date Title
Chen et al. Learning graph structures with transformer for multivariate time-series anomaly detection in IoT
Chen et al. PCNN: Deep convolutional networks for short-term traffic congestion prediction
WO2023010658A1 (en) Time dilated convolutional network-based method for alerting of rotating stall in compressor
CN111383452A (en) A short-term traffic operation state estimation and prediction method for urban road network
CN110957015B (en) Missing value filling method for electronic medical record data
CN114220271A (en) Traffic flow prediction method, equipment and storage medium based on dynamic spatiotemporal graph convolutional recurrent network
CN111612243A (en) Traffic speed prediction method, system and storage medium
CN110858973B (en) Cell network traffic prediction method and device
CN108985380B (en) A fault identification method of switch machine based on cluster integration
CN113405799B (en) Bearing early fault detection method based on health state index construction and fault warning limit self-learning
CN110444011B (en) Traffic flow peak identification method and device, electronic equipment and storage medium
CN109086291B (en) Parallel anomaly detection method and system based on MapReduce
CN115423048B (en) A traffic flow anomaly detection method and system based on pattern similarity
CN116796275A (en) Multi-mode time sequence anomaly detection method for industrial equipment
CN116364203A (en) Water quality prediction method, system and device based on deep learning
CN115994630A (en) Method and system for predicting remaining service life of equipment based on multi-scale self-attention
CN112330158B (en) Method for identifying traffic index time series based on autoregressive differential moving average-convolutional neural network
CN117828308A (en) Time sequence prediction method based on local segmentation
CN117290706A (en) Traffic flow prediction method based on space-time convolution fusion probability sparse attention mechanism
CN116644831A (en) Channel water level prediction method, device and storage medium based on spatio-temporal graph convolutional network
CN115600116B (en) Dynamic detection method, system, storage medium and terminal for time sequence abnormality
CN117763338A (en) Prediction method and prediction system for residual life of tobacco machinery bearing
CN109887290B (en) Traffic flow prediction method based on balance index smoothing method and stack type self-encoder
CN115116013A (en) Online dense point cloud semantic segmentation system and method fused with time series features
CN117591860A (en) A data anomaly detection method and device

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant