CN109327345A - Method and device for detecting abnormal network traffic, and computer-readable storage medium - Google Patents
Method and device for detecting abnormal network traffic, and computer-readable storage medium Download PDFInfo
- Publication number
- CN109327345A CN109327345A CN201710648870.3A CN201710648870A CN109327345A CN 109327345 A CN109327345 A CN 109327345A CN 201710648870 A CN201710648870 A CN 201710648870A CN 109327345 A CN109327345 A CN 109327345A
- Authority
- CN
- China
- Prior art keywords
- traffic
- current
- port
- historical
- flow
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/08—Monitoring or testing based on specific metrics, e.g. QoS, energy consumption or environmental parameters
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L43/00—Arrangements for monitoring or testing data switching networks
- H04L43/50—Testing arrangements
Landscapes
- Engineering & Computer Science (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Environmental & Geological Engineering (AREA)
- Data Exchanges In Wide-Area Networks (AREA)
Abstract
本发明公开了一种网络异常流量的检测方法和装置、计算机可读存储介质。网络异常流量的检测方法包括:采集多个端口的当前时间段内的当前流量数据和与当前时间段对应的历史流量数据;利用预设的滤波器,分别从当前流量数据和历史流量数据中提取端口的当前流量特征和历史流量特征;分别从当前流量特征和历史流量特征中提取端口的当前流量的高层语义和历史流量的高层语义;计算端口的当前流量和历史流量之间的语义相似度;判断语义相似度是否落入预设的第一取值区间,得到端口的当前流量状态的判决结果;根据各个端口的当前流量状态的判决结果,确认出故障端口。利用本发明实施例中的网络异常流量的检测方法,能够及时找到故障端口。
The invention discloses a method and device for detecting abnormal network traffic, and a computer-readable storage medium. The method for detecting abnormal network traffic includes: collecting current traffic data in the current time period of multiple ports and historical traffic data corresponding to the current time period; using a preset filter to extract from the current traffic data and the historical traffic data respectively The current traffic characteristics and historical traffic characteristics of the port; the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port are extracted from the current traffic characteristics and the historical traffic characteristics respectively; the semantic similarity between the current traffic and the historical traffic of the port is calculated; It is judged whether the semantic similarity falls within the preset first value range, and the judgment result of the current traffic state of the port is obtained; the faulty port is confirmed according to the judgment result of the current traffic state of each port. Using the method for detecting abnormal network traffic in the embodiment of the present invention, the faulty port can be found in time.
Description
技术领域technical field
本发明涉及通信领域,尤其涉及一种网络异常流量的检测方法和装置、计算机可读存储介质。The present invention relates to the field of communications, and in particular, to a method and device for detecting abnormal network traffic, and a computer-readable storage medium.
背景技术Background technique
目前,网络运营商的机房通常配置有大量的数据通信设备,这些数据通信设备之间每天需要进行大量的流量交换。维护这些数据通信设备的正常运行,对于提高通信质量至关重要。Currently, a network operator's computer room is usually equipped with a large number of data communication devices, and a large amount of traffic needs to be exchanged between these data communication devices every day. Maintaining the normal operation of these data communication devices is crucial to improving communication quality.
然而,本申请的发明人发现,由于机房内的数据通信设备(以下简称设备)数量较多,且在同一个设备上的端口的数量也较多,因此,当某一个或多个设备上的端口出现故障,导致网络流量异常时,无法及时找到故障端口。However, the inventor of the present application found that since there are a large number of data communication devices (hereinafter referred to as devices) in the computer room, and the number of ports on the same device is also large, when one or more devices When the port is faulty and the network traffic is abnormal, the faulty port cannot be found in time.
发明内容SUMMARY OF THE INVENTION
本发明实施例提供了一种网络异常流量的检测方法和装置、计算机可读存储介质,能够及时找到故障端口。Embodiments of the present invention provide a method and device for detecting abnormal network traffic, and a computer-readable storage medium, which can find a faulty port in time.
第一方面,本发明实施例提供一种网络异常流量的检测方法,包括:In a first aspect, an embodiment of the present invention provides a method for detecting abnormal network traffic, including:
采集多个端口的当前时间段内的当前流量数据和与所述当前时间段对应的历史流量数据;Collect current flow data in the current time period of multiple ports and historical flow data corresponding to the current time period;
利用预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述端口的当前流量特征和历史流量特征;Using a preset filter, extract the current flow characteristics and historical flow characteristics of the port from the current flow data and the historical flow data, respectively;
分别从所述当前流量特征和所述历史流量特征中提取所述端口的当前流量的高层语义和历史流量的高层语义;Extracting the high-level semantics of the current traffic of the port and the high-level semantics of the historical traffic from the current traffic characteristics and the historical traffic characteristics respectively;
根据所述当前流量的高层语义和所述历史流量的高层语义,计算所述端口的当前流量和历史流量之间的语义相似度;According to the high-level semantics of the current traffic and the high-level semantics of the historical traffic, calculating the semantic similarity between the current traffic and the historical traffic of the port;
判断所述语义相似度是否落入预设的第一取值区间,得到所述端口的当前流量状态的判决结果;Judging whether the semantic similarity falls within a preset first value range, and obtaining a judgment result of the current traffic state of the port;
根据各个所述端口的当前流量状态的判决结果,确认出故障端口。The faulty port is confirmed according to the judgment result of the current traffic state of each of the ports.
在第一方面的一些实施例中,所述分别从所述当前流量特征和所述历史流量特征中提取所述端口的当前流量的高层语义和历史流量的高层语义,包括:In some embodiments of the first aspect, extracting the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port from the current traffic characteristics and the historical traffic characteristics respectively includes:
利用乘十取整法,分别从所述当前流量特征和所述历史流量特征中提取所述端口的当前流量的高层语义和历史流量的高层语义。The high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port are extracted from the current traffic characteristics and the historical traffic characteristics by using the multiplication method.
在第一方面的一些实施例中,所述分别从所述当前流量特征和所述历史流量特征中提取所述端口的当前流量的高层语义和历史流量的高层语义,包括:In some embodiments of the first aspect, extracting the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port from the current traffic characteristics and the historical traffic characteristics respectively includes:
计算所述当前流量的高层语义和所述历史流量的高层语义之间的卡方统计距离,将所述卡方统计距离作为所述端口的当前流量和历史流量之间的语义相似度。Calculate the chi-square statistical distance between the high-level semantics of the current traffic and the high-level semantics of the historical traffic, and use the chi-square statistical distance as the semantic similarity between the current traffic and the historical traffic of the port.
在第一方面的一些实施例中,所述判断所述端口的语义相似度是否落入预设的第一取值区间,得到所述端口的当前流量状态的判决结果,包括:In some embodiments of the first aspect, the judging whether the semantic similarity of the port falls within a preset first value range, and obtaining a judgment result of the current traffic state of the port, includes:
判断所述端口的语义相似度是否落入预设的第一取值区间;Judging whether the semantic similarity of the port falls within a preset first value interval;
若所述端口的语义相似度落入所述预设的第一取值区间,则将所述端口的当前流量状态判决为正常;If the semantic similarity of the port falls within the preset first value range, the current traffic state of the port is judged to be normal;
若所述端口的语义相似度未落入所述预设的第一取值区间,则将所述端口的当前流量状态判决为异常。If the semantic similarity of the port does not fall within the preset first value range, the current traffic state of the port is determined to be abnormal.
在第一方面的一些实施例中,所述判断所述端口的语义相似度是否落入预设的第一取值区间,得到所述端口的当前流量状态的判决结果,还包括:In some embodiments of the first aspect, the judging whether the semantic similarity of the port falls within a preset first value range, and obtaining a judgment result of the current traffic state of the port, further includes:
若所述端口的语义相似度落入所述预设的第一取值区间,则计算所述端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度;If the semantic similarity of the port falls within the preset first value range, calculating the semantic similarity between the current incoming flow and the current outgoing flow in the current flow of the port;
判断所述端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度是否落入预设的第二取值区间;Judging whether the semantic similarity between the current incoming flow and the current outgoing flow in the current flow of the port falls within a preset second value range;
若所述端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度落入预设的第二取值区间,则将所述端口的当前流量状态判决为正常;If the semantic similarity between the current incoming traffic and the current outgoing traffic in the current traffic of the port falls within the preset second value range, the current traffic state of the port is judged to be normal;
若所述端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度未落入预设的第二取值区间,则将所述端口的当前流量状态判决为异常。If the semantic similarity between the current incoming traffic and the current outgoing traffic in the current traffic of the port does not fall within the preset second value range, the current traffic state of the port is determined to be abnormal.
在第一方面的一些实施例中,所述利用预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述端口的当前流量特征和历史流量特征,包括:In some embodiments of the first aspect, the use of a preset filter to extract the current traffic characteristics and historical traffic characteristics of the port from the current traffic data and the historical traffic data, respectively, includes:
利用所述预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述各个端口的初始流量特征,所述初始流量特征包括初始当前流量特征和初始历史流量特征;Using the preset filter, extract the initial flow characteristics of each port from the current flow data and the historical flow data, respectively, where the initial flow characteristics include initial current flow characteristics and initial historical flow characteristics;
分别对所述初始当前流量特征和所述初始历史流量特征进行特征激活,得到所述端口的当前流量特征和历史流量特征。Feature activation is performed on the initial current traffic feature and the initial historical traffic feature, respectively, to obtain the current traffic feature and the historical traffic feature of the port.
在第一方面的一些实施例中,所述利用预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述端口的当前流量特征和历史流量特征,包括:In some embodiments of the first aspect, the use of a preset filter to extract the current traffic characteristics and historical traffic characteristics of the port from the current traffic data and the historical traffic data, respectively, includes:
利用所述预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述各个端口的低频流量特征和/或高频流量特征,所述低频流量特征包括当前低频流量特征和历史低频流量特征,所述高频流量特征包括当前高频流量特征和历史高频流量特征;Using the preset filter, extract low-frequency traffic characteristics and/or high-frequency traffic characteristics of each port from the current traffic data and the historical traffic data, respectively, where the low-frequency traffic characteristics include current low-frequency traffic characteristics and historical low-frequency traffic characteristics, the high-frequency traffic characteristics include current high-frequency traffic characteristics and historical high-frequency traffic characteristics;
将所述当前低频流量特征和所述当前高频流量特征作为所述端口的当前流量特征,以及将所述历史低频流量特征和所述历史高频流量特征作为所述端口的历史流量特征。The current low-frequency traffic characteristics and the current high-frequency traffic characteristics are used as the current traffic characteristics of the port, and the historical low-frequency traffic characteristics and the historical high-frequency traffic characteristics are used as the historical traffic characteristics of the port.
在第一方面的一些实施例中,所述利用预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述端口的当前流量特征和历史流量特征,包括:In some embodiments of the first aspect, the use of a preset filter to extract the current traffic characteristics and historical traffic characteristics of the port from the current traffic data and the historical traffic data, respectively, includes:
利用所述预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述各个端口的流入流量特征和/或流出流量特征,所述流入流量特征包括当前流入流量特征和历史流入流量特征,所述流出流量特征包括当前流出流量特征和历史流出流量特征;Using the preset filter, the inflow flow characteristics and/or the outflow flow characteristics of each port are extracted from the current flow data and the historical flow data, respectively, where the inflow characteristics include the current inflow characteristics and Historical inflow flow characteristics, the outflow flow characteristics include current outflow flow characteristics and historical outflow flow characteristics;
将所述当前流入流量特征和所述当前流出流量特征作为所述端口的当前流量特征,以及将所述历史流入流量特征和所述历史流出流量特征作为所述端口的历史流量特征。The current inflow flow characteristic and the current outflow flow characteristic are used as the current flow characteristic of the port, and the historical inflow flow characteristic and the historical outflow flow characteristic are taken as the historical flow characteristic of the port.
在第一方面的一些实施例中,在所述判断所述端口的语义相似度是否落入预设的第一取值区间,得到所述端口的当前流量状态的判决结果之后,所述检测方法还包括:In some embodiments of the first aspect, after judging whether the semantic similarity of the port falls within a preset first value range, and obtaining a judgment result of the current traffic state of the port, the detection method Also includes:
分别计算各个端口的当前流量状态的判决结果和预标注结果之间的误差,并将所述各个端口的误差相加,得到所有端口的总体误差;Calculate the error between the judgment result of the current flow state of each port and the pre-marking result, and add the errors of the various ports to obtain the overall error of all ports;
利用所述总体误差更新所述预设的滤波器的自由参数,得到更新后的滤波器;Utilize the overall error to update the free parameters of the preset filter to obtain an updated filter;
利用所述更新后的滤波器,重新提取所述各个端口的当前流量特征和历史流量特征,得到所有端口的当前流量状态的判决结果和各自对应的预标注结果之间的总体误差,并利用所述总体误差更新上一滤波器的自由参数,直到当前滤波器的自由参数的值溢出预设的第一取值区间,将所述总体误差最小时对应的滤波器设为最优滤波器;Using the updated filter, re-extract the current traffic characteristics and historical traffic characteristics of each port, obtain the overall error between the judgment results of the current traffic status of all ports and the corresponding pre-marking results, and use the The overall error updates the free parameter of the previous filter, until the value of the free parameter of the current filter overflows the preset first value interval, and the filter corresponding to the minimum overall error is set as the optimal filter;
根据与所述最优滤波器对应的各个端口的当前流量状态的判决结果,确定出故障端口。The faulty port is determined according to the judgment result of the current traffic state of each port corresponding to the optimal filter.
第二方面,本发明实施例提供一种网络异常流量的检测装置,包括:In a second aspect, an embodiment of the present invention provides an apparatus for detecting abnormal network traffic, including:
数据采集模块,用于采集多个端口的当前时间段内的当前流量数据和与所述当前时间段对应的历史流量数据;a data collection module, used for collecting current flow data in the current time period of multiple ports and historical flow data corresponding to the current time period;
特征提取模块,用于利用预设的滤波器,分别从所述当前流量数据和所述历史流量数据中提取所述端口的当前流量特征和历史流量特征;a feature extraction module, configured to extract the current traffic feature and the historical traffic feature of the port from the current traffic data and the historical traffic data, respectively, by using a preset filter;
语义提取模块,用于分别从所述当前流量特征和所述历史流量特征中提取所述端口的当前流量的高层语义和历史流量的高层语义;A semantic extraction module, used for extracting the high-level semantics of the current traffic of the port and the high-level semantics of the historical traffic from the current traffic feature and the historical traffic feature respectively;
相似度计算模块,用于根据所述当前流量的高层语义和所述历史流量的高层语义,计算所述端口的当前流量和历史流量之间的语义相似度;a similarity calculation module, configured to calculate the semantic similarity between the current traffic and the historical traffic of the port according to the high-level semantics of the current traffic and the high-level semantics of the historical traffic;
流量状态判决模块,用于判断所述语义相似度是否落入预设的第一取值区间,得到所述端口的当前流量状态的判决结果;A traffic state judgment module, configured to judge whether the semantic similarity falls within a preset first value range, and obtain a judgment result of the current traffic state of the port;
确认模块,用于根据各个所述端口的当前流量状态的判决结果,确认出故障端口。The confirmation module is used for confirming the faulty port according to the judgment result of the current flow state of each of the ports.
第三方面,本发明实施例提供一种网络异常流量的检测装置,包括存储器、处理器及存储在存储器上并可在处理器上运行的程序,所述处理器执行所述程序时实现如上所述的网络异常流量的检测方法。In a third aspect, an embodiment of the present invention provides an apparatus for detecting abnormal network traffic, including a memory, a processor, and a program stored in the memory and executable on the processor, and the processor implements the above when executing the program. The detection method of abnormal network traffic described above.
第四方面,本发明实施例提供一种计算机可读存储介质,其上存储有程序,所述程序被处理器执行时实现如上所述的网络异常流量的检测方法。In a fourth aspect, an embodiment of the present invention provides a computer-readable storage medium on which a program is stored, and when the program is executed by a processor, implements the above-mentioned method for detecting abnormal network traffic.
根据本发明实施例提供的网络异常流量的检测方法,首先采集多个端口的当前时间段内的当前流量数据和与当前时间段对应的历史流量数据;接着利用预设的滤波器分别从当前流量数据和历史流量数据中提取端口的当前流量特征和历史流量特征。然后分别从当前流量特征和历史流量特征中提取端口的当前流量的高层语义和历史流量的高层语义,并根据当前流量的高层语义和历史流量的高层语义,计算端口的当前流量和历史流量之间的语义相似度,通过判断语义相似度是否落入预设的第一取值区间,可以得到端口的当前流量状态的判决结果,然后根据各个端口的当前流量状态的判决结果,就能够确认出故障端口。According to the method for detecting abnormal network traffic provided by the embodiment of the present invention, first collect current traffic data in the current time period of multiple ports and historical traffic data corresponding to the current time period; The current traffic characteristics and historical traffic characteristics of the port are extracted from the data and historical traffic data. Then, the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port are extracted from the current traffic characteristics and the historical traffic characteristics, respectively, and according to the high-level semantics of the current traffic and the high-level semantics of the historical traffic, the difference between the current traffic and the historical traffic of the port is calculated. By judging whether the semantic similarity falls within the preset first value range, the judgment result of the current traffic state of the port can be obtained, and then the fault can be confirmed according to the judgment result of the current traffic state of each port. port.
由上可知,本发明实施例的网络异常流量的检测方法提取了各个端口的当前时间段和与当前时间段对应的历史时间段内的流量高层语义,即高度提取的流量特征。由于当前时间段和与其对应的历史时间段内的流量行为具有相似性,比如以天为周期的相似性,因此只需要对当前时间段和与其对应历史时间段内的流量高层语义进行比较和判决,就能够得到各端口的流量状态,从而及时找到故障端口。It can be seen from the above that the method for detecting abnormal network traffic according to the embodiment of the present invention extracts the high-level semantics of traffic in the current time period of each port and the historical time period corresponding to the current time period, that is, highly extracted traffic features. Since the traffic behavior in the current time period and its corresponding historical time period is similar, such as the similarity with a period of days, it is only necessary to compare and judge the high-level semantics of the traffic in the current time period and its corresponding historical time period. , the traffic status of each port can be obtained, so as to find the faulty port in time.
附图说明Description of drawings
从下面结合附图对本发明的具体实施方式的描述中可以更好地理解本发明其中,相同或相似的附图标记表示相同或相似的特征。The present invention can be better understood from the following description of specific embodiments of the present invention in conjunction with the accompanying drawings, wherein the same or similar reference numerals denote the same or similar features.
图1为本发明一实施例的网络异常流量的检测方法的流程示意图;1 is a schematic flowchart of a method for detecting abnormal network traffic according to an embodiment of the present invention;
图2为本发明另一实施例的网络异常流量的检测方法的流程示意图;2 is a schematic flowchart of a method for detecting abnormal network traffic according to another embodiment of the present invention;
图3为本发明又一实施例的网络异常流量的检测方法的流程示意图;3 is a schematic flowchart of a method for detecting abnormal network traffic according to another embodiment of the present invention;
图4为本发明又一实施例的网络异常流量的检测方法的流程示意图;4 is a schematic flowchart of a method for detecting abnormal network traffic according to another embodiment of the present invention;
图5为本发明实施例的基于学习的滤波器自适应学习方法的流程示意图;5 is a schematic flowchart of a learning-based filter adaptive learning method according to an embodiment of the present invention;
图6为本发明实施例的基于不同粒度输出流量的计算高层语义的流程示意图;6 is a schematic flowchart of calculating high-level semantics based on output traffic with different granularities according to an embodiment of the present invention;
图7为本发明实施例的基于不同粒度输入流量的计算高层语义的流程示意图;7 is a schematic flowchart of calculating high-level semantics based on input traffic with different granularities according to an embodiment of the present invention;
图8为本发明实施例的网络异常流量的检测装置的结构示意图;8 is a schematic structural diagram of an apparatus for detecting abnormal network traffic according to an embodiment of the present invention;
图9为本发明实施例的网络异常流量的检测装置的硬件结构示意图;9 is a schematic diagram of a hardware structure of an apparatus for detecting abnormal network traffic according to an embodiment of the present invention;
图10为本发明实施例的试验场景的布局示意图;10 is a schematic layout diagram of a test scene according to an embodiment of the present invention;
图11为本发明实施例的Trunk2端口流量正常时的历史流量数据示意图;11 is a schematic diagram of historical traffic data when the traffic of the Trunk2 port is normal according to an embodiment of the present invention;
图12为本发明实施例的Trunk2端口流量异常时的当前流量数据示意图;12 is a schematic diagram of current flow data when the flow of the Trunk2 port is abnormal according to an embodiment of the present invention;
图13为本发明实施例的Trunk1端口流量正常时的历史流量数据示意图;13 is a schematic diagram of historical traffic data when the traffic of the Trunk1 port is normal according to an embodiment of the present invention;
图14为本发明实施例的Trunk1端口流量异常时的当前史流量数据示意图。FIG. 14 is a schematic diagram of current historical traffic data when the traffic of the Trunk1 port is abnormal according to an embodiment of the present invention.
具体实施方式Detailed ways
下面将详细描述本发明实施例的各个方面的特征和示例性实施例。在下面的详细描述中,提出了许多具体细节,以便提供对本发明实施例的全面理解。但是,对于本领域技术人员来说很明显的是,本发明实施例可以在不需要这些具体细节中的一些细节的情况下实施。下面对实施例的描述仅仅是为了通过示出本发明实施例的示例来提供对本发明实施例的更好的理解。本发明实施例决不限于下面所提出的任何具体配置和算法,而是在不脱离本发明实施例的精神的前提下覆盖了元素、部件和算法的任何修改、替换和改进。在附图和下面的描述中,没有示出公知的结构和技术,以便避免对本发明实施例造成不必要的模糊。Features and exemplary embodiments of various aspects of embodiments of the present invention are described in detail below. In the following detailed description, numerous specific details are set forth in order to provide a thorough understanding of the embodiments of the present invention. However, it will be apparent to those skilled in the art that embodiments of the invention may be practiced without some of these specific details. The following description of the embodiments is merely to provide a better understanding of the embodiments of the present invention by illustrating examples of the embodiments of the present invention. Embodiments of the present invention are by no means limited to any specific configurations and algorithms proposed below, but cover any modifications, substitutions and improvements of elements, components and algorithms without departing from the spirit of the embodiments of the present invention. In the drawings and the following description, well-known structures and techniques are not shown in order to avoid unnecessarily obscuring the embodiments of the present invention.
本发明实施例中提供的网络异常流量的检测方法和装置,应用于通信设备的网络异常流量的故障检测,采用本发明实施例中的网络异常流量的检测方法和装置能够得到各个设备中的流量状态,及时找到故障端口。The method and device for detecting abnormal network traffic provided in the embodiments of the present invention are applied to the fault detection of abnormal network traffic of communication equipment. By using the method and device for detecting abnormal network traffic in the embodiment of the present invention, the traffic in each device can be obtained. status, and find the faulty port in time.
通常,核心机房的流量行为具有以天为周期的相似性,即当天某个时段的流量行为与之前某一天的流量行为具有相似性。比如,当年当天和上一年的这一天之间某个时段的流量具有很强的相似性,当月当天和上一月的这天某个时段的流量具有很强的相似性,当年当月当周当天与上一年的这月这周这天某个时段的流量具有一定的相似性。这是由于不同的日期与人们的网络使用行为之间有一定的关系,比如,在假日部署在热点地段的设备的流量就会与平常时间不一样;在休息日由于在家用WIFI上网用户增多,移动通信营运商部署的设备的流量就会下降;且不同的月份,受天气的影响,人们出行的时段不一样造成了不同月份的移动流量不一样。Usually, the traffic behavior of the core equipment room has a similarity on a daily basis, that is, the traffic behavior of a certain period of the day is similar to the traffic behavior of a previous day. For example, the traffic of a certain period between the current day and the previous year has a strong similarity, the traffic of the current month and a certain period of the previous month have a strong similarity, and the current month and week of the current year have strong similarity. There is a certain similarity between the traffic of this day and the time of the week of this month of the previous year. This is because there is a certain relationship between different days and people's network usage behavior. For example, the traffic of devices deployed in hot spots on holidays will be different from normal times; The traffic of devices deployed by mobile communication operators will drop; and in different months, affected by the weather, people travel at different times, resulting in different mobile traffic in different months.
目前,随着对数据采集和保存能力的提高,可以采集机房设备每个端口在较长时间范围内的流量,比如1年,以提取和利用这种以天为周期的流量行为。At present, with the improvement of data collection and storage capabilities, the flow of each port of the equipment room can be collected over a long period of time, such as one year, to extract and utilize this daily flow behavior.
图1为本发明实施例提供的网络异常流量的检测方法的流程示意图。图1中的网络异常流量的检测方法包括步骤101至步骤106。FIG. 1 is a schematic flowchart of a method for detecting abnormal network traffic according to an embodiment of the present invention. The method for detecting abnormal network traffic in FIG. 1 includes steps 101 to 106 .
在步骤101中,采集多个端口的当前时间段内的当前流量数据和与当前时间段对应的历史流量数据。In step 101, current flow data in the current time period and historical flow data corresponding to the current time period of multiple ports are collected.
此处,所引入的历史流量数据不是指同一天中已经发生的流量,而是指历史上相同时刻的流量数据。通过将当前流量与历史上相同时刻的流量进行对比,使得能够采用较短时间内的频率特征,描述当前流量与历史流量的相似性或者差异性。Here, the introduced historical traffic data does not refer to traffic that has occurred on the same day, but refers to traffic data at the same time in history. By comparing the current traffic with the traffic at the same time in history, the frequency characteristics in a relatively short period of time can be used to describe the similarity or difference between the current traffic and the historical traffic.
在步骤102中,利用预设的滤波器,分别从当前流量数据和历史流量数据中提取端口的当前流量特征和历史流量特征。In step 102, the current traffic characteristics and the historical traffic characteristics of the port are extracted from the current traffic data and the historical traffic data respectively by using a preset filter.
其中,滤波器的作用是提取流量特征。在一个示例中,滤波器可以是一组权值矩阵。权值矩阵可以是固定不变,也可以带有自由参数。带有自由参数的权值矩阵可以提取不同时段内具有不同变化趋势的流量特征。Among them, the function of the filter is to extract traffic features. In one example, the filter may be a set of weight matrices. The weight matrix can be fixed or have free parameters. The weight matrix with free parameters can extract traffic characteristics with different trends in different time periods.
实践中发现,同一个机房、同一个设备或者同一个设备的同一个端口的当前流量和之前某一天的流量相比,要么在总流量上及其接近,要么在变化趋势上极其接近,要么两者都满足。为了描述其相似性,可以将流量特征分为低频特征和高频特征,由低通滤波器来提取低频特征,用于描述流量总量,由高通滤波器提取高频特征,用于描述流量的变化趋势。在一个示例中,低通滤波器的表达式为:In practice, it is found that, compared with the current traffic of the same equipment room, the same device or the same port of the same device, compared with the traffic of a previous day, either the total traffic is very close, or the change trend is extremely close, or both. are satisfied. In order to describe the similarity, the flow features can be divided into low-frequency features and high-frequency features, the low-frequency features are extracted by a low-pass filter, which is used to describe the total amount of traffic, and the high-frequency features are extracted by a high-pass filter. Trend. In one example, the expression for the low-pass filter is:
对应地,高通滤波器的表达式为:Correspondingly, the expression of the high-pass filter is:
g(k)=(-1)kw(k) (2)g(k)=(-1) k w(k) (2)
其中,k为误差判决区间,λ=-3/64,β为自由参数,β的表达式为:β=β+k/128。不同的β值可以提取不同流量行为的高频与低频特征,通常,β=1/32时滤波器对于非时变信号具有良好的频谱分解特性(请见A new design method of 9-7biorthogonal filterbanks based on odd harmonic function,CSSP,Springer,31(3):1245-1255,2012))。Among them, k is the error judgment interval, λ=-3/64, β is a free parameter, and the expression of β is: β=β+k/128. Different β values can extract high-frequency and low-frequency features of different traffic behaviors. Generally, when β=1/32, the filter has good spectral decomposition characteristics for time-invariant signals (see A new design method of 9-7biorthogonal filterbanks based on on odd harmonic function, CSSP, Springer, 31(3):1245-1255, 2012)).
具体地,可以用w(k)与流量做卷积得到流量的低频特征,用g(k)与流量做卷积得到流量的高频分量。Specifically, the low-frequency characteristics of the flow can be obtained by convolution with w(k) and the flow, and the high-frequency components of the flow can be obtained by convolution with g(k) with the flow.
在步骤103中,分别从当前流量特征和历史流量特征中提取端口的当前流量的高层语义和历史流量的高层语义。In step 103, the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port are extracted from the current traffic feature and the historical traffic feature, respectively.
优选地,可以利用乘十取整法,分别从当前流量特征和历史流量特征中提取端口的当前流量的高层语义和历史流量的高层语义,即求取流量特征的直方图矢量作为高层语义。以使得高层语义中的每个元素都能携带足够多的流量行为的信息量。Preferably, the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port can be extracted from the current traffic characteristics and the historical traffic characteristics by using the multiplication method, that is, the histogram vector of the traffic characteristics is obtained as the high-level semantics. So that each element in the high-level semantics can carry enough information about the traffic behavior.
下面对利用乘十取整法提取流量高层语义的计算过程进行举例说明,提取流量高层语义即求取流量特征的直方图矢量。The following will illustrate the calculation process of extracting high-level semantics of traffic by using the method of multiplying by ten. To extract high-level semantics of traffic is to obtain the histogram vector of traffic characteristics.
预定义一个流量特征的失量表达式:Predefine a loss expression for a flow characteristic:
s(n)=[0.8994,0.9541,1.0000,0,0.9414,0.9852] (3)s(n)=[0.8994, 0.9541, 1.0000, 0, 0.9414, 0.9852] (3)
其中,s(n)为流量特征失量,n为矢量元素个数。求取流量特征的直方图矢量的方法依次包括对s(n)中的各元素乘10取整;对各元素乘10取整后的值加1,得到矢量[9,10,11,1,10,10];统计矢量[9,10,11,1,10,10]中,1~11分别出现的次数,得到直方图矢量,即流量的高层语义[1,0,0,0,0,0,0,0,1,3,1]。Among them, s(n) is the flow characteristic loss, and n is the number of vector elements. The method of obtaining the histogram vector of the flow characteristics includes multiplying each element in s(n) by 10 and rounding it up; adding 1 to the value after multiplying each element by 10 and rounding up to obtain a vector [9, 10, 11, 1, 10, 10]; count the number of occurrences of 1 to 11 in the vector [9, 10, 11, 1, 10, 10], and obtain the histogram vector, that is, the high-level semantics of traffic [1, 0, 0, 0, 0] ,0,0,0,1,3,1].
在步骤104中,根据当前流量的高层语义和历史流量的高层语义,计算端口的当前流量和历史流量之间的语义相似度。In step 104, the semantic similarity between the current traffic and the historical traffic of the port is calculated according to the high-level semantics of the current traffic and the high-level semantics of the historical traffic.
优选地,可以计算当前流量的高层语义和历史流量的高层语义之间的卡方统计距离,将卡方统计距离作为端口的当前流量和历史流量之间的语义相似度,即引入卡方统计距离刻画历史流量与当前流量的相似性。与欧式距离相比,由于引入协方差矩阵,卡方统计距离具有更高阶的消失距,能够排除语义无关分量对相似性的影响。Preferably, the chi-square statistical distance between the high-level semantics of the current traffic and the high-level semantics of the historical traffic can be calculated, and the chi-square statistical distance is used as the semantic similarity between the current traffic and the historical traffic of the port, that is, the chi-square statistical distance is introduced. Characterize the similarity between historical traffic and current traffic. Compared with Euclidean distance, due to the introduction of covariance matrix, chi-square statistical distance has a higher-order vanishing distance, which can exclude the influence of semantically irrelevant components on similarity.
在一个示例中,定义历史流量的高层语义为hist1,当前流量的高层语义为hist2,历史流量和当前流量之间的协方差为cov(hist1i,hist2i),则历史流量和当前流量之间的卡方统计距离的计算公式为:In an example, define the high-level semantics of historical traffic as hist 1 , the high-level semantics of current traffic as hist 2 , and the covariance between historical traffic and current traffic as cov(hist 1i , hist 2i ), then historical traffic and current traffic The formula for calculating the chi-square statistical distance between is:
其中,χ2(hist1,hist2)为历史流量和当前流量之间的卡方统计距离,i为高层语义的直方图矢量中的元素个数。Among them, χ 2 (hist 1 , hist 2 ) is the chi-square statistical distance between historical traffic and current traffic, and i is the number of elements in the histogram vector of high-level semantics.
在步骤105中,判断端口的当前流量和历史流量之间的语义相似度是否落入预设的第一取值区间,得到端口的当前流量状态的判决结果。In step 105, it is judged whether the semantic similarity between the current traffic and the historical traffic of the port falls within a preset first value range, and a judgment result of the current traffic state of the port is obtained.
在步骤106中,根据各个端口的当前流量状态的判决结果,确认出故障端口。In step 106, the faulty port is confirmed according to the judgment result of the current traffic state of each port.
根据本发明实施例提供的网络异常流量的检测方法,首先采集多个端口的当前时间段内的当前流量数据和与当前时间段对应的历史流量数据;接着利用预设的滤波器分别从当前流量数据和历史流量数据中提取端口的当前流量特征和历史流量特征。然后分别从当前流量特征和历史流量特征中提取端口的当前流量的高层语义和历史流量的高层语义,并根据当前流量的高层语义和历史流量的高层语义,计算端口的当前流量和历史流量之间的语义相似度,通过判断语义相似度是否落入预设的第一取值区间,可以得到端口的当前流量状态的判决结果,然后根据各个端口的当前流量状态的判决结果,就能够确认出故障端口。According to the method for detecting abnormal network traffic provided by the embodiment of the present invention, first collect current traffic data in the current time period of multiple ports and historical traffic data corresponding to the current time period; The current traffic characteristics and historical traffic characteristics of the port are extracted from the data and historical traffic data. Then, the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port are extracted from the current traffic characteristics and the historical traffic characteristics, respectively, and according to the high-level semantics of the current traffic and the high-level semantics of the historical traffic, the difference between the current traffic and the historical traffic of the port is calculated. By judging whether the semantic similarity falls within the preset first value range, the judgment result of the current traffic state of the port can be obtained, and then the fault can be confirmed according to the judgment result of the current traffic state of each port. port.
由上可知,本发明实施例的网络异常流量的检测方法提取了各个端口的当前时间段和与当前时间段对应的历史时间段内的流量高层语义,即高度提取的流量特征。由于当前时间段和与其对应的历史时间段内的流量行为具有相似性,比如以天为周期的相似性,因此只需要对当前时间段和与其对应历史时间段内的流量高层语义进行比较和判决,就能够得到各端口的流量状态,从而及时找到故障端口。It can be seen from the above that the method for detecting abnormal network traffic according to the embodiment of the present invention extracts the high-level semantics of traffic in the current time period of each port and the historical time period corresponding to the current time period, that is, highly extracted traffic features. Since the traffic behavior in the current time period and its corresponding historical time period is similar, such as the similarity with a period of days, it is only necessary to compare and judge the high-level semantics of the traffic in the current time period and its corresponding historical time period. , the traffic status of each port can be obtained, so as to find the faulty port in time.
此外,由于流量的高层语义属于流量的高层概念,不受底层特征(比如低频特征和高频特征)局限性的限制,因此采用流量的高层语义作为流量是否异常的判决依据,还能够更好地体现人们对流量异常的理解与判决。In addition, since the high-level semantics of traffic belongs to the high-level concept of traffic and is not limited by the limitations of low-level features (such as low-frequency features and high-frequency features), using the high-level semantics of traffic as the basis for judging whether the traffic is abnormal can better It reflects people's understanding and judgment of abnormal traffic.
图2为本发明另一实施例提供的网络异常流量的检测方法的流程示意图。图2与图1的不同之处在于,图1中的步骤102可细化为图2中的步骤1021至步骤1022。FIG. 2 is a schematic flowchart of a method for detecting abnormal network traffic according to another embodiment of the present invention. The difference between FIG. 2 and FIG. 1 is that step 102 in FIG. 1 can be refined into steps 1021 to 1022 in FIG. 2 .
在步骤1021中,利用预设的滤波器,分别从当前流量数据和历史流量数据中提取各个端口的初始流量特征,初始流量特征包括初始当前流量特征和初始历史流量特征。In step 1021, a preset filter is used to extract initial traffic characteristics of each port from the current traffic data and the historical traffic data, respectively, where the initial traffic characteristics include initial current traffic characteristics and initial historical traffic characteristics.
在步骤1022中,分别对初始当前流量特征和初始历史流量特征进行特征激活,得到端口的当前流量特征和历史流量特征。In step 1022, feature activation is performed on the initial current traffic feature and the initial historical traffic feature, respectively, to obtain the current traffic feature and the historical traffic feature of the port.
其中,通过对流量特征做激活处理能够突显出异常流量特征,能够使地流量特征得到收敛。对流量特征做激活处理的计算过程具体参看下文的公式(15)和公式(16)。Among them, by activating the flow characteristics, the abnormal flow characteristics can be highlighted, and the ground flow characteristics can be converged. For the calculation process of activating the flow characteristic, please refer to the following formula (15) and formula (16).
图3为本发明又一实施例提供的网络异常流量的检测方法的流程示意图。图3与图1的不同之处在于,图1中的步骤105可细化为图3中的步骤1051至步骤1052。FIG. 3 is a schematic flowchart of a method for detecting abnormal network traffic according to another embodiment of the present invention. The difference between FIG. 3 and FIG. 1 is that step 105 in FIG. 1 can be refined into steps 1051 to 1052 in FIG. 3 .
在步骤1051中,判断端口的语义相似度是否落入预设的第一取值区间。In step 1051, it is determined whether the semantic similarity of the port falls within a preset first value range.
在步骤1052中,若端口的当前流量和历史流量之间的语义相似度落入预设的第一取值区间,则将端口的当前流量状态判决为正常;若端口的当前流量和历史流量之间的语义相似度未落入预设的第一取值区间,则将端口的当前流量状态判决为异常。In step 1052, if the semantic similarity between the current flow of the port and the historical flow falls within the preset first value range, the current flow status of the port is judged to be normal; if the difference between the current flow of the port and the historical flow is If the semantic similarity between them does not fall within the preset first value range, the current traffic state of the port is judged to be abnormal.
需要说明的是,图3中的方法主要是基于当前流量和历史流量之间的语义相似度对端口的当前流量状态进行了判决。It should be noted that, the method in FIG. 3 mainly judges the current traffic state of the port based on the semantic similarity between the current traffic and the historical traffic.
图4为本发明又一实施例提供的网络异常流量的检测方法的流程示意图。图4中的检测方法包括步骤1053至步骤1056。需要说明的是,图4中的步骤1053至步骤1056也是图1中的步骤105的进一步细化。FIG. 4 is a schematic flowchart of a method for detecting abnormal network traffic according to another embodiment of the present invention. The detection method in FIG. 4 includes steps 1053 to 1056 . It should be noted that, steps 1053 to 1056 in FIG. 4 are also further refinements of step 105 in FIG. 1 .
在步骤1053中,判断端口的语义相似度是否落入预设的第一取值区间(与步骤1051相同)。In step 1053, it is determined whether the semantic similarity of the port falls within a preset first value range (same as step 1051).
在步骤1054中,若端口的当前流量和历史流量之间的语义相似度落入预设的第一取值区间,则计算端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度。In step 1054, if the semantic similarity between the current flow and the historical flow of the port falls within the preset first value range, calculate the semantic similarity between the current inflow and the current outflow in the current flow of the port Spend.
在步骤1055中,判断端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度是否落入预设的第二取值区间。In step 1055, it is determined whether the semantic similarity between the current inflow traffic and the current outgoing traffic in the current traffic of the port falls within a preset second value range.
在步骤1056中,若端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度落入预设的第二取值区间,则将端口的当前流量状态判决为正常;若端口的当前流量中的当前流入流量和当前流出流量之间的语义相似度未落入预设的第二取值区间,则将端口的当前流量状态判决为异常。In step 1056, if the semantic similarity between the current incoming traffic and the current outgoing traffic in the current traffic of the port falls within the preset second value interval, the current traffic state of the port is judged to be normal; The semantic similarity between the current incoming traffic and the current outgoing traffic in the current traffic does not fall within the preset second value range, and the current traffic state of the port is judged to be abnormal.
需要说明的是,图4中的方法主要是基于当前流量和历史流量之间的语义相似度在预设范围内的情况,进一步对当前流量的输入流量和输出流量是否守恒进行了判决,提高了端口流量的判决精度。It should be noted that the method in Figure 4 is mainly based on the situation that the semantic similarity between the current traffic and the historical traffic is within the preset range, and further judges whether the input traffic and output traffic of the current traffic are conserved, which improves the Judgment accuracy of port traffic.
根据本发明实施例,还可以利用预设的滤波器,分别从当前流量数据和历史流量数据中提取各个端口的低频流量特征和/或高频流量特征,低频流量特征包括当前低频流量特征和历史低频流量特征,高频流量特征包括当前高频流量特征和历史高频流量特征;将当前低频流量特征和当前高频流量特征作为端口的当前流量特征,以及将历史低频流量特征和历史高频流量特征作为端口的历史流量特征。和/或,利用预设的滤波器,分别从当前流量数据和历史流量数据中提取各个端口的流入流量特征和/或流出流量特征,流入流量特征包括当前流入流量特征和历史流入流量特征,流出流量特征包括当前流出流量特征和历史流出流量特征;将当前流入流量特征和当前流出流量特征作为端口的当前流量特征,以及将历史流入流量特征和历史流出流量特征作为端口的历史流量特征。According to the embodiment of the present invention, preset filters can also be used to extract low-frequency traffic characteristics and/or high-frequency traffic characteristics of each port from current traffic data and historical traffic data, respectively, where low-frequency traffic characteristics include current low-frequency traffic characteristics and historical traffic characteristics. Low-frequency traffic characteristics, high-frequency traffic characteristics include current high-frequency traffic characteristics and historical high-frequency traffic characteristics; take the current low-frequency traffic characteristics and current high-frequency traffic characteristics as the current traffic characteristics of the port, and use the historical low-frequency traffic characteristics and historical high-frequency traffic characteristics. The characteristic is used as the historical traffic characteristic of the port. And/or, using a preset filter, respectively extract the inflow flow characteristics and/or outflow flow characteristics of each port from the current flow data and the historical flow data, the inflow flow characteristics include the current inflow flow characteristics and the historical inflow flow characteristics, and the outflow flow characteristics Traffic characteristics include current outgoing traffic characteristics and historical outgoing traffic characteristics; take the current inflow traffic characteristics and current outgoing traffic characteristics as the current traffic characteristics of the port, and use the historical inflow traffic characteristics and historical outgoing traffic characteristics as the historical traffic characteristics of the port.
如上所述,通过提取多个角度的流量特征作为端口流量的判决依据,能够提高端口流量的判决精度。As described above, by extracting the traffic characteristics from multiple angles as the judgment basis of the port traffic, the judgment accuracy of the port traffic can be improved.
网络流量即具有瞬时性又具有长时性,需要由非固定的滤波器捕捉这种时变的网络流量特征。为捕捉时变的网络流量特征,本发明实施例还提出了基于学习的滤波器自适应学习方法,以获得不同时变特征的滤波器。Network traffic is both transient and long-term, and it is necessary to capture this time-varying network traffic feature by a non-fixed filter. In order to capture the time-varying network traffic characteristics, the embodiments of the present invention also propose a learning-based filter adaptive learning method to obtain filters with different time-varying characteristics.
图5为本发明实施例的基于学习的滤波器自适应学习方法的流程示意图。需要说明的是,图5中的步骤107至步骤110位于图1中的步骤105之后。FIG. 5 is a schematic flowchart of a learning-based filter adaptive learning method according to an embodiment of the present invention. It should be noted that, steps 107 to 110 in FIG. 5 are located after step 105 in FIG. 1 .
在步骤107中,分别计算各个端口的当前流量状态的判决结果和预标注结果之间的误差,并将各个端口的误差相加,得到所有端口的总体误差。In step 107, the error between the judgment result of the current traffic state of each port and the pre-marking result is calculated respectively, and the errors of each port are added to obtain the overall error of all ports.
在步骤108中,利用总体误差更新预设的滤波器的自由参数,得到更新后的滤波器。In step 108, the free parameters of the preset filter are updated by using the overall error to obtain an updated filter.
在步骤109中,利用更新后的滤波器,重新提取各个端口的当前流量特征和历史流量特征,得到所有端口的当前流量状态的判决结果和各自对应的预标注结果之间的总体误差,并利用总体误差更新上一滤波器的自由参数,直到当前滤波器的自由参数的值溢出预设的第一取值区间,将总体误差最小时对应的滤波器设为最优滤波器。其中,预设的第一取值区间为[1/128,1/4]。该方法将将β设为自由参数,使β的取值区间为[1/128,1/4],以获得不同时变特征的滤波器。In step 109, use the updated filter to re-extract the current traffic characteristics and historical traffic characteristics of each port, obtain the overall error between the judgment results of the current traffic status of all ports and the corresponding pre-marking results, and use The overall error updates the free parameters of the previous filter until the value of the free parameters of the current filter overflows the preset first value range, and the filter corresponding to the smallest overall error is set as the optimal filter. The preset first value interval is [1/128, 1/4]. In this method, β is set as a free parameter, and the value interval of β is [1/128, 1/4] to obtain filters with different time-varying characteristics.
在步骤110中,根据与最优滤波器对应的各个端口的当前流量状态的判决结果,确定出故障端口。In step 110, the faulty port is determined according to the judgment result of the current traffic state of each port corresponding to the optimal filter.
下面对基于学习的滤波器自适应学习方法的流程进行举例说明,包括以下几个步骤:The following is an example of the flow of the learning-based filter adaptive learning method, including the following steps:
提前选取50000个包含机房各个设备各个端口的当前流量数据信息与历史流量数据信息的流量段,并根据历史故障记录对这50000个流量段的分别进行流量状态的标注,流量状态异常用1标注,流量状态正常用0标注。Select 50,000 traffic segments in advance that contain the current traffic data information and historical traffic data information of each port of each device in the computer room, and mark the traffic status of these 50,000 traffic segments according to the historical fault records. The abnormal traffic status is marked with 1. The normal flow status is marked with 0.
然后根据上文中的网络异常流量的检测方法得到50000个流量段各自的流量状态的判决结果(参看步骤105),同样地,流量状态异常的判决结果为1,流量状态正常的判决结果为0。Then, according to the above method for detecting abnormal network traffic, the judgment result of the respective traffic states of the 50,000 traffic segments is obtained (refer to step 105). Similarly, the judgment result of the abnormal traffic state is 1, and the judgment result of the normal traffic state is 0.
接下来,求取这50000个流量段的流量状态的标注结果与判决结果之间的总体误差,计算公式为:Next, find the overall error between the labeling results of the traffic states of these 50,000 traffic segments and the judgment results. The calculation formula is:
其中,H(yn,l(zn))为总体误差的熵,yn为流量段流量状态的标注结果,l(zn)为流量段异常判决得到的判决结果,n表示流量段编号。Among them, H(y n , l(z n )) is the entropy of the overall error, y n is the labeling result of the traffic status of the traffic segment, l(z n ) is the judgment result obtained from the abnormal judgment of the traffic segment, and n represents the number of the traffic segment .
对公式(5)求导,得到:Taking the derivative of formula (5), we get:
然后,将H′除以50000,使H′归一化至0~1之间,并将0~1之间的划分为256个等分作为误差判决区间,误差判决区间的编号记作k,通过判断H′落入的误差判决区间,就能够得到k值,比如,H′的值为3/256,则k=3。Then, divide H' by 50000, normalize H' to be between 0 and 1, and divide the interval between 0 and 1 into 256 equal parts as the error judgment interval, and the number of the error judgment interval is denoted as k, By judging the error judgment interval that H' falls into, the value of k can be obtained. For example, if the value of H' is 3/256, then k=3.
为了加速最优滤波器的寻找过程,利用k对β进行更新,其中,β=β+k/128。且在遍历整个β的取值范围[1/128,1/4]后,选取H′最小时的β对应的滤波器组为最优滤波器,从而自适应地得到具有学习能力的滤波器。In order to speed up the process of finding the optimal filter, β is updated with k, where β=β+k/128. And after traversing the entire value range of β [1/128, 1/4], the filter bank corresponding to β when H' is the smallest is selected as the optimal filter, so as to adaptively obtain a filter with learning ability.
可以理解地,由于不同粒度(机房、设备及端口)的流量行为存在差异,根据本发明实施例,还可以从机房粒度、设备粒度和设备粒度三个层次上分别提取各粒度层对应的高层语义,逐层对比各粒度的当前流量和历史流量之间的语义相似度,得到多个粒度上网络流量的判决结果,进而得到适应于不同粒度的流量行为的滤波器。Understandably, due to differences in the traffic behavior of different granularities (computer room, device, and port), according to the embodiment of the present invention, the high-level semantics corresponding to each granularity layer can also be extracted from the three levels of computer room granularity, device granularity, and device granularity. , compare the semantic similarity between current traffic and historical traffic of each granularity layer by layer, obtain the judgment results of network traffic at multiple granularities, and then obtain filters adapted to traffic behaviors of different granularities.
为便于本领域技术人员理解,下面对机房粒度、设备粒度及端口粒度对应的流量的特征提取过程进行举例说明。In order to facilitate the understanding of those skilled in the art, the process of feature extraction of traffic corresponding to the computer room granularity, the device granularity, and the port granularity is exemplified below.
假设流量的采样间隔时间为1分钟,则每个端口24小时内可以采集1440个流入流量和1440个流出流量。在以下表述中,x为连续时间,n为离散时间,n=1表示当前时间点,n=N表示当前时间往前的第N个时间点,即N为用于计算流量特征的样本数。在实践中,为了及时检测出异常的流量,N取6~30之间的值,即取当前时间点为止的前6~30个采样点。Assuming that the flow sampling interval is 1 minute, each port can collect 1440 incoming flows and 1440 outgoing flows within 24 hours. In the following expressions, x is continuous time, n is discrete time, n=1 represents the current time point, and n=N represents the Nth time point ahead of the current time, that is, N is the number of samples used to calculate the flow characteristics. In practice, in order to detect abnormal flow in time, N takes a value between 6 and 30, that is, takes the first 6 to 30 sampling points up to the current time point.
端口的流量低频特征的提取公式为:The extraction formula of low-frequency characteristics of port traffic is:
端口的流量高频特征的提取公式为:The extraction formula of the high-frequency characteristics of the port traffic is:
其中,为第l个端口的低频特征,为第l个端口的高频特征,表示第l个端口的流量,π为提取层编号,ω(k)为低频特征提取函数,即低通滤波器对应的权值矩阵,g(k)为高频特征提取函数,即高通滤波器对应的权值矩阵,b(π)表示与提取层对应的偏置量。式(7)用于刻画某个设备第l个端口的流量在一段时间内流量的低频分量。式(8)用于刻画某个设备第l个端口的流量在一段时间内流量的高频分量。in, is the low frequency characteristic of the lth port, is the high frequency characteristic of the lth port, Indicates the traffic of the lth port, π is the extraction layer number, ω(k) is the low-frequency feature extraction function, that is, the weight matrix corresponding to the low-pass filter, and g(k) is the high-frequency feature extraction function, that is, the high-pass filter. The corresponding weight matrix, b (π) represents the offset corresponding to the extraction layer. Equation (7) is used to describe the flow of the lth port of a device Low frequency components of flow over a period of time. Equation (8) is used to describe the flow of the lth port of a device High frequency components of traffic over a period of time.
设备的流量低频特征的提取公式为:The extraction formula of the low-frequency characteristics of the flow of the device is:
设备的流量高频特征的提取公式为:The extraction formula of the high-frequency characteristics of the flow of the device is:
其中,为第m个端口的低频特征,为第m个设备的高频特征,为第m个设备的流量,L为设备上端口的总数。式(9)用于刻画第m个设备的流量在一段时间内流量的低频分量,式(11)用于刻画第m个设备的流量在一段时间内流量的高频分量。in, is the low frequency characteristic of the mth port, is the high frequency feature of the mth device, is the traffic of the mth device, and L is the total number of ports on the device. Equation (9) is used to characterize the flow of the mth device The low-frequency component of the flow in a period of time, Equation (11) is used to characterize the flow of the mth device High frequency components of traffic over a period of time.
机房的流量低频特征的提取公式为:The extraction formula for the low-frequency characteristics of traffic in the computer room is:
sR(n)=f(zn)=f(w(ω)tR(x)+b(π)) (12)s R (n)=f(z n )=f(w (ω) t R (x)+b (π) ) (12)
机房的流量高频特征的提取公式为:The extraction formula for the high-frequency characteristics of traffic in the computer room is:
dR(n)=f(zn)=f(g(π)tR(x)+b(π)) (14)d R (n)=f(z n )=f(g (π) t R (x)+b (π) ) (14)
其中,SR(n)为机房的低频特征,dR(n)为机房的低频特征,tR(n)为机房的总流量,m为机房内设备的总数。式(12)用于刻画机房的流量tR(x)在一段时间内流量的低频分量,式(14)用于刻画机房的流量tR(x)在一段时间内流量的高频分量。Among them, S R (n) is the low frequency characteristic of the computer room, d R (n) is the low frequency characteristic of the computer room, t R (n) is the total flow of the computer room, and m is the total number of devices in the computer room. Equation (12) is used to describe the low frequency component of the flow t R (x) in the computer room in a period of time, and Equation (14) is used to describe the high frequency component of the flow t R (x) of the computer room in a period of time.
接下来,可以对(7)、(8)、(9)、(11)、(12)和(14)的结果进行激活处理,激活处理的计算公式为:Next, activation processing can be performed on the results of (7), (8), (9), (11), (12) and (14), and the calculation formula of the activation processing is:
其中,u为迭代次数,γ的取值为0.25,λ的取值为0.25。将式(15)的结果分别代入式(16)可以得到各层收敛后的流量特征。由于在多个层次上提取了网络的流量特征,且这些流量特征得到式(15)与式(16)的激活,因此叫做流量的深度激活特征,简称为流量特征。in, u is the number of iterations, the value of γ is 0.25, and the value of λ is 0.25. Substitute the results of Equation (15) into Equation (16) to obtain the flow characteristics of each layer after convergence. Since the traffic features of the network are extracted at multiple levels, and these traffic features are activated by equations (15) and (16), they are called deep activation features of traffic, or traffic features for short.
图6为本发明实施例提供基于不同粒度输出流量的计算高层语义的流程示意图。其中,机房内包括多个设备,每个设备上包括多个端口。图6中示出的机房内设备的总数为M,第m个设备上包括L个端口。FIG. 6 is a schematic flowchart of calculating high-level semantics based on output traffic with different granularities according to an embodiment of the present invention. The equipment room includes multiple devices, and each device includes multiple ports. The total number of devices in the equipment room shown in FIG. 6 is M, and the mth device includes L ports.
参看图6,首先,根据各个端口的输出流量,分别计算各个端口的输出流量特征对应的高层语义;Referring to Figure 6, first, according to the output traffic of each port, respectively calculate the high-level semantics corresponding to the output traffic characteristics of each port;
然后,根据各个设备的输出流量,分别计算各个设备的输出流量特征对应的高层语义,其中,各个设备的输出流量为该设备内所有端口的输出流量的总和;Then, according to the output traffic of each device, respectively calculate the high-level semantics corresponding to the output traffic characteristics of each device, wherein the output traffic of each device is the sum of the output traffic of all ports in the device;
最后,计算机房的输出流量特征对应的高层语义,其中,机房的输出流量为机房内所有端口的输出流量的总和。Finally, the high-level semantics corresponding to the output traffic characteristics of the computer room, where the output traffic of the computer room is the sum of the output traffic of all ports in the computer room.
图7为本发明实施例提供的不同粒度输入流量的计算高层语义的流程示意图。图7与图6中各符号表示的意义相同,不同之处在于,图6为计算输出流量的高层语义,而图7为计算输入流量的高层语义。FIG. 7 is a schematic flowchart of calculating high-level semantics of input traffic with different granularities according to an embodiment of the present invention. The symbols in FIG. 7 and FIG. 6 have the same meanings, the difference is that FIG. 6 shows the high-level semantics of calculating the output flow, while FIG. 7 shows the high-level semantics of calculating the input flow.
参看图7,首先,根据各个端口的输入流量,分别计算各个端口的输入流量特征对应的高层语义;Referring to FIG. 7, first, according to the input traffic of each port, respectively calculate the high-level semantics corresponding to the input traffic characteristics of each port;
然后,根据各个设备的输入流量,分别计算各个设备的输入流量特征对应的高层语义,其中,各个设备的输入流量为该设备内所有端口的输入流量的总和;Then, according to the input traffic of each device, respectively calculate the high-level semantics corresponding to the input traffic characteristics of each device, wherein the input traffic of each device is the sum of the input traffic of all ports in the device;
最后,计算机房的输入流量特征对应的高层语义,其中,机房的输入流量为机房内所有端口的输入流量的总和。Finally, the high-level semantics corresponding to the input traffic characteristics of the computer room, where the input traffic of the computer room is the sum of the input traffic of all ports in the computer room.
接下来,基于对机房粒度、设备粒度和端口粒度对本发明实施例的流量判决过程进行示例性说明。Next, based on the granularity of the computer room, the granularity of the device, and the granularity of the port, the flow judgment process of the embodiment of the present invention is exemplarily described.
表1为机房粒度、设备粒度和端口粒度的流量特征高层语义之间的语义相似度的变量命名列表。表1中的行依次对应机房粒度、设备粒度和端口粒度,表1中的列依次对应当前流入低频特征与历史流入低频特征之间的语义相似度、当前流出低频特征与历史流出低频特征之间的语义相似度、当前流入高频特征与历史流入高频特征之间的语义相似度、当前流出高频特征与历史流出高频特征之间的语义相似度、当前流入低频特征与当前流出低频特征之间的语义相似度、以及当前流入高频特征与当前流出高频特征之间的语义相似度。Table 1 is a list of variable names for the semantic similarity between the high-level semantics of traffic features at computer room granularity, device granularity, and port granularity. The rows in Table 1 correspond to the computer room granularity, device granularity, and port granularity in turn, and the columns in Table 1 correspond to the semantic similarity between the current inflow low-frequency feature and the historical inflow low-frequency feature, and the current outflow low-frequency feature and the historical outflow low-frequency feature. The semantic similarity between the current inflow high-frequency features and the historical inflow high-frequency features, the semantic similarity between the current outflow high-frequency features and the historical outflow high-frequency features, the current inflow low-frequency features and the current outflow low-frequency features The semantic similarity between the current inflow high-frequency features and the current outflow high-frequency features.
为了清楚区分各变量名,语义相似度χ2的排序第一位的下标符号中:P表示端口粒度,I表示设备粒度,R表示机房粒度;其他排序位的下标符号中:I表示流量输入,O表示流量输出,S表示低频特征,G表示高频特征。In order to clearly distinguish the variable names, in the subscript notation of the first rank of the semantic similarity χ2 : P represents the port granularity, I represents the device granularity, and R represents the computer room granularity; in the subscript notation of the other sorting bits: I represents the traffic Input, O represents flow output, S represents low frequency features, and G represents high frequency features.
表1Table 1
表1中端口粒度的流量特征按照式(7)、式(8)和式(15)计算,设备粒度的流量特征按照式(9)、式(11)和式(15)计算,机房粒度的流量特征按照式(12)、式(14)和式(15)计算,高层语义相似度按照式(4)计算。The flow characteristics of port granularity in Table 1 are calculated according to formula (7), formula (8) and formula (15), and the flow characteristics of equipment granularity are calculated according to formula (9), formula (11) and formula (15). The traffic feature is calculated according to formula (12), formula (14) and formula (15), and the high-level semantic similarity is calculated according to formula (4).
在一个示例中,基于对机房粒度、设备粒度和端口粒度对本发明实施例的流量判决原则为:In an example, the flow judgment principle of the embodiment of the present invention based on the granularity of the equipment room, the granularity of the equipment, and the granularity of the port is as follows:
首先,从机房粒度的层次开始流量判决,判决规则如下:First, the traffic judgment starts from the level of the granularity of the computer room. The judgment rules are as follows:
(1a)如果满足则判断机房流入流量的低频特征正常;(1a) if satisfied Then it is judged that the low-frequency characteristics of the inflow traffic of the computer room are normal;
(1b)如果满足则判断机房流入流量的高频特征正常;(1b) if satisfied Then it is judged that the high-frequency characteristics of the inflow traffic of the computer room are normal;
(1c)如果满足则判断机房流出流量的低频特征正常;(1c) If satisfied Then it is judged that the low-frequency characteristics of the outflow flow from the computer room are normal;
(1d)如果满足则判断机房流出流量的高频特征正常;(1d) if satisfied Then it is judged that the high-frequency characteristics of the outflow flow from the computer room are normal;
(1e)如果(1a)与(1b)同时满足,或者(1c)与(1d)同时满足,则转至(1f),否则判断机房粒度的流量异常;(1e) If (1a) and (1b) are satisfied at the same time, or (1c) and (1d) are satisfied at the same time, then go to (1f), otherwise it is judged that the flow rate of the computer room is abnormal;
(1f)如果且则判断机房的流量正常,否则判断机房粒度的流量异常。(1f) If and Then it is judged that the traffic in the computer room is normal, otherwise it is judged that the traffic of the computer room is abnormal.
接下来,如果机房粒度的判决结果为流量异常,则需进入(2),对机房中的各个设备做流量检测,以第m个设备为例:Next, if the decision result of the computer room granularity is that the traffic is abnormal, you need to enter (2) to perform traffic detection on each device in the computer room. Take the mth device as an example:
(2a)如果满足则判断第m个设备流入流量的低频特征正常;(2a) if satisfied Then it is judged that the low-frequency characteristics of the inflow traffic of the mth device are normal;
(2b)如果满足则判断机第m个设备流入流量的高频特征正常;(2b) if satisfied Then it is judged that the high-frequency characteristics of the inflow traffic of the m-th device are normal;
(2c)如果满足则判断第m个设备流出流量的低频特征正常;(2c) if satisfied Then it is judged that the low frequency characteristic of the outflow flow of the mth device is normal;
(2d)如果满足则判断第m个设备流出流量的高频特征正常;(2d) if satisfied Then it is judged that the high-frequency characteristics of the outflow flow of the mth device are normal;
(2e)如果(2a)与(2b)同时满足,或者(2c)与(2d)同时满足,则转至(2f),否则判断第m个设备的流量异常;(2e) If (2a) and (2b) are satisfied at the same time, or (2c) and (2d) are satisfied at the same time, then go to (2f), otherwise judge that the flow of the mth device is abnormal;
(2f)如果且则判断第m个设备的流量正常,否则判断第m个设备的流量异常。(2f) if and Then it is judged that the traffic of the mth device is normal, otherwise it is judged that the traffic of the mth device is abnormal.
接下来,如果设备粒度的判决结果为流量异常,则需进入(3),对设备中的各个端口做流量检测,以第l个端口为例:Next, if the judgment result of the device granularity is that the traffic is abnormal, you need to enter (3) to perform traffic detection on each port in the device. Take the lth port as an example:
(3a)如果满足则判断第l个端口流入流量的低频特征正常;(3a) if satisfied Then it is judged that the low frequency characteristic of the inflow traffic of the lth port is normal;
(3b)如果满足则判断机第l个端口流入流量的高频特征正常;(3b) if satisfied Then it is judged that the high-frequency characteristics of the inflow traffic of the lth port of the machine are normal;
(3c)如果满足则判断第l个端口流出流量的低频特征正常;(3c) If satisfied Then it is judged that the low frequency characteristic of the outgoing traffic of the lth port is normal;
(3d)如果满足则判断第l个端口流出流量的高频特征正常;(3d) if satisfied Then it is judged that the high frequency characteristics of the outgoing traffic of the lth port are normal;
(3e)如果(3a)与(3b)同时满足,或者(3c)与(3d)同时满足,则转至(3f),否则判断第l个端口的流量异常;(3e) If (3a) and (3b) are satisfied at the same time, or (3c) and (3d) are satisfied at the same time, then go to (3f), otherwise judge that the flow of the lth port is abnormal;
(3f)如果且则判断第l个端口的流量正常,否则判断第l个端口的流量异常。(3f) if and Then it is judged that the traffic of the lth port is normal; otherwise, it is judged that the traffic of the lth port is abnormal.
需要说明的是,按照上述判决步骤,可以获得各个流量是否异常的标签l(zn),即1表示流量异常,0表示流量正常。根据本发明的实施例,上述判决步骤即可以用于自由参数的训练与学习,又可以用于流量异常的在线判决。It should be noted that, according to the above judgment steps, the label l(z n ) of whether each traffic is abnormal can be obtained, that is, 1 indicates that the traffic is abnormal, and 0 indicates that the traffic is normal. According to the embodiment of the present invention, the above judgment step can be used not only for training and learning of free parameters, but also for online judgment of abnormal traffic.
图8为本发明实施例提供的网络异常流量的检测装置的结构示意图。图8中的网络异常流量的检测装置包括:数据采集模块801、特征提取模块802、语义提取模块803、相似度计算模块804、流量状态判决模块805和确认模块806。FIG. 8 is a schematic structural diagram of an apparatus for detecting abnormal network traffic according to an embodiment of the present invention. The apparatus for detecting abnormal network traffic in FIG. 8 includes: a data collection module 801 , a feature extraction module 802 , a semantic extraction module 803 , a similarity calculation module 804 , a traffic state judgment module 805 and a confirmation module 806 .
其中,数据采集模块801用于采集多个端口的当前时间段内的当前流量数据和与当前时间段对应的历史流量数据。The data collection module 801 is configured to collect current flow data within the current time period of multiple ports and historical flow data corresponding to the current time period.
数据采集模块801的实现形式可以数据库。在一个示例中,可以分别建立一个当前数据库,用于保留当天所有的流量数据;和一个历史数据库,用于保存一年的历史流量数据。其中,在利用历史数据库进行流量数据采集的过程中,还可以对流量数据的状态进行人工判断,将认为是正常的流量数据保存在历史流量数据库。这样当收集完一整年的数据后,历史流量数据库方可准确发挥异常流量检测作用。The implementation form of the data acquisition module 801 can be a database. In an example, a current database can be established to keep all the flow data of the current day; and a historical database can be established to save the historical flow data of one year. Among them, in the process of using the historical database to collect the traffic data, the state of the traffic data can also be manually judged, and the traffic data considered to be normal can be saved in the historical traffic database. In this way, after collecting data for a whole year, the historical traffic database can accurately detect abnormal traffic.
具体地,还可以建立一个映射数据库,用于映射上下游设备及其端口之间的对应关系。映射数据库分别于当前数据库和历史数据库连接,在进行网络的异常流量检测时,可以通过映射数据库获取某个机房中的所有设备与端口的信息,并根据所有设备与端口的信息从当前数据库获取当前的流量数据,并根据当天的具体日期,向历史流量数据库请求相关日期的历史流量。Specifically, a mapping database may also be established for mapping the correspondence between upstream and downstream devices and their ports. The mapping database is connected to the current database and the historical database respectively. When detecting abnormal traffic on the network, the information of all devices and ports in a certain computer room can be obtained through the mapping database, and the current database can be obtained from the current database according to the information of all devices and ports. and according to the specific date of the day, request the historical traffic of the relevant date from the historical traffic database.
特征提取模块802用于利用预设的滤波器,分别从当前流量数据和历史流量数据中提取端口的当前流量特征和历史流量特征。The feature extraction module 802 is configured to use a preset filter to extract the current traffic feature and the historical traffic feature of the port from the current traffic data and the historical traffic data, respectively.
语义提取模块803用于分别从当前流量特征和历史流量特征中提取端口的当前流量的高层语义和历史流量的高层语义。The semantic extraction module 803 is configured to extract the high-level semantics of the current traffic and the high-level semantics of the historical traffic of the port from the current traffic feature and the historical traffic feature, respectively.
相似度计算模块804用于根据当前流量的高层语义和历史流量的高层语义,计算端口的当前流量和历史流量之间的语义相似度。The similarity calculation module 804 is configured to calculate the semantic similarity between the current traffic and the historical traffic of the port according to the high-level semantics of the current traffic and the high-level semantics of the historical traffic.
流量状态判决模块805用于判断语义相似度是否落入预设的第一取值区间,得到端口的当前流量状态的判决结果。The traffic state judgment module 805 is configured to judge whether the semantic similarity falls within the preset first value range, and obtain the judgment result of the current traffic state of the port.
确认模块806,用于根据各个端口的当前流量状态的判决结果,确认出故障端口。The confirmation module 806 is configured to confirm the faulty port according to the judgment result of the current traffic state of each port.
图9为本发明实施例提供的网络异常流量的检测装置的硬件结构示意图。如图9所示,本发明实施例中的网络异常流量的检测装置包括:处理器901、存储器902、通信接口903和总线910。其中,处理器901、存储器902和通信接口903通过总线910连接并完成相互间的通信。FIG. 9 is a schematic diagram of a hardware structure of an apparatus for detecting abnormal network traffic according to an embodiment of the present invention. As shown in FIG. 9 , the apparatus for detecting abnormal network traffic in the embodiment of the present invention includes: a processor 901 , a memory 902 , a communication interface 903 , and a bus 910 . Among them, the processor 901, the memory 902 and the communication interface 903 are connected through the bus 910 and complete the mutual communication.
具体地,上述处理器901可以包括中央处理器901(CPU),或者特定集成电路(ASIC),或者可以被配置成实施本发明实施例的一个或多个集成电路。Specifically, the above-mentioned processor 901 may include a central processing unit 901 (CPU), or a specific integrated circuit (ASIC), or may be configured as one or more integrated circuits implementing embodiments of the present invention.
存储器902可以包括用于数据或指令的大容量存储器902。举例来说而非限制,存储器902可包括HDD、软盘驱动器、闪存、光盘、磁光盘、磁带或通用串行总线910(USB)驱动器或者两个或更多个以上这些的组合。在合适的情况下,存储器902可包括可移除或不可移除(或固定)的介质。在合适的情况下,存储器902可在资源接口设备的内部或外部。在特定实施例中,存储器902是非易失性固态存储器902。在特定实施例中,存储器902包括只读存储器902(ROM)。在合适的情况下,该ROM可以是掩模编程的ROM、可编程ROM(PROM)、可擦除PROM(EPROM)、电可擦除PROM(EEPROM)、电可改写ROM(EAROM)或闪存或者两个或更多个以上这些的组合。Memory 902 may include mass storage 902 for data or instructions. By way of example and not limitation, memory 902 may include an HDD, floppy disk drive, flash memory, optical disk, magneto-optical disk, magnetic tape, or Universal Serial Bus 910 (USB) drive or a combination of two or more of the above. Memory 902 may include removable or non-removable (or fixed) media, where appropriate. Where appropriate, memory 902 may be internal or external to the resource interface device. In particular embodiments, memory 902 is non-volatile solid state memory 902 . In particular embodiments, memory 902 includes read only memory 902 (ROM). Where appropriate, the ROM may be a mask programmed ROM, programmable ROM (PROM), erasable PROM (EPROM), electrically erasable PROM (EEPROM), electrically rewritable ROM (EAROM) or flash memory or A combination of two or more of the above.
通信接口903,主要用于实现本发明实施例中各模块、装置、单元和/或设备之间的通信。The communication interface 903 is mainly used to implement communication between modules, apparatuses, units, and/or devices in the embodiments of the present invention.
也就是说,网络异常流量的检测装置可以被实现为包括:处理器901、存储器902、通信接口903和总线910。处理器901、存储器902和通信接口903通过总线910连接并完成相互间的通信。存储器902用于存储程序代码;处理器901通过读取存储器902中存储的可执行程序代码来运行与所述可执行程序代码对应的程序,以用于执行上文所述的网络异常流量的检测方法,从而实现结合图1至图8所述的网络异常流量的检测方法和装置。That is, the apparatus for detecting abnormal network traffic can be implemented to include: a processor 901 , a memory 902 , a communication interface 903 and a bus 910 . The processor 901, the memory 902 and the communication interface 903 are connected through the bus 910 and complete the communication with each other. The memory 902 is used for storing program codes; the processor 901 runs a program corresponding to the executable program code by reading the executable program code stored in the memory 902, so as to perform the above-mentioned detection of abnormal network traffic method, so as to realize the method and apparatus for detecting abnormal network traffic described in conjunction with FIG. 1 to FIG. 8 .
图10为根据本发明实施例的试验场景的布局示意图。图10中示出的网络设备包括:两个防火墙(FW13和FW14),两个交换机(SW07和SW08),两个路由器(RT05和RT06)以及一个抽象的SAE网关池(SAEGW)。当前流量数据来自于2016年3月14日11时10分至11时40分。实验的目的是验证提出方案的有效性,并测试低频特征与高频特征对当前流量与历史流量短时行为的捕捉能力。FIG. 10 is a schematic layout diagram of a test scene according to an embodiment of the present invention. The network devices shown in Figure 10 include: two firewalls (FW13 and FW14), two switches (SW07 and SW08), two routers (RT05 and RT06), and an abstract SAE gateway pool (SAEGW). The current traffic data is from 11:10 to 11:40 on March 14, 2016. The purpose of the experiment is to verify the effectiveness of the proposed scheme, and to test the ability of low-frequency and high-frequency features to capture the short-term behavior of current traffic and historical traffic.
表2和表3分别为机房和设备的流量特征的语义相似度。Table 2 and Table 3 show the semantic similarity of the traffic features of the computer room and equipment, respectively.
表2Table 2
表3table 3
参看表2,机房当前流入低频特征与历史流入低频特征虽然在判决式(1a)的范围内,然而当前低频特征与历史低频特征相比大约降低了1/6。当前高频特征大约为历史高频特征的6倍以上,因此可以判定该机房出现了异常流量及潜在的设备故障。Referring to Table 2, although the current low-frequency characteristics and historical low-frequency characteristics of the computer room inflow are within the range of decision formula (1a), the current low-frequency characteristics are reduced by about 1/6 compared with the historical low-frequency characteristics. The current high-frequency characteristics are about 6 times more than the historical high-frequency characteristics, so it can be determined that abnormal traffic and potential equipment failures have occurred in the equipment room.
参看表3,通过逐个提取机房中设备的短时流量特征,发现FW13与FW14这两个设备的当前高频特征明显大于它们的历史高频特征,通过它们各自端口粒度的流量特征分析,可以确定FW13与FW14发生异常流量的端口。参看图11和图12,图11中数据为采集自2015年3月23日11时40分Trunk2端口的正常流量数据,图12中的数据为采集自2016年3月14日11时40分Trunk2端口的异常流量。对比图11和图12中的FW13部分端口的流量曲线,发现端口Trunk2出现了流量奇异点(参看图12中的箭头指示),验证了提出方案对异常流量检测的有效性。Referring to Table 3, by extracting the short-term traffic characteristics of the devices in the equipment room one by one, it is found that the current high-frequency characteristics of the two devices, FW13 and FW14, are significantly larger than their historical high-frequency characteristics. Ports where abnormal traffic occurs on FW13 and FW14. Referring to Figure 11 and Figure 12, the data in Figure 11 is the normal traffic data collected from the Trunk2 port at 11:40 on March 23, 2015, and the data in Figure 12 is collected from Trunk2 at 11:40 on March 14, 2016. Abnormal traffic on the port. Comparing the flow curves of some ports of FW13 in Figure 11 and Figure 12, it is found that the port Trunk2 has a traffic singularity point (refer to the arrow in Figure 12), which verifies the effectiveness of the proposed scheme for abnormal traffic detection.
参看表3,RT06的高频特征不满足(1c)式,且其当前低频特征明显小于历史低频特征。计算发现RT06的Trunk1端口低频特征与高频特征有明显的异常。参看图13和图14,图13中的数据为采集自2015年3月23日11时40分Trunk1端口的正常流量数据,图14中数据为采集自2016年3月14日11时40分Trunk1端口发生故障时的异常流量数据。对比图13和图14中的RT06部分端口的流量曲线,发现端口Trunk1没有采集到流量(即图13中的箭头指示的流量曲线),则说明端口Trunk1发生了故障,验证了提出方案对异常流量检测的有效性。Referring to Table 3, the high-frequency characteristics of RT06 do not satisfy the formula (1c), and its current low-frequency characteristics are significantly smaller than the historical low-frequency characteristics. The calculation shows that the low-frequency and high-frequency characteristics of the Trunk1 port of RT06 are obviously abnormal. Referring to Figure 13 and Figure 14, the data in Figure 13 is the normal traffic data collected from the Trunk1 port at 11:40 on March 23, 2015, and the data in Figure 14 is collected from Trunk1 at 11:40 on March 14, 2016. Abnormal traffic data when the port fails. Comparing the traffic curves of some ports of RT06 in Figure 13 and Figure 14, it is found that the port Trunk1 does not collect traffic (that is, the traffic curve indicated by the arrow in Figure 13), which means that the port Trunk1 is faulty, which verifies that the proposed solution is effective for abnormal traffic. Validity of detection.
参看表3,RT06、FW13、FW14这几个设备当前的流入特征与流出特征均比较匹配,而SW08设备的流入特征与流出特征并不匹配,如果以此作为判定流量状态的依据将发生误判。这也从另一个方面说明了利用历史流量做异常流量检测的重要性。Referring to Table 3, the current inflow characteristics and outflow characteristics of RT06, FW13, and FW14 devices are relatively matched, while the inflow characteristics and outflow characteristics of the SW08 device do not match. . This also illustrates the importance of using historical traffic to detect abnormal traffic from another aspect.
需要明确的是,本说明书中的各个实施例均采用递进的方式描述,各个实施例之间相同或相似的部分互相参见即可,每个实施例重点说明的都是与其他实施例的不同之处。对于装置实施例而言,相关之处可以参见方法实施例的说明部分。本发明实施例并不局限于上文所描述并在图中示出的特定步骤和结构。本领域的技术人员可以在领会本发明实施例的精神之后作出各种改变、修改和添加,或者改变步骤之间的顺序。并且,为了简明起见,这里省略对已知方法技术的详细描述。It should be clear that each embodiment in this specification is described in a progressive manner, and the same or similar parts of each embodiment may be referred to each other, and each embodiment focuses on the differences from other embodiments. place. For the apparatus embodiment, reference may be made to the description part of the method embodiment for relevant places. Embodiments of the present invention are not limited to the specific steps and structures described above and shown in the figures. Those skilled in the art may make various changes, modifications and additions, or change the order between steps, after comprehending the spirit of the embodiments of the present invention. Also, for the sake of brevity, detailed descriptions of known methods and techniques are omitted here.
本发明实施例可以以其他的具体形式实现,而不脱离其精神和本质特征。例如,特定实施例中所描述的算法可以被修改,而系统体系结构并不脱离本发明实施例的基本精神。因此,当前的实施例在所有方面都被看作是示例性的而非限定性的,本发明实施例的范围由所附权利要求而非上述描述定义,并且,落入权利要求的含义和等同物的范围内的全部改变从而都被包括在本发明实施例的范围之中。The embodiments of the present invention may be implemented in other specific forms without departing from the spirit and essential characteristics thereof. For example, the algorithms described in particular embodiments may be modified without departing from the basic spirit of the embodiments of the invention in system architecture. Accordingly, the present embodiments are to be regarded in all respects as illustrative and not restrictive, and the scope of the embodiments of the present invention is defined by the appended claims rather than the foregoing description, and falls within the meaning and equivalence of the claims All changes within the scope of the present invention are thus included in the scope of the embodiments of the present invention.
Claims (12)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710648870.3A CN109327345A (en) | 2017-08-01 | 2017-08-01 | Method and device for detecting abnormal network traffic, and computer-readable storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201710648870.3A CN109327345A (en) | 2017-08-01 | 2017-08-01 | Method and device for detecting abnormal network traffic, and computer-readable storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN109327345A true CN109327345A (en) | 2019-02-12 |
Family
ID=65245284
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201710648870.3A Pending CN109327345A (en) | 2017-08-01 | 2017-08-01 | Method and device for detecting abnormal network traffic, and computer-readable storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN109327345A (en) |
Cited By (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109889530A (en) * | 2019-03-05 | 2019-06-14 | 北京长亭科技有限公司 | Web application firewall system and computer storage medium |
CN109922082A (en) * | 2019-04-10 | 2019-06-21 | 杭州数梦工场科技有限公司 | The detection method and device and computer readable storage medium of Traffic Anomaly |
CN112165471A (en) * | 2020-09-22 | 2021-01-01 | 杭州安恒信息技术股份有限公司 | Industrial control system flow abnormity detection method, device, equipment and medium |
CN112242971A (en) * | 2019-07-16 | 2021-01-19 | 中兴通讯股份有限公司 | Flow abnormity detection method, device, network equipment and storage medium |
CN112307077A (en) * | 2019-12-11 | 2021-02-02 | 深圳新阳蓝光能源科技股份有限公司 | Data archiving method, device, server and system |
CN118473834A (en) * | 2024-07-12 | 2024-08-09 | 商飞智能技术有限公司 | Network traffic characteristic identification method and device and electronic equipment |
CN119051996A (en) * | 2024-10-31 | 2024-11-29 | 上海斗象信息科技有限公司 | Training method and device, monitoring method and equipment of abnormal flow detection model |
CN119416134A (en) * | 2025-01-08 | 2025-02-11 | 曙光信息产业股份有限公司 | Method and device for detecting abnormal operation of intelligent computing network |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102271091A (en) * | 2011-09-06 | 2011-12-07 | 电子科技大学 | A Classification Method for Network Abnormal Events |
CN104836694A (en) * | 2014-02-11 | 2015-08-12 | 中国移动通信集团河北有限公司 | Method and device for monitoring network |
US9268938B1 (en) * | 2015-05-22 | 2016-02-23 | Power Fingerprinting Inc. | Systems, methods, and apparatuses for intrusion detection and analytics using power characteristics such as side-channel information collection |
CN105553998A (en) * | 2015-12-23 | 2016-05-04 | 中国电子科技集团公司第三十研究所 | Network attack abnormality detection method |
-
2017
- 2017-08-01 CN CN201710648870.3A patent/CN109327345A/en active Pending
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102271091A (en) * | 2011-09-06 | 2011-12-07 | 电子科技大学 | A Classification Method for Network Abnormal Events |
CN104836694A (en) * | 2014-02-11 | 2015-08-12 | 中国移动通信集团河北有限公司 | Method and device for monitoring network |
US9268938B1 (en) * | 2015-05-22 | 2016-02-23 | Power Fingerprinting Inc. | Systems, methods, and apparatuses for intrusion detection and analytics using power characteristics such as side-channel information collection |
CN105553998A (en) * | 2015-12-23 | 2016-05-04 | 中国电子科技集团公司第三十研究所 | Network attack abnormality detection method |
Cited By (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109889530A (en) * | 2019-03-05 | 2019-06-14 | 北京长亭科技有限公司 | Web application firewall system and computer storage medium |
CN109922082A (en) * | 2019-04-10 | 2019-06-21 | 杭州数梦工场科技有限公司 | The detection method and device and computer readable storage medium of Traffic Anomaly |
CN112242971A (en) * | 2019-07-16 | 2021-01-19 | 中兴通讯股份有限公司 | Flow abnormity detection method, device, network equipment and storage medium |
CN112242971B (en) * | 2019-07-16 | 2023-06-16 | 中兴通讯股份有限公司 | Traffic abnormality detection method and device, network equipment and storage medium |
CN112307077A (en) * | 2019-12-11 | 2021-02-02 | 深圳新阳蓝光能源科技股份有限公司 | Data archiving method, device, server and system |
CN112165471A (en) * | 2020-09-22 | 2021-01-01 | 杭州安恒信息技术股份有限公司 | Industrial control system flow abnormity detection method, device, equipment and medium |
CN118473834A (en) * | 2024-07-12 | 2024-08-09 | 商飞智能技术有限公司 | Network traffic characteristic identification method and device and electronic equipment |
CN119051996A (en) * | 2024-10-31 | 2024-11-29 | 上海斗象信息科技有限公司 | Training method and device, monitoring method and equipment of abnormal flow detection model |
CN119051996B (en) * | 2024-10-31 | 2025-02-25 | 上海斗象信息科技有限公司 | Training method and device for abnormal flow detection model, monitoring method and equipment |
CN119416134A (en) * | 2025-01-08 | 2025-02-11 | 曙光信息产业股份有限公司 | Method and device for detecting abnormal operation of intelligent computing network |
CN119416134B (en) * | 2025-01-08 | 2025-06-17 | 曙光信息产业股份有限公司 | Job anomaly detection method and device for intelligent computing network |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109327345A (en) | Method and device for detecting abnormal network traffic, and computer-readable storage medium | |
CN107798390B (en) | Training method and device of machine learning model and electronic equipment | |
US11902114B2 (en) | System and method for predicting and reducing subscriber churn | |
CN110008080B (en) | Business index anomaly detection method and device based on time sequence and electronic equipment | |
US9418088B1 (en) | Identification of storage system elements causing performance degradation | |
US7936260B2 (en) | Identifying redundant alarms by determining coefficients of correlation between alarm categories | |
CN112307619B (en) | Construction method of early warning model, equipment fault early warning method and device | |
CN109284380A (en) | Illegal user's recognition methods and device, electronic equipment based on big data analysis | |
CN111861486B (en) | Abnormal account identification method, device, equipment and medium | |
Raso et al. | Effective streamflow process modeling for optimal reservoir operation using stochastic dual dynamic programming | |
CN106100937A (en) | System monitoring method and apparatus | |
CN108234347A (en) | A kind of method, apparatus, the network equipment and storage medium for extracting feature string | |
CN111353610A (en) | Model parameter determination method and device, storage medium and electronic equipment | |
CN111476375A (en) | Method and device for determining recognition model, electronic equipment and storage medium | |
CN109816043A (en) | Method and device for determining user identification model, electronic equipment and storage medium | |
CN119728461B (en) | A network traffic prediction method, system, computer device and storage medium | |
Lu et al. | On Kalman smoothing for wireless sensor networks systems with multiplicative noises | |
CN117938762A (en) | Network flow missing data completion method, device, electronic device and storage medium | |
CN112150795B (en) | Method and device for detecting vehicle track abnormity | |
CN112241820A (en) | Risk identification method and device for key nodes in fund flow and computing equipment | |
CN110691067A (en) | Dual port mirror system for analyzing non-stationary data in a network | |
CN111159250A (en) | Mobile user behavior detection method based on nested deep siamese neural network | |
CN111124844B (en) | Abnormal detection method and device for detecting abnormal operation of operating system | |
CN114066619B (en) | Security risk determination method and device, electronic equipment and storage medium | |
CN110430140A (en) | Path processing method, device, equipment and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
RJ01 | Rejection of invention patent application after publication | ||
RJ01 | Rejection of invention patent application after publication |
Application publication date: 20190212 |