JP6662812B2

JP6662812B2 - Calculation device and calculation method

Info

Publication number: JP6662812B2
Application number: JP2017111933A
Authority: JP
Inventors: 五十嵐　弓将; 弓将五十嵐
Original assignee: Nippon Telegraph and Telephone Corp
Current assignee: Nippon Telegraph and Telephone Corp
Priority date: 2017-06-06
Filing date: 2017-06-06
Publication date: 2020-03-11
Anticipated expiration: 2037-06-06
Also published as: JP2018207345A

Description

本発明は、計算装置及び計算方法に関する。 The present invention relates to a calculation device and a calculation method.

コンピュータネットワーク観測する目的で用いられるネットワークモニタやＬＡＮ（Local Area Network）アナライザにおける通信の計測方式は、パケットキャプチャによる計測方式とフロー統計情報を計測する計測方式とに大きく分けられる。 Measurement methods of communication in a network monitor or a LAN (Local Area Network) analyzer used for the purpose of observing a computer network can be broadly classified into a measurement method using packet capture and a measurement method for measuring flow statistical information.

パケットキャプチャによる計測方式は、コンピュータネットワークを構成するコンピュータシステム間でやり取りされる情報の伝送単位であるパケットを複製し、それらの複製したパケットの内容を分析する技術である。 The measurement method using packet capture is a technique of copying packets, which are transmission units of information exchanged between computer systems constituting a computer network, and analyzing the contents of the copied packets.

フロー統計情報を計測する計測方式は、コンピュータネットワークを構成するルータ等のネットワーク機器に、自身を通過する通信の量を計測する機能を搭載して、コンピュータネットワーク内を流れる情報の量を計測する技術である。ここで、フロー統計情報とは、ネットワーク機器を通過する通信の情報量等の統計的数値のことをいう。この統計的数値はカウンタとも呼ばれる。トラフィック統計情報の一例としてバイト数がある。 The measurement method that measures flow statistical information is a technology that measures the amount of information flowing in a computer network by installing a function to measure the amount of communication passing through itself on network devices such as routers that constitute a computer network. It is. Here, the flow statistical information refers to a statistical numerical value such as an information amount of communication passing through a network device. This statistical value is also called a counter. One example of traffic statistics is the number of bytes.

フロー統計情報の代表的な技術仕様として、sFlow（登録商標）（例えば、非特許文献１を参照）とnetFlow（例えば、非特許文献２を参照）が知られている。sFlowは、パケットサンプリングと通信回線インタフェースごとのカウンタを用いた、統計的推定に基づくトラフィック計測技術である。また、netFlowは、ルータやスイッチ等のネットワーク機器でフロー単位のパケット数やバイト数を計測する技術である。 As typical technical specifications of the flow statistical information, sFlow (registered trademark) (for example, see Non-Patent Document 1) and netFlow (for example, see Non-Patent Document 2) are known. sFlow is a traffic measurement technology based on statistical estimation using packet sampling and a counter for each communication line interface. In addition, netFlow is a technology for measuring the number of packets and the number of bytes per flow in network devices such as routers and switches.

フロー統計情報を計測する計測方式としては、他にも、セッションの上り及び下り両方向のフローのパケットを観測された時刻順に先頭から並べ、そのパケットサイズを配列として用いる方式（例えば、非特許文献３を参照）、あるセッションのパケット長の平均値、中央値、分散とパケット到着間隔の分散を計算する方式（例えば、非特許文献４を参照）、セッションあたりの総パケット数および総バイト数、特定のフラグが付いたパケット数、全パケットの平均および最大サイズと分散を計算する方式（例えば、非特許文献５を参照）等が知られている。 As another measurement method for measuring the flow statistical information, a method of arranging the packets of the flow in both the upstream and downstream directions of the session from the head in the order of the observed time and using the packet size as an array (for example, Non-Patent Document 3) ), A method of calculating an average value, a median value, a variance of a packet length and a packet arrival interval of a session (for example, see Non-Patent Document 4), a total number of packets and a total number of bytes per session, and identification. A method of calculating the number of packets with the flag, the average and maximum size and variance of all packets (for example, see Non-Patent Document 5) and the like are known.

Traffic Monitoring using sFlow、[online]、[平成２９年５月２５日検索]、インターネット（http://www.sflow.org/sFlowOverview.pdf）Traffic Monitoring using sFlow, [online], [Search May 25, 2017], Internet (http://www.sflow.org/sFlowOverview.pdf) Omar Santos, “Network Security with NetFlow and IPFIX”, Cisco Press, September 2015Omar Santos, “Network Security with NetFlow and IPFIX”, Cisco Press, September 2015 和泉勇治、田中和之、「トラヒック解析に基づいたウェブアプリケーション識別」、信学技報 IEICE Technical Report CS2013-40(2013-09), pp.61-66Yuji Izumi, Kazuyuki Tanaka, "Web Application Identification Based on Traffic Analysis", IEICE Technical Report CS2013-40 (2013-09), pp.61-66 北村強、静野隆之、岡部稔哉、「フロー挙動分析に基づくアプリケーション識別手法」、信学技報 IEICE Technical Report NS2005-136(2005-12), pp.13-16Tsuyoshi Kitamura, Takayuki Shizuno, Toshiya Okabe, "Application Identification Method Based on Flow Behavior Analysis", IEICE Technical Report NS2005-136 (2005-12), pp.13-16 Liu Yingqiu, Li Wei, Li Yunchun, “Network Traffic Classification Using K-means Clustering”, Second International Multisymposium on Computer and Computational Sciences, pp.360-365Liu Yingqiu, Li Wei, Li Yunchun, “Network Traffic Classification Using K-means Clustering”, Second International Multisymposium on Computer and Computational Sciences, pp.360-365

しかしながら、従来の技術には、限られた処理資源を用いて効果的な通信の計測を行うことができない場合があるという問題がある。例えば、パケットキャプチャによる計測方式には大量の記憶資源及び計算資源が必要になる。このため、使用可能な処理資源が限られている場合は、パケットキャプチャによる計測方式することができないことがある。一方で、フロー統計情報を計測する計測方式では、効果的な通信の計測を行うことができない場合がある。 However, the conventional technology has a problem that effective communication measurement may not be performed using limited processing resources. For example, a measurement method using packet capture requires a large amount of storage resources and computation resources. Therefore, when available processing resources are limited, a measurement method using packet capture may not be possible. On the other hand, in a measurement method for measuring flow statistical information, effective communication measurement may not be performed.

例えば、非特許文献１に記載のsFlowは、一定の率で間欠的にパケットをサンプリングするものであるため、パケット数が非常に少ないまたは非常に短い通信等、サンプリングされる確率が低いフローでは検出漏れや誤差が発生する場合があり、また、パケットを選ぶ方法やサンプル又はカウンタを収集する周期によっても計測精度に影響が出るため、効果的な通信の計測を行うことができない場合があるという問題点がある。 For example, since sFlow described in Non-Patent Document 1 intermittently samples packets at a fixed rate, it is detected in a flow having a low probability of being sampled, such as communication with a very small number of packets or a very short packet. Leakage and errors may occur, and the method of selecting packets and the cycle of collecting samples or counters may affect the measurement accuracy, so that effective communication measurement may not be performed. There is a point.

また、例えば、非特許文献２に記載のnetFlowは、ルータやスイッチ等のネットワーク機器でフロー単位のパケット数やバイト数を計測するものである。このため、既存のルータやスイッチに、本来のパケット交換等の本来の処理と計測に関する処理の両方を行わせるためには、余分な計算資源が必要になる場合がある。 Also, for example, netFlow described in Non-Patent Document 2 is for measuring the number of packets and the number of bytes in flow units by a network device such as a router or a switch. Therefore, in order for an existing router or switch to perform both the original processing such as the original packet exchange and the processing related to the measurement, extra computational resources may be required.

また、非特許文献３から５に記載の技術は、キャプチャしたパケットのデータペイロードの中身は分析せずにパケット単位の長さや到着間隔、フラグ等ヘッダ情報のみを参照し計測を行うものであるが、計測可能な情報は限定的であり、付加的なフロー統計情報を生成するためには、パケットキャプチャによる計測を行う必要がある。 The techniques described in Non-Patent Documents 3 to 5 perform measurement by referring to only header information such as the length, arrival interval, and flag of a packet without analyzing the contents of the data payload of a captured packet. The measurable information is limited, and it is necessary to perform measurement by packet capture in order to generate additional flow statistical information.

本発明の計算装置は、通信元及び通信先ごとに集約されたトラフィックに関する統計情報であるフロー統計情報を取得する取得部と、前記フロー統計情報を、基となったトラフィックのセッションが同一となるようにグループに分類する分類部と、前記フロー統計情報に基づいて、前記分類部によって分類されたグループごとのトラフィックに関する統計情報を計算する計算部と、を有することを特徴とする。 In the computing device of the present invention, the acquisition unit that acquires the flow statistical information that is the statistical information on the traffic aggregated for each of the communication source and the communication destination and the session of the traffic based on the flow statistical information are the same. And a calculating unit that calculates statistical information on traffic for each group classified by the classifying unit based on the flow statistical information.

本発明によれば、限られた処理資源を用いて効果的な通信の計測を行うことができる。 According to the present invention, effective communication measurement can be performed using limited processing resources.

図１は、第１の実施形態に係る計算システムの構成の一例を示す図である。FIG. 1 is a diagram illustrating an example of a configuration of a calculation system according to the first embodiment. 図２は、第１の実施形態に係る計算装置の構成の一例を示す図である。FIG. 2 is a diagram illustrating an example of a configuration of the computing device according to the first embodiment. 図３は、第１の実施形態に係る計算装置による計算方法を説明するための図である。FIG. 3 is a diagram for explaining a calculation method by the calculation device according to the first embodiment. 図４は、第１の実施形態に係るルータの処理の流れを示すフローチャートである。FIG. 4 is a flowchart illustrating a flow of processing of the router according to the first embodiment. 図５は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。FIG. 5 is a flowchart illustrating a flow of processing of the computing device according to the first embodiment. 図６は、第１の実施形態に係る一次記憶部のデータ構成の一例を示す図である。FIG. 6 is a diagram illustrating an example of a data configuration of the primary storage unit according to the first embodiment. 図７は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。FIG. 7 is a flowchart illustrating a processing flow of the computing device according to the first embodiment. 図８は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。FIG. 8 is a flowchart illustrating a processing flow of the computing device according to the first embodiment. 図９は、第１の実施形態に係る二次記憶部のデータ構成の一例を示す図である。FIG. 9 is a diagram illustrating an example of a data configuration of the secondary storage unit according to the first embodiment. 図１０は、計算プログラムを実行するコンピュータの一例を示す図である。FIG. 10 is a diagram illustrating an example of a computer that executes a calculation program.

［第１の実施形態の構成］
以下に、本願に係る計算装置及び計算方法の実施形態を図面に基づいて詳細に説明する。なお、本発明は、以下に説明する実施形態により限定されるものではない。まず、図１を用いて、第１の実施形態に係る計算システムの構成について説明する。図１は、第１の実施形態に係る計算システムの構成の一例を示す図である。図１に示すように、計算システム１は、計算装置１０、クライアント２０、サーバ３０及びルータ４０を有する。 [Configuration of First Embodiment]
Hereinafter, embodiments of a calculation device and a calculation method according to the present application will be described in detail with reference to the drawings. Note that the present invention is not limited by the embodiments described below. First, the configuration of the calculation system according to the first embodiment will be described with reference to FIG. FIG. 1 is a diagram illustrating an example of a configuration of a calculation system according to the first embodiment. As shown in FIG. 1, the computing system 1 includes a computing device 10, a client 20, a server 30, and a router 40.

ここで、ルータ４０は、クライアント２０とサーバ３０との間で通信により発生するトラフィックに基づいて、フロー統計情報を生成する。フロー統計情報には、パケット数及びバイト数が含まれる。また、計算システム１においてフロー統計情報の生成の対象となる通信は、クライアント及びサーバによる通信に限られず、クライアント同士の通信であってもよいし、サーバ同士の通信であってもよいし、クライアント及びサーバ以外の機器による通信であってもよい。また、フロー統計情報を生成する機器は、ルータ４０に限られず、任意のネットワーク機器であってよい。 Here, the router 40 generates flow statistic information based on traffic generated by communication between the client 20 and the server 30. The flow statistics information includes the number of packets and the number of bytes. Further, the communication for which the flow statistical information is generated in the calculation system 1 is not limited to communication between a client and a server, and may be communication between clients, communication between servers, or client communication. Alternatively, the communication may be performed by a device other than the server. The device that generates the flow statistical information is not limited to the router 40, and may be any network device.

ここで、フロー統計情報とは、コンピュータシステムを識別する送信元、あて先、プロトコル及びポート等を基に、通信を行うコンピュータシステム同士が交換する情報をフローと呼ばれる単位に分割し、フローごとに通信量を計測、計算した統計情報ということができる。例えば、フローは、通信の送信元及びあて先という情報の流れる方向に関する属性を含むため、通常、送信（往き）と受信（帰り）の２種類のフローが存在することになる。本実施形態では、送信と受信の２種類のフローを合わせた単位をセッションとよぶ。また、以降の説明では、計算システム１において、クライアント２０からサーバ３０へ向かう方向を上り、サーバ３０からクライアント２０へ向かう方向を下りとよぶ。 Here, the flow statistical information divides information exchanged between the communicating computer systems into units called flows based on a source, a destination, a protocol, a port, and the like for identifying the computer systems, and performs communication for each flow. It can be said that statistical information is obtained by measuring and calculating the amount. For example, since a flow includes an attribute related to a direction in which information of a communication source and a destination flows, there are usually two types of flows: transmission (going) and receiving (return). In the present embodiment, a unit combining two types of flows, transmission and reception, is called a session. In the following description, in the computing system 1, the direction from the client 20 to the server 30 is called up, and the direction from the server 30 to the client 20 is called down.

また、本実施形態では、前述の通り、ネットワーク機器であるルータ４０がフロー統計情報の生成を行う。ルータ４０は、フロー統計情報を、コンピュータシステムが具備する通信回線インタフェース単位で生成してもよいし、さらに細かな通信の単位に分割して生成してもよい。 In the present embodiment, as described above, the router 40, which is a network device, generates flow statistical information. The router 40 may generate the flow statistic information for each communication line interface provided in the computer system, or may generate the flow statistic information by dividing it into smaller communication units.

計算装置１０は、ルータ４０によって生成されたフロー統計情報を基に、セッションに関する統計的演算を行うことで、フロー統計情報のみでは得ることができない情報、例えば、パケット数及びバイト数以外の情報を得ることができる。 The computing device 10 performs a statistical operation on the session based on the flow statistical information generated by the router 40, thereby obtaining information that cannot be obtained only by the flow statistical information, for example, information other than the number of packets and the number of bytes. Obtainable.

次に、図２を用いて、計算装置１０の構成について説明する。図２は、第１の実施形態に係る計算装置の構成の一例を示す図である。図２に示すように、計算装置１０は、通信部１１、記憶部１２及び制御部１３を有する。 Next, the configuration of the computing device 10 will be described with reference to FIG. FIG. 2 is a diagram illustrating an example of a configuration of the computing device according to the first embodiment. As illustrated in FIG. 2, the computing device 10 includes a communication unit 11, a storage unit 12, and a control unit 13.

通信部１１は、ネットワークを介して、他の装置との間でデータ通信を行う。例えば、通信部１１はＮＩＣ（Network Interface Card）である。通信部１１は、例えばルータ４０との間でデータ通信を行う。 The communication unit 11 performs data communication with another device via a network. For example, the communication unit 11 is a NIC (Network Interface Card). The communication unit 11 performs data communication with, for example, the router 40.

記憶部１２は、ＨＤＤ（Hard Disk Drive）、ＳＳＤ（Solid State Drive）、光ディスク等の記憶装置である。なお、記憶部１２は、ＲＡＭ（Random Access Memory）、フラッシュメモリ、ＮＶＳＲＡＭ（Non Volatile Static Random Access Memory）等のデータを書き換え可能な半導体メモリであってもよい。記憶部１２は、計算装置１０で実行されるＯＳ（Operating System）や各種プログラムを記憶する。さらに、記憶部１２は、プログラムの実行で用いられる各種情報を記憶する。また、記憶部１２は、一次記憶部１２１及び二次記憶部１２２を有する。 The storage unit 12 is a storage device such as a hard disk drive (HDD), a solid state drive (SSD), and an optical disk. Note that the storage unit 12 may be a rewritable semiconductor memory such as a random access memory (RAM), a flash memory, or a non-volatile random access memory (NVSRAM). The storage unit 12 stores an OS (Operating System) executed by the computing device 10 and various programs. Further, the storage unit 12 stores various information used in executing the program. The storage unit 12 has a primary storage unit 121 and a secondary storage unit 122.

制御部１３は、計算装置１０全体を制御する。制御部１３は、例えば、ＣＰＵ（Central Processing Unit）、ＭＰＵ（Micro Processing Unit）等の電子回路や、ＡＳＩＣ（Application Specific Integrated Circuit）、ＦＰＧＡ（Field Programmable Gate Array）等の集積回路である。また、制御部１３は、各種の処理手順を規定したプログラムや制御データを格納するための内部メモリを有し、内部メモリを用いて各処理を実行する。また、制御部１３は、各種のプログラムが動作することにより各種の処理部として機能する。例えば、制御部１３は、取得部１３１、分類部１３２、計算部１３３及び保存部１３４を有する。 The control unit 13 controls the entire computing device 10. The control unit 13 is, for example, an electronic circuit such as a CPU (Central Processing Unit) or an MPU (Micro Processing Unit), or an integrated circuit such as an ASIC (Application Specific Integrated Circuit) or an FPGA (Field Programmable Gate Array). Further, the control unit 13 has an internal memory for storing programs and control data defining various processing procedures, and executes each process using the internal memory. Further, the control unit 13 functions as various processing units when various programs operate. For example, the control unit 13 includes an acquisition unit 131, a classification unit 132, a calculation unit 133, and a storage unit 134.

取得部１３１は、通信元及び通信先ごとに集約されたトラフィックに関する統計情報であるフロー統計情報を取得する。取得部１３１は、ルータ４０等のネットワーク機器で生成されたフロー統計情報を取得する。 The acquiring unit 131 acquires flow statistical information that is statistical information on traffic aggregated for each communication source and each communication destination. The acquisition unit 131 acquires flow statistical information generated by a network device such as the router 40.

また、分類部１３２は、フロー統計情報を、基となったトラフィックのセッションが同一となるようにグループに分類する。分類部１３２は、例えば、取得部１３１が取得したフロー統計情報から、セッションを識別可能な情報を抽出し、当該抽出した情報に基づいて元となったトラフィックのセッションごとにフロー統計情報を分類する。セッションを識別可能な情報としては、フロー統計情報に含まれる送信元、送信先、及びフロー統計情報の生成時刻等がある。 In addition, the classification unit 132 classifies the flow statistical information into groups such that the sessions of the underlying traffic are the same. The classification unit 132 extracts, for example, information that can identify a session from the flow statistics information acquired by the acquisition unit 131, and classifies the flow statistics information for each session of the original traffic based on the extracted information. . The information that can identify the session includes the source and destination included in the flow statistical information, the generation time of the flow statistical information, and the like.

例えば、分類部１３２は、第１のフロー統計情報に含まれる送信元及び送信先が、それぞれ第２のフロー統計情報に含まれる送信先及び送信元と同一であり、かつ、第１のフロー統計情報及び第２のフロー統計情報がいずれも所定の期間内に発生したトラフィックに基づくものである場合に、第１のフロー統計情報と第２のフロー統計情報とを同一のグループに分類する。このように、計算装置１０は、フロー統計情報をセッションごとのグループに分類する。 For example, the classification unit 132 determines that the transmission source and the transmission destination included in the first flow statistics information are the same as the transmission destination and the transmission source included in the second flow statistics information, respectively , and that the first flow statistics If the information and the second flow statistical information are both based on traffic generated within a predetermined period, the first flow statistical information and the second flow statistical information are classified into the same group. As described above, the computing device 10 classifies the flow statistical information into groups for each session.

また、計算部１３３は、フロー統計情報に基づいて、分類部１３２によって分類されたグループごとのトラフィックに関する統計情報を計算する。例えば、取得部１３１がフロー統計情報としてパケット数及びバイト数を取得する場合、計算部１３３は、統計情報として、グループごとのパケットサイズの平均、パケットサイズの平均の最大値、パケットサイズの平均の最小値、及びパケットサイズの平均の標準偏差、バイト数の時間平均、及び、時刻ごとの送受信されたパケットの有無を表す情報を計算する。また、計算部１３３は、統計情報として、時刻ごとの送受信されたパケットの有無に基づく共起行列を計算することができる。 The calculating unit 133 calculates statistical information on traffic for each group classified by the classifying unit 132 based on the flow statistical information. For example, when the obtaining unit 131 obtains the number of packets and the number of bytes as flow statistical information, the calculating unit 133 calculates, as the statistical information, the average of the packet size for each group, the maximum value of the average of the packet size, and the average of the average of the packet size. The minimum value, the standard deviation of the average of the packet size, the time average of the number of bytes, and the information indicating the presence or absence of the transmitted / received packet at each time are calculated. Further, the calculation unit 133 can calculate, as statistical information, a co-occurrence matrix based on the presence or absence of transmitted / received packets at each time.

また、保存部１３４は、取得部が取得したフロー統計情報や、計算部１３３による計算結果等を、一次記憶部１２１又は二次記憶部１２２に保存する。以降の説明では、取得部１３１が取得したフロー統計情報を入力情報とよぶ。また、計算部１３３が計算した統計情報をセッション統計情報とよぶ。 The storage unit 134 stores the flow statistical information acquired by the acquisition unit, the calculation result by the calculation unit 133, and the like in the primary storage unit 121 or the secondary storage unit 122. In the following description, the flow statistical information obtained by the obtaining unit 131 is referred to as input information. The statistical information calculated by the calculating unit 133 is called session statistical information.

ここで、図３を用いて、入力情報及びセッション統計情報の計算について具体的に説明する。図３は、第１の実施形態に係る計算装置による計算方法を説明するための図である。 Here, the calculation of the input information and the session statistical information will be specifically described with reference to FIG. FIG. 3 is a diagram for explaining a calculation method by the calculation device according to the first embodiment.

取得部１３１は、入力情報として、セッションを一意に識別するセッション識別子INP_1を生成する。また取得部１３１は、入力情報の生成時刻INP_2、上りパケット数INP_3、下りパケット数INP_4、上りバイト数INP_5、下りバイト数INP_6、セッションの確立後の経過時間INP_7を取得する。 The acquisition unit 131 generates , as input information, a session identifier INP_1 that uniquely identifies a session . The acquiring unit 131 acquires the generation time INP_2 of the input information, the number of upstream packets INP_3, the number of downstream packets INP_4, the number of upstream bytes INP_5, the number of downstream bytes INP_6, and the elapsed time INP_7 after the session is established.

セッション識別子INP_1は、通信を行うある一対のコンピュータシステム間で確立されたある１つのセッションを一意に識別できる値である。取得部１３１は、例えば、コンピュータシステムを識別するアドレス、プロトコル番号、ポート、時刻等に対しビット演算やハッシュ演算等を行うことでセッション識別子を生成することができる。 The session identifier INP_1 is a value that can uniquely identify one session established between a pair of computer systems performing communication. The acquisition unit 131 can generate a session identifier by performing a bit operation, a hash operation, or the like on an address, a protocol number, a port, a time, or the like for identifying a computer system.

生成時刻INP_2は、取得部１３１によって取得された入力情報が生成された時刻である。上りパケット数INP_3は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、上り方向の累計パケット数（0上の整数値）である。また、下りパケット数INP_4は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、下り方向の累計パケット数（0以上の整数値）である。また、上りバイト数INP_5は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、上り方向の累計バイト数（0上の整数値）である。また、下りバイト数INP_6は、セッション識別子INP_1で識別されるセッションにおいて、セッションの開始時刻から生成時刻INP_2までの、下り方向の累計バイト数（0以上の整数値）である。また、経過時間INP_7は、セッションの開始時刻から生成時刻INP_2までの経過時間である。より精度の高い計算をするために、生成時刻INP_2及び経過時間INP_7は、マイクロ秒又はミリ秒単位まで含んでいることが望ましい。 The generation time INP_2 is the time at which the input information acquired by the acquisition unit 131 was generated. The uplink packet count INP_3 is the total number of packets in the uplink direction (an integer value on 0) from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. The number of downlink packets INP_4 is the total number of packets (an integer value of 0 or more) in the downlink direction from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. Further, the number of upstream bytes INP_5 is the total number of bytes in the upward direction (integer value on 0) from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. In addition, the number of downstream bytes INP_6 is the total number of bytes (an integer value of 0 or more) in the downstream direction from the session start time to the generation time INP_2 in the session identified by the session identifier INP_1. The elapsed time INP_7 is the elapsed time from the session start time to the generation time INP_2. In order to perform calculations with higher accuracy, it is preferable that the generation time INP_2 and the elapsed time INP_7 include a unit of microsecond or millisecond.

ここで、取得部１３１は、セッション開始時刻から一定時間dTおきに生成された入力情報を取得する。例えば、図３に示すように、取得部１３１は、まず、時刻T₁に生成された入力情報INP(T₁)を取得し、次に、時刻T₁から時間dTが経過した時刻T₂に生成された入力情報INP(T₂)を取得する。このように、取得部１３１は、セッション終了時刻である時刻T_nに生成された入力情報を取得するまで入力情報を順次取得する。ここで、生成時刻T_kにおける入力情報をINP(T_k)のように表す。INP(T_k)には、INP_1(T_k)、INP_2(T_k)、INP_3(T_k)、INP_4(T_k)、INP_5(T_k)、INP_6(T_k)、INP_7(T_k)が含まれる。また、入力情報が生成される時刻の間隔dTは一定であることが望ましいが、異なっていてもよい。 Here, the acquisition unit 131 acquires the input information generated every fixed time dT from the session start time. For example, as shown in FIG. 3, the acquisition unit 131 first acquires the time T ₁ to the generated input information INP (T _1), then the time T _2, from the time T ₁ has elapsed time dT Obtain the generated input information INP (T ₂ ). Thus, obtaining unit 131 sequentially obtains the input information to acquire the input information generated at time T _n is the session end time. Here, representing the inputs to the generation time T _k as INP (T _k). INP (T _k ) includes INP_1 (T _k ), INP_2 (T _k ), INP_3 (T _k ), INP_4 (T _k ), INP_5 (T _k ), INP_6 (T _k ), INP_7 (T _k ) included. Further, it is desirable that the time interval dT at which the input information is generated is constant, but may be different.

次に、計算部１３３は、セッション統計情報として、セッションごとの上り平均パケットサイズAVE_1及び下り平均パケットサイズAVE_2の２つの変数を計算する。計算部１３３は、INP(T_k)、及び時刻T_kの１つ前の時刻T_k-1における入力情報INP(T_k-1)に基づいて、時刻T_kにおける上り平均パケットサイズAVE_1(T_k)及び下り平均パケットサイズAVE_2(T_k)をそれぞれ（１）式及び（２）式のように計算する。
AVE_1(T_k)＝{INP_4(T_k)-INP_4(T_k-1)}÷{INP_3(T_k)−INP_3(T_k-1)}・・・（１）
AVE_2(T_k)＝{INP_6(T_k)-INP_6(T_k-1)}÷{INP_5(T_k)−INP_5(T_k-1)}・・・（２） Next, the calculation unit 133 calculates two variables of the average packet size AVE_1 and the average packet size AVE_2 for each session as the session statistical information. Calculation unit 133, INP (T _k), and the time T on the basis of the input information INP (T _k-1) in the previous time T _k-1 of _k, uplink average packet size AVE_1 at time T _k (T _k ) and the average downlink packet size AVE_2 (T _k ) are calculated as in equations (1) and (2), respectively.
AVE_1 (T _k ) = {INP_4 (T _k ) −INP_4 (T _k−1 )} ÷ {INP_3 (T _k ) −INP_3 (T _k−1 )} (1)
AVE_2 (T _k ) = {INP_6 (T _k ) −INP_6 (T _k−1 )} ÷ {INP_5 (T _k ) −INP_5 (T _k−1 )} (2)

このように、計算部１３３は、時刻T_kと時刻T_k-1との間で、バイト数の差分をパケット数の差分で平均することによって平均パケットサイズを計算することができる。ここで、生成時刻T_kにおける平均パケットサイズをAVE(T_k)のように表す。AVE(T_k)には、AVE_1(T_k)、AVE_2(T_k)が含まれる。 As described above, the calculation unit 133 can calculate the average packet size by averaging the difference in the number of bytes by the difference in the number of packets between the time _Tk and the time _Tk-1 . Here, representing the average packet size in generation time T _k as AVE of (T _k). AVE (T _k ) includes AVE_1 (T _k ) and AVE_2 (T _k ).

各平均パケットサイズは、時刻T_kと時刻T_k-1の間で流れたパケットの１パケットあたりの平均バイト数である。なお、INP(T_k-1)が存在しない場合（例えばk=1、すなわちINP(T_k)がセッション開始後における最初に生成された入力情報である場合）、計算部１３３は、INP_3(T_k-1)、INP_3(T_k-1)、INP_3(T_k-1)、INP_3(T_k-1)を0として計算を行う。また、保存部１３４は、AVE_1(T_k+1)が計算されるまで、INP(T_k)を一次記憶部１２１に保存しておく。そして、保存部１３４は、AVE_1(T_k+1)が計算された後、INP(T_k)を破棄してもよい。 Each average packet size is an average number of bytes per packet of a packet flowing between time _Tk and time _Tk-1 . When INP (T _k−1 ) does not exist (for example, when k = 1, that is, when INP (T _k ) is the first input information generated after the start of the session), the calculation unit 133 sets the INP_3 (T k _{k-1), INP_3 (T} k-1), INP_3 (T k-1), performs calculations INP_3 a (T _k-1) as 0. Further, the storage unit 134 stores INP (T _k ) in the primary storage unit 121 until AVE_1 (T _{k + 1} ) is calculated. Then, after calculating AVE_1 (T _{k + 1} ), storage unit 134 may discard INP (T _k ).

さらに、計算部１３３は、AVE(T_k)を用いて、上り最大平均パケットサイズAVE_MAX_1(T_k)、下り最大平均パケットサイズAVE_MAX_2(T_k)、上り最小平均パケットサイズAVE_MIN_1(T_k)、下り最小平均パケットサイズAVE_MIN_2(T_k)、上り平均パケットサイズの標準偏差AVE_SD_1(T_k)、及び下り平均パケットサイズの標準偏差AVE_SD_2(T_k)を（３）式から（８）式のように計算する。
AVE_MAX_1(T_k)={AVE_1(T₁),AVE_1(T₂),…,AVE_1(T_k)}の最大値・・・（３）
AVE_MAX_2(T_k)={AVE_2(T₁),AVE_2(T₂),…,AVE_2(T_k)}の最大値・・・（４）
AVE_MIN_1(T_k)={AVE_1(T₁),AVE_1(T₂),…,AVE_1(T_k)}の最小値・・・（５）
AVE_MIN_2(T_k)={AVE_2(T₁),AVE_2(T₂),…,AVE_2(T_k)}の最小値・・・（６）
AVE_SD_1(T_k)={AVE_1(T₁),AVE_1(T₂),…,AVE_1(T_k)}の標準偏差・・・（７）
AVE_SD_2(T_k)={AVE_2(T₁),AVE_2(T₂),…,AVE_2(T_k)}の標準偏差・・・（８） Further, the calculating unit 133 uses AVE (T _k ) to calculate the maximum uplink average packet size AVE_MAX_1 (T _k ), the maximum downlink average packet size AVE_MAX_2 (T _k ), the minimum uplink average packet size AVE_MIN_1 (T _k ), minimum average packet size AVE_MIN_2 (T _k), calculated as the standard deviation of the uplink average packet size AVE_SD_1 (T _k), and a downlink standard deviation of the mean packet size AVE_SD_2 a (T _k) from equation (3) (8) I do.
_{AVE_MAX_1 (T k) = {AVE_1} (T 1), AVE_1 (T 2), ..., AVE_1 (T k)} maximum value (3) of
AVE_MAX_2 (T _k ) = {AVE_2 (T ₁ ), the maximum value of AVE_2 (T ₂ ),…, AVE_2 (T _k )} (4)
_{AVE_MIN_1 (T k) = {AVE_1} (T 1), AVE_1 (T 2), ..., AVE_1 (T k)} minimum ... (5)
AVE_MIN_2 (T _k ) = {AVE_2 (T ₁ ), the minimum value of AVE_2 (T ₂ ),…, AVE_2 (T _k )} (6)
AVE_SD_1 (T _k ) = standard deviation of {AVE_1 (T ₁ ), AVE_1 (T ₂ ),..., AVE_1 (T _k )} (7)
AVE_SD_2 (T _k ) = {standard deviation of AVE_2 (T ₁ ), AVE_2 (T ₂ ), ..., AVE_2 (T _k )} ・・・ (8)

また、計算部１３３は、セッションの開始から終了までの間に上り及び下りの各方向で流れたバイト数の時間平均、すなわち上り方向のフローレートFRATE_1及び下り方向のフローレートFRATE_2を、それぞれ（９）式及び（１０）式のように計算する。ここで、T_nはセッションが終了した時刻である。また、（９）式及び（１０）式に示すように、計算部１３３は、セッション終了時の入力情報、すなわちINP(T_n)のみからフローレートを計算することができる。
FRATE_1=INP_4(T_n)÷INP_7(T_n)・・・（９）
FRATE_2=INP_6(T_n)÷INP_7(T_n)・・・（１０） The calculating unit 133 also calculates the time average of the number of bytes flowing in each of the upstream and downstream directions from the start to the end of the session, that is, the upstream flow rate FRATE_1 and the downstream flow rate FRATE_2, respectively (9 ) And (10). Here, T _n is the time at which the session ended. Further, as shown in Expressions (9) and (10), the calculation unit 133 can calculate the flow rate only from the input information at the end of the session, that is, INP (T _n ).
FRATE_1 = INP_4 (T _n ) ÷ INP_7 (T _n ) ・・・ (9)
FRATE_2 = INP_6 (T _n ) ÷ INP_7 (T _n ) ・・・ (10)

計算部１３３は、パケット共起行列を計算する。ここで、共起行列とは、画素や単語等の画素間の相対関係や単語の出現パターンを表現する行列であり、一般的に画像認識や言語処理等で利用されてきた。本実施形態においては、計算部１３３は、以下のようにパケット共起行列を計算する。まず、計算部１３３は、あるセッションにおいて、時刻T_k-1と時刻T_kとの間にパケットが１つ以上流れたか否かを0又は1の二値で示す。なお、上り方向の当該二値をBOOL_1(T_k)、下り方向の当該二値をBOOL_2(T_k)と表す。 The calculator 133 calculates a packet co-occurrence matrix. Here, the co-occurrence matrix is a matrix that expresses a relative relationship between pixels such as pixels and words and an appearance pattern of words, and has been generally used in image recognition, language processing, and the like. In the present embodiment, the calculation unit 133 calculates the packet co-occurrence matrix as follows. First, the calculation unit 133 indicates whether or not one or more packets have flowed between the time T _k−1 and the time T _k in a certain session by a binary value of 0 or 1. Note that the binary BOOL_1 (T _k) of the uplink, representing the binary downlink BOOL_2 and (T _k).

ここで、計算部１３３は、INP_3(T_k)、又はINP_5(T_k)が、それぞれINP_3(T_k-1)、又はINP_5(T_k-1)より大きい場合に、時刻T_k-1と時刻T_kとの間に上り方向にパケットが１つ以上流れたとみなし、当該二値の値を1とする。この計算方法は、以下の（１１）式及び（１２）式のように表すことができる。
BOOL_1(T_k)=
INP_3(T_k)-INP_3(T_k-1)>0ならば1,INP_3(T_k)-INP_3(T_k-1)=0ならば0・・・（１１）
BOOL_2(T_k)=
INP_5(T_k)-INP_5(T_k-1)>0ならば1,INP_5(T_k)-INP_5(T_k-1)=0ならば0・・・（１２） Here, the calculation unit 133, INP_3 (T _k), or INP_5 (T _k), respectively INP_3 (T _k-1), or INP_5 when (T _k-1) greater than the time T _k-1 It is assumed that one or more packets have flowed in the upstream direction between the time _Tk and the binary value is set to 1. This calculation method can be expressed as the following equations (11) and (12).
BOOL_1 (T _k ) =
_{INP_3 (T k) -INP_3 (T} k-1)> 0 if _{1, INP_3 (T k) -INP_3} (T k-1) = 0 if 0 (11)
BOOL_2 (T _k ) =
If INP_5 (T _k ) -INP_5 (T _k-1 )> 0, 1 if INP_5 (T _k ) -INP_5 (T _k-1 ) = 0, 0 ... (12)

また、当該二値は、パケットサイズの平均を用いて以下の（１３）式及び（１４）式のように計算することができる。
BOOL_1(T_k)’=AVE_1(T_k)>0ならば1,AVE_1(T_k)=0ならば0・・・（１３）
BOOL_2(T_k)’=AVE_2(T_k)>0ならば1,AVE_2(T_k)=0ならば0・・・（１４） Also, the binary value can be calculated using the average of the packet sizes as in the following Expressions (13) and (14).
BOOL_1 (T _k ) '= AVE_1 (T _k )> 0, 1 if AVE_1 (T _k ) = 0, 0 ... (13)
BOOL_2 (T _k ) '= AVE_2 (T _k )> 0, 1 if AVE_2 (T _k ) = 0, 0 ... (14)

計算部１３３は、あるセッションについての入力情報を生成した時刻をセッションの開始直後（１番目）から終了（n番目）まで順番に{T₁,...,T_n}とし、T₁からT_nまでのそれぞれについて二値を決定し、上りと下りそれぞれについて0と1で作られたn個の数字の数列を得ることができる。 The calculation unit 133 sets {T ₁ ,..., T _n } from the time immediately after the start of the session (first) to the end (n-th) from the time T ₁ to T ₁ the binary determined for each of up to _n, it is possible to obtain the n-number of digits sequence made of the 0 and 1, respectively downlink and uplink.

計算部１３３が上りのパケットについて得る当該数列をBOOL_1、下りパケットについて得られる当該数列をBOOL_2とすれば、BOOL_1及びBOOL_2は、それぞれn個の数字の長さを持つ。例えば、BOOL_1={1,0,....,1}のようになる。BOOL_1及びBOOL_2は、あるセッションにおけるパケット送出有無のパターンを表現している。例えば、セッションの開始から終了まで連続してパケットを送出しているセッションの場合、BOOL_1及びBOOL_2は、連続する1の羅列{1,1,1,1,1,...,1}のようになる。また、入力情報の生成する時間周期を超えて間欠的にパケットを送出する場合には、BOOL_1及びBOOL_2は、1と0の繰り返し、例えば{1,0,1,0,1,...,0}や{1,0,0,1,0,0,...,1}のようになる場合もある。 Assuming that the sequence obtained by the calculation unit 133 for the upstream packet is BOOL_1 and the sequence obtained for the downstream packet is BOOL_2, BOOL_1 and BOOL_2 each have a length of n numbers. For example, BOOL_1 = {1,0, ...., 1}. BOOL_1 and BOOL_2 represent patterns of the presence / absence of packet transmission in a certain session. For example, in the case of a session in which packets are continuously transmitted from the start to the end of the session, BOOL_1 and BOOL_2 are represented as a sequence of one continuous {1,1,1,1,1, ..., 1}. become. Also, when transmitting packets intermittently beyond the time period in which the input information is generated, BOOL_1 and BOOL_2 repeat 1 and 0, for example, {1,0,1,0,1, ..., 0} or {1,0,0,1,0,0, ..., 1}.

また、セッションの開始直後のみ情報を送って短時間で終了する場合には数列の長さが短くなり、例えば{1,0}のような数列が生成される場合もありえる。このように、BOOL_1及びBOOL_2は、あるセッションにおけるパケット送出のパターンを表現しているが、BOOL_1及びBOOL_2の長さはセッションの継続時間に依存して可変長であり、その長さを予測することは困難である。 Further, when information is sent only immediately after the start of a session and the processing is completed in a short time, the length of the sequence becomes short, and a sequence such as {1,0} may be generated. As described above, BOOL_1 and BOOL_2 represent the packet transmission pattern in a certain session, but the length of BOOL_1 and BOOL_2 is variable depending on the duration of the session, and it is necessary to predict the length. It is difficult.

計算部１３３は、上記の手順で得られたBOOL_1及びBOOL_2から共起行列を生成する。BOOL_1及びBOOL_2は、一次元の0または1の並びであるため、数列の中のある連続した２つの数字のならびに着目した場合、その並び方の組み合わせは{00}、{01}、{10}、{11}の4通りしかない。計算部１３３は、BOOL_1及びBOOL_2の先頭から２つずつ数字の並びを取り出し、その組み合わせの出現する数を合計する。ここで、一例として、入力情報が１０回生成された場合、すなわちn=10の場合の計算部１３３による共起行列の生成方法について説明する。例えば、BOOL_1={1,1,1,1,1,1,1,1,1,1}の場合、計算部１３３は、共起行列MATRIX_1を{00}=0,{01}=0,{10}=0,{11}=9のように生成する。また、例えば、BOOL_2={1,0,1,0,1,0,1,0,1,0}の場合、計算部１３３は、共起行列MATRIX_2を{00}=0,{01}=4,{10}=5,{11}=0のように生成する。 The calculation unit 133 generates a co-occurrence matrix from BOOL_1 and BOOL_2 obtained by the above procedure. Since BOOL_1 and BOOL_2 are one-dimensional arrangements of 0s or 1s, if attention is paid to two consecutive numbers in a sequence, the combination of arrangements is {00}, {01}, {10}, There are only four ways of {11}. The calculation unit 133 extracts two numbers each from the top of BOOL_1 and BOOL_2, and totals the number of occurrences of the combination. Here, as one example, a method of generating a co-occurrence matrix by the calculation unit 133 when input information is generated ten times, that is, when n = 10, will be described. For example, when BOOL_1 = {1,1,1,1,1,1,1,1,1,1}, the calculation unit 133 calculates the co-occurrence matrix MATRIX_1 as {00} = 0, {01} = 0, Generate as {10} = 0, {11} = 9. Also, for example, in the case of BOOL_2 = {1,0,1,0,1,0,1,0,1,0}, the calculation unit 133 calculates the co-occurrence matrix MATRIX_2 as {00} = 0, {01} = Generated as 4, {10} = 5, {11} = 0.

さらに、セッションの時間が短い場合、例えばn=4の場合に、BOOL_1={1,0,1,1}の場合、計算部１３３は、共起行列MATRIX_1を{00}=0,{01}=1,{10}=1,{11}=1のように生成する。このように、計算部１３３は、あるセッションについて上り下りそれぞれの共起行列が計算することで、各々の共起行列は４変数を持つため、合計８変数を得ることができる。 Further, when the session time is short, for example, when n = 4 and BOOL_1 = {1,0,1,1}, the calculation unit 133 sets the co-occurrence matrix MATRIX_1 to {00} = 0, {01}. = 1, {10} = 1, {11} = 1. As described above, the calculation unit 133 calculates the up-down co-occurrence matrix for a certain session, and since each co-occurrence matrix has 4 variables, a total of 8 variables can be obtained.

［第１の実施形態の処理］
図４から９を用いて、計算システム１の処理の流れについて説明する。図４は、第１の実施形態に係るルータの処理の流れを示すフローチャートである。また、図５、７及び８は、第１の実施形態に係る計算装置の処理の流れを示すフローチャートである。また、図６は、第１の実施形態に係る一次記憶部のデータ構成の一例を示す図である。また、図９は、第１の実施形態に係る二次記憶部のデータ構成の一例を示す図である。 [Processing of First Embodiment]
The processing flow of the calculation system 1 will be described with reference to FIGS. FIG. 4 is a flowchart illustrating a flow of processing of the router according to the first embodiment. FIGS. 5, 7, and 8 are flowcharts illustrating the processing flow of the computing device according to the first embodiment. FIG. 6 is a diagram illustrating an example of a data configuration of the primary storage unit according to the first embodiment. FIG. 9 is a diagram illustrating an example of a data configuration of the secondary storage unit according to the first embodiment.

図４に示すように、ルータ４０は、一定時間が経過するまで待機し（ステップＳ１１、Ｎｏ）、一定時間が経過すると（ステップＳ１１、Ｙｅｓ）、入力情報を生成する（ステップＳ１２）。 As shown in FIG. 4, the router 40 waits until a certain time elapses (step S11, No), and when the certain time elapses (step S11, Yes), generates the input information (step S12).

図５に示すように、取得部１３１は、ネットワーク機器、すなわちルータ４０から時刻T_kの入力情報INP(T_k)を読み取る（ステップＳ２１）。分類部１３２は、取得部１３１によって取得された入力情報INP(T_k)が、新しいセッションのものであるか否かを判定する（ステップＳ２２）。 As shown in FIG. 5, the acquisition unit 131, network devices, i.e., reading input information INP time T _k from the router 40 (T _k) (step S21). The classification unit 132 determines whether the input information INP (T _k ) acquired by the acquisition unit 131 is for a new session (Step S22).

ここで、一次記憶部１２１に入力情報INP(T_k-1)が保存されていない場合、分類部１３２は、取得部１３１によって取得された入力情報INP(T_k)が、新しいセッションのものであると判定する（ステップＳ２２、Ｙｅｓ）。この場合、保存部１３４は、入力情報INP(T_k)を一次記憶部１２１に保存する（ステップＳ２３）。図６に示すように、一次記憶部１２１は入力情報を記憶する。図６は、保存部１３４が保存した入力情報INP(T_k)の、セッション識別子INP_1(T_k)が「xyz001」、生成時刻INP_2(T_k)が「20:40」、上りパケット数INP_3(T_k)が「10」、上りバイト数INP_4(T_k)が「80」、下りパケット数INP_5(T_k)が「400」、下りバイト数INP_6(T_k)が「10000」、経過時間INP_7(T_k)が「2」であったことを示している。また、この場合、保存部１３４は、一次記憶部１２１の統計情報を保存する。この場合の統計情報は、セッションの最初の統計情報であるため、保存部１３４は一次記憶部１２１の統計情報の各値を0とする。 Here, when the input information INP (T _k−1 ) is not stored in the primary storage unit 121, the classification unit 132 determines that the input information INP (T _k ) acquired by the acquisition unit 131 is for a new session. It is determined that there is (Step S22, Yes). In this case, the storage unit 134 stores the input information INP (T _k ) in the primary storage unit 121 (Step S23). As shown in FIG. 6, the primary storage unit 121 stores input information. FIG. 6 shows that the session identifier INP_1 (T _k ) of the input information INP (T _k ) stored by the storage unit 134 is “xyz001”, the generation time INP_2 (T _k ) is “20:40”, and the number of uplink packets INP_3 ( T _k) is "10", "80" number uplink bytes INP_4 (T _k) is, the downlink packet number INP_5 (T _k) is "400", the number down bytes INP_6 (T _k) is "10000", the elapsed time INP_7 (T _k ) was “2”. In this case, the storage unit 134 stores the statistical information in the primary storage unit 121. Since the statistical information in this case is the first statistical information of the session, the storage unit 134 sets each value of the statistical information in the primary storage unit 121 to 0.

一方、一次記憶部１２１に入力情報INP(T_k-1)が保存されている場合、分類部１３２は、取得部１３１によって取得された入力情報INP(T_k)が、新しいセッションのものでないと判定する（ステップＳ２２、Ｎｏ）。つまり、分類部１３２は、入力情報INP(T_k-1)及び入力情報INP(T_k)を同一のグループに分類する。この場合、保存部１３４は、入力情報INP(T_k)を一次記憶部１２１に保存する。また、取得部１３１は、一次記憶部１２１から時刻T_k-1の入力情報及び統計情報を読み取る（ステップＳ２４）。そして、計算部１３３は、各統計情報を計算する（ステップＳ２５）。 On the other hand, when the input information INP (T _k−1 ) is stored in the primary storage unit 121, the classification unit 132 determines that the input information INP (T _k ) acquired by the acquisition unit 131 is not for a new session. It is determined (Step S22, No). That is, the classification unit 132 classifies the input information INP (T _k−1 ) and the input information INP (T _k ) into the same group. In this case, the storage unit 134 stores the input information INP (T _k ) in the primary storage unit 121. Further, the acquiring unit 131 reads the input information and the statistical information at the time T _k−1 from the primary storage unit 121 (Step S24). And the calculation part 133 calculates each statistical information (step S25).

このように、取得部１３１は、一定時間間隔の時刻のそれぞれに対応するフロー統計情報を時間順に取得することができる。このとき、分類部１３２は、取得部１３１によってフロー統計情報が取得されるたびに、フロー統計情報を分類する。また、計算部１３３は、分類部１３２によって分類が行われるたびに、グループごとのトラフィックに関する統計情報を計算する。 As described above, the acquiring unit 131 can acquire the flow statistical information corresponding to each of the times at the fixed time intervals in chronological order. At this time, the classifying unit 132 classifies the flow statistical information each time the obtaining unit 131 obtains the flow statistical information. The calculating unit 133 calculates statistical information on traffic for each group each time the classifying unit 132 performs classification.

図７に示すように、図５のステップＳ２５において、計算部１３３は、まず、時刻T_kの入力情報及び時刻T_k-1の入力情報を基に、時刻T_kの平均パケットサイズを計算する（ステップＳ２５１）。次に、計算部１３３は、時刻T_kの平均パケットサイズ及び時刻T_k-1の平均パケットサイズを基に、時刻T_kの平均パケットサイズの最大値、最小値、及び標準偏差を計算する（ステップＳ２５２）。次に、計算部１３３は、時刻T_kの入力情報及び時刻T_k-1の入力情報を基に、時刻T_kの共起行列を計算する（ステップＳ２５３）。そして、保存部１３４は、計算部１３３によって計算された各統計情報を一次記憶部１２１に保存する。 As shown in FIG. 7, in step S25 of FIG. 5, the calculation unit 133, first, based on the input information of the input information and time T _k-1 at time T _k, calculate the average packet size at time T _k (Step S251). Next, calculation unit 133, based on the average packet size and average packet size at time T _k-1 at time T _k, the maximum value of the average packet size at time T _k, a minimum value, and calculates the standard deviation ( Step S252). Next, calculation unit 133, based on the input information of the input information and time T _k-1 at time T _k, calculates the co-occurrence matrix of time T _k (step S253). Then, the storage unit 134 stores the respective pieces of statistical information calculated by the calculation unit 133 in the primary storage unit 121.

ここで、平均パケットサイズに関する計算を行う場合、計算部１３３は、セッションの開始から終了までの全ての入力情報等を用いることなく、時刻T_k-1及び時刻T_kの入力情報及び統計情報のみを用いることで、（１５）式から（１８）式のように計算を行うことができる。
AVE_MAX_1(T_k)={AVE_1(T_k-1),AVE_1(T_k)}の大きい方・・・（１５）
AVE_MAX_2(T_k)={AVE_2(T_k-1),AVE_2(T_k)}の大きい方・・・（１６）
AVE_MIN_1(T_k)={AVE_1(T_k-1),AVE_1(T_k)}の小さい方・・・（１７）
AVE_MIN_2(T_k)={AVE_2(T_k-1),AVE_2(T_k)}の小さい方・・・（１８） Here, when the calculation regarding the average packet size is performed, the calculation unit 133 uses only the input information and the statistical information at the time T _k−1 and the time T _k without using all the input information from the start to the end of the session. Is used, the calculation can be performed as in the equations (15) to (18).
AVE_MAX_1 (T _k ) = {AVE_1 (T _k-1 ), the larger of AVE_1 (T _k )} (15)
AVE_MAX_2 (T _k ) = larger of {AVE_2 (T _k-1 ), AVE_2 (T _k )} (16)
AVE_MIN_1 (T _k ) = {AVE_1 (T _k−1 ), AVE_1 (T _k )}, whichever is smaller (17)
AVE_MIN_2 (T _k ) = {AVE_2 (T _k-1 ), the smaller of AVE_2 (T _k )} (18)

これにより、ステップＳ２５４において、保存部１３４は、計算部１３３によって計算された時刻T_kにおける統計情報のみが一次記憶部１２１に記憶されるようにすればよい。つまり、保存部１３４は、時刻T_k-1における統計情報を削除し、時刻T_kにおける統計情報を保存してもよいし、時刻T_k-1における統計情報に時刻T_kにおける統計情報を上書きしてもよい。このように、１つ前の時刻の統計情報を破棄していくことで、一次記憶部１２１には１つの時刻の入力情報及び統計情報のみ記憶しておけばよく、記憶容量を削減することができる。 Thus, in step S254, the storage unit 134, only the statistics at the time T _k calculated by the calculation unit 133 may be as stored in the primary storage unit 121. That is, the storage unit 134 deletes the statistics at time T _k-1, may be stored statistics in time T _k, overwrites the statistics at the time T _k to the statistics at time T _k-1 May be. In this way, by discarding the statistical information of the previous time, only the input information and the statistical information of one time need be stored in the primary storage unit 121, and the storage capacity can be reduced. it can.

ここで、変数X={x1,x2,...,x(k),...,x(n-1),x(n)}があったとして、k番目における分散sig(k)、すなわち標準偏差の二乗は以下の（１９）式に示す漸化式で表される。計算部１３３は、（１９）式を用いて、AVE_SD_1(T_k-1)及びAVE_SD_2(T_k-1)を基にAVE_SD_1(T_k)及びAVE_SD_2(T_k)を計算することができる。ただし、u(k)は、k番目までのx(k)の平均である。 Here, if there is a variable X = {x1, x2, ..., x (k), ..., x (n-1), x (n)}, the variance sig (k) at the k-th, That is, the square of the standard deviation is represented by a recurrence equation shown in the following equation (19). Calculation unit 133 may calculate the (19) using the formula, AVE_SD_1 (T _k-1) and AVE_SD_2 (T _k-1) based on AVE_SD_1 (T _k) and AVE_SD_2 (T _k). Here, u (k) is an average of x (k) up to the k-th.

また、計算部１３３は、時刻T_k-1の入力情報及び統計情報を基に時刻T_kの共起行列を計算することができる。まず、計算部１３３は、（１３）式及び（１４）式により、AVE_1(T_k)及びAVE_2(T_k)を基に、BOOL_1(T_k)’及びBOOL_2(T_k)’を計算する。ここで、一次記憶部１２１に、BOOL_1(T_k-1)及びBOOL_2(T_k-1)が記憶されていれば、計算部１３３は、BOOL_1(T_k-1)とBOOL_1(T_k)’、又はBOOL_2(T_k-1)とBOOL_2(T_k)’を連結することで、{00},{01},{10},{11}のうちのどれか生成されるかを得ることができ、BOOL_1及びBOOL_2を計算することができる。 The calculation unit 133 may calculate the co-occurrence matrix of time T _k based on the input information and statistical information of the time T _k-1. First, the calculation unit 133 calculates BOOL_1 (T _k ) ′ and BOOL_2 (T _k ) ′ based on AVE_1 (T _k ) and AVE_2 (T _k ) according to Expressions (13) and (14). Here, if BOOL_1 (T _k−1 ) and BOOL_2 (T _k−1 ) are stored in the primary storage unit 121, the calculation unit 133 calculates BOOL_1 (T _k−1 ) and BOOL_1 (T _k ) ′. , Or by concatenating BOOL_2 (T _k-1 ) and BOOL_2 (T _k ) ', it is possible to obtain which of {00}, {01}, {10}, {11} is generated. BOOL_1 and BOOL_2 can be calculated.

このように、計算部１３３は、分類部１３２によって分類が行われたグループの統計情報が既に計算済みである場合、当該計算済みの統計情報及び取得部１３１によって取得されたフロー統計情報に基づいて、グループごとのトラフィックに関する統計情報を計算する。 As described above, when the statistical information of the group classified by the classifying unit 132 has already been calculated, the calculating unit 133 calculates the statistical information based on the calculated statistical information and the flow statistical information acquired by the acquiring unit 131. , Calculate traffic statistics for each group.

ここで、計算部１３３は、セッションが終了したか否かを判定する（ステップＳ２６）。セッションが終了したと判定した場合（ステップＳ２６、Ｙｅｓ）、計算部１３３は、セッション単位の統計情報を計算し（ステップＳ２７）、kにk+1を代入し（ステップＳ２８）、次の時刻の処理に進む。セッション単位の統計情報とは、例えばフローレートである。また、セッションが終了していないと判定した場合（ステップＳ２６、Ｎｏ）、計算部１３３は、kにk+1を代入し（ステップＳ２８）、次の時刻の処理に進む。 Here, the calculation unit 133 determines whether or not the session has ended (Step S26). If it is determined that the session has ended (step S26, Yes), the calculation unit 133 calculates statistical information in session units (step S27), substitutes k + 1 for k (step S28), and sets the next time. Proceed to processing. The statistical information for each session is, for example, a flow rate. If it is determined that the session has not ended (No at Step S26), the calculation unit 133 substitutes k + 1 for k (Step S28), and proceeds to the process at the next time.

ここで、計算部１３３は、ルータ４０において、時刻T_k+1の入力情報が生成されているか否かによって、セッションが終了したか否かを判定することができる。つまり、INP(T_k+1)がINP_1(T_k)と同一のセッション識別子を持つ入力情報であれば、計算部１３３はセッションが終了していないと判定する。 Here, the calculation unit 133 can determine whether or not the session has ended, based on whether or not the input information at the time T _{k + 1} has been generated in the router 40. That is, if INP (T _{k + 1} ) is input information having the same session identifier as INP_1 (T _k ), calculation unit 133 determines that the session has not ended.

さらに、計算部１３３は、パケットのヘッダ部分に含まれるフラグを参照することでセッションが終了したか否かを判定してもよい。例えば、計算部１３３は、ＴＣＰ（Transmission Control Protocol)のヘッダの中で送信終了を示すＦＩＮフラグがＯＮであるかＯＦＦであるかを参照し、ＦＩＮフラグがＯＮであればセッションが終了したと判定することができる。なお、この方法は、ＦＩＮフラグを入力情報に追加することによって実現することができる。 Furthermore, the calculation unit 133 may determine whether or not the session has ended by referring to a flag included in the header portion of the packet. For example, the calculation unit 133 refers to whether a FIN flag indicating the end of transmission is ON or OFF in a TCP (Transmission Control Protocol) header, and determines that the session has ended if the FIN flag is ON. can do. Note that this method can be realized by adding a FIN flag to the input information.

図８に示すように、図５のステップＳ２７において、計算部１３３は、まず、時刻T_kの入力情報を基に、セッションのフローレートを計算する（ステップＳ２７１）。そして、保存部１３４は、一次記憶部１２１の統計情報及びフローレートを二次記憶部１２２に保存し、一次記憶部１２１の入力情報及び統計情報を削除する（ステップＳ２７２）。 As shown in FIG. 8, in step S27 in FIG. 5, the calculation unit 133, first, based on the input information of the time T _k, to calculate the flow rate of the session (step S271). Then, the storage unit 134 stores the statistical information and the flow rate of the primary storage unit 121 in the secondary storage unit 122, and deletes the input information and the statistical information of the primary storage unit 121 (Step S272).

図９に示すように、二次記憶部１２２は入力情報及び統計情報を記憶する。図９は、二次記憶部１２２に保存済みの入力情報INP(T_k)の、セッション識別子INP_1(T_n)が「abc123」、生成時刻INP_2(T_n)が「20:29」、上りパケット数INP_3(T_n)が「20」、上りバイト数INP_4(T_n)が「120」、下りパケット数INP_5(T_n)が「650」、下りバイト数INP_6(T_n)が「30000」、経過時間INP_7(T_k)が「5」であることを示している。また、図９は、二次記憶部１２２に保存済みの統計情報の、上り平均パケットサイズAVE_1(T_n)が「30」、下り平均パケットサイズAVE_2(T_n)が「200」、上り平均パケットサイズの最大値AVE_MAX_1(T_n)が「60」、下り平均パケットサイズの最大値AVE_MAX_2(T_n)が「500」、上り平均パケットサイズの最小値AVE_MIN_1(T_n)が「2」、下り平均パケットサイズの最小値AVE_MIN_2(T_n)が「50」、上り平均パケットサイズの標準偏差AVE_SD_1(T_n)が「30」、下り平均パケットサイズの標準偏差AVE_SD_2(T_n)が「300」、上り共起行列MATRIX_1(T_n)が「0,0,0,9」、上り共起行列MATRIX_2(T_n)が「0,2,2,5」、上りフローレートFRATE_1(T_n)が「19」、下りフローレートFRATE_2(T_n)が「1300」であることを示している。また、例えば、保存部１３４は、セッション識別子INP_1(T_n)が「abc123」である行の下に、セッション識別子INP_1(T_n)が「xyz001」である行を作成し、入力情報及び統計情報を保存してもよい。 As shown in FIG. 9, the secondary storage unit 122 stores input information and statistical information. FIG. 9 shows that the session identifier INP_1 (T _n ) of the input information INP (T _k ) stored in the secondary storage unit 122 is “abc123”, the generation time INP_2 (T _n ) is “20:29”, and the upstream packet number INP_3 (T _n) is "20", the uplink number of bytes INP_4 (T _n) is "120", the number downlink packet INP_5 (T _n) is "650", the number downlink bytes INP_6 (T _n) is "30000", This indicates that the elapsed time INP_7 (T _k ) is “5”. FIG. 9 shows that the statistical information stored in the secondary storage unit 122 has an average packet size AVE_1 (T _n ) of “30”, an average packet size AVE_2 (T _n ) of “200”, and an average packet size of up The maximum size value AVE_MAX_1 (T _n ) is “60”, the maximum downlink average packet size value AVE_MAX_2 (T _n ) is “500”, the minimum uplink average packet size value AVE_MIN_1 (T _n ) is “2”, and the downlink average The minimum packet size value AVE_MIN_2 (T _n ) is “50”, the standard deviation AVE_SD_1 (T _n ) of the uplink average packet size is “30”, and the standard deviation AVE_SD_2 (T _n ) of the downlink average packet size is “300”. The co-occurrence matrix MATRIX_1 (T _n ) is `` 0,0,0,9 '', the upstream co-occurrence matrix MATRIX_2 (T _n ) is `` 0,2,2,5 '', and the upstream flow rate FRATE_1 (T _n ) is `` 19 And that the downstream flow rate FRATE_2 (T _n ) is “1300”. Further, for example, storage unit 134, below the line session identifier INP_1 (T _n) is "abc123" session identifier INP_1 (T _n) to create a line of "xyz001", the input information and statistics May be saved.

［実施例］
第１の実施形態に基づく実施例について説明する。本実施例では、ルータ４０は、NetFlowと呼ばれる方式を利用してフロー統計情報を収集する。なお、ルータ４０は、入力情報として必要な情報が収集可能である方式であれば、NetFlow以外の方式を用いてもよい。例えば、ルータ４０は、OpenFlow（参考文献１：OpenFlow Switch Specification（URL:https://www.opennetworking.org/images/stories/downloads/sdn-resources/onf-specifications/openflow/openflow-spec-v1.3.2.pdf））「Body of reply to OFPMP_FLOW request」で規定される情報でも入力情報を生成可能である。 [Example]
An example based on the first embodiment will be described. In this embodiment, the router 40 collects flow statistical information using a method called NetFlow. The router 40 may use a method other than NetFlow as long as necessary information can be collected as input information. For example, the router 40 uses OpenFlow (Reference 1: OpenFlow Switch Specification (URL: https://www.opennetworking.org/images/stories/downloads/sdn-resources/onf-specifications/openflow/openflow-spec-v1. 3.2.pdf)) Input information can be generated even with the information specified in "Body of reply to OFPMP_FLOW request".

NetFlowはインターネットで標準的に利用されている、ルータやスイッチ等のネットワーク機器にフロー統計情報を生成する機能を搭載し、その情報を遠隔の情報収集器および分析器に送信するための仕組みの１つである。図１に示すように、ルータ４０（NetFlow-Enabled Router）が自身を通過する通信のフロー統計情報を生成するネットワーク機器であり、計算装置１０（NetFlow Collector）がフロー統計情報であるFlow Recordを受け取り、収集、分析を行う機器である。 NetFlow is equipped with a function to generate flow statistical information in network equipment such as routers and switches that are used as standard on the Internet, and is one of the mechanisms for transmitting that information to remote information collectors and analyzers. One. As shown in FIG. 1, a router 40 (NetFlow-Enabled Router) is a network device that generates flow statistical information of communication passing therethrough, and a computing device 10 (NetFlow Collector) receives a Flow Record that is flow statistical information. , Collection and analysis equipment.

ここで、NetFlow Version 5では、Flow Record Formatに以下の情報が定義されている。
1.srcaddr Source IP address
2.dstaddr Destination IP address
3.nexthop IP address of next hop router
4.input SNMP index of input interface
5.output SNMP index of output interface
6.dPkts Packets in the flow
7.dOctets Total number of Layer 3 bytes in the packets of the flow
8.First SysUptime at start of flow
9.Last SysUptime at the time the last packet of the flow was received
10.srcport TCP/UDP source port number or equivalent
11.dstport TCP/UDP destination port number or equivalent
12.pad1 Unused (zero) bytes
13.tcp_flags Cumulative OR of TCP flags
14.prot IP protocol type (for example, TCP = 6; UDP = 17)
15.tos IP type of service (ToS)
16.src_as Autonomous system number of the source, either origin or peer
17.dst_as Autonomous system number of the destination, either origin or peer
18.src_mask Source address prefix mask bits
19.dst_mask Destination address prefix mask bits
20.pad2 Unused (zero) bytes Here, in NetFlow Version 5, the following information is defined in the Flow Record Format.
1.srcaddr Source IP address
2.dstaddr Destination IP address
3.nexthop IP address of next hop router
4.input SNMP index of input interface
5.output SNMP index of output interface
6.dPkts Packets in the flow
7.dOctets Total number of Layer 3 bytes in the packets of the flow
8.First SysUptime at start of flow
9.Last SysUptime at the time the last packet of the flow was received
10.srcport TCP / UDP source port number or equivalent
11.dstport TCP / UDP destination port number or equivalent
12.pad1 Unused (zero) bytes
13.tcp_flags Cumulative OR of TCP flags
14.prot IP protocol type (for example, TCP = 6; UDP = 17)
15.tos IP type of service (ToS)
16.src_as Autonomous system number of the source, either origin or peer
17.dst_as Autonomous system number of the destination, either origin or peer
18.src_mask Source address prefix mask bits
19.dst_mask Destination address prefix mask bits
20.pad2 Unused (zero) bytes

計算装置１０の取得部１３１は、Flow Record Formatの情報を用いて、以下のように入力情報を生成することができる。取得部１３１は、Flow Record Formatの1、2、8、10、11、14を入力としてビット演算あるいはハッシュ計算をすることによりINP_1を生成することができる。言い換えると、取得部１３１は、セッションを識別可能な情報を取得し、取得した情報からセッション識別子を生成する。また、取得部１３１は、ルータ４０がFlow Recordを生成あるいは送信した時刻を参照しINP_2を生成してもよいし、計算装置１０は、Flow Recordを受信した時刻をINP_2としてもよい。また、取得部１３１は、Flow Record Formatの6を参照しINP_3及びINP_5を生成する。また、取得部１３１は、Flow Record Formatの7を参照しINP_4及びINP_6を生成する。また、取得部１３１は、Flow Record Formatの9と8の時間差分を計算しINP_7を生成する。 The acquisition unit 131 of the computing device 10 can generate input information as described below using the information of the Flow Record Format. The acquisition unit 131 can generate INP_1 by performing a bit operation or a hash calculation by using Flow Record Format 1, 2, 8, 10, 11, and 14 as inputs. In other words, the acquisition unit 131 acquires information capable of identifying a session, and generates a session identifier from the acquired information. Further, the acquisition unit 131 may generate INP_2 with reference to the time at which the router 40 generates or transmits the Flow Record, or the computing device 10 may set the time at which the Flow Record was received as INP_2. Further, the acquisition unit 131 generates INP_3 and INP_5 with reference to 6 of the Flow Record Format. Further, the acquisition unit 131 generates INP_4 and INP_6 with reference to Flow Record Format 7. Further, the acquisition unit 131 calculates INP_7 by calculating a time difference between 9 and 8 in the Flow Record Format.

ここで、INP_3とINP_5、及びINP_4とINP_6は、上りと下りで組になっている。NetFlowのFlow Recordはどちらか一方向のフローに関する情報なので、取得部１３１は、対となるFlow Recordを見つける必要があるが、上りと下りのFlow Recordは、上記Flow Record Formatの1と2、10と11が反転した関係となっている。すなわち、取得部１３１は、送信元(source)と宛先(destination)が入れ替わっているフローの対を見つければよい。そして、取得部１３１は、対となる２つのFlow Recordから１つのセッションに相当するINPを生成する。 Here, INP_3 and INP_5, and INP_4 and INP_6 are paired for up and down. Since the Flow Record of NetFlow is information about a flow in one direction, the acquisition unit 131 needs to find a Flow Record to be paired, but the Up and Down Flow Records are 1 and 2, 10 of the above Flow Record Format. And 11 are reversed. That is, the acquisition unit 131 only has to find a pair of flows in which the transmission source (source) and the destination (destination) are exchanged. Then, the acquisition unit 131 generates an INP corresponding to one session from the paired two Flow Records.

ルータ４０は、ある時刻Tにおいて自身を通過しているフロー、すなわちセッション全てに関するFlow Recordを生成して送信する。Flow Recordの送信契機はルータ４０で設定可能であるため、一定の時間間隔、例えば10秒おきにFlow Recordを送信することが可能である。また、計算装置１０は、Flow Recordを受信し、実施形態の手順に従い計算を繰り返すことによりセッション統計情報を生成、更新する。 The router 40 generates and transmits a Flow Record relating to a flow passing through itself at a certain time T, that is, all sessions. Since the transmission timing of the Flow Record can be set by the router 40, it is possible to transmit the Flow Record at fixed time intervals, for example, every 10 seconds. Further, the computing device 10 generates and updates session statistical information by receiving the Flow Record and repeating the calculation according to the procedure of the embodiment.

クライアント２０がサーバ３０に対して通信を開始すると、ルータ４０はその通信をFlow Recordとして記録する。Flow Recordは、上り方向と下り方向の２つが生成される。例えば、実施例では、計算システム１は以下の流れで処理を行う。 When the client 20 starts communication with the server 30, the router 40 records the communication as a Flow Record. Two Flow Records are generated for the up direction and the down direction. For example, in the embodiment, the calculation system 1 performs processing according to the following flow.

（セッション開始、継続時）
1.ルータ４０はFlow Recordの送信契機で、Flow Recordを生成して計算装置１０に送る。
2.計算装置１０は、Flow Recordを受信し受信時刻を記録する。
3.計算装置１０は、実施形態の手順に従い入力情報を生成する。
4.計算装置１０は、実施形態の手順に従い統計情報を生成する。
5.計算装置１０は、次の時刻の入力情報を待つ。
6.ルータ４０はFlow Recordの送信契機で、Flow Recordを生成して計算装置１０に送る。
7.計算装置１０は、Flow Recordを受信し受信時刻を記録する。 (At the beginning and continuation of session)
1. The router 40 generates a Flow Record and sends it to the computing device 10 when the Flow Record is transmitted.
2. The computing device 10 receives the Flow Record and records the reception time.
3. The computing device 10 generates input information according to the procedure of the embodiment.
4. The calculation device 10 generates statistical information according to the procedure of the embodiment.
5. The computing device 10 waits for input information at the next time.
6. The router 40 generates a Flow Record and sends it to the computing device 10 when the Flow Record is transmitted.
7. The calculation device 10 receives the Flow Record and records the reception time.

（セッション終了時）
8.クライアント２０又はサーバ３０が通信を終了する。
9.ルータ４０は通信の終了を検知し、Flow Recordを計算装置１０に送信後、終了したFlow Recordを削除する。
10.計算装置１０は、9.のFlow Recordを受信し受信時刻を記録し、入力情報及び統計情報の生成を実施する。
11.計算装置１０は、次の時刻の入力情報を待つ。
12.ルータ４０はFlow Recordが削除されているので、該当する通信のFlow Recordに関しては何も送信しない（あるいは空のFlow Recordを送信する）。
13.計算装置１０は、セッションが終了したと判断し、フローレートを計算する。
14.計算装置１０は、入力情報及び統計情報を二次記憶装置等に書き出し、INP_1で識別されるセッションに関する統計情報の計算を終了する。 (At the end of the session)
8. The client 20 or the server 30 ends the communication.
9. The router 40 detects the end of the communication, transmits the Flow Record to the computing device 10, and then deletes the ended Flow Record.
10. The computing device 10 receives the Flow Record of 9., records the reception time, and generates input information and statistical information.
11. The computing device 10 waits for input information at the next time.
12. Since the Flow Record has been deleted, the router 40 does not transmit anything (or transmits an empty Flow Record) for the Flow Record of the corresponding communication.
13. The calculation device 10 determines that the session has ended, and calculates the flow rate.
14. The calculation device 10 writes the input information and the statistical information to the secondary storage device or the like, and ends the calculation of the statistical information related to the session identified by INP_1.

［第１の実施形態の効果］
取得部１３１は、通信元及び通信先ごとに集約されたトラフィックに関する統計情報であるフロー統計情報を取得する。また、分類部１３２は、フロー統計情報を、基となったトラフィックのセッションが同一となるようにグループに分類する。また、計算部１３３は、フロー統計情報に基づいて、分類部１３２によって分類されたグループごとのトラフィックに関する統計情報を計算する。 [Effect of First Embodiment]
The acquiring unit 131 acquires flow statistical information that is statistical information on traffic aggregated for each communication source and each communication destination. In addition, the classification unit 132 classifies the flow statistical information into groups such that the sessions of the underlying traffic are the same. The calculating unit 133 calculates statistical information on traffic for each group classified by the classifying unit 132 based on the flow statistical information.

このように、本実施形態の計算装置１０は、パケットキャプチャを用いずにコンピュータシステム間の通信を計測する。このため、本実施形態によれば、キャプチャデータを複製して保存する必要がなく、キャプチャデータを保存する二次記憶装置が必要ないという効果が得られる。 As described above, the computing device 10 of the present embodiment measures communication between computer systems without using packet capture. For this reason, according to the present embodiment, it is not necessary to copy and store the capture data, and it is possible to obtain an effect that a secondary storage device for storing the capture data is not required.

さらに、本実施形態では、パケットが複製されないため、ペイロードに含まれる通信の内容の秘密及びプライバシ保護に関する問題が発生しない。さらに、本実施形態によれば、通信の内容の秘密を保護するために利用される暗号化通信に対して、ペイロードの内容を判読することなく通信を計測することが可能である。 Further, in the present embodiment, since the packet is not copied, there is no problem regarding the confidentiality and privacy protection of the communication content included in the payload. Further, according to the present embodiment, it is possible to measure the communication of the encrypted communication used to protect the secret of the communication contents without reading the contents of the payload.

また、従来のフロー統計情報の計測では、計測できる統計情報がパケット数とバイト数の２種類に限られていた。これに対し、本実施形態の計算装置１０は、８種類の統計情報（パケット数、バイト数、平均パケットサイズ、平均パケットサイズの最大値、最小値、標準偏差、フローレート、共起行列）を計算により生成する。このため、仮に、１つの変数がとり得る値の数をMとすれば、従来の技術で得られていた情報量がM2であったのに対し、本実施形態ではM8の情報量を得ることができる。 Further, in the conventional measurement of the flow statistical information, the statistical information that can be measured is limited to two types of the number of packets and the number of bytes. On the other hand, the computing device 10 of the present embodiment stores eight types of statistical information (the number of packets, the number of bytes, the average packet size, the maximum value, the minimum value, the standard deviation, the flow rate, and the co-occurrence matrix of the average packet size). Generated by calculation. For this reason, if the number of values that one variable can take is M, the information amount obtained by the conventional technique is M2, whereas the information amount of M8 is obtained in the present embodiment. Can be.

さらに、本実施形態では、セッションが開始してから終了するまで全期間に渡って計算に必要な情報を保持する必要がない。つまり、本実施形態では、周期的に取得できるフロー統計情報の１周期分を一時的に記憶しておくだけでパケット数とバイト数以外の各統計情報を計算することができる。このように、本実施形態の計算装置１０によれば、限られた処理資源を用いて効果的な通信の計測を行うことができる。 Further, in the present embodiment, it is not necessary to hold information necessary for calculation over the entire period from the start to the end of the session. That is, in the present embodiment, each piece of statistical information other than the number of packets and the number of bytes can be calculated only by temporarily storing one cycle of the flow statistical information that can be obtained periodically. As described above, according to the computing device 10 of the present embodiment, effective communication measurement can be performed using limited processing resources.

分類部１３２は、第１のフロー統計情報に含まれる送信元及び送信先が、それぞれ第２のフロー統計情報に含まれる送信先及び送信元と同一であり、かつ、第１のフロー統計情報及び第２のフロー統計情報がいずれも所定の期間内に発生したトラフィックに基づくものである場合に、第１のフロー統計情報と第２のフロー統計情報とを同一のグループに分類することができる。このように、計算装置１０は、フロー統計情報をセッションごとのグループに分類する。これにより、本実施形態によれば、セッション単位での統計情報の計算が可能となる。 The classification unit 132 determines that the transmission source and the transmission destination included in the first flow statistics information are the same as the transmission destination and the transmission source included in the second flow statistics information, respectively. When the second flow statistical information is based on traffic generated within a predetermined period, the first flow statistical information and the second flow statistical information can be classified into the same group. As described above, the computing device 10 classifies the flow statistical information into groups for each session. As a result, according to the present embodiment, it is possible to calculate statistical information in session units.

取得部１３１は、フロー統計情報として、少なくともパケット数及びバイト数を取得することができる。この場合、計算部１３３は、統計情報として、グループごとのパケットサイズの平均、パケットサイズの平均の最大値、パケットサイズの平均の最小値、及びパケットサイズの平均の標準偏差、バイト数の時間平均、及び、時刻ごとの送受信されたパケットの有無を表す情報を計算することができる。このように、計算装置１０は、パケット数及びバイト数から、６種類の統計情報を生成することができる。 The acquisition unit 131 can acquire at least the number of packets and the number of bytes as flow statistical information. In this case, the calculation unit 133 calculates, as the statistical information, the average of the packet size for each group, the maximum of the average of the packet size, the minimum of the average of the packet size, the standard deviation of the average of the packet size, and the time average of the number of bytes. , And information indicating the presence / absence of a transmitted / received packet for each time can be calculated. Thus, the computing device 10 can generate six types of statistical information from the number of packets and the number of bytes.

取得部１３１は、一定時間間隔の時刻のそれぞれに対応するフロー統計情報を時間順に取得することができる。この場合、分類部１３２は、取得部１３１によってフロー統計情報が取得されるたびに、フロー統計情報を分類することができる。また、計算部１３３は、分類部１３２によって分類が行われるたびに、グループごとのトラフィックに関する統計情報を計算することができる。これにより、計算装置１０は、フロー統計情報が生成されるのに合わせて、逐次計算を進めていくことができる。 The acquisition unit 131 can acquire flow statistical information corresponding to each of the times at a fixed time interval in order of time. In this case, the classifying unit 132 can classify the flow statistical information each time the obtaining unit 131 obtains the flow statistical information. Further, the calculating unit 133 can calculate statistical information on traffic for each group each time the classifying unit 132 performs classification. Thereby, the calculation device 10 can proceed with the calculation sequentially as the flow statistical information is generated.

計算部１３３は、分類部１３２によって分類が行われたグループの統計情報が既に計算済みである場合、当該計算済みの統計情報及び取得部１３１によって取得されたフロー統計情報に基づいて、グループごとのトラフィックに関する統計情報を計算することができる。これにより、あるセッションのフロー統計情報を全て保持しておく必要がなくなるため、使用する記憶容量を削減することができる。 If the statistical information of the group classified by the classifying unit 132 has already been calculated, the calculating unit 133 calculates the statistical information of each group based on the calculated statistical information and the flow statistical information acquired by the acquiring unit 131. Statistics about traffic can be calculated. As a result, it is not necessary to hold all the flow statistical information of a session, so that the storage capacity to be used can be reduced.

計算部１３３は、統計情報として、時刻ごとの送受信されたパケットの有無に基づく共起行列を計算することができる。これにより、連続するパケットの出現パターンを分析することが可能となる。 The calculation unit 133 can calculate, as statistical information, a co-occurrence matrix based on the presence or absence of transmitted / received packets at each time. This makes it possible to analyze the appearance pattern of successive packets.

［システム構成等］
また、図示した各装置の各構成要素は機能概念的なものであり、必ずしも物理的に図示のように構成されていることを要しない。すなわち、各装置の分散・統合の具体的形態は図示のものに限られず、その全部又は一部を、各種の負荷や使用状況等に応じて、任意の単位で機能的又は物理的に分散・統合して構成することができる。さらに、各装置にて行われる各処理機能は、その全部又は任意の一部が、ＣＰＵ及び当該ＣＰＵにて解析実行されるプログラムにて実現され、あるいは、ワイヤードロジックによるハードウェアとして実現され得る。 [System configuration, etc.]
Each component of each device illustrated is a functional concept and does not necessarily need to be physically configured as illustrated. In other words, the specific form of distribution / integration of each device is not limited to that shown in the figure, and all or a part thereof may be functionally or physically distributed / arbitrarily divided into arbitrary units according to various loads and usage conditions. Can be integrated and configured. Further, all or any part of each processing function performed by each device can be realized by a CPU and a program analyzed and executed by the CPU, or can be realized as hardware by wired logic.

また、本実施形態において説明した各処理のうち、自動的に行われるものとして説明した処理の全部又は一部を手動的に行うこともでき、あるいは、手動的に行われるものとして説明した処理の全部又は一部を公知の方法で自動的に行うこともできる。この他、上記文書中や図面中で示した処理手順、制御手順、具体的名称、各種のデータやパラメータを含む情報については、特記する場合を除いて任意に変更することができる。 Further, of the processes described in the present embodiment, all or a part of the processes described as being performed automatically can be manually performed, or the processes described as being performed manually can be performed. All or part can be performed automatically by a known method. In addition, the processing procedures, control procedures, specific names, and information including various data and parameters shown in the above documents and drawings can be arbitrarily changed unless otherwise specified.

［プログラム］
一実施形態として、計算装置１０は、パッケージソフトウェアやオンラインソフトウェアとして上記の統計情報の計算を実行する計算プログラムを所望のコンピュータにインストールさせることによって実装できる。例えば、上記の計算プログラムを情報処理装置に実行させることにより、情報処理装置を計算装置１０として機能させることができる。ここで言う情報処理装置には、デスクトップ型又はノート型のパーソナルコンピュータが含まれる。また、その他にも、情報処理装置にはスマートフォン、携帯電話機やＰＨＳ（Personal Handyphone System）等の移動体通信端末、さらには、ＰＤＡ（Personal Digital Assistant）等のスレート端末等がその範疇に含まれる。 [program]
As one embodiment, the calculation device 10 can be implemented by installing a calculation program for performing the calculation of the above-described statistical information on a desired computer as package software or online software. For example, by causing the information processing device to execute the above calculation program, the information processing device can function as the calculation device 10. The information processing apparatus referred to here includes a desktop or notebook personal computer. In addition, the information processing apparatus includes a mobile communication terminal such as a smartphone, a mobile phone, and a PHS (Personal Handyphone System), and a slate terminal such as a PDA (Personal Digital Assistant).

また、計算装置１０は、ユーザが使用する端末装置をクライアントとし、当該クライアントに上記の統計情報の計算に関するサービスを提供する計算サーバ装置として実装することもできる。例えば、計算サーバ装置は、フロー統計情報を入力とし、セッション統計情報を出力とする計算サービスを提供するサーバ装置として実装される。この場合、計算サーバ装置は、Ｗｅｂサーバとして実装することとしてもよいし、アウトソーシングによって上記の統計情報の計算に関するサービスを提供するクラウドとして実装することとしてもかまわない。 In addition, the computing device 10 can be implemented as a computing server device that provides a client with a terminal device used by a user and provides the client with a service related to the calculation of the statistical information. For example, the calculation server device is implemented as a server device that provides a calculation service that receives flow statistics information and outputs session statistics information. In this case, the calculation server device may be implemented as a Web server, or may be implemented as a cloud that provides a service related to the calculation of the statistical information by outsourcing.

図１０は、計算プログラムを実行するコンピュータの一例を示す図である。コンピュータ１０００は、例えば、メモリ１０１０、ＣＰＵ１０２０を有する。また、コンピュータ１０００は、ハードディスクドライブインタフェース１０３０、ディスクドライブインタフェース１０４０、シリアルポートインタフェース１０５０、ビデオアダプタ１０６０、ネットワークインタフェース１０７０を有する。これらの各部は、バス１０８０によって接続される。 FIG. 10 is a diagram illustrating an example of a computer that executes a calculation program. The computer 1000 has, for example, a memory 1010 and a CPU 1020. The computer 1000 has a hard disk drive interface 1030, a disk drive interface 1040, a serial port interface 1050, a video adapter 1060, and a network interface 1070. These components are connected by a bus 1080.

メモリ１０１０は、ＲＯＭ（Read Only Memory）１０１１及びＲＡＭ１０１２を含む。ＲＯＭ１０１１は、例えば、ＢＩＯＳ（Basic Input Output System）等のブートプログラムを記憶する。ハードディスクドライブインタフェース１０３０は、ハードディスクドライブ１０９０に接続される。ディスクドライブインタフェース１０４０は、ディスクドライブ１１００に接続される。例えば磁気ディスクや光ディスク等の着脱可能な記憶媒体が、ディスクドライブ１１００に挿入される。シリアルポートインタフェース１０５０は、例えばマウス１１１０、キーボード１１２０に接続される。ビデオアダプタ１０６０は、例えばディスプレイ１１３０に接続される。 The memory 1010 includes a ROM (Read Only Memory) 1011 and a RAM 1012. The ROM 1011 stores, for example, a boot program such as a BIOS (Basic Input Output System). The hard disk drive interface 1030 is connected to the hard disk drive 1090. The disk drive interface 1040 is connected to the disk drive 1100. For example, a removable storage medium such as a magnetic disk or an optical disk is inserted into the disk drive 1100. The serial port interface 1050 is connected to, for example, a mouse 1110 and a keyboard 1120. The video adapter 1060 is connected to the display 1130, for example.

ハードディスクドライブ１０９０は、例えば、ＯＳ１０９１、アプリケーションプログラム１０９２、プログラムモジュール１０９３、プログラムデータ１０９４を記憶する。すなわち、計算装置１０の各処理を規定するプログラムは、コンピュータにより実行可能なコードが記述されたプログラムモジュール１０９３として実装される。プログラムモジュール１０９３は、例えばハードディスクドライブ１０９０に記憶される。例えば、計算装置１０における機能構成と同様の処理を実行するためのプログラムモジュール１０９３が、ハードディスクドライブ１０９０に記憶される。なお、ハードディスクドライブ１０９０は、ＳＳＤにより代替されてもよい。 The hard disk drive 1090 stores, for example, the OS 1091, the application program 1092, the program module 1093, and the program data 1094. That is, a program that defines each process of the computing device 10 is implemented as a program module 1093 in which codes executable by a computer are described. The program module 1093 is stored in, for example, the hard disk drive 1090. For example, a program module 1093 for executing the same processing as the functional configuration in the computing device 10 is stored in the hard disk drive 1090. Note that the hard disk drive 1090 may be replaced by an SSD.

また、上述した実施形態の処理で用いられる設定データは、プログラムデータ１０９４として、例えばメモリ１０１０やハードディスクドライブ１０９０に記憶される。そして、ＣＰＵ１０２０が、メモリ１０１０やハードディスクドライブ１０９０に記憶されたプログラムモジュール１０９３やプログラムデータ１０９４を必要に応じてＲＡＭ１０１２に読み出して実行する。 The setting data used in the processing of the above-described embodiment is stored as the program data 1094 in, for example, the memory 1010 or the hard disk drive 1090. Then, the CPU 1020 reads out the program module 1093 and the program data 1094 stored in the memory 1010 and the hard disk drive 1090 to the RAM 1012 as necessary and executes them.

なお、プログラムモジュール１０９３やプログラムデータ１０９４は、ハードディスクドライブ１０９０に記憶される場合に限らず、例えば着脱可能な記憶媒体に記憶され、ディスクドライブ１１００等を介してＣＰＵ１０２０によって読み出されてもよい。あるいは、プログラムモジュール１０９３及びプログラムデータ１０９４は、ネットワーク（ＬＡＮ、ＷＡＮ（Wide Area Network）等）を介して接続された他のコンピュータに記憶されてもよい。そして、プログラムモジュール１０９３及びプログラムデータ１０９４は、他のコンピュータから、ネットワークインタフェース１０７０を介してＣＰＵ１０２０によって読み出されてもよい。 The program module 1093 and the program data 1094 are not limited to being stored in the hard disk drive 1090, but may be stored in, for example, a removable storage medium and read by the CPU 1020 via the disk drive 1100 or the like. Alternatively, the program module 1093 and the program data 1094 may be stored in another computer connected via a network (LAN, Wide Area Network (WAN), or the like). Then, the program module 1093 and the program data 1094 may be read from another computer by the CPU 1020 via the network interface 1070.

１計算システム
１０計算装置
１１通信部
１２記憶部
１３制御部
２０クライアント
３０サーバ
４０ルータ
１２１一次記憶部
１２２二次記憶部
１３１取得部
１３２分類部
１３３計算部
１３４保存部 DESCRIPTION OF SYMBOLS 1 Computing system 10 Computing device 11 Communication part 12 Storage part 13 Control part 20 Client 30 Server 40 Router 121 Primary storage part 122 Secondary storage part 131 Acquisition part 132 Classification part 133 Calculation part 134 Storage part

Claims

An acquisition unit that acquires flow statistical information that is statistical information on traffic for each flow ;
A classification unit that classifies the flow statistics information into groups so that the sessions of the underlying traffic are the same;
For each of the sessions corresponding to the group classified by the classifying unit based on the flow statistics information, whether or not one or more upstream packets have flowed between a plurality of times when the flow statistics information is generated. A second sequence of binary values indicating whether or not at least one downstream packet has flowed during each of the plurality of times, and A calculation unit that calculates a co-occurrence matrix based on the number of appearances of each of the combinations of the arrangement methods included in the first and second arrangements, and a co-occurrence matrix based on the number of appearances of each of the combinations of the arrangement methods included in the second arrangement .
A computing device comprising:

The first calculating unit indicates, for each of the sessions corresponding to the group, whether or not one or more upstream packets have flowed for each of a plurality of times at which the flow statistical information is generated, by 0 or 1 , And a second sequence of 0 or 1 indicating whether or not one or more packets in the downstream direction have flowed for each of the plurality of times, and further calculates the first sequence and the included sequence. A co-occurrence matrix based on the number of occurrences of each of the combinations {00}, {01}, {10}, and {11}, and the second and included combinations {00}, {01}, and {10} , {11}, wherein the co-occurrence matrix is calculated based on the number of occurrences of each.

The classification unit may be configured such that a transmission source and a transmission destination included in the first flow statistics information are the same as a transmission destination and a transmission source included in the second flow statistics information, respectively, and the first flow statistics information And when both the second flow statistics are based on traffic generated within a predetermined period, the first flow statistics and the second flow statistics are classified into the same group. computing device of claim 1 or 2, characterized in that.

The acquisition unit acquires at least the number of packets and the number of bytes as the flow statistics information,
The calculation unit may include, as the statistical information, an average of the packet size for each group, a maximum of the average of the packet size, a minimum of the average of the packet size, a standard deviation of the average of the packet size, and the number of bytes. The calculation device according to any one of claims 1 to 3 , wherein a time average and information indicating presence / absence of transmitted / received packets at each time are calculated.

The acquiring unit acquires the flow statistical information corresponding to each of the times at a fixed time interval in chronological order,
The classifying unit, each time the obtaining unit obtains the flow statistical information, classifies the flow statistical information,
The calculation device according to any one of claims 1 to 4 , wherein the calculation unit calculates statistical information on traffic for each group each time the classification unit performs classification.

The calculation unit, if the statistical information of the group classified by the classification unit has already been calculated, based on the calculated statistical information and the flow statistical information acquired by the acquisition unit, the The calculating device according to claim 5 , wherein statistical information on traffic for each group is calculated.

A calculation method performed by a calculation device,
An acquisition step of acquiring flow statistical information that is statistical information on traffic for each flow ;
A classifying step of classifying the flow statistics into groups so that the sessions of the underlying traffic are the same;
Based on the flow statistic information, for each of the sessions corresponding to the group classified by the classification step, whether or not one or more upstream packets have flowed between a plurality of times when the flow statistic information is generated. A second sequence of binary values indicating whether or not at least one downstream packet has flowed during each of the plurality of times, and A calculating step of calculating a co-occurrence matrix based on the number of appearances of each of the combinations of the arrangement methods included in the first and the second arrangements, and
A calculation method comprising: