WO2007006994A2

WO2007006994A2 - Static detection of anomalies in traffic concerning a service entity

Info

Publication number: WO2007006994A2
Application number: PCT/FR2006/050669
Authority: WO
Inventors: Hervé SIBERT; Emmanuel Besson; Aline Gouget
Original assignee: France Telecom
Priority date: 2005-07-07
Filing date: 2006-07-04
Publication date: 2007-01-18
Also published as: WO2007006994A3; FR2888438A1

Abstract

The invention concerns a device for fast detection of anomalies in the traffic (LT) concerning at least one service entity (SE) following an attack of denial of service by flooding, wherein a module (MOD) provides a model of the normal activity of the entity through models for volume components of the traffic. Each model comprises a period of validity, statistical values and a conformity threshold dependent on the statistical values. A module (DET) determines for at least one evaluation of volume component at a later date a deviation of the volumic component relative to the model of the volume component and having a period of validity including the evaluation date. A module (ALE) determines a global alarm based on the deviation of the volume components for evaluation and signals an abnormal activity if the global alarm value exceeds a predetermined alarm value.

Description

Static detection of anomalies in traffic relating to a service entity

The present invention relates to the field of network security and flood denial of service attacks. More particularly, it relates to an anomaly detection in the traffic supported by a transmission link and relating to at least one service entity.

Telecommunications networks, such as the Internet, transmit data between different service entities via a common infrastructure. A service entity connected to such a network responds to client terminal requests by providing them with a service, that is, by performing well-defined actions requested by the clients. Service entities are, for example, a web server, a content or streaming server offering a download of multimedia files or broadcasting multimedia files, an e-mail server that relays messages, or a DNS domain name server that provides IP addresses corresponding to domain names. These service entities are, in some cases, endpoints of the network and are located at customers of an operator, while other service entities, such as DNS servers, are managed by the operator of the network itself. . A denial of service attack is an attack that attempts to make a service entity unavailable. There are several types of denial of service attacks, for example specific queries that directly attack the service entity's operation by asking it to perform an action. "improper" . Among the denial of service attacks, flood denial-of-service type attacks consist in exceeding, and thus "flooding", the network capacity of the service entity or the transmission link through which the service entity is connected to the network. In both cases, volume characteristics of the network traffic to the service entity increase suddenly. In order to detect denial of service attacks, there are two major families of detection methods.

The first family is related to signature detections. It consists of continuously observing the traffic in the vicinity of a potential service entity, and comparing the observations with traffic patterns stored in memory and which characterize known attacks. The detection methods of the first family are particularly suitable for intrusion detection and non-flood-based denial of service detection of service entities.

The invention further relates to the second family which relates to anomaly detections. An anomaly is a traffic assessment that does not conform to a set of acceptable normal traffic evaluations. All eligible assessments are determined a priori using rules and expert knowledge. There are several types of rules, such as an access control list and security policies, set by the operator or the customer who owns a service entity. These rules define characteristics that normal traffic must satisfy or otherwise not satisfy. However, these rules are insufficient: The most sensitive service entities most targeted by cybercrime are also the most difficult to protect a priori; in particular, the traffic of service entities whose clients are not known in advance is treated, and the rules that can be set a priori are limited. An attack can indeed comply with most rules set a priori in the form of the traffic that the attack sends to flood the service entity. To avoid this, it is necessary to fix in addition threshold type rules on volume components of the traffic. For good detection quality, these thresholds must take into account the traffic of the service entity in normal times, which implies a phase of learning of normal traffic by the detection method, the success of the attack being based the amount of attack traffic actually reached at the service entity for processing. If the rules are traffic thresholds, their effectiveness and relevance depends on the difference between these thresholds and the activity of the service at the time of the attack. It is therefore important that such rules take into account the activity of the service, and rules defined a priori are insufficient.

The detection of behavioral anomalies consists in modeling the activity of the service entity to be protected by observing it and modeling its behavior. Few known anomaly detection methods are specifically oriented towards flood attacks. Given evaluations and a behavioral model of the service entity, these methods use statistical variables to predict the next assessment (s). If the next effective evaluations deviate significantly from the predicted estimates, then an alert is reported. However, these anomaly detection methods also detect attacks other than flood attacks, inducing a loss of processing time by the security operator.

Furthermore, methods for detecting anomalies dedicated to the detection of flood attacks are known. But their cost is high and their algorithms are kept confidential. Some of these methods for controlling flood attacks are designed to be installed at the core of the network and / or rely on information based on a counting mechanism installed in routers, which information only characterizes the attacks very roughly.

The purpose of the invention is to statically detect near a service entity to protect behavioral anomalies both in the traffic directed towards this service entity and in the traffic that comes from it, without necessarily having to use predetermined rules for normal traffic. nor a prediction of evaluations of statistical variables, so as to report very quickly and precisely a flood denial of service attack.

To achieve this objective, a method for detecting anomalies in the traffic supported by a transmission link and relating to a service entity, including previously a modeling of the normal activity of the service entity, is characterized in that the model provides at least one model for a volume component of the traffic, each model comprising a validity period, statistical values relating to said volume component during said validity period and a compliance threshold dependent on the statistical values, said method comprising for minus a volume component assessment at a later date, a determination of a deviation of the assessed volume component from the volume component model and having a validity period that includes substantially the date of the assessment, and a determination of an alert value according to the deviation of at least one volume component.

The invention detects and characterizes a flood denial of service attack to protect the service entities of an operator or its customers for example. When a flood denial of service attack is detected, the invention characterizes it by deviations of the assessed volume components that represent an abnormal dispersion of the assessed volume components compared to heterogeneities of the volume components in the normal and as a result of the averages of the volume components in these models. The invention then quantifies the anomaly to be detected by the alert value dependent on deviations of the volume components for the evaluation and previous evaluations thereof.

Thus, the invention makes it possible to detect an abnormal activity in the traffic of the service entity when the alert value exceeds a level predetermined alert. The conformity of a volume component evaluation is assessed, contrary to the prior art, according to the date of this evaluation; it is thus possible to take advantage of the normal variation of the traffic over a certain period, for example during a day.

It should be noted that, in the context of the invention, the evaluations of volume components may relate to the traffic destined for the service entity to be protected, or to the traffic originating from this entity, or both of these traffics at the same time. .

The detection of behavioral abnormalities according to the invention is "static" in that it relies on stable periods of activity for the monitored service entity, and that it determines relevant models for the detection comprising values statistics relating to the volume components constituting static variables and therefore stationary over time.

According to a preferred embodiment, the modeling comprises periodic evaluations of voluminal components relating to the service entity's traffic for several predetermined durations, a determination of periods of stable activity of the service entity recursively by determining statistical values. of each volume component and partitioning the values of the volume components during each predetermined time based on the statistical values into clusters of which the most heterogeneous is selected and partitioned into other clusters and so on until the last cluster selected have a heterogeneity below a predetermined heterogeneity threshold, and a determination of several statistical values and a compliance threshold for each of the volume components and for each of the clusters, each model thus gathering a cluster of evaluations of a respective volume component for a period of one year. period of validity. According to particular arrangements, in order to ensure effective detection of anomalies when the normal service entity traffic changes, the invention provides for updating of models to periodically replace current old models relating to the service entity with the service entity. updated models based on recent values of the volume components associated with the service entity. Advantageously, the recent values of the evaluated volume components are purged of any value that has led to the signaling of an abnormal activity.

The invention also relates to a device for detecting anomalies in the traffic supported by a transmission link and relating to a service entity, the normal activity of the service entity being previously modeled, and a traffic probe including the device. to detect anomalies. The device is characterized in that it comprises means for modeling the normal activity of the service entity by at least one model for a volume component of the traffic, each model comprising a period of validity, statistical values relating to said volume component during said validity period and a compliance threshold dependent on the statistical values, means for determining for at least one volume component evaluation at a later date, a deviation of the evaluated volume component from the volume component model and having a validity period substantially including the date of the evaluation, and a means for determining an alert value based on the deviation of at least one volume component.

Finally, the invention relates to a computer program adapted to be implemented in an anomaly detection device according to the invention. The program comprises instructions which, when the program is loaded and executed in said device, perform the steps of the method of detecting anomalies according to the invention.

Other features and advantages of the present invention will emerge more clearly on reading the following description of several preferred embodiments of the invention, given by way of non-limiting examples, with reference to the corresponding appended drawings in which:

FIG. 1 is a schematic block diagram of a device for detecting static anomalies in an internet type network, according to the invention; and FIG. 2 is an algorithm of the static anomaly detection method according to the invention.

With reference to FIG. 1, a device for detecting static anomalies DD according to the invention is included in a traffic probe and serves substantially software-based agent for inhibiting attacks against SE service entities in a high speed packet telecommunication network RT managed by an operator according to the IP ("Internet Protocol"). Data is transmitted in a transmission link LT of the RT network constituting part of the Internet. The data are, for example, contained in packets transmitted by TC client terminals and intended for SE service entities comprising web servers. The transmission link LT is located near the service entities SE whose traffic is to be monitored by the probe so that the latter detects anomalies in the traffic which characterize abnormal behavior of the service entity clients, particularly attacks by flood of service entities.

For example, the transmission link LT to which the probe is connected in listening according to FIG. 1, is located in the network RT at the point of entry of one or more service entities on the network, between the network RT and the network. last RO router of the network operator RT preceding a router connected to an service entity SE. The probe is not connected to a client link between the RO router and the SE service entity whose traffic may already be congested due to the limited capacity of the client link.

Alternatively, the probe is cut-off to the transmission link LT to also remove abnormal packets for one or more monitored service entities.

The detection device DD reports an alert from the observation of the traffic relating to the entity or the service entities in the link LT so that the operator takes appropriate measures to counter an attack directed at the entity or service entities.

Whatever the implementation of the probe, limited or not to the static detection device of the invention, the probe issues alerts from listening to traffic at a point of the network RT.

As shown in FIG. 1, the static detection device DD may be in the form of a computer or a workstation, or a local or distributed computer system. In connection with the invention, the device DD comprises the following modules: a human-machine management interface HMI, local or remote, including including a keyboard and a screen, to activate the device automatically or manually by an operator and in particular to enter service entity identifiers and characteristics, select data for detection and read alerts; a service entity declaration module DEC for selecting a portion of the traffic to be monitored in the link LT which is intended for SE service entities to be protected; a MED mediated listening module according to Figure 1, or alternatively cut to the transmission link LT to evaluate volume components of traffic; a MOD modeling module that constructs normal activity models for the service entities to be protected based on traffic volume component values evaluated by the MED mediation module to produce models static behavior for each protected service entity and for each volume component; a DB database for recording models relating to the volume components of the service entities; an EXP expert module to introduce expert knowledge about the models relating to each service entity; a DET anomaly detection module detecting anomalies in the observed traffic destined for the service entities to be protected, which anomalies are detected in the form of abnormal deviations of traffic volume components evaluated by the MED mediation module with respect to models of normal activity; and an alert module ALE delivering alerts following abnormal deviations of volume components.

The method of static detection of traffic anomalies according to the invention is implemented in the detection device DD and comprises three main stages E1, E2 and E3, as shown in FIG. 2. The steps E1, E2 and E3 comprise respectively sub-steps ElO to E12, E20 to E29 and E30 to E35. Sub-stages are also indicated in FIG. 1 at the level of links between modules of the detection device DD intervening for the execution of these substeps. The first step E1 includes a declaration of the service entities OS to be explicitly protected by the operator, which defines a set of services to be protected provided by these entities.

The second step E2 models the static activity of the service entities SE to be protected. She constructs behavioral models that reflect normal service entity activity as a result of normal use by their customers. Step E2 therefore considers the specificities of each of the service entities to be protected. Predetermined volume components for detecting anomalies are predetermined for each service entity to be protected. Then, the normal activity of each service entity is modeled according to the predetermined volume components that are evaluated in dependence on the actual traffic and stored in the DB database. This first modeling considers each "evaluation" or "aggregation of evaluations" as a point in a space having as dimension the number of volume components considered for a given service entity. Then, the activity of each service entity is "split" into several time models during which the activity of the service entity is considered stable, and which are associated with clusters (clouds) separated from points present in the service entity. dimension space P.

The third step E3 detects abnormal activity in the monitored traffic in the service entity LT link. Step E3 tests whether the evaluations of the volume components of traffic correspond to models. Step E3 can also test whether the appearance of the evaluations of the volume components in particular with respect to an order of evaluations in the models and time references, for example expressed in hours, days of the week and months, is admissible or not to using graphs defining permissible transitions or not between two models and data contained in the models produced in the modeling step E2. The detection step E3 provides a vector of deviation from the modeled values.

At the first substep ElO of the service entity declaration step El, the operator defines SE service entities that he wishes to protect in the declaration module DEC via the human-machine interface HMI. The DEC module thus selects a portion of the traffic in the link LT which is intended for each declared service entity.

The operator explicitly defines service entities by entering identifiers and characteristics of the service entities to be protected transmitted to the DEC, or by entering identifiers of the service entities passed to the DEC that reads characteristics of the selected service entities in correspondence with the identifiers of the entities in a prerecorded list in a database in connection with the link LT. The characteristics identifying a service entity are for example triplets. Typically, the feature triple of a service entity includes a destination address or IP destination address class, a transport protocol, and a port. Service entity identifiers may also be specialized, for example, the feature triplets themselves. For example, a declared service entity has a triplet of characteristics (IP destination address, T transport protocol (s), port (s)

P): an IP destination address which is in the format 'wxyz / m', with w, x, y and z between 0 and 255, and m between 0 and 32, and possibly assigned a mask according to the notation CIDR

(Classless Internet Domain Routing); for example:

159.151.254.0/25 means the address 159.151.254.0 at the address 159.151.254.127 since 2 ³² - 2 ²⁵ -

1 = 127; a list of transport protocols which is in the format ^f pl; p2 'with pi, p2, ... finite number values taken in key words such as' tcp ',' udp ',' ip ',' icmp ', the value' ip 'covering both' tcp 'and' udp '; and a list of ports which is in the format ^f p1; p3-p4 'with pi, p2, p3, p4, ... integer values between 0 and 65535 and which can be included in a range of integer values, for example example '3200-3299', or in a list of integer values, for example '3200;3202; 3208.

"Supplementary" service entities may be implicitly declared as a result of the explicit declaration of the service entities to be protected. An implicit supplementary service entity may be a service entity declared by default to cover attacks on specific protocols, such as undesirable traffic using the Internet Control Message Protocol (ICMP). An implicit supplementary service entity may be constructed as a complement to at least one declared service entity having an IP address, and relating to traffic other than those relating to services hosted by the declared service entity but having said IP address. and for example different ports.

"Complementary" service entities are automatically inferred from the characteristics of the service entities declared to synthesize the LT link activity of the network that is not explicitly declared by the operator. The supplementary service entities to be protected have characteristic triplets (IP destination address, T transport protocol (s), all ports except P) and (IP destination address, transport protocol (s) except T, port (s) ) P).

As a variant, at a step ElOa that can be joint to the step ElO or replace it for certain service entities, the declaration of service entities is automatic thanks to a listening of the traffic in the link LT by the declaration module DEC. In this variant, the DEC module is connected in listening mode according to FIG. 1. The DEC module establishes a list of the service entities requested by the packets that pass in the link LT and classifies these service entities by frequency of appearance of the packets intended these entities, in order to obtain a pre-list of service entities. Then, this list is automatically reduced in order, for example, to keep in memory of the DEC module only the service entities requested by more than 1% of the packets, or is modified by the operator. The list thus established is the list of service entities to be protected.

After the step ElO or ElOa, the service entities to be protected by the device DD are well identified. The modeling module MOD is initialized by receiving the list of identifiers of the service entities to be protected provided by the declaration module DEC. The MOD module receives the list automatically at the sub-step ElI, or alternatively after validation of the operator via the HMI interface in the substep Ella.

For each service entity to be protected, the MOD modeling module associates default traffic evaluation parameters which are a granularity and a default list of volume components of traffic to, or possibly originating from, the service entity. SE service.

The granularity defines a service entity traffic evaluation period and may be a default value dependent on a feature of the service entity. For example, granularity is

10 seconds if the P port of the service entity is

80 (HTTP), or 1 minute if the port is 23 (Telnet). The volume components of traffic in the default list are characteristics that can be read from network and transport layer information and aggregated as counter counts reset at each traffic evaluation period defined by granularity. The volume components are for example: at the level of the IP network layer: the volumetry expressed by a traffic rate in bytes per second, the volubility expressed by a packet traffic rate per second, the connectivity expressed by a number of different source addresses in packets destined for the service entity, and fragmentation expressed as a fragmented packet rate as a function of the DF and MF bits in the flag field of the header of an IP packet; and at the Transmission Control Protocol (TCP) level: the connection opening expressed by a packet rate including a SYN control bit at "1", the "stress" expressed by a packet rate including a PUSH control bit at "1" indicating that the data is to be transferred to the upper layer, the urgency expressed by a packet rate including an urgency message control bit URG to "1", and the volatility expressed by a number of different ports solicited.

To the default list of traffic volume components can be added a list of other specific volumic components that depend on the service entity, such as quantities specific to an application protocol used. For example, a volume component to be evaluated on a HyperText Transfer Protocol (HTTP) service is the number of requests on a specific method, such as the 'GET' method, the number of requests to CGI (Common Gateway Interface) pages, PHP (Personal). Home Page), a Java Server Page (JSP), or a 2xx, 3xx, 4xx, or 5xx error code returned by the server involving a return traffic measurement, or the version of the protocol used (0.9, 1.0, 1.1) .

However, in the sub-step Ella, via the HMI interface, the operator can manually modify the list of solid components as well as any other evaluation parameter that has taken a default value, such as granularity.

In the substep E12, the module MOD transmits a list of the service entities, possibly after validation of the operator, the MED mediation module of the DD device. For the normal activity of each service entity to be protected that the MOD must model, the service entity list shall include the identifier and characteristics of the service entity and the volume components of the traffic necessary for modeling to that from these volume components, the MOD module produces a model of static behavior of the service entity.

At the beginning E20 of the service entity activity modeling step E2, the mediation module MED extracts from each relevant packet captured in the link LT and intended for an identified service entity to protect, in particular the address of IP source, the IP destination address, the transport protocol field value, the source port, the destination port, the total packet length in bytes, the flag field with the fragmentation bits, and the TCP flag list , to evaluate the volume components.

For each service entity, the MED module comprises COM counters which respectively count, for example, bytes, packets, predetermined source addresses and predetermined control bits to express the volume components, and which are thus assigned to the evaluation. the volume components of the service entity. The counters are reset regularly to the evaluation period according to the granularity, for example of the order of a few seconds, typically 10 seconds. Each volume component evaluation is time stamped with the start date of the period. During substep E20 and at At the end of each evaluation period, the meter counts constitute values of the requested traffic volume components that are aggregated, formatted and provided by the MED module to the MOD module.

The volume components provided by the MED module are temporarily stored in an evaluation database included in the MOD module. For a given service entity, the MOD module processes a series of sets of volume component values that have been evaluated for a predetermined duration which is much greater than the granularity evaluation period adopted for the aggregation of the volume components. or else defined by the operator. In particular, since the models are established with a period of validity of the order of the hour in the day, or even of the day in the month, the predetermined duration for raising the volume components is between one week and at least one month. . The set of predetermined times during which the counts of the COM counters are read and supplied to the MOD module to perform the first modeling during the start of the detection device DD is called the learning phase. From these volume components evaluated during the learning phase for the service entities to be protected, the module MOD produces models of static behavior of each protected service entity, as explained below.

To model the behavior of the traffic destined for the given service entity SE and thus the normal activity thereof, the module MOD recursively determines breaking points in the following sets of volume components during the learning phase, in order to determine periods of stable activity called clusters, in the substep E21. For each service entity, the MOD modeling module defines periods of stable activity on which the clusters extend.

For example, the modeling is based on a hybrid approach based on both an a priori definition and a posteriori definition of the behavior period of the clusters. The assessed volume components are first grouped into distinct clusters according to their day of the week. Then for each day of the week, a partitioning analysis discovers the distinct periods of activity, with a high limit on the number of clusters thus discovered which are sufficiently distinct from one another. This limit is derived from the desired modeling finesse. Thus, the number of possible clusters in a day can be limited to a minimum number, and / or the duration of each cluster at the time level can be limited to a minimum duration.

The recursion of the modeling includes for example a centering-reduction of each x of the volumetric components evaluated in the initial cluster of a day by determining statistical values of the volume component, such as the average of the evaluated values of the volume component and the the standard deviation of the volume component, and replacing each value of the volume component with an evaluation ratio of the difference between that value and the average of the standard deviation so that the average of the ratios relating to the volume component is null and their standard deviation equals 1. Euclidean distances between ratios of two evaluations for all volume components are determined.

Then recursively, the initial cluster is partitioned into clusters of which the most heterogeneous is partitioned into other clusters, and so on. For each cluster to be partitioned, signed functions are evaluated each equal to a sum of the reports for all the volumic components relating to the assessments included in the cluster. Each sign change in the time series of the signed functions related to the assessments included in the cluster is characterized by a product of the signed functions relating to two successive evaluations that is negative. The sign change indicates a separation breaking the series between the end of a new cluster and the beginning of a new next cluster in the cluster to be partitioned. For each new cluster, a heterogeneity is estimated equal to the sum of the squares of the Euclidean distances between the center of this new cluster and the reports relating to all the volume components and evaluations in this new cluster. The cluster with the highest heterogeneity is then selected to be partitioned into clusters in order to select the most heterogeneous cluster as before, until the heterogeneity of a selected cluster is less than a threshold. predetermined heterogeneity.

The MOD module produces a set of clusters that include consecutively evaluated sets of volume components having permissible heterogeneity. The preliminary recognition of stable behaviors thus separates the initial cluster of values of the volume components evaluated during the learning phase into the most homogeneous clusters possible. Each cluster corresponds to a validity period included in a predetermined period such as a day. As a result, for a learning phase over several weeks, each day has a set of clusters.

In sub-step E22, for each of the clusters thus determined and for each x of the volume components, the module MOD calculates statistical values, for example the mean μ _x , the standard deviation σ _x , the population equal to the number of components. evaluations in the cluster and a compliance threshold S _x depending on the calculated previous statistical values. The compliance threshold S _x for a volume component x is for example equal to the sum of the average μ _x of the volume component and the product of the standard deviation σ _x of the volume component by the square root of the ratio of the heterogeneity of the cluster at the end of the partitioning on the number of evaluations in the cluster.

Finally, in the substep E23, the MOD module gathers each cluster and each volume component x to produce a model which is a list including the identifier of the service entity, a validity period calculated from the start dates and end of assessments that are in the cluster, such as Monday between 9:00 am and 10:30 am, cluster granularity, model creation date, volume component designation x, and statistical values, such as mean μ _x and the standard deviation σ _x , relative to the volume component x in the cluster, and the compliance threshold S _x depending on the statistical values preceding and relating to the volume component. The cluster is thus defined by the data of the models of the different volume components, and the evaluations that led to its creation can be erased in the MOD module. The templates of all the clusters are transferred to the DB database which stores them in the substep E24 and / or the expert module EXP described later in the substep E25.

The MOD modeling module includes a MAM model update sub-module that periodically reads the current old models relating to a given service entity in the DB to provide updated models replacing the current old models. This model replacement is intended to reflect the potential evolution of customer usage for communication with SE protected service entities and thus the evolution of the actual LT link traffic between TC terminals and protected service entities. The updating period may be a duration substantially equal to that of the learning phase.

For an update of E26 models repeating at least substeps E20 to E25, the MOD module produces updated models based on recent values of the volumic components associated with the service entity, evaluated with the granularity defined in said current models and provided by the MED module since the last update, as in step E20. Recent values of the volumetric components evaluated are first temporarily stored in the evaluation database and then purged of any value that led to the generation of an alert and therefore to the reporting of an activity. abnormal of the service entity, verifying that these recent values are consistent with existing old models, that is to say that they do not trigger an alert by applying a detecting function detailed below.

The purification can however be skipped by decision of the operator or according to the definition of different modes of updating: for example an "obsolete" mode for which the purification is skipped, and a "normal" mode for which the purification is performed.

Then, the MAM submodule introduces the recently evaluated value of the volume component into the set of the volume component values of any associated model whose validity period substantially includes the evaluation date of the recent value of the volume component. . The statistical values of the volume component, such as the mean, the dispersion represented by the standard deviation and the population, and the conformity threshold for the volume component associated with the model are updated by the submodule MAM according to the recent value of the volume component.

Preferably, the MAM update submodule updates the statistical values including the compliance threshold by applying a weighting to the values of the volume component of the current old model. Population weighting depends on the values of the volume component of the old model, which is a priori different from the population of the recently assessed volume component values due to the treatment. Advantageously, the weighting in the submodule MAM depends on the creation date of the current old model to give less weight to the old model. current; for example, the "population" parameter is divided by a coefficient that increases with an age of the current old model.

Consequently, the MOD module establishes an age of each registered model expressed in number of days, in association with the identifier of the service entity, the validity period (start and end times of a day of the week or of the month) and an integer phase code indicating whether the model is being built during the learning phase, in use for detection, or temporarily disabled, or has generated a recent alert.

After updating, the updated templates are transferred to the DB database, which records them as in substep E24 and / or the EXP expert module as in substep E25, to replace and delete the current old templates.

At the end of each modeling E23 are possibly introduced expert knowledge by the EXP expert module. At a substep E27 succeeding the substep E25, the expert module EXP processes the models transmitted by MOD module after the modeling and concerning a service entity or several service entities having common characteristics such as an IP address. The EXP module creates, from the current models, advanced conformance schemes that are used in conjunction with the models in the E3 detection step.

For example, for each service entity and each granularity, the EXP module creates a PLC whose summits of the eligibility graph are the models relating to the service entity and the granularity and whose oriented edges of the eligibility graph form a rule base. For example, an edge between model states ETa and ETb is associated with a rule of the type "there is a transition from the state ETa to the state ETb if the validity period of the model associated with the state ETa immediately precedes the validity period associated with the ETb "state. Through the HMI man - machine interface, the operator can modify the PLC by adding other edges. The resulting automaton is associated with each service entity model for granularity and is stored in the DB database at a substep E28.

As an alternative to substep E27, the EXP module is activated only when the operator activates the detection module DET. The EXP module then reads from the DB database the models associated with the evaluation parameters of the requested detection, and requires the modifications and validation of the eligibility graph by the operator, before sending the PLCs associated with the DET module to the substep E29.

The abnormal activity detection step E3 comprises initialization substeps E30 to E32 and comparison sub-steps to the models E33 to E35.

The DET anomaly detection module is activated either automatically after the substeps E24 and E25 of the learning phase, that is to say after the first modeling, or by the operator via the human interface. HMI machine. At the sub-stage

E30, the DET module receives from the declaration module

DEC and / or the operator via the HMI interface the list of identifiers and characteristics of the service entities to be protected, and possibly the list of evaluation parameters, including granularity of assessments, for each service entity. The DET module reads in the DB database all the current models corresponding to the parameters, and if for a service entity no granularity is mentioned, the DET module calls in the database DB the current models and therefore the most recent for this service entity and reads their granularities. Thus, the DET module has the list of service entities to protect and current models relevant for detection. The DET module loads and stores all these current models in RAM.

For each SE service entity that the DD shall protect, the identifier and characteristics of the service entity and the traffic evaluation parameters of that entity necessary for the detection and contained in the templates shall, after modification and possible validation by the operator, subsequently applied by the DET module to the mediation module MED, in the substep E31.

In response to the characteristics and parameters of the service entity, the mediation module MED periodically delivers to the detection module DET an evaluation of the traffic intended for the service entity, in the substep E32. This assessment includes the counts of the COM meters that are assigned to the evaluation of the service entity's volumic components and that are periodically reset to the evaluation period according to the requested granularity for the service entity.

The valued components of the service entity have "instantaneous" values expressed by the accounts of the respective counters COM which are issued by the module MED to be processed by the DET detection module. Following the evaluation, the DET module calls the current models relating to the volume components. Each volume component may be associated with at least one current model of normal activity transferred from the base BD into the RAM. The current model of normal activity is considered relevant if it has a validity period including the date of the evaluation with more or less a time window and therefore during which the evaluation was carried out within the time window. The time window makes it possible to compensate for a normally normal traffic activity that intervenes in a manner substantially offset in time, and thus avoids false alarms. In substep E33, for each volume component x, the DET determines a deviation D _x of the evaluation with respect to each relevant current activity model. The deviation D _x is for example the ratio between the distance (absolute value of the difference) between the instantaneous value x of the evaluated volume component and its mean μ _x in the model, and the distance (absolute value of the difference) between the compliance threshold S _x in the model and the mean μ _x in the model. The DET module also determines a DG global deviation as a function of the deviations for the assessed volume components, for example equal to the root mean square of the deviation values.

If expert knowledge has been introduced according to substep E29, the DET module benefits from expert knowledge to reduce the number of false positives that should lead to alerts that are not. For example, when an eligibility graph has been created by the expert module EXP as described in substep E27, the expert knowledge depends on the statistical values of close models, that is to say whose distance in the graph to said current model is small, such as the models separated from the current model by an edge, and whose validity periods are few away from the evaluation date of the deviation value of the volume component associated with the current model, so that the expert knowledge is used to refine, for example decrease, the deviation value. For calculating the ratio between the distances for the deflection value D _x , the value of the compliance threshold S _c of the model can for example be replaced by the greatest of the conformity thresholds in the models close to said current model.

Then, in substep E34, for each evaluation, the DET module groups the values of the deviations D _x of the volume components and of the global deviation DG in an alert vector which thus represents the more or less pronounced similarity of the evaluation to current models relevant to this assessment. The alert vector is delivered to the ALE alert module responsible for the output of alerts. Finally in substep E35, the ALE alert module determines a global alert value by examining said alert vector. If the global alert value exceeds a predetermined alert level, the ALE module reports abnormal activity in the service entity SE traffic by transmitting an alert to the operator via the HMI interface, and / or to an external device. The transmitted alert is accompanied by the values of the volume components of the evaluation that triggered the alert, compliance thresholds relevant at the time of the evaluation, and a type of alert depending on the values of the deviations of the volume components.

In practice, the method of detecting anomalies according to the invention is provided to be able to detect anomalies relating to the traffic of several service entities supported by the transmission link. Previously, each service entity is declared by a destination address, at least one transport protocol and at least one port and a list of volume components to be evaluated according to a predetermined evaluation period.

The invention described herein relates to a method and a computing device DD for detecting anomalies in the traffic supported by the transmission link and relating to one or more service entities SE. According to a preferred implementation, the steps of the method of the invention are determined by the instructions of a computer program incorporated in the computing device. The program comprises program instructions which, when said program is loaded and executed in the device, whose operation is then controlled by the execution of the program, carry out the steps of the method according to the invention.

Accordingly, the invention also applies to a computer program, including a computer program on or in an information carrier, adapted to implement the invention. This program can use any programming language, and be in the form of source code, object code, or intermediate code between source code and object code such as in a form partially compiled, or in any other form desirable for implementing the method according to the invention.

The information carrier may be any entity or device capable of storing the program. For example, the medium may comprise storage means or recording medium, such as a ROM, for example a CD ROM or a microelectronic circuit ROM, or a USB key, or a magnetic recording means, for example a floppy disk or a hard disk.

On the other hand, the information medium may be a transmissible medium such as an electrical or optical signal, which may be conveyed via an electrical or optical cable, by radio or by other means. The program according to the invention can in particular be downloaded to an Internet type network.

Alternatively, the information carrier may be an integrated circuit in which the program is incorporated, the circuit being adapted to execute or to be used in carrying out the method according to the invention.

Claims

1 - Method for detecting anomalies in the traffic supported by a transmission link (LT) and relating to a service entity (SE), including previously a modeling of the normal activity of the service entity, characterized in that the modeling (E2) provides at least one model for a volume component of the traffic, each model comprising a validity period, statistical values relating to said volume component during said validity period and a compliance threshold depending on the statistical values, said method comprising for at least one volume component evaluation at a later date, a determination (E31 - E33) of a deviation of the assessed volume component from the volume component model and having a validity period substantially including the date of the evaluation, and a determination (E34, E35) of an alert value according to the deviation of at least a volume component.

The method according to claim 1, wherein the deviation of the volume component in a model is the ratio between the distance between the evaluated volume component and the average of the volume component in the model, and the distance between the compliance threshold. in the model and the average.

3 - Process according to claim 1 or 2, wherein the modeling comprises periodical evaluations (E20) of voluminal components relating to the service entity (SE) traffic for a plurality of predetermined periods, determining recursively stable service activity periods of the service entity by determining (E21) values statistics of each volume component and partitioning the values of the volume components during each predetermined time based on the statistical values into clusters of which the most heterogeneous is selected and partitioned into other clusters and so on until the last cluster selected has a heterogeneity less than a predetermined heterogeneity threshold, and a determination (E22) of several statistical values and a compliance threshold for each of the volume components and for each of the clusters, each model thus gathering a cluster of evaluations of a respective volume component during a period of e respective validity.

4 - Process according to claim 3, wherein the compliance threshold for a volume component and a cluster is equal to the sum of the average of the volume component and the product of the standard deviation of the volume component by the root. square ratio of cluster heterogeneity on the number of assessments in the cluster.

5 - Process according to any one of claims 1 to 4, wherein the deviation of each volume component is refined by expert knowledge (E27) comprising at least one of said statistical values and said Conformance thresholds for models whose validity periods are not far from the valuation date of the deviation value.

The method according to any one of claims 1 to 5, comprising updating models (E26) to periodically replace current old models relating to the service entity with updated models based on recent values of the associated volume components. to the service entity.

7 - Process according to claim 6, comprising a purification of the recent values of the assessed volume components of any value that led to the report of an abnormal activity.

The method according to claim 6 or 1, wherein the statistical values and conformance threshold in a model are updated by applying model-dependent weighting to the values of the model's volume component.

9 - Method according to any one of claims 1 to 8, detecting anomalies relating to the traffic of several service entities (SE) supported by the transmission link (LT), and previously comprising a declaration of each service entity by a destination address, at least one transport protocol and at least one port and a list of volume components to be evaluated according to a predetermined evaluation period. 10 - Device (DD) for detecting anomalies in the traffic supported by a transmission link (LT) and relating to a service entity (SE), the normal activity of the service entity being previously modeled, characterized in that it comprises a means (MOD) for modeling the normal activity of the service entity by at least one model for a volume component of the traffic, each model comprising a period of validity, statistical values relating to the said volume component during said period of validity and a compliance threshold dependent on the statistical values, means (DET) for determining for at least one volume component evaluation at a later date, a deviation of the evaluated volume component from the volume component model and having a validity period substantially including the date of the evaluation, and means (ALE) for determining an alert value based on the deviation of at least one volume component.

11 - Traffic probe including a device (DD) according to claim 10 for detecting anomalies.

12 - Computer program capable of being implemented in a device (DD) for detecting anomalies in the traffic supported by a transmission link (LT) and relating to a service entity

(SE), the normal activity of the service entity being previously modeled, said program including instructions which, when the program is loaded and executed in said device, perform the steps of: modeling (E2) the normal activity of the service entity by at least one model for a volume component of the traffic, each model comprising a validity period, relative statistical values to said volume component during said validity period and a compliance threshold dependent on the statistical values, determining (E31 - E33) for at least one volume component evaluation at a later date, a deviation of the evaluated volume component from the relative model to the volume component and having a validity period substantially including the date of the evaluation, and determining (E34, E35) an alert value as a function of the deviation of at least one volume component.