The method of recognition network phone traffic and system thereof in network
Technical field
The present invention relates generally to network traffics identification, and more particularly, the present invention relates to a kind of being used in network recognition network phone (Voice over IP, VoIP) method of flow and system thereof.
Background technology
Over past ten years, the communication technology sharply develops.Particularly, the fast development of internet has not only reduced and has kept the cost that connects, and its widely availability promote traditionally migration based on the communication mode of dedicated network (for example phone, fax, TV or the like).It is the computer software of realizing these communication modes that the internet is used, and this follows the design philosophy of its copy in dedicated network usually: a complexity (and normally strong) server is surrounded by a plurality of simple client computer.The subject matter of Cun Zaiing is like this, and server becomes key point, in case and this server failure or get clogged, whole service will be collapsed immediately.
In order to solve this crisis; equity (P2P) communication and overlay network are from continuous development in 1999 (is sign with Napster); and now, comprise that file-sharing, content are sent, the various internets of phone and video broadcasting use the benefit of having utilized this breakthrough widely.Technical, the P2P network disperses by the role's equilibrium that makes the communication participant in fact, and these communication participants form overlay network.In this network, either party can both receive service (role who plays the part of client computer) from network, provide service (role who plays the part of server) for other side in the network simultaneously, make live load be evenly distributed and efficent use of resources, and single participant's fault only have local influence and will recover by the mechanism of network self-adapting automatically.
Recently, use based on the Internet telephony of P2P that (networking telephone VoIP) has attracted remarkable attention from each side.For Virtual network operator, third party P2P phone application is interesting especially, and is restive usually because these are used directly with the traditional telephone service competition, charge, even is difficult to measurement, thereby causes the grievous injury benefits of operators.For example, have millions of registered users, most popular P2P phone application Skype (a kind of free voice communication programs is referring to http://www.skype.com) provides the voice quality and the software ease for use of high-quality.And after the success, it only is a matter of time that Skype provides service to the mobile subscriber in fixed network.
Traditionally, there is very big interest in operator to service control.Be service implementation control, operator need be in network build-in services control appliance (such as fire compartment wall).For these equipment, flow identification is the prerequisite of processing subsequently.For this reason, the method that the present invention proposes a kind of novelty is come the voip traffic in the recognition network, and the method does not rely on concrete application and realizes, thereby can stably discern voip traffic and high-performance ground is realized.
Traditional VoIP uses or uses known port, centralized server or comprise the signature can serviced control appliance distinguished (signature also can be called as the application features character string, etc.).On the contrary, the VoIP of new generation with P2P and encryption technology uses this kind equipment has been proposed very big challenge, gives those and uses the ability that is identified of escaping.
At the P2P technical elements, the network architecture based on P2P allows voip call to carry out between any calling party and called party, and do not relate to specific server, and allow voip call to send packet via any port, even come tunnelling by other agreement (for example HTTP).This stops the service control appliance to know the existence of voip call by checking its contact object.For example, two Skype users can only set up betwixt to call out and connect, and if directly connection be not optimal selection, then Skype SN (super node just) will dynamically be selected to transmit the flow of calling.A kind of noticeable phenomenon is, its startup stage, many application contacts well known server are for example got in touch well known server and are registered or login.Yet this is not enough for the identification voip traffic, sets up another reciprocity data content (for example sending text message, voice or the like) of sending reality that connects usually because these are used.
Aspect encryption, many P2P use integrated this technology, and wherein data packet payload is so that mode is encrypted end to end, thus the actual content that stops examiner therebetween to watch them to carry, and make failure of apparatus in fact based on signature.Refer again to Skype, its software has comprised an encryption layer in realizing, imposes high-intensity TLS/AES with (almost) all packets to its transmission and encrypts.
In addition, another practical problem comes from the violation of these VoIP application to traditional " connection of one type communication " rule.As shown in fig. 1, anything in new connection of VoIP intended application carrying, this is different from assigns to the tradition application that different sockets connects with its flow (just signaling, text message, multimedia or the like).This is true has brought a new problem for the operator of attempting to serve differentiation: compare with being connected, present service has meticulousr granularity, and a connection can comprise the service of number of different types.For these services, existing method and apparatus can not be handled, because these method and apparatus suppose that usually it is minimum unit to be processed that socket connects.
This class problem belongs in the research field widely that is called as P2P flow identification.Prior art is divided into two big classes usually: based on the method for signature with based on the method for behavior.The major defect of these methods is as follows:
Check the packet (sometimes only check before several packets) of the connection of the signature that some is known by inference by rule of thumb based on the method for signature, normally check the control keyword.If these control keyword couplings conclude that then this connection generates by certain application with those signatures, [1] [2].For example some technology of regular expression can be used to help to show complicated signature.Yet, make that the so not effective major defect of these methods is the fully dependences of these methods to signature.These methods must be upgraded its signature database continually, with the reflection software upgrading, thereby cause the maintenance cost that increases.In addition, these methods can't be discerned the packet behind the End to End Encryption, and it is more and more general that this situation will become, because software seller recognizes the threat that gets clogged.At last, high performance CPU of byte-by-byte inspection requirements and jumbo memory, thus improved the cost and the price of this product.
Method based on behavior is with the data-bag interacting effect between the peer-to-peer of statistical way analysis formation P2P overlay network.K.Suh proposes to discern by the rising edge of each bit rate that is connected at the inlet point of observing given network and exit point place and trailing edge the existence of the Skype via node in that network in [3].X.Wang[4] advise coming the potential connection of mark by increasing the time disturbance, so that follow the trail of the final destination of that connection.About performance, because these methods need compare and related per two statistical informations that are connected, so the complexity of these methods is o (nlogn), if and the number of connection under the monitoring very big (this is a typical case in the operator grade network), then these methods have serious scalability problem.In addition, these method hypothesis, the traffic statistics data of two access link (first of connection is jumped and final jump) can be synchronized to the service recognition device.This hypothesis under real-world situation unlikely because voip call can cross over different networks, operator or even country carry out.Therefore, even if there is this possibility, the operator of single network also needs to spend the statistics that great amount of cost obtains other operator.
Summary of the invention
With high stability and low complex degree is target, has proposed the present invention and has come to discern voip traffic effectively by the essential behavioural characteristic of utilizing voip traffic.Because the voice application that wideband adaptive VoIP codec will be considered to for future is promising, so in the present invention by in existing passive identification, improving accuracy of identification in conjunction with initiatively discerning.
According to an aspect of the present invention, provide a kind of method that is used at network identification voip traffic.This method comprises the following steps: that according to user identity all data being connected pairing user is divided into potential VoIP user application and domestic consumer; Monitor the flow of potential VoIP user application and collect interesting statistic; The statistic of collecting is associated with the priori of the flow distribution of VoIP application, and calculates similar degree SI
pIf analyze to disclose between the priori that some connection shows the flow distribution that described statistic and VoIP use highly relatedly, SI is fed back in the connection and the observation that just disturbance are applied to VoIP user application that monitored, potential
aSimilar degree SI
pWith feedback SI
aBe incorporated into and come together to calculate recognition result D; If recognition result D is greater than threshold value, then the connection of VoIP user application that monitored, potential is identified as VoIP and uses connection.
If user and well known server communicate, then this user can be considered as potential VoIP user application.
In some cases, accurately definition " well known server " is difficult.At this moment, can be potential VoIP user application with all user definitions, its connection will be subjected to traffic monitoring.
In addition, the inventive method can also be utilized heuristic experience and logic, by the reciprocal process in the analysis user login process, obtains higher performance.These heuristic experiences and logic include but not limited to: user's logging request bag is used deep-packet detection (DPI), analysis user login connection mode or the like.
Described interesting statistic comprises two classes: at articulamentum, these statistics comprise duration and mean bit rate; At layer data packet, these statistics comprise packet size and timestamp.Select interesting statistic according to the codec in the concrete VoIP application program.
In the methods of the invention, disturbance for example is the data packet discarding rate.User identity is the sign that network is used to distinguish different user.For example, user identity is at least a in the IP address of PDP Context, fixed communication network of WCDMA mobile communications network or the PPP session number.
According to a further aspect in the invention, provide a kind of system that is used at network identification voip traffic.Three key modules of this system are: packet statistic gatherer, detection and analyzer and disturbance maker.Packet statistic gatherer is used to monitor the flow of the potential VoIP user application of telling through total grader and collect interesting statistic.Detection will be associated with the priori of the flow distribution of VoIP application by the statistic that packet statistic gatherer is exported with analyzer and decide, and whether a connection comprises voip traffic.Disturbance maker and detection and analyzer reciprocation and be generated to the statistic disturbance that is connected of VoIP user application that monitored, potential.
On the basis of above-mentioned three modules, this system also comprises the optional module that can improve performance: VoIP Attribute Recognition device.This VoIP Attribute Recognition device utilizes heuristic experience and logic to come reciprocal process in the analysis user login process.And this VoIP Attribute Recognition device is accepted by the connection after the 3rd layer/the 4th layer signal filter filtration, and by the internal control message mechanism result is returned to total grader.These heuristic experiences and logic include but not limited to: user's logging request bag is used deep-packet detection (DPI), analysis user login connection mode or the like.
This system also needs to utilize the function of following two utility modules: total grader and the 3rd layer/the 4th layer signal filter.The 3rd layer/the 4th layer signal filter is connected after total grader, be used to filter about the information of VoIP and with filter result and return to total grader, make total grader all data be connected pairing user and be divided into potential VoIP user application and domestic consumer according to user identity by the internal control message mechanism.These two modules are used for concrete agreement identification modules all on it as the basic function of whole service area subsystem.
In detection and analyzer, collected statistic is associated with the priori of the flow distribution of VoIP application, and calculates similar degree SI
pIf analyze to disclose between the priori that some connection shows the flow distribution that described statistic and VoIP use highly relatedly, disturbance just is applied to the connection and the observation of VoIP user application that monitored, potential and feeds back SI
aSimilar degree SI
pWith feedback SI
aBe incorporated into and come together to calculate recognition result D.If recognition result D is greater than threshold value, then the connection of VoIP user application that monitored, potential is identified as VoIP and uses connection.
System of the present invention can be installed in the flow that is identified and converge the place in network.
Advantage of the present invention comprises following five aspects:
Precision
By the active disturbance is attached to passive monitoring, the present invention can identify with P2P and the voip traffic that is encrypted as feature, and it is impossible for any prior art.Known voip traffic distributes provides basic accuracy, and this precision is strengthened by the adaptive characteristic of variable bit rate (VBR) VoIP codec.
Stability
In method proposed by the invention, to the dependence of VoIP codec characteristics rather than the dependence that VoIP uses (and implementation) is greatly contributed to the stability of this method.Owing to the VoIP codec more is not easy to change than the VoIP application, uses so this method can identify a large amount of VoIP, and needn't require by answering land used even pursuing application version ground and discerned.
Extensibility
The statistic of Collection and analysis layer data packet is more effective than the payload inspection of traditional byte-by-byte, and is more suitable for large-scale deployment.Carry out off-line analysis if necessary, then the present invention can also save computer storage or disk storage space.
But generalization
If the flow distribution of corresponding application is known, then the present invention can be promoted and be discerned the general voip traffic that is produced by other application in the network (for example fixed telephone network) of many types.
Law, privacy, logistics and financial advantage
Because this method does not require payload inspection, so this method legal issue and privacy concern that employed other method is not faced in many similar products.And, because its stability, so this method can reduce the maintenance cost of operator by eliminating frequent updating signature/server (super node) database.In addition, because the method that is proposed among the present invention does not also require any specialized hardware, so this method can realize by general hardware platform, thereby only need safeguard quantity and stock of less types in the backup warehouse that makes the replacement of operator for hardware fault the time do.
Description of drawings
Illustrate in greater detail the present invention below in conjunction with accompanying drawing, wherein:
Fig. 1 illustrates VoIP and uses the multiple different services that may comprise.
Fig. 2 illustrates and is used for the system configuration of recognition category like the voip traffic of Skype.
Embodiment
In order to understand the present invention better, show the terminological interpretation that is used among the present invention is as follows:
VoIP: the networking telephone, by the technology of IP network transmission digitize voice.
P2P: the equity, the distributed network framework, usually be implemented on the present networks.
SN: super node, the special-purpose peer-to-peer in the P2P network, this peer-to-peer provides share service, for example index, relaying or the like.The P2P network can comprise a plurality of this super nodes.
HTTP: HTML (Hypertext Markup Language), the agreement that is commonly used to send web page contents (web content).
Codec: encoder is used for the computer program of the conversion between Digital Media and the professional format.
VBR: the variable bit rate technology, it can change the output bit rate of media coding when work.
DPI: deep-packet detection, a kind ofly come technology that flow is classified according to the signature that is comprised in the packet.
Notice that the implementation detail that VoIP uses is different because of the difference of type.Below at present most popular application Skype, show a kind of representational implementation.Fig. 2 is provided for the block diagram of recognition category like the voip traffic of Skype, and this block diagram is made up of three passive identification modules (comprising an optional module), an active identification module and two system share modules:
Packet statistic gatherer 201 (passive module)
The voip user's that this module monitors is potential flow is also collected and is analyzed needed statistic after a while.Interesting statistic is double-deck: at articulamentum, these statistics comprise duration and mean bit rate; At layer data packet, these statistics comprise packet size and timestamp.Select these interesting statistic according to the codec in the concrete VoIP application program.
Detect and analyzer 202 (passive module)
As input, this module decides by importing to be associated with the priori of the flow distribution of some application (for example Skype) with these statistics, and whether a connection may comprise voip traffic.The network queuing is influential for the delay that packet experiences on network equipment, particularly when network is seriously loaded.In order to address this problem, the time fluctuation between the packet arrival to be thought of as noise, and to utilize the statistics signal processing technology to discern the voip traffic of expectation.
VoIP Attribute Recognition device 203 (passive module, optional)
Consider and reported in the literature and some relevant clues of some application (for example Skype),, then utilize these clues to help to distinguish the VoIP application as heuristic experience and logic if can obtain these clues.This method can be utilized these heuristic experience and logics, by the reciprocal process in the analysis user login process, thereby obtains higher accuracy.Possible heuristic experience and logic comprise uses deep-packet detection (DPI), analysis user login connection mode or the like to user's logging request bag.Yet, the influence that these clue height are changed by software implement scheme, and can not identify voip traffic separately, can login is finished after, open a new connection transmitting actual speech data because VoIP uses, and this new connection often there is not aforementioned clue can supply the usefulness of identification.This module is that optionally its filter result to the 3rd layer/the 4th layer of signal filter 206 is further handled, and its result is submitted to total grader 205 by the internal control message mechanism.If this module does not exist, then directly supplied with total grader and be used for defining potential user from the result of the 3rd layer/the 4th layer of signal filter.
Disturbance maker 204 (initiatively module)
By with " detect and analyzer " reciprocation, this module is generated to the statistic disturbance that target connects, such as the data packet discarding rate.Target connects pairing audio coder ﹠ decoder (codec) this disturbance is considered as the change of network condition, and triggers corresponding adaptation mechanism.Connecting the result who how disturbance is reacted about target is write down and analyzes by " detecting and analyzer " in this method subsequently.
Total grader 205 (passive module, public systemic-function)
Many systems/platforms have this module in order to carry out traffic classification.The present invention utilizes this total grader, according to user identity all data is connected and is divided into two classes: potential voip user and domestic consumer.Described user identity is the sign that network is used to distinguish different user, for example the IP address of the PDP Context of WCDMA mobile communications network, fixed communication network or PPP session number etc.The connection that belongs to potential voip user is delivered to packet statistic gatherer, is used for subsequent treatment.
The 3rd layer/the 4th layer signal filter 206 (passive module, public systemic-function)
As previously mentioned, when client signed in to corresponding service network, Skype and great majority were based on one of the application need of SIP and the well known server in this service network exchange message.Therefore, if find this situation, then this module marks the user as potential voip user.Then, its filter result is returned to total grader by the internal control message mechanism that is shown in dotted line among Fig. 2.Consider that potential voip user is fewer, this behavior helps monitor task is defined as only fraction flow.
According to the 3rd layer/the 4th layer filtering result, all-network user (left side input of Fig. 2) is divided into potential voip user and domestic consumer.Each of potential voip user connects monitored and collects its statistic, and do not consider the actual content that carries in the data packet payload.After the analytic statistics amount, the known mass flow of these analysis results and corresponding application distributed carry out relatedly, and calculate similar degree SI
pDisclose highly association between the two in case analyze, disturbance just is added to this connection and observation is defined as SI
aIts feedback.At last, these two amounts are incorporated into and come together to calculate recognition result D (referring to equation (1)).For example, SI
aAnd SI
pWeighting δ summation, and if this recognition result D greater than threshold value η
0, then this connection is identified as VoIP connection (referring to equation (2)).
D=D(SI
p,SI
a) (1)
In equation (2), should be used for determining weight δ and threshold value η according to concrete VoIP
0And the value of weight δ changes between 1 to 0.
The pairing system of the present invention realizes with the form of software module, can be installed in the flow that is identified and converge the place in network.For example, possible mounting points is the border, internet (particularly finishing the equipment of service differentiation/controlled function) or the general gateway of the core network of operator.Existing equipment platform such as Cisco CSG/SSG, Nokia ISN, Siemens IPS, can provide sufficient hardware and software platform support for the implementation of system of the present invention.
Although the example has in conjunction with the accompanying drawings carried out above description to the present invention, obviously the present invention is confined to this, and can make amendment in many ways within the disclosed scope of the claim of enclosing.
List of references
[1] S.Ehlert, S.Petgang.Analysis and signature of SkypeVoIP session traffic.Technical Report.2006 July.
[2]Ipp2p,http://www.ipp2p.org/.17filter,http://17-filter.sourceforge.net/.
[3] people's such as K.Suh Characterizing and detecting Skype-relayed traffic.IEEE Infocom ' in April, 06,2006.
[4] people's such as X.Wang Tracking anonymous peer-to-peer VoIPcalls on the Internet.ACM CCS ' in November, 05,2005.