A kind of method of finding the probe failure in the internet performance measuring system
Technical field
The present invention relates to Internet technical field, particularly a kind of method of finding the probe failure in the internet performance measuring system.
Background technology
The Internet is one of present information network important infrastructure, yet the end to end performance problem of the Internet is a great problem of network manager always.
Along with the develop rapidly of Internet technology and Network, the user unprecedentedly increases the demand of Internet resources, and network also becomes and becomes increasingly complex.The ever-increasing network user and application cause network burden heavy, network equipment overload operation, thus cause that network performance descends.This just need to extract and analyze the performance index of network, and network performance is improved.Therefore network performance measurement just arises at the historic moment.The discovering network bottleneck, the optimized network configuration, and the potential hazard that may exist in the further discovering network, more effectively carry out network performance management, the checking of the quality of providing services on the Internet and control, Service Quality Metrics to the service provider quantizes, compares and checking, is the main purpose of network performance measurement.
The Internet is a kind of network of packetizing, take the TCP/IP technology as the basis, the data message is carried out addressing and the forwarding of hop-by-hop on network layer.Because the network node of each jumping only is responsible for the data retransmission of this node, be mutually independent between the node, and current network management system all is take individual node as management object, so network manager is difficult to obtain the overall picture of network performance.Under this background, need network measuring system with Internet user's identity network to be come that as black box network performance is carried out active and measure.
Carrying out in the world network, initiatively to measure the project of research a lot, such as IEPM, NIMI, NLANRAMP, Surveyor etc., wherein TWAMP (the Two Way Active Measurement Protocol) agreement (RFC5357) developed of IETF is one of more influential method wherein.
The TWAMP agreement is based on metering system end to end, and namely measuring entity all is main frame, and the network equipment does not participate in measuring.TWAMP has comprised two separate agreements:
● TWAMP-Control: be used for to set up measure session, the parameter of consulting session (such as the distributed constant that wraps length, zero-time, intermission, give out a contract for a project etc.), stops the measurement session at beginning, and obtains measurement result (employing Transmission Control Protocol);
● TWAMP-Test: stipulated the form of measured message etc., be used between measured node, carrying out mutual (the employing udp protocol) of measured message.
In order to improve its opening, the thought that TWAMP has adopted control protocol to separate with the measurement agreement, the control protocol that is to say actual TWAMP system not necessarily adopts TWAMP-Control, but the measurement agreement of bottom will adopt TWAMP-Test, can both guarantee like this interoperability of measuring process, so that the measured node of employing different control protocol can participate in measurement, embodied the opening of measuring again.
The TWAMP agreement comprises five functional entitys:
● send the measured node of measured message in the Session-Sender:TWAMP-Test session;
● receive the measured node of measured message in the Session-Receiver:TWAMP-Test session;
● Server: a server, managing one or more TWAMP-Test sessions, can be configured for each TWAMP-Test session in each measured node, can return the measurement result of each TWAMP-Test session;
● Control-Client: a main frame, be used for the request that the TWAMP-Test session is set up in initiation, and beginning and the termination of control session;
● Fetch-Client: a main frame is used for the request that TWAMP-Test session measurement result is obtained in initiation;
Relation between five functional entitys is as shown in Figure 1:
The TWAMP agreement supposes that at first the node (Session-Sender and Session-Receiver) of participating in measurement is under different effectors' control, Session-Sender is controlled by Control-Client, Session-Receiver is controlled by Server, therefore between Session-Sender and the Control-Client, and can be the control protocol of effector oneself definition between Session-Receiver and the Server, but between Control-Client and the Server, and can use disclosed TWAMP-Control agreement between Fetch-Client and the Server, like this with regard to so that between different effectors' main frame, carry out network performance measurement, and obtain data by an open interface and become possibility.Some research institutions such as Aveiro university, realize the TWAMP protocol system at present, and in their system, uncertain agreement has also adopted the TWAMP-Control agreement among the figure.
The TWAMP agreement is owing to based on metering system end to end, adopting common UDP message, so measuring process is difficult for perceived and monitoring, can reflect user's actual services situation; When design, just considered that safety problem, protocol contents have comprised between Client and Server and the authentication between Sender and Receiver and encryption mechanism simultaneously; TWAMP also supports parcel to measure in addition, and minimum message reaches 42 bytes when not encrypting, and is 60 bytes during encryption.But also there are some shortcomings in the TWAMP agreement, and at first measurement result reflection is the performance between the network edge main frame just, is unfavorable for the Troubleshooting of network; Secondly, agreement itself has very large opening, and the adaptability of agreement is strengthened, and has also introduced on the other hand safety problem, such as man-in-the-middle attack etc.Therefore, in sum, the TWAMP agreement is a relatively more suitable network performance measurement agreement of being undertaken by the user.
TWAMP itself is for measuring type, measuring the communication protocol of controlling parameter between a probe and probe, probe and the server, if the TWAMP agreement is used for actual measuring system, must consider the reliabilty and availability of whole system, need the state of real-time judge probe whether normal, if there is fault, also want failure judgement to occur on the probe node or occur among the network, so that test result is carried out correct processing.
At present, find still do not have associated mechanisms or individual to propose similar thinking and realize that in the internet performance measuring system of supporting the TWAMP agreement server is to the fault discovery mechanism of probe by web search.
Summary of the invention
The objective of the invention is on the TWAMP protocol system, set up a kind of method of finding fault at Control Server and probe (Session-Sender and Session-Receiver), so that whole system can be realized the judgement to the probe operating state.
For achieving the above object, the present invention is by the following technical solutions:
A kind of method of finding the probe failure in the internet performance measuring system, described internet performance measuring system is based on the system of TWAMP agreement, comprise, after probe is successful to server registration, set up the first memory table and the second memory table by server, described the first memory table be used for to be preserved the real time information that probe sends and the time that receives described real time information, and described the second memory table is for the information recording/ of preserving registered probe; Also specifically comprise:
Step 10: it is 0 that setting does not connect number of times, empties described the second memory table;
Step 20: server receives the real time information that probe sends, and preserve this real time information and receive time of described real time information at described the first memory table, preserve the information recording/of registered probe at described the second memory table, comprise title, IP address and the webmaster address information of probe;
Step 30: from described the second memory table, read an information recording/;
Step 40: with the coupling of the real time information in described information recording/and described the first memory table, if coupling then proves the real time information of having received probe in time-out time, go to step 50; If can not mate, then prove the real time information of in time-out time, not receiving probe, go to step 70;
Step 50: revising the probe active state is " normally ";
Step 60: the not connection number of times that probe is set is 0, deletes the data in described the first memory table, and goes to step 30;
Step 70: revising the probe active state is " fault ";
Step 80: the not connection number of times of probe+1 is set, deletes the data in described the first memory table;
Step 90: judge whether not connect number of times greater than 3, if not, then go to step 30; If judge that then detected probe breaks down.
Further, server in the described step 2 receives the real time information that probe sends, and preserves this real time information and receive time of described real time information at described the first memory table,, specifically comprise:
Step 21: the IP address, the keepalive timing interval information that from internal memory, read server;
Step 22: the configuration scheduling device just sends the keepalive packet to server every described keepalive timing interval;
Step 23: receive the keepalive packet, and information and the time that receives the keepalive packet in described the first memory table preservation keepalive packet.
Further, after described step 22, also comprise:
Step 221: the udp port that creates the keepalive report;
Step 222: encapsulation keepalive packet, according to defined keepalive message, insert corresponding information;
Step 223: send the keepalive packet, with sending function the keepalive packet is sent to server by UDP by bar;
Step 224: close described udp port.
Further, each the field implication of message with described keepalive packet is set as respectively message length, type of message and probe title.
Further, after judging that detected probe breaks down, also comprise:
Step 110: the IP address and the gateway ip address that from described the second memory table, read the probe that needs detection;
Step 120: ping test is carried out in the IP address of reading in the step 110, determine whether and to lead to by ping, if then go to step 130; If not, then go to step 150;
Step 130: the fault type of judging probe is " software fault ";
Step 140: the not connection number of times that probe is set is 0, deletes the data in described the first memory table, goes to step 180;
Step 150: the gateway ip address to probe carries out ping test, determines whether to lead to by ping, if then go to step 160; If not, then go to step 170;
Step 160: the fault type of judging probe is " hostdown ", and goes to step 180;
Step 170: the fault type of judging probe is " network failure ", and goes to step 180;
Step 180: the fault type of probe is write daily record.
Further, described step 110 specifically comprises:
Step 111: the IP address of from described the second memory table, reading the probe that needs detection;
Step 112: judge whether contain in described the second memory table and the corresponding gateway ip address in IP address described in the step 111, if having, then go to step 114; If not, then go to step 113;
Step 113: from database probe Basic Information Table, read and the corresponding detecting probe information in IP address described in the step 111, and be added into described the second memory table, and go to step 111;
Step 114: the gateway ip address that from described the second memory table, reads the probe that needs detection.
The present invention increases the method for state-maintenance and fault discovery at the probe of realizing the TWAMP agreement and server, at probe after server registration, regularly send Keepalive information to the server newspaper, after Keepalive is overtime, server will be initiated the detection to probe status, judge whether probe is in malfunction.
Description of drawings
Fig. 1 be in the prior art TWAMP protocol function entity concern schematic diagram;
Fig. 2 is sound end register flow path schematic diagram of the present invention;
Fig. 3 is probe registration request message format schematic diagram of the present invention;
Fig. 4 is sound end Keepalive handling process schematic diagram of the present invention;
Fig. 5 is Keepalive message format schematic diagram of the present invention;
Fig. 6 is process and the memory table schematic diagram that establishment of the present invention is monitored;
Fig. 7 is that probe status of the present invention is judged schematic flow sheet;
Fig. 8 is probe failure testing process schematic diagram of the present invention.
Embodiment
In order to make purpose of the present invention, technical scheme and advantage clearer, below in conjunction with drawings and Examples, the present invention is further elaborated.Specific embodiment described herein is only in order to explaining the present invention, but is not intended to limit the present invention.Probe failure among the present invention finds to comprise the handling process of probe and server two aspects.
At sound end, at first need to log-on message be write the sound end local file by administrative staff, after probe starts, automatically start register flow path, the fault discovery information needed is sent to server end.As shown in Figure 2, handling process may further comprise the steps:
Step 10 creates the employed TCP Socket of registration.
Step 20 sends connection request by this TCP Socket to server end.
Step 30 after connection request is accepted, according to log-on message file generated logon message, sends registration request to server end.
Step 40 receives and reads the registration answer message that server end returns.
Step 50 is called public module and is closed connection.
Step 60 is replied message according to the registration of reading in the step 40, takes out Accept field wherein, and whether this field representative registration is successful.Return this registering result according to the Accept value.
As shown in Figure 3, probe registration request message is as follows:
● message length: 4 bytes.
● type of message: 2 bytes, 1 represents the probe log-in command, and 2 represent regularly online order.
● probe ability: 4 bytes, 32bit.
● probe title: 32 bytes, the title that the user configures for probe.
● password: 32 bytes, the log-in password that the user configures for probe.
● probe I P address: 256 bytes, the IP address of probe.
● probe gateway ip address: 256 bytes, the gateway address of probe.
After probe is successfully registered, will start the Keepalive thread, the probe status on the periodic maintenance server.As shown in Figure 4, the handling process of sound end Keepalive thread is as follows:
Step 10 is read the keepalive relevant information, need to read out the information such as IP address, keepalive timing interval of Control Server from internal memory.
Step 20, setting the timer time is interval setup time, calls Keepalive by the set time and sends code, reaches the purpose of the presence of reporting for work.Then just can automatically open thread process when timing goes to step 21 and manages everywhere.
Step 21 creates the UDP Socket that Keepalive reports.
Step 22, encapsulation Keepalive packet according to defined Keepalive message, is inserted corresponding information.
Step 23 sends the Keepalive packet, by calling the transmission function datagram is sent to Control Server by udp port.
Step 24 is closed Keepalive UDP Socket.
Sound end sends the keepalive packet and uses udp port.As shown in Figure 5, each field implication of Keepalive message is as follows:
● message length: 4 bytes.
● type of message: 2 bytes, 2 represent regularly online order, and 1 represents the probe log-in command.
● probe title: 32 bytes, the title that the user configures for probe.
As shown in Figure 6, the handling process of server may further comprise the steps:
Step 110-120: server is accepted the probe registration request, and starts the detecting probe information processing threads;
Step 121-123: server mates the probe registration information that receives and the information in the database;
Step 124: with the IP address in the probe log-on message, the key messages such as gateway address are saved in the database relevant position;
Step 125-126: send registration to probe and reply message, and close connection.
Server with regard to the scan procedure of initialization probe, by the reading scan probe status time in the Control Server Parameter File, every scanning mode run-down blanking time, and is made corresponding probe status and is judged when starting the UDP listening port.
Server is set up a special memory table 1, if receive the keepalive information of probe, the time that then probe title and Control Server is received keepalive information is saved in the Control Server end memory table, sets up simultaneously a memory table 2 and is the Hash data structure of " probe name-IP address ".As shown in Figure 7, carry out the flow process that probe status judges as follows:
● step 10: judge beginning, putting probe, not connect number of times be 0, empties memory table 2;
● step 20: set up memory table 2 according to registered detecting probe information, wherein preserved probe title and IP address, gateway address information;
● step 30: from memory table 2, get a probe records;
● step 40: should record and memory table 1 middle probe name coupling, and if can mate, then prove the keepalive information of having received probe in time-out time, and forward step 50 to, it is normal revising probe status, and returns step 30, takes off a probe records; If can not mate, then prove the keepalive information of in time-out time, not receiving probe, forward step 70 to;
● step 70: revising probe status is " fault ";
● step 80: this probe does not connect number of times and adds 1, these probe clauses and subclauses in the deletion memory table 1;
● step 90: judge whether not connect number of times greater than 3, if not greater than 3, then wait for the keepalive time-out time after, return step 40 and judge; If greater than 3, then open probe failure and judge thread.
Probe failure judges that flow process is in order to judge the type of probe failure, to comprise: probe software fault, probe hostdown and three kinds of situations of verifier network fault.As shown in Figure 8, judge that flow process is as follows:
● step 110: reading from memory table 2 needs exploratory probe IP address;
● step 120: judge whether contain probe, gateway IP in the table 2, if do not have, again from database, read corresponding information;
● step 140: read memory table 2 middle probe IP addresses;
● step 150: the probe I P address is carried out ping test, if probe can lead to by ping, the fault-free of proof probe main frame own, the network free barrier, server does not receive that keepalive message should be because the probe software induced fault, therefore remove in the memory table 1 and this probe relevant information, and in daily record, write probe " software fault "; If probe can't lead to by ping, enter step 180, carry out further fault and judge;
● step 180:ping probe gateway ip address, if can lead to by ping, then prove the network free barrier, and the probe main frame breaks down, and indicates in daily record; If can't lead to by ping, prove that the network of linking probe breaks down, in daily record, should indicate " network failure ".
By above step, all possible breakdown forms of probe can be carried out complete judgement, thereby make the network performance measurement system can accurately reflect the situation of network.
The above is preferred embodiment of the present invention only, is not to limit practical range of the present invention; If do not break away from the spirit and scope of the present invention, the present invention is made amendment or is equal to replacement, all should be encompassed in the middle of the protection range of claim of the present invention.