Detailed Description
The technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the drawings in the embodiments of the present invention, and it is obvious that the described embodiments are only a part of the embodiments of the present invention, and not all of the embodiments. All other embodiments, which can be derived by a person skilled in the art from the embodiments given herein without making any creative effort, shall fall within the protection scope of the present invention.
The embodiment of the invention provides a method for determining webpage time delay, which is applied to a device for determining the webpage time delay. The determining device of the web page delay may specifically be a network device such as a terminal or a server, which has data analysis and processing capabilities. As shown in fig. 1, the method includes:
step 101, acquiring target traffic data in all traffic data stored in the unidirectional DPI equipment according to the webpage identifier and the terminal IP address.
The target flow data at least comprises the access time of a specified terminal corresponding to the terminal IP address to access a specified webpage corresponding to the webpage identification, and the response time of the specified terminal in the current access to respond to the specified webpage.
The unidirectional DPI equipment adopts a DPI technology, and can identify the application layer data of the data packet, so that the network flow is analyzed in detail. Because the DPI device capable of simultaneously acquiring the uplink traffic and the downlink traffic is expensive at present, and the deployment number of the DPI device is in direct proportion to the data volume of the network traffic to be analyzed, in order to save the investment cost, a unidirectional DPI device is generally used to detect the uplink traffic of the network.
As shown in fig. 2, the unidirectional DPI device is located between the designated terminal and the server, and is configured to detect an upstream traffic sent by the terminal to the server. In the embodiment of the present invention, the traffic data sent when the terminal accesses the web page is based on a Transmission Control Protocol (TCP). TCP adopts three-way handshake to establish data connection between a terminal and a server, in the three-way handshake process of establishing connection between the terminal and the server, the terminal firstly sends a first synchronization Sequence number (SYN) message to the server, and the first SYN message is transmitted to the server through one-way DPI equipment and waits for the server to confirm; after receiving the first SYN message, the server confirms the first SYN message, and then sends a second SYN message and a first Acknowledgement Character (ACK) message to the terminal; after receiving the second SYN message and the first ACK message, the terminal sends a second ACK message to the server; after the server receives the second ACK message, the terminal and the server complete three-way handshake, connection is established, and the terminal can access webpage content in the server.
It should be noted that, when the unidirectional DPI device receives the first SYN packet, the receiving time of the first SYN packet is recorded, and the receiving time is determined as the access time; and when the unidirectional DPI equipment receives the second ACK message, recording the receiving time of the second ACK message, and determining the receiving time as response time.
If the unidirectional DPI device only monitors the first SYN packet and does not monitor the second ACK packet, the server may not respond to the access request of the user due to busy or the like, or the first SYN packet does not reach the server due to poor network quality from the unidirectional DPI device to the server, or the second SYN packet and the first ACK packet sent by the server do not reach the terminal. When the above situation occurs, the above situation can be improved by optimizing the network or repeatedly initiating an access request to the server by the terminal for many times.
And step 102, determining the difference value between the response time and the access time as the single round-trip delay.
In the present example, t is used1Denotes the access time, t2Indicating the response Time, the single Round-Trip Time (RTT) is expressed as:
RTT=t2-t1
and 103, determining all single round-trip time delays when the appointed terminal accesses the appointed webpage within the appointed time.
Considering that a user accesses web page elements, such as pictures, script files and the like, in a web page in a process of accessing the web page, when the user accesses a web page containing a plurality of web page elements, the web page elements need to be downloaded from a corresponding server, the web page delay includes the downloading time of the web page elements, and the time consumed for downloading different web page elements from different servers is different. The designated time can be set manually, and the average access time of the same webpage accessed by the user in the historical data can be referred during setting.
In the embodiment of the invention, the difference value between the access time and the response time of one access request initiated by the terminal is used as the single round-trip delay, and the access request terminal can access one of the webpage itself or the webpage elements.
And step 104, determining the specified webpage time delay of the specified terminal for accessing the specified webpage according to all the single round-trip time delays.
The embodiment of the invention obtains the flow data of the webpage accessed by the terminal from the unidirectional DPI equipment, thereby determining the webpage time delay when the appointed terminal accesses the appointed webpage according to the flow data, and because the unidirectional DPI equipment does not need to be deployed at the user side, and the flow data when all users access the webpage are transmitted to the server after passing through the unidirectional DPI equipment, the calculated webpage time delay can reflect the service perception when all users access the webpage without the cooperation of the users; and when the webpage time delay of a certain user at a certain moment needs to be acquired, the flow data of the user at the moment is only needed to be screened from the flow data to calculate the webpage time delay, so that the calculated webpage time delay truly reflects the real-time service perception of the user.
In order to facilitate the screening of the target flow data from all the flow data, in one implementation manner of the embodiment of the present invention, a screening basis needs to be determined. Therefore, on the basis of the implementation shown in fig. 1, the implementation shown in fig. 3 can also be realized. Step 101 is to obtain target traffic data in all traffic data stored in the unidirectional DPI device according to the terminal IP address and the web page identifier, and may perform steps 1011 to 1015:
step 1011, obtain Uniform Resource Identifier (URI) of the specified web page, web page element of the specified web page, and URI of the web page element.
It should be noted that the web page elements at least include pictures, sounds, videos, style sheet files, script files, and the like in the web page.
For a webpage needing to acquire webpage time delay, the webpage needs to be accessed, and a HyperText Markup Language (HTML) source file of the webpage is stored, wherein the structure of the HTML source file comprises a Head (Head) part and a Body (Body) part, the Head part provides relevant information of the webpage, and the Body part provides specific content of the webpage. After the HTML source file is obtained, the determining device for webpage time delay analyzes the head part of the HTML source file to obtain the URI of the webpage and the URI corresponding to the webpage elements.
Step 1012, obtaining a first reference field and a second reference field of a hypertext transfer Protocol (HTTP) header in each data packet in all the traffic data.
It should be noted that, after receiving all traffic data sent by the unidirectional DPI device, extracting fields "Host (Host)" and "Request (Request) URI" of an HTTP header in each packet, and merging the two fields to obtain a URI of a web page element accessed by the packet, where the Host field is before and the Request URI field is after the merging. And combining the Host field and the Request URI field to obtain a field, namely the URI of the webpage element accessed by the data packet is used as a first reference field. Through the first reference field, a data packet having the same URI as that of a web element in a specified web page can be screened out from all reference traffic.
In addition to extracting the first reference field, an access source (Referer) field of the HTTP header needs to be extracted as a second reference field. When the terminal accesses the webpage element, an access request is sent to a server storing the webpage file or the webpage element file, and a data packet of the access request carries a Referer field so as to inform the server of which webpage link the data packet comes from. Therefore, the web page to which the web page element belongs can be determined by extracting the refer field in the terminal data packet, that is, the web page to which the web page element belongs can be determined by the second reference field.
Step 1013, if the first reference field is the same as one of the URIs of the web page element and the second reference field is the same as the URI of the specified web page, determining a data packet including the first reference field and the second reference field in all the traffic data as a reference data packet.
And comparing the first reference field, namely the URI of the webpage element accessed by each data packet in all the flow data with the URI of the webpage element in the specified webpage, and comparing the second reference field, namely the refer field in the data packet with the URI of the specified webpage, and if the first reference field and the second reference field are both identical to the corresponding URI, determining that the data packet comprising the first field and the second field is the data packet accessed to the specified webpage, namely the reference data packet.
In the embodiment of the present invention, the webpage identifier may specifically be a first reference field and a second reference field.
In the embodiment of the present invention, after the unidirectional DPI device acquires the first reference field and the second reference field, the reference packet may be screened out according to the first reference field and the second reference field.
And step 1014, acquiring the terminal IP address of the appointed terminal.
After the reference data packets are screened out, the reference data packets need to be filtered according to the terminal IP address of the specified terminal used by the user, so as to screen out the target data packets of the specified terminal accessing the specified webpage.
Step 1015, determine the reference data packet with the same source IP address and terminal IP address in the reference data packet as the target data packet.
The target data packet constitutes a target flow, and the target flow at least comprises target flow data.
It should be noted that the terminal has a fixed and unique IP address, and the data packet sent by the terminal to the server carries the IP address, i.e., the source IP address. Because the source IP has uniqueness, if the source IP address included in the data packet is the same as the terminal IP address of the designated terminal, the data packet is determined to be the data packet sent by the designated terminal.
In the embodiment of the invention, the target data packet is screened from the data packets forming all the flow data, and the target data packet is the data packet sent when the appointed terminal accesses the appointed webpage, so that the webpage time delay of the appointed terminal accessing the appointed webpage can be determined according to the target flow data.
In order to determine the web page delay, in an implementation manner of the embodiment of the present invention, an IP address of a server storing web page elements needs to be determined first, so as to distinguish target data packets accessing different servers. Therefore, on the basis of the implementation shown in fig. 2, the implementation shown in fig. 3 may be implemented, and after the step 1015 is executed to determine the reference packet with the source IP address being the same as the terminal IP address in the reference packet as the target packet, the step 105 may be further executed:
and 105, acquiring a destination IP address of the target data packet, and determining the destination IP address as the IP address of the target server.
It should be noted that, because the web page delay is related to the time for acquiring each web page element from the server, in the embodiment of the present invention, before calculating the web page delay, it is necessary to determine the target server to which the web page element belongs, that is, determine the destination IP address included in the target data packet, that is, the IP address of the server storing the web page element, according to the header of the network layer protocol included in the screened target data packet.
In the embodiment of the invention, the target IP address contained in the target data packet is used as the IP address of the target server, so that compared with the target server to which the webpage element belongs obtained according to the webpage, the time for determining the IP address of the target server is saved, and the workload of the webpage delay determining device is reduced.
In order to accurately determine the web page delay, in an implementation manner of the embodiment of the present invention, it is necessary to calculate the web page delay by comprehensively considering the single round trip delay of the terminal for accessing the web page and the single round trip delay of the terminal for accessing the web page element, and therefore, on the basis of the implementation manner shown in fig. 4, the implementation manner shown in fig. 5 may also be implemented. Step 104 determines, according to all single round-trip delays, a specified webpage delay of a specified terminal for accessing a specified webpage, which may be specifically executed as steps 1041 to 1043:
step 1041, dividing the single round trip delay of all the single round trip delays when accessing the target servers corresponding to the same IP address into a group.
In the embodiment of the invention, all single round-trip time delays counted in the specified time are grouped according to the IP address of the accessed server, and the single round-trip time delays for accessing the servers corresponding to the same IP address are divided into a group. To facilitate the counting and comparison of the number of times a given terminal accesses each server, each set of single round trip delays may be represented using the following form:
......
wherein, IP
1、IP
2And IP
kRespectively indicating a server corresponding to a first IP address, a server corresponding to a second IP address and a server corresponding to a k-th IP address;
and
respectively used for indicating the n-th designated terminal
1Single round trip delay, nth, of a second access to a server corresponding to the first IP address
2A single round trip delay for a second access to a server corresponding to the second IP address, and an nth
kAnd the single round trip delay of the server corresponding to the kth IP address is accessed for the second time.
It should be noted that the data amount in each group may not be equal, which may cause the user to access the same server many times and access another server less times because more web page elements in the web page elements accessed by the user exist in the same server; alternatively, the user may visit the same web page element multiple times, but the number of visits to other web page elements is small, which may also result in unequal number of visits by the user to each server.
1042, according to the formula
And determining the corresponding weight of each target server.
Wherein alpha isiFor indicating the weight corresponding to the target server corresponding to the ith IP address, niThe number of times that the designated terminal accesses the target server corresponding to the ith IP address in the designated time is represented, and k is used for representing the total number of the target servers.
Step 1043, according to the formula
And calculating the specified webpage time delay d.
Wherein d isiThe median of the group formed by the single round trip delay for accessing the target server corresponding to the ith IP address is expressed.
And considering that the single round trip delay in each group has a long tail effect, selecting the median of the group formed by the single round trip delay in each group to calculate the time delay of the specified webpage.
It should be noted that before determining the median of the group formed by each group of single round trip delays, each group of single round trip delays needs to be sorted from large to small or from small to large. In this embodiment of the present invention, based on the method for representing each set of single round trip delays in step 1041, diThe following method may be used for determination:
when n is greater than n
iIn the case of an odd number of the groups,
② when the number is even, if the number is even,
it should be noted that after the specified webpage delay is obtained through calculation, the service perception of the user for the specified webpage can be evaluated by using the size relationship between the specified webpage delay and the preset threshold. If the specified webpage time delay is greater than the preset threshold, the service perception of the user is poor, and the webpage problem needs to be located, for example, the round-trip time delay of each server is compared with the preset threshold, if the round-trip time delay of one or more servers is greater than the preset threshold, the physical attribution and the load state of the one or more servers, the network path of the user accessing the one or more servers, the network link utilization rate, the network equipment operation state and the like are analyzed in detail, the problem of the webpage service is determined through the detailed analysis, the network is optimized, the webpage time delay is reduced, and the user perception is improved.
In the embodiment of the invention, the single round-trip time delay obtained in the appointed time is grouped according to the different accessed servers, the time delay difference of the appointed terminal when accessing different servers is considered, and the subsequent positioning of the webpage problem is facilitated; and then selecting the median of each group to calculate the time delay of the specified webpage, namely, when determining the time delay of the webpage, not only taking the time delay of one or more times of the specified terminal accessing the specified webpage or the webpage element as the time delay of the webpage, but also considering that the user may frequently access the specified webpage and the webpage element in a certain time, calculating the time delay of the specified webpage by using the single time webpage time delay of all the specified terminals accessing the specified webpage in the certain time, so that the calculation result of the time delay of the specified webpage is more accurate.
An embodiment of the present invention further provides a device 20 for determining a web page delay, where the device 20 is configured to execute the method flows shown in fig. 1, fig. 3, fig. 4, and fig. 5, and as shown in fig. 6, the device 20 includes:
the obtaining module 21 is configured to obtain target traffic data in all traffic data stored in the unidirectional DPI device according to the web page identifier and the terminal IP address, where the target traffic data at least includes access time of a specific terminal corresponding to the terminal IP address to access a specific web page corresponding to the web page identifier, and response time of the specific terminal in this access to respond to the specific web page.
And a determining module 22, configured to determine a difference between the response time and the access time acquired by the acquiring module 21 as a single round trip delay.
The determining module 22 is further configured to determine all single round trip delays of the specified terminal, which are obtained by the obtaining module 21, when the specified terminal accesses the specified web page within the specified time.
The determining module 22 is further configured to determine, according to all the single round-trip delays, a specified webpage delay of the specified terminal for accessing the specified webpage.
In an implementation manner of the embodiment of the present invention, the obtaining module 21 is further configured to obtain a uniform resource identifier URI of the specified web page, a web page element of the specified web page, and a URI of the web page element.
The obtaining module 21 is further configured to obtain a first reference field and a second reference field of a HTTP header in each data packet in all the traffic data.
The determining module 22 is further configured to determine, as the reference data packet, a data packet in all the traffic data that includes the first reference field and the second reference field if the first reference field is the same as one of the URIs of the web page elements and the second reference field is the same as the URI of the specified web page.
The obtaining module 21 is further configured to obtain a terminal IP address of the specified terminal.
The determining module 22 is further configured to determine, as a target data packet, a reference data packet in the reference data packet, where a source IP address is the same as a terminal IP address, where the target data packet constitutes a target traffic, and the target traffic at least includes target traffic data.
In an implementation manner of the embodiment of the present invention, the determining module 22 is further configured to obtain a destination IP address of the target data packet, and determine the destination IP address as an IP address of the target server.
In an implementation manner of the embodiment of the present invention, the determining module is configured to:
dividing single round-trip time delay when a target server corresponding to the same IP address is accessed into a group in all the single round-trip time delays;
according to the formula
Determining a weight alpha corresponding to each target server, wherein alpha is
iFor indicating the weight corresponding to the target server corresponding to the ith IP address, n
iThe terminal is used for indicating the times of the designated terminal accessing the target server corresponding to the ith IP address in the designated time, and k is used for indicating the total number of the target servers;
according to the formula
Calculating the time delay d of the specified webpage, wherein d
iThe median of the group formed by the single round trip delay for accessing the target server corresponding to the ith IP address is expressed.
Compared with the prior art that an operator cannot acquire the webpage time delay when all users access the webpage due to the fact that probes cannot be installed in partial areas, the webpage time delay determining device acquires the flow data of the webpage accessed by the terminal from the unidirectional DPI equipment, and accordingly determines the webpage time delay when the designated terminal accesses the designated webpage according to the flow data; and when the webpage time delay of a certain user at a certain moment needs to be acquired, the flow data of the user at the moment is only needed to be screened from the flow data to calculate the webpage time delay, so that the calculated webpage time delay truly reflects the real-time service perception of the user.
As shown in fig. 7, an embodiment of the present application provides a schematic structural diagram of a network device. The network device 30 includes: a processor 31 and a transceiver 33. Processor 31 is configured to control and manage the actions of network device 30, e.g., to perform the steps performed by determination module 22 described above, and/or to perform other processes for the techniques described herein. The network device 30 may also include a memory 32, a transceiver 33, and a bus 34, the memory 32 for storing program codes and data for the network device; the transceiver 33 is used to support communication between the network device and other network entities, for example, to perform the steps performed by the acquisition module 21.
The processor 31 may implement or execute the various illustrative logical blocks, modules, and circuits described in connection with the disclosure herein. The processor 31 may be a central processing unit, general purpose processor, digital signal processor, application specific integrated circuit, field programmable gate array or other programmable logic device, transistor logic device, hardware component, or any combination thereof. Which may implement or perform the various illustrative logical blocks, modules, and circuits described in connection with the disclosure. The processor may be a combination that implements a computing function, and may include, for example, a combination of one or more microprocessors, a combination of Digital Signal Processing (DSP) and a microprocessor, or the like.
Memory 32 may include volatile memory, such as random access memory; the memory 32 may also include non-volatile memory, such as read-only memory, flash memory, a hard disk, or a solid state disk; the memory may also comprise a combination of memories of the kind described above.
The bus 34 may be an Extended Industry Standard Architecture (EISA) bus or the like. The bus 34 may be divided into an address bus, a data bus, a control bus, and the like. For ease of illustration, only one thick line is shown in FIG. 7, but this is not intended to represent only one bus or type of bus.
Through the above description of the embodiments, it is clear to those skilled in the art that, for convenience and simplicity of description, the foregoing division of the functional modules is merely used as an example, and in practical applications, the above function distribution may be completed by different functional modules according to needs, that is, the internal structure of the device may be divided into different functional modules to complete all or part of the above described functions. For the specific working processes of the system, the apparatus and the unit described above, reference may be made to the corresponding processes in the foregoing method embodiments, and details are not described here again.
The steps of a method or algorithm described in connection with the disclosure herein may be embodied in hardware or in software instructions executed by a processor. The software instructions may consist of corresponding software modules that may be stored in RAM, flash memory, ROM, Erasable Programmable Read Only Memory (EPROM), Electrically Erasable Programmable Read Only Memory (EEPROM), registers, a hard disk, a removable hard disk, a compact disc read only memory (CD-ROM), or any other form of storage medium known in the art. An exemplary storage medium is coupled to the processor such the processor can read information from, and write information to, the storage medium. Of course, the storage medium may also be integral to the processor. The processor and the storage medium may reside in an Application Specific Integrated Circuit (ASIC). Computer-readable media includes both computer storage media and communication media including any medium that facilitates transfer of a computer program from one place to another. A storage media may be any available media that can be accessed by a general purpose or special purpose computer. A computer readable storage medium may be, for example, but not limited to, an electronic, magnetic, optical, electromagnetic, infrared, or semiconductor system, apparatus, or device, or any combination of the foregoing. More specific examples (a non-exhaustive list) of the computer readable storage medium would include the following: an electrical connection having one or more wires, a portable computer diskette, a hard disk, a Random Access Memory (RAM), a Read-Only Memory (ROM), an Erasable Programmable Read-Only Memory (EPROM), an optical fiber, a portable Compact Disc Read-Only Memory (CD-ROM), an optical storage device, a magnetic storage device, or any suitable combination of the foregoing. In embodiments of the present application, a computer readable storage medium may be any tangible medium that can contain, or store a program for use by or in connection with an instruction execution system, apparatus, or device.
The above is only an embodiment of the present application, but the scope of the present application is not limited thereto, and any changes or substitutions within the technical scope of the present disclosure should be covered by the scope of the present application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.