[go: up one dir, main page]

CN101547126B - Network virus detecting method based on network data streams and device thereof - Google Patents

Network virus detecting method based on network data streams and device thereof Download PDF

Info

Publication number
CN101547126B
CN101547126B CN2008101028494A CN200810102849A CN101547126B CN 101547126 B CN101547126 B CN 101547126B CN 2008101028494 A CN2008101028494 A CN 2008101028494A CN 200810102849 A CN200810102849 A CN 200810102849A CN 101547126 B CN101547126 B CN 101547126B
Authority
CN
China
Prior art keywords
virus
network
matching
module
characteristic
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN2008101028494A
Other languages
Chinese (zh)
Other versions
CN101547126A (en
Inventor
华东明
肖小剑
邓炜
周涛
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Credit Information Technology Co ltd
Original Assignee
Beijing Venus Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Venus Information Technology Co Ltd filed Critical Beijing Venus Information Technology Co Ltd
Priority to CN2008101028494A priority Critical patent/CN101547126B/en
Publication of CN101547126A publication Critical patent/CN101547126A/en
Application granted granted Critical
Publication of CN101547126B publication Critical patent/CN101547126B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a network virus detecting method based on network data streams and a device thereof in a TCP/IP network. The method comprises the steps of classifying network virus according to the types of host files, fragmenting the characteristics of the network virus according to different description types, recombining special numbers of network data packets into network data streams, matching the network virus characteristics after being classified and segmented with the network data streams, detecting implicit network virus in the network data streams and scanning format characteristics of webpage files in the matching process so as to detect embedded network virus. In the invention, twice matching can be carried out for the network data streams and the network virus characteristics to detect the embedded network virus, thereby the network virus spread in the TCP/IP network can be effectively and accurately detected, network users can be prevented from being attacked by network virus spread by a network, and a safe network environment is provided for the network users.

Description

Network virus detection method and device based on network data flow
Technical Field
The invention relates to the technical field of network and information security, in particular to a network virus detection method and device based on network data flow in a TCP/IP network.
Background
At present, with the development of Internet and network application modes, people can perform electronic commerce, resource sharing and entertainment activities through a network, the network gradually becomes an indispensable part of people in work, life and study, and meanwhile, the requirement of people on the safety of information in the network is more and more strong. Currently, firewall products, intrusion detection products, and anti-virus products are still mainstream products in the information security market. With the rapid spread of network viruses and the combination with the mature of hacker technology, the integration of the traditional anti-virus technology and the network security technology becomes necessary. The traditional antivirus technology and the intrusion detection technology are fused to form a network virus detection technology, and the basic principle is as follows: all network flows flowing through the equipment are obtained through the packet capturing device, and then data link layer protocol analysis, IP protocol analysis and fragment recombination, transport layer protocol analysis and stream recombination, application layer protocol analysis and data reduction, virus feature matching and response are carried out on the data packet, so that the purpose of virus detection in the network is achieved.
The network virus detection device mainly relates to the following technologies: network packet capturing technology, protocol parsing technology, recombination technology, decompression technology, decoding technology, virus uncoating technology, script separation technology and mode matching technology.
The development of network virus detection technology has three major directions, namely, the network virus detection technology based on files; secondly, a network virus detection technology based on data packets; thirdly, network virus detection technology based on network data flow. For the file-based network virus detection technology, the scanning is started when the whole file is received, the memory space occupied by the data is large, the scanning time is long, and the performance is low; when only a portion of the entire document is scanned, the false-positive rate is relatively high. For the network virus detection technology based on the data packet, the detection performance is relatively high, but the data packet captured by the device is out of order, so that the false alarm rate and the missing report rate are relatively high. The invention is a network virus detection technology based on network data flow, which starts to match when part of the file is received, reduces the processing time, thereby improving the performance, and sequences the data packets, thereby reducing the false alarm rate and the missing report rate.
Disclosure of Invention
The invention aims to overcome the defects of a network virus detection technology based on files and data packets, and provides a device for detecting network viruses based on network data flow in a TCP/IP network, so that the network viruses can be detected quickly and accurately, the safety of information in the network is ensured, and a quick and safe network application environment is provided for network users.
The purpose of the invention is realized by the following technical scheme:
a network virus detection method based on network data flow includes the following steps:
A. classifying the network viruses according to different host file types;
B. fragmenting the network virus according to different description modes;
C. acquiring different webpage file format characteristics aiming at different webpage file types;
D. recombining a network data stream for a specific number of network data packets;
E. and matching the network data stream with the network virus characteristics according to the network virus information base, the virus characteristic fragments, the web page file format characteristics and the network data stream of different types, and detecting the network virus hidden in the network data stream.
On the premise that a network virus information base is available, the step A comprises the following steps:
a1, reading a network virus information base;
a2, decrypting virus information;
and A3, analyzing the virus information.
Preferably, the virus information includes: virus name, virus type, file offset, and virus signature.
A4, dividing the network virus into PE virus, macro virus, script virus and other viruses according to different format types of the network virus host file.
Preferably, the step B includes:
according to different network virus description modes, two pieces of network viruses described by a non-regular expression are randomly and sequentially extracted from the network viruses, for the network viruses described by a regular expression, a segment is randomly extracted from the front of all regular descriptors, and the rest part of the segment is taken as a segment.
Preferably, the first of the two viral signature fragments comprises: the file comprises a virus name, a virus description mode, a file offset, a virus feature code and a pointer of a second virus feature segment in the two virus feature segments;
preferably, the second viral fragment comprises: a virus offset and a virus signature.
Preferably, the step C includes:
c1, acquiring file format characteristics of various types of web page file formats by analyzing the web page file formats to form a web page file format characteristic library;
c1, reading the web page file format characteristic information from the web page file format characteristic library;
c2, analyzing the webpage file format characteristic information;
c3, inserting the web page file format features into the web virus features data structure to detect script viruses embedded in the PE and document format files.
Preferably, the step D includes:
d1, storing the data packets into a cache in sequence, and adding 1 to a packet counter;
and D2, when the packet counter value is greater than or equal to the sending end packet number window value, or the packet counter value is smaller than the sending end packet number window value and the time difference value is greater than or equal to the time window value, uploading the data packet, wherein the packet counter is 0, and the initial time is 0.
Preferably, the step E includes:
e1, according to different file types, hanging different types of network virus feature libraries, and using a multi-mode matching algorithm to enable the first segments of all network viruses to scan network data streams;
e2, when the matching of E1 is successful and the corresponding virus is described by the irregular expression, calculating the position of the second piece of virus in the network data stream and scanning the network data stream from the position by using a single pattern matching algorithm;
e3, when the matching of the E1 is successful and the corresponding virus is described by a regular expression, performing the automatic machine scanning network data flow generated by using the residual virus characteristics by using a regular expression algorithm.
E4, according to the scanning results of E2 and E3, if the matching is successful, a virus response is carried out.
A network data flow based network virus detection apparatus, comprising:
the virus information base is used for storing virus information, and comprises a virus name, a virus type, a description mode, an offset and a virus characteristic code;
the system comprises a characteristic mark used for storing and identifying webpage files, such as a webpage file format characteristic library of html, htm, php, jsp, jspx and asp;
the operation parameter library is used for storing the switch parameters and the response modes of the compression, decoding and shelling modules;
initializing operation parameters, initializing webpage file format characteristics, preprocessing virus characteristics, and establishing an initialization module of a virus characteristic single-mode regular expression automaton, wherein the virus characteristic is read, decrypted, analyzed, classified and fragmented, a virus characteristic tree is established, and the virus characteristic single-mode regular expression automaton is established;
a virus detection module for preprocessing data stream, including data stream recombination, decompression, decoding, virus shelling, cross-stream matching virus of multi-mode matching algorithm, cross-stream matching virus of single-mode regular expression algorithm, and response;
and the cache recovery module is used for recovering the cache space applied in the running process of the device.
Firstly, an initialization module reads, decrypts and analyzes a network virus information base, a webpage file format feature base and an operation parameter base; then, the virus detection device preprocesses the network data stream and performs virus feature matching, and if a virus exists, the virus detection device responds; finally, the dynamic application cache is reclaimed at the termination of the process.
According to the technical scheme provided by the invention, the viruses are classified, so that different virus libraries can be called for different types of files, the number of matched viruses is reduced, the viruses are fragmented, two-time matching is carried out on data streams, the matching length of a single virus is reduced, and the performance of the device is improved; the network data packet is subjected to stream recombination, and cross-stream can be realized in the processes of decompression and virus matching, so that the false alarm rate and the missing report rate are reduced.
Drawings
FIG. 1 is a schematic networking diagram of a virus detection apparatus in a TCP/IP network;
FIG. 2 is a schematic diagram of the apparatus of the method of the present invention;
FIG. 3 is a schematic diagram of a network virus classification model according to the present invention;
FIG. 4 is a schematic diagram of a network virus fragmentation model according to the present invention;
FIG. 5 is a main flow diagram of the method of the present invention;
FIG. 6 is a schematic diagram of the network virus feature preprocessing flow of the present invention;
FIG. 7 is a schematic diagram illustrating a web page file format feature preprocessing flow according to the present invention;
FIG. 8 is a schematic diagram of a network data stream reassembly procedure in accordance with the present invention;
FIG. 9 is a schematic diagram of a network data flow preprocessing flow in the present invention;
FIG. 10 is a schematic diagram of the network virus matching process in the present invention.
Detailed Description
The core of the method is that the network viruses are classified according to the file types, so that the file types correspond to the network virus types, and the number of the network viruses matched for each data stream is reduced; the network virus is segmented and scanned twice, so that the length of scanning a single virus is reduced; the network data flow is recombined, so that the false alarm rate and the missing report rate are reduced; cross-stream decompression, decoding, shelling, and matching to enable devices to detect viruses transmitted in a network; the embedded network virus is detected, so that the device can detect the script virus hidden in the PE and OLE2 files.
As known to those of ordinary skill in the art, the general workflow for network virus detection is:
in the initialization stage, reading network virus information from a virus information base, and decrypting and analyzing the network virus information; in the detection stage, a network packet capturing device acquires a data packet, a protocol is analyzed and recombined, and decompression, decoding, shelling and virus scanning are carried out; and in the response stage, reporting the detection result and the corresponding action to be taken to the control end. In the detection stage, there are three detection modes: packet-based, data stream-based, and file-based. The method of the invention keeps the general framework and the flow of the network virus detection when the network virus detection is carried out based on the data flow in the TCP/IP network.
The device networking structure for detecting network virus based on data flow in TCP/IP is shown in FIG. 1. Wherein,
the local area network comprises network users and network services inside the local area network;
the network virus detection device based on the data flow is used for detecting the network data flow passing through and providing safety protection for the local area network;
the Internet, including routers, may transport and route network traffic.
The apparatus structure of the method of the present invention will be described in detail with reference to FIG. 2:
the network virus detection device based on the network data flow comprises an information base, an initialization module, a network virus characteristic detection module and a cache recovery module; the information base comprises a virus information base, a webpage format feature base and an operation parameter base.
The initialization module comprises an operation parameter initialization module, a preprocessing network virus characteristic module and a webpage file format characteristic initialization module, wherein the preprocessing network virus characteristic module comprises a virus information reading, decrypting, analyzing, classifying and fragmenting module, a characteristic tree creating module, a single-mode data structure and a regular expression automaton creating module;
the network virus characteristic detection module comprises a network data stream reconstruction module, a data stream preprocessing module, a virus characteristic matching module and a response module; the data stream preprocessing module comprises a cross-stream decompression, decoding and virus shelling module.
The virus characteristic matching module is provided with a multi-mode matching module and a single-mode regular expression matching module.
And the cache recovery module is used for recovering the cache space applied in the initialization module when the network virus detection device based on the network data flow is finished.
The classification model of the network viruses in the present invention is explained in detail with reference to fig. 3:
FIG. 3 classifies virus libraries and file types in the present invention such that the file types correspond to virus libraries to reduce the number of viruses used to scan a single network data stream. The virus library is divided into a PE virus library, a macro virus library, a script virus library and other virus libraries. The files are divided into compressed files and uncompressed files, the compressed files comprise zip files, rar files, chm files and cab files, the uncompressed files are divided into special coded files and non-special coded files, the special coded files comprise Base64 coded files, the non-special coded files are divided into Windows PE files, Windows documents, webpages and other files, the Windows PE files comprise exe files, com files, dll files, sys files and vxd files, the Windows documents comprise doc files, xls files and ppt files, and the webpages comprise html files, htm files, php files, asp files, aspx files and jsp files.
The viruses hidden in the Windows PE files form a PE virus library, the viruses hidden in the Windows files form a macro virus library, the viruses hidden in the web pages form a script virus library, and the viruses hidden in other files form other virus libraries.
The fragment model of the complex virus in the present invention is explained in detail with reference to fig. 4:
in order to reduce the length of the matched single virus, the single virus is segmented in the invention, firstly, the virus is divided into the virus described by a non-regular expression and the virus described by a regular expression according to a description form. For viruses described by a non-regular expression, randomly extracting 2 segments according to an address sequence, and inserting the 1 st segment into a feature tree of a virus library where the segments are located; for the virus described by the regular expression, the 1 st segment is inserted into the feature tree of the virus library, and the 2 nd segment is inserted into the single-mode regular expression automaton.
In order that those skilled in the art will better understand the present invention, the present invention will be described in further detail below with reference to the flowchart shown in fig. 5. The method comprises the following steps:
step 501: the setting of the operating parameters of the network virus detection device based on the network data stream can be identified, specifically: and reading the operation parameters from the network virus detection device operation parameter library based on the network data flow, analyzing and assigning the operation parameters to corresponding variables.
The structural table of the operating parameters of the apparatus is shown in table 1 below.
Table 1:
device operating parameter variables Values of device operating parameters
Type of detection 1 detection of compressed files 2 detection of mail 3 detection of enveloped virus files 4 detection of embedded virus files
Type of response 1 alarm 2 packet loss
Type of matching algorithm 1 multi-pattern matching algorithm 12 multi-pattern matching algorithm 23 single-pattern regular expression algorithm
Step 502: setting network viruses to be identifiable specifically as follows: reading virus information from a virus library, carrying out decryption, analysis, fragmentation and classification, creating a feature tree, and creating a single-mode data structure and a regular expression automaton.
The structure table of the network virus information is shown in table 2 below.
Table 2:
serial number Data field
1 Viral name
2 Virus type 1 PE Virus 2 Macro Virus 3 script Virus 4 other viruses
3 Description mode 1 irregular expression 2 regular expression
4 File offset of virus signature relative to file header
5 Viral characteristics
The array structure of the 1 st segment of the network virus signature is shown in table 3 below.
Table 3:
serial number Data field
1 Length of fragment 1 of viral signature
2 Case sensitive 1 case sensitive 2 case insensitive
3 Description mode 1 irregular expression 2 regular expression
4 Characteristic value of virus
Step 503: the method for setting the format characteristics of the webpage file can be identified, and specifically comprises the following steps: and reading the characteristic value from the webpage file format characteristic library, analyzing and finally inserting the characteristic value into a corresponding characteristic tree.
The structure of the web page file format feature is shown in table 4 below.
Table 4:
serial number Data field
1 html
2 htm
3 PHP
4 asp
5 aspx
6 jsp
Step 504: setting that the network data packet can be identified specifically is: and capturing a data packet from a network link, and performing data frame analysis, IP data packet analysis and fragment recombination, and transmission layer data message analysis.
Step 505: and recombining the data messages according to the fact that the network data packet can be identified, and reporting the network data stream when the number of the messages is larger than the number of the messages window.
Step 506: preprocessing the network data stream, and if the network data stream is a compressed file, decompressing the network data stream; if the file is the file with the special coding format, decoding; if the file contains the virus with shell, shell removal is carried out, and a corresponding virus library is connected.
Step 507: matching the 1 st segment of the virus and the format characteristics of the webpage file by using a multi-pattern matching algorithm, and detecting the embedded virus if the matching is successful and the matching is the format characteristics of the webpage file; and if the matching is successful and the matching is the virus characteristic, detecting the network data flow by using the 2 nd fragment virus characteristic according to different description forms.
Step 508: and according to the virus matching result, if viruses exist, making a corresponding response.
Step 509: and when the virus detection device is terminated, recovering the dynamically applied cache resources.
The above-described flow of fig. 5 is further illustrated by an application example.
For example: the device may detect files of compressed, mail, shelled and embedded virus types; when the virus is detected, the alarm is given and the packet is lost; when scanning virus, a multi-mode matching algorithm and a single-mode regular expression algorithm are adopted.
The virus information is
Exploit.HTML.ObjectType:3:9a:3c6f626a65637420747970653d222f2f2f2f2f2f2f2f2f2f2f2f7468284461746529203d20313220416e6420446179284461746529203d203239205468656e{2-
3}20202020456e64204966*636f6d706f6e656e742e4578706f7274202822433a5c537572726f756e642e6b65792229646174613d226d732d6974733a6d68746d6c3a66696c653a2f2f(63|64)3a5c
The web page file format is characterized in that
html, htm, PHP, asp, aspx, and jsp.
The structural table of the operating parameters of the apparatus is shown in table 5 below.
Table 5:
device operating parameter variables Values of device operating parameters
Type of detection 1 & 2 & 3 & 4
Type of response 1 & 2
Type of matching algorithm 1 & 3
The structure of the network virus information is shown in table 6 below.
Table 6:
serial number Data field
1 Exploit.HTML.ObjectType
2 3
3 2
4 9a
5 3c6f626a65637420747970653d222f2f2f2f2f2f2f2f2f2f2f2 f74682 84461746529203d20313220416e6420446179284461746529203 d203239205468656e{2-3}20202020456e64204966*636f 6d706f6e656e742e4578706f7274202822433 a5c537572726f756 e642e6b65792229646174613d226d732d6974733a6d68746d6c3 a66696c653a2f2f(63|64)3a5c
The array structure of the 1 st segment of the network virus signature is shown in table 7 below.
Table 7:
serial number Data field
1 10byte
2 1
3 3c6f626a656374207479
Firstly, preprocessing a network data packet, wherein the preprocessed network data stream is
01 00 0c cc cc cc 00 0e d7 bd a4 c0 01 63 aa aa 03 00 00 0c 20 00
02 b4 50 76 00 01 00 0d 76 65 6e 75 73 64 65 70 32 00 05 01 00 43
69 73 63 6f 20 49 6e 74 65 72 6e 65 74 77 6f 72 6b 20 4f 70 65 72
61 74 69 6e 67 20 53 79 73 74 65 6d 20 53 6f 66 74 77 61 72 65 20
0a 49 4f 53 20 28 74 6d 29 20 43 32 36 30 30 20 53 6f 66 74 77 61
72 65 20 28 43 32 36 3c6f626a656374207479 30 30 2d 49 2d 4d 29 2c
20 56 65 72 73 30 00 04 00 08 00 00 00 01 00 07 00 09 c0 a8 0a 00
18 00 0b 00 05 00
And matching the 1 st network virus characteristic with the network data stream, wherein the matching is successful, because the virus characteristic is described by the regular expression, an automaton using the regular expression is used for matching, and if the matching is not successful, the data packet is obtained again and detected.
The present invention is further described in detail with reference to the flowchart shown in fig. 6. The method comprises the following steps:
step 601: and reading virus information from the virus library.
Step 602: and decrypting the ciphertext of the virus information.
Step 603: and analyzing the plaintext of the virus information.
Step 604: the virus characteristics are classified into PE viruses, macro viruses, script viruses and other viruses.
Step 605: the virus characteristics are segmented, if the virus characteristics are described by the irregular expression, 2 virus segments are randomly extracted from the virus characteristics; if the virus features are described by regular expressions, randomly extracting a1 st virus feature segment with a determined length before a1 st regular expression descriptor, and taking the rest part as a2 nd virus feature segment.
Step 606: and inserting the 1 st virus fragment of each virus characteristic into the corresponding virus characteristic tree.
Step 607: and if the virus characteristics are described by the regular expression, inserting the residual virus characteristics behind the 1 st virus segment into the single-mode regular expression automaton.
Step 608: if the virus characteristics are described by the irregular expression, saving the 2 nd virus characteristic segment and the virus offset thereof.
The array structure of the 2 nd segment of the network virus signature is shown in table 8 below.
Table 8:
serial number Data field
1 Offset of 2 nd viral fragment from 1 st viral signature fragment
2 Length of 2 nd fragment characteristic of virus
3 Viral characteristics
The above-described flow of fig. 6 is further illustrated by an application example.
For example: firstly, reading a ciphertext of virus information from a virus library, decrypting the ciphertext and analyzing a plaintext;
the ciphertext of the virus information is
436f707972696768742028632920313938362d3230303320627920636973636f2053797374656d732c20496e632e0a436f6d70696c6564204672692033302d4d61792d30332030323a3435206279206b656c6c7974687700060010636973636f2032363231584d00020011000000010101cc0004c0a81cfe0003001346617374772e636973636f2e636f6d2f7461630a4152452028666331290a54414376
The plaintext of the virus information is
Exploit.HTML.ObjectType:3:9a:3c6f626a65637420747970653d222f2f2f2f2f2f2f2f2f2f2f2f7468284461746529203d20313220416e6420446179284461746529203d203239205468656e{2-
3}20202020456e64204966*636f6d706f6e656e742e4578706f727
4202822433a5c537572726f756e642e6b65792229646174613d226d732d6974733a6d68746d6c3a66696c653a2f2f(63|64)3a5c
The structure table of the network virus information is shown in table 9 below.
Table 9:
serial number Data field
1 Exploit.HTML.ObjectType
2 3
3 2
4 9a
5 3c6f626a65637420747970653d222f2f2f2f2f2f2f2f2f2f2f2 f74682 84461746529203d20313220416e6420446179284461746529203 d203239205468656e{2-3}20202020456e64204966*636f 6d706f6e656e742e4578706f7274202822433 a5c537572726f756 e642e6b65792229646174613d226d732d6974733a6d68746d6c3 a66696c653a2f2f(63|64)3a5c
Then, the virus characteristics are segmented, and the 1 st virus characteristic is inserted into a corresponding multi-mode matching virus characteristic tree according to different virus types;
the array structure of the 1 st fragment of the network virus is shown in table 10 below.
Table 10:
serial number Data field
1 10byte
2 1
3 3c6f626a656374207479
And finally, according to different description modes, if the segment is described by a non-regular expression, caching the 2 nd segment, creating a jump table, and if the segment is described by a regular expression, constructing a regular expression automaton.
The present invention is further described in detail with reference to the flowchart shown in fig. 7. The method comprises the following steps:
step 701: and reading the webpage file format characteristics from the webpage file format characteristic library.
Step 702: and analyzing the format characteristics of the webpage file.
Step 703: and inserting the webpage file format characteristics into the PE virus characteristic tree.
Step 704: and inserting the webpage file format characteristics into the macro virus characteristic tree.
The above-described flow of fig. 7 is further illustrated by an application example.
For example: firstly, reading the format characteristics of the webpage file from a webpage file format characteristic library and analyzing;
the structure of the web page file format feature is shown in table 11 below.
Table 11:
serial number Data field
1 html
2 htm
3 PHP
4 asp
5 aspx
6 jsp
Then, inserting the features in the table into the PE virus tree;
finally, the features in the table are inserted into the macro virus tree.
The network data stream reassembly is described in further detail with reference to the flowchart shown in fig. 8. The method comprises the following steps:
step 801: storing the data packets into a cache in sequence, and adding 1 to a packet counter;
step 802: when the packet counter value is greater than or equal to the sending end packet number window value, or the packet counter value is smaller than the sending end packet number window value and the time value is greater than or equal to the time window value, uploading the data packet;
step 803: the packet counter is 0 and the initial time is 0.
The above-described flow of fig. 8 is further illustrated by an application example.
For example: setting that the number window of data packets is 10, the time window is 1 second, if the packet capturing device has captured data packets with sequence numbers 1, 2, 3, 4, 5, 6, 7, 9 and 10, the number of packets is 9 smaller than 10, the time window is smaller than 1 second, and now captures data packets with sequence number 8, then inserting the data packets into a stream reassembly queue in sequence, wherein the sequence of the queue is 1, 2, 3, 4, 5, 6, 7, 8, 9 and 10, and the number of packets is equal to 10;
then, the 10 data packets are reported, and the value of the counter is assigned to 0, and the initial time window is assigned to 0.
The network data flow preprocessing flow is further described in detail with reference to the flow chart shown in fig. 9. The method comprises the following steps:
step 901: and acquiring the file type of the file from the header of the file after the protocol analysis.
Step 902: if the file type is a compressed file, a corresponding decompression algorithm is called according to the extension names of different compressed files to decompress the compressed file.
Step 903: if the file type is a file with a special coding format, calling a corresponding decoding algorithm to decode the file.
Step 904: if the file type is a common file, whether the file type is a Windows PE file, a Windows document, a webpage or other files is judged.
Step 905: if the file type is a Windows PE file and virus shelling is needed, then shelling is performed.
Step 906: and hooking a virus library corresponding to the PE format file.
Step 907: and hooking the corresponding macro virus library of the Windows document.
Step 908: and hooking a script virus library corresponding to the webpage file.
Step 909: and hanging a picture virus library corresponding to the picture file.
Step 9010: and hooking virus libraries corresponding to other files.
The above-described flow of fig. 9 is further illustrated by an application example.
For example: decoding the analyzed encoding mode base64 encoding format of the network data stream; if the decoded file type is rar, decompressing the file; the decompressed file type is a Windows PE format file, and if the file needs to be unshelled, the file is unshelled, and a PE virus library is connected.
The present invention is further described in detail with reference to the flowchart shown in fig. 10. The method comprises the following steps:
step 1001: and scanning the network data stream by using a virus characteristic tree constructed by the 1 st segment of all viruses and a multi-mode matching algorithm.
Step 1002: if the matching is successful and the corresponding mode is the format characteristic of the webpage file, hanging a script virus library corresponding to the webpage file.
Step 1003: script viruses embedded in PE and OLE2 files are detected.
Step 1004: and judging whether the description form of the virus is a non-regular expression or a regular expression description form.
Step 1005: and if the virus fragment is described by the irregular expression, calculating the position of the 2 nd virus fragment and matching.
Step 1006: if the regular expression is described, the virus automata is hooked.
Step 1007: network data streams are scanned using a virus automaton and a single-pattern regular expression algorithm.
The above-described flow of fig. 10 is further illustrated by an application example.
For example: the virus library type is set as a PE virus library, the format characteristic of the webpage file is PHP, and the virus characteristic description mode is a regular expression.
Firstly, matching a feature tree generated by the 1 st virus feature of a PE virus library and a webpage file format feature with a network data stream, wherein the matching is successful, and if the feature is PHP, hanging a script virus library and matching; if the virus characteristic is described by the regular expression, the virus characteristic regular expression automata is hooked and matched.
While the present invention has been described with respect to the embodiments, those skilled in the art will appreciate that there are numerous variations and permutations of the present invention without departing from the spirit of the invention, and it is intended that the appended claims cover such variations and modifications as fall within the true spirit of the invention.

Claims (6)

1. A network virus detection method based on network data flow is characterized by comprising the following steps:
A. classifying the network viruses according to different host file types;
B. fragmenting the network virus according to different description modes;
C. acquiring different webpage file format characteristics aiming at different webpage file types;
D. recombining a network data stream for a specific number of network data packets;
E. matching the network data stream with the network virus characteristics according to different types of virus characteristic libraries, virus characteristic fragments, webpage file format characteristics and network data streams, and detecting network viruses implicit in the network data streams;
the step B comprises the following steps:
according to different network virus feature description modes, two pieces of network virus features described by a non-regular expression are randomly and sequentially extracted from the network virus features, for the network virus features described by a regular expression, a segment is randomly extracted from the front of all regular descriptors, and the rest part of the segment is taken as a segment;
the first of the two network virus signature segments comprises: the file comprises a virus name, a virus description mode, a file offset, a virus characteristic code and a pointer of a second virus characteristic segment in the two virus characteristic segments;
the second viral fragment comprises: a virus signature offset and a virus signature code;
the step E comprises the following steps:
e1, according to different file types, hanging different types of network virus feature libraries, and matching the first segment scanning network data streams of all network viruses by using a multi-mode matching algorithm;
e2, when the matching of E1 is successful and the corresponding virus is described by a non-regular expression, calculating the position of the second piece of virus in the network data stream and scanning the network data stream from the position by using a single pattern matching algorithm for matching;
e3, when the matching of E1 is successful and the corresponding virus is described by a regular expression, matching with the automaton scanning network data flow inserted with the residual virus characteristics behind the first virus segment by using a regular expression algorithm;
e4, if the E2 or E3 match successfully, a viral response is made.
2. The method according to claim 1, wherein the step a comprises:
a1, reading a network virus information base;
a2, decrypting virus information;
a3, analyzing virus information;
the network virus information comprises: virus name, virus type, file offset and virus signature;
a4, dividing the network virus into PE virus, macro virus, script virus, picture virus and other viruses according to different format types of the host files of the network virus.
3. The method according to claim 1, wherein the step C comprises:
c1, reading the web page file format characteristic information from the web page file format characteristic library;
c2, analyzing the webpage file format characteristic information;
c3, inserting the web page file format features into the web virus features data structure to detect script viruses embedded in the PE and document format files.
4. The method according to claim 1, wherein the step D comprises:
storing the data packets into a cache in sequence, and adding 1 to a packet counter;
and when the packet counter value is greater than or equal to the sending end packet number window value, or the packet counter value is smaller than the sending end packet number window value and the time difference value is greater than or equal to the time window value, uploading the data packet, wherein the packet counter is 0, and the initial time is 0.
5. A network virus detection apparatus based on network data flow, comprising:
the virus information base is used for storing information of the network viruses, and comprises virus names, virus types, description modes, offsets and virus characteristic codes;
a web page file format feature library for storing a feature tag identifying a web page file;
the operation parameter library is used for storing the switch parameters and the response modes of the compression, decoding and shelling modules;
the initialization module is used for initializing operation parameters, initializing webpage file format characteristics, preprocessing viruses, creating a virus characteristic tree, creating a single-mode data structure of virus characteristics and the regular expression automaton;
the virus detection module is used for preprocessing network data flow, matching viruses in a cross-flow mode by a multi-mode matching algorithm and matching viruses and responses in a cross-flow mode by a single-mode regular expression algorithm;
the cache recovery module is used for recovering the cache space applied in the running process of the device;
firstly, an initialization module reads, decrypts and analyzes a network virus information base, a webpage file format feature base and an operation parameter base; then, the virus detection module preprocesses the network data stream and performs virus feature matching, and if a virus exists, the virus detection module responds; finally, the dynamic application cache is reclaimed at the termination of the process.
6. The apparatus of claim 5, comprising:
the initialization module comprises an operation parameter initialization module, a network virus characteristic preprocessing module and a webpage file format characteristic initialization module, wherein the network virus characteristic preprocessing module comprises a virus information reading, decrypting, analyzing, classifying and fragmenting module, a characteristic tree creating module, a single-mode data structure and a regular expression automaton creating module;
the virus detection module comprises a network data stream reconstruction module, a data stream preprocessing module, a virus characteristic matching module and a response module; the data stream preprocessing module comprises a cross-stream decompression, decoding and virus unshelling module;
the virus characteristic matching module is provided with a multi-mode matching module and a single-mode regular expression matching module.
CN2008101028494A 2008-03-27 2008-03-27 Network virus detecting method based on network data streams and device thereof Active CN101547126B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN2008101028494A CN101547126B (en) 2008-03-27 2008-03-27 Network virus detecting method based on network data streams and device thereof

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN2008101028494A CN101547126B (en) 2008-03-27 2008-03-27 Network virus detecting method based on network data streams and device thereof

Publications (2)

Publication Number Publication Date
CN101547126A CN101547126A (en) 2009-09-30
CN101547126B true CN101547126B (en) 2011-10-12

Family

ID=41194035

Family Applications (1)

Application Number Title Priority Date Filing Date
CN2008101028494A Active CN101547126B (en) 2008-03-27 2008-03-27 Network virus detecting method based on network data streams and device thereof

Country Status (1)

Country Link
CN (1) CN101547126B (en)

Families Citing this family (21)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN102045366A (en) * 2011-01-05 2011-05-04 上海北塔软件股份有限公司 Method for actively discovering network attacked by viruses
CN102307189B (en) * 2011-08-18 2014-02-26 华为数字技术(成都)有限公司 Malicious code detection method and network equipment
CN102694801B (en) * 2012-05-21 2015-08-05 华为技术有限公司 Method for detecting virus, device and firewall box
WO2014005303A1 (en) * 2012-07-04 2014-01-09 华为技术有限公司 Anti-virus method and apparatus and firewall device
CN103778370B (en) 2012-10-17 2016-08-24 腾讯科技(深圳)有限公司 Virus document processing method and client device
CN103546448A (en) * 2012-12-21 2014-01-29 哈尔滨安天科技股份有限公司 Network virus detection method and system based on format parsing
CN103580948A (en) * 2012-12-27 2014-02-12 哈尔滨安天科技股份有限公司 Method and device for detecting network based on structural-file index information
CN103095714A (en) * 2013-01-25 2013-05-08 四川神琥科技有限公司 Trojan horse detection method based on Trojan horse virus type classification modeling
CN103246847B (en) * 2013-05-13 2016-03-23 腾讯科技(深圳)有限公司 A kind of method and apparatus of macrovirus killing
CN104283726A (en) * 2013-07-01 2015-01-14 南京理工大学常熟研究院有限公司 P2P flow detecting system based on flow statistical characteristics and fuzzy pattern recognition
CN104424438B (en) * 2013-09-06 2018-03-16 华为技术有限公司 A kind of antivirus file detection method, device and the network equipment
CN104850782B (en) * 2014-02-18 2019-05-14 腾讯科技(深圳)有限公司 Match the method and device of virus characteristic
CN104243486B (en) * 2014-09-28 2018-03-23 中国联合网络通信集团有限公司 A kind of method for detecting virus and system
CN104732148A (en) * 2015-04-14 2015-06-24 北京汉柏科技有限公司 Distributed searching and killing method and system
CN105939314A (en) * 2015-09-21 2016-09-14 杭州迪普科技有限公司 Network protection method and device
CN108090353B (en) * 2017-11-03 2021-09-03 安天科技集团股份有限公司 Knowledge-driven regression detection method and system for shell-added codes
CN109547433A (en) * 2018-11-21 2019-03-29 安徽云融信息技术有限公司 A kind of detection method of internet worm
CN110855719B (en) * 2019-12-13 2021-12-17 成都安恒信息技术有限公司 Low-delay TCP (Transmission control protocol) cross-message firewall detection method
CN114024765B (en) * 2021-11-15 2022-07-22 北京智维盈讯网络科技有限公司 Firewall strategy convergence method based on combination of bypass flow and firewall configuration
CN114244581B (en) * 2021-11-29 2024-03-29 西安四叶草信息技术有限公司 Cache poisoning vulnerability detection method and device, electronic equipment and storage medium
CN114840538A (en) * 2022-04-21 2022-08-02 成都安恒信息技术有限公司 Method and system for testing safety equipment by updating virus library

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1280298A1 (en) * 2001-07-26 2003-01-29 BRITISH TELECOMMUNICATIONS public limited company Method and apparatus of detecting network activity
CN1625121A (en) * 2003-12-05 2005-06-08 中国科学技术大学 A Layered Cooperative Network Virus and Malicious Code Identification Method
CN1909488A (en) * 2006-08-30 2007-02-07 北京启明星辰信息技术有限公司 Virus detection and invasion detection combined method and system
CN101141458A (en) * 2007-10-12 2008-03-12 网经科技(苏州)有限公司 Network data pipelining type analysis process method

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
EP1280298A1 (en) * 2001-07-26 2003-01-29 BRITISH TELECOMMUNICATIONS public limited company Method and apparatus of detecting network activity
CN1625121A (en) * 2003-12-05 2005-06-08 中国科学技术大学 A Layered Cooperative Network Virus and Malicious Code Identification Method
CN1909488A (en) * 2006-08-30 2007-02-07 北京启明星辰信息技术有限公司 Virus detection and invasion detection combined method and system
CN101141458A (en) * 2007-10-12 2008-03-12 网经科技(苏州)有限公司 Network data pipelining type analysis process method

Also Published As

Publication number Publication date
CN101547126A (en) 2009-09-30

Similar Documents

Publication Publication Date Title
CN101547126B (en) Network virus detecting method based on network data streams and device thereof
US7802303B1 (en) Real-time in-line detection of malicious code in data streams
Wang et al. Seeing through network-protocol obfuscation
EP1959367B1 (en) Automatic extraction of signatures for Malware
TWI387299B (en) Resisting the spread of unwanted code and data
KR100922579B1 (en) Apparatus and method for detecting network attack
US8010685B2 (en) Method and apparatus for content classification
CN114050926B (en) Data message depth detection method and device
EP2924943B1 (en) Virus detection method and device
CN108985064B (en) Method and device for identifying malicious document
US9350707B2 (en) System and method for detecting a compromised computing system
CN106470214B (en) Attack detection method and device
CN106911637A (en) Cyberthreat treating method and apparatus
CN110808994A (en) Method, device and server for detecting brute force cracking operation
WO2013117151A1 (en) Method and system for rapidly scanning files
CN111770097A (en) A whitelist-based content lock firewall method and system
CN111756716A (en) Flow detection method and device and computer readable storage medium
CN115695031A (en) Host computer sink-loss detection method, device and equipment
CN103324886A (en) Method and system for extracting fingerprint database in network intrusion detection
KR20220074635A (en) A method and apparatus for detecting malicious activities over encrypted secure channels
CN107277109B (en) Multi-string matching method for compressed flow
CN119135442B (en) Plaintext WEB scanning detection method, device, electronic equipment and storage medium
CN113347184A (en) Method, device, equipment and medium for testing network flow security detection engine
CN108897721B (en) Method and device for decoding multiple kinds of coded data
CN115801404A (en) Defense rule generation method, defense rule generation equipment and storage medium

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
C14 Grant of patent or utility model
GR01 Patent grant
C41 Transfer of patent application or patent right or utility model
C56 Change in the name or address of the patentee
CP03 Change of name, title or address

Address after: 100193 Beijing city Haidian District Dongbeiwang qimingxingchenmansionproject Building No. 21 West Road No. 8 Zhongguancun Software Park

Patentee after: VENUSTECH GROUP Co.,Ltd.

Address before: 100094, Beijing Haidian District 8 West Road, Zhongguancun Software Park, 21, Venus building

Patentee before: BEIJING VENUSTECH Inc.

TR01 Transfer of patent right

Effective date of registration: 20161110

Address after: 100193 Beijing city Haidian District Dongbeiwang qimingxingchenmansionproject Building No. 21 West Road No. 8 Zhongguancun Software Park

Patentee after: BEIJING VENUSTECH CYBERVISION Co.,Ltd.

Address before: 100193 Beijing city Haidian District Dongbeiwang qimingxingchenmansionproject Building No. 21 West Road No. 8 Zhongguancun Software Park

Patentee before: VENUSTECH GROUP Co.,Ltd.

TR01 Transfer of patent right

Effective date of registration: 20170315

Address after: 100193 Beijing City, Haidian District, northeast Wang West Road, building 21, floor -1-3, floor four, room two, room 21, 2419

Patentee after: Beijing Credit Information Technology Co.,Ltd.

Address before: 100193 Beijing city Haidian District Dongbeiwang qimingxingchenmansionproject Building No. 21 West Road No. 8 Zhongguancun Software Park

Patentee before: BEIJING VENUSTECH CYBERVISION Co.,Ltd.

TR01 Transfer of patent right