[go: up one dir, main page]

CN107612911B - Method for detecting infected host and C & C server based on DNS traffic - Google Patents

Method for detecting infected host and C & C server based on DNS traffic Download PDF

Info

Publication number
CN107612911B
CN107612911B CN201710850732.3A CN201710850732A CN107612911B CN 107612911 B CN107612911 B CN 107612911B CN 201710850732 A CN201710850732 A CN 201710850732A CN 107612911 B CN107612911 B CN 107612911B
Authority
CN
China
Prior art keywords
domain name
dns
server
information
random
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201710850732.3A
Other languages
Chinese (zh)
Other versions
CN107612911A (en
Inventor
蔡福杰
范渊
刘元
李凯
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
DBAPPSecurity Co Ltd
Original Assignee
DBAPPSecurity Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by DBAPPSecurity Co Ltd filed Critical DBAPPSecurity Co Ltd
Priority to CN201710850732.3A priority Critical patent/CN107612911B/en
Publication of CN107612911A publication Critical patent/CN107612911A/en
Application granted granted Critical
Publication of CN107612911B publication Critical patent/CN107612911B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Landscapes

  • Data Exchanges In Wide-Area Networks (AREA)

Abstract

The invention relates to a method for detecting an infected host and a C & C server based on DNS traffic, which comprises the steps of constructing a training set, training an algorithm for identifying random domain names, collecting DNS traffic passing through any network card and analyzing to obtain DNS information; judging whether the domain name in the DNS information is a random domain name by using an algorithm for identifying the random domain name, if so, judging whether the domain name is successfully analyzed, if not, identifying the infected host, storing the information of the infected host, identifying the C & C server, storing the information of the C & C server, alarming, storing the alarm information and displaying. The invention effectively identifies the host infected by virus and Trojan horse which are connected back by using DGA algorithm and the C & C server behind the host through DNS traffic information, has high accuracy, and has much smaller DNS traffic compared with other traffic detection means, thereby having lower cost and higher efficiency.

Description

Method for detecting infected host and C & C server based on DNS traffic
Technical Field
The invention relates to the technical field of network security, in particular to a method for detecting an infected host and a C & C server based on DNS (domain name system) flow in the field of network security APT (advanced persistent Threat) detection.
Background
Detecting hosts infected by viruses and trojans and C & C servers (remote command and control servers, which are servers giving instructions to infected hosts) are always an important part of network security research, and computers are often connected back to the C & C servers after being infected by viruses and trojans to acquire new instructions or transmit acquired confidential contents to the C & C servers.
In the past, viruses, trojans, often used fixed domain names for the back-connection, which were easily detected. However, more and more viruses and trojans use the DGA domain name generation algorithm to generate domain names for loop connection, and in a period, hundreds or even thousands of relatively random domain names are generated by taking date and the like as seeds for one-by-one access, and then an attacker registers and points to the C & C server by using partial domain names, and when the infected host accesses the domain name registered by the attacker, the loop connection succeeds, so that some detection technologies are avoided.
In the process of accessing the domain name, DNS traffic is necessarily generated, and a DGA domain name generation algorithm is necessarily provided with a plurality of random domain names with access failure, and the infected host and the C & C server are checked on the basis of the domain name generation algorithm.
Disclosure of Invention
The invention aims to discover an infected host and a C & C server behind the infected host in time, and provides a method for detecting the infected host and the C & C server based on DNS traffic to realize the technical problem.
The technical scheme adopted by the invention is that a method for detecting an infected host and a C & C server based on DNS traffic, which comprises the following steps:
step 1: constructing a training set, and training to obtain an algorithm for identifying random domain names;
step 2: collecting DNS flow passing through any network card by using a flow collection module;
and step 3: analyzing the collected DNS traffic according to DNS protocol specification to obtain DNS information; the DNS information comprises a domain name, whether the domain name is successfully resolved or not, and a client IP;
and 4, step 4: judging whether the domain name in the DNS information is a random domain name or not by using the algorithm for identifying the random domain name obtained in the step 1, if so, carrying out the next step, and if not, returning to the step 2;
and 5: judging whether the domain name is successfully analyzed, if the domain name is failed to be analyzed, carrying out the next step, and if the domain name is successfully analyzed, carrying out the step 7;
step 6: identifying an infected host; if the host is infected, the information is stored, step 8 is carried out, otherwise, the step 2 is returned;
and 7: identifying a C & C server; if the server is a C & C server, the information is stored, the step 8 is carried out, and if not, the step 2 is returned;
and 8: alarming, storing and displaying alarming information; and returning to the step 2.
Preferably, in step 1, the training set includes a normal domain name and a random domain name generated by a random algorithm.
Preferably, the algorithm for identifying random domain names is obtained by attaching a weight ratio to the domain name length, the number of digits, the alphanumeric exchange frequency, the numeric ratio, the maximum length of continuous letters and the number of special characters.
Preferably, in step 3, the DNS information further includes a DNS server, a time of the request, and an actual server IP corresponding to the domain name successfully resolved.
Preferably, the step 6 comprises the steps of:
step 6.1: obtaining a client IP of the current DNS information;
step 6.2: if the client corresponding to the client IP continuously has the random domain names with the same characteristics and failed to resolve within the T time period, and the number of the random domain names with the same characteristics and failed to resolve before and after the T time period is small, and the duration time period T is within 30 minutes, the client is considered to be infected, and is considered to be an infected host;
step 6.3: for the infected host, extracting the time period of the random domain names failed in resolution and the same characteristics of the domain names, and storing; carrying out step 8; otherwise, returning to the step 2.
Preferably, in step 6.2, the same features include that the partial character strings are the same or have the same length.
Preferably, the same characteristics further include that the second-level domain names are identical but the top-level domain names are different.
Preferably, in step 6, the stored information includes the infected host IP, the random domain name that was accessed and failed, and the time period of the domain name that failed to be accessed.
Preferably, in step 7, the method for identifying the C & C server is to identify, for the domain name successfully resolved, the actual server IP corresponding to the domain name as the C & C server if the access time of the domain name successfully resolved currently, which has been identified as the infected host in step 6, is the same as the domain name failed in resolution within the time period when the infected host accesses the random domain name failed in resolution in a large amount.
Preferably, the saved information includes a client IP accessing the C & C server, a time of access, a domain name of access, a C & C server IP, while associating the infected host with the C & C server.
The invention provides a method for detecting infected host and C & C server based on DNS traffic, which effectively identifies virus, Trojan infected host and C & C server behind the host, which are connected back by using DGA algorithm, through DNS traffic information, has high accuracy, and DNS traffic is much smaller than other traffic detection means, so the cost of the invention is lower and the efficiency is higher.
Drawings
FIG. 1 is a flow chart of the algorithm for obtaining an identified random domain name of the present invention;
fig. 2 is a flow chart of DNS traffic based detection of infected hosts and C & C servers of the present invention.
Detailed Description
It should be noted that the method for detecting an infected host and a C & C server through DNS traffic according to the present invention is an application of computer technology in the field of information security technology. The applicant believes that it is fully possible for one skilled in the art to utilize the software programming skills in his or her own practice to implement the invention, as well as to properly understand the principles and objectives of the invention, in conjunction with the prior art, after a perusal of this application. All references made herein are to the extent that they do not constitute a complete listing of the applicants.
The present invention will be described in further detail with reference to the accompanying drawings and specific embodiments.
The invention relates to a method for detecting infected hosts and C & C servers based on DNS traffic, comprising the following steps.
Step 1: and constructing a training set, and training to obtain an algorithm for identifying the random domain name.
In step 1, the training set includes a normal domain name and a random domain name generated by a random algorithm.
The algorithm for identifying the random domain name is obtained by attaching a weight proportion to the length of the domain name, the number of digits, the exchange frequency of alphanumerics, the digit proportion, the maximum length of continuous alphabets and the number of special characters.
In the invention, 50 first foreign websites and 10 first domestic websites of Alexa are collected as training sets of normal domain names, and simultaneously, a plurality of DGA algorithms are collected to generate 10 ten thousand DGA domain names as training sets of random domain names.
In the invention, a second-level domain name of each domain name is extracted, according to experience, a relatively proper weight proportion is added to the length, the number of digits, the number-letter switching frequency, the number proportion, the maximum length of continuous letters and the number of special characters of the second-level domain name, and then the proportion is continuously adjusted through a certain algorithm, and finally an algorithm capable of distinguishing normal domain names from random domain names is obtained.
In the invention, an embodiment is provided for an algorithm for identifying the random domain name, and in the actual operation process, adjustment or additional setting can be made according to different technical requirements of technicians. The domain name length is a1, the number of digits is a2, the alphanumeric exchange frequency is a3, the numeric ratio is a4, the maximum length of continuous letters is a5, the number of special characters is a6, and x = x1 a1+ (x2 a2+ x3 a3+ x4 a4+ x5 a5+ x6 a6)/a1 are calculated, wherein x1, x2, x3, x4, x5 and x6 are weighted values, the domain name is updated continuously according to sample domain names, and finally whether the domain name is random is judged according to whether the obtained x reaches a preset threshold y or not.
Step 2: and acquiring DNS flow passing through any network card by using a flow acquisition module.
In the invention, a flow acquisition module captures DNS flow data packets flowing through a network card by using a Libpcap library.
And step 3: analyzing the collected DNS traffic according to DNS protocol specification to obtain DNS information; the DNS information comprises a domain name, whether the domain name is successfully resolved or not and a client IP.
In step 3, the DNS information further includes a DNS server, a requested time, and an actual server IP corresponding to the domain name successfully resolved.
And 4, step 4: and (3) judging whether the domain name in the DNS information is a random domain name or not by using the algorithm for identifying the random domain name obtained in the step (1), if so, carrying out the next step, and if not, returning to the step (2).
And 5: and (4) judging whether the domain name is successfully analyzed, if the domain name is failed to be analyzed, carrying out the next step, and if the domain name is successfully analyzed, carrying out the step (7).
Step 6: identifying an infected host; if the host is infected, the information is stored, step 8 is carried out, otherwise, the step 2 is returned.
The step 6 includes the following steps.
Step 6.1: the client IP of the current DNS information is obtained.
Step 6.2: if the random domain names with the same characteristics and failed resolution continuously appear in the client corresponding to the client IP within the T period, and the number of the random domain names with the same characteristics and failed resolution before and after the T period is small, and the duration T is within 30 minutes, the client is considered to be infected, and is considered to be an infected host.
In said step 6.2, the same features include that the partial character strings are the same or have the same length.
The same characteristics also include that the second level domain names are identical but the top level domain names are different.
Step 6.3: for the infected host, extracting the time period of the random domain names failed in resolution and the same characteristics of the domain names, and storing; carrying out step 8; otherwise, returning to the step 2.
In step 6, the stored information includes the infected host IP, the random domain name that was accessed and failed, and the time period of the domain name that was accessed and failed.
In the invention, the random domain name failed in resolution is classified and stored according to the client IP acquired in the step 3, and the same domain name is calculated only once.
In the present invention, in general, in order to prevent the occurrence of false determination, the client with an infected status is continuously observed as a suspected infected host until the number of random domain names with access failure is significantly reduced, and if the duration is within a predetermined threshold range (e.g. 30 minutes), the client is considered to be infected and is considered as an infected host.
In the present invention, for example, if a client corresponding to the client IP continuously fails to resolve the random domain name with the same characteristics for more than 20 times in any 5-minute time period, and the number of the random domain names with the same characteristics that fail to resolve before and after any 5-minute time period is less than 2 times, and the duration T is within 30 minutes (5 minutes within 30 minutes), the client may be considered to be infected and considered to be an infected host.
In the present invention, the information saved in step 6 can be used to identify the C & C server.
And 7: identifying a C & C server; and if the server is the C & C server, storing the information, and performing the step 8, otherwise, returning to the step 2.
In step 7, the method for identifying the C & C server is that, for the domain name successfully resolved, if the access time of the domain name successfully resolved currently, which is identified as the infected host in step 6, is the same as the domain name failed in resolution within the time period when the infected host accesses the random domain name failed in resolution in a large amount, the actual server IP corresponding to the domain name is identified as the C & C server.
The stored information includes client IP accessing the C & C server, time of access, domain name accessed, C & C server IP, while associating infected host with C & C server.
In the invention, in step 7, if the client IP identified as the C & C server is not identified as the infected host, the client IP is stored for a period of time, and if the client IP is identified as the infected host within the period of time, whether the domain name is the C & C domain name is further determined.
In the present invention, in principle, step 6 and step 7 will not be performed simultaneously for the same record, but their client IPs may be the same for different records, partly involving infected hosts and partly involving C & C servers, while other records that are the same as the client IP have identified the client IP as an infected host in step 6 and extracted domain name features and time ranges that failed resolution, and whether it is a C & C server is determined by determining whether the domain name in the current record matches the extracted domain name features and time ranges.
In the present invention, association means association with an infected host through a client IP.
And 8: alarming, storing and displaying alarming information; and returning to the step 2.
Finally, it should be noted that the above-mentioned list is only a specific embodiment of the present invention. It is obvious that the present invention is not limited to the above embodiments, but many variations are possible. All modifications which can be derived or suggested by a person skilled in the art from the disclosure of the present invention are to be considered within the scope of the invention.

Claims (9)

1. A method for detecting infected hosts and C & C servers based on DNS traffic, characterized by: the method comprises the following steps:
step 1: constructing a training set, and training to obtain an algorithm for identifying random domain names;
step 2: collecting DNS flow passing through any network card by using a flow collection module;
and step 3: analyzing the collected DNS traffic according to DNS protocol specification to obtain DNS information; the DNS information comprises a domain name, whether the domain name is successfully resolved or not, and a client IP;
and 4, step 4: judging whether the domain name in the DNS information is a random domain name or not by using the algorithm for identifying the random domain name obtained in the step 1, if so, carrying out the next step, and if not, returning to the step 2;
and 5: judging whether the domain name is successfully analyzed, if the domain name is failed to be analyzed, carrying out the next step, and if the domain name is successfully analyzed, carrying out the step 7;
step 6: identifying an infected host; if the host is infected, the information is stored, step 8 is carried out, otherwise, the step 2 is returned;
the step 6 comprises the following steps:
step 6.1: obtaining a client IP of the current DNS information;
step 6.2: if the client corresponding to the client IP continuously has the random domain names with the same characteristics and failed to resolve within the T time period, and the number of the random domain names with the same characteristics and failed to resolve before and after the T time period is small, and the duration time period T is within 30 minutes, the client is considered to be infected, and is considered to be an infected host;
step 6.3: for the infected host, extracting the time period of the random domain names failed in resolution and the same characteristics of the domain names, and storing; carrying out step 8; otherwise, returning to the step 2;
and 7: identifying a C & C server; if the server is a C & C server, the information is stored, the step 8 is carried out, and if not, the step 2 is returned;
and 8: alarming, storing and displaying alarming information; and returning to the step 2.
2. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 1, wherein: in step 1, the training set includes a normal domain name and a random domain name generated by a random algorithm.
3. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 1, wherein: the algorithm for identifying the random domain name is obtained by attaching a weight proportion to the length of the domain name, the number of digits, the exchange frequency of alphanumerics, the digit proportion, the maximum length of continuous alphabets and the number of special characters.
4. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 1, wherein: in step 3, the DNS information further includes a DNS server, a requested time, and an actual server IP corresponding to the domain name successfully resolved.
5. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 4, wherein: in said step 6.2, the same features include that the partial character strings are the same or have the same length.
6. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 5, wherein: the same characteristics also include that the second level domain names are identical but the top level domain names are different.
7. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 1, wherein: in step 6, the stored information includes the infected host IP, the random domain name that was accessed and failed, and the time period of the domain name that was accessed and failed.
8. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 1, wherein: in step 7, the method for identifying the C & C server is that, for the domain name successfully resolved, if the access time of the domain name successfully resolved currently, which is identified as the infected host in step 6, is the same as the domain name failed in resolution within the time period when the infected host accesses the random domain name failed in resolution in a large amount, the actual server IP corresponding to the domain name is identified as the C & C server.
9. The method for detecting infected hosts and C & C servers based on DNS traffic of claim 1, wherein: the stored information includes client IP accessing the C & C server, time of access, domain name accessed, C & C server IP, while associating infected host with C & C server.
CN201710850732.3A 2017-09-20 2017-09-20 Method for detecting infected host and C & C server based on DNS traffic Active CN107612911B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201710850732.3A CN107612911B (en) 2017-09-20 2017-09-20 Method for detecting infected host and C & C server based on DNS traffic

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201710850732.3A CN107612911B (en) 2017-09-20 2017-09-20 Method for detecting infected host and C & C server based on DNS traffic

Publications (2)

Publication Number Publication Date
CN107612911A CN107612911A (en) 2018-01-19
CN107612911B true CN107612911B (en) 2020-05-01

Family

ID=61060185

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201710850732.3A Active CN107612911B (en) 2017-09-20 2017-09-20 Method for detecting infected host and C & C server based on DNS traffic

Country Status (1)

Country Link
CN (1) CN107612911B (en)

Families Citing this family (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107645503B (en) * 2017-09-20 2020-01-24 杭州安恒信息技术股份有限公司 A rule-based detection method for malicious domain names belonging to DGA family
CN109120733B (en) * 2018-07-20 2021-06-01 杭州安恒信息技术股份有限公司 A detection method using DNS for communication
CN109474593B (en) * 2018-11-09 2021-04-20 杭州安恒信息技术股份有限公司 Method for identifying C & C periodic loop back connection behaviors
CN113315737A (en) * 2020-02-26 2021-08-27 深信服科技股份有限公司 APT attack detection method and device, electronic equipment and readable storage medium
CN111654487B (en) * 2020-05-26 2022-04-19 南京云利来软件科技有限公司 DGA domain name identification method based on bypass network full flow and behavior characteristics

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007010395A2 (en) * 2005-07-22 2007-01-25 Alcatel Lucent Dns based enforcement for confinement and detection of network malicious activities
CN105072214A (en) * 2015-08-28 2015-11-18 携程计算机技术(上海)有限公司 C&C domain name identification method based on domain name feature
CN105577660A (en) * 2015-12-22 2016-05-11 国家电网公司 DGA domain name detection method based on random forest
CN106576058A (en) * 2014-08-22 2017-04-19 迈克菲股份有限公司 System and method to detect domain generation algorithm malware and systems infected by such malware

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2007010395A2 (en) * 2005-07-22 2007-01-25 Alcatel Lucent Dns based enforcement for confinement and detection of network malicious activities
CN106576058A (en) * 2014-08-22 2017-04-19 迈克菲股份有限公司 System and method to detect domain generation algorithm malware and systems infected by such malware
CN105072214A (en) * 2015-08-28 2015-11-18 携程计算机技术(上海)有限公司 C&C domain name identification method based on domain name feature
CN105577660A (en) * 2015-12-22 2016-05-11 国家电网公司 DGA domain name detection method based on random forest

Also Published As

Publication number Publication date
CN107612911A (en) 2018-01-19

Similar Documents

Publication Publication Date Title
CN112866023B (en) Network detection method, model training method, device, equipment and storage medium
CN110719291B (en) Network threat identification method and identification system based on threat information
CN107612911B (en) Method for detecting infected host and C & C server based on DNS traffic
Antonakakis et al. From {Throw-Away} traffic to bots: Detecting the rise of {DGA-Based} malware
US10721245B2 (en) Method and device for automatically verifying security event
CN110099059B (en) Domain name identification method and device and storage medium
CN107645503B (en) A rule-based detection method for malicious domain names belonging to DGA family
US8260914B1 (en) Detecting DNS fast-flux anomalies
Singh et al. Detecting bot-infected machines using DNS fingerprinting
US20140101759A1 (en) Method and system for detecting malware
EP3913888A1 (en) Detection method for malicious domain name in domain name system and detection device
WO2017049042A1 (en) Identifying phishing websites using dom characteristics
CN113810372B (en) Low-throughput DNS hidden channel detection method and device
CN112929390A (en) Network intelligent monitoring method based on multi-strategy fusion
CN114091016B (en) Method, apparatus and computer program product for anomaly detection
US20170308688A1 (en) Analysis apparatus, analysis system, analysis method, and analysis program
CN109257393A (en) XSS attack defence method and device based on machine learning
US11916942B2 (en) Automated identification of false positives in DNS tunneling detectors
CN108234472A (en) Detection method and device, computer equipment and the readable medium of Challenging black hole attack
CN111835781B (en) A method and system for discovering a same-origin attack host based on a lost host
CN115001724B (en) Network threat intelligence management method, device, computing equipment and computer readable storage medium
CN112437062B (en) ICMP tunnel detection method, device, storage medium and electronic equipment
CN110135162A (en) The recognition methods of the back door WEBSHELL, device, equipment and storage medium
CN117354024A (en) DNS malicious domain name detection system and method based on big data
Schiavoni et al. Tracking and characterizing botnets using automatically generated domains

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information

Address after: Zhejiang Zhongcai Building No. 68 Binjiang District road Hangzhou City, Zhejiang Province, the 310052 and 15 layer

Applicant after: Dbappsecurity Co.,Ltd.

Address before: Zhejiang Zhongcai Building No. 68 Binjiang District road Hangzhou City, Zhejiang Province, the 310052 and 15 layer

Applicant before: DBAPPSECURITY Co.,Ltd.

CB02 Change of applicant information
GR01 Patent grant
GR01 Patent grant
EE01 Entry into force of recordation of patent licensing contract

Application publication date: 20180119

Assignee: Hangzhou Anheng Information Security Technology Co.,Ltd.

Assignor: Dbappsecurity Co.,Ltd.

Contract record no.: X2024980043369

Denomination of invention: Method for detecting infected hosts and C&C servers based on DNS traffic

Granted publication date: 20200501

License type: Common License

Record date: 20241231

EE01 Entry into force of recordation of patent licensing contract