[go: up one dir, main page]

CN106027564B - Detect the method and device of anti-crawler security policy - Google Patents

Detect the method and device of anti-crawler security policy Download PDF

Info

Publication number
CN106027564B
CN106027564B CN201610537443.3A CN201610537443A CN106027564B CN 106027564 B CN106027564 B CN 106027564B CN 201610537443 A CN201610537443 A CN 201610537443A CN 106027564 B CN106027564 B CN 106027564B
Authority
CN
China
Prior art keywords
crawler
data
page
target object
security policy
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201610537443.3A
Other languages
Chinese (zh)
Other versions
CN106027564A (en
Inventor
崔广宇
李巍
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ctrip Computer Technology Shanghai Co Ltd
Original Assignee
Ctrip Computer Technology Shanghai Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ctrip Computer Technology Shanghai Co Ltd filed Critical Ctrip Computer Technology Shanghai Co Ltd
Priority to CN201610537443.3A priority Critical patent/CN106027564B/en
Publication of CN106027564A publication Critical patent/CN106027564A/en
Application granted granted Critical
Publication of CN106027564B publication Critical patent/CN106027564B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L63/00Network architectures or network communication protocols for network security
    • H04L63/20Network architectures or network communication protocols for network security for managing network security; network security policies in general
    • HELECTRICITY
    • H04ELECTRIC COMMUNICATION TECHNIQUE
    • H04LTRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
    • H04L67/00Network arrangements or protocols for supporting network services or applications
    • H04L67/01Protocols
    • H04L67/02Protocols based on web technology, e.g. hypertext transfer protocol [HTTP]

Landscapes

  • Engineering & Computer Science (AREA)
  • Computer Security & Cryptography (AREA)
  • Computer Networks & Wireless Communication (AREA)
  • Signal Processing (AREA)
  • Computer Hardware Design (AREA)
  • Computing Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Information Transfer Between Computers (AREA)

Abstract

The invention discloses a kind of method and devices for detecting anti-crawler security policy, wherein the described method includes: being embedded in the anti-crawler code for realizing anti-crawler strategy in the first front end page of website;Whether the user that first front end page is accessed using the anti-crawler code detection is crawler, will be detected to be that the user of crawler is denoted as target object;Verify whether the target object is crawler, counts the number of the non-crawler of the target object;The accidental injury rate of the anti-crawler strategy is calculated according to the number, the accidental injury rate is used to measure the safety of the anti-crawler strategy.The present invention compensates for the prior art to the improper deficiency for causing system injury of the safety detection of anti-crawler strategy, anti- crawler security policy can accurately be detected, convenient for anti-crawler strategy is modified or is updated in time, it avoids the safety due to anti-crawler strategy from impacting the stability of inline system, guarantees the stability of system while detecting crawler.

Description

Detect the method and device of anti-crawler security policy
Technical field
The present invention relates to a kind of method and devices for detecting anti-crawler security policy.
Background technique
The crawler amount on internet increasingly increases at present, and crawler is also very strange, and the moment is evolving, anti-crawler mechanism Also increasingly by stern challenge, need continually to issue new anti-crawler strategy to solve new crawler.However, every time into When row publication new edition anti-crawler strategy, the stability of inline system can all be impacted, if anti-crawler strategy does not have Enough safeties can damage inline system while anti-crawler, lose more than gain.
Summary of the invention
The technical problem to be solved by the present invention is in order to overcome the prior art improper to the safety detection of anti-crawler strategy The defect for causing system injury provides a kind of method and device that can accurately detect anti-crawler security policy.
The present invention is to solve above-mentioned technical problem by the following technical programs:
The present invention provides a kind of method for detecting anti-crawler security policy, its main feature is that, which comprises
S1, in the first front end page of website be embedded in for realizing anti-crawler strategy anti-crawler code;
S2, using the anti-crawler code detection access whether the user of first front end page is crawler, will be detected Measure is that the user of crawler is denoted as target object;
S3, the verifying target object whether be crawler, count the number of the non-crawler of the target object;
S4, calculate according to the number accidental injury rate of the anti-crawler strategy, the accidental injury rate is for measuring described counter climb The safety of worm strategy.
Wherein, anti-crawler strategy is realized by anti-crawler code;Time of the non-crawler of the target object counted Number indicates number of the anti-crawler code accidentally user's detection of non-crawler at crawler, i.e., the described anti-crawler code detection is wrong Number accidentally.The technical program measures the safety of anti-crawler strategy by calculating the accidental injury rate of the anti-crawler strategy, such as Fruit safety is higher, can be by anti-crawler policy deployment into inline system, if safety is lower, can with time update or more New anti-crawler strategy, avoids the safety due to anti-crawler strategy from impacting the stability of inline system, in detection crawler While guarantee system stability.
Preferably, S3Verify whether the target object is crawler by following steps:
S31, judge whether the target object accesses the second front end page of the website, if so, the target object Non- crawler, if it is not, then the target object is crawler.
Wherein, the second front end page and first front end page have relevance, are having accessed the first front end for crawler The page that will not be usually accessed after the page, if target object accesses the second front end page, then it represents that the target object is non-to climb Worm (not being crawler), the anti-crawler code detection mistake, if target object does not access the second front end page, then it represents that institute Stating target object is crawler, and the anti-crawler code detection is correct.The technical program can efficiently and accurately verify the target Whether object is crawler, further verifies the safety of the anti-crawler code.
Preferably, the anti-crawler code includes front end portion, the front end portion includes the generation for detecting crawler Code, S1Include:
S11, in the first front end page of website be embedded in first page;
S12, will be used to detect the code configuration of crawler into the first page.
The technical program is embedded in the first front end page by the code that the first page is configured to detection crawler, even if Code or first page are wrong, the display of first front end page will not be influenced, convenient for the change of code.
Preferably, the anti-crawler code further includes back partition, the back partition is used to be written first in client Data and first data are intercepted in the second front end page and count the sum of first data intercepted;
S1Further include: the first data are written in the client using the back partition;
S3It include: to intercept first data in the second front end page using the back partition and count the institute intercepted State the sum of the first data;
The accidental injury rate is equal to the total the number of visiting people of the sum/second front end page.
The technical program judges the target pair by judging whether second front end page intercepts the first data As if no access second front end page shows the target pair if second front end page intercepts the first data As having accessed second front end page, the non-crawler of target object, if second front end page does not intercept the first number According to then showing that the target object does not access second front end page, the target object is crawler.Described intercepted The sum of one data is equal to the number of the non-crawler of the target object.The the accidental injury rate in the technical program the high, shows described anti- The safety of crawler strategy is lower, and the accidental injury rate the low, shows that the safety of the anti-crawler strategy is higher.
Preferably, S11Further include: it is set in the probability that the first page is embedded in first front end page;
The accidental injury rate is equal to the total the number of visiting people of the sum/probability/second front end page.
Preferably, S1Further include: the effective time of setting first data, first data are being more than described effective It fails when the time.
Preferably, different anti-crawler strategies corresponds to the first different data.
The technical program is particularly suitable for detecting the situation of multiple anti-crawler strategies simultaneously, is different anti-crawler strategies (anti-crawler code) sets the first different data, counts the first different numbers intercepted respectively in second front end page According to sum, the safety of each anti-crawler strategy is determined by the first data of difference.
Preferably, the method also includes:
S5, the accidental injury rate whether be lower than threshold value, if so, the anti-crawler security policy;If it is not, then described anti- Crawler strategy is dangerous.
The present invention also provides a kind of devices for detecting anti-crawler security policy, its main feature is that, described device includes:
Embedded unit, for being embedded in the anti-crawler generation for realizing anti-crawler strategy in the first front end page of website Code;
Detection unit, whether the user for accessing first front end page using the anti-crawler code detection is to climb Worm will be detected to be that the user of crawler is denoted as target object;
Authentication unit counts the number of the non-crawler of the target object for verifying whether the target object is crawler;
Computing unit, for calculating the accidental injury rate of the anti-crawler strategy according to the number, the accidental injury rate is for weighing Measure the safety of the anti-crawler strategy.
Preferably, whether the authentication unit is crawler by target object described in following module verification:
Judgment module, for judging whether the target object accesses the second front end page of the website, if so, institute The non-crawler of target object is stated, if it is not, then the target object is crawler.
Preferably, the anti-crawler code includes front end portion, the front end portion includes the generation for detecting crawler Code, the embedded unit include:
Page module, for being embedded in first page in the first front end page of website;
Configuration module, for that will be used to detect the code configuration of crawler into the first page.
Preferably, the anti-crawler code further includes back partition, the back partition is used to be written first in client Data and first data are intercepted in the second front end page and count the sum of first data intercepted;
The embedded unit further include:
Data module, for the first data to be written in client using the back partition;
The authentication unit includes:
Blocking module, for intercepting first data in the second front end page using the back partition and counting interception The sum of first data arrived;
The accidental injury rate is equal to the total the number of visiting people of the sum/second front end page.
Preferably, the page module, which is also used to be set in first front end page, is embedded in the general of the first page Rate;
The accidental injury rate is equal to the total the number of visiting people of the sum/probability/second front end page.
Preferably, the data module is also used to set the effective time of first data, first data are super It fails when spending the effective time.
Preferably, different anti-crawler strategies corresponds to the first different data.
Preferably, described device further include:
Whether comparing unit is lower than threshold value for the accidental injury rate, if so, the anti-crawler security policy;If No, then the anti-crawler strategy is dangerous.
On the basis of common knowledge of the art, above-mentioned each optimum condition, can any combination to get each preferable reality of the present invention Example.
The positive effect of the present invention is that: the present invention is measured anti-by calculating the accidental injury rate of the anti-crawler strategy The safety of crawler strategy, the accurate detection to anti-crawler security policy, convenient for modifying in time to anti-crawler strategy or It updates, avoids the safety due to anti-crawler strategy from impacting the stability of inline system, protected while detecting crawler The stability of card system.
Detailed description of the invention
Fig. 1 is the flow chart of the method for the anti-crawler security policy of detection of present pre-ferred embodiments 1.
Fig. 2 is the schematic diagram of the device of the anti-crawler security policy of detection of present pre-ferred embodiments 2.
Specific embodiment
The present invention is further illustrated below by the mode of embodiment, but does not therefore limit the present invention to the reality It applies among a range.
Embodiment 1
A method of anti-crawler security policy being detected, as shown in Figure 1, which comprises
Step 101 writes anti-crawler code for realizing anti-crawler strategy.Wherein, anti-crawler code includes front end Point and back partition:
The front end portion includes the code for detecting crawler;
Back partition is used to be written the effective time of the first data and setting first data in client, and described the One data fail when being more than the effective time, and cookie can be used (sometimes in first data in specific implementation With its plural form cookies, data being stored in user local terminal (i.e. client)) realize the value of cookie and have Imitating the time can be customized;
The back partition is also used to intercept first data in the second front end page and counts described the intercepted The sum of one data.
Step 102 is embedded in first page in the first front end page of website and is set in first front end page The probability of the middle insertion first page.Wherein, the first front end page typically biggish front end of random access amount The page, such as hotel's details page of online tourism website;The first page can pass through dynamically configurable javascript A kind of iframe (html label) page of (literal translation formula scripting language) code is realized, since the iframe page is independent sandbox, Even if js, which reports an error, to impact parent page (i.e. described first front end page);The probability refers in first front end The probability of the first page is embedded in the page, for controlling the frequency of occurrence of the first page, such as before there are 100 first 10 insertion first pages are just selected if setting probability as 10% in end page face from this 100 first front end pages, If setting probability as 1, first page is just embedded in this 100 first front end pages.
Step 103, the publication anti-crawler code will be used to detect the code configuration of crawler to the first page to realize The first data are written in face and in the client.It can detect to access first front end page automatically using the code User whether be crawler;It is realized using the back partition and the first data is written in the client.
It will be detected to be that the user of crawler is denoted as target object in the present embodiment, by whether judging the target object The second front end page of the website is accessed to verify whether the target object is crawler, second front end page with it is described First front end page has relevance, is the page that crawler will not usually access after having accessed the first front end page, such as The order page of line tour site.If target object accesses the second front end page, then it represents that the non-crawler of target object is (no It is crawler), the anti-crawler code detection mistake, the target object is accidentally injured, if before target object does not access second End page face, then it represents that the target object is crawler, and the anti-crawler code detection is correct, and the target object is not missed Wound.
Step 104 intercepts first data in the second front end page and counts the total of first data intercepted Number.Step 104 be realized using the back partition, wherein the effect for intercepting the first data be by judging described second before Whether end page face intercepts the first data to judge whether the target object accesses second front end page, and then verifies institute State whether target object is crawler.If second front end page intercepts the first data, show the target object access Second front end page, the non-crawler of target object, if second front end page does not intercept the first data, table The bright target object does not access second front end page, and the target object is crawler.
Step 105, the accidental injury rate for calculating the anti-crawler strategy, the accidental injury rate is for measuring the anti-crawler strategy Safety.
If accidental injury rate is Q, the probability that the first page is embedded in first front end page is P, the institute intercepted The sum for stating the first data is C, and total the number of visiting people of the second front end page is O:
Q=C/P/O.
As P=1, Q=C/O.
Whether step 106, the accidental injury rate are lower than threshold value, if so, the anti-crawler security policy;If it is not, then The anti-crawler strategy is dangerous.Wherein, the threshold value can be customized.If the anti-crawler strategy be it is safe, Production can be formally deployed to.
When needing while detecting multiple anti-crawler strategies, different anti-crawler strategies can be set and correspond to different first Data.The sum for the first different data intercepted is counted respectively in second front end page, by distinguishing the first data To determine the safety of each anti-crawler strategy.
Embodiment 2
The device of the anti-crawler security policy of detection of the present embodiment, as shown in Fig. 2, described device includes: embedded unit 201, detection unit 202, authentication unit 203, computing unit 204 and comparing unit 205.
The anti-crawler strategy includes anti-crawler code, and anti-crawler code includes front end portion and back partition:
The front end portion includes the code for detecting crawler;
Back partition is used to be written the effective time of the first data and setting first data in client, and described the One data fail when being more than the effective time, and cookie can be used (sometimes in first data in specific implementation With its plural form cookies, data being stored in user local terminal (i.e. client)) realize the value of cookie and have Imitating the time can be customized;
The back partition is also used to intercept first data in the second front end page and counts described the intercepted The sum of one data.
Embedded unit 201, for being embedded in the anti-crawler for realizing anti-crawler strategy in the first front end page of website Code.
Specifically, the embedded unit 201 includes:
Page module 2011, for being embedded in first page in the first front end page of website and being set in described first The probability of the first page is embedded in front end page.Wherein, first front end page typically random access amount compared with Big front end page, such as hotel's details page of online tourism website;The first page can be by dynamically configurable A kind of iframe (html label) page of javascript (literal translation formula scripting language) code is realized, since the iframe page is Independent sandbox, even if js, which reports an error, to impact parent page (i.e. described first front end page);The probability refers in institute The probability for being embedded in the first page in the first front end page is stated, for controlling the frequency of occurrence of the first page, such as is had 100 the first front end pages, if setting probability as 10%, just selected from this 100 first front end pages 10 it is embedding Enter first page, if setting probability as 1, is just embedded in first page in this 100 first front end pages.
Configuration module 2012 will be used to detect the code configuration of crawler to described for the anti-crawler code by publication In one page.
For the anti-crawler code by publication the first data will be written in client in data module 2013.Using described Back partition, which is realized, is written the first data in the client.
Detection unit 202, for going out to access whether the user of first front end page is to climb using the code detection Worm will be detected to be that the user of crawler is denoted as target object.
Authentication unit 203 counts time of the non-crawler of the target object for verifying whether the target object is crawler Number.Specifically, whether the authentication unit 203 is crawler by target object described in following module verification:
Judgment module 2031, for judging whether the target object accesses the second front end page of the website, if so, The then non-crawler of the target object, if it is not, then the target object is crawler.Second front end page and first front end The page has relevance, for the page that crawler will not usually access after having accessed the first front end page, such as online tourism net The order page stood.If target object accesses the second front end page, then it represents that the non-crawler of target object (not being crawler), The anti-crawler code detection mistake, the target object is accidentally injured, if target object does not access the second front end page, Indicate that the target object is crawler, the anti-crawler code detection is correct, and the target object is not accidentally injured.
The authentication unit 203 includes:
Blocking module 2032, for intercepting first data in the second front end page using the back partition and counting The sum for first data intercepted.The effect for wherein intercepting the first data is by judging that second front end page is It is no to intercept the first data to judge whether the target object accesses second front end page, and then verify the target pair As if no is crawler.If second front end page intercepts the first data, show that the target object has accessed described Two front end pages, the non-crawler of target object show the mesh if second front end page does not intercept the first data Mark object does not access second front end page, and the target object is crawler.
Computing unit 204, for calculating the accidental injury rate of the anti-crawler strategy, the accidental injury rate is for measuring described counter climb The safety of worm strategy.
If accidental injury rate is Q, the probability that the first page is embedded in first front end page is P, the institute intercepted The sum for stating the first data is C, and total the number of visiting people of the second front end page is O:
Q=C/P/O.
As P=1, Q=C/O.
Whether comparing unit 205 is lower than threshold value for the accidental injury rate, if so, the anti-crawler security policy; If it is not, then the anti-crawler strategy is dangerous.Wherein, the threshold value can be customized.If the anti-crawler strategy is safety , then production can be formally deployed to.
When needing while detecting multiple anti-crawler strategies, different anti-crawler strategies can be set and correspond to different first Data.The sum for the first different data intercepted is counted respectively in second front end page, by distinguishing the first data To determine the safety of each anti-crawler strategy.
Although specific embodiments of the present invention have been described above, it will be appreciated by those of skill in the art that these It is merely illustrative of, protection scope of the present invention is defined by the appended claims.Those skilled in the art is not carrying on the back Under the premise of from the principle and substance of the present invention, many changes and modifications may be made, but these are changed Protection scope of the present invention is each fallen with modification.

Claims (16)

1. a kind of method for detecting anti-crawler security policy, which is characterized in that the described method includes:
S1, in the first front end page of website be embedded in for realizing anti-crawler strategy anti-crawler code;
S2, using the anti-crawler code detection access whether the user of first front end page is crawler, will be detected be The user of crawler is denoted as target object;
S3, the verifying target object whether be crawler, count the number of the non-crawler of the target object;
S4, calculate according to the number accidental injury rate of the anti-crawler strategy, the accidental injury rate is for measuring the anti-crawler strategy Safety.
2. detecting the method for anti-crawler security policy as described in claim 1, which is characterized in that S3It is tested by following steps Demonstrate,prove whether the target object is crawler:
S31, judge whether the target object accesses the second front end page of the website, climbed if so, the target object is non- Worm, if it is not, then the target object is crawler.
3. detecting the method for anti-crawler security policy as claimed in claim 2, which is characterized in that the anti-crawler code packet Front end portion is included, the front end portion includes the code for detecting crawler, S1Include:
S11, in the first front end page of website be embedded in first page;
S12, will be used to detect the code configuration of crawler into the first page.
4. detecting the method for anti-crawler security policy as claimed in claim 3, which is characterized in that the anti-crawler code is also Including back partition, the back partition is used to that the first data to be written in client and intercepts described the in the second front end page One data and the sum for counting first data intercepted;
S1Further include: the first data are written in the client using the back partition;
S3It include: to intercept first data in the second front end page using the back partition and count described the intercepted The sum of one data;
The accidental injury rate is equal to the total the number of visiting people of the sum/second front end page.
5. detecting the method for anti-crawler security policy as claimed in claim 4, which is characterized in that S11Further include: it is set in The probability of the first page is embedded in first front end page;
The accidental injury rate is equal to the total the number of visiting people of the sum/probability/second front end page.
6. detecting the method for anti-crawler security policy as claimed in claim 4, which is characterized in that the back partition is also used In the effective time for setting first data, first data fail when being more than the effective time.
7. detecting the method for anti-crawler security policy as claimed in claim 4, which is characterized in that different anti-crawler strategies Corresponding the first different data.
8. detecting the method for anti-crawler security policy as described in claim 1, which is characterized in that the method also includes:
S5, the accidental injury rate whether be lower than threshold value, if so, the anti-crawler security policy;If it is not, the then anti-crawler Strategy is dangerous.
9. a kind of device for detecting anti-crawler security policy, which is characterized in that described device includes:
Embedded unit, for being embedded in the anti-crawler code for realizing anti-crawler strategy in the first front end page of website;
Detection unit, whether the user for accessing first front end page using the anti-crawler code detection is crawler, It will be detected to be that the user of crawler is denoted as target object;
Authentication unit counts the number of the non-crawler of the target object for verifying whether the target object is crawler;
Computing unit, for calculating the accidental injury rate of the anti-crawler strategy according to the number, the accidental injury rate is for measuring institute State the safety of anti-crawler strategy.
10. detecting the device of anti-crawler security policy as claimed in claim 9, which is characterized in that the authentication unit is logical Cross whether target object described in following module verification is crawler:
Judgment module, for judging whether the target object accesses the second front end page of the website, if so, the mesh The non-crawler of object is marked, if it is not, then the target object is crawler.
11. detecting the device of anti-crawler security policy as claimed in claim 10, which is characterized in that the anti-crawler code Including front end portion, the front end portion includes the code for detecting crawler, and the embedded unit includes:
Page module, for being embedded in first page in the first front end page of website;
Configuration module, for that will be used to detect the code configuration of crawler into the first page.
12. detecting the device of anti-crawler security policy as claimed in claim 11, which is characterized in that the anti-crawler code Further include back partition, the back partition be used for client be written the first data and the second front end page intercept described in First data and the sum for counting first data intercepted;
The embedded unit further include:
Data module, for the first data to be written in client using the back partition;
The authentication unit includes:
Blocking module is intercepted for intercepting first data in the second front end page using the back partition and counting The sum of first data;
The accidental injury rate is equal to the total the number of visiting people of the sum/second front end page.
13. detecting the device of anti-crawler security policy as claimed in claim 12, which is characterized in that the page module is also For being set in the probability for being embedded in the first page in first front end page;
The accidental injury rate is equal to the total the number of visiting people of the sum/probability/second front end page.
14. detecting the device of anti-crawler security policy as claimed in claim 12, which is characterized in that the back partition is also For setting the effective time of first data, first data fail when being more than the effective time.
15. detecting the device of anti-crawler security policy as claimed in claim 12, which is characterized in that different anti-crawler plans Slightly correspond to the first different data.
16. detecting the device of anti-crawler security policy as claimed in claim 9, which is characterized in that described device further include:
Whether comparing unit is lower than threshold value for the accidental injury rate, if so, the anti-crawler security policy;If it is not, then The anti-crawler strategy is dangerous.
CN201610537443.3A 2016-07-08 2016-07-08 Detect the method and device of anti-crawler security policy Active CN106027564B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201610537443.3A CN106027564B (en) 2016-07-08 2016-07-08 Detect the method and device of anti-crawler security policy

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201610537443.3A CN106027564B (en) 2016-07-08 2016-07-08 Detect the method and device of anti-crawler security policy

Publications (2)

Publication Number Publication Date
CN106027564A CN106027564A (en) 2016-10-12
CN106027564B true CN106027564B (en) 2019-05-21

Family

ID=57108853

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201610537443.3A Active CN106027564B (en) 2016-07-08 2016-07-08 Detect the method and device of anti-crawler security policy

Country Status (1)

Country Link
CN (1) CN106027564B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107147640B (en) * 2017-05-09 2019-12-31 网宿科技股份有限公司 Method and system for identifying web crawlers
CN107943949B (en) * 2017-11-24 2020-06-26 厦门集微科技有限公司 Method and server for determining web crawler
CN109543454B (en) * 2019-01-25 2022-07-12 腾讯科技(深圳)有限公司 Anti-crawler method and related equipment
CN111181933B (en) * 2019-12-19 2022-04-26 贝壳技术有限公司 Web crawler detection method and device, storage medium and electronic equipment

Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582887A (en) * 2009-05-20 2009-11-18 成都市华为赛门铁克科技有限公司 Safety protection method, gateway device and safety protection system
CN102495861A (en) * 2011-11-24 2012-06-13 中国科学院计算技术研究所 System and method for identifying web crawlers
CN102790700A (en) * 2011-05-19 2012-11-21 北京启明星辰信息技术股份有限公司 Method and device for recognizing webpage crawler
CN105743901A (en) * 2016-03-07 2016-07-06 携程计算机技术(上海)有限公司 Server, anti-crawler system and anti-crawler verification method

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8347386B2 (en) * 2008-10-21 2013-01-01 Lookout, Inc. System and method for server-coupled malware prevention
US9323649B2 (en) * 2013-09-30 2016-04-26 International Business Machines Corporation Detecting error states when interacting with web applications

Patent Citations (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101582887A (en) * 2009-05-20 2009-11-18 成都市华为赛门铁克科技有限公司 Safety protection method, gateway device and safety protection system
CN102790700A (en) * 2011-05-19 2012-11-21 北京启明星辰信息技术股份有限公司 Method and device for recognizing webpage crawler
CN102495861A (en) * 2011-11-24 2012-06-13 中国科学院计算技术研究所 System and method for identifying web crawlers
CN105743901A (en) * 2016-03-07 2016-07-06 携程计算机技术(上海)有限公司 Server, anti-crawler system and anti-crawler verification method

Non-Patent Citations (3)

* Cited by examiner, † Cited by third party
Title
Web爬虫检测技术综述;吴晓晖;《湖北汽车工业学院学报》;20120331;全文
关于爬虫,看着一篇就够了;ctriptech;《https://segmentfault.com/a/1190000005840672》;20160630;全文
那些你不知道的爬虫反爬虫套路;36氪的朋友们;《http://36kr.com/p/5079327.html》;20160612;全文

Also Published As

Publication number Publication date
CN106027564A (en) 2016-10-12

Similar Documents

Publication Publication Date Title
CN106027564B (en) Detect the method and device of anti-crawler security policy
Nikiforakis et al. Privaricator: Deceiving fingerprinters with little white lies
CN104301302A (en) Unauthorized attack detection method and device
CN103577748B (en) Dynamic measuring method based on dependable computing and management system
CN104519032B (en) A kind of security strategy and system of internet account number
US10826684B1 (en) System and method of validating Internet of Things (IOT) devices
ES2808974T3 (en) Procedure for identifying the risk of account theft, identification device and prevention and control system
CN106254368B (en) The detection method and device of Web vulnerability scanning
US8321360B2 (en) Method and system for weighting transactions in a fraud detection system
US7975296B2 (en) Automated security threat testing of web pages
CN105913257A (en) System And Method For Detecting Fraudulent Online Transactions
Manadhata et al. Measuring a system's attack surface
US10015153B1 (en) Security using velocity metrics identifying authentication performance for a set of devices
CN103618691A (en) Network security performance evaluation method
CN107819631A (en) A kind of unit exception detection method, device and equipment
CN110602021A (en) Safety risk value evaluation method based on combination of HTTP request behavior and business process
US20120254947A1 (en) Distributed Real-Time Network Protection for Authentication Systems
CN109753772A (en) A kind of account safety verification method and system
CN109257393A (en) XSS attack defence method and device based on machine learning
CN106302412A (en) A kind of intelligent checking system for the test of information system crushing resistance and detection method
WO2012126611A1 (en) Detecting attacks on a portable data carrier
CN104253809A (en) Method and system for detecting network content
CN110287047A (en) A kind of trusted status detection method
CN106643810B (en) A kind of diagnostic method of pair of Gyro measurement data
Franks et al. Robustness properties of a sequential test for vaccine safety in the presence of misspecification

Legal Events

Date Code Title Description
C06 Publication
PB01 Publication
C10 Entry into substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant