CN103501306B - A kind of network address knows method for distinguishing, server and system - Google Patents
A kind of network address knows method for distinguishing, server and system Download PDFInfo
- Publication number
- CN103501306B CN103501306B CN201310503007.0A CN201310503007A CN103501306B CN 103501306 B CN103501306 B CN 103501306B CN 201310503007 A CN201310503007 A CN 201310503007A CN 103501306 B CN103501306 B CN 103501306B
- Authority
- CN
- China
- Prior art keywords
- network address
- malice
- pages
- content
- detected
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L63/00—Network architectures or network communication protocols for network security
- H04L63/14—Network architectures or network communication protocols for network security for detecting or protecting against malicious traffic
- H04L63/1441—Countermeasures against malicious traffic
- H04L63/1483—Countermeasures against malicious traffic service impersonation, e.g. phishing, pharming or web spoofing
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
Landscapes
- Engineering & Computer Science (AREA)
- Computer Security & Cryptography (AREA)
- Theoretical Computer Science (AREA)
- Databases & Information Systems (AREA)
- General Engineering & Computer Science (AREA)
- Physics & Mathematics (AREA)
- General Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- Computer Hardware Design (AREA)
- Computing Systems (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention discloses a kind of network address and know method for distinguishing, including: obtain the content of pages that network address to be detected is corresponding;Described content of pages is mated with any page face die plate in the malice Page Template storehouse previously generated;When the matching similarity of described content of pages and described any page face die plate is more than the first predetermined threshold value, determine that described network address to be detected is for malice network address.The embodiment of the present invention also provides for corresponding server.The network address that the embodiment of the present invention provides knows method for distinguishing, can quickly identify malice network address, thus improve internet security.
Description
Technical field
The present invention relates to Internet technical field, be specifically related to a kind of network address and know method for distinguishing, server and be
System.
Background technology
The Internet is while bringing convenience to people's lives, and the security situation of the Internet also allows of no optimist, respectively
The class trojan horse normal file that disguises oneself as is propagated wantonly, and fishing website imitates normal website, and to steal user account number close
Code grows in intensity.
For identification and the strike of malicious websites, generally there is two schemes: a class is based on user's report and people
The method of work examination & verification, user can submit suspicious URL (Uniform Resource to
Locator, URL), URL is also referred to as web page address, is called for short network address, then adds after artificial nucleus actually malice
Enter in malice url list;One class is method based on URL feature identification.
To in the research of prior art and practice process, it was found by the inventors of the present invention that nothing in prior art
The method being also based on URL feature identification by the method being manual examination and verification, is required for long time ability
Determine whether this network address is malice network address, cause the recognition efficiency to malice network address low.
Summary of the invention
The embodiment of the present invention provides a kind of network address to know method for distinguishing, can quickly identify malice network address, thus
Improve internet security.The embodiment of the present invention additionally provides corresponding server and system.
First aspect present invention provides a kind of network address to know method for distinguishing, including:
Obtain the content of pages that network address to be detected is corresponding;
Described content of pages and any page face die plate in the malice Page Template storehouse previously generated are carried out
Join;
When the matching similarity of described content of pages and described any page face die plate is more than the first predetermined threshold value,
Determine that described network address to be detected is for malice network address.
In conjunction with first aspect, in the implementation that the first is possible, described method also includes:
Described malice network address is stored in the malice URL library pre-set, and collects and drawn black network address to described
Maliciously URL library.
In conjunction with the first possible implementation of first aspect, in the implementation that the second is possible, described
Method also includes:
Described malice Page Template storehouse is updated according to described malice URL library.
In conjunction with the implementation that first aspect the second is possible, in the implementation that the third is possible, described
Described malice Page Template storehouse is updated according to described malice URL library, including:
Obtain the content of pages that each network address in described malice URL library is corresponding;
Calculate the similarity of any two content of pages in the content of pages that each network address described is corresponding, by institute
The similarity stating any two content of pages is divided into identity set more than the network address of the second predetermined threshold value;
Make comprising the network address quantity content of pages corresponding more than network address in arbitrary set of the 3rd preset threshold value
For malice Page Template, and described malice Page Template is stored in described malice Page Template storehouse.
In conjunction with first aspect, first aspect the first to the third may any one in implementation,
In 4th kind of possible implementation, the content of pages that described acquisition network address to be detected is corresponding, including:
Receive the network address described to be detected that user side sends;
According to the content of pages that network address to be detected described in described website, download to be detected is corresponding.
Second aspect present invention provides a kind of server, including:
Acquiring unit, for obtaining the content of pages that network address to be detected is corresponding;
Matching unit, for the content of pages obtained by described acquiring unit and the malice page mould previously generated
Each Page Template in plate storehouse mates;
Determine unit, for matching described content of pages and described any page face die plate when described matching unit
Matching similarity more than the first predetermined threshold value time, determine described network address to be detected for malice network address.
In conjunction with second aspect, in the implementation that the first is possible, described server also includes:
Memory element, for being stored in, by described malice network address, the malice URL library pre-set;
Collector unit, is drawn black network address to described malice URL library for collecting.
In conjunction with the first possible implementation of second aspect, in the implementation that the second is possible, described
Server also includes:
Updating block, for updating described malice Page Template storehouse according to described malice URL library.
In conjunction with the implementation that second aspect the second is possible, in the implementation that the third is possible, described
Updating block includes:
Obtain subelement, for obtaining the content of pages that each network address in described malice URL library is corresponding;
Computation subunit, for calculating the content of pages that each network address of described acquisition subelement acquisition is corresponding
The similarity of middle any two content of pages;
Dividing subelement, the similarity of any two content of pages for described computation subunit being calculated surpasses
The network address crossing the second predetermined threshold value is divided into identity set;
Determine subelement, for single more than arbitrary described division of the 3rd preset threshold value by comprising network address quantity
The content of pages that in the set that unit divides, network address is corresponding is as malice Page Template;
Storing sub-units, is used for and the described malice Page Template determining that subelement determines is stored in described malice
In Page Template storehouse.
In conjunction with second aspect, second aspect the first to the third may any one in implementation,
In 4th kind of possible implementation, described acquiring unit includes:
Receive subelement, for receiving the network address described to be detected that user side sends;
Lower subelements, for be checked described in the website, download to be detected that receives according to described reception subelement
The content of pages that survey grid location is corresponding.
Third aspect present invention provides a kind of network address identification system, including: server and user side,
Wherein, described server is the server described in technique scheme.
The embodiment of the present invention uses and obtains the content of pages that network address to be detected is corresponding;By described content of pages with pre-
Any page face die plate in the malice Page Template storehouse first generated mates;When described content of pages is with described
When the matching similarity of any page face die plate is more than the first predetermined threshold value, determine that described network address to be detected is for malice
Network address.With in prior art to the recognition efficiency of malice network address lowly compared with, the net that the embodiment of the present invention provides
Method for distinguishing is known in location, can quickly identify malice network address, thus improve internet security.
Accompanying drawing explanation
For the technical scheme being illustrated more clearly that in the embodiment of the present invention, institute in embodiment being described below
The accompanying drawing used is needed to be briefly described, it should be apparent that, the accompanying drawing in describing below is only the present invention
Some embodiments, for those skilled in the art, on the premise of not paying creative work, also
Other accompanying drawing can be obtained according to these accompanying drawings.
Fig. 1 is the embodiment schematic diagram that in the embodiment of the present invention, network address knows method for distinguishing;
Fig. 2 is another embodiment schematic diagram that in the embodiment of the present invention, network address knows method for distinguishing;
Fig. 3 is an embodiment schematic diagram of server in the embodiment of the present invention;
Fig. 4 is another embodiment schematic diagram of server in the embodiment of the present invention;
Fig. 5 is another embodiment schematic diagram of server in the embodiment of the present invention;
Fig. 6 is another embodiment schematic diagram of server in the embodiment of the present invention;
Fig. 7 is another embodiment schematic diagram of server in the embodiment of the present invention;
Fig. 8 is another embodiment schematic diagram of server in the embodiment of the present invention;
Fig. 9 is an embodiment schematic diagram of network address identification system in the embodiment of the present invention.
Detailed description of the invention
The embodiment of the present invention provides a kind of network address to know method for distinguishing, can quickly identify malice network address, thus
Improve internet security.The embodiment of the present invention additionally provides corresponding server and system.Carry out individually below
Describe in detail.
Below in conjunction with the accompanying drawing in the embodiment of the present invention, the technical scheme in the embodiment of the present invention is carried out clearly
Chu, be fully described by, it is clear that described embodiment be only a part of embodiment of the present invention rather than
Whole embodiments.Based on the embodiment in the present invention, those skilled in the art are not making creative labor
The every other embodiment obtained under dynamic premise, broadly falls into the scope of protection of the invention.
Refering to Fig. 1, the network address that the embodiment of the present invention provides is known an embodiment of method for distinguishing and is included:
101, the content of pages that network address to be detected is corresponding is obtained.
102, described content of pages is entered with any page face die plate in the malice Page Template storehouse previously generated
Row coupling.
The malice Page Template storehouse previously generated can be according to the network address of reporting of user accumulated before or
Summed up out by the content of pages drawing black network address corresponding.
Draw black URL to be i.e. identified as malice by security software programs or operation personnel receives user and reports descendant
Work examines the URL into malice.
103, threshold is preset when the matching similarity of described content of pages Yu described any page face die plate more than first
During value, determine that described network address to be detected is for malice network address.
First predetermined threshold value can be 80%, 90% or other numerical value.
Maliciously network address refers to that malice plants the rogue program such as wooden horse, virus in network address, by " the net of camouflage
Location service content " induce user to access this network address, once enter these network address, plantation in network address will be triggered
Under wooden horse, the program such as virus, cause visitor computer infected, face loss account number or privacy letter
The danger such as breath.Maliciously network address easily occur in some nameless with sell, recommend character network address in.
The embodiment of the present invention uses and obtains the content of pages that network address to be detected is corresponding;By described content of pages with pre-
Any page face die plate in the malice Page Template storehouse first generated mates;When described content of pages is with described
When the matching similarity of any page face die plate is more than the first predetermined threshold value, determine that described network address to be detected is for malice
Network address.With in prior art to the recognition efficiency of malice network address lowly compared with, the net that the embodiment of the present invention provides
Method for distinguishing is known in location, can quickly identify malice network address, thus improve internet security.
Alternatively, on the basis of the embodiment that above-mentioned Fig. 1 is corresponding, the network address that the embodiment of the present invention provides is known
In one alternative embodiment of method for distinguishing, described method can also include:
Described malice network address is stored in the malice URL library pre-set, and collects and drawn black network address to described
Maliciously URL library.
In the embodiment of the present invention, when match network address to be detected for malice network address after, can by this malice net
Location stores malice URL library, maliciously URL library can store the reporting of user accumulated before network address or
Person is drawn black network address, and server can be with the network address of persistent collection reporting of user or drawn black network address.
Alternatively, on the basis of an alternative embodiment corresponding for above-mentioned Fig. 1, the embodiment of the present invention provides
Network address is known in another alternative embodiment of method for distinguishing, and described method can also include:
Described malice Page Template storehouse is updated according to described malice URL library.
In the embodiment of the present invention, because malice network address the most constantly increases, so malice Page Template storehouse also needs
Constantly updating, such guarantee confirms malice network address efficiently.
Alternatively, on the basis of another alternative embodiment that above-mentioned Fig. 1 is corresponding, the embodiment of the present invention provides
Network address know method for distinguishing another alternative embodiment in, described according to described malice URL library update described evil
Meaning Page Template storehouse, including:
Obtain the content of pages that each network address in described malice URL library is corresponding;
Calculate the similarity of any two content of pages in the content of pages that each network address described is corresponding, by institute
The similarity stating any two content of pages is divided into identity set more than the network address of the second predetermined threshold value;
Make comprising the network address quantity content of pages corresponding more than network address in arbitrary set of the 3rd preset threshold value
For malice Page Template, and described malice Page Template is stored in described malice Page Template storehouse.
In the embodiment of the present invention, compare the similarity calculating web page contents two-by-two.Weigh the calculation of text similarity
Method has a lot, such as longest common subsequence, minimum editing distance, Hamming distance, characteristic vector cosine value
Deng, the invention is not limited in this regard, and only do an explanation with minimum editing distance.Assuming that text A is
"<html>hi</html>" (string length is 15), text B be "<html>hello</html>" (word
Symbol string length is 18), text A is converted to text B to be needed character ' i ' is become ' e ', then adds respectively
Character ' l ', ' l ', ' o ', at least need 4 steps, then its minimum editing distance is 4;Text A's with B is similar
Degree can be defined as the minimum editing distance of 1-()/(maximum of A and B length), i.e. 1-4/18=0.78;As
It is 0.8 that fruit arranges similarity threshold, then similarity is less than threshold value, it is believed that text A and text B is dissimilar.
URL is sorted out according to similarity result, for example, it is assumed that have 8 URL, the most similar
Including (URL1, URL3), (URL3, URL7) and (URL4, URL6), by similar URL
Add identity set, then all URL can be divided into following set:
Set 1:URL1, URL3, URL7
Set 2:URL4, URL6
Set 3:URL2
Set 4:URL5
Set 5:URL8
By above-mentioned set in magnitude order, choose meet the requirements set in URL content of pages as malice mould
Plate.As the set comprising at least 3 similar URL can be selected as template, then example above only collects
Close 1 to meet the requirements.Threshold value for set sizes can adjust according to practical situation.
Second predetermined threshold value can be identical with the first predetermined threshold value, it is also possible to different, the 3rd predetermined threshold value is permissible
It is the numerical value such as 3,4,5, this is not limited.
Alternatively, on the basis of arbitrary alternative embodiment that above-mentioned Fig. 1 or Fig. 1 is corresponding, the embodiment of the present invention
The network address provided is known in another alternative embodiment of method for distinguishing, the page that described acquisition network address to be detected is corresponding
Content, may include that
Receive the network address described to be detected that user side sends;
According to the content of pages that network address to be detected described in described website, download to be detected is corresponding.
In the embodiment of the present invention, after server receives network address to be detected, can be directly from locally stored page
Face content finds the content of pages that this network address to be detected is corresponding.
In order to make it easy to understand, refering to Fig. 2, below as a example by an application scenarios, the embodiment of the present invention is described
The process of middle network address identification:
S100, server obtain network address to be detected.
S110, determine whether content of pages corresponding to network address to be detected can be downloaded, if execution can be downloaded
Step S130, if cannot download, performs step S120.
S120, when the content of pages that network address to be detected is corresponding cannot be downloaded, this network address to be detected is set
State is for meaning no harm.
S130, when the content of pages that network address to be detected is corresponding can be downloaded, determine this content of pages with malice
Arbitrary page in Page Template storehouse mates, and when matching, performs step S140, when not matching
Time, perform step S150.
Content of pages corresponding to S140, network address to be detected matches with arbitrary page in malice Page Template storehouse
After, confirm that this network address to be detected, for malice network address, is set to malice network address by this network address to be detected.
S150, when not matching, proceed existing detection logic of the prior art.
Refering to Fig. 3, an embodiment of the server 20 that the embodiment of the present invention provides includes:
Acquiring unit 201, for obtaining the content of pages that network address to be detected is corresponding;
Matching unit 202, for the content of pages obtained by described acquiring unit 201 and the malice previously generated
Each Page Template in Page Template storehouse mates;
Determine unit 203, for matching described content of pages and described any page when described matching unit 202
When the matching similarity of face die plate is more than the first predetermined threshold value, determine that described network address to be detected is for malice network address.
In the embodiment of the present invention, acquiring unit 201 obtains the content of pages that network address to be detected is corresponding;Coupling is single
Unit 202 content of pages that described acquiring unit 201 is obtained with in the malice Page Template storehouse previously generated
Each Page Template mates;Determine that unit 203 matches described content of pages when described matching unit 202
During with the matching similarity of described any page face die plate more than the first predetermined threshold value, determine described network address to be detected
For malice network address.With in prior art to the recognition efficiency of malice network address lowly compared with, the embodiment of the present invention carries
The server of confession, can quickly identify malice network address, thus improve internet security.
Alternatively, on the basis of the embodiment that above-mentioned Fig. 3 is corresponding, refering to Fig. 4, the embodiment of the present invention provides
Server another embodiment in, described server 20 also includes:
Memory element 204, for being stored in, by described malice network address, the malice URL library pre-set;
Collector unit 205, is drawn black network address to described malice URL library for collecting.
Alternatively, on the basis of the embodiment that above-mentioned Fig. 4 is corresponding, refering to Fig. 5, the embodiment of the present invention provides
Server another embodiment in, described server 20 also includes:
Updating block 206, for updating described malice Page Template storehouse according to described malice URL library.
Alternatively, on the basis of the embodiment that above-mentioned Fig. 5 is corresponding, refering to Fig. 6, the embodiment of the present invention provides
Server another embodiment in, described updating block 206 includes:
Obtain subelement 2061, for obtaining in the page that each network address in described malice URL library is corresponding
Hold;
Computation subunit 2062, corresponding for calculating each network address of described acquisition subelement 2061 acquisition
The similarity of any two content of pages in content of pages;
Divide subelement 2063, for any two content of pages that described computation subunit 2062 calculated
Similarity is divided into identity set more than the network address of the second predetermined threshold value;
Determine subelement 2064, for the arbitrary described division more than the 3rd preset threshold value of the network address quantity will be comprised
The content of pages that in the set that subelement 2063 divides, network address is corresponding is as malice Page Template;
Storing sub-units 2065, is used for and is stored in by the described malice Page Template determining that subelement 2064 determines
In described malice Page Template storehouse.
Alternatively, on the basis of the embodiment that above-mentioned Fig. 3 is corresponding, refering to Fig. 7, the embodiment of the present invention provides
Server another embodiment in, described acquiring unit 201 includes:
Receive subelement 2011, for receiving the network address described to be detected that user side sends;
Lower subelements 2012, for the website, download to be detected received according to described reception subelement 2011
The content of pages that described network address to be detected is corresponding.
The embodiment of the present invention also provides for a kind of computer-readable storage medium, and this storage medium has program stored therein, this journey
Sequence includes when performing that above-mentioned network address knows the some or all of step of method for distinguishing.
It is the structural representation of embodiment of the present invention server 20 refering to Fig. 8, Fig. 8.Server 20 can include defeated
Enter equipment 210, outut device 220, processor 230 and memorizer 240.
Memorizer 240 can include read only memory and random access memory, and refers to processor 230 offer
Order and data.A part for memorizer 240 can also include nonvolatile RAM
(NVRAM).
Memorizer 240 stores following element, executable module or data structure, or their son
Collection, or their superset:
Operational order: include various operational order, is used for realizing various operation.
Operating system: include various system program, is used for realizing various basic business and processing based on hardware
Task.
In embodiments of the present invention, processor 230 is by calling operational order (this behaviour of memorizer 240 storage
It is storable in operating system as instruction), perform to operate as follows:
Obtain the content of pages that network address to be detected is corresponding;
Described content of pages and any page face die plate in the malice Page Template storehouse previously generated are carried out
Join;
When the matching similarity of described content of pages and described any page face die plate is more than the first predetermined threshold value,
Determine that described network address to be detected is for malice network address.
With in prior art to the recognition efficiency of malice network address lowly compared with, the network address that the embodiment of the present invention provides
Know method for distinguishing, can quickly identify malice network address, thus improve internet security.
Processor 230 controls the operation of server 20, and processor 230 can also be referred to as CPU(Central
Processing Unit, CPU).Memorizer 240 can include read only memory and random access memory
Memorizer, and provide instruction and data to processor 230.A part for memorizer 240 can also include non-easily
The property lost random access memory (NVRAM).In concrete application, each assembly of server 20 passes through
Bus system 250 is coupled, and wherein bus system 250 is in addition to including data/address bus, it is also possible to include
Power bus, control bus and status signal bus in addition etc..But for the sake of understanding explanation, in the drawings will be each
Plant bus and be all designated as bus system 250.
The method that the invention described above embodiment discloses can apply in processor 230, or by processor 230
Realize.Processor 230 is probably a kind of IC chip, has the disposal ability of signal.Realizing
Cheng Zhong, each step of said method can be by the integrated logic circuit of the hardware in processor 230 or soft
The instruction of part form completes.Above-mentioned processor 230 can be general processor, digital signal processor
(DSP), special IC (ASIC), ready-made programmable gate array (FPGA) or other can compile
Journey logical device, discrete gate or transistor logic, discrete hardware components.Can realize or perform
Disclosed each method, step and logic diagram in the embodiment of the present invention.General processor can be micro-process
Device or this processor can also be the processors etc. of any routine.In conjunction with the side disclosed in the embodiment of the present invention
The step of method can be embodied directly in hardware decoding processor and perform, or hard with in decoding processor
Part and software module combination execution complete.Software module may be located at random access memory, flash memory, read-only storage
Device, ripe the depositing in this area such as programmable read only memory or electrically erasable programmable memorizer, depositor
In storage media.This storage medium is positioned at memorizer 240, and processor 230 reads the information in memorizer 240,
The step of said method is completed in conjunction with its hardware.
Alternatively, described malice network address also can be stored in the malice URL library pre-set by processor 230, and
Collecting is drawn black network address to described malice URL library.
Alternatively, processor 230 also can update described malice Page Template storehouse according to described malice URL library.
Alternatively, the page that each network address during processor 230 specifically can obtain described malice URL library is corresponding
Face content;Calculate the similarity of any two content of pages in the content of pages that each network address described is corresponding,
The similarity of described any two content of pages is divided into identity set more than the network address of the second predetermined threshold value;
The network address quantity content of pages corresponding more than network address in arbitrary set of the 3rd preset threshold value will be comprised as evil
Meaning Page Template, and described malice Page Template is stored in described malice Page Template storehouse.
Alternatively, input equipment 210 can receive the network address described to be detected that user side sends;
Processor 230 is according to content of pages corresponding to network address to be detected described in described website, download to be detected.
Refering to Fig. 9, an embodiment of the network address identification system that the embodiment of the present invention provides includes: server 20
Communicate to connect with user side 30, server 20 and user side 30;
In the embodiment of the present invention, user side can have multiple, only depicts three, can essentially have in Fig. 9
A lot of.
Described server 20, for obtaining the content of pages that network address to be detected is corresponding;By described content of pages with
Any page face die plate in the malice Page Template storehouse previously generated mates;When described content of pages and institute
When stating the matching similarity of any page face die plate more than the first predetermined threshold value, determine that described network address to be detected is for disliking
Meaning network address.
One of ordinary skill in the art will appreciate that all or part of step in the various methods of above-described embodiment is
Can instruct relevant hardware by program to complete, this program can be stored in a computer-readable storage
In medium, storage medium may include that ROM, RAM, disk or CD etc..
The network address provided the embodiment of the present invention above is known method for distinguishing, server and system and has been carried out in detail
Thin introducing, principle and the embodiment of the present invention are set forth by specific case used herein, above
The explanation of embodiment is only intended to help to understand method and the core concept thereof of the present invention;Simultaneously for ability
The those skilled in the art in territory, according to the thought of the present invention, the most all have
In place of change, in sum, this specification content should not be construed as limitation of the present invention.
Claims (7)
1. a network address knows method for distinguishing, it is characterised in that including:
Obtain the content of pages that network address to be detected is corresponding;
Described content of pages and any page face die plate in the malice Page Template storehouse previously generated are carried out
Join;
When the matching similarity of described content of pages and described any page face die plate is more than the first predetermined threshold value,
Determine that described network address to be detected is for malice network address;
Described malice Page Template storehouse is updated according to described malice URL library;
Wherein, described according to described malice URL library update described malice Page Template storehouse, including:
Obtain the content of pages that each network address in described malice URL library is corresponding;
Calculate the similarity of any two content of pages in the content of pages that each network address described is corresponding, by institute
The similarity stating any two content of pages is divided into identity set more than the network address of the second predetermined threshold value;
Make comprising the network address quantity content of pages corresponding more than network address in arbitrary set of the 3rd preset threshold value
For malice Page Template, and described malice Page Template is stored in described malice Page Template storehouse.
Method the most according to claim 1, it is characterised in that described method also includes:
Described malice network address is stored in the malice URL library pre-set, and collects and drawn black network address to described
Maliciously URL library.
3. according to the arbitrary described method of claim 1-2, it is characterised in that described acquisition network address to be detected
Corresponding content of pages, including:
Receive the network address described to be detected that user side sends;
According to the content of pages that network address to be detected described in described website, download to be detected is corresponding.
4. a server, it is characterised in that including:
Acquiring unit, for obtaining the content of pages that network address to be detected is corresponding;
Matching unit, for the content of pages obtained by described acquiring unit and the malice page mould previously generated
Each Page Template in plate storehouse mates;
Determine unit, for matching described content of pages and described any page face die plate when described matching unit
Matching similarity more than the first predetermined threshold value time, determine described network address to be detected for malice network address;
Updating block, for updating described malice Page Template storehouse according to described malice URL library;
Wherein, described updating block includes:
Obtain subelement, for obtaining the content of pages that each network address in described malice URL library is corresponding;
Computation subunit, for calculating the content of pages that each network address of described acquisition subelement acquisition is corresponding
The similarity of middle any two content of pages;
Dividing subelement, the similarity of any two content of pages for described computation subunit being calculated surpasses
The network address crossing the second predetermined threshold value is divided into identity set;
Determine subelement, for single more than arbitrary described division of the 3rd preset threshold value by comprising network address quantity
The content of pages that in the set that unit divides, network address is corresponding is as malice Page Template;
Storing sub-units, is used for and the described malice Page Template determining that subelement determines is stored in described malice
In Page Template storehouse.
Server the most according to claim 4, it is characterised in that described server also includes:
Memory element, for being stored in, by described malice network address, the malice URL library pre-set;
Collector unit, is drawn black network address to described malice URL library for collecting.
6. according to the arbitrary described server of claim 4-5, it is characterised in that described acquiring unit includes:
Receive subelement, for receiving the network address described to be detected that user side sends;
Lower subelements, for be checked described in the website, download to be detected that receives according to described reception subelement
The content of pages that survey grid location is corresponding.
7. a network address identification system, it is characterised in that including: server and user side,
Wherein, described server is the arbitrary described servers of the claims 4-6.
Priority Applications (2)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310503007.0A CN103501306B (en) | 2013-10-23 | 2013-10-23 | A kind of network address knows method for distinguishing, server and system |
PCT/CN2014/088468 WO2015058631A1 (en) | 2013-10-23 | 2014-10-13 | Method, server and system for malicious url identification |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201310503007.0A CN103501306B (en) | 2013-10-23 | 2013-10-23 | A kind of network address knows method for distinguishing, server and system |
Publications (2)
Publication Number | Publication Date |
---|---|
CN103501306A CN103501306A (en) | 2014-01-08 |
CN103501306B true CN103501306B (en) | 2016-09-14 |
Family
ID=49866478
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201310503007.0A Active CN103501306B (en) | 2013-10-23 | 2013-10-23 | A kind of network address knows method for distinguishing, server and system |
Country Status (2)
Country | Link |
---|---|
CN (1) | CN103501306B (en) |
WO (1) | WO2015058631A1 (en) |
Families Citing this family (11)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN103501306B (en) * | 2013-10-23 | 2016-09-14 | 腾讯科技(武汉)有限公司 | A kind of network address knows method for distinguishing, server and system |
CN104852883A (en) * | 2014-02-14 | 2015-08-19 | 腾讯科技(深圳)有限公司 | Method and system for protecting safety of account information |
CN104079560A (en) * | 2014-06-05 | 2014-10-01 | 腾讯科技(深圳)有限公司 | Web address security detecting method and device and server |
CN108683666B (en) * | 2018-05-16 | 2021-04-16 | 新华三信息安全技术有限公司 | Webpage identification method and device |
CN109992666A (en) * | 2019-03-22 | 2019-07-09 | 阿里巴巴集团控股有限公司 | Method, apparatus and non-transitory machine readable media for processing feature library |
CN111198939B (en) * | 2019-12-27 | 2021-11-23 | 北京健康之家科技有限公司 | Statement similarity analysis method and device and computer equipment |
CN114172676B (en) * | 2020-09-10 | 2024-07-16 | 中国移动通信有限公司研究院 | Malicious website detection method, device, equipment and storage medium |
CN112084501B (en) * | 2020-09-18 | 2024-06-25 | 珠海豹趣科技有限公司 | Malicious program detection method and device, electronic equipment and storage medium |
CN113098859B (en) * | 2021-03-30 | 2023-03-31 | 深圳市欢太科技有限公司 | Webpage page rollback method, device, terminal and storage medium |
CN113239305A (en) * | 2021-05-19 | 2021-08-10 | 中国电子科技集团公司第三十研究所 | Target detection and identification method in cloud computing environment |
CN113904827B (en) * | 2021-09-29 | 2024-03-19 | 恒安嘉新(北京)科技股份公司 | Identification method and device for counterfeit website, computer equipment and medium |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102082792A (en) * | 2010-12-31 | 2011-06-01 | 成都市华为赛门铁克科技有限公司 | Phishing webpage detection method and device |
Family Cites Families (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102693236A (en) * | 2011-03-24 | 2012-09-26 | 苏州风采信息技术有限公司 | Bad information filtering method based on content understanding |
CN102170640A (en) * | 2011-06-01 | 2011-08-31 | 南通海韵信息技术服务有限公司 | Mode library-based smart mobile phone terminal adverse content website identifying method |
CN102339320B (en) * | 2011-11-04 | 2013-08-28 | 华为数字技术(成都)有限公司 | Malicious web recognition method and device |
CN102609516A (en) * | 2012-02-08 | 2012-07-25 | 苏州中联互通信息科技有限公司 | Content understanding-based bad information filter method |
CN103501306B (en) * | 2013-10-23 | 2016-09-14 | 腾讯科技(武汉)有限公司 | A kind of network address knows method for distinguishing, server and system |
-
2013
- 2013-10-23 CN CN201310503007.0A patent/CN103501306B/en active Active
-
2014
- 2014-10-13 WO PCT/CN2014/088468 patent/WO2015058631A1/en active Application Filing
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN102082792A (en) * | 2010-12-31 | 2011-06-01 | 成都市华为赛门铁克科技有限公司 | Phishing webpage detection method and device |
Also Published As
Publication number | Publication date |
---|---|
CN103501306A (en) | 2014-01-08 |
WO2015058631A1 (en) | 2015-04-30 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN103501306B (en) | A kind of network address knows method for distinguishing, server and system | |
CN107204960B (en) | Webpage identification method and device and server | |
CN104317938B (en) | Web page interlinkage validation verification method and device | |
CN107241296B (en) | Webshell detection method and device | |
CN106708952B (en) | A kind of Webpage clustering method and device | |
CN105224606A (en) | A kind of disposal route of user ID and device | |
CN110099059A (en) | A kind of domain name recognition methods, device and storage medium | |
CN105224600B (en) | A kind of detection method and device of Sample Similarity | |
CN104182482B (en) | A kind of news list page determination methods and the method for screening news list page | |
CN104079559B (en) | A kind of website safety detection method, device and server | |
CN110096872B (en) | Detection method of webpage intrusion script attack tool and server | |
CN107547671A (en) | A kind of URL matching process and device | |
CN110535806A (en) | Monitor method, apparatus, equipment and the computer storage medium of abnormal website | |
KR102257139B1 (en) | Method and apparatus for collecting information regarding dark web | |
CN104391978A (en) | Method and device for storing and processing web pages of browsers | |
CN109561163B (en) | Method and device for generating uniform resource locator rewriting rule | |
CN103336693A (en) | Method and device for establishing refer chain and security detection device | |
CN106911635A (en) | A kind of method and device of detection website with the presence or absence of backdoor programs | |
CN111125704B (en) | Webpage Trojan horse recognition method and system | |
CN110825947B (en) | URL deduplication method, device, equipment and computer readable storage medium | |
CN104715018A (en) | Intelligent SQL injection resistant method based on semantic analysis | |
CN106911636B (en) | A method and device for detecting whether a website has a backdoor program | |
CN103870590B (en) | Webpage identification method and device with error-reported characteristic | |
CN105095309A (en) | Webpage processing method and device | |
KR101524618B1 (en) | Apparatus for colleting of harmful sites and method thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant |