Detailed Description
So that the manner in which the features and elements of the disclosed embodiments can be understood in detail, a more particular description of the disclosed embodiments, briefly summarized above, may be had by reference to the embodiments, some of which are illustrated in the appended drawings. In the following description of the technology, for purposes of explanation, numerous details are set forth in order to provide a thorough understanding of the disclosed embodiments. However, one or more embodiments may be practiced without these details. In other instances, well-known structures and devices may be shown in simplified form in order to simplify the drawing.
The terms "first," "second," and the like in the description and in the claims, and the above-described drawings of embodiments of the present disclosure, are used for distinguishing between similar elements and not necessarily for describing a particular sequential or chronological order. It should be understood that the data so used may be interchanged under appropriate circumstances such that embodiments of the present disclosure described herein may be made. Furthermore, the terms "comprising" and "having," as well as any variations thereof, are intended to cover non-exclusive inclusions.
The term "plurality" means two or more unless otherwise specified.
In the embodiment of the present disclosure, the character "/" indicates that the preceding and following objects are in an or relationship. For example, A/B represents: a or B.
The term "and/or" is an associative relationship that describes objects, meaning that three relationships may exist. For example, a and/or B, represents: a or B, or A and B.
The term "correspond" may refer to an association or binding relationship, and a corresponds to B refers to an association or binding relationship between a and B.
The technical scheme of the embodiment of the disclosure can be applied to an intelligent terminal or a server. In some embodiments, the smart terminal comprises a smartphone, tablet or computer, etc. device capable of accessing a website.
In the embodiment of the disclosure, the intelligent terminal or the server is used for distinguishing the URL link of the target website, so that when the intelligent terminal or the server accesses the target website, the ICP filing information of the target website can determine that the URL link to be distinguished of the target website is an internal link or an external link of the target website, thereby facilitating the processing of the link.
Referring to fig. 1, an embodiment of the present disclosure provides a method for distinguishing an internal link and an external link of a target website, including:
step S101, the electronic device obtains first ICP (Internet Content Provider) filing information of the target website.
Step S102, the electronic equipment acquires the URL link to be distinguished of the target website.
Step S103, the electronic equipment distinguishes whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information.
By adopting the method for distinguishing the internal link and the external link of the target website provided by the embodiment of the disclosure, the URL link to be distinguished of the target website is obtained by obtaining the first ICP record information of the target website, and then the URL link to be distinguished is distinguished to be the internal link or the external link of the target website according to the first ICP record information. Therefore, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, the URL link to be distinguished is the inner link or the outer link of the target website is distinguished by utilizing the first ICP record information of the target website, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
Optionally, the target website is a non-commercial website.
Optionally, the obtaining, by the electronic device, first ICP filing information of the target website includes: the electronic equipment accesses a first website home page of a target website; the electronic equipment extracts the content of a first website home page; the electronic equipment acquires first ICP record information of the target website from the content of the first website home page. Therefore, the first website homepage of the target website is accessed through the electronic equipment, the first ICP record information of the target website is obtained in the content of the first website homepage, so that the URL link to be distinguished is the inner link or the outer link of the target website according to the first ICP record information, misjudgment caused by the relation between the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, the URL link to be distinguished is the inner link or the outer link of the target website is distinguished through the first ICP record information of the target website, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
Optionally, the electronic device accessing a first website home page of the target website, including: the electronic equipment acquires a URL link of a target website; the electronic equipment acquires a home page address of a URL link of a target website; and the electronic equipment accesses the first website home page corresponding to the home page address. Optionally, the home page address of the URL link of the target website is a website address corresponding to the host field of the URL link of the target website.
In some embodiments, the electronic device accesses a first website homepage of the target website, then extracts content of the first website homepage through a crawler, and the electronic device obtains first ICP filing information of the target website from the content of the first website homepage.
Optionally, the first ICP record information of the target website is an ICP record number of a first website home page of the target website.
In some embodiments, according to relevant policy rules, the non-commercial internet information service provider should register the website information before the website is opened, and obtain ICP register information, i.e., ICP register number. And then placing an ICP record number at the bottom of the website home page of the website for public inquiry and verification, and then confirming that the ICP record information is a public and private attribute field of the content of the website home page. Therefore, the ICP record number is particularly suitable for distinguishing the internal link and the external link of the non-commercial website, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not required to be considered, the internal link or the external link of the non-commercial website to be distinguished is distinguished by the first ICP record information of the target website, and the accuracy of distinguishing the internal link and the external link of the non-commercial website is improved. Meanwhile, by using the special attribute of the non-commercial website and only by analyzing the general attribute of the ICP record information in the page content, the distinction of the internal link and the external link of the website can be completed, and the simplicity, the efficiency and the accuracy of distinguishing the internal link and the external link of the target website are improved.
Optionally, the obtaining, by the electronic device, first ICP filing information of the target website in the content of the first website home page includes: and the electronic equipment performs format matching on the content of the home page of the first website, and determines a field as first ICP record information of the target website when the field with the same preset format is matched.
Optionally, the preset format is the format of ICP docket number.
In some embodiments, the format of the ICP docket number is: "province abbreviation + ICP backup" + "main ICP backup number" + "website serial number"; or, the format of the ICP record number is as follows: "ICP gets to" + "main ICP records number" + "website serial number".
Optionally, under the condition that the first ICP filing information of the target website is not acquired, the electronic device extracts the URL link to be distinguished of the target website from the first website home page; the electronic equipment distinguishes whether the URL link to be distinguished is an internal link or an external link of the target website by using a URL identification method.
Optionally, the obtaining, by the electronic device, a to-be-distinguished URL link of the target website includes: under the condition that the electronic equipment acquires the first ICP record information of the target website, the electronic equipment extracts the URL link to be distinguished of the target website from the first website home page. Therefore, under the condition that the first ICP record information of the target website is acquired, the electronic equipment extracts the URL link to be distinguished of the target website from the first website home page, so that the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information, misjudgment caused by the relation between the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not required to be considered, and the accuracy of distinguishing the internal link and the external link of the target website is improved by distinguishing the URL link to be distinguished from the internal link or the external link of the target website by utilizing the first ICP record information of the target website.
Optionally, the extracting, by the electronic device, the to-be-distinguished URL link of the target website in the first website home page includes: the electronic equipment carries out format matching on the content of the first website home page; and under the condition that the field with the same format as the preset URL is matched, the electronic equipment determines the field as the URL link to be distinguished of the target website.
Optionally, the URL link to be distinguished includes a secondary link exchange of the target website, a tertiary link of the target website, and the like.
Optionally, the distinguishing, by the electronic device, that the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP filing information includes: the electronic equipment acquires a first host field of a first website home page; the electronic equipment acquires a second host field of the URL link to be distinguished; the electronic equipment determines the URL link to be distinguished as the internal link of the target website under the condition that the first host field is the same as the second host field; and under the condition that the first host field is different from the second host field, the electronic equipment distinguishes whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information. Therefore, the URL link to be distinguished is determined as the inner link of the target website under the condition that the first host field is the same as the second host field, and under the condition that the first host field is different from the second host field, the URL link to be distinguished is distinguished as the inner link or the outer link of the target website according to the first ICP record information, and misjudgment caused by the relation between the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not required to be considered, so that the accuracy of distinguishing the inner link and the outer link of the target website is improved.
In some embodiments, the URL link includes: a host field, a path field, and a file field. For example: html, www.aaa.com/a/b/files/202111/index; wherein www.aaa.com is a host field; a/b/files/202111 is a path field; html is a file field; com is a domain name field.
Referring to fig. 2, another method for distinguishing an internal link and an external link of a target website is provided in an embodiment of the present disclosure, including:
in step S201, the electronic device accesses a first website home page of the target website.
In step S202, the electronic device extracts the content of the first website home page.
In step S203, the electronic device obtains first ICP filing information of the target website from the content of the first website home page.
Step S204, the electronic equipment acquires the URL link to be distinguished of the target website.
In step S205, the electronic device obtains a first host field of a first website home page.
In step S206, the electronic device obtains a second host field of the URL link to be distinguished.
In step S207, the electronic device determines the URL link to be distinguished as the internal link of the target website when the first host field is the same as the second host field.
Step S208, under the condition that the first host field is different from the second host field, the electronic device distinguishes whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information.
By adopting the method for distinguishing the internal chain and the external chain of the target website provided by the embodiment of the disclosure, the first ICP filing information of the target website is obtained through the first website homepage of the target website, the URL link to be distinguished of the target website is obtained on the first website homepage, and then the URL link to be distinguished is distinguished to be the internal chain or the external chain of the target website according to the first host field, the second host field and the first ICP filing information. Therefore, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, the URL link to be distinguished is the inner link or the outer link of the target website is distinguished by utilizing the first host field, the second host field and the first ICP record information of the target website, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
Referring to fig. 3, another method for distinguishing between an internal link and an external link of a target website is provided in an embodiment of the present disclosure, which includes:
in step S301, the electronic device accesses a first website home page of the target website.
In step S302, the electronic device extracts the content of the first website home page.
Step S303, the electronic device obtains first ICP filing information of the target website from the content of the first website home page.
Step S304, under the condition that the electronic equipment acquires the first ICP filing information of the target website, the URL link to be distinguished of the target website is extracted from the first website homepage.
In step S305, the electronic device obtains a first host field of a first website home page.
In step S306, the electronic device obtains a second host field of the URL link to be distinguished.
In step S307, the electronic device determines the URL link to be distinguished as the internal link of the target website when the first host field is the same as the second host field.
Step S308, under the condition that the first host field is different from the second host field, the electronic equipment distinguishes whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information.
By adopting the method for distinguishing the internal link and the external link of the target website provided by the embodiment of the disclosure, the first ICP filing information of the target website is obtained through the first website home page of the target website, under the condition that the first ICP filing information of the target website is obtained, the URL link to be distinguished of the target website is extracted from the first website home page, and then the URL link to be distinguished is distinguished to be the internal link or the external link of the target website according to the first host field, the second host field and the first ICP filing information. Therefore, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, under the condition that the first ICP record information of the target website is obtained, the URL link to be distinguished is an inner link or an outer link of the target website is distinguished by the first host field, the second host field and the first ICP record information of the target website, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
Optionally, the distinguishing, by the electronic device, that the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP filing information when the first host field is different from the second host field includes: under the condition that the first host field is different from the second host field, the electronic equipment accesses a second website home page linked with the URL to be distinguished according to the second host field; the electronic equipment extracts the content of the home page of the second website; the electronic equipment acquires second ICP filing information of the URL link to be distinguished from the content of the home page of the second website; and the electronic equipment distinguishes whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information and the second ICP record information. Therefore, the second ICP record information of the URL link to be distinguished is obtained according to the second host field under the condition that the first host field is different from the second host field, the first ICP record information and the second ICP record information are used for distinguishing whether the URL link to be distinguished is the inner link or the outer link of the target website, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
Optionally, the electronic device accesses a second website home page linked with the URL to be distinguished according to the second host field, including: and the electronic equipment accesses the second website home page of the URL link to be distinguished corresponding to the second host field.
Optionally, the obtaining, by the electronic device, second ICP filing information of the URL link to be distinguished in the content of the home page of the second website includes: the electronic equipment carries out format matching on the content of the home page of the second website; and in the case that the electronic equipment is matched with the field with the same format as the preset format, determining the field as second ICP filing information of the URL link to be distinguished.
In some embodiments, when the first host field is different from the second host field, the electronic device accesses a second website home page of the URL link to be distinguished corresponding to the second host field, then extracts the content of the second website home page through a crawler, and matches, by format, second ICP docket information of the URL link to be distinguished in the content of the second website home page.
Optionally, the distinguishing, by the electronic device, that the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP filing information and the second ICP filing information includes: the electronic equipment determines that the URL link to be distinguished is an inner link of the target website under the condition that the first ICP record information is the same as the second ICP record information; and/or the electronic equipment determines that the URL link to be distinguished is an external link of the target website under the condition that the first ICP record information is different from the second ICP record information. Therefore, the URL link to be distinguished is determined to be the inner link of the target website under the condition that the first ICP record information and the second ICP record information are the same, the URL link to be distinguished is determined to be the outer link of the target website under the condition that the first ICP record information and the second ICP record information are different, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
In some embodiments, the first ICP docket information and the second ICP docket information are determined to be the same in the event that the "province shorthand" field, the "subject ICP docket number" field, the "website serial number" field of the first ICP docket information and the second ICP docket information are all the same.
In some embodiments, when any of the "province abbreviation" field, the "main ICP record number" field, and the "website serial number" field of the first ICP record information and the second ICP record information are not the same, it is determined that the first ICP record information and the second ICP record information are not the same.
Optionally, after acquiring the first filing information and the second filing information, the electronic device stores the first filing information and the second filing information.
Referring to fig. 4, another method for distinguishing between an internal link and an external link of a target website is provided in an embodiment of the present disclosure, which includes:
in step S401, the electronic device accesses a first website home page of the target website.
In step S402, the electronic device extracts the content of the first website home page.
In step S403, the electronic device obtains first ICP filing information of the target website from the content of the first website homepage.
In step S404, under the condition that the electronic device acquires the first ICP filing information of the target website, the URL link to be distinguished of the target website is extracted from the first website home page.
In step S405, the electronic device obtains a first host field of a first website home page.
In step S406, the electronic device obtains a second host field of the URL link to be distinguished.
Step S407, the electronic device determines whether the first host field is the same as the second host field; if the first host field is the same as the second host field, go to step S412; if the first host field is different from the second host field, step S408 is performed.
Step S408, the electronic device accesses the second website home page linked with the URL to be distinguished according to the second host field.
In step S409, the electronic device extracts the content of the home page of the second website.
Step S410, the electronic device obtains second ICP filing information of the URL link to be distinguished from the content of the home page of the second website.
Step S411, the electronic equipment judges whether the first ICP filing information and the second ICP filing information are the same; if the first ICP filing information is the same as the second ICP filing information, step S412 is executed; if the first ICP record information is not the same as the second ICP record information, step S413 is executed.
In step S412, the electronic device determines that the URL link to be distinguished is an in-link of the target website.
In step S413, the electronic device determines that the URL link to be distinguished is an out-link of the target website.
By adopting the method for distinguishing the internal link and the external link of the target website provided by the embodiment of the disclosure, the first ICP record information of the target website is obtained through the first website home page of the target website, under the condition that the first ICP record information of the target website is obtained, the URL link to be distinguished of the target website is extracted from the first website home page, then under the condition that the first host field is the same as the second host field, the URL link is determined to be the internal link of the target website, under the condition that the first host field is different from the second host field, the second ICP record information connected with the URL is obtained through the second website home page connected with the URL, and the URL link to be distinguished is the internal link or the external link of the target website is distinguished according to the first ICP record information and the second ICP record information. Therefore, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, under the condition that the first ICP record information of the target website is obtained, the URL link to be distinguished is an inner link or an outer link of the target website is distinguished by the first host field, the second host field, the first ICP record information and the second ICP record information, and accuracy of distinguishing the inner link and the outer link of the target website is improved.
In some embodiments, in the case of determining an in-link and an out-link for the target website www.xxx1.gov.cn, the obtained URL link to be distinguished of the target website includes URL 1: "www.xxx1.gov.cn/hudong/hdjl/.. and URL 2: "jw.xxx 1.gov.cn/xxgk/zfxxgkml/. so." to distinguish URL1 and URL2 according to URL identification, the main domain name of the target website is xxx1.gov.cn, the main domain name of URL1 is xxx1.gov.cn, the main domain name of URL2 is xxx1.gov.cn, the main domain names of the three are the same, and both have a common IP address through domain name resolution: x.x.111.13. then, according to the URL identification, URL1 and URL2 are both in-links of the website. However, the obtained target website is the same as the ICP record information of the URL link URL1 to be distinguished, and the target website is not the same as the ICP record information of the URL link URL2 to be distinguished, so that the method for distinguishing the inside link and the outside link of the target website according to the present embodiment is used to distinguish the inside link and the outside link of the URL link to be distinguished, and it is determined that the URL1 is the inside link of the target website and the URL2 is the outside link of the target website. By determining the target website, the sponsoring units or the sponsoring units of the URL1 and the URL2, the sponsoring unit of the target website and the URL1 is xx1 municipal xx2 government, and the sponsoring unit of the URL2 is xx1 municipal xx3 committee, namely the URL2 is an outer chain of the target website. Therefore, according to the scheme, the external links with the same IP address and the same main domain name can be judged as the internal links by utilizing the ICP filing information of the target website, and the accuracy of distinguishing the internal links from the external links of the target website is improved.
In some embodiments, in the case of performing determination of an internal link and an external link for the target website www.xxx2.gov.cn, the obtained URL links to be distinguished of the target website include URL3 "www.xxx2.gov.cn/col/col 80524/index. html" and URL4 "www.xxx2.cn/col/col 80524/index. html", URL3 and URL4 are distinguished according to a URL identification method, and the main domain name of the target website is xxx2.gov.cn, the main domain name of URL3 is xxx2.gov.cn, and the main domain name of URL4 is xxx2. cn. The main domain name of the target site is the same as that of URL3, and both also have the same IP address by domain name resolution, 119.188.x.x, then URL3 is the in-link to the target site. The main domain name of the destination website is different from the main domain name of the URL4, and the IP address of the URL4 is resolved by the domain name to be 202.110.x.x, which is different from the IP address of the destination website, and then the URL4 is the out-link of the destination website. However, the acquired destination website is the same as the ICP record information of the URL links to be distinguished, URL3 and URL4, so that the method for distinguishing the inside and outside links of the destination website is adopted to distinguish the inside and outside links of the URL links to be distinguished, and it is determined that both URL3 and URL4 are the inside links of the destination website. By identifying the targeted website, URL3, and the sponsoring units or sponsoring units of URL4, the targeted website, URL3, and URL4 are obtained from xx3 and xx4 government, i.e., URL3 and URL4 are all inlinks of the targeted website. Therefore, according to the scheme, the internal link with different IP addresses and incompletely same main domain names can be prevented from being judged as the external link by utilizing the ICP filing information of the target website, and the accuracy of distinguishing the internal link from the external link of the target website is improved.
In some embodiments, in a business cloud and website intensive website building scene, the problem that the URL identification method cannot accurately distinguish internal and external links is caused. The embodiment of the disclosure provides that ICP record information is introduced to distinguish URL links to be distinguished, so that the URL links to be distinguished can be distinguished from an internal link or an external link of a target website from the perspective of a management domain, and the accuracy of distinguishing the URL links to be distinguished is improved.
Referring to fig. 5, an apparatus for distinguishing an internal link and an external link of a target website according to an embodiment of the present disclosure includes a first obtaining module 1, a second obtaining module 2, and a distinguishing module 3. The first acquisition module 1 is configured to acquire first ICP filing information of a target website; the second obtaining module 2 is configured to obtain a URL link to be distinguished of the target website; the distinguishing module 3 is configured to distinguish whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP filing information.
By adopting the device for distinguishing the internal chain and the external chain of the target website, the URL link to be distinguished of the target website is obtained by obtaining the first ICP record information of the target website, and then the URL link to be distinguished is distinguished to be the internal chain or the external chain of the target website according to the first ICP record information. Therefore, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, the URL link to be distinguished is the inner link or the outer link of the target website is distinguished by utilizing the first ICP record information of the target website, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
Optionally, the first obtaining module is configured to obtain the first ICP filing information of the target website by: accessing a first website home page of a target website; extracting the content of a first website home page; and acquiring first ICP filing information of the target website from the content of the first website home page.
Optionally, the second obtaining module is configured to obtain the URL link to be distinguished of the target website by: under the condition that first ICP filing information of the target website is obtained, the URL link to be distinguished of the target website is extracted from a first website homepage.
Optionally, the distinguishing module is configured to distinguish whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP docket information by: acquiring a first host field of a first website home page; acquiring a second host field of the URL link to be distinguished; under the condition that the first host field is the same as the second host field, determining the URL link to be distinguished as an inner chain of the target website; and under the condition that the first host field is different from the second host field, distinguishing whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information.
Optionally, the distinguishing module is configured to distinguish whether the URL link to be distinguished is an in-link or an out-link of the target website according to the first ICP docket information if the first host field is not the same as the second host field by: under the condition that the first host field is different from the second host field, accessing a second website home page linked with the URL to be distinguished according to the second host field; extracting the content of the home page of the second website; acquiring second ICP filing information of the URL link to be distinguished from the content of the home page of the second website; and distinguishing whether the URL link to be distinguished is an internal link or an external link of the target website according to the first ICP record information and the second ICP record information.
Optionally, the distinguishing module is configured to distinguish whether the URL link to be distinguished is an in-link or an out-link of the target website according to the first ICP docket information and the second ICP docket information by: under the condition that the first ICP record information is the same as the second ICP record information, determining that the URL link to be distinguished is an inner link of the target website; and/or determining that the URL link to be distinguished is an external link of the target website under the condition that the first ICP record information is different from the second ICP record information.
As shown in fig. 6, an apparatus for distinguishing between an internal link and an external link of a target website according to an embodiment of the present disclosure includes a processor (processor)100 and a memory (memory) 101. Optionally, the apparatus may also include a Communication Interface (Communication Interface)102 and a bus 103. The processor 100, the communication interface 102, and the memory 101 may communicate with each other via a bus 103. The communication interface 102 may be used for information transfer. The processor 100 may invoke logic instructions in the memory 101 to perform the method for distinguishing between internal and external links of a target web site of the above-described embodiments.
In addition, the logic instructions in the memory 101 may be implemented in the form of software functional units and stored in a computer readable storage medium when the logic instructions are sold or used as independent products.
The memory 101, which is a computer-readable storage medium, may be used for storing software programs, computer-executable programs, such as program instructions/modules corresponding to the methods in the embodiments of the present disclosure. The processor 100 executes functional applications and data processing by executing program instructions/modules stored in the memory 101, namely, implements the method for distinguishing between an internal link and an external link of a target website in the above embodiments.
The memory 101 may include a storage program area and a storage data area, wherein the storage program area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal device, and the like. In addition, the memory 101 may include a high-speed random access memory, and may also include a nonvolatile memory.
By adopting the device for distinguishing the internal chain and the external chain of the target website, the URL link to be distinguished of the target website can be obtained by obtaining the first ICP record information of the target website, and then the URL link to be distinguished is distinguished to be the internal chain or the external chain of the target website according to the first ICP record information. Therefore, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, the URL link to be distinguished is the inner link or the outer link of the target website is distinguished by utilizing the first ICP record information of the target website, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
The embodiment of the disclosure provides an electronic device, which includes the above device for distinguishing the internal link and the external link of a target website.
By adopting the electronic equipment provided by the embodiment of the disclosure, the URL link to be distinguished of the target website is obtained by obtaining the first ICP record information of the target website, and then the URL link to be distinguished is distinguished to be an internal link or an external link of the target website according to the first ICP record information. Therefore, misjudgment caused by the relation among the main domain name of the target website, the IP address of the target website, the main domain name of the URL link to be distinguished and the IP address of the URL link to be distinguished is not needed to be considered, the URL link to be distinguished is the inner link or the outer link of the target website is distinguished by utilizing the first ICP record information of the target website, and the accuracy of distinguishing the inner link and the outer link of the target website is improved.
Optionally, the electronic device comprises a smart terminal or a server. Optionally, the smart terminal comprises a smart phone, a tablet or a computer, etc. capable of accessing the website.
The disclosed embodiments provide a storage medium storing computer-executable instructions configured to perform the above-described method for distinguishing between internal and external links of a target website.
The disclosed embodiments provide a computer program product comprising a computer program stored on a computer readable storage medium, the computer program comprising program instructions which, when executed by a computer, cause the computer to perform the above-described method for distinguishing between an inside and an outside link of a target website.
The computer-readable storage medium described above may be a transitory computer-readable storage medium or a non-transitory computer-readable storage medium.
The technical solution of the embodiments of the present disclosure may be embodied in the form of a software product, where the computer software product is stored in a storage medium and includes one or more instructions to enable a computer device (which may be a personal computer, a server, or a network device) to execute all or part of the steps of the method of the embodiments of the present disclosure. And the aforementioned storage medium may be a non-transitory storage medium comprising: a U-disk, a removable hard disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a magnetic disk or an optical disk, and other various media capable of storing program codes, and may also be a transient storage medium.
The above description and drawings sufficiently illustrate embodiments of the disclosure to enable those skilled in the art to practice them. Other embodiments may incorporate structural, logical, electrical, process, and other changes. The examples merely typify possible variations. Individual components and functions are optional unless explicitly required, and the sequence of operations may vary. Portions and features of some embodiments may be included in or substituted for those of others. Furthermore, the words used in the specification are words of description only and are not intended to limit the claims. As used in the description of the embodiments and the claims, the singular forms "a", "an" and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. Similarly, the term "and/or" as used in this application is meant to encompass any and all possible combinations of one or more of the associated listed. Furthermore, the terms "comprises" and/or "comprising," when used in this application, specify the presence of stated features, integers, steps, operations, elements, and/or components, but do not preclude the presence or addition of one or more other features, integers, steps, operations, elements, components, and/or groups thereof. Without further limitation, an element defined by the phrase "comprising an …" does not exclude the presence of other like elements in a process, method or apparatus that comprises the element. In this document, each embodiment may be described with emphasis on differences from other embodiments, and the same and similar parts between the respective embodiments may be referred to each other. For methods, products, etc. of the embodiment disclosures, reference may be made to the description of the method section for relevance if it corresponds to the method section of the embodiment disclosure.
Those of skill in the art would appreciate that the various illustrative elements and algorithm steps described in connection with the embodiments disclosed herein may be implemented as electronic hardware or combinations of computer software and electronic hardware. Whether such functionality is implemented as hardware or software may depend upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the disclosed embodiments. It can be clearly understood by the skilled person that, for convenience and brevity of description, the specific working processes of the system, the apparatus and the unit described above may refer to the corresponding processes in the foregoing method embodiments, and are not described herein again.
In the embodiments disclosed herein, the disclosed methods, products (including but not limited to devices, apparatuses, etc.) may be implemented in other ways. For example, the above-described apparatus embodiments are merely illustrative, and for example, the division of the units may be merely a logical division, and in actual implementation, there may be another division, for example, multiple units or components may be combined or integrated into another system, or some features may be omitted, or not executed. In addition, the shown or discussed mutual coupling or direct coupling or communication connection may be an indirect coupling or communication connection through some interfaces, devices or units, and may be in an electrical, mechanical or other form. The units described as separate parts may or may not be physically separate, and parts displayed as units may or may not be physical units, may be located in one place, or may be distributed on a plurality of network units. Some or all of the units can be selected according to actual needs to implement the present embodiment. In addition, functional units in the embodiments of the present disclosure may be integrated into one processing unit, or each unit may exist alone physically, or two or more units are integrated into one unit.
The flowchart and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to embodiments of the present disclosure. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). In some alternative implementations, the functions noted in the block may occur out of the order noted in the figures. For example, two blocks shown in succession may, in fact, be executed substantially concurrently, or the blocks may sometimes be executed in the reverse order, depending upon the functionality involved. In the description corresponding to the flowcharts and block diagrams in the figures, operations or steps corresponding to different blocks may also occur in different orders than disclosed in the description, and sometimes there is no specific order between the different operations or steps. For example, two sequential operations or steps may in fact be executed substantially concurrently, or they may sometimes be executed in the reverse order, depending upon the functionality involved. Each block of the block diagrams and/or flowchart illustrations, and combinations of blocks in the block diagrams and/or flowchart illustrations, can be implemented by special purpose hardware-based systems that perform the specified functions or acts, or combinations of special purpose hardware and computer instructions.