CN105335511A - Webpage access method and device - Google Patents
Webpage access method and device Download PDFInfo
- Publication number
- CN105335511A CN105335511A CN201510725908.3A CN201510725908A CN105335511A CN 105335511 A CN105335511 A CN 105335511A CN 201510725908 A CN201510725908 A CN 201510725908A CN 105335511 A CN105335511 A CN 105335511A
- Authority
- CN
- China
- Prior art keywords
- proxy server
- webpage
- access
- information
- restricted
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/957—Browsing optimisation, e.g. caching or content distillation
- G06F16/9574—Browsing optimisation, e.g. caching or content distillation of access to content, e.g. by caching
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/955—Retrieval from the web using information identifiers, e.g. uniform resource locators [URL]
- G06F16/9566—URL specific, e.g. using aliases, detecting broken or misspelled links
-
- G—PHYSICS
- G06—COMPUTING OR CALCULATING; COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/90—Details of database functions independent of the retrieved data types
- G06F16/95—Retrieval from the web
- G06F16/958—Organisation or management of web site content, e.g. publishing, maintaining pages or automatic linking
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L67/00—Network arrangements or protocols for supporting network services or applications
- H04L67/50—Network services
- H04L67/56—Provisioning of proxy services
- H04L67/563—Data redirection of data network streams
Landscapes
- Engineering & Computer Science (AREA)
- Databases & Information Systems (AREA)
- Theoretical Computer Science (AREA)
- Data Mining & Analysis (AREA)
- Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Information Transfer Between Computers (AREA)
Abstract
The invention provides a webpage access method and device. According to the embodiment of the invention, through determining that the access to a webpage is limited, the information of a proxy server is obtained, so that the webpage can be accessed by using the information of the proxy server, and since the information of the proxy server can be automatically obtained, a user has no need of manually searching a website which issues the proxy server; the operation is simple and the success rate is high, so that the webpage access efficiency and reliability are improved.
Description
[technical field]
The present invention relates to Internet technology, particularly relate to a kind of access method and device of webpage.
[background technology]
Along with the development of internet industry, the information that web page contents provides day by day is enriched, and the data content that webpage is shown is also thereupon more and more abundanter.In the process of accessed web page, the website belonging to some webpages is access restricted web site, and such as, foreign Web sites or school website etc., make these webpages normally to access.
Under such conditions, user needs to utilize relevant keyword such as, and proxy server issuing web site etc., carry out search operation, to obtain the web portal of realease agent server.User accesses the website of the realease agent server obtained, and utilizes the proxy server that it is issued, and the agency that conducts interviews respectively is arranged, and to make it possible to utilize available proxy server, accesses these webpages.Like this, can cause complicated operation, the running time is long, and success ratio is not high, thus reduces efficiency and the reliability of web page access.
[summary of the invention]
Many aspects of the present invention provide a kind of access method and device of webpage, in order to improve efficiency and the reliability of web page access.
An aspect of of the present present invention, provides a kind of access method of webpage, comprising:
Determine that the access of webpage is restricted;
Obtain the information of proxy server;
Utilize the information of described proxy server, access described webpage.
Aspect as above and arbitrary possible implementation, provide a kind of implementation further, describedly determines that the access of webpage is restricted, and comprising:
Obtain the request of access of described webpage;
According to the request of access of described webpage, determine that described webpage cannot be accessed;
According to access restricted list, determine that described webpage affiliated web site is for access restricted web site;
Determine that the access of described webpage is restricted.
Aspect as above and arbitrary possible implementation, provide a kind of implementation further, and the information of described acquisition proxy server, comprising:
According to the banner of described webpage, obtain the information of described proxy server.
Aspect as above and arbitrary possible implementation, provide a kind of implementation further, before the information of described acquisition proxy server, also comprises:
Utilize web crawlers, obtain proxy server set, described proxy server set comprises the information of each proxy server at least one available proxy server, for according to described proxy server set, obtains the information of described proxy server.
Aspect as above and arbitrary possible implementation, provide a kind of implementation further, describedly utilizes web crawlers, after obtaining proxy server set, also comprises:
Quality verification is carried out at least one proxy server described;
To not by the information of the proxy server of quality verification, carry out filtration treatment.
Another aspect of the present invention, provides a kind of access means of webpage, comprising:
Addressed location, for determining that the access of webpage is restricted;
Acquiring unit, for obtaining the information of proxy server;
Described addressed location, also for utilizing the information of described proxy server, accesses described webpage.
Aspect as above and arbitrary possible implementation, provide a kind of implementation, described addressed location further, also for
Obtain the request of access of described webpage;
According to the request of access of described webpage, determine that described webpage cannot be accessed;
According to access restricted list, determine that described webpage affiliated web site is for access restricted web site; And
Determine that the access of described webpage is restricted.
Aspect as above and arbitrary possible implementation, provide a kind of implementation, described acquiring unit further, specifically for
According to the banner of described webpage, obtain the information of described proxy server.
Aspect as above and arbitrary possible implementation, provide a kind of implementation further, described device also comprises collecting unit, for
Utilize web crawlers, obtain proxy server set, described proxy server set comprises the information of each proxy server at least one available proxy server, for according to described proxy server set, obtains the information of described proxy server.
Aspect as above and arbitrary possible implementation, provide a kind of implementation, described collecting unit further, also for
Quality verification is carried out at least one proxy server described; And
To not by the information of the proxy server of quality verification, carry out filtration treatment.
As shown from the above technical solution, the embodiment of the present invention is by determining that the access of webpage is restricted, and then obtain the information of proxy server, make it possible to the information utilizing described proxy server, access described webpage, due to can the information of automatic acquisition proxy server, therefore, make the website without the need to user's manual search realease agent server, simple to operate, and success ratio is high, thus improve efficiency and the reliability of web page access.
In addition, adopt technical scheme provided by the present invention, by carrying out quality verification to each proxy server at least one available proxy server included in obtained proxy server set, and then to the information not by the proxy server of quality verification, carry out filtration treatment, effectively can ensure the quality of obtained proxy server.
In addition, adopt technical scheme provided by the present invention, without the need to the website of user's manual search realease agent server, completely transparent to user, the access that effectively can improve user is experienced.
[accompanying drawing explanation]
In order to be illustrated more clearly in the technical scheme in the embodiment of the present invention, be briefly described to the accompanying drawing used required in embodiment or description of the prior art below, apparently, accompanying drawing in the following describes is some embodiments of the present invention, for those of ordinary skill in the art, under the prerequisite not paying creative work, other accompanying drawing can also be obtained according to these accompanying drawings.
The schematic flow sheet of the access method of the webpage that Fig. 1 provides for one embodiment of the invention;
The structural representation of the access means of the webpage that Fig. 2 provides for another embodiment of the present invention;
The structural representation of the access means of the webpage that Fig. 3 provides for another embodiment of the present invention.
[embodiment]
For making the object of the embodiment of the present invention, technical scheme and advantage clearly, below in conjunction with the accompanying drawing in the embodiment of the present invention, technical scheme in the embodiment of the present invention is clearly and completely described, obviously, described embodiment is the present invention's part embodiment, instead of whole embodiments.Based on the embodiment in the present invention, those of ordinary skill in the art, not making other embodiments whole obtained under creative work prerequisite, belong to the scope of protection of the invention.
Be understandable that, webpage involved in the present invention, also webpage or Web page can be called, can be based on HTML (Hypertext Markup Language) (HyperTextMarkupLanguage, HTML) webpage (WebPage) write, i.e. html web page, or can also be the webpage write based on HTML and Java language, i.e. java server webpage (JavaServerPage, JSP), or can also be the webpage of other language compilation, the present embodiment be particularly limited this.Web page can comprise by one or more web page tag such as, HTML (Hypertext Markup Language) (HyperTextMarkupLanguage, HTML) label, JSP label etc., a display block of definition, be called web page element, such as, word, picture, hyperlink, button, input frame, combobox etc.
It should be noted that, terminal involved in the embodiment of the present invention can include but not limited to mobile phone, personal digital assistant (PersonalDigitalAssistant, PDA), radio hand-held equipment, panel computer (TabletComputer), PC (PersonalComputer, PC), MP3 player, MP4 player, wearable device (such as, intelligent glasses, intelligent watch, Intelligent bracelet etc.) etc.
In addition, term "and/or" herein, being only a kind of incidence relation describing affiliated partner, can there are three kinds of relations in expression, and such as, A and/or B, can represent: individualism A, exists A and B simultaneously, these three kinds of situations of individualism B.In addition, character "/" herein, general expression forward-backward correlation is to the relation liking a kind of "or".
The schematic flow sheet of the access method of the webpage that Fig. 1 provides for one embodiment of the invention, as shown in Figure 1.
101, determine that the access of webpage is restricted.
102, the information of proxy server is obtained.
103, utilize the information of described proxy server, access described webpage.
It should be noted that, the executive agent of 101 ~ 103 can for being positioned at the application of local terminal, or can also for being arranged in plug-in unit or the SDK (Software Development Kit) (SoftwareDevelopmentKit of the application of local terminal, the functional unit such as SDK), or can also for being arranged in the search engine of the server of network side, or can also for being positioned at the distributed system of network side, the present embodiment is not particularly limited this, and the present embodiment is not particularly limited this.
Be understandable that, described application can be mounted in the local program (nativeApp) in terminal, such as, browser application, the application of mobile phone Baidu etc., or can also be a web page program (webApp) of the browser in terminal, the present embodiment be particularly limited this.
Like this, by determining that the access of webpage is restricted, and then obtain the information of proxy server, make it possible to the information utilizing described proxy server, access described webpage, due to can the information of automatic acquisition proxy server, therefore, make the website without the need to user's manual search realease agent server, simple to operate, and success ratio is high, thus improve efficiency and the reliability of web page access.
It should be noted that, webpage involved in the present embodiment, can be the webpage of PC website, or can also be the webpage of mobile site, the present embodiment be particularly limited this.
At present, application examples as, when browser or Baidu's APP accessed web page, need first downloading web pages primary resource, and then resolve and play up webpage primary resource.When being resolved to URL(uniform resource locator) (UniformResourceLocator, the URL) of the webpage child resource quoted in webpage primary resource, starting downloading web pages child resource, and according to webpage child resource, playing up webpage primary resource further.If webpage affiliated web site is access restricted web site, then cannot the primary resource of downloading web pages, so, then directly export the information being used to indicate webpage and cannot accessing.
Alternatively, in one of the present embodiment possible implementation, in 101, specifically can obtain the request of access of described webpage, and according to the request of access of described webpage, determine that described webpage cannot be accessed, and according to access restricted list, determine that described webpage affiliated web site is for access restricted web site, and then, then can determine that the access of described webpage is restricted.
After the request of access getting the webpage that user triggers, this request of access is sent to the server of webpage affiliated web site.If this website is access restricted web site, this request of access then can be blocked, and cannot be sent to the server of website.Then, the information being used to indicate webpage and cannot accessing is received.Now, then can determine that this webpage cannot be accessed.
The reason cannot accessed due to webpage has a lot, therefore, after determining that webpage cannot be accessed, also needs to inquire about in access restricted list further, to determine whether this webpage affiliated web site is access restricted web site.If this webpage affiliated web site is in access restricted list, then can determine that this webpage affiliated web site is for access restricted web site.
To sum up, the webpage will accessed due to user cannot be accessed, and this webpage affiliated web site is access restricted web site, therefore, can determine that the access of this webpage is restricted.
In the present invention, in 102, the information of the proxy server obtained can include but not limited to the URL(uniform resource locator) (UniformResourceLocator of proxy server, or uniform resource name (UniformResourceName URL), URN), IP address or other access identities, the present embodiment is not particularly limited this.
Alternatively, in one of the present embodiment possible implementation, in 102, specifically can obtain the information of a proxy server, or the information of multiple proxy server can also be obtained.
If obtain the information of a proxy server, so, then utilize the information of this proxy server, perform follow-up 103.
If obtain the information of multiple proxy server, so, then can adopt the selection strategy pre-set, first select the information of a proxy server, then, then can utilize the information of this proxy server, perform follow-up 103.If the access of described webpage is still restricted, then continue the information selecting next proxy server, continue operation above, until the access of described webpage is no longer restricted.
Alternatively, in one of the present embodiment possible implementation, in 102, specifically according to the banner of described webpage, the information of described proxy server can be obtained.Particularly, the mapping relations of the information of a webpage and available proxy server can be prestored, in order to be associated by proxy server available to webpage and its.Like this, then according to the banner of described webpage, and described mapping relations can be utilized, obtain the information with the proxy server corresponding to described banner, the availability of the information of obtained proxy server can be ensured.
In a concrete implementation procedure, specifically can by the information of the mark of described webpage and described proxy server, corresponding stored is in database or file system.
Particularly, specifically can by the mark of described webpage, and the information of proxy server corresponding to the mark of this webpage, corresponding stored is in a database or in file system.
Wherein, the mark of described webpage can include but not limited to the parameter value of the parameter name of the mark of webpage and the mark of webpage, and the present embodiment is not particularly limited this; The information of described proxy server can include but not limited to the parameter value of the parameter name of the information of proxy server and the information of proxy server, and the present embodiment is not particularly limited this.
Wherein, described database can adopt relevant database, such as, oracle database, DB2 database, Structured Query Language (SQL) (StructuredQueryLanguage, SQL) server (Server) database, MySQL database etc., or key assignments (Key-Value) type database can also be adopted, such as, non-SQL (NotOnlySQL) NoSQL database, Redis database, the present embodiment is not particularly limited this.
Such as, specifically can by the parameter name of the mark of described each webpage and parameter value, and the parameter value of the information of proxy server corresponding to the mark of this webpage, corresponding stored is in a database or in file system.As can with the parameter value of the information of the proxy server corresponding to the mark of each webpage, as Key; With the parameter name of the mark of this webpage and parameter value, as Value, the two corresponding stored can be incited somebody to action in Key-Value type database.
Or, more such as, specifically can by the parameter name of the mark of described each webpage and parameter value, and the parameter name of the information of proxy server corresponding to the mark of this webpage and parameter value, corresponding stored is in a database or in file system.As can with the parameter name of the information of the proxy server corresponding to the mark of each webpage and parameter value, as Key; With the parameter name of the mark of this webpage and parameter value, as Value, the two corresponding stored can be incited somebody to action in Key-Value type database.
It should be noted that, to the mark of described webpage and the information of described proxy server, while carrying out stores processor, also need further to the time (Init_time) that first time stores, and the follow-up at least one item upgraded in the time (update_time) stored, carry out record, for the primary demand meeting follow-up management operation.
Particularly, described database or described file system, specifically can be deployed on the memory device of terminal.
Such as, the memory device of described terminal can memory device at a slow speed, be specifically as follows the hard disk of computer system, or can also be inoperative internal memory and the physical memory of mobile phone, such as, ROM (read-only memory) (Read-OnlyMemory, ROM) and RAM (random access memory) card etc., the present embodiment is not particularly limited this.
Or, again such as, the memory device of described terminal can also be speedy storage equipment, be specifically as follows the internal memory of computer system, or can also be running memory and the Installed System Memory of mobile phone, such as, random access memory (RandomAccessMemory, RAM) etc., the present embodiment is not particularly limited this.
Alternatively, in one of the present embodiment possible implementation, before 102, web crawlers can also be utilized further, obtain proxy server set, described proxy server set comprises the information of each proxy server at least one available proxy server, for according to described proxy server set, obtains the information of described proxy server.
Utilizing web crawlers, after obtaining proxy server set, quality verification can also carried out at least one proxy server described further, and then, then to not by the information of the proxy server of quality verification, filtration treatment can be carried out.Like this, by carrying out quality verification to each proxy server at least one available proxy server included in obtained proxy server set, and then to the information not by the proxy server of quality verification, carry out filtration treatment, effectively can ensure the quality of obtained proxy server.
So-called quality verification, refers to and carries out stability, the checking such as ageing, to guarantee that proxy server can be used to proxy server.Be understandable that, described quality verification can regularly carry out, and such as, once a day, once in a week, like this, can ensure the quality of obtained proxy server further.
In the present embodiment, by determining that the access of webpage is restricted, and then obtain the information of proxy server, make it possible to the information utilizing described proxy server, access described webpage, due to can the information of automatic acquisition proxy server, therefore, make the website without the need to user's manual search realease agent server, simple to operate, and success ratio is high, thus improve efficiency and the reliability of web page access.
In addition, adopt technical scheme provided by the present invention, by carrying out quality verification to each proxy server at least one available proxy server included in obtained proxy server set, and then to the information not by the proxy server of quality verification, carry out filtration treatment, effectively can ensure the quality of obtained proxy server.
In addition, adopt technical scheme provided by the present invention, without the need to the website of user's manual search realease agent server, completely transparent to user, the access that effectively can improve user is experienced.
It should be noted that, for aforesaid each embodiment of the method, in order to simple description, therefore it is all expressed as a series of combination of actions, but those skilled in the art should know, the present invention is not by the restriction of described sequence of movement, because according to the present invention, some step can adopt other orders or carry out simultaneously.Secondly, those skilled in the art also should know, the embodiment described in instructions all belongs to preferred embodiment, and involved action and module might not be that the present invention is necessary.
In the above-described embodiments, the description of each embodiment is all emphasized particularly on different fields, in certain embodiment, there is no the part described in detail, can see the associated description of other embodiments.
The structural representation of the access means of the webpage that Fig. 2 provides for another embodiment of the present invention, as shown in Figure 2.The access means of the webpage of the present embodiment can comprise addressed location 21 and acquiring unit 22.Wherein, addressed location 21, for determining that the access of webpage is restricted; Acquiring unit 22, for obtaining the information of proxy server; Described addressed location 21, also for utilizing the information of described proxy server, accesses described webpage.
It should be noted that, the access means of the webpage that the present embodiment provides can for being positioned at the application of local terminal, or can also for being arranged in plug-in unit or the SDK (Software Development Kit) (SoftwareDevelopmentKit of the application of local terminal, the functional unit such as SDK), or can also for being arranged in the search engine of the server of network side, or can also for being positioned at the distributed system of network side, the present embodiment is not particularly limited this, and the present embodiment is not particularly limited this.
Be understandable that, described application can be mounted in the local program (nativeApp) in terminal, or can also be a web page program (webApp) of browser in terminal, and the present embodiment is not particularly limited this.
Alternatively, in one of the present embodiment possible implementation, described addressed location 21, can also be further used for the request of access obtaining described webpage; According to the request of access of described webpage, determine that described webpage cannot be accessed; According to access restricted list, determine that described webpage affiliated web site is for access restricted web site; And determine that the access of described webpage is restricted.
Alternatively, in one of the present embodiment possible implementation, described acquiring unit 22, specifically may be used for the banner according to described webpage, obtains the information of described proxy server.
Alternatively, in one of the present embodiment possible implementation, as shown in Figure 3, the access means of the webpage that the present embodiment provides can further include collecting unit 31, may be used for utilizing web crawlers, obtain proxy server set, described proxy server set comprises the information of each proxy server at least one available proxy server, for according to described proxy server set, obtain the information of described proxy server.
Further, described collecting unit 31, can also be further used for carrying out quality verification at least one proxy server described; And to not by the information of the proxy server of quality verification, carry out filtration treatment.
It should be noted that, method in the embodiment that Fig. 1 is corresponding, the access means of the webpage that can be provided by the present embodiment realizes.Detailed description see the related content in embodiment corresponding to Fig. 1, can repeat no more herein.
In the present embodiment, be restricted by the access of addressed location determination webpage, and then the information of proxy server is obtained by acquiring unit, make described addressed location can utilize the information of described proxy server, access described webpage, due to can the information of automatic acquisition proxy server, therefore, make the website without the need to user's manual search realease agent server, simple to operate, and success ratio is high, thus improve efficiency and the reliability of web page access.
In addition, adopt technical scheme provided by the present invention, by collecting unit, quality verification is carried out to each proxy server at least one available proxy server included in obtained proxy server set, and then to the information not by the proxy server of quality verification, carry out filtration treatment, effectively can ensure the quality of obtained proxy server.
In addition, adopt technical scheme provided by the present invention, without the need to the website of user's manual search realease agent server, completely transparent to user, the access that effectively can improve user is experienced.
Those skilled in the art can be well understood to, and for convenience and simplicity of description, the system of foregoing description, the specific works process of device and unit, with reference to the corresponding process in preceding method embodiment, can not repeat them here.
In several embodiment provided by the present invention, should be understood that, disclosed system, apparatus and method, can realize by another way.Such as, device embodiment described above is only schematic, such as, the division of described unit, be only a kind of logic function to divide, actual can have other dividing mode when realizing, such as multiple unit or assembly can in conjunction with or another system can be integrated into, or some features can be ignored, or do not perform.Another point, shown or discussed coupling each other or direct-coupling or communication connection can be by some interfaces, and the indirect coupling of device or unit or communication connection can be electrical, machinery or other form.
The described unit illustrated as separating component or can may not be and physically separates, and the parts as unit display can be or may not be physical location, namely can be positioned at a place, or also can be distributed in multiple network element.Some or all of unit wherein can be selected according to the actual needs to realize the object of the present embodiment scheme.
In addition, each functional unit in each embodiment of the present invention can be integrated in a processing unit, also can be that the independent physics of unit exists, also can two or more unit in a unit integrated.Above-mentioned integrated unit both can adopt the form of hardware to realize, and the form that hardware also can be adopted to add SFU software functional unit realizes.
The above-mentioned integrated unit realized with the form of SFU software functional unit, can be stored in a computer read/write memory medium.Above-mentioned SFU software functional unit is stored in a storage medium, comprising some instructions in order to make a computer equipment (can be personal computer, server, or the network equipment etc.) or processor (processor) perform the part steps of method described in each embodiment of the present invention.And aforesaid storage medium comprises: USB flash disk, portable hard drive, ROM (read-only memory) (Read-OnlyMemory, ROM), random access memory (RandomAccessMemory, RAM), magnetic disc or CD etc. various can be program code stored medium.
Last it is noted that above embodiment is only in order to illustrate technical scheme of the present invention, be not intended to limit; Although with reference to previous embodiment to invention has been detailed description, those of ordinary skill in the art is to be understood that: it still can be modified to the technical scheme described in foregoing embodiments, or carries out equivalent replacement to wherein portion of techniques feature; And these amendments or replacement, do not make the essence of appropriate technical solution depart from the spirit and scope of various embodiments of the present invention technical scheme.
Claims (10)
1. an access method for webpage, is characterized in that, comprising:
Determine that the access of webpage is restricted;
Obtain the information of proxy server;
Utilize the information of described proxy server, access described webpage.
2. method according to claim 1, is characterized in that, describedly determines that the access of webpage is restricted, and comprising:
Obtain the request of access of described webpage;
According to the request of access of described webpage, determine that described webpage cannot be accessed;
According to access restricted list, determine that described webpage affiliated web site is for access restricted web site;
Determine that the access of described webpage is restricted.
3. method according to claim 1, is characterized in that, the information of described acquisition proxy server, comprising:
According to the banner of described webpage, obtain the information of described proxy server.
4. the method according to the arbitrary claim of claims 1 to 3, is characterized in that, before the information of described acquisition proxy server, also comprises:
Utilize web crawlers, obtain proxy server set, described proxy server set comprises the information of each proxy server at least one available proxy server, for according to described proxy server set, obtains the information of described proxy server.
5. method according to claim 4, is characterized in that, describedly utilizes web crawlers, after obtaining proxy server set, also comprises:
Quality verification is carried out at least one proxy server described;
To not by the information of the proxy server of quality verification, carry out filtration treatment.
6. an access means for webpage, is characterized in that, comprising:
Addressed location, for determining that the access of webpage is restricted;
Acquiring unit, for obtaining the information of proxy server;
Described addressed location, also for utilizing the information of described proxy server, accesses described webpage.
7. device according to claim 6, is characterized in that, described addressed location, also for
Obtain the request of access of described webpage;
According to the request of access of described webpage, determine that described webpage cannot be accessed;
According to access restricted list, determine that described webpage affiliated web site is for access restricted web site; And
Determine that the access of described webpage is restricted.
8. device according to claim 6, is characterized in that, described acquiring unit, specifically for
According to the banner of described webpage, obtain the information of described proxy server.
9. the device according to the arbitrary claim of claim 6 ~ 8, is characterized in that, described device also comprises collecting unit, for
Utilize web crawlers, obtain proxy server set, described proxy server set comprises the information of each proxy server at least one available proxy server, for according to described proxy server set, obtains the information of described proxy server.
10. device according to claim 9, is characterized in that, described collecting unit, also for
Quality verification is carried out at least one proxy server described; And
To not by the information of the proxy server of quality verification, carry out filtration treatment.
Priority Applications (5)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510725908.3A CN105335511A (en) | 2015-10-30 | 2015-10-30 | Webpage access method and device |
| US15/745,987 US20180225387A1 (en) | 2015-10-30 | 2016-05-23 | Method and apparatus for accessing webpage, apparatus and non-volatile computer storage medium |
| EP16858633.7A EP3273362A4 (en) | 2015-10-30 | 2016-05-23 | Webpage access method, apparatus, device and non-volatile computer storage medium |
| PCT/CN2016/082981 WO2017071189A1 (en) | 2015-10-30 | 2016-05-23 | Webpage access method, apparatus, device and non-volatile computer storage medium |
| JP2017548061A JP6488508B2 (en) | 2015-10-30 | 2016-05-23 | Web page access method, apparatus, device, and program |
Applications Claiming Priority (1)
| Application Number | Priority Date | Filing Date | Title |
|---|---|---|---|
| CN201510725908.3A CN105335511A (en) | 2015-10-30 | 2015-10-30 | Webpage access method and device |
Publications (1)
| Publication Number | Publication Date |
|---|---|
| CN105335511A true CN105335511A (en) | 2016-02-17 |
Family
ID=55286038
Family Applications (1)
| Application Number | Title | Priority Date | Filing Date |
|---|---|---|---|
| CN201510725908.3A Pending CN105335511A (en) | 2015-10-30 | 2015-10-30 | Webpage access method and device |
Country Status (5)
| Country | Link |
|---|---|
| US (1) | US20180225387A1 (en) |
| EP (1) | EP3273362A4 (en) |
| JP (1) | JP6488508B2 (en) |
| CN (1) | CN105335511A (en) |
| WO (1) | WO2017071189A1 (en) |
Cited By (6)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017071189A1 (en) * | 2015-10-30 | 2017-05-04 | 百度在线网络技术(北京)有限公司 | Webpage access method, apparatus, device and non-volatile computer storage medium |
| CN108769278A (en) * | 2018-04-11 | 2018-11-06 | 北京中科闻歌科技股份有限公司 | A kind of social media account management method and system |
| CN110147271A (en) * | 2019-05-15 | 2019-08-20 | 重庆八戒传媒有限公司 | Promote the method, apparatus and computer readable storage medium of crawler agent quality |
| CN111428179A (en) * | 2020-03-19 | 2020-07-17 | 北大方正集团有限公司 | Image monitoring method, device and electronic equipment |
| CN111767450A (en) * | 2020-07-27 | 2020-10-13 | 深圳快学教育科技有限公司 | Browser data acquisition system and method |
| CN112583780A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Method, device, system and equipment for accessing website data by using proxy IP |
Families Citing this family (13)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US8560604B2 (en) | 2009-10-08 | 2013-10-15 | Hola Networks Ltd. | System and method for providing faster and more efficient data communication |
| US9241044B2 (en) | 2013-08-28 | 2016-01-19 | Hola Networks, Ltd. | System and method for improving internet communication by using intermediate nodes |
| US11023846B2 (en) | 2015-04-24 | 2021-06-01 | United Parcel Service Of America, Inc. | Location-based pick up and delivery services |
| US11057446B2 (en) | 2015-05-14 | 2021-07-06 | Bright Data Ltd. | System and method for streaming content from multiple servers |
| EP3472717B1 (en) | 2017-08-28 | 2020-12-02 | Luminati Networks Ltd. | Method for improving content fetching by selecting tunnel devices |
| US11190374B2 (en) | 2017-08-28 | 2021-11-30 | Bright Data Ltd. | System and method for improving content fetching by selecting tunnel devices |
| US20210067577A1 (en) | 2019-02-25 | 2021-03-04 | Luminati Networks Ltd. | System and method for url fetching retry mechanism |
| CN111641664B (en) * | 2019-03-01 | 2023-12-05 | 北京京东尚科信息技术有限公司 | A crawler device service request method, device, system and storage medium |
| LT4027618T (en) | 2019-04-02 | 2024-08-26 | Bright Data Ltd. | Managing a non-direct url fetching service |
| US10637956B1 (en) * | 2019-10-01 | 2020-04-28 | Metacluster It, Uab | Smart proxy rotator |
| CN111488392B (en) * | 2020-04-16 | 2023-07-07 | 北京思特奇信息技术股份有限公司 | Query method, query system and electronic equipment |
| CN114595253A (en) * | 2022-02-22 | 2022-06-07 | 深圳海域信息技术有限公司 | Brand monitoring method, device, electronic device and medium |
| KR102681000B1 (en) * | 2023-02-28 | 2024-07-04 | 쿠팡 주식회사 | Operating method for electronic apparatus for managing transmission of information and electronic apparatus supporting thereof |
Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| CN101102313A (en) * | 2007-06-21 | 2008-01-09 | 潘晓梅 | Network download system and method with automatically replaced proxy server and its method |
| US20080195665A1 (en) * | 2007-02-09 | 2008-08-14 | Proctor & Stevenson Limited | Tracking web server |
| CN101931635A (en) * | 2009-06-18 | 2010-12-29 | 北京搜狗科技发展有限公司 | Network resource access method and proxy device |
| CN102694772A (en) * | 2011-03-23 | 2012-09-26 | 腾讯科技(深圳)有限公司 | Apparatus, system and method for accessing internet web pages |
| CN104462570A (en) * | 2014-12-26 | 2015-03-25 | 小米科技有限责任公司 | Webpage content obtaining method and device |
Family Cites Families (14)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US6829638B1 (en) * | 2000-08-03 | 2004-12-07 | International Business Machines Corporation | System and method for managing multiple proxy servers |
| US7483910B2 (en) * | 2002-01-11 | 2009-01-27 | International Business Machines Corporation | Automated access to web content based on log analysis |
| US20030145046A1 (en) * | 2002-01-31 | 2003-07-31 | Keller S. Brandon | Generating a list of addresses on a proxy server |
| CN101800758B (en) * | 2009-02-09 | 2012-09-05 | 华为终端有限公司 | Mobile terminal network visiting method, system and gateway |
| US20100205215A1 (en) * | 2009-02-11 | 2010-08-12 | Cook Robert W | Systems and methods for enforcing policies to block search engine queries for web-based proxy sites |
| US9009330B2 (en) * | 2010-04-01 | 2015-04-14 | Cloudflare, Inc. | Internet-based proxy service to limit internet visitor connection speed |
| US9049244B2 (en) * | 2011-04-19 | 2015-06-02 | Cloudflare, Inc. | Registering for internet-based proxy services |
| CN103024933B (en) * | 2011-09-28 | 2016-01-20 | 腾讯科技(深圳)有限公司 | A kind of method of mobile Internet access system and access mobile Internet |
| US9386114B2 (en) * | 2011-12-28 | 2016-07-05 | Google Inc. | Systems and methods for accessing an update server |
| CN103678311B (en) * | 2012-08-31 | 2018-11-13 | 腾讯科技(深圳)有限公司 | Web access method and system, crawl Routing Service device based on transfer mode |
| US9241044B2 (en) * | 2013-08-28 | 2016-01-19 | Hola Networks, Ltd. | System and method for improving internet communication by using intermediate nodes |
| CN104767837B (en) * | 2014-01-08 | 2018-08-24 | 阿里巴巴集团控股有限公司 | A kind of method and device of identification agent IP address |
| CN103973682B (en) * | 2014-04-30 | 2018-09-04 | 北京奇虎科技有限公司 | Carry out the method and device of web page access |
| CN105335511A (en) * | 2015-10-30 | 2016-02-17 | 百度在线网络技术(北京)有限公司 | Webpage access method and device |
-
2015
- 2015-10-30 CN CN201510725908.3A patent/CN105335511A/en active Pending
-
2016
- 2016-05-23 JP JP2017548061A patent/JP6488508B2/en active Active
- 2016-05-23 WO PCT/CN2016/082981 patent/WO2017071189A1/en not_active Ceased
- 2016-05-23 EP EP16858633.7A patent/EP3273362A4/en not_active Ceased
- 2016-05-23 US US15/745,987 patent/US20180225387A1/en not_active Abandoned
Patent Citations (5)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| US20080195665A1 (en) * | 2007-02-09 | 2008-08-14 | Proctor & Stevenson Limited | Tracking web server |
| CN101102313A (en) * | 2007-06-21 | 2008-01-09 | 潘晓梅 | Network download system and method with automatically replaced proxy server and its method |
| CN101931635A (en) * | 2009-06-18 | 2010-12-29 | 北京搜狗科技发展有限公司 | Network resource access method and proxy device |
| CN102694772A (en) * | 2011-03-23 | 2012-09-26 | 腾讯科技(深圳)有限公司 | Apparatus, system and method for accessing internet web pages |
| CN104462570A (en) * | 2014-12-26 | 2015-03-25 | 小米科技有限责任公司 | Webpage content obtaining method and device |
Cited By (9)
| Publication number | Priority date | Publication date | Assignee | Title |
|---|---|---|---|---|
| WO2017071189A1 (en) * | 2015-10-30 | 2017-05-04 | 百度在线网络技术(北京)有限公司 | Webpage access method, apparatus, device and non-volatile computer storage medium |
| CN108769278A (en) * | 2018-04-11 | 2018-11-06 | 北京中科闻歌科技股份有限公司 | A kind of social media account management method and system |
| CN110147271A (en) * | 2019-05-15 | 2019-08-20 | 重庆八戒传媒有限公司 | Promote the method, apparatus and computer readable storage medium of crawler agent quality |
| CN110147271B (en) * | 2019-05-15 | 2020-04-28 | 重庆八戒传媒有限公司 | Method and device for improving quality of crawler proxy and computer readable storage medium |
| CN112583780A (en) * | 2019-09-30 | 2021-03-30 | 北京国双科技有限公司 | Method, device, system and equipment for accessing website data by using proxy IP |
| CN112583780B (en) * | 2019-09-30 | 2023-04-07 | 北京国双科技有限公司 | Method, device, system and equipment for accessing website data by using proxy IP |
| CN111428179A (en) * | 2020-03-19 | 2020-07-17 | 北大方正集团有限公司 | Image monitoring method, device and electronic equipment |
| CN111428179B (en) * | 2020-03-19 | 2023-09-19 | 新方正控股发展有限责任公司 | Picture monitoring method and device and electronic equipment |
| CN111767450A (en) * | 2020-07-27 | 2020-10-13 | 深圳快学教育科技有限公司 | Browser data acquisition system and method |
Also Published As
| Publication number | Publication date |
|---|---|
| WO2017071189A1 (en) | 2017-05-04 |
| EP3273362A1 (en) | 2018-01-24 |
| JP2018514846A (en) | 2018-06-07 |
| EP3273362A4 (en) | 2018-04-25 |
| JP6488508B2 (en) | 2019-03-27 |
| US20180225387A1 (en) | 2018-08-09 |
Similar Documents
| Publication | Publication Date | Title |
|---|---|---|
| CN105335511A (en) | Webpage access method and device | |
| US9245274B2 (en) | Identifying selected dynamic content regions | |
| CN104965764A (en) | Static resource processing method and apparatus | |
| US9734257B2 (en) | Exported overlays | |
| US20170083527A1 (en) | Surfacing applications based on browsing activity | |
| US9251283B2 (en) | Instrumenting a website with dynamically generated code | |
| CN112384940B (en) | Mechanism for crawling e-commerce resource pages on the web | |
| CN104331474A (en) | Page processing method and device | |
| CN111427577A (en) | Code processing method and device and server | |
| CN104731869A (en) | Page display method and device | |
| CN109284450B (en) | Method and device for determining order forming paths, storage medium and electronic equipment | |
| US20120072918A1 (en) | Generation of generic universal resource indicators | |
| CN103177096A (en) | Page element positioning method based on text attribute and page element positioning device based on text attribute | |
| CN109074401B (en) | Extraction of primary content of a linked list | |
| CN104951536B (en) | Searching method and device | |
| US7496843B1 (en) | Web construction framework controller and model tiers | |
| CN105260463A (en) | Detail page processing method and apparatus | |
| CN114238839A (en) | Page generation method and device, electronic equipment and storage medium | |
| KR101352259B1 (en) | Advertisement providing method for general personal computer or mobile terminal and mobile advertisement building method for supporting the same | |
| CN113282285A (en) | Code compiling method and device, electronic equipment and storage medium | |
| CN119226590B (en) | Website data updating method, device, equipment and computer medium | |
| CN104657882A (en) | Method and device for obtaining popularization effect data | |
| CN101145936A (en) | A method and system for adding tags in Web pages | |
| CN117010926A (en) | User preference mining method and device, electronic equipment and medium | |
| CN105677672A (en) | Page display method and device |
Legal Events
| Date | Code | Title | Description |
|---|---|---|---|
| C06 | Publication | ||
| PB01 | Publication | ||
| C10 | Entry into substantive examination | ||
| SE01 | Entry into force of request for substantive examination | ||
| RJ01 | Rejection of invention patent application after publication | ||
| RJ01 | Rejection of invention patent application after publication |
Application publication date: 20160217 |