CN102693237B - Webpage content adaptation and encapsulation system and method - Google Patents
Webpage content adaptation and encapsulation system and method Download PDFInfo
- Publication number
- CN102693237B CN102693237B CN201110071330.6A CN201110071330A CN102693237B CN 102693237 B CN102693237 B CN 102693237B CN 201110071330 A CN201110071330 A CN 201110071330A CN 102693237 B CN102693237 B CN 102693237B
- Authority
- CN
- China
- Prior art keywords
- tree
- content
- web page
- ast
- abstract syntax
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Expired - Fee Related
Links
Landscapes
- Information Transfer Between Computers (AREA)
Abstract
本发明公开了一种网页内容适配封装系统及方法,所述方法包含:从源网页的DOM(Document Object Model,文档对象模型)树生成相关AST(Abstract Syntax Tree,抽象语法树)树集合,对AST树中的文字、图片、语音和视频采用不同的适配策略分别进行适配,同时对AST树内容进行主题分类,根据分类结果和业务增值策略在AST树中添加应用扩展。最后以AST树为基本粒度,使用默认或用户定制的显示模板将AST树封装生成符合终端各种特性和用户个人偏好以及应用扩展的web页面并返回给终端显示。
The present invention discloses a system and method for adapting and encapsulating webpage content. The method includes: generating a set of related AST (Abstract Syntax Tree, Abstract Syntax Tree) trees from a DOM (Document Object Model, Document Object Model) tree of a source webpage, Different adaptation strategies are used to adapt the text, pictures, voice and video in the AST tree, and at the same time, the content of the AST tree is classified by subject, and application extensions are added to the AST tree according to the classification results and business value-added strategies. Finally, the AST tree is used as the basic granularity, and the default or user-customized display template is used to encapsulate the AST tree to generate a web page that conforms to various characteristics of the terminal, user personal preferences and application extensions, and returns it to the terminal for display.
Description
技术领域 technical field
本发明涉及WEB网页在不同终端上的显示适配技术,特别涉及一种网页内容适配封装系统及方法。The invention relates to display adaptation technology of WEB web pages on different terminals, in particular to a system and method for adapting and packaging web page content.
背景技术 Background technique
当前互联网普遍状态为使用PC客户端访问WEB网页,在三网融合的背景下,使用手机和电视终端访问互联网信息将成为发展趋势。同时随着智能手机使用越来越广泛,手机终端访问WEB信息也越来越方便。但由于互联网上绝大多数网页都是专门为PC终端设计,其它终端访问WEB站点存在着适配转换的问题。The current general state of the Internet is to use PC clients to access WEB pages. Under the background of the integration of the three networks, it will become a development trend to use mobile phones and TV terminals to access Internet information. At the same time, as smart phones are more and more widely used, it is becoming more and more convenient for mobile terminals to access WEB information. However, since most of the web pages on the Internet are specially designed for PC terminals, there is a problem of adaptation and conversion when other terminals access WEB sites.
中国第CN101815093A号专利公开了一移动终端上网页内容适配的方法,该方法从源网页或者其它数据源提取有效数据然后组织成网页形式发送到终端显示。这种方法直接使用基于DOM树的适配策略修改网页结构和内容,降低源网页与终端性能之间的耦合,比较方便手机等移动终端。但是不能满足不同用户终端的差异化需求,同时由于WEB站点的网页结构复杂,直接在其DOM树上删减大量的信息,易丢失原有子节点基本结构。Chinese Patent No. CN101815093A discloses a method for webpage content adaptation on a mobile terminal. The method extracts valid data from source webpages or other data sources and then organizes them into webpages and sends them to the terminal for display. This method directly uses the adaptation strategy based on the DOM tree to modify the structure and content of the webpage, reduces the coupling between the source webpage and the terminal performance, and is more convenient for mobile terminals such as mobile phones. However, it cannot meet the differentiated requirements of different user terminals. At the same time, due to the complex web page structure of the WEB site, a large amount of information is directly deleted from the DOM tree, and the basic structure of the original child nodes is easily lost.
发明内容 Contents of the invention
本发明的目的在于,为解决上述问题提出一种网页内容适配封装系统及方法。The object of the present invention is to provide a system and method for adapting and packaging webpage content to solve the above problems.
为实现上述目的,本发明提供一种网页内容适配封装方法,包含:In order to achieve the above purpose, the present invention provides a method for adapting and encapsulating webpage content, comprising:
(1)源网页和参数获取的步骤,获取源网页和终端特性参数;(1) The step of obtaining the source webpage and parameters, obtaining the source webpage and terminal characteristic parameters;
(2)抽象语法树生成的步骤,利用源网页DOM(文档对象模型)树,并根据终端特性参数生成相关抽象语法树集合;(2) The step of abstract syntax tree generation, utilize source web page DOM (document object model) tree, and generate relevant abstract syntax tree set according to terminal characteristic parameter;
(3)适配步骤,对抽象语法树中的各项内容分别进行适配转换处理;(3) Adaptation step, carrying out adaptation transformation processing to each content in the abstract syntax tree respectively;
(4)添加新应用的步骤,对抽象语法树内容进行主题分类,根据分类结果在抽象语法树中添加新节点及内容;(4) The step of adding a new application is to classify the content of the abstract syntax tree by topic, and add new nodes and content in the abstract syntax tree according to the classification results;
(5)网页封装的步骤,以抽象语法树为基本粒度,采用显示模板将各抽象语法树重新组织封装成网页,在显示终端上显示网页内容。(5) The step of encapsulating the webpage, taking the abstract syntax tree as the basic granularity, using the display template to reorganize and encapsulate each abstract syntax tree into a webpage, and displaying the content of the webpage on the display terminal.
上述技术方案中,所述获取源网页和终端特性参数的步骤进一步包含:In the above technical solution, the step of obtaining the source web page and terminal characteristic parameters further includes:
(1-1)终端向代理服务器发起包括目的url以及产品相关参数的请求;(1-2)代理服务器获取用户的请求后解析出请求的url和终端设备的产品型号,并从profile文档库或者数据库中查找相应的profile文档,若找到则解析该文档获得终端设备相关参数信息,如果没有该产品型号的profile文档或者数据库表记录,则要求终端将其详细参数及产品型号再次发送给代理服务器,生成对应的profile文档并入库保存。(1-1) The terminal initiates a request including the destination url and product-related parameters to the proxy server; (1-2) The proxy server obtains the user's request and parses out the requested url and the product model of the terminal device, and obtains the requested URL from the profile library or Search the corresponding profile document in the database, if found, analyze the document to obtain the relevant parameter information of the terminal equipment, if there is no profile document or database table record of the product model, the terminal is required to send its detailed parameters and product model to the proxy server again, Generate the corresponding profile document and store it in the library.
所述终端特性参数包括:终端产品型号、屏幕尺寸、屏幕分辨率、颜色质量、终端存储能力和处理器参数。步骤(4)所述分类采用机器学习方法训练得到文本分类器,并采用该分类器将AST树内容按照类别和层次进行主题分类。The terminal characteristic parameters include: terminal product model, screen size, screen resolution, color quality, terminal storage capacity and processor parameters. The classification in step (4) adopts a machine learning method to train a text classifier, and uses the classifier to classify the content of the AST tree according to categories and levels.
本发明还提供一种网页内容适配封装系统,该系统包括:The present invention also provides a webpage content adaptation packaging system, the system includes:
源网页和参数获取模块,用于解析用户向终端服务器发送的请求,获取源网页和终端特性参数;The source webpage and parameter acquisition module is used to analyze the request sent by the user to the terminal server, and acquire the source webpage and terminal characteristic parameters;
抽象语法树生成模块,用于利用源网页DOM(文档对象模型)树,并根据终端特性参数生成相关抽象语法树集合;The abstract syntax tree generation module is used to utilize the DOM (Document Object Model) tree of the source webpage, and generate a set of related abstract syntax trees according to the terminal characteristic parameters;
适配模块,用于对抽象语法树中的各项内容分别进行适配转换处理;An adaptation module is used to perform adaptation and transformation processing on the contents in the abstract syntax tree;
新应用添加模块,用于对抽象语法树内容进行主题分类,根据分类结果在抽象语法树中添加新节点及内容;The new application adding module is used to classify the content of the abstract syntax tree, and add new nodes and content to the abstract syntax tree according to the classification results;
网页封装模块,用于以抽象语法树为基本粒度,采用显示模板将各抽象语法树重新组织封装成网页,在显示终端上显示网页内容。The webpage encapsulation module is used for reorganizing and encapsulating each abstract syntax tree into a webpage by using the display template with the abstract syntax tree as the basic granularity, and displaying the content of the webpage on the display terminal.
上述技术方案中,所述新应用添加模块进一步包含:分类器单元,用于对AST树内容进行分类;和应用添加单元,用于针对分类器分类结果在相应AST树中添加扩展应用。In the above technical solution, the new application adding module further includes: a classifier unit, configured to classify the content of the AST tree; and an application adding unit, configured to add an extended application in the corresponding AST tree according to the classification result of the classifier.
所述抽象语法树适配模块进一步包含:文本内容适配单元、图片语音和视频适配单元、索引分段适配单元和新应用添加单元。The abstract syntax tree adapting module further includes: a text content adapting unit, a picture voice and video adapting unit, an index segment adapting unit and a new application adding unit.
所述系统还包含一缓存模块,用于缓存曾经访问的网页对应的AST树。The system also includes a caching module for caching the AST tree corresponding to the visited webpage.
其中,本发明维护一个profile文档库或者性能参数数据库,保存终端设备性能参数,终端特性包括终端产品型号、屏幕尺寸、屏幕分辨率、颜色质量、终端存储能力、处理器参数等;这些参数首次获取后以XML文档或者数据记录的形式保存于profile文档库或数据库中,用于根据不同终端产品型号从profile库或数据库中解析得到标定的参数。根据终端能力特性、适配策略,从用户请求的源网页DOM树生成相关AST树集合,并对树中的不同内容如文本、图片、语音、视频等进行适配转换处理;采用有监督学习的方法训练得到一个文本分类器,使用该分类器对AST树内容进行主题分类,根据分类结果及业务增值策略在AST树中添加新业务内容;以AST树为基本粒度,使用默认或者用户定制的显示模板对各AST树重新组织封装成网页。Among them, the present invention maintains a profile document library or a performance parameter database, and saves terminal device performance parameters. Terminal characteristics include terminal product model, screen size, screen resolution, color quality, terminal storage capacity, processor parameters, etc.; these parameters are acquired for the first time Afterwards, it is stored in the profile document library or database in the form of XML documents or data records, and is used to parse and obtain the calibrated parameters from the profile library or database according to different terminal product models. According to the terminal capability characteristics and adaptation strategy, the relevant AST tree set is generated from the DOM tree of the source webpage requested by the user, and the different contents in the tree, such as text, picture, voice, video, etc., are adapted and converted; the supervised learning method is adopted Method training to obtain a text classifier, use the classifier to classify the content of the AST tree, and add new business content to the AST tree according to the classification results and business value-added strategies; use the AST tree as the basic granularity, and use the default or user-defined display The template reorganizes and packages each AST tree into a web page.
本发明的优点在于,本发明采用AST树集合根据终端能力特性对WEB网页进行适配封装,以AST树为基本粒度针对不同终端的不同特性采取默认显示策略,解决其访问互联网网页显示不适配的问题,同时可以在AST树适配过程中采用机器学习的技术识别AST树的主题类别,提供一种针对网页内容添加扩展应用的方法。The advantage of the present invention is that the present invention adopts the AST tree set to adapt and package the WEB web pages according to the terminal capability characteristics, and uses the AST tree as the basic granularity to adopt the default display strategy for different characteristics of different terminals, so as to solve the problem of inappropriate display when accessing Internet web pages. At the same time, machine learning technology can be used in the AST tree adaptation process to identify the subject category of the AST tree, providing a method for adding extended applications to web page content.
附图说明 Description of drawings
图1为本发明的本发明的网页内容适配封装系统的组成的示意图;1 is a schematic diagram of the composition of the webpage content adaptation and packaging system of the present invention;
图2为本发明的网页内容适配封装系统的功能组成框图;Fig. 2 is a functional composition block diagram of the web page content adaptation packaging system of the present invention;
图3是本发明的抽象语法树(AST)适配模块的功能组成框图;Fig. 3 is the function composition block diagram of abstract syntax tree (AST) adaptation module of the present invention;
图4是本发明提供的基于上述系统的网页内容适配封装方法的流程图;FIG. 4 is a flow chart of a method for adapting and encapsulating webpage content based on the above-mentioned system provided by the present invention;
图5为本发明的获取源网页和终端特性参数的步骤流程示意图;FIG. 5 is a schematic flow diagram of steps for obtaining source webpages and terminal characteristic parameters of the present invention;
图6为本发明获取源网页后网页内容适配封装方法流程图;Fig. 6 is a flow chart of the method for adapting and encapsulating webpage content after acquiring the source webpage according to the present invention;
图7为本发明提供的一个具体使用AST树表示网页内容示意图。Fig. 7 is a schematic diagram of expressing web page content using AST tree provided by the present invention.
具体实施方式 Detailed ways
下面结合附图和具体实施例对本发明进行详细的说明。The present invention will be described in detail below in conjunction with the accompanying drawings and specific embodiments.
本发明在各种显示终端与WEB服务器之间对网页内容进行适配封装,采用机器学习的方法结合终端能力特性、适配策略及增值策略从源网页DOM树生成相关AST树集合,并对树中的各项内容如文本、图片、语音、视频等进行适配转换处理;以AST树为基本粒度,针对手机或电视终端的不同显示特性采取不同的默认显示策略将AST树重新组织封装成网页。The present invention adapts and encapsulates webpage content between various display terminals and WEB servers, uses machine learning methods in combination with terminal capability characteristics, adaptation strategies, and value-added strategies to generate related AST tree sets from the source webpage DOM tree, and Each content in the website, such as text, pictures, voice, video, etc., is adapted and converted; with the AST tree as the basic granularity, different default display strategies are adopted for different display characteristics of mobile phones or TV terminals to reorganize and package the AST tree into web pages .
如图1所示,该图为本发明的网页内容适配封装系统的组成的示意图,该系统包含:As shown in Figure 1, this figure is a schematic diagram of the composition of the webpage content adaptation packaging system of the present invention, the system includes:
源网页和参数获取模块1,用适当方式获取源网页和终端特性参数。The source webpage and parameter acquisition module 1 acquires the source webpage and terminal characteristic parameters in an appropriate manner.
抽象语法树生成模块3,用于根据源网页DOM树,并根据终端特性参数生成相关抽象语法树集合。The abstract syntax tree generating module 3 is configured to generate a set of related abstract syntax trees according to the DOM tree of the source webpage and according to the terminal characteristic parameters.
适配模块4,用于针对区域元素内容的不同对AST树进行适配;和Adaptation module 4, used for adapting the AST tree according to the content of the regional elements; and
网页封转模块6,用于根据模板库和AST树适配模块的相关内容封装网页。The webpage encapsulation module 6 is used for encapsulating the webpage according to the relevant content of the template library and the AST tree adaptation module.
如图2所示,该图为本发明的网页内容适配封装系统的功能模块组成具体实施框图,该系统包含:As shown in Figure 2, this figure is a specific implementation block diagram of the functional modules of the web page content adaptation packaging system of the present invention, the system includes:
获取源网页和终端特性参数模块1,用于解析终端请求进而获取源网页和终端特性参数。Obtaining the source webpage and terminal characteristic parameter module 1, which is used to analyze the terminal request and obtain the source webpage and terminal characteristic parameter.
网页解析过滤模块2,用于对得到的网页进行解析得到源网页的DOM树。The webpage parsing and filtering module 2 is configured to parse the obtained webpage to obtain the DOM tree of the source webpage.
抽象语法树集合生成模块3,用于根据源网页DOM树,并根据终端特性参数生成相关抽象语法树集合。The abstract syntax tree set generating module 3 is used to generate a related abstract syntax tree set according to the DOM tree of the source webpage and according to the terminal characteristic parameters.
抽象语法树适配模块4,用于针对区域元素内容的不同对AST树进行适配。The abstract syntax tree adaptation module 4 is used to adapt the AST tree according to the content of the area element.
此外,再封装网页前还可以实施新应用添加的模块,该模块又包含:In addition, modules added by the new application can be implemented before repackaging the web page, which in turn includes:
分类器8,用于对AST树内容进行分类;和新应用添加模块5,用于针对分类器分类结果在相应AST树中添加扩展应用。The classifier 8 is used to classify the content of the AST tree; and the new application adding module 5 is used to add an extended application in the corresponding AST tree according to the classification result of the classifier.
网页封装模块6,用于根据模板库模块7和AST树适配模块的相关内容封装网页,具体对应网页封转模块104。The webpage encapsulation module 6 is used for encapsulating the webpage according to the related content of the template library module 7 and the AST tree adaptation module, specifically corresponding to the webpage encapsulation and transfer module 104 .
本发明的系统还包含一缓存模块,用于存放若干源网页形成的抽象语法树。该模块可加快用户终端对网页的访问速度。The system of the present invention also includes a cache module for storing abstract syntax trees formed by several source webpages. This module can speed up the access speed of the user terminal to the webpage.
如图3所示,该图为抽象语法树(AST)适配模块具体结构框图,进一步包含的模块结构示意图。其中,AST树适配策略详细描述如下:As shown in FIG. 3 , this figure is a specific structural block diagram of an abstract syntax tree (AST) adaptation module, and a schematic structural diagram of further included modules. Among them, the AST tree adaptation strategy is described in detail as follows:
(1).文本内容适配模块(1). Text content adaptation module
对于文本比较集中的区域,或者页面中文本比例大的情况,采用首句转换或者关键词语转换,即将首句或关键词作为超文本链接显示在网页中,链接通过列表的形式给出,其它内容只在用户点击的情况下才返回到终端。整个页面整洁,且使用遥控器或者手机键盘上的方向键能够快速的操作。For the area where the text is relatively concentrated, or the proportion of text in the page is large, the first sentence conversion or keyword conversion is adopted, that is, the first sentence or keywords are displayed on the web page as a hypertext link, and the link is given in the form of a list. Other content Only returns to the terminal if the user clicks. The entire page is neat and tidy, and can be operated quickly by using the remote control or the arrow keys on the keyboard of the mobile phone.
如果文本内容很多且具有层次结构时,可以采取首句链接分层目录的策略。If the text has a lot of content and has a hierarchical structure, you can adopt the strategy of linking the hierarchical table of contents in the first sentence.
(2).图片、语音和视频适配模块(2). Picture, voice and video adaptation module
图片采用默认压缩显示缩略图或者标记链接方法。第一种方法默认将原图压缩25%到75%,在最终页面中只给出压缩后的缩略图。第二种方法采用图片标题或者alt内容作为超文本链接指向图片,只在用户点击图片后才将其发送到终端设备上进行显示。语音和视频则是通过对应标题作为文本链接给出。The image uses the default compression display thumbnail or mark link method. The first method compresses the original image by 25% to 75% by default, and only gives the compressed thumbnail in the final page. The second method uses the picture title or alt content as a hypertext link to point to the picture, and only after the user clicks on the picture, it is sent to the terminal device for display. Audio and video are given as text links via corresponding titles.
(3).索引分段适配模块(3). Index segment adaptation module
对源页面内容结构比较复杂时,根据前后顺序的list标签、段落或者表格等进行逻辑分割。在此基础上再采用首句链接方法将超文本指向具体的分割区域内容。When the content structure of the source page is complex, logically divide it according to the sequence of list tags, paragraphs, or tables. On this basis, the first sentence link method is used to point the hypertext to the specific content of the segmented area.
如果这些子块在一个终端页面上不能全部呈现时可进一步细分,对子块内的内容按照相同的原则进行分割和链接处理。子块数量较多时采取previous和next导航的策略显示前一子块和后一子块的链接。If these sub-blocks cannot be fully presented on a terminal page, they can be further subdivided, and the content in the sub-blocks is divided and linked according to the same principle. When the number of sub-blocks is large, the previous and next navigation strategies are used to display the links of the previous sub-block and the next sub-block.
(4).应用添加模块(4). Application add module
根据AST树内容的分类结果,在AST树中加入相关的扩展业务。例如,如果AST树内容为教育类,可以在该树中插入高考、研究生考试等相关广告。According to the classification result of the content of the AST tree, add related extended services to the AST tree. For example, if the content of the AST tree is educational, relevant advertisements such as college entrance examinations and postgraduate examinations can be inserted into the tree.
如图4所示,该图为本发明提供的基于上述系统的网页内容适配封装方法,该方法包含:As shown in Figure 4, this figure is a method for adapting and encapsulating webpage content based on the above-mentioned system provided by the present invention, and the method includes:
步骤101:获取源网页和终端特性参数;Step 101: Obtain the source webpage and terminal characteristic parameters;
步骤102:利用源网页DOM(文档对象模型)树,并根据终端特性参数生成相关抽象语法树集合;Step 102: Utilize the DOM (Document Object Model) tree of the source web page, and generate a set of related abstract syntax trees according to the terminal characteristic parameters;
步骤103:对抽象语法树中的各项内容分别进行适配转换处理;Step 103: Carry out adaptation and transformation processing on each content in the abstract syntax tree;
步骤104:对抽象语法树内容进行主题分类,根据分类结果在抽象语法树中添加新节点及内容;Step 104: Carry out subject classification on the content of the abstract syntax tree, and add new nodes and content in the abstract syntax tree according to the classification result;
步骤105:以抽象语法树为基本粒度,采用显示模板将各抽象语法树重新组织封装成网页,在显示终端上显示网页内容。Step 105: Taking the abstract syntax tree as the basic granularity, using display templates to reorganize and package each abstract syntax tree into a web page, and display the content of the web page on the display terminal.
其中,对生成的抽象语法树同时进行缓存处理加快后续用户终端对同样源网页的访问速度。Wherein, cache processing is performed on the generated abstract syntax tree at the same time to speed up the access speed of subsequent user terminals to the same source webpage.
如图5所示,该图为获取源网页和终端特性参数的步骤进一步包含的若干子步骤,描述如下:As shown in Figure 5, this figure is a number of sub-steps further included in the step of obtaining the source web page and terminal characteristic parameters, described as follows:
终端用户向代理服务器发起请求,包括目的url以及终端设备的性能参数。The terminal user initiates a request to the proxy server, including the destination url and the performance parameters of the terminal device.
步骤201:代理服务器获取用户的请求后解析出请求的url和终端设备的产品型号;Step 201: After obtaining the user's request, the proxy server parses out the requested url and the product model of the terminal device;
步骤202:从profile文档库中查找相应的profile文档。Step 202: Find the corresponding profile document from the profile document library.
步骤203:若找到则解析该文档获得终端设备相关参数信息,如果没有该产品型号的profile文档,则要求终端将其详细参数及产品型号发回,生成对应的profile文档并入库保存。Step 203: If found, analyze the document to obtain relevant parameter information of the terminal device. If there is no profile document of the product model, the terminal is required to send back its detailed parameters and product model, generate a corresponding profile document and store it in the database.
获取用户访问url后首先查询缓存模块9,如果已经存在该网页对应的AST树则直接将其封装后发回到终端,否则再访问WEB服务器获取指定的源网页文档。After obtaining the user's access url, at first query the cache module 9, if there is already an AST tree corresponding to the webpage, then directly encapsulate it and send it back to the terminal, otherwise access the WEB server to obtain the specified source webpage document.
如图6所示,该图为获取源网页后一个详细的网页内容适配封装方法流程图,As shown in Figure 6, this figure is a detailed flowchart of a method for adapting and encapsulating webpage content after obtaining the source webpage.
步骤301:根据profile文档提供的终端设备的处理能力、存储能力和显示能力参数得到不同子块的AST树,针对区域元素内容的不同如文本、图片、语音和视频等,对AST树进行适配;Step 301: Obtain the AST tree of different sub-blocks according to the processing capability, storage capability and display capability parameters of the terminal device provided by the profile document, and adapt the AST tree according to the different content of regional elements such as text, picture, voice and video, etc. ;
步骤302:同时使用已经训练好的分类器,对AST树内容进行分类;Step 302: Use the trained classifier to classify the content of the AST tree;
步骤303:针对分类结果可以在相应AST树中添加扩展应用如添加广告内容等;Step 303: For the classification result, an extended application such as adding advertisement content can be added to the corresponding AST tree;
步骤304:根据profile文档提供的信息检验终端浏览器是否支持HTML5.0标准;Step 304: Check whether the terminal browser supports the HTML5.0 standard according to the information provided by the profile document;
步骤305:如果支持则采用HTML5.0标准封装网页;Step 305: Encapsulate the webpage with HTML5.0 standard if supported;
步骤306:否则采用默认标准封装网页,最后使用默认模板或者用户定制模板封装适配结果,得到最终网页。Step 306: Otherwise, use the default standard to package the webpage, and finally use the default template or user-defined template to package the adaptation result to obtain the final webpage.
其中,步骤304、步骤305和步骤306为网页封装的一个具体实施方式。Wherein, step 304, step 305 and step 306 are a specific implementation manner of webpage encapsulation.
如图7所示,该图为一个AST树的具体实施例的示意图,其中,左侧为一普通页面内容,右侧为对应的AST树。关于AST树的内容分类,按照内容类别的不同,将AST树分为新闻、教育、财经、体育、娱乐、科技、生活等类别,每一类别又分为不同的层次,如教育分为高考、中考、成教、考研、出国考试,生活可分为旅游、购物、结婚、育儿等。采取有监督学习的方法训练得到一个文本分类器,根据该分类器在AST树适配过程中对其内容进行分类,以支撑扩展应用。As shown in FIG. 7 , which is a schematic diagram of a specific embodiment of an AST tree, wherein the left side is a common page content, and the right side is the corresponding AST tree. Regarding the content classification of the AST tree, according to the different content categories, the AST tree is divided into news, education, finance, sports, entertainment, technology, life and other categories. Each category is divided into different levels. For example, education is divided into college entrance examination, Senior high school entrance examination, adult education, postgraduate entrance examination, overseas examination, life can be divided into travel, shopping, marriage, childcare, etc. A text classifier is trained by a supervised learning method, and its content is classified according to the classifier in the AST tree adaptation process to support extended applications.
最后所应说明的是,以上实施例仅用以说明本发明的技术方案而非限制。尽管参照实施例对本发明进行了详细说明,本领域的普通技术人员应当理解,对本发明的技术方案进行修改或者等同替换,都不脱离本发明技术方案的精神和范围,其均应涵盖在本发明的权利要求范围当中。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention rather than limit them. Although the present invention has been described in detail with reference to the embodiments, those skilled in the art should understand that modifications or equivalent replacements to the technical solutions of the present invention do not depart from the spirit and scope of the technical solutions of the present invention, and all of them should be included in the scope of the present invention. within the scope of the claims.
Claims (7)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110071330.6A CN102693237B (en) | 2011-03-24 | 2011-03-24 | Webpage content adaptation and encapsulation system and method |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201110071330.6A CN102693237B (en) | 2011-03-24 | 2011-03-24 | Webpage content adaptation and encapsulation system and method |
Publications (2)
Publication Number | Publication Date |
---|---|
CN102693237A CN102693237A (en) | 2012-09-26 |
CN102693237B true CN102693237B (en) | 2014-09-10 |
Family
ID=46858694
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201110071330.6A Expired - Fee Related CN102693237B (en) | 2011-03-24 | 2011-03-24 | Webpage content adaptation and encapsulation system and method |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN102693237B (en) |
Families Citing this family (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN105138542B (en) * | 2015-07-09 | 2019-08-09 | 北京天河石科技有限责任公司 | A kind of method that the end PC WEB webpage is converted to mobile terminal WEB webpage |
CN105161095B (en) * | 2015-07-29 | 2017-03-22 | 百度在线网络技术(北京)有限公司 | Method and device for picture composition of speech recognition syntax tree |
CN107026820A (en) * | 2016-01-29 | 2017-08-08 | 苏格克莱姆公司 | Adaptive content balance in a network application environment |
CN111078519A (en) * | 2019-12-13 | 2020-04-28 | 杭州安恒信息技术股份有限公司 | Method and device for backtracking abnormal monitoring behaviors and electronic equipment |
CN112417338B (en) * | 2020-11-30 | 2022-12-20 | 北京博瑞彤芸科技股份有限公司 | Page adaptation method, system and equipment |
CN112463152A (en) * | 2020-12-09 | 2021-03-09 | 深圳赛安特技术服务有限公司 | Webpage adaptation method and device based on AST |
Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6973619B1 (en) * | 1998-06-30 | 2005-12-06 | International Business Machines Corporation | Method for generating display control information and computer |
CN101197849A (en) * | 2007-12-21 | 2008-06-11 | 腾讯科技(深圳)有限公司 | Method and device for commuting internet page into wireless application protocol page |
CN101815093A (en) * | 2010-03-11 | 2010-08-25 | 深圳市嘉讯软件有限公司 | Method for adapting webpage to mobile terminal and mobile terminal page adaptation device |
-
2011
- 2011-03-24 CN CN201110071330.6A patent/CN102693237B/en not_active Expired - Fee Related
Patent Citations (3)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US6973619B1 (en) * | 1998-06-30 | 2005-12-06 | International Business Machines Corporation | Method for generating display control information and computer |
CN101197849A (en) * | 2007-12-21 | 2008-06-11 | 腾讯科技(深圳)有限公司 | Method and device for commuting internet page into wireless application protocol page |
CN101815093A (en) * | 2010-03-11 | 2010-08-25 | 深圳市嘉讯软件有限公司 | Method for adapting webpage to mobile terminal and mobile terminal page adaptation device |
Also Published As
Publication number | Publication date |
---|---|
CN102693237A (en) | 2012-09-26 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
US11907237B2 (en) | Gathering and contributing content across diverse sources | |
US10235349B2 (en) | Systems and methods for automated content generation | |
CN102693237B (en) | Webpage content adaptation and encapsulation system and method | |
CN106503211B (en) | Method for automatic generation of mobile version of information publishing website | |
CN102065572B (en) | Mobile browser, gateway, browsing system and access method for internet page | |
CN111414560B (en) | Shared information processing method and device, related equipment and storage medium | |
US20110209046A1 (en) | Optimizing web content display on an electronic mobile reader | |
CN103336794B (en) | For providing the corresponding method and apparatus that information is presented in target pages | |
US10289747B2 (en) | Dynamic file concatenation | |
CN112417338B (en) | Page adaptation method, system and equipment | |
CN102207967B (en) | Method and system for automatically providing new browser plugin | |
CN103092834A (en) | Method and client-side device for browsing pictures of web pages | |
CN105045864A (en) | Personalized recommendation method of digital resources | |
CN106874502A (en) | A kind of method of video search, device and terminal | |
CN107229653A (en) | Pseudo- static Web page generation method and device | |
WO2020063448A1 (en) | Information blocking method, device and terminal | |
CN105488218A (en) | Method and device for loading waterfall flows based on search | |
CN101216822A (en) | Browsing method and system of embedded browser | |
CN113268686B (en) | Processing method for multiple browsing modes of form in information at APP (application) end | |
Gupta et al. | Mobile web: web manipulation for small displays using multi-level hierarchy page segmentation | |
US20090150759A1 (en) | Method and apparatus for browsing content-based documents | |
CN109766509B (en) | Internet periodical management system | |
Li et al. | Extracting main content of webpage to enhance adaptively rendering for small screen size terminals | |
CN103731393A (en) | Method for compressing Web resource data | |
TW530226B (en) | Wireless multimedia playing method and the platform thereof |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
C06 | Publication | ||
PB01 | Publication | ||
C10 | Entry into substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
C14 | Grant of patent or utility model | ||
GR01 | Patent grant | ||
CF01 | Termination of patent right due to non-payment of annual fee | ||
CF01 | Termination of patent right due to non-payment of annual fee |
Granted publication date: 20140910 Termination date: 20170324 |