[go: up one dir, main page]

CN114003632A - Data acquisition method and device for content source, electronic equipment and medium - Google Patents

Data acquisition method and device for content source, electronic equipment and medium Download PDF

Info

Publication number
CN114003632A
CN114003632A CN202111288912.XA CN202111288912A CN114003632A CN 114003632 A CN114003632 A CN 114003632A CN 202111288912 A CN202111288912 A CN 202111288912A CN 114003632 A CN114003632 A CN 114003632A
Authority
CN
China
Prior art keywords
search
content source
target
result
preset
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202111288912.XA
Other languages
Chinese (zh)
Inventor
郑伟
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shandong Kurui Technology Co ltd
Beijing Qury Technology Co ltd
Original Assignee
Shandong Kurui Technology Co ltd
Beijing Qury Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shandong Kurui Technology Co ltd, Beijing Qury Technology Co ltd filed Critical Shandong Kurui Technology Co ltd
Priority to CN202111288912.XA priority Critical patent/CN114003632A/en
Publication of CN114003632A publication Critical patent/CN114003632A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2457Query processing with adaptation to user needs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/245Query processing
    • G06F16/2453Query optimisation
    • G06F16/24534Query rewriting; Transformation
    • G06F16/24539Query rewriting; Transformation using cached or materialised query results

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本公开涉及一种内容源的数据采集方法、装置、电子设备和介质;其中,该方法包括:建立内容源的搜索接口,内容源中存储至少一个搜索请求和至少一个搜索请求对应的搜索结果;根据内容源确定目标搜索集合,目标搜索集合用于描述内容源中的搜索请求;基于目标搜索集合,利用内容源对应的搜索接口获得目标结果集合;将目标搜索集合和目标结果集合存储至预设数据库中。本公开实施例能够基于多个内容源中的搜索结果响应一个内容源中的搜索请求,扩大内容源中数据的搜索场景,便于用户可直接在一个内容源中获取到想要的有效内容数据。

Figure 202111288912

The present disclosure relates to a data collection method, apparatus, electronic device and medium for a content source; wherein the method includes: establishing a search interface for the content source, and storing at least one search request and a search result corresponding to the at least one search request in the content source; Determine the target search set according to the content source, and the target search set is used to describe the search request in the content source; based on the target search set, use the search interface corresponding to the content source to obtain the target result set; store the target search set and the target result set in the preset in the database. The embodiments of the present disclosure can respond to a search request in one content source based on search results in multiple content sources, expand the search scenarios of data in the content source, and facilitate users to directly obtain desired valid content data in one content source.

Figure 202111288912

Description

Data acquisition method and device for content source, electronic equipment and medium
Technical Field
The present disclosure relates to the field of data acquisition technologies, and in particular, to a data acquisition method and apparatus for a content source, an electronic device, and a medium.
Background
A content source is a content entity, e.g. a terminal device or an application in a terminal device, that forms specific data.
When a user searches content in one content source, each content source records/stores search results corresponding to a plurality of search requests, so that the stored search results are matched with the search requests of the user, and corresponding search results are returned.
However, when a user searches in one content source, if there is no corresponding search result, the user cannot obtain the desired search result.
Disclosure of Invention
To solve the above technical problem or at least partially solve the above technical problem, the present disclosure provides a data acquisition method, apparatus, electronic device, and medium for a content source.
In a first aspect, the present disclosure provides a data acquisition method for a content source, including:
establishing a search interface of a content source, wherein at least one search request and a search result corresponding to the at least one search request are stored in the content source;
determining a target search set according to the content source, wherein the target search set is used for describing a search request in the content source;
based on the target search set, obtaining a target result set by utilizing a search interface corresponding to the content source;
and storing the target search set and the target result set into a preset database.
Optionally, the determining a target search set according to the content source includes:
acquiring a first search set corresponding to the content source, wherein the first search set is used for describing historical search requests in the content source;
predicting a second search set from the first search set;
and determining a target search set according to the first search set and the second search set.
Optionally, the predicting the second search set according to the first search set includes:
predicting a second search set from the preset search set according to the association degree between the first search set and a preset search set and the association degree between the first search set and a preset result set;
and the search request in the preset search set corresponds to the search result in the preset result set.
Optionally, the obtaining a target result set by using a search interface corresponding to the content source based on the target search set includes:
sending each search request in the target search set to a content source corresponding to the search request;
and acquiring the search result of the search request based on the search interface corresponding to the content source to obtain a target result set.
Optionally, after the storing the target search set and the target result set in a preset database, the method further includes:
determining a result set with the association degree with the user being greater than a preset threshold value from the target result set;
updating the target result set based on the result set;
and updating the preset database based on the updated target result set.
Optionally, before storing the target search set and the target result set in a preset database, the method further includes:
acquiring a content source to which each search result in the target result set belongs;
after the target search set and the target result set are stored in a preset database, the method further includes:
establishing an incidence relation between the search result and a content source to which the search result belongs, and establishing an incidence relation between the content source and a search request corresponding to the search result;
and updating the association relationship between the search result and the content source to which the search result belongs and the association relationship between the content source and the search request corresponding to the search result into the preset database.
Optionally, the method further includes:
responding to a first search request carrying search characters, and acquiring at least one first search result of which the matching degree with the search characters is greater than a preset matching degree threshold value, or at least one first search result of which the matching degree with the search characters is greater than the preset matching degree threshold value and a content source related to the at least one first search result from a preset database;
and displaying the at least one first search result or the content source related to the at least one first search result and the at least one first search result.
In a second aspect, the present disclosure provides a data acquisition apparatus for a content source, comprising:
the system comprises an establishing module, a searching module and a searching module, wherein the establishing module is used for establishing a searching interface of a content source, and at least one searching request and a searching result corresponding to the at least one searching request are stored in the content source;
a determining module, configured to determine a target search set according to the content source, where the target search set is used to describe a search request in the content source;
the determining module is further used for obtaining a target result set by utilizing a search interface corresponding to the content source based on the target search set;
and the storage module is used for storing the target search set and the target result set into a preset database.
Optionally, the determining module includes: an acquisition unit, a prediction unit, and a determination unit;
the acquisition unit is used for acquiring a first search set corresponding to the content source, wherein the first search set is used for describing historical search requests in the content source;
a prediction unit for predicting a second search set from the first search set;
and the determining unit is used for determining a target search set according to the first search set and the second search set.
Optionally, the prediction unit is specifically configured to:
predicting a second search set from the preset search set according to the association degree between the first search set and a preset search set and the association degree between the first search set and a preset result set;
and the search request in the preset search set corresponds to the search result in the preset result set.
Optionally, the determining module is specifically configured to:
sending each search request in the target search set to a content source corresponding to the search request;
and acquiring the search result of the search request based on the search interface corresponding to the content source to obtain a target result set.
Optionally, the method further includes: an update module;
the determining module is further used for determining a result set of which the association degree with the user is greater than a preset threshold value from the target result set;
an update module to update the target result set based on the result set;
and the updating module is also used for updating the preset database based on the updated target result set.
Optionally, the method further includes: an acquisition module;
the acquisition module is used for acquiring a content source to which each search result in the target result set belongs;
the establishing module is further used for establishing an incidence relation between the search result and a content source to which the search result belongs and an incidence relation between the content source and a search request corresponding to the search result;
and the updating module is further used for updating the association relationship between the search result and the content source to which the search result belongs and the association relationship between the content source and the search request corresponding to the search result into the preset database.
Optionally, the method further includes: a display module;
the acquisition module is further used for responding to a first search request carrying search characters, and acquiring at least one first search result of which the matching degree with the search characters is greater than a preset matching degree threshold value, or at least one first search result of which the matching degree with the search characters is greater than the preset matching degree threshold value and a content source related to the at least one first search result from a preset database;
and the display module is used for displaying the at least one first search result or the at least one first search result and a content source related to the at least one first search result.
In a third aspect, the present disclosure also provides an electronic device, including:
one or more processors;
a storage device for storing one or more programs,
when executed by the one or more processors, cause the one or more processors to implement the data collection method for the content source according to any one of the embodiments of the present invention.
In a fourth aspect, the present disclosure also provides a computer-readable storage medium, on which a computer program is stored, which when executed by a processor, implements the data acquisition method of the content source according to any one of the embodiments of the present invention.
Compared with the prior art, the technical scheme provided by the embodiment of the disclosure has the following advantages: the method comprises the steps of establishing a search interface of each content source, wherein the content source stores search results corresponding to at least one search request, determining a target search set according to the content source, the target search set is used for describing the search requests in the content source, obtaining the target result set from the content source by using the search interface corresponding to the content source based on the target search set, and storing the target search set and the target result set into a preset database so as to uniformly store data in a plurality of content sources, so that content data integration in the plurality of content sources is facilitated, accordingly, the search scene of the data in the content source can be expanded by responding to the search request in one content source based on the search results in the plurality of content sources, and a user can directly obtain desired effective content data in one content source.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the present disclosure and together with the description, serve to explain the principles of the disclosure.
In order to more clearly illustrate the embodiments or technical solutions in the prior art of the present disclosure, the drawings used in the description of the embodiments or prior art will be briefly described below, and it is obvious for those skilled in the art that other drawings can be obtained according to the drawings without inventive exercise.
Fig. 1 is a schematic flow chart diagram of a data acquisition method of a content source according to an embodiment of the present disclosure;
FIG. 2 is a schematic flow chart diagram illustrating another method for data collection of a content source according to an embodiment of the present disclosure;
FIG. 3 is a schematic flow chart diagram illustrating a data acquisition method for a content source according to an embodiment of the present disclosure;
FIG. 4 is a schematic structural diagram of a data acquisition device of a content source according to an embodiment of the present disclosure;
fig. 5 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure.
Detailed Description
In order that the above objects, features and advantages of the present disclosure may be more clearly understood, aspects of the present disclosure will be further described below. It should be noted that the embodiments and features of the embodiments of the present disclosure may be combined with each other without conflict.
In the following description, numerous specific details are set forth in order to provide a thorough understanding of the present disclosure, but the present disclosure may be practiced in other ways than those described herein; it is to be understood that the embodiments disclosed in the specification are only a few embodiments of the present disclosure, and not all embodiments.
When a user needs to search for related knowledge, the user can search for the related knowledge from original data sources such as an application program in the intelligent device, an applet in the application program or a website, so that the knowledge can be effectively acquired.
Some content aggregation platforms and intelligent devices integrate data of other data sources, but generally, the content and data of applications needing to be integrated are accessed into own databases in a manual docking mode and then presented to users through own systems.
When a user browses and searches through a single application or content aggregation platform, the source of result data fed back to the user by the application or content aggregation platform is a corresponding single database, if the user searches in one application through intelligent equipment, the fed-back result is found out from the application storage database, and if the storage database does not store the result data searched by the user, the result data fed back to the user is empty, so that the user experience is greatly reduced.
Based on the method, the internal search content of the data source (the content source as referred to below) is acquired through the search interface in a mode of mining the search word data of the user, the content is analyzed and self-learned, more search candidate requests are mined, and accordingly wider search results are acquired, and through complete collection of the search data, a complete accurate data set on each application and device is established.
Illustratively, the present disclosure provides a data collection method, apparatus, electronic device and medium for content sources, by establishing a search interface for each content source, wherein, the content source stores at least one search result corresponding to the search request, and determines a target search set according to the content source, the target search set is used for describing a search request in a content source, acquiring a target result set from the content source by using a search interface corresponding to the content source based on the target search set, storing the target search set and the target result set in a preset database, so as to store the data in a plurality of content sources uniformly, facilitate the integration of the content data in the plurality of content sources, therefore, the search request in one content source can be responded based on the search results in a plurality of content sources, the search scene of the data in the content sources is expanded, and the user can directly obtain the desired effective content data in one content source.
The data acquisition method of the content source is executed by the electronic equipment or a client installed in the electronic equipment. The electronic device may be a tablet computer, a mobile phone, a wearable device, an in-vehicle device, an Augmented Reality (AR)/Virtual Reality (VR) device, a notebook computer, an ultra-mobile personal computer (UMPC), a netbook, a Personal Digital Assistant (PDA), a smart television, a smart screen, a high definition television, a 4K television, a smart speaker, a smart projector, and the like, and the specific type of the electronic device is not limited in this disclosure.
The present disclosure does not limit the type of operating system of the electronic device. For example, an Android system, a Linux system, a Windows system, an iOS system, etc.
Please refer to fig. 1.
Fig. 1 is a schematic flow chart of a data acquisition method of a content source according to an embodiment of the present disclosure. The method of the embodiment may be performed by a data acquisition device of a content source, which may be implemented in hardware and/or software and may be configured in an electronic device. The data acquisition method of the content source according to any embodiment of the present application can be realized. As shown in fig. 1, the method specifically includes the following steps:
and S110, establishing a search interface of the content source.
The content source stores at least one search request and search results corresponding to the at least one search request.
Each content source has a search function, and the search function corresponding to each content source can be disassembled into a configurable search interface suitable for being used by a machine by analyzing the search function corresponding to each content source, so that the search interface can be used as a data interface for external equipment to acquire data from the content source.
In addition, after the search interface of each content source is determined, the search can be normalized, the search templates which are consistent with the same specification are unified through the search functions of different content sources, each search template corresponds to one content source, and the electronic device can receive the search request of the user and the search result corresponding to the search request through the search template corresponding to each content source.
It should be noted that, in the present disclosure, each content source may correspond to a unique search interface, or multiple content sources may correspond to a search interface, so as to implement uniform invocation of the interfaces.
When a plurality of content sources can correspond to one search interface, a plurality of search channels are arranged under the search interface, each search channel corresponds to one content source, and the search channels of the search interface can be determined based on the identification of each content source, so that data of different content sources can be effectively acquired from different search channels under the same search interface.
And S120, determining a target search set according to the content source.
Wherein the target search set is used to describe a search request in the content source.
The content source can include a plurality of search requests, the search requests can be obtained from the content source based on a search interface corresponding to the content source, a set formed by the search requests in the content source is determined as a target search set, and therefore the search requests are effectively summarized.
And S130, based on the target search set, obtaining a target result set by utilizing a search interface corresponding to the content source.
The search interface corresponding to one content source can be utilized to acquire, search and find out the search result corresponding to each search request in the target search set from the content source, and the set of a plurality of search results is determined as the target result set.
It should be noted that the target search set corresponds to the target result set, where one search request in the target search set may correspond to multiple or one search result in the target result set; also, one search result in a target result set may correspond to a search request in one or more target search sets.
In this embodiment, optionally, obtaining the target result set by using the search interface corresponding to the content source based on the target search set includes:
sending each search request in the target search set to a content source corresponding to the search request;
and acquiring a search result of the search request based on a search interface corresponding to the content source to obtain a target result set.
The method comprises the steps of acquiring a set of actual search requests of users in each content source by a search interface through a software module embedded in the content source and a remote docking mode.
And acquiring a search result corresponding to each search request in the target search set based on the search interface corresponding to each content source to obtain a target result set.
Therefore, the search results corresponding to the search requests stored in different types of content sources are effectively obtained in a remote acquisition mode, so that the search results corresponding to various search requests are obtained, and the data integrity of the determined target result set is high.
And S140, storing the target search set and the target result set into a preset database.
The target search set and the target result set can be stored in a preset database based on the incidence relation between the search request in the target search set and the search results in the target result set.
The data acquisition method for the content sources provided by this embodiment establishes the search interface for each content source, wherein, the content source stores at least one search result corresponding to the search request, and determines a target search set according to the content source, the target search set is used for describing a search request in a content source, acquiring a target result set from the content source by using a search interface corresponding to the content source based on the target search set, storing the target search set and the target result set in a preset database, so as to store the data in a plurality of content sources uniformly, facilitate the integration of the content data in the plurality of content sources, therefore, the search request in one content source can be responded based on the search results in a plurality of content sources, the search scene of the data in the content sources is expanded, and the user can directly obtain the desired effective content data in one content source.
Based on the above description, in this embodiment, optionally, before storing the target search set and the target result set in the preset database, the apparatus of this embodiment may further include:
acquiring a content source to which each search result in a target result set belongs;
after storing the target search set and the target result set in the preset database, the apparatus of this embodiment may further include:
establishing an incidence relation between the search result and a content source to which the search result belongs, and establishing an incidence relation between the content source and a search request corresponding to the search result;
and updating the association relationship between the search result and the content source to which the search result belongs and the association relationship between the content source and the search request corresponding to the search result into the preset database.
Wherein each search result may correspond to a content source, i.e., the search result is searched from the content source based on a search request.
Therefore, on the basis of the original preset database, the content sources to which the search results belong are also associated, so that the preset database not only can display the search results corresponding to the search request, but also can display the sources of the search results, and the content display dimensionality of the search request is enlarged.
For example, the search terms in the first search request include "dueley," which is sent to the first content source for searching, more terms with a greater semantic relevance to "dueley" may be extracted from the search results fed back by the first content source as new search terms, such as "christmas", "women's clothing", and the like, then similarity is calculated for the search results of each content source, after it is detected that the similarity corresponding to a certain content source is greater than a set threshold, an association relationship between christmas and the content source is established, and if an association relationship between "christmas" and the second content source is established, new search terms and associated content sources may be generated to expand the target search set.
Fig. 2 is a schematic flow chart of another data acquisition method of a content source according to an embodiment of the present disclosure. The present embodiment is based on the above embodiments, wherein one possible implementation manner of S120 is as follows:
s1201, obtaining a first search set corresponding to the content source.
Wherein the first search set is used to describe historical search requests in the content source.
The historical search request of the user can be acquired from the content source through the search interface corresponding to the content source, and the historical search request is stored in the first search set.
And S1202, predicting a second search set according to the first search set.
The second search set is predicted based on the search request in the first search set, that is, other search requests related to the search request can be predicted based on the search request in the first search set, so that the predicted search request is stored in the second search set.
It should be noted that the other search requests may be historical search requests of other users in the content source, or may be external search requests related to the search requests, which is not specifically limited in this disclosure.
In this embodiment, optionally, predicting the second search set according to the first search set includes:
predicting a second search set from the preset search set according to the association degree between the first search set and the preset search set and the association degree between the first search set and the preset result set;
and the search request in the preset search set corresponds to the search result in the preset result set.
The preset search set may be an existing published search request, such as user search data and content data published on the internet, and the preset result set is a search result set corresponding to the preset search set.
Therefore, based on the first search set obtained by the historical search request of the user, the second search set with higher relevance to the first search set is predicted, and the request expansion of the search set is realized.
S1203, determining a target search set according to the first search set and the second search set.
The search request included in the first search set and the search request included in the second search set can be cleaned, for example, the search request with repeated deletion or low relevance is deleted, and the two processed search sets are subjected to data combination to obtain the target search set.
Therefore, on the basis of the first search set corresponding to each content source, the second search set corresponding to the first search set is predicted, and therefore the target search set is obtained through expansion on the basis of the first search set and the second search set, and content expansion of the search request is achieved.
In addition, data extraction operation can be performed on the basis of the second search set, so that more search sets with higher relevance are searched and added to the target search set.
For example, a second result set may be obtained through the second search set, a third search set is extracted from the second result set, and the operations are sequentially repeated until a preset convergence condition is satisfied, where the preset convergence condition is: the difference degree between the newly extracted nth search set and the previous 1 to (N-1) search sets is smaller than a set threshold, and at this time, the 1 to N search sets can be comprehensively determined as the target search set.
Fig. 3 is a schematic flow chart of a data acquisition method of a content source according to an embodiment of the present disclosure. On the basis of the foregoing embodiment, further after S140, the method of this embodiment may further include:
s150, determining a result set with the association degree with the user larger than a preset threshold value from the target result set.
The information of the obtained target result set is extracted, and a candidate result set which is possibly selected by a user in the target result set is excavated, so that the problem that the user responsiveness is not high due to more useless results in the target result set is solved.
It should be noted that the preset threshold may be determined based on actual comparison requirements, which is not specifically limited by the present disclosure.
And S160, updating the target result set based on the result set.
Wherein search results that can be approved by a user are determined based on a result set mined from the target result set.
That is, the search result with the highest recognition degree of the information content is extracted by extracting the information from the target result set, and the target result set is updated.
And S170, updating the preset database based on the updated target result set.
And updating a preset database based on the updated target result set and the association relationship among each search result in the target result set, the search request corresponding to each result and the search request corresponding to each result.
Therefore, data updating is carried out on the preset database based on the new search request and the new search result, and the real-time performance of the data of the preset database is improved.
Based on the description of the foregoing embodiment, optionally, the method of this embodiment may further include:
responding to a first search request carrying search characters, and acquiring at least one first search result with the matching degree of the search characters being larger than a preset matching degree threshold value from a preset database, or acquiring a content source associated with the at least one first search result with the matching degree of the search characters being larger than the preset matching degree threshold value and the at least one first search result;
and displaying the at least one first search result or the content source associated with the at least one first search result and the at least one first search result.
Based on the search result of the first search request, two types of information, namely at least one first search result with the search character matching degree larger than a preset matching degree threshold value, or a content source associated with the at least one first search result with the search character matching degree larger than the preset matching degree threshold value and the at least one first search result, can be presented to the user.
Illustratively, in response to a first search request carrying search characters, at least one first search result with a matching degree with the search characters larger than a preset matching degree threshold is obtained from a preset database, and the at least one first search result is displayed.
Or responding to a first search request carrying search characters, acquiring at least one first search result and at least one content source related to the first search result, wherein the matching degree of the search characters is greater than a preset matching degree threshold value, from a preset database, and displaying the at least one first search result and the at least one content source related to the first search result.
Therefore, the corresponding comprehensive search result can be displayed to the user based on the first search request of the user, or the user can directly know the content source of the search result while the search result is displayed.
FIG. 4 is a schematic structural diagram of a data acquisition device of a content source according to an embodiment of the present disclosure; the device is configured in the electronic equipment, and can realize the data acquisition method of the content source in any embodiment of the application. The device specifically comprises the following steps:
an establishing module 410, configured to establish a search interface of a content source, where at least one search request and a search result corresponding to the at least one search request are stored in the content source;
a determining module 420, configured to determine a target search set according to the content source, where the target search set is used to describe a search request in the content source;
the determining module 420 is further configured to obtain a target result set based on the target search set by using a search interface corresponding to the content source;
the storage module 430 is configured to store the target search set and the target result set in a preset database.
In this embodiment, optionally, the determining module 420 includes: an acquisition unit, a prediction unit, and a determination unit;
the acquisition unit is used for acquiring a first search set corresponding to the content source, wherein the first search set is used for describing historical search requests in the content source;
a prediction unit for predicting a second search set from the first search set;
and the determining unit is used for determining a target search set according to the first search set and the second search set.
In this embodiment, optionally, the prediction unit is specifically configured to:
predicting a second search set from the preset search set according to the association degree between the first search set and a preset search set and the association degree between the first search set and a preset result set;
and the search request in the preset search set corresponds to the search result in the preset result set.
In this embodiment, optionally, the determining module 420 is specifically configured to:
sending each search request in the target search set to a content source corresponding to the search request;
and acquiring the search result of the search request based on the search interface corresponding to the content source to obtain a target result set.
In this embodiment, optionally, the apparatus of this embodiment further includes: an update module;
the determining module 420 is further configured to determine, from the target result set, a result set with a degree of association with the user greater than a preset threshold;
an update module to update the target result set based on the result set;
and the updating module is also used for updating the preset database based on the updated target result set.
In this embodiment, optionally, the apparatus of this embodiment further includes: an acquisition module;
the acquisition module is used for acquiring a content source to which each search result in the target result set belongs;
the establishing module 410 is further configured to establish an association relationship between the search result and a content source to which the search result belongs, and an association relationship between the content source and a search request corresponding to the search result;
and the updating module is further used for updating the association relationship between the search result and the content source to which the search result belongs and the association relationship between the content source and the search request corresponding to the search result into the preset database.
In this embodiment, optionally, the apparatus of this embodiment further includes: a display module;
the acquisition module is further used for responding to a first search request carrying search characters, and acquiring at least one first search result of which the matching degree with the search characters is greater than a preset matching degree threshold value, or at least one first search result of which the matching degree with the search characters is greater than the preset matching degree threshold value and a content source related to the at least one first search result from a preset database;
and the display module is used for displaying the at least one first search result or the at least one first search result and a content source related to the at least one first search result.
By the data acquisition device of the content source of the embodiment of the invention, through establishing the search interface of each content source, wherein, the content source stores at least one search result corresponding to the search request, and determines a target search set according to the content source, the target search set is used for describing a search request in a content source, acquiring a target result set from the content source by using a search interface corresponding to the content source based on the target search set, storing the target search set and the target result set in a preset database, so as to store the data in a plurality of content sources uniformly, facilitate the integration of the content data in the plurality of content sources, therefore, the search request in one content source can be responded based on the search results in a plurality of content sources, the search scene of the data in the content sources is expanded, and the user can directly obtain the desired effective content data in one content source.
The data acquisition device of the content source provided by the embodiment of the invention can execute the data acquisition method of the content source provided by any embodiment of the invention, and has the corresponding functional modules and beneficial effects of the execution method.
Fig. 5 is a schematic structural diagram of an electronic device provided in an embodiment of the present disclosure. As shown in fig. 5, the electronic device includes a processor 510, a memory 520, an input device 530, and an output device 540; the number of the processors 510 in the electronic device may be one or more, and one processor 510 is taken as an example in fig. 5; the processor 510, the memory 520, the input device 530 and the output device 540 in the electronic apparatus may be connected by a bus or other means, and the connection by the bus is exemplified in fig. 5.
The memory 520 is a computer-readable storage medium and can be used for storing software programs, computer-executable programs, and modules, such as program instructions/modules corresponding to the data acquisition method of the content source in the embodiment of the present invention. The processor 510 executes various functional applications and data processing of the electronic device by executing the software programs, instructions and modules stored in the memory 520, so as to implement the data acquisition method of the content source provided by the embodiment of the present invention.
The memory 520 may mainly include a program storage area and a data storage area, wherein the program storage area may store an operating system, an application program required for at least one function; the storage data area may store data created according to the use of the terminal, and the like. Further, the memory 520 may include high speed random access memory, and may also include non-volatile memory, such as at least one magnetic disk storage device, flash memory device, or other non-volatile solid state storage device. In some examples, memory 520 may further include memory located remotely from processor 510, which may be connected to an electronic device through a network. Examples of such networks include, but are not limited to, the internet, intranets, local area networks, mobile communication networks, and combinations thereof.
The input device 530 may be used to receive input numeric or character information and generate key signal inputs related to user settings and function control of the electronic apparatus, and may include a keyboard, a mouse, and the like. The output device 540 may include a display device such as a display screen.
The embodiment of the disclosure also provides a storage medium containing computer executable instructions, and the computer executable instructions are used for realizing the data acquisition method of the content source provided by the embodiment of the invention when being executed by a computer processor.
Of course, the storage medium provided by the embodiment of the present invention contains computer-executable instructions, and the computer-executable instructions are not limited to the operations of the method described above, and may also perform related operations in the data acquisition method of the content source provided by any embodiment of the present invention.
From the above description of the embodiments, it is obvious for those skilled in the art that the present invention can be implemented by software and necessary general hardware, and certainly, can also be implemented by hardware, but the former is a better embodiment in many cases. Based on such understanding, the technical solutions of the present invention may be embodied in the form of a software product, which can be stored in a computer-readable storage medium, such as a floppy disk, a Read-Only Memory (ROM), a Random Access Memory (RAM), a FLASH Memory (FLASH), a hard disk or an optical disk of a computer, and includes several instructions for enabling a computer device (which may be a personal computer, a server, or a network device) to execute the methods according to the embodiments of the present invention.
It should be noted that, in the embodiment of the above search apparatus, each included unit and module are merely divided according to functional logic, but are not limited to the above division as long as the corresponding functions can be implemented; in addition, specific names of the functional units are only for convenience of distinguishing from each other, and are not used for limiting the protection scope of the present invention.
It is noted that, in this document, relational terms such as "first" and "second," and the like, may be used solely to distinguish one entity or action from another entity or action without necessarily requiring or implying any actual such relationship or order between such entities or actions. Also, the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, method, article, or apparatus that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, method, article, or apparatus. Without further limitation, an element defined by the phrase "comprising an … …" does not exclude the presence of other identical elements in a process, method, article, or apparatus that comprises the element.
The foregoing are merely exemplary embodiments of the present disclosure, which enable those skilled in the art to understand or practice the present disclosure. Various modifications to these embodiments will be readily apparent to those skilled in the art, and the generic principles defined herein may be applied to other embodiments without departing from the spirit or scope of the disclosure. Thus, the present disclosure is not intended to be limited to the embodiments shown herein but is to be accorded the widest scope consistent with the principles and novel features disclosed herein.

Claims (10)

1.一种内容源的数据采集方法,其特征在于,包括:1. a data collection method of content source, is characterized in that, comprises: 建立内容源的搜索接口,所述内容源中存储至少一个搜索请求和所述至少一个搜索请求对应的搜索结果;establishing a search interface for a content source, where at least one search request and a search result corresponding to the at least one search request are stored in the content source; 根据所述内容源确定目标搜索集合,所述目标搜索集合用于描述所述内容源中的搜索请求;determining a target search set from the content source, the target search set being used to describe a search request in the content source; 基于所述目标搜索集合,利用所述内容源对应的搜索接口获得目标结果集合;Based on the target search set, use a search interface corresponding to the content source to obtain a target result set; 将所述目标搜索集合和所述目标结果集合存储至预设数据库中。The target search set and the target result set are stored in a preset database. 2.根据权利要求1所述的方法,其特征在于,所述根据所述内容源确定目标搜索集合,包括:2. The method according to claim 1, wherein the determining a target search set according to the content source comprises: 获取所述内容源对应的第一搜索集合,所述第一搜索集合用于描述所述内容源中的历史搜索请求;obtaining a first search set corresponding to the content source, where the first search set is used to describe historical search requests in the content source; 根据所述第一搜索集合预测第二搜索集合;predicting a second search set from the first search set; 根据所述第一搜索集合和所述第二搜索集合,确定目标搜索集合。A target search set is determined from the first search set and the second search set. 3.根据权利要求2所述的方法,其特征在于,所述根据所述第一搜索集合预测第二搜索集合,包括:3. The method according to claim 2, wherein the predicting the second search set according to the first search set comprises: 根据所述第一搜索集合与预设的搜索集合之间的关联度,和所述第一搜索集合与预设的结果集合之间的关联度,从所述预设的搜索集合中预测出第二搜索集合;According to the degree of association between the first search set and the preset search set, and the degree of association between the first search set and the preset result set, the first search set is predicted from the preset search set. two search sets; 其中,所述预设的搜索集合中的搜索请求与所述预设的结果集合中的搜索结果相对应。Wherein, the search request in the preset search set corresponds to the search result in the preset result set. 4.根据权利要求1所述的方法,其特征在于,所述基于所述目标搜索集合,利用所述内容源对应的搜索接口获得目标结果集合,包括:4. The method according to claim 1, wherein, based on the target search set, obtaining a target result set by using a search interface corresponding to the content source, comprising: 将所述目标搜索集合中的每个搜索请求,发送给与所述搜索请求对应的内容源;sending each search request in the target search set to a content source corresponding to the search request; 基于所述内容源对应的搜索接口,获取所述搜索请求的搜索结果,得到目标结果集合。Based on the search interface corresponding to the content source, the search result of the search request is acquired, and a target result set is obtained. 5.根据权利要求1所述的方法,其特征在于,所述将所述目标搜索集合和所述目标结果集合存储至预设数据库中之后,还包括:5. The method according to claim 1, wherein after storing the target search set and the target result set in a preset database, the method further comprises: 从所述目标结果集合中确定与用户关联度大于预设阈值的结果集合;Determine, from the target result set, a result set whose degree of relevance to the user is greater than a preset threshold; 基于所述结果集合,更新所述目标结果集合;based on the result set, updating the target result set; 基于更新后的所述目标结果集合,更新所述预设数据库。The preset database is updated based on the updated target result set. 6.根据权利要求4所述的方法,其特征在于,所述将所述目标搜索集合和所述目标结果集合存储至预设数据库中之前,还包括:6. The method according to claim 4, wherein before storing the target search set and the target result set in a preset database, the method further comprises: 获取所述目标结果集合中每个搜索结果所属的内容源;Obtain the content source to which each search result in the target result set belongs; 所述将所述目标搜索集合和所述目标结果集合存储至预设数据库中之后,还包括:After storing the target search set and the target result set in the preset database, the method further includes: 建立所述搜索结果与所述搜索结果所属的内容源的关联关系,以及所述内容源与所述搜索结果对应的搜索请求的关联关系;establishing an association relationship between the search result and a content source to which the search result belongs, and an association relationship between the content source and a search request corresponding to the search result; 将所述搜索结果与所述搜索结果所属的内容源的关联关系,和所述内容源与所述搜索结果对应的搜索请求的关联关系,更新至所述预设数据库中。The association relationship between the search result and the content source to which the search result belongs, and the association relationship between the content source and the search request corresponding to the search result are updated to the preset database. 7.根据权利要求1-6中任一项所述的方法,其特征在于,还包括:7. The method according to any one of claims 1-6, further comprising: 响应于携带有搜索字符的第一搜索请求,从预设数据库中获取与所述搜索字符匹配度大于预设的匹配度阈值的至少一个第一搜索结果,或者与所述搜索字符匹配度大于预设的匹配度阈值的至少一个第一搜索结果和所述至少一个第一搜索结果关联的内容源;In response to a first search request carrying a search character, obtain from a preset database at least one first search result whose matching degree with the search character is greater than a preset matching degree threshold, or whose matching degree with the search character is greater than a preset matching degree threshold. at least one first search result of the set matching degree threshold and a content source associated with the at least one first search result; 展示所述至少一个第一搜索结果,或者所述至少一个第一搜索结果和所述至少一个第一搜索结果关联的内容源。The at least one first search result, or a content source associated with the at least one first search result and the at least one first search result, is displayed. 8.一种内容源的数据采集装置,其特征在于,包括:8. A data acquisition device for a content source, comprising: 建立模块,用于建立内容源的搜索接口,所述内容源中存储至少一个搜索请求和所述至少一个搜索请求对应的搜索结果;a building module, configured to build a search interface of a content source, where at least one search request and a search result corresponding to the at least one search request are stored in the content source; 确定模块,用于根据所述内容源确定目标搜索集合,所述目标搜索集合用于描述所述内容源中的搜索请求;a determining module, configured to determine a target search set according to the content source, where the target search set is used to describe a search request in the content source; 确定模块,还用于基于所述目标搜索集合,利用所述内容源对应的搜索接口获得目标结果集合;A determination module, further configured to obtain a target result set by using a search interface corresponding to the content source based on the target search set; 存储模块,用于将所述目标搜索集合和所述目标结果集合存储至预设数据库中。A storage module, configured to store the target search set and the target result set in a preset database. 9.一种电子设备,其特征在于,包括:9. An electronic device, characterized in that, comprising: 一个或多个处理器;one or more processors; 存储装置,用于存储一个或多个程序,storage means for storing one or more programs, 当所述一个或多个程序被所述一个或多个处理器执行,使得所述一个或多个处理器实现如权利要求1~7中任一所述的内容源的数据采集方法。When the one or more programs are executed by the one or more processors, the one or more processors implement the data acquisition method of the content source according to any one of claims 1-7. 10.一种计算机可读存储介质,其上存储有计算机程序,其特征在于,该程序被处理器执行时实现如权利要求1~7中任一所述的内容源的数据采集方法。10 . A computer-readable storage medium on which a computer program is stored, characterized in that, when the program is executed by a processor, the data acquisition method for a content source according to any one of claims 1 to 7 is implemented.
CN202111288912.XA 2021-11-02 2021-11-02 Data acquisition method and device for content source, electronic equipment and medium Pending CN114003632A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111288912.XA CN114003632A (en) 2021-11-02 2021-11-02 Data acquisition method and device for content source, electronic equipment and medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111288912.XA CN114003632A (en) 2021-11-02 2021-11-02 Data acquisition method and device for content source, electronic equipment and medium

Publications (1)

Publication Number Publication Date
CN114003632A true CN114003632A (en) 2022-02-01

Family

ID=79926484

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111288912.XA Pending CN114003632A (en) 2021-11-02 2021-11-02 Data acquisition method and device for content source, electronic equipment and medium

Country Status (1)

Country Link
CN (1) CN114003632A (en)

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN104462557A (en) * 2014-12-25 2015-03-25 北京奇虎科技有限公司 Instant searching method and device based on search history
CN106528590A (en) * 2016-09-18 2017-03-22 青岛海信电器股份有限公司 A query method and apparatus
CN107077478A (en) * 2014-09-18 2017-08-18 微软技术许可有限责任公司 Multi-source is searched for

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN107077478A (en) * 2014-09-18 2017-08-18 微软技术许可有限责任公司 Multi-source is searched for
CN104462557A (en) * 2014-12-25 2015-03-25 北京奇虎科技有限公司 Instant searching method and device based on search history
CN106528590A (en) * 2016-09-18 2017-03-22 青岛海信电器股份有限公司 A query method and apparatus

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
于春雷 等: "一种个性化查询扩展方法", 《计算机工程与应用》, vol. 48, no. 2, 18 May 2011 (2011-05-18), pages 119 - 123 *

Similar Documents

Publication Publication Date Title
JP7695050B2 (en) Search result display method, device, equipment and medium
CN112527816B (en) Data blood relationship analysis method, system, computer equipment and storage medium
US20150234927A1 (en) Application search method, apparatus, and terminal
CN110866491A (en) Target retrieval method, device, computer readable storage medium and computer equipment
CN111339171B (en) Data query method, device and device
JP2019145093A (en) Method and apparatus for generating information
CN111797402A (en) Method, device and storage medium for software vulnerability detection
CN104504040B (en) A kind of method and apparatus of search
CN105893613B (en) image identification information searching method and device
CN112749258A (en) Data searching method and device, electronic equipment and storage medium
CN111723077A (en) Data dictionary maintenance method and device and computer equipment
CN113312432A (en) Associated information processing method and device, computer storage medium and electronic equipment
CN112559913A (en) Data processing method and device, computing equipment and readable storage medium
CN112436943A (en) Request deduplication method, device, equipment and storage medium based on big data
CN111651749A (en) Method, device, computer equipment and storage medium for retrieving account based on password
CN115329131A (en) Material label recommendation method, device, electronic device and storage medium
CN107291951B (en) Data processing method, device, storage medium and processor
CN107562901B (en) Client data updating method and system
WO2017097102A1 (en) Retrieval method and retrieval device
CN112800181A (en) Text retrieval method, device, computer equipment and storage medium
CN114003632A (en) Data acquisition method and device for content source, electronic equipment and medium
KR102023999B1 (en) Method and apparatus for generating web pages
WO2024234405A1 (en) Entity enhancement rule mining method and apparatus applied to big data
CN116842245A (en) Search for processing methods, devices and equipment
CN115203391A (en) An information retrieval method, device, computer equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination