[go: up one dir, main page]

CN118467851B - Artificial intelligent data searching and distributing method and system - Google Patents

Artificial intelligent data searching and distributing method and system Download PDF

Info

Publication number
CN118467851B
CN118467851B CN202410939629.6A CN202410939629A CN118467851B CN 118467851 B CN118467851 B CN 118467851B CN 202410939629 A CN202410939629 A CN 202410939629A CN 118467851 B CN118467851 B CN 118467851B
Authority
CN
China
Prior art keywords
search
user
recommendation
semantic
search results
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202410939629.6A
Other languages
Chinese (zh)
Other versions
CN118467851A (en
Inventor
徐杭
蒙婕
陈钢
任军
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Honeycomb Technology Co ltd
Original Assignee
Beijing Honeycomb Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Honeycomb Technology Co ltd filed Critical Beijing Honeycomb Technology Co ltd
Priority to CN202410939629.6A priority Critical patent/CN118467851B/en
Publication of CN118467851A publication Critical patent/CN118467851A/en
Application granted granted Critical
Publication of CN118467851B publication Critical patent/CN118467851B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/90Details of database functions independent of the retrieved data types
    • G06F16/95Retrieval from the web
    • G06F16/953Querying, e.g. by the use of web search engines
    • G06F16/9535Search customisation based on user profiles and personalisation
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/31Indexing; Data structures therefor; Storage structures
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/338Presentation of query results
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • G06N3/092Reinforcement learning
    • YGENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
    • Y02TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
    • Y02DCLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
    • Y02D10/00Energy efficient computing, e.g. low power processors, power management or thermal management

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Computing Systems (AREA)
  • Molecular Biology (AREA)
  • Evolutionary Computation (AREA)
  • Mathematical Physics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Health & Medical Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本发明提供一种人工智能数据搜索与分发方法及系统,涉及人工智能技术领域,包括获取用户输入的搜索请求,对搜索请求进行语义扩展,得到目标扩展搜索关键词,根据目标扩展搜索关键词和预先构建的多模态语义索引,进行语义搜索,得到初步搜索结果;通过多模态融合神经网络对初步搜索结果进行特征提取与融合,得到多模态融合特征向量,利用注意力机制进行权重调整,计算初步搜索结果中各数据与用户搜索意图的相关性得分,根据相关性得分对初步搜索结果进行排序,得到有序搜索结果;有序搜索结果输入个性化推荐系统,生成最终推荐结果,基于终端设备和用户画像,对最终推荐结果进行呈现适配,确定分发内容,并推送给终端设备。

The present invention provides an artificial intelligence data search and distribution method and system, which relates to the field of artificial intelligence technology, including obtaining a search request input by a user, semantically expanding the search request to obtain a target expanded search keyword, performing semantic search according to the target expanded search keyword and a pre-built multimodal semantic index to obtain a preliminary search result; extracting and fusing features of the preliminary search results through a multimodal fusion neural network to obtain a multimodal fusion feature vector, using an attention mechanism to adjust weights, calculating the correlation scores of each data in the preliminary search results and the user's search intent, sorting the preliminary search results according to the correlation scores to obtain ordered search results; inputting the ordered search results into a personalized recommendation system to generate a final recommendation result, presenting and adapting the final recommendation result based on a terminal device and a user portrait, determining the distribution content, and pushing it to the terminal device.

Description

Artificial intelligent data searching and distributing method and system
Technical Field
The invention relates to the technical field of artificial intelligence, in particular to an artificial intelligence data searching and distributing method and system.
Background
The internet users are increasingly dependent on search engines and recommendation systems. Users want to be able to quickly and accurately obtain the required information, and at the same time want the system to be able to provide personalized recommended content according to their interests and behavior habits. However, with the explosive growth of information volume, how to efficiently extract relevant information from massive data and accurately distribute the relevant information according to specific needs and preferences of users becomes a technical problem to be solved.
Traditional search engines rely primarily on keyword matching and rule-based ranking algorithms, and it is difficult to fully understand the semantic needs and interest preferences of users, resulting in poor relevance of search results and user satisfaction. Meanwhile, the traditional recommendation system often depends on collaborative filtering or content-based recommendation methods, and although the methods can improve the recommendation effect to a certain extent, the problems of data sparsity, cold start and the like are still faced, and real-time optimization and dynamic adjustment are difficult to realize; along with the rapid development of artificial intelligence technology, the artificial intelligence technology has great potential in the fields of searching and recommending, improves the accuracy of searching and recommending, provides a richer semantic background, enhances the understanding capability of a system, dynamically adjusts the recommending strategy through continuous user interaction, and improves the user satisfaction and the long-term benefits of the system.
In summary, many challenges still exist in practical application, so that multi-mode data needs to be effectively fused, diversity and coverage of search and recommendation results are improved, semantic expansion is performed by using a knowledge graph, understanding of search requests and recall rate of related results are improved, user images and recommendation models are dynamically updated through real-time feedback and incremental learning, and more accurate personalized recommendation is provided.
Disclosure of Invention
The embodiment of the invention provides an artificial intelligent data searching and distributing method and system, which can solve the problems in the prior art.
In a first aspect of an embodiment of the present invention,
Provided is an artificial intelligence data searching and distributing method, comprising:
Obtaining a search request input by a user, carrying out semantic expansion on the search request based on a pre-constructed comprehensive knowledge graph to obtain a target expanded search keyword, and carrying out semantic search on structured data and unstructured data according to the target expanded search keyword and a pre-constructed multi-mode semantic index to obtain a preliminary search result;
Performing feature extraction and fusion on the preliminary search results through a multi-modal fusion neural network to obtain multi-modal fusion feature vectors, performing weight adjustment on the multi-modal fusion feature vectors through an attention mechanism, calculating relevance scores of each data in the preliminary search results and search intentions of users according to the adjusted multi-modal fusion feature vectors, and sequencing the preliminary search results according to the relevance scores to obtain ordered search results;
The ordered search results are input into a personalized recommendation system, a final recommendation result is generated, presentation adaptation is carried out on the final recommendation result based on terminal equipment and user portraits, distribution content is determined, and the distribution content is pushed to the terminal equipment.
In an alternative embodiment of the present invention,
Obtaining a search request input by a user, carrying out semantic expansion on the search request based on a pre-constructed comprehensive knowledge graph to obtain a target expanded search keyword, carrying out semantic search on structured data and unstructured data according to the target expanded search keyword and a pre-constructed multi-mode semantic index, and obtaining a preliminary search result, wherein the obtaining of the preliminary search result comprises the following steps:
acquiring a search request input by a user, preprocessing the search request to obtain a search request text, and converting the search request text into semantic vector representation of the search request;
carrying out semantic expansion on the semantic vector representation by utilizing the comprehensive knowledge graph to obtain a target expanded search keyword;
searching in a multi-mode semantic index constructed in advance according to the target expanded search keyword, wherein the multi-mode semantic index comprises a structured data index and an unstructured data index;
Based on the structured data, matching the target expanded search keywords with entities and relations in the multi-mode knowledge graph by adopting a query language based on a graph to obtain a first search result;
Based on the unstructured data, a pre-trained multi-modal representation learning model is adopted, the target expanded search keywords are mapped into a multi-modal semantic space, and a second search result is obtained through vector similarity calculation;
And carrying out semantic fusion on the first search result and the second search result, and sorting based on semantic relevance and importance to obtain a preliminary search result.
In an alternative embodiment of the present invention,
Carrying out semantic expansion on the semantic vector representation by utilizing the comprehensive knowledge graph, and obtaining target expanded search keywords comprises the following steps:
Carrying out vectorization processing on the entities and the relations in the comprehensive knowledge graph by adopting a knowledge graph embedding model to obtain entity relation vector representation, and selecting an entity with the highest similarity as a candidate expansion keyword by calculating the similarity between semantic vector representation and entity relation vector representation;
Taking the candidate expanded keywords as the center, adopting a random walk algorithm to perform context sampling in the comprehensive knowledge graph, and calculating node centrality measurement and node importance score by determining node degree, centrality and clustering coefficient of the comprehensive knowledge graph to generate an expanded keyword sequence;
And screening the expanded keyword sequence based on the node centrality measurement and the node importance score to obtain a target expanded search keyword.
In an alternative embodiment of the present invention,
Performing feature extraction and fusion on the preliminary search result through a multi-modal fusion neural network to obtain a multi-modal fusion feature vector, performing weight adjustment on the multi-modal fusion feature vector by using an attention mechanism, calculating a relevance score of each data in the preliminary search result and the search intention of a user according to the adjusted multi-modal fusion feature vector, and sequencing the preliminary search result according to the relevance score, wherein the step of obtaining an ordered search result comprises the following steps:
Based on the structured data in the preliminary search result, extracting corresponding key attributes and corresponding attribute values, and determining dominant features; based on unstructured data in the preliminary search results, extracting semantic features by adopting a pre-trained deep learning model, and determining deep features;
Inputting the dominant features and the deep features into a multi-modal fusion neural network, and carrying out feature fusion through multi-layer nonlinear transformation and interactive operation to obtain multi-modal fusion feature vectors;
Based on an attention mechanism, carrying out interactive calculation on semantic vector representations corresponding to the user search intention and the multi-modal fusion feature vectors to obtain an attention weight matrix, and adjusting weight distribution of different dimensionalities in the multi-modal fusion feature vectors through the attention weight matrix to determine multi-modal weighted fusion feature vectors;
And calculating a correlation score between the multimodal weighted fusion feature vector and the semantic vector representation of the user search intention by adopting a similarity measurement method, sequencing the preliminary search results according to the correlation score from high to low, and determining the ordered search results.
In an alternative embodiment of the present invention,
The ordered search results are input into a personalized recommendation model, a final recommendation result is generated, presentation adaptation is carried out on the final recommendation result based on terminal equipment and user portraits, distribution content is determined, and the distribution content is pushed to the terminal equipment, wherein the steps of:
Based on a pre-acquired user image, the ordered search result acquires a recommendation candidate set through a content recommendation algorithm, potential information corresponding to potential interests of a user is obtained in a reasoning mode, and the potential information is added into the recommendation candidate set to synthesize a recommendation result;
Optimizing the recommendation result in real time through a reinforcement learning model according to the real-time feedback and behavior change of the user, dynamically adjusting the display and sequencing of the recommendation result by dynamically adjusting priority sequencing parameters and combining the explicit feedback and the implicit feedback of the user, and generating a final recommendation result;
Based on terminal equipment information and the user portrait, performing presentation adaptation on the final recommendation result, determining distribution content, and pushing the distribution content to corresponding terminal equipment;
And after the terminal equipment receives the final recommendation result, acquiring user interaction data in real time to generate data feedback, wherein the data feedback is used for updating the user portrait based on incremental updating calculation through data reflux, and simultaneously iteratively updating the personalized recommendation model.
In an alternative embodiment of the present invention,
Based on the pre-acquired user image, the ordered search result acquires a recommendation candidate set through a content recommendation algorithm, and inferentially acquires potential information corresponding to potential interests of the user, the potential information is added into the recommendation candidate set, and the recommendation result is synthesized by the steps of:
in the content recommendation algorithm, user content preference is calculated, and the formula is as follows:
Where u represents user u, c represents content c, p u,c represents a preference score of user u on content c, F represents a feature in the content, F represents a set of all features, w u,f represents a preference weight of user u on feature F, v c,f represents a value of content c on feature F, x f represents an importance weight of feature F, y u,c,f represents interaction strength of user u and content c on feature F, α represents an intensity parameter controlling similarity, sim u,c represents similarity of user u and content c, β represents an intensity parameter controlling popularity, pop c represents popularity of content c.
In a second aspect of an embodiment of the present invention,
There is provided an artificial intelligence data searching and distributing system comprising:
the first unit is used for acquiring a search request input by a user, carrying out semantic expansion on the search request based on a pre-built comprehensive knowledge graph to obtain a target expanded search keyword, and carrying out semantic search on structured data and unstructured data according to the target expanded search keyword and a pre-built multi-mode semantic index to obtain a preliminary search result;
The second unit is used for carrying out feature extraction and fusion on the preliminary search results through a multi-modal fusion neural network to obtain multi-modal fusion feature vectors, carrying out weight adjustment on the multi-modal fusion feature vectors by using an attention mechanism, calculating relevance scores of all data in the preliminary search results and search intentions of users according to the adjusted multi-modal fusion feature vectors, and sequencing the preliminary search results according to the relevance scores to obtain ordered search results;
and the third unit is used for inputting the ordered search results into the personalized recommendation system, generating a final recommendation result, performing presentation adaptation on the final recommendation result based on the terminal equipment and the user portrait, determining distribution content and pushing the distribution content to the terminal equipment.
In a third aspect of an embodiment of the present invention,
There is provided an electronic device including:
A processor;
A memory for storing processor-executable instructions;
wherein the processor is configured to invoke the instructions stored in the memory to perform the method described previously.
In a fourth aspect of an embodiment of the present invention,
There is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method as described above.
In the embodiment of the invention, accurate search request semantic vector representation is generated through preprocessing and word embedding models; based on semantic expansion of the knowledge graph, related expansion keywords are generated, and coverage and accuracy of search results are improved; the multi-mode semantic indexes of the structured and unstructured data are combined, so that the diversity and the relevance of search results are improved; by extracting and fusing the features of the structured and unstructured data, the search result is not only dependent on the data of a single modality, but also considers the search intention of the user more comprehensively; the method for generating the modal weighted fusion feature vector and measuring the similarity ensures that the sorting of the search results is more accurate and personalized, and improves the satisfaction degree of users; the final ordered search results are generated, so that the general search requirements are met, the method can be further used for personalized recommendation, and the practicability and the user viscosity of a search system are improved; when the content recommendation algorithm and the reasoning algorithm are combined, the historical behaviors and the interests and hobbies of the user are considered when the recommendation candidate set is generated, potential information corresponding to the potential interests of the user is obtained through reasoning, the diversity and the coverage range of recommendation are expanded, and a more comprehensive recommendation result is provided; the interactive action data of the user is transmitted to the recommendation system through a data reflow mechanism, new user feedback data and historical data are combined and updated based on incremental update calculation, so that a user portrait and a personalized recommendation model are updated, the personalized recommendation model is continuously and iteratively updated and optimized by utilizing an incremental learning algorithm, the updated user portrait and the updated personalized recommendation model can reflect real-time interests and demands of the user more accurately, and more accurate and satisfactory recommendation service is provided.
Drawings
FIG. 1 is a flow chart of an artificial intelligence data searching and distributing method according to an embodiment of the invention;
FIG. 2 is a schematic diagram of an artificial intelligence data searching and distributing system according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
FIG. 1 is a schematic flow chart of an artificial intelligence data searching and distributing method according to an embodiment of the invention, as shown in FIG. 1, the method includes:
S101, acquiring a search request input by a user, carrying out semantic expansion on the search request based on a pre-constructed comprehensive knowledge graph to obtain a target expanded search keyword, and carrying out semantic search on structured data and unstructured data according to the target expanded search keyword and a pre-constructed multi-mode semantic index to obtain a preliminary search result;
The comprehensive knowledge graph specifically refers to a tool for representing knowledge through a graph structure, and comprises a plurality of search related domain knowledge, a large number of entities such as people, places, events and the like, and relationships among the entities such as 'friends who are someone', 'located somewhere', the entities and the relationships are embedded into a graph, and the knowledge graph is displayed in a node and side form and aims to integrate multi-source heterogeneous data in a structured way to provide a global view so as to perform complex query and reasoning;
The semantic expansion specifically means that the semantic analysis is carried out on the input text, potential meanings behind the text are extracted, a group of expansion keywords are generated by utilizing the meanings, the original text is associated with related context information by utilizing entities and relations in a knowledge graph, the semantic range of the text is expanded, and the purpose of the semantic expansion is to improve the accuracy and the comprehensiveness of searching and find content more related to the user requirement.
In the embodiment, the semantic expansion is performed on the search request through the comprehensive knowledge graph, potential meanings and contexts in user input can be captured, and more accurate expanded search keywords are generated, so that the relevance and accuracy of search results are improved; the semantic expansion can discover more related keywords, cover more aspects possibly concerned by a user, and enable search results to be more comprehensive and rich; the combination of the comprehensive knowledge graph and the multi-mode semantic index enables the search system to find related data more quickly, reduces waiting time of users and improves response speed of the system; through semantic expansion and multi-modal indexing, data of different types and different sources can be effectively associated and mined, potential links between the data are discovered, and more valuable information is provided.
In an alternative embodiment, a search request input by a user is acquired, semantic expansion is performed on the search request based on a pre-built comprehensive knowledge graph to obtain a target expanded search keyword, semantic search is performed on structured data and unstructured data according to the target expanded search keyword and a pre-built multi-modal semantic index, and the obtaining of a preliminary search result includes:
acquiring a search request input by a user, preprocessing the search request to obtain a search request text, and converting the search request text into semantic vector representation of the search request;
carrying out semantic expansion on the semantic vector representation by utilizing the comprehensive knowledge graph to obtain a target expanded search keyword;
searching in a multi-mode semantic index constructed in advance according to the target expanded search keyword, wherein the multi-mode semantic index comprises a structured data index and an unstructured data index;
Based on the structured data, matching the target expanded search keywords with entities and relations in the multi-mode knowledge graph by adopting a query language based on a graph to obtain a first search result;
Based on the unstructured data, a pre-trained multi-modal representation learning model is adopted, the target expanded search keywords are mapped into a multi-modal semantic space, and a second search result is obtained through vector similarity calculation;
And carrying out semantic fusion on the first search result and the second search result, and sorting based on semantic relevance and importance to obtain a preliminary search result.
The first search result specifically refers to a search result generated based on structured data, and the data is usually stored in a database or a knowledge graph, and has a definite structure and relationship. By retrieving entities and relationships matching the target expanded search keywords based on the graph-based query language, structured information related to the user search request is obtained, the results typically including well-defined data such as specific entries in the database, associations, attribute values, and the like.
The second search result specifically refers to a search result generated based on unstructured data, and the data comprises text, images, videos and other contents without fixed structures. The target expanded search keywords are mapped to a multi-modal semantic space through a pre-trained multi-modal representation learning model, and content most relevant to the target expanded search keywords is retrieved from unstructured data indexes through vector similarity calculation, and the result generally comprises information such as documents, pictures, video clips and the like.
Obtaining a search request input by a user, preprocessing the search request, including word segmentation, word deactivation, part-of-speech tagging and the like, obtaining a search request text, and mapping each word in the search request text into a low-dimensional dense vector representation by utilizing a pre-trained word embedding model, preferably a GloVe model; based on the attention mechanism, word vectors in the text of the search request are aggregated to obtain a semantic vector representation of the search request.
Carrying out semantic expansion on semantic vector representation of a search request by utilizing a pre-constructed comprehensive knowledge graph, finding out entity nodes most relevant to the semantic vector representation of the search request in the comprehensive knowledge graph, and acquiring attribute information and associated entities of the corresponding entity nodes; and generating an expanded keyword related to the original search request according to the attribute information of the entity node and the associated entity to form a target expanded search keyword set.
Searching in a pre-constructed multi-mode semantic index according to the target expanded search keyword, adopting a query language based on a graph, preferably using a Cypher query language, for the structured data index, matching the target expanded search keyword with the entity and the relation in the multi-mode knowledge graph, and finding out related structured data to form a first search result; for unstructured data index, a pre-trained multi-modal representation learning model, preferably a ViLBERT model, is adopted to map the target expanded search keyword into a multi-modal semantic space to obtain semantic vector representation thereof, and the unstructured data most relevant to the target expanded search keyword is found by calculating cosine similarity between the semantic vector representation of the target expanded search keyword and semantic vector representations of various items in the unstructured data index to form a second search result.
Carrying out semantic fusion on the first search result, namely the structured data, and the second search result, namely the unstructured data; for the first search result, calculating a semantic relevance score for each result item according to the relevance and importance of the matched entity and relationship; for the second search result, calculating a semantic relevance score of each result item according to the semantic similarity with the target expanded search keyword; comprehensively considering semantic relevance scores of the first search result and the second search result, and sequencing all result items to obtain a preliminary search result list; and further adjusting and optimizing the preliminary search result list according to the importance of the result items to obtain final search result ordering.
In the embodiment, accurate search request semantic vector representation is generated through preprocessing and word embedding models; based on semantic expansion of the knowledge graph, related expansion keywords are generated, and coverage and accuracy of search results are improved; the multi-mode semantic indexes of the structured and unstructured data are combined, so that the diversity and the relevance of search results are improved; through semantic fusion and correlation calculation, the search results are finely ordered and optimized, and the user search experience and satisfaction are improved.
In an alternative embodiment, using the integrated knowledge graph to semantically expand the semantic vector representation, obtaining the target expanded search keyword includes:
Carrying out vectorization processing on the entities and the relations in the comprehensive knowledge graph by adopting a knowledge graph embedding model to obtain entity relation vector representation, and selecting an entity with the highest similarity as a candidate expansion keyword by calculating the similarity between semantic vector representation and entity relation vector representation;
Taking the candidate expanded keywords as the center, adopting a random walk algorithm to perform context sampling in the comprehensive knowledge graph, and calculating node centrality measurement and node importance score by determining node degree, centrality and clustering coefficient of the comprehensive knowledge graph to generate an expanded keyword sequence;
And screening the expanded keyword sequence based on the node centrality measurement and the node importance score to obtain a target expanded search keyword.
The node degree specifically refers to the number of edges connected by one node in the knowledge graph. It represents the number of other nodes with which the node is directly associated. The node degree can be divided into an outgoing degree and an incoming degree, namely, the number of outgoing edges and the number of incoming edges, and in the undirected graph, the node degree is the total number of edges connected with the node.
The centrality specifically refers to an index for measuring the relative importance of nodes in a graph, and various centrality measuring methods exist: the median centrality measures the number of shortest paths that one node takes as between other node pairs in the graph. The nodes with high medium-number centrality have more control force in the network; near centrality, the average shortest path length of one node with all other nodes is measured. Nodes with high proximity centrality can propagate information to other parts of the network faster; the degree centrality is measured by directly using the degree of the node, and the higher the degree is, the greater the importance of the node in the network is.
The node centrality measurement specifically refers to an index for comprehensively evaluating the importance and influence of a node in a knowledge graph, and based on the centrality concept, the relative importance of the node in the overall graph structure is calculated by combining the structure position, the connection quantity and the connection quality of the node in the network.
The node importance score specifically refers to a comprehensive score calculated according to the node centrality measurement and other related indexes, the comprehensive score reflects the global importance and the local importance of the nodes in the knowledge graph, and the higher the score, the more critical and the more influencing the nodes in the graph.
And carrying out vectorization processing on the entities and relations in the comprehensive knowledge graph by adopting a knowledge graph embedding model, preferably adopting a ComplEx embedding model, mapping each entity and relation in the knowledge graph into a low-dimensional dense vector space to obtain entity relation vector representation, for semantic vector representation of a search request, calculating similarity between the semantic vector representation of the search request and each entity relation vector representation in the knowledge graph, preferably calculating Euclidean distance, finding the entity most relevant to the semantic of the search request, and selecting the entity with the highest similarity as a candidate expansion keyword to form a candidate expansion keyword set.
Taking a candidate expansion keyword as a center, adopting a random walk algorithm to perform context sampling in an integrated knowledge graph, starting from the candidate expansion keyword, randomly selecting neighbor nodes to walk according to entity relations in the knowledge graph, generating a node sequence containing context information, and evaluating importance of the nodes in the knowledge graph by counting the degree of the nodes, namely the number of edges connected with the nodes, centrality such as medium centrality, approximate centrality and the like, and clustering coefficient, namely the proportion of triangles formed between the nodes and the neighbor nodes; according to the node degree, the centrality and the clustering coefficient, calculating the centrality measurement of each node, reflecting the importance degree of the node in the knowledge graph, and calculating the importance score of each node by combining the centrality measurement and the wandering frequency of the node to obtain the extended keyword sequence.
Based on the node centrality measurement and the node importance score, screening the expanded keyword sequence, setting a threshold value of the centrality measurement and the importance score, filtering out nodes lower than the threshold value, reserving nodes with high centrality and high importance, de-duplicating and sequencing the screened nodes to obtain a final target expanded search keyword set, wherein the target expanded search keyword set contains expanded keywords which are related to the original search request semanteme and have high importance in a knowledge graph, and can effectively expand the semantic range of the search request.
In the embodiment, the embedded model can capture complex relation and semantic information in the knowledge graph, and map the entity and the relation to a low-dimensional space, so that similarity calculation is more efficient; by calculating Euclidean distance, the entity most relevant to the search request is accurately found, and the semantic relevance of the expanded keywords is ensured to be high; the random walk algorithm can capture the context information among the entities in the knowledge graph and generate a node sequence with rich semantics; the importance of the nodes is accurately estimated by counting the degree, the centrality and the clustering coefficient of the nodes, so that the quality of the expanded keywords is ensured; the importance scores of the nodes are calculated by combining the centrality measurement and the wandering frequency of the nodes, so that the expanded keyword sequences are more comprehensive and accurate; by setting a threshold value, nodes with low centrality and low importance are filtered, high-quality expanded keywords are reserved, and accuracy and relevance of results are ensured; performing de-duplication and sequencing on the screened nodes, further optimizing and expanding a keyword set, and improving the coverage range and the precision of a search result; the final target expanded search keyword set can effectively expand the semantic range of the search request, and the comprehensiveness and accuracy of search are improved.
S102, carrying out feature extraction and fusion on the preliminary search results through a multi-modal fusion neural network to obtain multi-modal fusion feature vectors, carrying out weight adjustment on the multi-modal fusion feature vectors by using an attention mechanism, calculating relevance scores of all data in the preliminary search results and search intentions of users according to the adjusted multi-modal fusion feature vectors, and sequencing the preliminary search results according to the relevance scores to obtain ordered search results;
The user searching intention specifically refers to information or a problem to be found or solved by a user when the user inputs a searching request, reflects the actual requirement and purpose of the user, not only comprises keywords on the surface, but also comprises back semantics and context, and identifies and understands the key that the user searching intention is to improve the performance of a searching system and the satisfaction degree of the user;
The relevance score specifically refers to an index for quantifying the matching degree of the search result and the search intention of the user, reflects the matching degree of each data item in the preliminary search result and the search intention of the user, and is generally calculated through methods such as feature extraction, semantic analysis and the like, wherein the higher the relevance score is, the more the data item meets the requirements and expectations of the user.
In the embodiment, the multi-modal fusion neural network is utilized to comprehensively extract and fuse the features of the structured and unstructured data in the primary search result, so that a unified multi-modal fusion feature vector is generated, and the richness and the comprehensiveness of feature expression are improved; the multi-mode fusion feature vector is subjected to weight adjustment through an attention mechanism, so that the feature highly related to the search intention of the user can be highlighted, the influence of noise features is reduced, the calculation precision of the relevance score is improved, and the precision of the search result is improved; and calculating the relevance score of each data in the preliminary search results and the search intention of the user according to the adjusted multimodal fusion feature vector, and sequencing according to the scores to ensure that the most relevant search results are ranked in front and provide more valuable information for the user.
In an alternative embodiment, the feature extraction and fusion are performed on the preliminary search result through a multi-modal fusion neural network to obtain a multi-modal fusion feature vector, the weight of the multi-modal fusion feature vector is adjusted by using an attention mechanism, the relevance score of each data in the preliminary search result and the search intention of the user is calculated according to the adjusted multi-modal fusion feature vector, the preliminary search result is ordered according to the relevance score, and the obtaining of the ordered search result comprises:
Based on the structured data in the preliminary search result, extracting corresponding key attributes and corresponding attribute values, and determining dominant features; based on unstructured data in the preliminary search results, extracting semantic features by adopting a pre-trained deep learning model, and determining deep features;
Inputting the dominant features and the deep features into a multi-modal fusion neural network, and carrying out feature fusion through multi-layer nonlinear transformation and interactive operation to obtain multi-modal fusion feature vectors;
Based on an attention mechanism, carrying out interactive calculation on semantic vector representations corresponding to the user search intention and the multi-modal fusion feature vectors to obtain an attention weight matrix, and adjusting weight distribution of different dimensionalities in the multi-modal fusion feature vectors through the attention weight matrix to determine multi-modal weighted fusion feature vectors;
And calculating a correlation score between the multimodal weighted fusion feature vector and the semantic vector representation of the user search intention by adopting a similarity measurement method, sequencing the preliminary search results according to the correlation score from high to low, and determining the ordered search results.
Analyzing the structured data in the preliminary search results, and identifying key attributes thereof, such as title, category, timestamp, browsing amount, click amount and the like; extracting attribute values corresponding to each key attribute to form a structured feature vector; and taking the extracted structured feature vector as an explicit feature for subsequent feature fusion.
Preprocessing unstructured data in the primary search results, such as texts, images, videos and the like, such as text segmentation, image normalization and the like; performing feature extraction on unstructured data by adopting a pre-trained deep learning model, preferably ResNet; for text data, extracting semantic features of the text data by using a pre-trained language model to obtain text semantic vector representation; for image and video data, extracting visual characteristics of the image and video data by using a pre-trained convolutional neural network to obtain image characteristic vector representation and video characteristic vector representation; and taking the extracted unstructured data features as deep features for subsequent feature fusion.
Inputting the dominant features and the deep features into a multi-modal fusion neural network, fusing the features of different modes through multi-layer nonlinear transformation and interaction operation, realizing nonlinear combination and interaction of the features by using a full connection layer, an attention mechanism, a gating mechanism and the like in the fusion process, and obtaining a fused multi-modal fusion feature vector through forward propagation of the multi-modal fusion neural network.
The semantic vector representation corresponding to the user search intention is interactively calculated with the multi-modal fusion feature vector, the attention mechanism is used, the attention weight matrix is obtained by calculating the correlation between the semantic vector representation of the user search intention and each dimension in the multi-modal fusion feature vector, the attention weight matrix is used for adjusting the weights of different dimensions in the multi-modal fusion feature vector, the feature dimension relevant to the user search intention is highlighted, the multi-modal weighted fusion feature vector is obtained through weighted fusion, and the correlation information of the user search intention is fused.
Calculating a correlation score between the multimodal weighted fusion feature vector and the semantic vector representation of the user search intention by adopting a similarity measurement method; the preliminary search results are ranked from high to low according to the relevance score, the characteristics of structured and unstructured data and the relevance to the search intention of the user are considered by the ranked search results, the search requirement of the user can be better met, and the ranked search results can be used as final ordered search results to be presented to the user or the search results can be further personalized.
In the embodiment, through feature extraction and fusion of structured and unstructured data, the search result is not only dependent on data of a single modality, but also comprehensively considers the search intention of a user; the comprehensive processing of the structured data and the unstructured data enables the search result to cover wider content types including texts, images, videos and the like, so as to meet the diversified information demands of users; according to the search intention of the user, the feature weight is adjusted, the feature dimension with strong correlation is highlighted, and the ordering of the search results is optimized, so that the user can find the required information more quickly; the feature weight is dynamically adjusted according to the search intention of the user by using an attention mechanism, so that the search result can timely reflect the change of the user's requirement; the sorting of the search results not only considers the characteristics of the preliminary search results, but also combines the correlation of the search intentions of the users, and can better adapt to the personalized requirements of the users; the multi-mode weighting fusion feature vector generation and similarity measurement method enables the sorting of the search results to be more accurate and personalized, and improves the user satisfaction; the final ordered search result is generated, so that the general search requirement is met, the method can be further used for personalized recommendation, and the practicability and the user viscosity of a search system are improved.
S103, inputting the ordered search results into a personalized recommendation system, generating final recommendation results, performing presentation adaptation on the final recommendation results based on terminal equipment and user portraits, determining distribution content, and pushing the distribution content to the terminal equipment.
The terminal device specifically refers to a hardware device used by a user to access, browse and interact with the recommendation system. Terminal devices include, but are not limited to, smartphones, tablet computers, personal computers, smartwatches, smarttelevisions, etc., each having different screen sizes, resolutions, operating systems and modes of interaction.
The user portrayal specifically refers to comprehensive description of characteristics, behaviors and preferences of the user, and comprises demographic information such as age, gender, occupation and the like, hobbies and interests, historical behavior data such as browsing records, clicking records, purchasing records and the like, and real-time feedback and interaction data of the user. The user portraits are used to help the recommendation system better understand and predict the needs and preferences of the user, providing personalized recommendations.
The distribution specifically refers to a process of pushing the recommendation result to the user. It includes preparation, transmission and presentation of content on a terminal device. The distribution process takes network conditions, device performance, user preferences, behaviors, and other factors into account, and ensures that recommended content can be efficiently and accurately delivered to users and presented in a suitable form on the user terminals.
In the embodiment, the ordered search results are input into the personalized recommendation system, and final recommendation results which more accord with the interests and the demands of the user are generated by combining the user portraits, so that the personalized degree and the accuracy of recommendation are improved; according to the characteristics of different terminal equipment, the final recommendation result is presented and adapted, so that the content can be displayed in an optimal form on various kinds of equipment, and the user experience is improved; the system can dynamically adjust recommended content and distribution strategies according to real-time feedback and user portraits, improves flexibility and intelligence of the system, and can timely respond to changes of user behaviors.
In an alternative embodiment, the inputting the ordered search result into the personalized recommendation model, generating a final recommendation result, performing presentation adaptation on the final recommendation result based on the terminal device and the user portrait, determining the distribution content, and pushing to the terminal device includes:
Based on a pre-acquired user image, the ordered search result acquires a recommendation candidate set through a content recommendation algorithm, potential information corresponding to potential interests of a user is obtained in a reasoning mode, and the potential information is added into the recommendation candidate set to synthesize a recommendation result;
Optimizing the recommendation result in real time through a reinforcement learning model according to the real-time feedback and behavior change of the user, dynamically adjusting the display and sequencing of the recommendation result by dynamically adjusting priority sequencing parameters and combining the explicit feedback and the implicit feedback of the user, and generating a final recommendation result;
Based on terminal equipment information and the user portrait, performing presentation adaptation on the final recommendation result, determining distribution content, and pushing the distribution content to corresponding terminal equipment;
And after the terminal equipment receives the final recommendation result, acquiring user interaction data in real time to generate data feedback, wherein the data feedback is used for updating the user portrait based on incremental updating calculation through data reflux, and simultaneously iteratively updating the personalized recommendation model.
Generating a recommendation candidate set through a content recommendation algorithm based on pre-acquired user portrayal information such as demographic characteristics, hobbies, historical behaviors and the like, and finding out an item matched with the user interests by analyzing the similarity and the relevance between the user and the item based on content recommendation; and when the recommendation candidate set is generated, potential information corresponding to the potential interests of the user is obtained through an inference algorithm, such as knowledge graph inference, the potential information obtained through inference is added into the recommendation candidate set, the diversity and coverage range of recommendation are expanded, and a more comprehensive recommendation result is formed.
According to the real-time feedback and behavior change of the user, real-time optimization is carried out on the recommendation result through a reinforcement learning model, the reinforcement learning model continuously learns and adjusts the recommendation strategy through interaction with the user so as to maximize satisfaction and long-term benefits of the user, the priority ranking parameters such as the click rate, the residence time and the conversion rate of the recommended articles by the user are dynamically adjusted, the ranking of the recommendation result is adjusted in real time, and the display and the ranking of the recommendation result are dynamically adjusted by combining the explicit feedback of the user such as scoring, praise and the like and the implicit feedback such as browsing time, click sequence and the like; through continuous interaction and optimization, final recommendation results matched with the real-time interests and preferences of the user are generated.
Based on terminal device information, such as device type, screen size, network status, etc., and user portraits, rendering adaptations are made to the final recommendation result; according to the characteristics of different terminal equipment and the preferences of users, the display form, layout and interaction mode of the recommended content are determined, personalized rendering and typesetting are carried out on the recommended content, good user experience is provided, the matched recommended content is pushed to the corresponding terminal equipment, such as mobile phone APP, web pages and intelligent hardware, in the pushing process, the network condition and the equipment performance are considered, and the efficient transmission and display of the recommended content are ensured by adopting a proper transmission protocol and compression algorithm.
After the terminal equipment receives the final recommendation result, interactive action data of the user, such as clicking, browsing, collecting, commenting and the like, are collected in real time, the interactive action data of the user are generated into data feedback and are transmitted to a recommendation system through a data reflow mechanism, the recommendation system combines and updates new user feedback data with historical data based on incremental updating calculation, and user portraits are updated through an incremental learning algorithm to capture the changes of user interests and preferences; and simultaneously, the new user feedback data is utilized to carry out iterative updating and optimization on the personalized recommendation model, the updated user portraits and the personalized recommendation model can more accurately reflect the real-time interests and requirements of the user, and more accurate and satisfactory recommendation service is provided.
In the embodiment, through collecting the user interaction data on the terminal equipment and updating the user portrait and the personalized recommendation model in real time, the recommendation result can be continuously iterated and optimized, the real-time personalized recommendation service is realized, the reinforcement learning model continuously adjusts the recommendation strategy through the interactive learning with the user, the satisfaction degree and the long-term income of the user are maximized, and the recommendation result can be better matched with the real-time interest and the requirement of the user; when the content recommendation algorithm and the reasoning algorithm are combined, the historical behaviors and the interests and hobbies of the user are considered when the recommendation candidate set is generated, potential information corresponding to the potential interests of the user is obtained through reasoning, the diversity and the coverage range of recommendation are expanded, and a more comprehensive recommendation result is provided; the display and the sequencing of the recommendation results are dynamically adjusted by combining the explicit feedback and the implicit feedback of the user, so that the recommendation results can better meet the requirements and the preferences of the user, and the satisfaction degree of the user is improved; the interactive action data of the user is transmitted to the recommendation system through a data reflow mechanism, new user feedback data and historical data are combined and updated based on incremental update calculation, so that a user portrait and a personalized recommendation model are updated, the personalized recommendation model is continuously and iteratively updated and optimized by utilizing an incremental learning algorithm, the updated user portrait and the updated personalized recommendation model can reflect real-time interests and demands of the user more accurately, and more accurate and satisfactory recommendation service is provided.
In an alternative embodiment, based on the pre-acquired user image, the ordered search result acquires a recommendation candidate set through a content recommendation algorithm, and inferentially acquires potential information corresponding to potential interests of the user, and adds the potential information into the recommendation candidate set, so as to synthesize a recommendation result, wherein the recommendation result comprises:
in the content recommendation algorithm, user content preference is calculated, and the formula is as follows:
Where u represents user u, c represents content c, p u,c represents a preference score of user u on content c, F represents a feature in the content, F represents a set of all features, w u,f represents a preference weight of user u on feature F, v c,f represents a value of content c on feature F, x f represents an importance weight of feature F, y u,c,f represents interaction strength of user u and content c on feature F, α represents an intensity parameter controlling similarity, sim u,c represents similarity of user u and content c, β represents an intensity parameter controlling popularity, pop c represents popularity of content c.
Calculating the preference weight of the user on the content characteristics and multiplying the value of the content on the characteristics, and simultaneously considering the importance weight of the characteristics and the interaction strength of the user and the content on the characteristics; normalizing the weighted scores of all the features, and respectively calculating the square sum of the weighted scores of the features and the square root of the weighted score sum of the content on the features by the user for normalizing the feature scores; and on the basis of the normalized score, correcting the similarity between the user and the content and the popularity of the content by adjusting parameters, and finally obtaining the preference score of the user on the content.
According to the formula, the interests and the demands of the user can be reflected more accurately by comprehensively considering the preference weights of the user on the characteristics, so that the individuation degree of the recommended content is improved; the interest of the user to the specific content is effectively measured by using the feature importance weight and the interaction strength of the user and the content, so that the recommendation result is more in line with the real preference of the user; by combining the similarity between the user and the content and the popularity adjustment parameter of the content, the recommendation system not only recommends the content similar to the user, but also recommends some popular or high-quality content with low similarity to the user, and the diversity of recommendation results is increased; the algorithm can dynamically capture and adapt to the change of the user interests according to the interaction strength of the user and the content on the characteristics, so that the recommendation result can reflect the latest interests and requirements of the user in real time; and the factors in multiple aspects are comprehensively considered, and the generated recommended content is more fit with the personalized requirements of the user, so that the satisfaction degree and the use experience of the user on the recommended result are improved.
FIG. 2 is a schematic structural diagram of an artificial intelligence data searching and distributing system according to an embodiment of the present invention, as shown in FIG. 2, the system includes:
the first unit is used for acquiring a search request input by a user, carrying out semantic expansion on the search request based on a pre-built comprehensive knowledge graph to obtain a target expanded search keyword, and carrying out semantic search on structured data and unstructured data according to the target expanded search keyword and a pre-built multi-mode semantic index to obtain a preliminary search result;
The second unit is used for carrying out feature extraction and fusion on the preliminary search results through a multi-modal fusion neural network to obtain multi-modal fusion feature vectors, carrying out weight adjustment on the multi-modal fusion feature vectors by using an attention mechanism, calculating relevance scores of all data in the preliminary search results and search intentions of users according to the adjusted multi-modal fusion feature vectors, and sequencing the preliminary search results according to the relevance scores to obtain ordered search results;
and the third unit is used for inputting the ordered search results into the personalized recommendation system, generating a final recommendation result, performing presentation adaptation on the final recommendation result based on the terminal equipment and the user portrait, determining distribution content and pushing the distribution content to the terminal equipment.
In a third aspect of an embodiment of the present invention,
There is provided an electronic device including:
A processor;
A memory for storing processor-executable instructions;
wherein the processor is configured to invoke the instructions stored in the memory to perform the method described previously.
In a fourth aspect of an embodiment of the present invention,
There is provided a computer readable storage medium having stored thereon computer program instructions which, when executed by a processor, implement the method as described above.
The present invention may be a method, apparatus, system, and/or computer program product. The computer program product may include a computer readable storage medium having computer readable program instructions embodied thereon for performing various aspects of the present invention.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and not for limiting the same; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some or all of the technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit of the invention.

Claims (8)

1.一种人工智能数据搜索与分发方法,其特征在于,包括:1. An artificial intelligence data search and distribution method, characterized by comprising: 获取用户输入的搜索请求,基于预先构建的综合知识图谱,对所述搜索请求进行语义扩展,得到目标扩展搜索关键词,根据所述目标扩展搜索关键词和预先构建的多模态语义索引,对结构化数据和非结构化数据进行语义搜索,得到初步搜索结果;Obtaining a search request input by a user, semantically expanding the search request based on a pre-built comprehensive knowledge graph to obtain a target expanded search keyword, and performing a semantic search on structured data and unstructured data based on the target expanded search keyword and a pre-built multimodal semantic index to obtain preliminary search results; 通过多模态融合神经网络对所述初步搜索结果进行特征提取与融合,得到多模态融合特征向量,利用注意力机制对所述多模态融合特征向量进行权重调整,根据调整后的多模态融合特征向量,计算所述初步搜索结果中各数据与用户搜索意图的相关性得分,根据相关性得分对所述初步搜索结果进行排序,得到有序搜索结果;Extracting and fusing features of the preliminary search results through a multimodal fusion neural network to obtain a multimodal fusion feature vector, using an attention mechanism to adjust the weight of the multimodal fusion feature vector, calculating the relevance score between each data in the preliminary search results and the user's search intent based on the adjusted multimodal fusion feature vector, and sorting the preliminary search results based on the relevance score to obtain an ordered search result; 所述有序搜索结果输入个性化推荐系统,生成最终推荐结果,基于终端设备和用户画像,对所述最终推荐结果进行呈现适配,确定分发内容,并推送给所述终端设备;The ordered search results are input into a personalized recommendation system to generate final recommendation results, and based on the terminal device and user portrait, the final recommendation results are presented and adapted, distribution content is determined, and pushed to the terminal device; 获取用户输入的搜索请求,基于预先构建的综合知识图谱,对所述搜索请求进行语义扩展,得到目标扩展搜索关键词,根据所述目标扩展搜索关键词和预先构建的多模态语义索引,对结构化数据和非结构化数据进行语义搜索,得到初步搜索结果包括:The search request input by the user is obtained, and based on the pre-built comprehensive knowledge graph, the search request is semantically expanded to obtain the target expanded search keyword, and the structured data and unstructured data are semantically searched according to the target expanded search keyword and the pre-built multimodal semantic index to obtain the preliminary search results including: 获取用户输入的搜索请求,对所述搜索请求进行预处理,得到搜索请求文本,并将所述搜索请求文本转换为搜索请求的语义向量表示;Obtaining a search request input by a user, preprocessing the search request to obtain a search request text, and converting the search request text into a semantic vector representation of the search request; 利用综合知识图谱对所述语义向量表示进行语义扩展,得到目标扩展搜索关键词;Using a comprehensive knowledge graph to semantically expand the semantic vector representation to obtain a target expanded search keyword; 根据所述目标扩展搜索关键词,在预先构建的多模态语义索引中进行搜索,所述多模态语义索引包括结构化数据索引和非结构化数据索引;Expanding the search keywords according to the target, searching in a pre-built multimodal semantic index, wherein the multimodal semantic index includes a structured data index and an unstructured data index; 基于所述结构化数据,采用基于图的查询语言,对所述目标扩展搜索关键词与所述多模态语义索引中的实体和关系进行匹配,得到第一搜索结果;Based on the structured data, using a graph-based query language, matching the target expanded search keyword with entities and relationships in the multimodal semantic index to obtain a first search result; 基于所述非结构化数据,采用预训练的多模态表示学习模型,将所述目标扩展搜索关键词映射到多模态语义空间中,通过向量相似度计算,得到第二搜索结果;Based on the unstructured data, a pre-trained multimodal representation learning model is used to map the target extended search keyword into a multimodal semantic space, and a second search result is obtained by vector similarity calculation; 将所述第一搜索结果和所述第二搜索结果进行语义融合,基于语义相关性和重要性排序,得到初步搜索结果。The first search result and the second search result are semantically fused and sorted based on semantic relevance and importance to obtain preliminary search results. 2.根据权利要求1所述的方法,其特征在于,利用综合知识图谱对所述语义向量表示进行语义扩展,得到目标扩展搜索关键词包括:2. The method according to claim 1 is characterized in that the semantic vector representation is semantically expanded using a comprehensive knowledge graph to obtain target expanded search keywords including: 采用知识图谱嵌入模型对所述综合知识图谱中的实体和关系进行向量化处理,获得实体关系向量表示,通过计算语义向量表示与实体关系向量表示之间的相似度,选取相似度最高的实体作为候选扩展关键词;The entities and relations in the comprehensive knowledge graph are vectorized by using a knowledge graph embedding model to obtain entity relationship vector representations, and the entities with the highest similarity are selected as candidate expansion keywords by calculating the similarity between the semantic vector representations and the entity relationship vector representations; 以所述候选扩展关键词为中心,采用随机游走算法在所述综合知识图谱中进行上下文采样,通过确定综合知识图谱的节点度、中心性和聚类系数,计算节点中心性度量和节点重要性评分,生成扩展关键词序列;Taking the candidate extended keyword as the center, a random walk algorithm is used to perform context sampling in the comprehensive knowledge graph, and by determining the node degree, centrality and clustering coefficient of the comprehensive knowledge graph, a node centrality measure and a node importance score are calculated to generate an extended keyword sequence; 基于所述节点中心性度量和所述节点重要性评分,对所述扩展关键词序列进行筛选,得到目标扩展搜索关键词。Based on the node centrality measurement and the node importance score, the extended keyword sequence is screened to obtain a target extended search keyword. 3.根据权利要求1所述的方法,其特征在于,通过多模态融合神经网络对所述初步搜索结果进行特征提取与融合,得到多模态融合特征向量,利用注意力机制对所述多模态融合特征向量进行权重调整,根据调整后的多模态融合特征向量,计算所述初步搜索结果中各数据与用户搜索意图的相关性得分,根据相关性得分对所述初步搜索结果进行排序,得到有序搜索结果包括:3. The method according to claim 1 is characterized in that the feature extraction and fusion of the preliminary search results are performed through a multimodal fusion neural network to obtain a multimodal fusion feature vector, the weight of the multimodal fusion feature vector is adjusted using an attention mechanism, and the correlation score between each data in the preliminary search results and the user's search intention is calculated according to the adjusted multimodal fusion feature vector, and the preliminary search results are sorted according to the correlation score, and the ordered search results obtained include: 基于所述初步搜索结果中的结构化数据,提取相应的关键属性和对应的属性值,确定显性特征;基于所述初步搜索结果中的非结构化数据,采用预训练的深度学习模型提取语义特征,确定深层特征;Based on the structured data in the preliminary search results, corresponding key attributes and corresponding attribute values are extracted to determine explicit features; based on the unstructured data in the preliminary search results, semantic features are extracted using a pre-trained deep learning model to determine deep features; 将所述显性特征和所述深层特征输入多模态融合神经网络,通过多层非线性变换和交互操作进行特征融合,得到多模态融合特征向量;Inputting the explicit features and the deep features into a multimodal fusion neural network, performing feature fusion through multi-layer nonlinear transformation and interactive operation, and obtaining a multimodal fusion feature vector; 基于注意力机制,将用户搜索意图对应的语义向量表示与所述多模态融合特征向量进行交互计算,得到注意力权重矩阵,通过所述注意力权重矩阵,调整所述多模态融合特征向量中不同维度的权重分配,确定多模态加权融合特征向量;Based on the attention mechanism, the semantic vector representation corresponding to the user's search intention is interactively calculated with the multimodal fusion feature vector to obtain an attention weight matrix. The weight distribution of different dimensions in the multimodal fusion feature vector is adjusted through the attention weight matrix to determine the multimodal weighted fusion feature vector; 采用相似度度量方法计算多模态加权融合特征向量与用户搜索意图语义向量表示之间的相关性得分,将所述初步搜索结果,按照所述相关性得分由高到低进行排序,确定有序搜索结果。A similarity measurement method is used to calculate the correlation score between the multimodal weighted fusion feature vector and the semantic vector representation of the user's search intention, and the preliminary search results are sorted from high to low according to the correlation score to determine the ordered search results. 4.根据权利要求1所述的方法,其特征在于,所述有序搜索结果输入个性化推荐模型,生成最终推荐结果,基于终端设备和所述用户画像,对所述最终推荐结果进行呈现适配,确定分发内容,并推送给所述终端设备包括:4. The method according to claim 1, characterized in that the ordered search results are input into a personalized recommendation model to generate a final recommendation result, and based on the terminal device and the user portrait, the final recommendation result is presented and adapted, distribution content is determined, and pushed to the terminal device, comprising: 基于预先获取的用户画像,所述有序搜索结果通过内容推荐算法,获取推荐候选集,并推理获取用户潜在兴趣对应的潜在信息,将所述潜在信息添加到推荐候选集中,合成推荐结果;Based on the pre-acquired user portrait, the ordered search results obtain a recommendation candidate set through a content recommendation algorithm, and infer the potential information corresponding to the user's potential interests, add the potential information to the recommendation candidate set, and synthesize the recommendation results; 根据用户的实时反馈和行为变化,通过强化学习模型,对推荐结果实时优化,通过动态调整优先级排序参数,结合用户的显式反馈和隐式反馈,动态调整所述推荐结果的展示和排序,生成最终推荐结果;According to the real-time feedback and behavior changes of users, the recommendation results are optimized in real time through the reinforcement learning model. By dynamically adjusting the priority sorting parameters and combining the explicit and implicit feedback of users, the display and sorting of the recommendation results are dynamically adjusted to generate the final recommendation results. 基于终端设备信息和所述用户画像,对所述最终推荐结果进行呈现适配,确定分发内容,并推送到对应的终端设备;Based on the terminal device information and the user portrait, the final recommendation result is presented and adapted, the distribution content is determined, and pushed to the corresponding terminal device; 当所述终端设备收到所述最终推荐结果后,实时采集用户交互动作数据,生成数据反馈,所述数据反馈通过数据回流,基于增量更新计算,更新用户画像,同时迭代更新个性化推荐模型。When the terminal device receives the final recommendation result, it collects user interaction action data in real time and generates data feedback. The data feedback is calculated based on incremental updates through data reflux to update the user portrait and iteratively update the personalized recommendation model. 5.根据权利要求4所述的方法,其特征在于,基于预先获取的用户画像,所述有序搜索结果通过内容推荐算法,获取推荐候选集,并推理获取用户潜在兴趣对应的潜在信息,将所述潜在信息添加到推荐候选集中,合成推荐结果包括:5. The method according to claim 4 is characterized in that, based on the pre-acquired user portrait, the ordered search results obtain a recommendation candidate set through a content recommendation algorithm, and infer the potential information corresponding to the user's potential interests, and add the potential information to the recommendation candidate set, and the synthesis recommendation result includes: 在所述内容推荐算法中,计算用户内容偏好,其公式如下:In the content recommendation algorithm, the user content preference is calculated as follows: ; 其中,u表示用户uc表示内容cp u,c 表示用户u对内容c的偏好得分,f表示内容中的特征,F表示所有特征的集合,w u,f 表示用户u对特征f的偏好权重,v c,f 表示内容c在特征f上的取值,x f 表示特征f的重要性权重,y u,c,f 表示用户u与内容c在特征f上的交互强度,α表示控制相似度的强度参数,sim u,c 表示用户u与内容c的相似度,β表示控制流行度的强度参数,pop c 表示内容c的流行度。Wherein, u represents user u , c represents content c , pu ,c represents the preference score of user u for content c , f represents the feature in the content, F represents the set of all features, wu ,f represents the preference weight of user u for feature f , vc ,f represents the value of content c on feature f , xf represents the importance weight of feature f , yu ,c,f represents the interaction intensity between user u and content c on feature f , α represents the intensity parameter controlling similarity, simu ,c represents the similarity between user u and content c , β represents the intensity parameter controlling popularity, and popc represents the popularity of content c . 6.一种人工智能数据搜索与分发系统,用于实现前述权利要求1-5中任一项所述的方法,其特征在于,包括:6. An artificial intelligence data search and distribution system, used to implement the method of any one of claims 1 to 5, characterized in that it comprises: 第一单元,用于获取用户输入的搜索请求,基于预先构建的综合知识图谱,对所述搜索请求进行语义扩展,得到目标扩展搜索关键词,根据所述目标扩展搜索关键词和预先构建的多模态语义索引,对结构化数据和非结构化数据进行语义搜索,得到初步搜索结果;The first unit is used to obtain a search request input by a user, perform semantic expansion on the search request based on a pre-built comprehensive knowledge graph, obtain a target expanded search keyword, perform semantic search on structured data and unstructured data based on the target expanded search keyword and a pre-built multimodal semantic index, and obtain preliminary search results; 第二单元,用于通过多模态融合神经网络对所述初步搜索结果进行特征提取与融合,得到多模态融合特征向量,利用注意力机制对所述多模态融合特征向量进行权重调整,根据调整后的多模态融合特征向量,计算所述初步搜索结果中各数据与用户搜索意图的相关性得分,根据相关性得分对所述初步搜索结果进行排序,得到有序搜索结果;The second unit is used to extract and fuse the features of the preliminary search results through a multimodal fusion neural network to obtain a multimodal fusion feature vector, use an attention mechanism to adjust the weight of the multimodal fusion feature vector, calculate the relevance score of each data in the preliminary search results and the user's search intention according to the adjusted multimodal fusion feature vector, sort the preliminary search results according to the relevance score, and obtain an ordered search result; 第三单元,用于所述有序搜索结果输入个性化推荐系统,生成最终推荐结果,基于终端设备和用户画像,对所述最终推荐结果进行呈现适配,确定分发内容,并推送给所述终端设备。The third unit is used to input the ordered search results into a personalized recommendation system, generate a final recommendation result, present and adapt the final recommendation result based on the terminal device and the user portrait, determine the distribution content, and push it to the terminal device. 7.一种电子设备,其特征在于,包括:7. An electronic device, comprising: 处理器;processor; 用于存储处理器可执行指令的存储器;a memory for storing processor-executable instructions; 其中,所述处理器被配置为调用所述存储器存储的指令,以执行权利要求1至5中任意一项所述的方法。The processor is configured to call the instructions stored in the memory to execute the method described in any one of claims 1 to 5. 8.一种计算机可读存储介质,其上存储有计算机程序指令,其特征在于,所述计算机程序指令被处理器执行时实现权利要求1至5中任意一项所述的方法。8. A computer-readable storage medium having computer program instructions stored thereon, wherein the computer program instructions, when executed by a processor, implement the method according to any one of claims 1 to 5.
CN202410939629.6A 2024-07-15 2024-07-15 Artificial intelligent data searching and distributing method and system Active CN118467851B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202410939629.6A CN118467851B (en) 2024-07-15 2024-07-15 Artificial intelligent data searching and distributing method and system

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202410939629.6A CN118467851B (en) 2024-07-15 2024-07-15 Artificial intelligent data searching and distributing method and system

Publications (2)

Publication Number Publication Date
CN118467851A CN118467851A (en) 2024-08-09
CN118467851B true CN118467851B (en) 2024-10-25

Family

ID=92161854

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202410939629.6A Active CN118467851B (en) 2024-07-15 2024-07-15 Artificial intelligent data searching and distributing method and system

Country Status (1)

Country Link
CN (1) CN118467851B (en)

Families Citing this family (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN118626672B (en) * 2024-08-12 2024-11-05 山东浪潮科学研究院有限公司 Video retrieval method and system based on multi-mode information fusion
CN118690039B (en) * 2024-08-28 2025-01-21 济南泉方科技有限公司 A graphical display method for search engine retrieval results
CN119166842B (en) * 2024-09-03 2025-04-11 湖北锦绣人才科技集团有限公司 A file retrieval method and system based on AI images and key talent information
CN118733844B (en) * 2024-09-04 2025-02-25 浪潮通用软件有限公司 A general data fuzzy search and matching method based on FuzzyWuzzy algorithm
CN119128177B (en) * 2024-09-06 2025-03-25 上海迪塔班克数据科技有限公司 Method and system for recommending chemical and plastic products based on user needs
CN119127970B (en) * 2024-09-09 2025-05-16 坛墨质检科技股份有限公司 Standard substance standard substance retrieval and ordering method and system based on search engine
CN119128045B (en) * 2024-09-10 2025-08-08 山东财经大学 Optimization method for index retrieval of digital library
CN119089024A (en) * 2024-09-11 2024-12-06 江苏鑫源融信软件科技有限公司 A general intelligent search system based on objects
CN119415767B (en) * 2024-09-30 2025-10-28 广东歌捷信息科技有限公司 Intelligent screening method and system based on target search data
CN119046447B (en) * 2024-11-04 2025-07-01 杭州齐圣科技有限公司 LLM problem optimization method, medium and system combining enterprise portrait
CN119089398B (en) * 2024-11-07 2025-06-27 杭州惠民征信有限公司 Content recommendation method and system based on semantic recognition
CN119271722A (en) * 2024-12-12 2025-01-07 苏州元脑智能科技有限公司 Configuration option search method, device and apparatus, medium and computer program product
CN119415680B (en) * 2025-01-07 2025-04-11 上海卓辰信息科技有限公司 Data query result ordering method and system integrating large model and incremental learning
CN120317960A (en) * 2025-04-18 2025-07-15 湖北黄商集团股份有限公司 A rapid commodity search auxiliary system based on artificial intelligence
CN120086414B (en) * 2025-05-07 2025-06-27 杭州云嘉科技有限公司 File retrieval method and system based on cloud computing
CN120104859A (en) * 2025-05-07 2025-06-06 大连数晨科技有限公司 A search method based on multi-source information fusion
CN120635426B (en) * 2025-08-08 2025-10-21 中科边缘智慧信息科技(苏州)有限公司 Unmanned aerial vehicle based on multi-mode prompt and high-low altitude collaborative target searching method and system

Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117851444A (en) * 2024-03-07 2024-04-09 北京谷器数据科技有限公司 An advanced search method based on semantic understanding

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US8868535B1 (en) * 2000-02-24 2014-10-21 Richard Paiz Search engine optimizer
CN107169010A (en) * 2017-03-31 2017-09-15 北京奇艺世纪科技有限公司 A kind of determination method and device of recommendation search keyword
CN118051653B (en) * 2024-04-16 2024-07-05 广州云趣信息科技有限公司 Multi-mode data retrieval method, system and medium based on semantic association

Patent Citations (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN117851444A (en) * 2024-03-07 2024-04-09 北京谷器数据科技有限公司 An advanced search method based on semantic understanding

Also Published As

Publication number Publication date
CN118467851A (en) 2024-08-09

Similar Documents

Publication Publication Date Title
CN118467851B (en) Artificial intelligent data searching and distributing method and system
Chu et al. A hybrid recommendation system considering visual information for predicting favorite restaurants
Isinkaye et al. Recommendation systems: Principles, methods and evaluation
TWI636416B (en) Method and system for multi-phase ranking for content personalization
CN104317835B (en) The new user of video terminal recommends method
US10198503B2 (en) System and method for performing a semantic operation on a digital social network
Lampropoulos et al. A cascade-hybrid music recommender system for mobile services based on musical genre classification and personality diagnosis
Dou et al. A survey of collaborative filtering algorithms for social recommender systems
US20150262069A1 (en) Automatic topic and interest based content recommendation system for mobile devices
CN106709037B (en) A Movie Recommendation Method Based on Heterogeneous Information Network
Melucci Contextual search: A computational framework
AU2011350049A1 (en) System and method for performing a semantic operation on a digital social network
CN115878841B (en) A short video recommendation method and system based on improved vulture search algorithm
Bouras et al. Improving news articles recommendations via user clustering
Zhang et al. Web service recommendation via combining Doc2Vec-based functionality clustering and DeepFM-based score prediction
Yu et al. News recommendation model based on encoder graph neural network and bat optimization in online social multimedia art education
Yan et al. A unified video recommendation by cross-network user modeling
Chung et al. Categorization for grouping associative items using data mining in item-based collaborative filtering
Liu et al. Online recommendations based on dynamic adjustment of recommendation lists
JP2022035314A (en) Information processing unit and program
CN116578729A (en) Content search method, device, electronic device, storage medium and program product
Dong et al. Improving sequential recommendation with attribute-augmented graph neural networks
Liang et al. Weight normalization optimization movie recommendation algorithm based on three-way neural interaction networks
Jangid et al. Enhancing user experience: a content-based recommendation approach for addressing cold start in music recommendation
Liao et al. An integrated model based on deep multimodal and rank learning for point-of-interest recommendation

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PE01 Entry into force of the registration of the contract for pledge of patent right

Denomination of invention: Artificial Intelligence Data Search and Distribution Method and System

Granted publication date: 20241025

Pledgee: Industrial Commercial Bank of China Ltd. Beijing Jiulong Mountain branch

Pledgor: Beijing honeycomb Technology Co.,Ltd.

Registration number: Y2025980042785

PE01 Entry into force of the registration of the contract for pledge of patent right