[go: up one dir, main page]

CN112015911B - A Method for Retrieval of Massive Knowledge Graph - Google Patents

A Method for Retrieval of Massive Knowledge Graph Download PDF

Info

Publication number
CN112015911B
CN112015911B CN202010857339.9A CN202010857339A CN112015911B CN 112015911 B CN112015911 B CN 112015911B CN 202010857339 A CN202010857339 A CN 202010857339A CN 112015911 B CN112015911 B CN 112015911B
Authority
CN
China
Prior art keywords
knowledge graph
visual data
data matrix
knowledge
graph
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010857339.9A
Other languages
Chinese (zh)
Other versions
CN112015911A (en
Inventor
樊星
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Original Assignee
Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd filed Critical Shanghai Squirrel Classroom Artificial Intelligence Technology Co Ltd
Priority to CN202010857339.9A priority Critical patent/CN112015911B/en
Publication of CN112015911A publication Critical patent/CN112015911A/en
Application granted granted Critical
Publication of CN112015911B publication Critical patent/CN112015911B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/36Creation of semantic tools, e.g. ontology or thesauri
    • G06F16/367Ontology
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Animal Behavior & Ethology (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention discloses a massive knowledge graph retrieval method, which is used for solving the problem that the conventional knowledge graph retrieval method cannot better retrieve a knowledge graph related to a knowledge graph searched by a user to the user. The method comprises the following steps: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance; calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; and storing the second knowledge graph corresponding to the visual data matrix with the relevance degree with the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree with the first knowledge graph. According to the method, the relevance of each knowledge graph can be accurately determined by calculating the relevance and the similarity between the knowledge graphs, so that the knowledge graph with strong relevance to the knowledge graph searched by the user can be pushed to the user, and the user experience is improved.

Description

Method for searching massive knowledge maps
Technical Field
The invention relates to the technical field of knowledge maps, in particular to a method for searching a mass knowledge map.
Background
Knowledge map (Knowledge Graph) is a series of different graphs which are called Knowledge domain visualization or Knowledge domain mapping map in the book intelligence world and display the relationship between the Knowledge development process and the structure. With the rapid development of theories and methods applying subjects such as mathematics, graphics, information visualization technology, information science and the like, the number of knowledge maps is also developed in a burst mode, the relevance among the knowledge maps is larger and larger, and each knowledge map is not an information isolated island any more. Therefore, how to enable a user to browse the knowledge graph related to the user in an extensible manner when searching for a certain knowledge graph enables the user to retrieve more useful information, improves the experience of the user, and becomes a hot spot of recent research. However, the matching accuracy of the current relevant knowledge graph retrieval method is low, and the user experience is poor.
Disclosure of Invention
The invention provides a massive knowledge graph retrieval method, which is used for solving the problems of low matching accuracy and poor user experience of the conventional knowledge graph retrieval method. According to the method for searching the massive knowledge maps, the relevance of each knowledge map can be accurately determined by calculating the relevance and the similarity before the knowledge map, so that the knowledge map with strong relevance with the knowledge map searched by a user can be pushed to the user, and the user experience is improved.
The invention provides a method for searching a massive knowledge map, which comprises the following steps:
constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
receiving a retrieval request about a target knowledge-graph;
and retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph, and providing a retrieval result for a user.
In one embodiment, after storing a second knowledge graph corresponding to a visualization data matrix having a degree of association with a visualization data matrix of the first knowledge graph as a knowledge graph having a degree of association with the first knowledge graph, before receiving a retrieval request regarding a target knowledge graph, the method further comprises:
calculating the similarity between the visual data matrix of the third knowledge graph and the visual data matrix of the first knowledge graph according to a preset similarity calculation method; the third knowledge graph is a knowledge graph with a degree of association with the first knowledge graph;
screening out a third knowledge graph corresponding to the visual data matrix with the similarity reaching a preset similarity threshold value with the visual data matrix of the first knowledge graph, and obtaining and storing a recommended knowledge graph set corresponding to the first knowledge graph;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
In one embodiment, after storing a second knowledge graph corresponding to a visualization data matrix having a degree of association with a visualization data matrix of the first knowledge graph as a knowledge graph having a degree of association with the first knowledge graph, before receiving a retrieval request regarding a target knowledge graph, the method further comprises:
sorting the knowledge graphs with the relevance degrees with the first knowledge graph from high to low according to the relevance degrees of the knowledge graphs and the visual data matrix of the first knowledge graph to obtain a sorting result;
calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to a preset similarity algorithm, wherein N is a positive integer, and the initial value of N is 1;
judging whether the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result reaches a preset similarity threshold value or not;
if the similarity between the visual data matrix of the Nth knowledge graph in the sequencing result and the visual data matrix of the first knowledge graph reaches a preset similarity threshold value, putting the knowledge graph corresponding to the visual data matrix of the Nth knowledge graph in the sequencing result into the recommended knowledge graph set corresponding to the first knowledge graph, and adding 1 to the count value of a preset counter; wherein the initial count value of the counter is 0;
judging whether the count value of the counter is equal to the preset map extension number or not;
if the count value of the counter is not equal to the preset map extension number, judging whether N is equal to M or not; the M is the number of the knowledge graphs in the sequencing result;
if N is not equal to M, after N is set to be N +1, returning to the step of executing the step of calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to the preset similarity calculation method;
if the counting value of the counter is equal to the preset map extension number or N is equal to M, storing a recommended knowledge map set corresponding to the current first knowledge map;
if the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result does not reach a preset similarity threshold value, executing the step of judging whether N is equal to M or not;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
In one embodiment, the preset association algorithm formula is as follows:
Figure BDA0002646882650000041
wherein X is the degree of association of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, pi is the circumference ratio, i and j respectively represent the row number and the column number of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and aijData information of visual data of ith row and jth column in visual data matrix of first knowledge graph, bijThe learning rate is the data information of visual data of the ith row and the jth column in the visual data matrix of the second knowledge graph, L is an association constraint coefficient of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, e is a natural constant, alpha is a weak association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, beta is expressed as a strong association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and lambda is a preset learning rate; l, e, the values of alpha, beta and lambda are all preset values.
In one embodiment, L has a value interval of [0.1,0.3], e has a value interval of 2.58, and λ has a value interval of [0,1 ].
In one embodiment, the preset similarity is calculated by the following formula:
Figure BDA0002646882650000042
the simY is the similarity between the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, the theta is expressed as the preset probability that the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph contain the data information of the same visual data, i and j respectively express the row number, column number and S of the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graphmIs confidence coefficient, Y, of the mth data information in the visual data matrix of the first knowledge-graphmFor the confidence of the mth data information in the visualized data matrix of the third knowledge-graph, GmIs the evaluation score of the mth data information in the visual data matrix of the first knowledge graph, CmThe evaluation score of the mth data information in the visual data matrix of the third knowledge graph is shown, and omega is a useless data influence factor in the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph; and the values of theta and omega are preset values.
In one embodiment, θ is 15%, and ω is within 0.05, 0.1.
In an embodiment, before constructing the visualized data matrix of each knowledge graph in the pre-acquired massive set of knowledge graphs, the method further includes:
acquiring a massive knowledge graph;
preprocessing each acquired knowledge graph; the pretreatment comprises the following steps: deleting repeated data and messy code data of each knowledge graph;
combining the preprocessed knowledge maps into a mass knowledge map set;
the method for constructing the visual data matrix of each knowledge graph in the pre-acquired massive knowledge graph set comprises the following steps:
analyzing each knowledge map in the massive knowledge map set to obtain visual data of each knowledge map;
taking data information of the visualized data of each knowledge graph as a matrix element, and constructing a visualized data matrix of each knowledge graph, wherein the data information of the visualized data comprises: and visualizing the keywords in the data and the connection relation of each keyword.
The invention provides a method for searching a mass knowledge graph, which comprises the steps of firstly, determining the relevance between knowledge graphs by calculating the relevance between knowledge graphs, further ensuring that when a user searches a knowledge graph, the knowledge graph with larger relevance can be recommended to the user in an extending way, expanding the searching range of the user, enabling the user to search more useful information and improving the experience of the user; furthermore, after the association degree between the knowledge graphs is calculated, the similarity between the knowledge graphs with the larger association is calculated, so that the similarity is further ensured on the basis of ensuring the association degree, the association of each knowledge graph can be more accurately determined, the accuracy of determining the association relation is improved, and the user experience is greatly improved.
Additional features and advantages of the invention will be set forth in the description which follows, and in part will be obvious from the description, or may be learned by practice of the invention. The objectives and other advantages of the invention will be realized and attained by the structure particularly pointed out in the written description and claims hereof as well as the appended drawings.
The technical solution of the present invention is further described in detail by the accompanying drawings and embodiments.
Drawings
The accompanying drawings, which are included to provide a further understanding of the invention and are incorporated in and constitute a part of this specification, illustrate embodiments of the invention and together with the description serve to explain the principles of the invention and not to limit the invention. In the drawings:
FIG. 1 is a flowchart of a first embodiment of a method for massive knowledge graph retrieval according to an embodiment of the present invention;
FIG. 2 is a flowchart of a method prior to step S101 in FIG. 1;
FIG. 3 is a flowchart of a second embodiment of a method for retrieving a mass knowledge graph according to an embodiment of the present invention;
FIG. 4 is a flowchart of a third embodiment of a method for retrieving a mass knowledge graph according to an embodiment of the present invention;
Detailed Description
The preferred embodiments of the present invention will be described in conjunction with the accompanying drawings, and it will be understood that they are described herein for the purpose of illustration and explanation and not limitation.
Fig. 1 is a flowchart of an embodiment of a method for retrieving a mass knowledge graph according to an embodiment of the present invention. As shown in fig. 1, the method comprises the following steps S101-S105:
s101: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
in this embodiment, each mass knowledge graph is constructed into a respective visual data matrix, so that the association degree between the visual data matrices is calculated by using a preset association degree algorithm, and the association degree is the association degree between the knowledge graphs corresponding to the matrices.
As an optional manner, as shown in fig. 2, before this step S101, steps S201 to S203 may be further included:
s201: acquiring a massive knowledge graph;
in this embodiment, a massive knowledge graph, such as Baidu encyclopedia, may be obtained from some target websites.
S202: preprocessing each acquired knowledge graph; the pretreatment comprises the following steps: deleting repeated data and messy code data of each knowledge graph;
in the embodiment, the repeated data and the messy code data of each knowledge graph are deleted, so that the association degree obtained through calculation is more accurate.
S203: and combining the preprocessed knowledge maps into a mass knowledge map set.
This step S101 includes: analyzing each knowledge map in the massive knowledge map set to obtain visual data of each knowledge map; taking data information of the visualized data of each knowledge graph as a matrix element, and constructing a visualized data matrix of each knowledge graph, wherein the data information of the visualized data comprises: and visualizing the keywords in the data and the connection relation of each keyword.
S102: calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
in this embodiment, the preset association algorithm formula is as follows:
Figure BDA0002646882650000071
wherein X is the degree of association of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, pi is the circumference ratio, i and j respectively represent the row number and the column number of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and aijData information of visual data of ith row and jth column in visual data matrix of first knowledge graph, bijThe learning rate is the data information of visual data of the ith row and the jth column in the visual data matrix of the second knowledge graph, L is an association constraint coefficient of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, e is a natural constant, alpha is a weak association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, beta is expressed as a strong association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and lambda is a preset learning rate; l, e, alpha, beta and lambda values are all preset values, wherein the value range of L is [0.1,0.3]]E is 2.58, and lambda is in the range of [0,1]]λ increases with increasing data information of the visualization data.
S103: storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
in this embodiment, as an optional manner, the step S103 includes: judging whether the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph is larger than or equal to a preset association degree threshold value or not; and if the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph is larger than or equal to a preset association degree threshold value, storing the second knowledge graph as the knowledge graph with the association degree with the first knowledge graph.
S104: receiving a retrieval request about a target knowledge-graph;
s105: and retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph, and providing a retrieval result for a user.
According to the method for searching the massive knowledge maps, the relevance between the knowledge maps is determined by calculating the relevance between the knowledge maps, so that a user can be ensured to be capable of extensibly browsing the knowledge map with the larger relevance when searching for a certain knowledge map, the searching range of the user is expanded, the user can search more useful information, and the experience of the user is improved.
Fig. 3 is a schematic flow chart of a second embodiment of the method for retrieving a massive knowledge map according to the present invention. Referring to fig. 3, the embodiment of the method for retrieving the massive knowledge map of the present invention includes the following steps:
s301: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
s302: calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
s303: storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
s304: calculating the similarity between the visual data matrix of the third knowledge graph and the visual data matrix of the first knowledge graph according to a preset similarity calculation method; the third knowledge graph is a knowledge graph with a degree of association with the first knowledge graph;
in this embodiment, the preset similarity calculation method includes:
Figure BDA0002646882650000091
the simY is the similarity between the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, and the theta is expressed as the preset probability that the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph contain the data information of the same visual data, and the value is 15%; i. j represents the row number, column number, S of the visual data matrix of the first/third knowledge graph respectivelymThe confidence coefficient of the mth data information in the visual data matrix of the first knowledge graph is taken as [0,1]]The probability of the mth data information appearing in a certain region in a visual verse matrix is reduced along with the increase of the area of the region; y ismIs the confidence coefficient of the mth data information in the visual data matrix of the third knowledge graph, and the value of the mth data information is equal to SmSimilarly; gmThe evaluation score of the mth data information in the visual data matrix of the first knowledge graph is taken as [0,1]]The m-th data information occupies the data in all the data information in the first visual data matrixRatio, which increases with increasing data fraction; cmThe evaluation score of the mth data information in the visual data matrix of the third knowledge graph is the value and GmSimilarly; omega is a useless data influence factor in the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, and the value is [0.05, 0.1%]The value is closer to 0.1 when the amount of unnecessary data is larger, and closer to 0.05 when the amount of unnecessary data is smaller.
S305: screening out a third knowledge graph corresponding to the visual data matrix with the similarity reaching a preset similarity threshold value with the visual data matrix of the first knowledge graph, and obtaining and storing a recommended knowledge graph set corresponding to the first knowledge graph;
s306: receiving a retrieval request about a target knowledge-graph;
s307: and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
According to the method for searching the massive knowledge maps, the relevance between the knowledge maps is determined by calculating the relevance between the knowledge maps, so that a user can be ensured to browse the knowledge maps with larger relevance in an extensible manner when searching for a certain knowledge map, the searching range of the user is expanded, the user can search more useful information, and the experience of the user is improved; furthermore, after the association degree between the knowledge maps is calculated, the similarity between the knowledge maps is calculated, so that the similarity is further ensured on the basis of ensuring the association degree, the association of each knowledge map can be more accurately determined, the accuracy of determining the association relation is improved, and the user experience is greatly improved.
Fig. 4 is a schematic flow chart of a third embodiment of the method for retrieving a massive knowledge map according to the present invention. Referring to fig. 4, the embodiment of the method for retrieving the massive knowledge map of the present invention includes the following steps:
s401: constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
s402: calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
s403: storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
s404: sorting the knowledge graphs with the relevance degrees with the first knowledge graph from high to low according to the relevance degrees of the knowledge graphs and the visual data matrix of the first knowledge graph to obtain a sorting result;
s405: calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to a preset similarity algorithm, wherein N is a positive integer and is initialized to 1;
s406: judging whether the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result reaches a preset similarity threshold value or not; if yes, executing step S407, otherwise, executing step S409;
s407: putting the knowledge graph corresponding to the visual data matrix of the Nth knowledge graph in the sequencing result into the recommended knowledge graph set corresponding to the first knowledge graph, and adding 1 to the count value of a preset counter; wherein the initial count value of the counter is 0;
s408: judging whether the count value of the counter is equal to the preset atlas extension number, if so, executing the step S411, otherwise, executing the step S409;
s409: judging whether N is equal to M, wherein M is the number of the knowledge maps in the sequencing result, if yes, executing a step S411, otherwise, executing a step S410;
s410: n +1, and step S405 is performed;
s411: storing a recommended knowledge map set corresponding to the current first knowledge map;
s412: receiving a retrieval request about a target knowledge-graph;
s41: 3: and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
According to the method for searching the massive knowledge maps, the relevance between the knowledge maps is determined by calculating the relevance between the knowledge maps, so that a user can be ensured to browse the knowledge maps with larger relevance in an extensible manner when searching for a certain knowledge map, the searching range of the user is expanded, the user can search more useful information, and the experience of the user is improved; furthermore, according to the sequence of the relevance degrees from large to small, the similarity between the corresponding knowledge maps is sequentially calculated, the knowledge maps with the similarity reaching a certain degree are used as recommended knowledge map sets until the number of the recommended knowledge map sets reaches the preset map extension number or the relevance knowledge maps are traversed, the similarity is further ensured on the basis of ensuring the relevance degrees, the relevance of each knowledge map can be more accurately determined, the accuracy of determining the relevance relation is improved, and meanwhile, due to the fact that the energy of a user is limited, the extended knowledge maps cannot be too much, and therefore the second knowledge map with the large similarity with the first knowledge map can be selected to better meet the requirements of the user.
As will be appreciated by one skilled in the art, embodiments of the present invention may be provided as a method, system, or computer program product. Accordingly, the present invention may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, the present invention may take the form of a computer program product embodied on one or more computer-usable storage media (including, but not limited to, disk storage, optical storage, and the like) having computer-usable program code embodied therein.
The present invention is described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems), and computer program products according to embodiments of the invention. It will be understood that each flow and/or block of the flow diagrams and/or block diagrams, and combinations of flows and/or blocks in the flow diagrams and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
It will be apparent to those skilled in the art that various changes and modifications may be made in the present invention without departing from the spirit and scope of the invention. Thus, if such modifications and variations of the present invention fall within the scope of the claims of the present invention and their equivalents, the present invention is also intended to include such modifications and variations.

Claims (7)

1. A method for searching massive knowledge maps is characterized by comprising the following steps:
constructing a visual data matrix of each knowledge map in a mass knowledge map set acquired in advance;
calculating the association degree of the visual data matrix of the first knowledge graph and the visual data matrix of each second knowledge graph in the massive knowledge graph set according to a preset association degree algorithm; the first knowledge graph is any knowledge graph in the massive knowledge graph set, and the second knowledge graph is other knowledge graphs in the massive knowledge graph set except the first knowledge graph;
storing a second knowledge graph corresponding to the visual data matrix with the relevance degree to the visual data matrix of the first knowledge graph as the knowledge graph with the relevance degree to the first knowledge graph;
receiving a retrieval request about a target knowledge-graph;
retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph, and providing a retrieval result for a user;
wherein, after storing the second knowledge graph corresponding to the visual data matrix with the association degree with the visual data matrix of the first knowledge graph as the knowledge graph with the association degree with the first knowledge graph, before receiving the retrieval request of the target knowledge graph, the method further comprises:
sorting the knowledge graphs with the relevance degrees with the first knowledge graph from high to low according to the relevance degrees of the knowledge graphs and the visual data matrix of the first knowledge graph to obtain a sorting result;
calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to a preset similarity algorithm, wherein N is a positive integer, and the initial value of N is 1;
judging whether the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result reaches a preset similarity threshold value or not;
if the similarity between the visual data matrix of the Nth knowledge graph in the sequencing result and the visual data matrix of the first knowledge graph reaches a preset similarity threshold value, putting the knowledge graph corresponding to the visual data matrix of the Nth knowledge graph in the sequencing result into the recommended knowledge graph set corresponding to the first knowledge graph, and adding 1 to the count value of a preset counter; wherein the initial count value of the counter is 0;
judging whether the count value of the counter is equal to the preset map extension number or not;
if the count value of the counter is not equal to the preset map extension number, judging whether N is equal to M or not; the M is the number of the knowledge graphs in the sequencing result;
if N is not equal to M, after N is set to be N +1, returning to the step of executing the step of calculating the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result according to the preset similarity calculation method;
if the counting value of the counter is equal to the preset map extension number or N is equal to M, storing a recommended knowledge map set corresponding to the current first knowledge map;
if the similarity between the visual data matrix of the Nth knowledge graph and the visual data matrix of the first knowledge graph in the sequencing result does not reach a preset similarity threshold value, executing the step of judging whether N is equal to M or not;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
2. The method for retrieving the massive knowledge graph according to claim 1, wherein after storing a second knowledge graph corresponding to a visualized data matrix with a degree of association with a visualized data matrix of the first knowledge graph as a knowledge graph with a degree of association with the first knowledge graph, and before receiving a retrieval request regarding a target knowledge graph, the method further comprises:
calculating the similarity between the visual data matrix of the third knowledge graph and the visual data matrix of the first knowledge graph according to a preset similarity calculation method; the third knowledge graph is a knowledge graph with a degree of association with the first knowledge graph;
screening out a third knowledge graph corresponding to the visual data matrix with the similarity reaching a preset similarity threshold value with the visual data matrix of the first knowledge graph, and obtaining and storing a recommended knowledge graph set corresponding to the first knowledge graph;
the retrieving the target knowledge graph and the knowledge graph with the association degree with the target knowledge graph and providing the retrieval result to the user comprises the following steps:
and retrieving the target knowledge graph and the recommendation knowledge graph set corresponding to the target knowledge graph, and providing a retrieval result for a user.
3. The method for retrieving the massive knowledge map according to any one of claims 1-2, wherein the preset association algorithm formula is as follows:
Figure FDA0003079772530000031
wherein X is the degree of association of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, pi is the circumference ratio, i and j respectively represent the row number and the column number of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and aijData information of visual data of ith row and jth column in visual data matrix of first knowledge graph, bijThe data information of visual data of ith row and jth column in the visual data matrix of the second knowledge graph is represented as L, the association constraint coefficient of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph is represented as e, alpha is a weak association weight of the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graph, and beta is represented as the visual data matrix of the first knowledge graph and the visual data matrix of the second knowledge graphThe strong correlation weight of a visual data matrix of the map is lambda, and the lambda is a preset learning rate; l, e, the values of alpha, beta and lambda are all preset values.
4. The method for retrieving the massive knowledge map as claimed in claim 3, wherein the value interval of L is [0.1,0.3], the value of e is 2.58, and the value interval of λ is [0,1 ].
5. The method for massive knowledge graph retrieval according to claim 2, wherein the algorithm formula of the preset similarity is as follows:
Figure FDA0003079772530000032
the simY is the similarity between the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph, the theta is expressed as the preset probability that the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph contain the data information of the same visual data, i and j respectively express the row number, column number and S of the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graphmIs confidence coefficient, Y, of the mth data information in the visual data matrix of the first knowledge-graphmFor the confidence of the mth data information in the visualized data matrix of the third knowledge-graph, GmIs the evaluation score of the mth data information in the visual data matrix of the first knowledge graph, CmThe evaluation score of the mth data information in the visual data matrix of the third knowledge graph is shown, and omega is a useless data influence factor in the visual data matrix of the first knowledge graph and the visual data matrix of the third knowledge graph; and the values of theta and omega are preset values.
6. The method for retrieving the massive knowledge map according to claim 5, wherein θ is 15%, and ω is [0.05,0.1 ].
7. The method for retrieving the massive knowledge graph according to claim 4, wherein before the constructing the visual data matrix of each knowledge graph in the pre-acquired massive knowledge graph set, the method further comprises:
acquiring a massive knowledge graph;
preprocessing each acquired knowledge graph; the pretreatment comprises the following steps: deleting repeated data and messy code data of each knowledge graph;
combining the preprocessed knowledge maps into a mass knowledge map set;
the method for constructing the visual data matrix of each knowledge graph in the pre-acquired massive knowledge graph set comprises the following steps:
analyzing each knowledge map in the massive knowledge map set to obtain visual data of each knowledge map;
taking data information of the visualized data of each knowledge graph as a matrix element, and constructing a visualized data matrix of each knowledge graph, wherein the data information of the visualized data comprises: and visualizing the keywords in the data and the connection relation of each keyword.
CN202010857339.9A 2020-08-24 2020-08-24 A Method for Retrieval of Massive Knowledge Graph Active CN112015911B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010857339.9A CN112015911B (en) 2020-08-24 2020-08-24 A Method for Retrieval of Massive Knowledge Graph

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010857339.9A CN112015911B (en) 2020-08-24 2020-08-24 A Method for Retrieval of Massive Knowledge Graph

Publications (2)

Publication Number Publication Date
CN112015911A CN112015911A (en) 2020-12-01
CN112015911B true CN112015911B (en) 2021-07-20

Family

ID=73505725

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010857339.9A Active CN112015911B (en) 2020-08-24 2020-08-24 A Method for Retrieval of Massive Knowledge Graph

Country Status (1)

Country Link
CN (1) CN112015911B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113377963B (en) 2021-06-28 2023-08-11 中国科学院地质与地球物理研究所 A method and device for processing well site test data based on knowledge graph
CN115858906A (en) * 2022-12-26 2023-03-28 中移动信息技术有限公司 Enterprise searching method, device, equipment, computer storage medium and program
CN116521656A (en) * 2023-03-17 2023-08-01 中国船舶集团有限公司第七二四研究所 Knowledge graph-based radar full-element data association method
CN118964642B (en) * 2024-10-18 2025-03-21 山东天华通信有限公司 An efficient retrieval method and system for structured data based on knowledge graph

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109857872A (en) * 2019-02-18 2019-06-07 浪潮软件集团有限公司 The information recommendation method and device of knowledge based map
CN111241241A (en) * 2020-01-08 2020-06-05 平安科技(深圳)有限公司 Case retrieval method, device and equipment based on knowledge graph and storage medium
CN111369318A (en) * 2020-02-28 2020-07-03 安徽农业大学 A recommendation method and system based on commodity knowledge graph feature learning

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109804364A (en) * 2016-10-18 2019-05-24 浙江核新同花顺网络信息股份有限公司 Knowledge graph construction system and method
US11693848B2 (en) * 2018-08-07 2023-07-04 Accenture Global Solutions Limited Approaches for knowledge graph pruning based on sampling and information gain theory

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN109857872A (en) * 2019-02-18 2019-06-07 浪潮软件集团有限公司 The information recommendation method and device of knowledge based map
CN111241241A (en) * 2020-01-08 2020-06-05 平安科技(深圳)有限公司 Case retrieval method, device and equipment based on knowledge graph and storage medium
CN111369318A (en) * 2020-02-28 2020-07-03 安徽农业大学 A recommendation method and system based on commodity knowledge graph feature learning

Also Published As

Publication number Publication date
CN112015911A (en) 2020-12-01

Similar Documents

Publication Publication Date Title
CN112015911B (en) A Method for Retrieval of Massive Knowledge Graph
CN108804641B (en) Text similarity calculation method, device, equipment and storage medium
US8559731B2 (en) Personalized tag ranking
US8150859B2 (en) Semantic table of contents for search results
US7853599B2 (en) Feature selection for ranking
EP3937029A2 (en) Method and apparatus for training search model, and method and apparatus for searching for target object
US8880548B2 (en) Dynamic search interaction
CN109299383B (en) Method and device for generating recommended word, electronic equipment and storage medium
JP6299596B2 (en) Query similarity evaluation system, evaluation method, and program
CN103514199A (en) Method and device for POI data processing and method and device for POI searching
CN110321437B (en) Corpus data processing method and device, electronic equipment and medium
CN108959550B (en) User interest mining method, apparatus, device and computer readable medium
US20110179013A1 (en) Search Log Online Analytic Processing
Wang et al. A distance matrix based algorithm for solving the traveling salesman problem
CN113918807A (en) Data recommendation method and device, computing equipment and computer-readable storage medium
CN111079035B (en) Domain searching and sorting method based on dynamic map link analysis
CN121166876A (en) A retrieval question answering method applied to large language models
CN104750692A (en) Information processing method, information retrieval method and corresponding device of information retrieval method
CN116701567B (en) Electronic book retrieval method and system based on artificial intelligence
CN111401055A (en) Method and apparatus for extracting context information from financial information
CN106778048B (en) Method and device for data processing
CN112015914A (en) Knowledge graph path searching method based on deep learning
CN111797183B (en) Method and device for mining road attribute of information point and electronic equipment
JP6577922B2 (en) Search apparatus, method, and program
JP7631430B2 (en) PROGRAM, TERMINAL DEVICE AND INFORMATION DISPLAY METHOD

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
PP01 Preservation of patent right
PP01 Preservation of patent right

Effective date of registration: 20221020

Granted publication date: 20210720

PD01 Discharge of preservation of patent
PD01 Discharge of preservation of patent

Date of cancellation: 20241020

Granted publication date: 20210720