CN118093632B - Graph database query method and device based on large language model and graph structure - Google Patents
Graph database query method and device based on large language model and graph structure Download PDFInfo
- Publication number
- CN118093632B CN118093632B CN202410458999.8A CN202410458999A CN118093632B CN 118093632 B CN118093632 B CN 118093632B CN 202410458999 A CN202410458999 A CN 202410458999A CN 118093632 B CN118093632 B CN 118093632B
- Authority
- CN
- China
- Prior art keywords
- target
- query
- graph
- target graph
- attribute
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2423—Interactive query statement specification based on a database schema
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/243—Natural language query formulation
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/242—Query formulation
- G06F16/2433—Query languages
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/24—Querying
- G06F16/245—Query processing
- G06F16/2452—Query translation
- G06F16/24522—Translation of natural language queries to structured queries
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/20—Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
- G06F16/28—Databases characterised by their database models, e.g. relational or object models
-
- Y—GENERAL TAGGING OF NEW TECHNOLOGICAL DEVELOPMENTS; GENERAL TAGGING OF CROSS-SECTIONAL TECHNOLOGIES SPANNING OVER SEVERAL SECTIONS OF THE IPC; TECHNICAL SUBJECTS COVERED BY FORMER USPC CROSS-REFERENCE ART COLLECTIONS [XRACs] AND DIGESTS
- Y02—TECHNOLOGIES OR APPLICATIONS FOR MITIGATION OR ADAPTATION AGAINST CLIMATE CHANGE
- Y02D—CLIMATE CHANGE MITIGATION TECHNOLOGIES IN INFORMATION AND COMMUNICATION TECHNOLOGIES [ICT], I.E. INFORMATION AND COMMUNICATION TECHNOLOGIES AIMING AT THE REDUCTION OF THEIR OWN ENERGY USE
- Y02D10/00—Energy efficient computing, e.g. low power processors, power management or thermal management
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Databases & Information Systems (AREA)
- Data Mining & Analysis (AREA)
- General Engineering & Computer Science (AREA)
- General Physics & Mathematics (AREA)
- Mathematical Physics (AREA)
- Computational Linguistics (AREA)
- Artificial Intelligence (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
Abstract
The invention provides a graph database query method and device based on a large language model and a graph structure, and relates to the technical field of graph databases and artificial intelligence, wherein the method comprises the following steps: obtaining a target graph structure corresponding to a target graph database and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship; determining a prompt sentence based on the prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relationship; inputting the prompt sentences and the query questions into a large language model, and outputting query sentences corresponding to the target graph database; and executing the query statement in the target graph database to determine the target query text. The invention can improve the universality and the query efficiency of the graph database.
Description
Technical Field
The invention relates to the technical field of graph databases and artificial intelligence, in particular to a graph database query method and device based on a large language model and a graph structure.
Background
The graph database is a non-relational database based on graph theory, the data storage structure of the graph database and the data query mode are based on the graph theory, and nodes and relations are mainly stored in the graph database. At present, graph databases present significant advantages in processing complex structured data, particularly in a data-intensive environment.
However, in conventional graph database systems, the query process often requires a specific query language, and users who are familiar with the database structure have high requirements on the expertise of the users, resulting in low usability and popularity of the graph database, and thus low query efficiency.
Disclosure of Invention
The invention provides a graph database query method and device based on a large language model and a graph structure, which are used for solving the defect of low query efficiency of a graph database in the prior art and improving the universality and the query efficiency of the graph database.
The invention provides a graph database query method based on a large language model and a graph structure, which comprises the following steps:
Obtaining a target graph structure corresponding to a target graph database and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
determining a prompt sentence based on a prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relationship;
inputting the prompt sentences and the query questions into a large language model, and outputting query sentences corresponding to the target graph database;
and executing the query statement in the target graph database to determine target query text.
According to the graph database query method based on the large language model and the graph structure provided by the invention, the method for acquiring the target graph structure corresponding to the target graph database comprises the following steps:
acquiring metadata in the target graph database; the metadata comprises element labels, element types, target types and attribute names corresponding to the nodes or the edges respectively;
Filtering to obtain target elements from all metadata based on the element types and the target types;
and determining a target graph structure corresponding to the target graph database based on the element label and the attribute name corresponding to the target element.
According to the graph database query method based on the large language model and the graph structure, the target element comprises a first target node or a target edge;
The determining, based on the element tag and the attribute name corresponding to the target element, a target graph structure corresponding to the target graph database includes:
Determining a target graph node attribute in the target graph structure based on the element label and the attribute name corresponding to the first target node;
determining a target graph edge attribute in the target graph structure based on the element label and the attribute name corresponding to the target edge;
And determining a target graph edge relation in the target graph structure based on the element label and the attribute name corresponding to the target edge and information of a second target node related to the target edge.
According to the graph database query method based on the large language model and the graph structure provided by the invention, the target element is obtained by filtering all metadata based on the element type and the target type, and the method comprises the following steps:
under the condition that the element type is a node and the target type is not a relation, filtering from all metadata to obtain a first target node;
and filtering to obtain a target edge from all metadata under the condition that the element types are in a relation and the target type is not a node or the target type is in a relation.
According to the graph database query method based on the large language model and the graph structure provided by the invention, the determining of the prompt sentence based on the prompt template, the target graph node attribute, the target graph side attribute and the target graph side relation comprises the following steps:
Determining target general prompt words in the prompt template; the target general prompt words comprise a first general prompt word, a second general prompt word and a third general prompt word;
updating the first general prompt word based on the target graph node attribute; updating the second general prompt word based on the target drawing attribute; updating the third common prompting word based on the target graph edge relation;
and determining the updated prompt template as the prompt statement.
According to the graph database query method based on the large language model and the graph structure provided by the invention, the query statement is executed in the target graph database, and the target query text is determined, which comprises the following steps:
Executing the query statement in the target graph database to obtain a first query result;
And inputting the first query result and the query question into the large language model to determine the target query text.
According to the graph database query method based on the large language model and the graph structure provided by the invention, the steps of inputting the first query result and the query question into the large language model and determining the target query text include:
carrying out semantic analysis on the query problem and determining a semantic analysis result corresponding to the query problem;
And if the semantic analysis result indicates that the target output format is included in the query question, performing format conversion on the first query result based on the target output format, and determining the target query text.
The invention also provides a graph database query device based on the large language model and the graph structure, which comprises:
the acquisition module is used for acquiring a target graph structure corresponding to the target graph database and a query problem input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
The determining module is used for determining a prompt sentence based on a prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relation;
the output module is used for inputting the prompt sentences and the query questions into a large language model and outputting query sentences corresponding to the target graph database;
And the query module is used for executing the query statement in the target graph database and determining target query text.
The invention also provides an electronic device, which comprises a memory, a processor and a computer program stored on the memory and capable of running on the processor, wherein the processor realizes the graph database query method based on the large language model and the graph structure when executing the program.
The present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, implements a graph database query method based on a large language model and a graph structure as any one of the above.
According to the graph database query method and device based on the large language model and the graph structure, after the query problem input by the user and the target graph structure corresponding to the target graph database are obtained, the target graph node attribute, the target graph edge attribute and the target graph edge relation in the target graph structure are utilized to update the prompt template to obtain the prompt statement, the prompt statement and the query problem input by the user are input into the large language model, the complex structure processing capacity of the graph database and the natural language processing capacity of the large language model are combined to generate the query statement, the query statement is executed in the target graph database, the target query text which is convenient for the user to understand is generated, the construction process of the query statement with higher professional requirement is executed by the large language model in the whole query process, the execution process of the query statement is automatically executed, the process related to user operation only comprises the steps of inputting the query problem and receiving the target query text, the professional requirement on the user is greatly reduced, the usability and the popularity of the graph database are further improved, and the query efficiency of the graph database is further improved.
Drawings
In order to more clearly illustrate the invention or the technical solutions of the prior art, the following description will briefly explain the drawings used in the embodiments or the description of the prior art, and it is obvious that the drawings in the following description are some embodiments of the invention, and other drawings can be obtained according to the drawings without inventive effort for a person skilled in the art.
FIG. 1 is a schematic flow diagram of a graph database query method based on a large language model and a graph structure according to an embodiment of the present invention;
FIG. 2 is a second flow chart of a graph database query method based on a large language model and graph structure according to an embodiment of the present invention;
FIG. 3 is a third flow chart of a graph database query method based on a large language model and graph structure according to an embodiment of the present invention;
FIG. 4 is a schematic structural diagram of a graph database query device based on a large language model and a graph structure according to an embodiment of the present invention;
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the technical solutions of the present invention will be clearly and completely described below with reference to the accompanying drawings, and it is apparent that the described embodiments are some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Aiming at the problem of low query efficiency of a graph database in the prior art, the embodiment of the invention provides a graph database query method based on a large language model and a graph structure, and fig. 1 is one of flow diagrams of the graph database query method based on the large language model and the graph structure, provided by the embodiment of the invention, as shown in fig. 1, and the method comprises the following steps:
Step 110, obtaining a target graph structure corresponding to a target graph database and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship.
It should be noted that the target Graph Database may be any one of Graph databases, and the Graph Database (Graph Database) is a non-relational Database (Not Only Structured Query Language, noSQL) implemented based on Graph theory, that is, the data storage structure and the data query manner of the Graph Database are both based on the Graph theory. In the graph database, the basic elements of the graph are nodes and edges, the nodes can be understood as entities in reality, and the edges can be understood as relationships among the entities. The graph database is specially designed for storing, inquiring and analyzing the highly interconnected data, and the nodes and edges are stored in the graph database in advance, so that the graph database can quickly respond to complex association inquiry.
Alternatively, the target graph database may include Neo4j, orientDB, arangoDB and JanusGraph, etc., where Neo4j is a high-performance, object-oriented graph database that provides a very flexible Cypher query language. OrientDB and ArangoDB are multimodal databases supporting graphic, document, and key-value database models. JanusGraph is a distributed graph database supporting various storage backend, including CASSANDRA, HBASE and BerkeleyDB, etc., which are not limited in this embodiment of the invention.
In addition, in response to the input operation of the user, the query question input by the user may be acquired, for example, the user may input the query question in an input box of a display interface of the electronic device, or the user may further upload the query question to the electronic device after inputting the query question in an input box of a display interface of another device communicatively connected to the electronic device, so as to acquire the query question input by the user. The embodiment of the invention does not limit the input operation of the user.
In the target graph structure, the target graph node attribute includes a plurality of attribute information related to nodes, the target graph edge attribute includes a plurality of attribute information related to edges, and the target graph edge relationship is used for representing the connection relationship and the directionality between the nodes.
Further, the obtaining the target graph structure corresponding to the target graph database includes:
acquiring metadata in the target graph database; the metadata comprises element labels, element types, target types and attribute names corresponding to the nodes or the edges respectively;
Filtering to obtain target elements from all metadata based on the element types and the target types;
and determining a target graph structure corresponding to the target graph database based on the element label and the attribute name corresponding to the target element.
Specifically, a core structure of a target graph structure corresponding to a target graph database includes a target graph node attribute, a target graph edge attribute and a target graph edge relationship. When obtaining the node attribute, the edge attribute and the edge relation of the target graph, firstly, obtaining metadata of an instance in the target graph database by calling a plug-in command in the target graph database, and returning element labels, element types, target types and attribute names corresponding to the nodes or edges respectively. Taking the target database as Neo4j as an example, the method can be used for acquiring metadata corresponding to all nodes and edges in the Neo4j graph database by using a 'CALL apoc.meta.data ()' cytomer statement, wherein 'CALL apoc.meta.data ()' is a command in an APOC plug-in used in the Neo4j graph database, the metadata comprises the number of nodes and edges, index information, the state of a storage engine and the like, and then, when acquiring the target graph node attribute and the target graph edge attribute, a 'YIELD label, ELEMENTTYPE, type and property' cytomer statement is adopted, so that the returned metadata comprises four fields of element labels, element types, target types and attribute names, wherein the element labels are label marks in the database, the element labels are used for classifying target elements, ELEMENTTYPE are used for indicating element types, the element types are used for indicating the node types, and the type is used for indicating the target relationships, and for indicating the target types are used for indicating the target relationships, and the property types are usually used for indicating the property types. When the target graph edge relationship is acquired, five fields of element labels, element types, target types, attribute names and other related information can be adopted by 'YIELD label, ELEMENTTYPE, type, property and other related information' so that the returned metadata comprises the element labels, the element types, the target types, the attribute names and the other related information, wherein the other related information is used for representing other information related to the target edge, for example, information corresponding to a second target node connected with the target edge.
And then, according to the combination of the element type and the target type, different filtering conditions are formed, and the target element is obtained from all metadata through filtering, wherein the target element comprises a first target node or a target edge. After filtering to obtain the target element, combining the corresponding key value pair according to the element label and the attribute name corresponding to the target element, and further obtaining a target graph structure corresponding to the target graph database. For example, when the target graph node attribute and the target graph edge attribute are acquired, the element label corresponding to the target element may be renamed, all attribute names corresponding to the target element are aggregated, all attribute names are renamed after the aggregation, and a key value pair corresponding to the element label and a key value pair corresponding to the attribute name are formed, so as to respectively form the target graph node attribute and the target graph edge attribute. When the target graph-edge relationship is obtained, after filtering to obtain the target element, respectively constructing a key value pair corresponding to the element label, a key value pair corresponding to the attribute name and a key value pair corresponding to the information of a second target node related to the target edge, and further constructing the target graph-edge relationship.
Further, filtering, based on the element type and the target type, target elements from all metadata, including:
under the condition that the element type is a node and the target type is not a relation, filtering from all metadata to obtain a first target node;
and filtering to obtain a target edge from all metadata under the condition that the element types are in a relation and the target type is not a node or the target type is in a relation.
Specifically, after the metadata is obtained, combining the element type and the target type, constructing a filtering condition, and filtering from the metadata to obtain a first target node or target edge through the filtering condition. That is, when the element type is a node and the filtering result is ensured not to be in a relation by the target type, the first target node can be accurately filtered, and when the element type is in a relation and the filtering result is ensured not to be a node or the filtering result is in a relation by the target type, the target edge can be accurately filtered from all metadata.
For example, taking the target database as Neo4j as an example, when obtaining the node attribute of the target graph, the record with the element type (ELEMENTTYPE) as the node (node) can be selected through the statement of 'WHERE ELEMENTTYPE = "node' AND type >" related hip ", AND meanwhile, the accuracy of the record is ensured through the fact that the target type (type) is not a relation (related hip). When the target graph edge attribute is acquired, the record with the element type (ELEMENTTYPE) as the relation (relationship) can be selected through the statement of' WHERE ELEMENTTYPE = "relation" AND NOT type= "NODE", AND meanwhile, the accuracy of the record is ensured through the fact that the target type (type) is NOT the NODE (NODE). When the relation of the target graph edges is acquired, the record of which the element type (ELEMENTTYPE) is the relation (relation) can be selected through the statement of' WHERE ELEMENTTYPE = "relation" AND type= "relation hip", AND meanwhile, the accuracy of the record is further ensured through the fact that the target type (type) is the relation (relation), AND the record is used for representing that the query is focused on the relation type defined in the target graph database.
Further, the determining, based on the element tag and the attribute name corresponding to the target element, a target graph structure corresponding to the target graph database includes:
Determining a target graph node attribute in the target graph structure based on the element label and the attribute name corresponding to the first target node;
determining a target graph edge attribute in the target graph structure based on the element label and the attribute name corresponding to the target edge;
And determining a target graph edge relation in the target graph structure based on the element label and the attribute name corresponding to the target edge and information of a second target node related to the target edge.
For example, taking the target database as Neo4j as an example, when acquiring the attribute of the target graph node, data is first converted and aggregated through the statement "WITH label AS nodeTypes, collect (property) AS nodeProperties", that is, the element label (label) of the first target node obtained by filtering is renamed to nodeTypes, all attribute names of the first target node of each node type are collected, and the set of attribute names is renamed to nodeProperties. Thereafter, a target graph node attribute is returned via the "RETURN { nodeLabels: nodeTypes, nodeProperties: nodeProperties } AS nodeData" statement, which contains two key-value pairs, node type (nodeLabels) and attribute name set (nodeProperties) of node type, respectively, and renaming the target graph node attribute to nodeData. When the attribute of the target graph edge is acquired, the data is converted and aggregated through the 'WITH label AS relationshipTypes, collect (property) AS relationshipProperties' statement, namely, the element label (label) of the target edge obtained through filtering is renamed to relationshipTypes, all attribute names of each relation type are collected, and the set of the attribute names is renamed to relationshipProperties. Then, a target graph edge attribute containing two key-value pairs is returned through "RETURN {relationshipLabels: relationshipTypes, properties: relationshipProperties} AS relationshipData" sentences, wherein the two keys are respectively a relationship type (relationshipLabels) and an attribute name set (properties) of the relationship type, and the target graph edge attribute is renamed as relationshipData. When the target edge relationship is acquired, a target edge relationship containing three key value pairs is returned through the RETURN { relationshipType:label, propertyKey:property, targetInfo:other } AS relationshipDetails "statement, the three keys are respectively a relationship type (relationshipType), a relationship attribute (propertyKey) and other related information (targetInfo), and the target edge relationship is renamed to relationshipDetails.
It should be noted that, the general extraction method of the target graph structure is independent of the graph structure, and can determine the complete target graph structure by extracting the node attribute of the target graph, the attribute of the target graph edge and the relationship of the target graph edge, and can be extended to any graph database.
Step 120, determining a prompt sentence based on the prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relationship.
Specifically, after the target graph structure including the target graph node attribute, the target graph edge attribute and the target graph edge relationship is obtained, a prompt template can be further obtained by combining a prompt engineering (Prompt Engineering) technology, and the prompt template is updated by using the target graph node attribute, the target graph edge attribute and the target graph edge relationship to obtain an updated prompt sentence so as to ensure that the large language model can clearly know the storage structure of the target graph database.
Further, fig. 2 is a second flow chart of a graph database query method based on a large language model and a graph structure according to an embodiment of the present invention, where, as shown in fig. 2, determining a hint statement based on a hint template, the target graph node attribute, the target graph edge attribute, and the target graph edge relationship includes:
Determining target general prompt words in the prompt template; the target general prompt words comprise a first general prompt word, a second general prompt word and a third general prompt word;
updating the first general prompt word based on the target graph node attribute; updating the second general prompt word based on the target drawing attribute; updating the third common prompting word based on the target graph edge relation;
and determining the updated prompt template as the prompt statement.
Specifically, after the prompt template is obtained, determining target general prompt words included in the prompt template, wherein the number of the target general prompt words is the same as the number of core parts included in the target graph structure, namely, the target graph structure comprises target graph node attributes, target graph edge attributes and target graph edge relationships, and the target general prompt words comprise first general prompt words, second general prompt words and third general prompt words, the first general prompt words are associated with the target graph node attributes, the second general prompt words are associated with the target graph edge attributes, and the third general prompt words are associated with the target graph edge relationships. And then, replacing the first general prompt word by using the target graph node attribute, replacing the second general prompt word by using the target graph side attribute, replacing the third general prompt word by using the target graph side relationship, and combining other prompt texts in the prompt template to obtain the updated prompt template, namely, the prompt sentence.
For example, with the target graph database as Neo4j, the hint text within the hint template is "THIS IS THE SCHEMA representation of the Neo4j database:
Node properties are the following:{nodes_props};
Relationship properties are the following:{relations_props};
Relationship point from source to target nodes:{relations};
Make sure to respect relationship types and directions. "for example, wherein the first universal cue word is" nodes_ props ", the second universal cue word is" relations _ props ", the third universal cue word is" relations ", the first universal cue word nodes_ props is replaced by the target graph node attribute nodeData, the second universal cue word relations _ props is replaced by the target graph side attribute relationshipData, the third universal cue word relations is replaced by the target graph side relationship relationshipDetails, and the cue sentence is" THIS IS THE SCHEMA representation of the Neo j database:
Node properties are the following:{nodeData};
Relationship properties are the following:{relationshipData};
Relationship point from source to target nodes:{relationshipDetails};
Make sure to respect relationship types and directions。”
and 130, inputting the prompt sentences and the query questions into a large language model, and outputting query sentences corresponding to the target graph database.
Specifically, after determining the prompt sentence, the prompt sentence and the query question input by the user are input to a large language model, the large language model can learn the target graph structure of the target graph database according to the prompt sentence, perform semantic analysis on the query question input by the user, identify the entity and the relationship between the entities in the query question, convert the relationship between the entities into query elements in the target graph structure, namely map the entities onto nodes in the target graph structure, map the relationship between the entities onto the relationship in the target graph structure, and generate the query sentence corresponding to the target graph database according to the mapping result.
Optionally, the selection of the large language model is flexible and various, and may be any online large language model or proprietary large language model, for example, a religion or a starfire cognition large model, etc., and the embodiment of the present invention does not limit the type of the large language model.
And 140, executing the query statement in the target graph database to determine target query text.
Specifically, after determining a query sentence corresponding to a query problem, the query sentence can be executed in a target graph database, a result related to the query problem is queried in the target graph database, and a target query text to be fed back to a user is generated, so that the user does not need to master the knowledge of a professional graph database, and the content to be queried in the graph database can be easily obtained only through natural language questioning.
Further, fig. 3 is a third flow chart of a graph database query method based on a large language model and a graph structure according to an embodiment of the present invention, as shown in fig. 3, where the executing the query sentence in the target graph database determines a target query text, including:
Executing the query statement in the target graph database to obtain a first query result;
And inputting the first query result and the query question into the large language model to determine the target query text.
Specifically, after the large language model outputs the query sentence, the query sentence can be executed in the target graph database to obtain the first query result related to the query problem, however, because the first query result of the target graph database is usually presented in a structured or diagrammatical form, for a non-professional user or a user without the graph database background, a certain difficulty exists in understanding the first query result, so that the user can understand or meet the requirement of the user conveniently, after the first query result is determined, the first query result and the corresponding query problem are input into the large language model together, the large language model can know the intention of the user again according to the query problem, then the first query result is adjusted according to the intention of the user, converted into a more natural and easier-to-understand target query text, and the target query text is fed back to the user.
Further, the inputting the first query result and the query question into the large language model, determining the target query text includes:
carrying out semantic analysis on the query problem and determining a semantic analysis result corresponding to the query problem;
And if the semantic analysis result indicates that the target output format is included in the query question, performing format conversion on the first query result based on the target output format, and determining the target query text.
Specifically, after the first query result and the query question are input into the large language model together, the large language model can perform semantic analysis on the query question, determine a semantic analysis result corresponding to the query question, namely, understand the intention corresponding to the user, if the semantic analysis result includes a target output format indicated by the user, the large language model can adjust the first query result according to the target output format and convert the first query result into a format corresponding to the target output format, further determine a target query text which is convenient for the user to understand, and further feed back to the user. Taking a movie and television play corresponding to a query director A and outputting the query by a MarkDown format as an example, after inputting the query question into a large language model, determining that a target output format indicated by a user is a MarkDown format based on semantic analysis of the query question, and performing format conversion on a first query result according to the MarkDown format to obtain a corresponding target query text.
In addition, if the user does not indicate a specific target output format in the query question, the large language model may adjust the first query result and convert it into a more natural and easier-to-understand target query text.
It should be noted that, the graph database query method based on the large language model and the graph structure provided by the embodiment of the invention can also be applied to other databases, for example, a relational database, and at this time, only the graph structure is changed into the table structure, so that the method has a wide application prospect.
According to the graph database query method based on the large language model and the graph structure, after the query problem input by the user and the target graph structure corresponding to the target graph database are obtained, the target graph node attribute, the target graph side attribute and the target graph side relation in the target graph structure are utilized to update the prompt template to obtain the prompt statement, the prompt statement and the query problem input by the user are input into the large language model, the complex structure processing capacity of the graph database and the natural language processing capacity of the large language model are combined to generate the query statement, the target query text which is convenient for the user to understand is generated after the query statement is executed in the target graph database, so that in the whole query process, the construction process of the query statement with higher professional requirement is executed by the large language model, the execution process of the query statement is automatically executed, the process related to the user operation only comprises the steps of inputting the query problem and receiving the target query text, the professional requirement on the user is greatly reduced, the usability and the popularity of the graph database are improved, and the query efficiency of the graph database is further improved. In addition, the method has great flexibility and high expansibility in selecting large language models and graph databases.
The graph database query device based on the large language model and the graph structure provided by the invention is described below, and the graph database query device based on the large language model and the graph structure described below and the graph database query method based on the large language model and the graph structure described above can be correspondingly referred to each other.
The embodiment of the invention also provides a graph database query device based on a large language model and a graph structure, and fig. 4 is a schematic structural diagram of the graph database query device based on the large language model and the graph structure provided by the embodiment of the invention, as shown in fig. 4, the graph database query device 400 based on the large language model and the graph structure includes: an acquisition module 410, a determination module 420, an output module 430, and a query module 440, wherein:
An obtaining module 410, configured to obtain a target graph structure corresponding to the target graph database, and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
A determining module 420, configured to determine a hint statement based on a hint template, the target graph node attribute, the target graph edge attribute, and the target graph edge relationship;
An output module 430, configured to input the prompt sentence and the query question into a large language model, and output a query sentence corresponding to the target graph database;
and a query module 440, configured to execute the query sentence in the target graph database, and determine a target query text.
According to the graph database query device based on the large language model and the graph structure, after the query problem input by the user and the target graph structure corresponding to the target graph database are acquired, the target graph node attribute, the target graph side attribute and the target graph side relation in the target graph structure are utilized to update the prompt template to obtain the prompt statement, the prompt statement and the query problem input by the user are input into the large language model, the complex structure processing capacity of the graph database and the natural language processing capacity of the large language model are combined to generate the query statement, the target query text which is convenient for the user to understand is generated after the query statement is executed in the target graph database, so that in the whole query process, the construction process of the query statement with higher professional requirement is executed by the large language model, the execution process of the query statement is automatically executed, the process related to the user operation only comprises the steps of inputting the query problem and receiving the target query text, the professional requirement on the user is greatly reduced, the usability and the popularity of the graph database are improved, and the query efficiency of the graph database is further improved.
Optionally, the acquiring module 410 is specifically configured to:
acquiring metadata in the target graph database; the metadata comprises element labels, element types, target types and attribute names corresponding to the nodes or the edges respectively;
Filtering to obtain target elements from all metadata based on the element types and the target types;
and determining a target graph structure corresponding to the target graph database based on the element label and the attribute name corresponding to the target element.
Optionally, the target element includes a first target node or a target edge.
Optionally, the acquiring module 410 is specifically configured to:
Determining a target graph node attribute in the target graph structure based on the element label and the attribute name corresponding to the first target node;
determining a target graph edge attribute in the target graph structure based on the element label and the attribute name corresponding to the target edge;
And determining a target graph edge relation in the target graph structure based on the element label and the attribute name corresponding to the target edge and information of a second target node related to the target edge.
Optionally, the acquiring module 410 is specifically configured to:
under the condition that the element type is a node and the target type is not a relation, filtering from all metadata to obtain a first target node;
and filtering to obtain a target edge from all metadata under the condition that the element types are in a relation and the target type is not a node or the target type is in a relation.
Optionally, the determining module 420 is specifically configured to:
Determining target general prompt words in the prompt template; the target general prompt words comprise a first general prompt word, a second general prompt word and a third general prompt word;
updating the first general prompt word based on the target graph node attribute; updating the second general prompt word based on the target drawing attribute; updating the third common prompting word based on the target graph edge relation;
and determining the updated prompt template as the prompt statement.
Optionally, the query module 440 is specifically configured to:
Executing the query statement in the target graph database to obtain a first query result;
And inputting the first query result and the query question into the large language model to determine the target query text.
Optionally, the query module 440 is specifically configured to:
carrying out semantic analysis on the query problem and determining a semantic analysis result corresponding to the query problem;
And if the semantic analysis result indicates that the target output format is included in the query question, performing format conversion on the first query result based on the target output format, and determining the target query text.
Fig. 5 is a schematic structural diagram of an electronic device according to an embodiment of the present invention, as shown in fig. 5, the electronic device may include: processor 510, communication interface (Communications Interface) 520, memory 530, and communication bus 540, wherein processor 510, communication interface 520, memory 530 complete communication with each other through communication bus 540. Processor 510 may invoke logic instructions in memory 530 to perform a graph database query method based on a large language model and graph structure, the method comprising:
Obtaining a target graph structure corresponding to a target graph database and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
determining a prompt sentence based on a prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relationship;
inputting the prompt sentences and the query questions into a large language model, and outputting query sentences corresponding to the target graph database;
and executing the query statement in the target graph database to determine target query text.
Further, the logic instructions in the memory 530 described above may be implemented in the form of software functional units and may be stored in a computer-readable storage medium when sold or used as a stand-alone product. Based on this understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art or in a part of the technical solution, in the form of a software product stored in a storage medium, comprising several instructions for causing a computer device (which may be a personal computer, a server, a network device, etc.) to perform all or part of the steps of the method according to the embodiments of the present invention. And the aforementioned storage medium includes: a usb disk, a removable hard disk, a Read-Only Memory (ROM), a random access Memory (RAM, random Access Memory), a magnetic disk, or an optical disk, or other various media capable of storing program codes.
In another aspect, the present invention also provides a computer program product, where the computer program product includes a computer program, where the computer program can be stored on a non-transitory computer readable storage medium, where the computer program when executed by a processor can perform a graph database query method based on a large language model and a graph structure provided by the above methods, and the method includes:
Obtaining a target graph structure corresponding to a target graph database and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
determining a prompt sentence based on a prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relationship;
inputting the prompt sentences and the query questions into a large language model, and outputting query sentences corresponding to the target graph database;
and executing the query statement in the target graph database to determine target query text.
In yet another aspect, the present invention also provides a non-transitory computer readable storage medium having stored thereon a computer program which, when executed by a processor, is implemented to perform the graph database query method based on a large language model and a graph structure provided by the above methods, the method comprising:
Obtaining a target graph structure corresponding to a target graph database and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
determining a prompt sentence based on a prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relationship;
inputting the prompt sentences and the query questions into a large language model, and outputting query sentences corresponding to the target graph database;
and executing the query statement in the target graph database to determine target query text.
The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purpose of the solution of this embodiment. Those of ordinary skill in the art will understand and implement the present invention without undue burden.
From the above description of the embodiments, it will be apparent to those skilled in the art that the embodiments may be implemented by means of software plus necessary general hardware platforms, or of course may be implemented by means of hardware. Based on this understanding, the foregoing technical solution may be embodied essentially or in a part contributing to the prior art in the form of a software product, which may be stored in a computer readable storage medium, such as ROM/RAM, a magnetic disk, an optical disk, etc., including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective embodiments or some parts of the embodiments.
Finally, it should be noted that: the above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention.
Claims (10)
1. A graph database query method based on a large language model and a graph structure, comprising:
Obtaining a target graph structure corresponding to a target graph database and a query question input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
determining a prompt sentence based on a prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relationship;
inputting the prompt sentences and the query questions into a large language model, and outputting query sentences corresponding to the target graph database;
Executing the query statement in the target graph database to determine a target query text;
In the target graph structure, the target graph node attribute comprises a plurality of attribute information related to nodes, the target graph edge attribute comprises a plurality of attribute information related to edges, and the target graph edge relationship is used for representing the connection relationship and the directivity between the nodes;
The large language model is used for carrying out semantic analysis on the query problem, identifying the entity and the relation between the entities in the query problem, converting the relation between the entities into query elements in the target graph structure, determining a mapping result, and generating a query sentence corresponding to the target graph database according to the mapping result, wherein the mapping result is used for representing the mapping relation between the entities and nodes in the target graph structure and the mapping relation between the entities and the relation in the target graph structure.
2. The graph database query method based on the large language model and the graph structure according to claim 1, wherein the obtaining the target graph structure corresponding to the target graph database includes:
acquiring metadata in the target graph database; the metadata comprises element labels, element types, target types and attribute names corresponding to the nodes or the edges respectively;
Filtering to obtain target elements from all metadata based on the element types and the target types;
and determining a target graph structure corresponding to the target graph database based on the element label and the attribute name corresponding to the target element.
3. The graph database query method based on a large language model and graph structure of claim 2, wherein the target element comprises a first target node or a target edge;
The determining, based on the element tag and the attribute name corresponding to the target element, a target graph structure corresponding to the target graph database includes:
Determining a target graph node attribute in the target graph structure based on the element label and the attribute name corresponding to the first target node;
determining a target graph edge attribute in the target graph structure based on the element label and the attribute name corresponding to the target edge;
And determining a target graph edge relation in the target graph structure based on the element label and the attribute name corresponding to the target edge and information of a second target node related to the target edge.
4. The method for querying a graph database based on a large language model and a graph structure according to claim 2, wherein filtering the target element from all metadata based on the element type and the target type comprises:
under the condition that the element type is a node and the target type is not a relation, filtering from all metadata to obtain a first target node;
and filtering to obtain a target edge from all metadata under the condition that the element types are in a relation and the target type is not a node or the target type is in a relation.
5. The method of query of a graph database based on a large language model and graph structure of any one of claims 1-4, wherein the determining a hint statement based on a hint template, the target graph node attribute, the target graph edge attribute, and the target graph edge relationship includes:
Determining target general prompt words in the prompt template; the target general prompt words comprise a first general prompt word, a second general prompt word and a third general prompt word;
updating the first general prompt word based on the target graph node attribute; updating the second general prompt word based on the target drawing attribute; updating the third common prompting word based on the target graph edge relation;
and determining the updated prompt template as the prompt statement.
6. The method for querying a graph database based on a large language model and graph structure as claimed in any one of claims 1-4, wherein executing the query statement in the target graph database to determine target query text comprises:
Executing the query statement in the target graph database to obtain a first query result;
And inputting the first query result and the query question into the large language model to determine the target query text.
7. The method of claim 6, wherein said entering the first query result and the query question into the large language model, determining the target query text, comprises:
carrying out semantic analysis on the query problem and determining a semantic analysis result corresponding to the query problem;
And if the semantic analysis result indicates that the target output format is included in the query question, performing format conversion on the first query result based on the target output format, and determining the target query text.
8. A graph database query device based on a large language model and a graph structure, comprising:
the acquisition module is used for acquiring a target graph structure corresponding to the target graph database and a query problem input by a user; the target graph structure comprises a target graph node attribute, a target graph edge attribute and a target graph edge relationship;
The determining module is used for determining a prompt sentence based on a prompt template, the target graph node attribute, the target graph edge attribute and the target graph edge relation;
the output module is used for inputting the prompt sentences and the query questions into a large language model and outputting query sentences corresponding to the target graph database;
The query module is used for executing the query statement in the target graph database and determining a target query text;
In the target graph structure, the target graph node attribute comprises a plurality of attribute information related to nodes, the target graph edge attribute comprises a plurality of attribute information related to edges, and the target graph edge relationship is used for representing the connection relationship and the directivity between the nodes;
The large language model is used for carrying out semantic analysis on the query problem, identifying the entity and the relation between the entities in the query problem, converting the relation between the entities into query elements in the target graph structure, determining a mapping result, and generating a query sentence corresponding to the target graph database according to the mapping result, wherein the mapping result is used for representing the mapping relation between the entities and nodes in the target graph structure and the mapping relation between the entities and the relation in the target graph structure.
9. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, wherein the processor implements the graph database query method based on a large language model and graph structure as claimed in any one of claims 1-7 when executing the program.
10. A non-transitory computer readable storage medium having stored thereon a computer program, wherein the computer program when executed by a processor implements the graph database query method based on a large language model and graph structure as claimed in any one of claims 1 to 7.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410458999.8A CN118093632B (en) | 2024-04-17 | 2024-04-17 | Graph database query method and device based on large language model and graph structure |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202410458999.8A CN118093632B (en) | 2024-04-17 | 2024-04-17 | Graph database query method and device based on large language model and graph structure |
Publications (2)
Publication Number | Publication Date |
---|---|
CN118093632A CN118093632A (en) | 2024-05-28 |
CN118093632B true CN118093632B (en) | 2024-08-30 |
Family
ID=91165366
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202410458999.8A Active CN118093632B (en) | 2024-04-17 | 2024-04-17 | Graph database query method and device based on large language model and graph structure |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN118093632B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN118332097B (en) * | 2024-06-12 | 2024-09-24 | 浙江口碑网络技术有限公司 | Information interaction method and device |
CN118779315A (en) * | 2024-06-14 | 2024-10-15 | 中国铁道科学研究院集团有限公司 | Database query method, device, electronic device and storage medium |
CN118733799B (en) * | 2024-06-14 | 2025-06-27 | 北京中科睿途科技有限公司 | Large model multi-mode output display method and device |
CN119623485B (en) * | 2025-02-13 | 2025-06-24 | 商飞智能技术有限公司 | User problem handling method and device |
Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117391192A (en) * | 2023-12-08 | 2024-01-12 | 杭州悦数科技有限公司 | Method and device for constructing knowledge graph from PDF by using LLM based on graph database |
Family Cites Families (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN115438070A (en) * | 2022-09-26 | 2022-12-06 | 支付宝(杭州)信息技术有限公司 | Method and device for automatically completing query sentence aiming at graph database |
CN117112590A (en) * | 2023-05-10 | 2023-11-24 | 深圳华为云计算技术有限公司 | Method for generating structural query language and data query equipment |
CN116340584B (en) * | 2023-05-24 | 2023-08-11 | 杭州悦数科技有限公司 | Implementation method for automatically generating complex graph database query statement service |
CN117493379A (en) * | 2023-11-09 | 2024-02-02 | 数据空间研究院 | Natural language-to-SQL interactive generation method based on large language model |
CN117371973A (en) * | 2023-12-06 | 2024-01-09 | 武汉科技大学 | Knowledge-graph-retrieval-based enhanced language model graduation service system |
CN117708161A (en) * | 2023-12-14 | 2024-03-15 | 以萨技术股份有限公司 | Data query method for converting natural language into SQL based on large language model |
-
2024
- 2024-04-17 CN CN202410458999.8A patent/CN118093632B/en active Active
Patent Citations (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117391192A (en) * | 2023-12-08 | 2024-01-12 | 杭州悦数科技有限公司 | Method and device for constructing knowledge graph from PDF by using LLM based on graph database |
Also Published As
Publication number | Publication date |
---|---|
CN118093632A (en) | 2024-05-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN118093632B (en) | Graph database query method and device based on large language model and graph structure | |
CN111753099B (en) | Method and system for enhancing relevance of archive entity based on knowledge graph | |
US9971967B2 (en) | Generating a superset of question/answer action paths based on dynamically generated type sets | |
CN107391677B (en) | Method and device for generating Chinese general knowledge graph with entity relation attributes | |
US9785725B2 (en) | Method and system for visualizing relational data as RDF graphs with interactive response time | |
US11334549B2 (en) | Semantic, single-column identifiers for data entries | |
CN111984745B (en) | Database field dynamic expansion method, device, equipment and storage medium | |
KR102345410B1 (en) | Big data intelligent collecting method and device | |
CN114880483A (en) | A metadata knowledge graph construction method, storage medium and system | |
CN113779349A (en) | Data retrieval system, apparatus, electronic device, and readable storage medium | |
JP7493195B1 (en) | Program, method, information processing device, and system | |
CN114090760A (en) | Data processing method, electronic device and readable storage medium for form question answering | |
CN118035405A (en) | Knowledge base question-answering construction method and device based on large model | |
CN117608652A (en) | A SQL statement translation method based on high-level abstract syntax tree | |
CN108733638B (en) | The Structure Method of WORD Manuscript and the Structure Device of WORD Manuscript | |
CN115878814A (en) | Knowledge graph question-answering method and system based on machine reading understanding | |
CN119557406A (en) | A method, device, equipment and medium for answering user questions | |
CN112214494B (en) | Retrieval method and device | |
CN113434658A (en) | Thermal power generating unit operation question-answer generation method, system, equipment and readable storage medium | |
CN112559550A (en) | Multi-data-source NL2SQL system based on semantic rules and multi-dimensional model | |
CN111143329B (en) | Data processing method and device | |
CN118210809A (en) | Object definition method, system, equipment and medium based on ER information | |
CN113127617A (en) | Knowledge question answering method of general domain knowledge graph, terminal equipment and storage medium | |
US9881055B1 (en) | Language conversion based on S-expression tabular structure | |
CN117951272A (en) | Document generation method, system and medium based on large language model |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |