[go: up one dir, main page]

CN112380238B - Database data query method and device, electronic equipment and storage medium - Google Patents

Database data query method and device, electronic equipment and storage medium Download PDF

Info

Publication number
CN112380238B
CN112380238B CN202011281644.4A CN202011281644A CN112380238B CN 112380238 B CN112380238 B CN 112380238B CN 202011281644 A CN202011281644 A CN 202011281644A CN 112380238 B CN112380238 B CN 112380238B
Authority
CN
China
Prior art keywords
sql
keyword
column names
data
name
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202011281644.4A
Other languages
Chinese (zh)
Other versions
CN112380238A (en
Inventor
赵亮
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Technology Shenzhen Co Ltd
Original Assignee
Ping An Technology Shenzhen Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Technology Shenzhen Co Ltd filed Critical Ping An Technology Shenzhen Co Ltd
Priority to CN202011281644.4A priority Critical patent/CN112380238B/en
Publication of CN112380238A publication Critical patent/CN112380238A/en
Priority to PCT/CN2021/097071 priority patent/WO2022100067A1/en
Application granted granted Critical
Publication of CN112380238B publication Critical patent/CN112380238B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/20Information retrieval; Database structures therefor; File system structures therefor of structured data, e.g. relational data
    • G06F16/24Querying
    • G06F16/242Query formulation
    • G06F16/2433Query languages
    • G06F16/2438Embedded query languages
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Data Mining & Analysis (AREA)
  • Databases & Information Systems (AREA)
  • Biomedical Technology (AREA)
  • General Health & Medical Sciences (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Artificial Intelligence (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The invention relates to a data processing technology and provides a database data query method, a database data query device, electronic equipment and a storage medium. The method comprises the steps of performing word segmentation processing on a query sentence to obtain a target word segmentation; inputting the target word segmentation into a neural network model, and outputting an SQL sentence frame; the SQL sentence framework comprises a first keyword, a second keyword and a placeholder; generating a plurality of SQL column names according to the column name and the table name of each data table in a preset database; the target word segmentation and the SQL column names are respectively input into a pointer network corresponding to each first keyword to obtain the SQL column name of each first keyword, the column name of each first keyword is further obtained, a target data table is found out from a database according to the obtained SQL column names, the table names of the target data table are added into corresponding positions in an SQL sentence frame, and the obtained column names are respectively substituted for corresponding placeholders to obtain the SQL sentence. The invention also relates to a blockchain technology, and the data related to the interface layer and the target data corresponding to the request can be stored in the blockchain node.

Description

Database data query method and device, electronic equipment and storage medium
Technical Field
The present invention relates to data processing technologies, and in particular, to a database data query method, a database data query device, an electronic device, and a storage medium.
Background
At present, NL2SQL (Natural Language to SQL) is a technology for converting a natural sentence of a user into an executable SQL (Structured Query Language ) sentence, which has great significance for improving the interaction mode between the user and the database. It is well known that databases store a large amount of data generated by the production and life of people, and the inquiry and analysis of the data have become an important means for the development of data value in the big data age. However, for database queries, deep knowledge of the structured query language SQL is required, thereby raising the threshold for data analysis. The goal of NL2SQL is to reduce the threshold of data analysis, let the user describe the data that wants to search recently through natural language, convert natural language into correct SQL statement through AI algorithm, then query on database, and finally return the query result to the user. Because of the tremendous promotion of NL2SQL technology to database data query and analysis, NL2SQL technology has received extensive attention from industry, which many data providers have as one of the core competence.
However, in order to improve the resolution accuracy of intelligent products to natural language, the existing NL2SQL technology limits and guides the input of the user, so that the user is forced to query according to a fixed sentence pattern.
Disclosure of Invention
In view of the above, the present invention provides a database data query method, a database data query device, an electronic device, and a storage medium, which aim to solve the technical problems of more current use restrictions, easy error and poor fault tolerance.
In order to achieve the above object, the present invention provides a database data query method, which includes:
receiving a query sentence input by a user, and performing word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder;
Generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively;
According to the SQL column names corresponding to each first keyword, obtaining the data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword respectively to obtain the SQL sentence.
In one embodiment, the inputting the obtained target word into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame includes:
Inputting the obtained target word into an embedded layer of a pre-built preset type depth neural network model to obtain word vectors of each target word, inputting the word vectors of the target word into an encoding layer of the pre-built preset type depth neural network model to obtain encoded word vectors, inputting the encoded word vectors into a decoding layer of the pre-built preset type depth neural network model, and outputting an SQL sentence frame.
In one embodiment, the first keyword comprises a first sub-keyword and/or a second sub-keyword and/or an operation vector; the first sub-key corresponds to the operation vector symbol; the second sub-key does not correspond to the operation vector symbol.
In one embodiment, the first sub-key and the operation vector symbol having a correspondence relationship correspond to the same preset pointer network;
Before all the target word segments and all the SQL column names are used as input data to be respectively input into a preset pointer network corresponding to each first keyword, the method comprises the following steps: respectively inputting the operation vector characters corresponding to the first sub-keywords and each target word as input data into a preset pointer network corresponding to each first sub-keyword to obtain calculated target words corresponding to each first sub-keyword;
The step of respectively inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column names corresponding to each first keyword, comprising the following steps:
And respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each first sub-keyword to obtain the similarity between all calculated target words corresponding to each first sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each first sub-keyword to serve as the SQL column names corresponding to each first sub-keyword.
In one embodiment, the step of respectively inputting all the target words and all the SQL column names as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column name corresponding to each first keyword includes:
Inputting each target word into a preset pointer network corresponding to each second sub-keyword to obtain calculated target words corresponding to the second sub-keywords; and respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each second sub-keyword to obtain the similarity between all calculated target words corresponding to each second sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each second sub-keyword to serve as the SQL column names corresponding to each second sub-keyword.
In one embodiment, the searching a data table containing one or more data table column names from a preset database according to the SQL column name corresponding to each first keyword, and determining the target table name according to the found data table includes:
searching a data table corresponding to SQL column names corresponding to all the first keywords from a preset database;
If the data table corresponding to the SQL column names corresponding to all the first keywords is searched, determining that the searched data table is the target data table, and the name of the target data table is the target table name;
If the data tables corresponding to the SQL column names corresponding to all the first keywords are not searched, respectively finding out the intermediate data tables corresponding to the first keywords from a preset database according to the SQL column names corresponding to the first keywords, and connecting the found names of the intermediate data tables according to a predetermined name connection algorithm to obtain the target table names.
In one embodiment, the number of the SQL column names corresponding to the first keyword may be one or more, and the number of the SQL column names is equal to the number of the placeholders.
In order to achieve the above object, the present invention further provides a database data query apparatus, the apparatus comprising:
the receiving module is used for receiving a query sentence input by a user, and performing word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder;
The generating module is used for generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in the preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively;
The replacing module is used for obtaining the data list names corresponding to the first keywords according to the SQL list names corresponding to the first keywords, finding out a data list containing one or more data list names from a preset database according to the SQL list names corresponding to the first keywords, determining a target list name according to the found data list, adding the determined target list name into the SQL sentence frame at the position corresponding to the second keywords, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data list names corresponding to the first keywords respectively to obtain the SQL sentence.
To achieve the above object, the present invention also provides an electronic device including:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the database data query method as described above.
To achieve the above object, the present invention also provides a computer-readable storage medium having stored therein a database data query program which, when executed by a processor, implements the steps of the database data query method as described above.
According to the database data query method, the database data query device, the electronic equipment and the storage medium, query sentences input by a user are received, word segmentation processing is carried out on the received query sentences, and one or more target word segmentation is obtained; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder; generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively; according to the SQL column names corresponding to each first keyword, obtaining the data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword respectively to obtain the SQL sentence. The invention can support the user to input the query statement in a normal language mode, has less use limit and better fault tolerance.
Drawings
FIG. 1 is a schematic diagram of an electronic device according to a preferred embodiment of the present invention;
FIG. 2 is a block diagram of a database data query device according to a preferred embodiment of the present invention;
FIG. 3 is a flowchart of a database data query method according to a preferred embodiment of the present invention;
The achievement of the objects, functional features and advantages of the present invention will be further described with reference to the accompanying drawings, in conjunction with the embodiments.
Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
Referring to fig. 1, a schematic diagram of a preferred embodiment of an electronic device 1 according to the present invention is shown.
The electronic device 1 includes, but is not limited to: memory 11, processor 12, display 13, and network interface 14. The electronic device 1 is connected to a network through a network interface 14 to obtain the original data. The network may be a wireless or wired network such as an Intranet (Intranet), the Internet (Internet), a global system for mobile communications (Global System of Mobile communication, GSM), wideband code division multiple access (Wideband Code Division Multiple Access, WCDMA), a 4G network, a 5G network, bluetooth (Bluetooth), wi-Fi, or a call network.
The memory 11 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 11 may be an internal storage unit of the electronic device 1, such as a hard disk or a memory of the electronic device 1. In other embodiments, the memory 11 may also be an external storage device of the electronic device 1, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are equipped with the electronic device 1. Of course, the memory 11 may also comprise both an internal memory unit of the electronic device 1 and an external memory device. In this embodiment, the memory 11 is typically used to store an operating system and various application software installed on the electronic device 1, such as program codes of the database data query program 10. Further, the memory 11 may be used to temporarily store various types of data that have been output or are to be output.
Processor 12 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 12 is typically used for controlling the overall operation of the electronic device 1, e.g. performing data interaction or communication related control and processing, etc. In this embodiment, the processor 12 is configured to execute the program code stored in the memory 11 or process data, such as the program code of the database data query program 10.
The display 13 may be referred to as a display screen or a display unit. The display 13 may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch device, or the like in some embodiments. The display 13 is used for displaying information processed in the electronic device 1 and for displaying a visual work interface, for example displaying the results of data statistics.
The network interface 14 may alternatively comprise a standard wired interface, a wireless interface, such as a WI-FI interface, which network interface 14 is typically used for establishing a communication connection between the electronic device 1 and other electronic devices.
Fig. 1 shows only the electronic device 1 and the cloud database 2 with components 11-14 and the database data query program 10, but it should be understood that not all of the illustrated components are required to be implemented, and that more or fewer components may alternatively be implemented.
Optionally, the electronic device 1 may further comprise a user interface, which may comprise a Display (Display), an input unit such as a Keyboard (Keyboard), and a standard wired interface, a wireless interface. Alternatively, in some embodiments, the display may be an LED display, a liquid crystal display, a touch-sensitive liquid crystal display, an Organic Light-Emitting Diode (OLED) touch, or the like. The display may also be referred to as a display screen or display unit, as appropriate, for displaying information processed in the electronic device 1 and for displaying a visual user interface.
The electronic device 1 may further comprise Radio Frequency (RF) circuits, sensors and audio circuits etc., which are not described here.
In the above embodiment, the processor 12 may implement the following steps when executing the database data query program 10 stored in the memory 11:
receiving a query sentence input by a user, and performing word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder;
Generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively;
According to the SQL column names corresponding to each first keyword, obtaining the data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword respectively to obtain the SQL sentence.
For a detailed description of the above steps, please refer to the following fig. 2 for a functional block diagram of an embodiment of the database data query device 100 and fig. 3 for a flowchart of an embodiment of the database data query method.
Referring to fig. 2, a functional block diagram of a database data query device 100 according to the present invention is shown.
The database data query apparatus 100 of the present invention may be installed in an electronic device. Depending on the functions implemented, the database data querying device 100 may include a receiving module 110, a generating module 120, and a replacing module 130. The modules in the present invention, which may also be referred to as units, refer to a series of computer program segments, which are stored in the memory of the electronic device, capable of being executed by the processor of the electronic device and of performing a fixed function.
In the present embodiment, the functions concerning the respective modules/units are as follows:
The receiving module 110 is configured to receive a query sentence input by a user, and perform word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame includes at least one first keyword, at least one second keyword, and at least one placeholder.
In this embodiment, the target word is input into a pre-built deep neural network model of a preset type, so as to obtain a corresponding structured query language (SQL, structured Query Language) sentence frame. Specifically, inputting target word segmentation of an original query sentence into a pre-built preset type depth neural network model, obtaining word vectors of each target word segmentation through an embedded layer of the pre-built preset type depth neural network model, inputting the word vectors of the target word segmentation into an encoding layer (namely a first LSTM layer) of the pre-built preset type depth neural network model to obtain encoded word vectors, inputting the encoded word vectors into a decoding layer (namely a second LSTM layer) of the pre-built preset type depth neural network model, and outputting an SQL sentence frame. The pre-built preset type deep neural network model is a Long short-term memory (LSTM) model, the first keyword and the second keyword are words which are fixed (the value range is fixed) in the SQL sentence, the first keyword is used for connecting the column names, the first keyword can comprise words such as select, where, group, by, order, having, limit, asc, desc and operation vector characters, and further the first keyword comprises a first sub-keyword and/or a second sub-keyword and/or operation vector characters. The first sub-keyword may include words corresponding to the operation vector characters such as where, have (i.e. the first sub-keyword has a binding relationship with the operation vector characters, there is necessarily a corresponding operation vector character of the first sub-keyword, and similarly, there is necessarily no corresponding operation vector character of the first sub-keyword if there is no first sub-keyword, for example, where corresponds to ">", etc.; the second sub-keyword may include select, group, by, order, limit, asc, desc or the like words that do not correspond to (i.e., are independent of) the operation vector characters. The second key is used to connect the table names, and the second key may include from. Placeholders are used to replace words (e.g., column names and table names) in the SQL statement that are variable (value range not fixed), in this embodiment represented as col. And arranging the first keywords, the second keywords and the placeholders of the SQL sentence framework according to a preset SQL grammar.
For example, the query sentence is "show ME THE NAME, sex of students who are older than", after the query sentence is subjected to word segmentation, target words such as "show", "me", "the", "name", "six", "of", "documents", "who", "are", "older", "than" and "18" are obtained, and these target words are input into a pre-built preset type deep neural network model, and an SQL sentence frame is output, where the SQL sentence frame is "select col col from where col >. It is readily apparent from the existing SQL syntax that two cols immediately following a select (i.e., two cols between select and from) correspond to a select, and one cols immediately following a where (i.e., one col between where and >) correspond to a where.
The generating module 120 is configured to generate a plurality of SQL column names corresponding to all column names of all data tables in the preset database one by one according to a column name of each data table in the preset database and a table name belonging to one data table together with the column names; and respectively inputting all the target word segmentation and all the SQL column names serving as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column name corresponding to each first keyword.
In this embodiment, one or more data tables are stored in the preset database, where one data table includes a plurality of column names, and the table names of each data table are different. In this embodiment, according to the column name of each data table in the preset database and the table name belonging to one data table together with the column names, the SQL column names corresponding to all column names of all data tables in the preset database one by one are generated. For example, a table name of a data table in a database is preset to be A, column names of the data table are name, sex and age, and three SQL column names of A.name, A.Sex and A.age are obtained according to the column names and the table names of the data table; the three SQL column names of the name, the A.sex and the A.age are in one-to-one correspondence with the three column names of the name, the sex and the age of the data table and correspond to the table name A.
And before all the target word segmentation and all the SQL column names are used as input data and respectively input into preset pointer networks corresponding to the first keywords to respectively obtain the SQL column names corresponding to the first keywords, inputting all the SQL column names corresponding to all the data tables in a preset database into an embedded layer of a preset type depth neural network model, and outputting word vectors of the SQL column names. The step of respectively inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column names corresponding to each first keyword, comprising the following steps: and respectively inputting word vectors of all target word segmentation and word vectors of all SQL column names into preset pointer networks corresponding to the first keywords as input data (namely, the preset pointer network corresponding to each first keyword receives the word vectors of all target word segmentation and the word vectors of all SQL column names, and the input of the preset pointer network corresponding to each first keyword is the same), so as to obtain the SQL column name corresponding to each first keyword. The number of the SQL column names corresponding to the first key word can be one or more, and the number of the SQL column names is equal to the number of the placeholders.
It should be noted that the first sub-key and the operation vector symbol having the correspondence relationship correspond to the same preset pointer network.
Before all the target word segments and all the SQL column names are used as input data to be respectively input into a preset pointer network corresponding to each first keyword, the method comprises the following steps: and inputting the operation vector characters corresponding to the first sub-keywords into a pre-built preset type deep neural network model, and obtaining word vectors of the operation vector characters corresponding to the first sub-keywords through an embedding layer. And respectively inputting the operation vector characters corresponding to the first sub-keywords and each target word as input data into a preset pointer network corresponding to each first sub-keyword to obtain calculated target words corresponding to each first sub-keyword. For example, the word vector of the operation vector symbol corresponding to a first sub-keyword and the word vectors of all the target word segments are input into a preset pointer network corresponding to the same first sub-keyword together, so as to obtain the calculated target word segment corresponding to the first sub-keyword.
Inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively, wherein the method comprises the following steps:
And respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each first sub-keyword to obtain the similarity between all calculated target words corresponding to each first sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each first sub-keyword to serve as the SQL column names corresponding to each first sub-keyword.
And/or inputting each target word into a preset pointer network corresponding to each second sub-keyword to obtain the calculated target word corresponding to the second sub-keyword; and respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each second sub-keyword to obtain the similarity between all calculated target words corresponding to each second sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each second sub-keyword to serve as the SQL column names corresponding to each second sub-keyword.
The similarity between the word vector of the calculated target word and the word vector of the SQL column name can be calculated by using the vector inner product. After calculating the similarity between each calculated target word-segmentation vector and each SQL column noun vector corresponding to a preset pointer network, further utilizing a softmax function to obtain a parameter value corresponding to a combination formed by any one SQL column name and any one calculated target word, thereby finding out the SQL column name in the combination with the maximum parameter value, and taking the SQL column name as the SQL column name corresponding to the preset pointer network.
Taking the foregoing query statement as "show ME THE NAME, sex of students who are older than" and the SQL statement frame as "select col col from where col >" as an example, the SQL column names corresponding to all column names of all databases in the preset database include a.age, a.name, a.sex, etc. Inputting the word vector of each target word into a preset pointer network corresponding to the select to obtain each calculated target word vector, calculating the similarity of the word vector of all calculated target words and the word vector of SQL column names corresponding to all column names of all databases in the preset database, wherein the similarity of the word vector of the SQL column name A.name and the word vector of the calculated target word name and the similarity of the word vector of the SQL column name A.sex and the word vector of the calculated target word sex are higher than the similarity of the word vectors of other column names and the word vector of the calculated target word, and taking the SQL column names A.name and A.sex as the SQL column names corresponding to the select. It will be appreciated that the number of SQL column names predicted by the select pointer network is the same as the number of placeholders corresponding to the select in the aforementioned SQL statement, and similarly, the number of SQL column names predicted by the pointer network of the other first key is the same as the number of placeholders corresponding to the same key in the aforementioned SQL statement.
It should be noted that, because parameters of pointer networks corresponding to different first keywords are different, after word vectors of the same target word are input into the pointer networks corresponding to different first keywords, the calculated target word vectors are different, and therefore similarity between the column names and the target word obtained by calculation of the pointer networks corresponding to the first keywords is also different.
And the replacing module 130 is configured to obtain a data table column name corresponding to each first keyword according to the SQL column name corresponding to each first keyword, find a data table containing one or more data table column names from a preset database according to the SQL column name corresponding to each first keyword, determine a target table name according to the found data table, add the determined target table name to the position corresponding to the second keyword in the SQL sentence frame, and replace placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword, thereby obtaining the SQL sentence.
In this embodiment, according to the corresponding relationship between each SQL column name and the table name of the data table, the corresponding one or more data tables are found out from the preset database. It will be appreciated that the resulting multiple SQL column names may be associated with the same data table (e.g., predicted SQL column names A.name, A.sex, and A.age, each associated with data table A), or each SQL column name may be associated with a different data table (e.g., predicted SQL column names A.name, B.sex, and C.age, each associated with the three data tables A, B, C).
Further, the step of finding out a data table containing one or more data table column names from a preset database according to the SQL column name corresponding to each first keyword, and the step of determining a target table name according to the found data table includes:
searching a data table corresponding to SQL column names corresponding to all the first keywords from a preset database;
If one data table corresponding to the SQL column names corresponding to all the first keywords is searched, determining that the searched data table is the target data table, and the name of the target data table is the target table name.
If the data tables corresponding to the SQL column names corresponding to all the first keywords are not searched, respectively finding out the intermediate data tables corresponding to the first keywords from a preset database according to the SQL column names corresponding to the first keywords, and connecting the found names of the intermediate data tables according to a predetermined name connection algorithm to obtain the target table names.
Specifically, when one data table is found out from a preset database, the data table corresponds to all SQL column names, the found data table contains all obtained column names, the data table is taken as a target data table, the table name of the data table is extracted, the table name of the found data table is taken as the target table name to be added into the SQL sentence frame at the position corresponding to the second keyword, and all placeholders in the SQL sentence frame are replaced by the column names corresponding to the first keywords respectively, so that an SQL sentence is obtained.
Taking the foregoing query statement as "show ME THE NAME, sex of students who are older than 18", the SQL statement frame as "select col col from where col >", the SQL column name corresponding to the first sub-keyword where is a.age, and the SQL column names corresponding to the second sub-keyword select are a.name and a.sex as examples, obtaining the corresponding column names name, sex and age according to a.name, a.sex and a.age, finding the data table with the table name a from the preset database according to a.name, a.sex and a.age, extracting the table name a of the data table, adding the table name into the corresponding position in the SQL statement frame, and after the column names name, sex and age replace the corresponding col in the SQL statement frame, obtaining the SQL statement SELECT NAME, sex from A WHERE AGE.
When no data table corresponding to the SQL column names corresponding to all the first keywords is searched (i.e. one data table corresponding to all the SQL column names is not found), respectively finding out the middle data tables corresponding to the first keywords from a preset database according to the SQL column names corresponding to the first keywords, wherein the number of the middle data tables is at least two, connecting all the found middle data tables through join operation, wherein any one found data table at least comprises one obtained column name and does not comprise all the obtained column names, the union of the column names of all the middle data tables comprises all the obtained data table column names, extracting the table names of all the found middle data tables, connecting the extracted table names through join, adding the target table names into the positions corresponding to the second keywords in the SQL sentence, and respectively replacing all the placeholders in the SQL sentence with the column names corresponding to the first keywords, thereby obtaining the SQL sentence. It should be noted that, each data table has a main key and an external key, and the main keys of each table are different, so that the tables can be connected by the main key of one table and the external keys of other tables. For example, table a includes a main key "name" and a foreign key "age", table B includes a main key "age" and a foreign key "gender", and the foreign key "age" of table a is the same as the main key "age" of table B, so that table a and table B can be connected.
Taking the foregoing query statement as "show ME THE NAME, sex of students who are older than 18" and the SQL statement frame as "select col col from where col >", the SQL column corresponding to the first sub-keyword where is named as A.age and the SQL column corresponding to the second sub-keyword select is named as B.name and C.sex. According to B.name, C.sex and A.age, obtaining corresponding column names name, sex and age, according to B.name, C.sex and A.age, finding three intermediate data tables with table names A, B and C from a preset database, wherein the intermediate data table B contains the column names name, the intermediate data table C contains the column names sex, the intermediate data table A contains the column names age, extracting table names A, B and C of the three intermediate data tables, connecting the table names of the intermediate data table A, the intermediate data table B and the intermediate data table C through join, adding corresponding positions in an SQL sentence frame, and respectively replacing corresponding cols in the SQL sentence frame with the column names name, sex and age to obtain the SQL sentence as SELECT NAME, sex from A join B join C WHERE AGE >.
The database data query device provided by the invention receives query sentences input by a user, and performs word segmentation processing on the received query sentences to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder; generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively; according to the SQL column names corresponding to each first keyword, obtaining the data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword respectively to obtain the SQL sentence. The device can support the user to input the query statement in a normal language mode, has less use limit and better fault tolerance.
In addition, the invention also provides a database data query method which is applied to the electronic equipment. Referring to fig. 3, a flowchart of a database data query method according to an embodiment of the present invention is shown. The processor 12 of the electronic device 1 implements the following steps of the database data query method when executing the database data query program 10 stored in the memory 11:
Step S10: receiving a query sentence input by a user, and performing word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame includes at least one first keyword, at least one second keyword, and at least one placeholder.
In this embodiment, the target word is input into a pre-built deep neural network model of a preset type, so as to obtain a corresponding structured query language (SQL, structured Query Language) sentence frame. Specifically, inputting target word segmentation of an original query sentence into a pre-built preset type depth neural network model, obtaining word vectors of each target word segmentation through an embedded layer of the pre-built preset type depth neural network model, inputting the word vectors of the target word segmentation into an encoding layer (namely a first LSTM layer) of the pre-built preset type depth neural network model to obtain encoded word vectors, inputting the encoded word vectors into a decoding layer (namely a second LSTM layer) of the pre-built preset type depth neural network model, and outputting an SQL sentence frame. The pre-built preset type deep neural network model is a Long short-term memory (LSTM) model, the first keyword and the second keyword are words which are fixed (the value range is fixed) in the SQL sentence, the first keyword is used for connecting the column names, the first keyword can comprise words such as select, where, group, by, order, having, limit, asc, desc and operation vector characters, and further the first keyword comprises a first sub-keyword and/or a second sub-keyword and/or operation vector characters. The first sub-keyword may include words corresponding to the operation vector characters such as where, have (i.e. the first sub-keyword has a binding relationship with the operation vector characters, there is necessarily a corresponding operation vector character of the first sub-keyword, and similarly, there is necessarily no corresponding operation vector character of the first sub-keyword if there is no first sub-keyword, for example, where corresponds to ">", etc.; the second sub-keyword may include select, group, by, order, limit, asc, desc or the like words that do not correspond to (i.e., are independent of) the operation vector characters. The second key is used to connect the table names, and the second key may include from. Placeholders are used to replace words (e.g., column names and table names) in the SQL statement that are variable (value range not fixed), in this embodiment represented as col. And arranging the first keywords, the second keywords and the placeholders of the SQL sentence framework according to a preset SQL grammar.
For example, the query sentence is "show ME THE NAME, sex of students who are older than", after the query sentence is subjected to word segmentation, target words such as "show", "me", "the", "name", "six", "of", "documents", "who", "are", "older", "than" and "18" are obtained, and these target words are input into a pre-built preset type deep neural network model, and an SQL sentence frame is output, where the SQL sentence frame is "select col col from where col >. It is readily apparent from the existing SQL syntax that two cols immediately following a select (i.e., two cols between select and from) correspond to a select, and one cols immediately following a where (i.e., one col between where and >) correspond to a where.
Step S20: generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; and respectively inputting all the target word segmentation and all the SQL column names serving as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column name corresponding to each first keyword.
In this embodiment, one or more data tables are stored in the preset database, where one data table includes a plurality of column names, and the table names of each data table are different. In this embodiment, according to the column name of each data table in the preset database and the table name belonging to one data table together with the column names, the SQL column names corresponding to all column names of all data tables in the preset database one by one are generated. For example, a table name of a data table in a database is preset to be A, column names of the data table are name, sex and age, and three SQL column names of A.name, A.Sex and A.age are obtained according to the column names and the table names of the data table; the three SQL column names of the name, the A.sex and the A.age are in one-to-one correspondence with the three column names of the name, the sex and the age of the data table and correspond to the table name A.
And before all the target word segmentation and all the SQL column names are used as input data and respectively input into preset pointer networks corresponding to the first keywords to respectively obtain the SQL column names corresponding to the first keywords, inputting all the SQL column names corresponding to all the data tables in a preset database into an embedded layer of a preset type depth neural network model, and outputting word vectors of the SQL column names. The step of respectively inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column names corresponding to each first keyword, comprising the following steps: and respectively inputting word vectors of all target word segmentation and word vectors of all SQL column names into preset pointer networks corresponding to the first keywords as input data (namely, the preset pointer network corresponding to each first keyword receives the word vectors of all target word segmentation and the word vectors of all SQL column names, and the input of the preset pointer network corresponding to each first keyword is the same), so as to obtain the SQL column name corresponding to each first keyword. The number of the SQL column names corresponding to the first key word can be one or more, and the number of the SQL column names is equal to the number of the placeholders.
It should be noted that the first sub-key and the operation vector symbol having the correspondence relationship correspond to the same preset pointer network.
Before all the target word segments and all the SQL column names are used as input data to be respectively input into a preset pointer network corresponding to each first keyword, the method comprises the following steps: and inputting the operation vector characters corresponding to the first sub-keywords into a pre-built preset type deep neural network model, and obtaining word vectors of the operation vector characters corresponding to the first sub-keywords through an embedding layer. And respectively inputting the operation vector characters corresponding to the first sub-keywords and each target word as input data into a preset pointer network corresponding to each first sub-keyword to obtain calculated target words corresponding to each first sub-keyword. For example, the word vector of the operation vector symbol corresponding to a first sub-keyword and the word vectors of all the target word segments are input into a preset pointer network corresponding to the same first sub-keyword together, so as to obtain the calculated target word segment corresponding to the first sub-keyword.
Inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively, wherein the method comprises the following steps:
And respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each first sub-keyword to obtain the similarity between all calculated target words corresponding to each first sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each first sub-keyword to serve as the SQL column names corresponding to each first sub-keyword.
And/or inputting each target word into a preset pointer network corresponding to each second sub-keyword to obtain the calculated target word corresponding to the second sub-keyword; and respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each second sub-keyword to obtain the similarity between all calculated target words corresponding to each second sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each second sub-keyword to serve as the SQL column names corresponding to each second sub-keyword.
The similarity between the word vector of the calculated target word and the word vector of the SQL column name can be calculated by using the vector inner product. After calculating the similarity between each calculated target word-segmentation vector and each SQL column noun vector corresponding to a preset pointer network, further utilizing a softmax function to obtain a parameter value corresponding to a combination formed by any one SQL column name and any one calculated target word, thereby finding out the SQL column name in the combination with the maximum parameter value, and taking the SQL column name as the SQL column name corresponding to the preset pointer network.
Taking the foregoing query statement as "show ME THE NAME, sex of students who are older than" and the SQL statement frame as "select col col from where col >" as an example, the SQL column names corresponding to all column names of all databases in the preset database include a.age, a.name, a.sex, etc. Inputting the word vector of each target word into a preset pointer network corresponding to the select to obtain each calculated target word vector, calculating the similarity of the word vector of all calculated target words and the word vector of SQL column names corresponding to all column names of all databases in the preset database, wherein the similarity of the word vector of the SQL column name A.name and the word vector of the calculated target word name and the similarity of the word vector of the SQL column name A.sex and the word vector of the calculated target word sex are higher than the similarity of the word vectors of other column names and the word vector of the calculated target word, and taking the SQL column names A.name and A.sex as the SQL column names corresponding to the select. It will be appreciated that the number of SQL column names predicted by the select pointer network is the same as the number of placeholders corresponding to the select in the aforementioned SQL statement, and similarly, the number of SQL column names predicted by the pointer network of the other first key is the same as the number of placeholders corresponding to the same key in the aforementioned SQL statement.
It should be noted that, because parameters of pointer networks corresponding to different first keywords are different, after word vectors of the same target word are input into the pointer networks corresponding to different first keywords, the calculated target word vectors are different, and therefore similarity between the column names and the target word obtained by calculation of the pointer networks corresponding to the first keywords is also different.
Step S30: according to the SQL column names corresponding to each first keyword, obtaining the data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword respectively to obtain the SQL sentence.
In this embodiment, according to the corresponding relationship between each SQL column name and the table name of the data table, the corresponding one or more data tables are found out from the preset database. It will be appreciated that the resulting multiple SQL column names may be associated with the same data table (e.g., predicted SQL column names A.name, A.sex, and A.age, each associated with data table A), or each SQL column name may be associated with a different data table (e.g., predicted SQL column names A.name, B.sex, and C.age, each associated with the three data tables A, B, C).
Further, the step of finding out a data table containing one or more data table column names from a preset database according to the SQL column name corresponding to each first keyword, and the step of determining a target table name according to the found data table includes:
searching a data table corresponding to SQL column names corresponding to all the first keywords from a preset database;
If one data table corresponding to the SQL column names corresponding to all the first keywords is searched, determining that the searched data table is the target data table, and the name of the target data table is the target table name.
If the data tables corresponding to the SQL column names corresponding to all the first keywords are not searched, respectively finding out the intermediate data tables corresponding to the first keywords from a preset database according to the SQL column names corresponding to the first keywords, and connecting the found names of the intermediate data tables according to a predetermined name connection algorithm to obtain the target table names.
Specifically, when one data table is found out from a preset database, the data table corresponds to all SQL column names, the found data table contains all obtained column names, the data table is taken as a target data table, the table name of the data table is extracted, the table name of the found data table is taken as the target table name to be added into the SQL sentence frame at the position corresponding to the second keyword, and all placeholders in the SQL sentence frame are replaced by the column names corresponding to the first keywords respectively, so that an SQL sentence is obtained.
Taking the foregoing query statement as "show ME THE NAME, sex of students who are older than 18", the SQL statement frame as "select col col from where col >", the SQL column name corresponding to the first sub-keyword where is a.age, and the SQL column names corresponding to the second sub-keyword select are a.name and a.sex as examples, obtaining the corresponding column names name, sex and age according to a.name, a.sex and a.age, finding the data table with the table name a from the preset database according to a.name, a.sex and a.age, extracting the table name a of the data table, adding the table name into the corresponding position in the SQL statement frame, and after the column names name, sex and age replace the corresponding col in the SQL statement frame, obtaining the SQL statement SELECT NAME, sex from A WHERE AGE.
When no data table corresponding to the SQL column names corresponding to all the first keywords is searched (i.e. one data table corresponding to all the SQL column names is not found), respectively finding out the middle data tables corresponding to the first keywords from a preset database according to the SQL column names corresponding to the first keywords, wherein the number of the middle data tables is at least two, connecting all the found middle data tables through join operation, wherein any one found data table at least comprises one obtained column name and does not comprise all the obtained column names, the union of the column names of all the middle data tables comprises all the obtained data table column names, extracting the table names of all the found middle data tables, connecting the extracted table names through join, adding the target table names into the positions corresponding to the second keywords in the SQL sentence, and respectively replacing all the placeholders in the SQL sentence with the column names corresponding to the first keywords, thereby obtaining the SQL sentence. It should be noted that, each data table has a main key and an external key, and the main keys of each table are different, so that the tables can be connected by the main key of one table and the external keys of other tables. For example, table a includes a main key "name" and a foreign key "age", table B includes a main key "age" and a foreign key "gender", and the foreign key "age" of table a is the same as the main key "age" of table B, so that table a and table B can be connected.
Taking the foregoing query statement as "show ME THE NAME, sex of students who are older than 18" and the SQL statement frame as "select col col from where col >", the SQL column corresponding to the first sub-keyword where is named as A.age and the SQL column corresponding to the second sub-keyword select is named as B.name and C.sex. According to B.name, C.sex and A.age, obtaining corresponding column names name, sex and age, according to B.name, C.sex and A.age, finding three intermediate data tables with table names A, B and C from a preset database, wherein the intermediate data table B contains the column names name, the intermediate data table C contains the column names sex, the intermediate data table A contains the column names age, extracting table names A, B and C of the three intermediate data tables, connecting the table names of the intermediate data table A, the intermediate data table B and the intermediate data table C through join, adding corresponding positions in an SQL sentence frame, and respectively replacing corresponding cols in the SQL sentence frame with the column names name, sex and age to obtain the SQL sentence as SELECT NAME, sex from A join B join C WHERE AGE >.
The database data query device provided by the invention receives query sentences input by a user, and performs word segmentation processing on the received query sentences to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder; generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively; according to the SQL column names corresponding to each first keyword, obtaining the data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword respectively to obtain the SQL sentence. The method can support the user to input the query statement in a normal language mode, has less use limit and better fault tolerance.
In addition, the embodiment of the invention also provides a computer readable storage medium, which can be any one or any combination of a plurality of hard disk, a multimedia card, an SD card, a flash memory card, an SMC, a read-only memory (ROM), an erasable programmable read-only memory (EPROM), a portable compact disc read-only memory (CD-ROM), a USB memory and the like. The computer readable storage medium includes a storage data area and a storage program area, the storage data area stores data created according to the use of the blockchain node, the storage program area stores a database data query program 10, and the database data query program 10 when executed by a processor realizes the following operations:
receiving a query sentence input by a user, and performing word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder;
Generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively;
According to the SQL column names corresponding to each first keyword, obtaining the data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword respectively to obtain the SQL sentence.
It should be emphasized that the embodiments of the computer-readable storage medium according to the present invention are substantially the same as the embodiments of the database data query method described above, and are not repeated here.
In another embodiment, in the database data query method provided by the present invention, in order to further ensure the privacy and security of all the data that appear, all the data may also be stored in a node of a blockchain. Such as knowledge maps, text to be identified, etc., which may be stored in the blockchain node.
It should be noted that, the blockchain referred to in the present invention is a novel application mode of computer technologies such as distributed data storage, point-to-point transmission, consensus mechanism, encryption algorithm, etc. The blockchain (Blockchain), essentially a de-centralized database, is a string of data blocks that are generated in association using cryptographic methods, each of which contains information from a batch of network transactions for verifying the validity (anti-counterfeit) of its information and generating the next block. The blockchain may include a blockchain underlying platform, a platform product services layer, an application services layer, and the like.
The embodiment of the computer readable storage medium of the present invention is substantially the same as the embodiment of the database data query method described above, and will not be described herein.
It should be noted that, the foregoing reference numerals of the embodiments of the present invention are merely for describing the embodiments, and do not represent the advantages and disadvantages of the embodiments. And the terms "comprises," "comprising," or any other variation thereof, are intended to cover a non-exclusive inclusion, such that a process, apparatus, article, or method that comprises a list of elements does not include only those elements but may include other elements not expressly listed or inherent to such process, apparatus, article, or method. Without further limitation, an element defined by the phrase "comprising one … …" does not exclude the presence of other like elements in a process, apparatus, article, or method that comprises the element.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present invention may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) as described above, comprising instructions for causing a terminal device (which may be a mobile phone, a computer, an electronic device, or a network device, etc.) to perform the method according to the embodiments of the present invention.
The foregoing description is only of the preferred embodiments of the present invention, and is not intended to limit the scope of the invention, but rather is intended to cover any equivalents of the structures or equivalent processes disclosed herein or in the alternative, which may be employed directly or indirectly in other related arts.

Claims (6)

1. A method for querying database data, the method comprising:
receiving a query sentence input by a user, and performing word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder;
Generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in a preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively;
According to the SQL column names corresponding to each first keyword, obtaining data table column names corresponding to each first keyword, finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, determining a target table name according to the found data table, adding the determined target table name into the SQL sentence frame at the position corresponding to the second keyword, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data table column names corresponding to each first keyword to obtain an SQL sentence;
The inputting the obtained target word into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame comprises the following steps: inputting the obtained target word into an embedded layer of a pre-built preset type depth neural network model to obtain word vectors of each target word, inputting the word vectors of the target word into an encoding layer of the pre-built preset type depth neural network model to obtain encoded word vectors, inputting the encoded word vectors into a decoding layer of the pre-built preset type depth neural network model, and outputting an SQL sentence frame;
the first keyword comprises a first sub-keyword and/or a second sub-keyword and/or an operation vector character; the first sub-key corresponds to the operation vector symbol; the second sub-key does not correspond to the operation vector character;
The step of respectively inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column names corresponding to each first keyword, comprising the following steps: inputting each target word into a preset pointer network corresponding to each second sub-keyword to obtain calculated target words corresponding to the second sub-keywords; respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each second sub-keyword to obtain the similarity between all calculated target words corresponding to each second sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each second sub-keyword to serve as the SQL column names corresponding to each second sub-keyword;
The step of finding out a data table containing one or more data table column names from a preset database according to the SQL column names corresponding to each first keyword, and determining a target table name according to the found data table comprises the following steps: searching a data table corresponding to SQL column names corresponding to all the first keywords from a preset database; if the data table corresponding to the SQL column names corresponding to all the first keywords is searched, determining that the searched data table is the target data table, and the name of the target data table is the target table name; if the data tables corresponding to the SQL column names corresponding to all the first keywords are not searched, respectively finding out the intermediate data tables corresponding to the first keywords from a preset database according to the SQL column names corresponding to the first keywords, and connecting the found names of the intermediate data tables according to a predetermined name connection algorithm to obtain the target table names.
2. The database data query method of claim 1, wherein the first sub-key and the operation vector symbol having a correspondence relationship correspond to the same preset pointer network;
Before all the target word segments and all the SQL column names are used as input data to be respectively input into a preset pointer network corresponding to each first keyword, the method comprises the following steps: respectively inputting the operation vector characters corresponding to the first sub-keywords and each target word as input data into a preset pointer network corresponding to each first sub-keyword to obtain calculated target words corresponding to each first sub-keyword;
The step of respectively inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword to respectively obtain the SQL column names corresponding to each first keyword, comprising the following steps:
And respectively inputting all calculated target words and all SQL column names serving as input data into a preset pointer network corresponding to each first sub-keyword to obtain the similarity between all calculated target words corresponding to each first sub-keyword and all SQL column names, and finding out one or more SQL column names with highest similarity corresponding to the preset pointer network corresponding to each first sub-keyword to serve as the SQL column names corresponding to each first sub-keyword.
3. The database data query method of claim 1, wherein the number of SQL column names corresponding to one first keyword is one or more, and the number of SQL column names is equal to the number of placeholders.
4. A database data query apparatus for implementing the database data query method as claimed in any one of claims 1 to 3, the apparatus comprising:
the receiving module is used for receiving a query sentence input by a user, and performing word segmentation processing on the received query sentence to obtain one or more target word segments; inputting the obtained target word segmentation into a pre-built preset type deep neural network model to output a corresponding SQL sentence frame; the SQL statement frame comprises at least one first keyword, at least one second keyword and at least one placeholder;
The generating module is used for generating a plurality of SQL column names which are in one-to-one correspondence with all column names of all data tables in the preset database according to the column name of each data table in the preset database and the table name belonging to one data table with the column name; inputting all the target word segments and all the SQL column names as input data into a preset pointer network corresponding to each first keyword respectively to obtain the SQL column name corresponding to each first keyword respectively;
The replacing module is used for obtaining the data list names corresponding to the first keywords according to the SQL list names corresponding to the first keywords, finding out a data list containing one or more data list names from a preset database according to the SQL list names corresponding to the first keywords, determining a target list name according to the found data list, adding the determined target list name into the SQL sentence frame at the position corresponding to the second keywords, and replacing placeholders of the corresponding first keywords in the SQL sentence frame with the data list names corresponding to the first keywords respectively to obtain the SQL sentence.
5. An electronic device, the electronic device comprising:
At least one processor; and
A memory communicatively coupled to the at least one processor; wherein,
The memory stores instructions executable by the at least one processor to enable the at least one processor to perform the database data querying method according to any of claims 1 to 3.
6. A computer-readable storage medium, in which a database data query program is stored, which, when executed by a processor, implements the database data query method of any one of claims 1 to 3.
CN202011281644.4A 2020-11-16 2020-11-16 Database data query method and device, electronic equipment and storage medium Active CN112380238B (en)

Priority Applications (2)

Application Number Priority Date Filing Date Title
CN202011281644.4A CN112380238B (en) 2020-11-16 2020-11-16 Database data query method and device, electronic equipment and storage medium
PCT/CN2021/097071 WO2022100067A1 (en) 2020-11-16 2021-05-30 Method and apparatus for querying data in database, electronic device and storage medium

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202011281644.4A CN112380238B (en) 2020-11-16 2020-11-16 Database data query method and device, electronic equipment and storage medium

Publications (2)

Publication Number Publication Date
CN112380238A CN112380238A (en) 2021-02-19
CN112380238B true CN112380238B (en) 2024-06-28

Family

ID=74585629

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011281644.4A Active CN112380238B (en) 2020-11-16 2020-11-16 Database data query method and device, electronic equipment and storage medium

Country Status (2)

Country Link
CN (1) CN112380238B (en)
WO (1) WO2022100067A1 (en)

Families Citing this family (13)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112380238B (en) * 2020-11-16 2024-06-28 平安科技(深圳)有限公司 Database data query method and device, electronic equipment and storage medium
CN114625748A (en) * 2021-04-23 2022-06-14 亚信科技(南京)有限公司 SQL query statement generation method and device, electronic equipment and readable storage medium
CN115687387B (en) * 2021-07-27 2026-02-03 中移(苏州)软件技术有限公司 SQL sentence generation method, device, equipment and storage medium
CN114490792A (en) * 2022-02-11 2022-05-13 中国工商银行股份有限公司 Data processing method and device
CN115168402B (en) * 2022-07-08 2025-11-28 支付宝(杭州)信息技术有限公司 Training sequence generation model method and device
CN115617841A (en) * 2022-11-10 2023-01-17 北京商银微芯科技有限公司 A method, system, device, and storage medium for generating data query statements
CN115905239A (en) * 2022-12-16 2023-04-04 中盈优创资讯科技有限公司 A highly multiplexed performance index data retrieval method and device
CN116841962A (en) * 2023-06-26 2023-10-03 中国建设银行股份有限公司 Document processing method, device and electronic equipment
CN116932839A (en) * 2023-07-14 2023-10-24 京东方科技集团股份有限公司 SQL interaction method of webpage end and electronic equipment
CN117290354A (en) * 2023-08-21 2023-12-26 中国银行股份有限公司 Data processing method, device, computer equipment and storage medium
CN117743506B (en) * 2023-09-04 2024-05-28 应急管理部大数据中心 Data association query method and system based on natural language
CN117056343B (en) * 2023-10-11 2024-01-23 湖北华中电力科技开发有限责任公司 Multi-source data management method and system in power grid field and electronic equipment
CN118689904B (en) * 2024-08-26 2024-11-15 珠海盈米基金销售有限公司 Order data aggregation method, system and equipment

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177180A (en) * 2019-12-11 2020-05-19 北京百分点信息科技有限公司 Data query method and device and electronic equipment
CN111177174A (en) * 2018-11-09 2020-05-19 百度在线网络技术(北京)有限公司 SQL statement generation method, device, equipment and computer readable storage medium

Family Cites Families (7)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10747761B2 (en) * 2017-05-18 2020-08-18 Salesforce.Com, Inc. Neural network based translation of natural language queries to database queries
CN109408526B (en) * 2018-10-12 2023-10-31 平安科技(深圳)有限公司 SQL sentence generation method, device, computer equipment and storage medium
US11789945B2 (en) * 2019-04-18 2023-10-17 Sap Se Clause-wise text-to-SQL generation
CN110825949B (en) * 2019-09-19 2024-09-13 平安科技(深圳)有限公司 Information retrieval method based on convolutional neural network and related equipment thereof
CN111274267A (en) * 2019-12-31 2020-06-12 杭州量之智能科技有限公司 Database query method and device and computer readable storage medium
CN111581229B (en) * 2020-03-25 2023-04-18 平安科技(深圳)有限公司 SQL statement generation method and device, computer equipment and storage medium
CN112380238B (en) * 2020-11-16 2024-06-28 平安科技(深圳)有限公司 Database data query method and device, electronic equipment and storage medium

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111177174A (en) * 2018-11-09 2020-05-19 百度在线网络技术(北京)有限公司 SQL statement generation method, device, equipment and computer readable storage medium
CN111177180A (en) * 2019-12-11 2020-05-19 北京百分点信息科技有限公司 Data query method and device and electronic equipment

Also Published As

Publication number Publication date
WO2022100067A1 (en) 2022-05-19
CN112380238A (en) 2021-02-19

Similar Documents

Publication Publication Date Title
CN112380238B (en) Database data query method and device, electronic equipment and storage medium
CN111814465B (en) Machine learning-based information extraction method, device, computer equipment, and medium
CN113707300B (en) Search intention recognition method, device, equipment and medium based on artificial intelligence
CN111814466B (en) Information extraction method based on machine reading understanding and related equipment thereof
CN111695439B (en) Image structured data extraction method, electronic device and storage medium
CN110502608B (en) Man-machine conversation method and man-machine conversation device based on knowledge graph
CN112447300B (en) Medical query method and device based on graph neural network, computer equipment and storage medium
WO2022105493A1 (en) Semantic recognition-based data query method and apparatus, device and storage medium
CN112287069B (en) Information retrieval method and device based on voice semantics and computer equipment
CN112650858B (en) Emergency assistance information acquisition method and device, computer equipment and medium
CN109299235B (en) Knowledge base searching method, device and computer readable storage medium
CN110134780B (en) Method, device, equipment and computer readable storage medium for generating document abstract
CN111414375A (en) Input recommendation method based on database query, electronic device and storage medium
CN113282763B (en) Text key information extraction, device, equipment and storage medium
CN103927330A (en) Method and device for determining characters with similar forms in search engine
CN111339166A (en) Thesaurus-based matching recommendation method, electronic device and storage medium
CN111857688A (en) SQL code automatic completion method, system and storage medium
CN113821622A (en) Answer retrieval method and device based on artificial intelligence, electronic equipment and medium
CN108776673A (en) Automatic switching method, device and the storage medium of relation schema
CN115062050B (en) Database query result generation method, device, equipment and storage medium
CN112084752A (en) Statement marking method, device, equipment and storage medium based on natural language
CN119719321A (en) Query statement generation method, device, equipment and storage medium
WO2021135103A1 (en) Method and apparatus for semantic analysis, computer device, and storage medium
CN119127921A (en) A method, system, device and medium for intelligent interaction of power equipment data
CN118093629A (en) Method, device, equipment and medium for generating database query statements

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant