Detailed Description
The present invention will be described in further detail with reference to the drawings and examples, in order to make the objects, technical solutions and advantages of the present invention more apparent. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention.
The terminology used in the embodiments of the invention is for the purpose of describing particular embodiments only and is not intended to be limiting of the invention. As used in this application and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, etc. may be used in embodiments of the present invention to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another.
According to the natural language processing method provided by the embodiment of the invention, the natural language information input by the user is identified and the intention information of the user is determined; further, determining a label matched with the keyword of the user intention information in a knowledge base, acquiring a matching answer corresponding to the determined label, and outputting the matching answer to a user; on one hand, the interference of the traditional word segmentation technology on language processing is reduced, the accuracy of natural language understanding is greatly improved, and on the other hand, the unique mosaic technology is adopted, the method is more friendly and humanized in human-computer interaction results, and better service customers are facilitated.
In order to further describe the technical means and effects adopted by the present invention for achieving the intended purpose, the following detailed description is given of the specific embodiments, structures, features and effects according to the present invention with reference to the accompanying drawings and preferred embodiments.
Fig. 1 shows an implementation flow of a natural language processing method according to the first embodiment of the present invention, and for convenience of explanation, only the portions related to the embodiment of the present invention are shown in detail as follows:
in step S101, natural language information input by a user is acquired.
In the embodiment of the invention, the natural language information input by the user can be natural language information in a text format directly input by the user, for example, when the user searches for the position information of the relevant store by using the mall guide board, the text information input by the user can be analyzed; the natural language information of the voice format input by the user may be recognized by the voice recognition model and processed into the natural language information of the text format, for example, when the user uses the telephone to make consultation, the voice information of the user can be analyzed.
In step S102, the natural language information is identified and user intention information is determined.
In the embodiment of the invention, the user intention information can pass through a preset intention recognition model, and the specific intention recognition model is not required by the invention because the existing intention recognition model has a plurality of types, and a proper intention recognition model can be selected by a person skilled in the art according to actual requirements, for example, the intention recognition model can be a path characteristic pair intention recognition model constructed based on a semantic analysis tree, and also can be an intention recognition model of a convolutional neural network based on an attention mechanism.
In the embodiment of the invention, the natural language information is identified and the user intention information is determined, so that the professional industry field to which the natural language information belongs can be determined after the text identification is performed on the natural language information, and the user intention information is identified by combining with the user intention identification model of the determined professional industry field.
In step S103, keyword extraction is performed on the user intention information.
In an embodiment of the present invention, in the present invention,
in step S104, a tag matching the keyword is determined in the knowledge base, and a matching answer corresponding to the determined tag is obtained and output to the user.
In an embodiment of the present invention, in the present invention,
according to the natural language processing method provided by the embodiment of the invention, the natural language information input by the user is identified and the intention information of the user is determined; further, determining a label matched with the keyword of the user intention information in a knowledge base, acquiring a matching answer corresponding to the determined label, and outputting the matching answer to a user; on one hand, the interference of the traditional word segmentation technology on language processing is reduced, the accuracy of natural language understanding is greatly improved, and on the other hand, the unique mosaic technology is adopted, the method is more friendly and humanized in human-computer interaction results, and better service customers are facilitated.
Fig. 2 shows a flow of implementation of the natural language processing method according to the second embodiment of the present invention, for convenience of explanation, only a portion related to the embodiment of the present invention is shown, which is similar to the embodiment, and the difference is that the step 102 specifically includes:
in step S201, the natural language information is identified and a semantic representation is generated.
In the embodiment of the present invention, one implementation way to identify the natural language information and generate the semantic representation may be to pre-process the natural language information and convert it into a structured query language (SQL statement) for identifying the industry and the requirement information thereof.
In step S202, a text classification is determined from the semantic representation.
In the embodiment of the present invention, one implementation way of determining the text classification according to the semantic representation may be to determine, according to the converted structured query language, the technical field to which the natural language information belongs.
In step S203, user intention information is determined from the semantic representation and the text classification.
According to the natural language processing method provided by the embodiment of the invention, the natural language information is firstly identified and the semantic representation is generated, so that the semantic representation is subjected to professional domain identification processing, and after the professional domain to which the natural language information belongs is acquired, the corresponding intention identification model corresponding to the professional domain can be positioned in time for processing, thereby effectively improving the accuracy of user intention identification.
Fig. 3 shows a flow of implementation of the natural language processing method according to the third embodiment of the present invention, and for convenience of explanation, only the portion related to the third embodiment of the present invention is shown, which is different from the second embodiment in that the step S203 includes the following steps:
in step S301, the semantic representation is converted into a semantic vector based on the text classification.
In the embodiment of the present invention, based on the text classification, the implementation manner of converting the semantic representation into the semantic vector may be that, by locating in time to the corresponding technical field of expertise, the semantic representation in each time sequence is converted into the word vector in the corresponding time state, the obtained word vector is modeled by a long short time memory network (LSTM), and the last time state vector is used as the representation of the whole sentence, that is, the sentence vector. In practical application, since each time corresponds to a sentence input by the user, word segmentation is followed by a word sequence, and LSTM may extract features of the time sequence, and for each sentence, the last state vector is obtained through LSTM as the code of the whole sentence.
In the embodiment of the invention, text input of a user is converted into vector input, word2vec (word vector) training is generally performed by using a user chat log, and then the user input is converted through an Embedding layer, namely, each word of the user input (input after word segmentation) is converted into a vector through an Embedding layer, which is a general processing mode in the industry and is not repeated herein.
In step S302, probability distribution information of the user intention is determined according to the semantic vector and a preset training model, and output.
In the embodiment of the invention, the preset training model may be a neural network, which is a concept, and a generic algorithm may be used as a general rnn (recurrent neural network), cnn (convolutional neural network) or fully-connected network.
In step S303, the user intention with the largest probability distribution threshold is determined as the user intention information.
Fig. 4 shows a flow of implementation of the natural language processing method according to the fourth embodiment of the present invention, and for convenience of explanation, only the portion related to the embodiment of the present invention is shown, which is similar to the embodiment, except that the step S104 specifically includes the following steps:
in step S401, it is determined whether the matching degree between the tag in the knowledge base and the keyword exceeds a preset matching threshold; if yes, go to step S402; if not, the process advances to step S403.
In step S402, a matching answer corresponding to the label that is exactly matched with the keyword is obtained and output to the user.
In step S403, a matching answer corresponding to at least one tag that is fuzzy matched with the keyword is obtained, the matching answer is calculated, and the processed matching answer is output to the user.
Fig. 5 shows a flow of implementation of the natural language processing method provided in the fifth embodiment of the present invention, for convenience of explanation, only the portion related to the embodiment of the present invention is shown, which is similar to the embodiment, and the difference is that the step S104 specifically includes the following steps:
in step S501, the keywords are analyzed and processed to obtain related words with the same meaning as the keywords.
In step S502, a tag matching the related word is determined in a knowledge base, and a matching answer corresponding to the determined tag is obtained and output to a user.
Fig. 6 shows a flow of implementation of the natural language processing method provided in the sixth embodiment of the present invention, and for convenience of explanation, only the portion related to the embodiment of the present invention is shown, which is similar to the embodiment, and the method further includes the following steps:
in step S601, when it is determined that the user is not satisfied with the matching answer, a manual addition reminder is output, and the received correct matching answer is used as a training set.
In step S602, when it is determined that the user is satisfied with the matching answer, the matching answer is taken as a verification set.
In step S603, according to the training set and the verification set, the preset training model is optimized according to a preset optimization period.
Fig. 7 shows a flow of implementation of the natural language processing method according to the seventh embodiment of the present invention, and for convenience of explanation, only the portion related to the embodiment of the present invention is shown, which is similar to the embodiment, and the method further includes the following steps:
in step S701, the knowledge base and the labels thereof are updated according to the training set and the verification set.
In the embodiment of the invention, the error matching answers are adjusted to serve as a training set, the correct answers are continued to serve as a verification set, and the knowledge base and the labels thereof are updated, so that the knowledge base can be reversely optimized, and the accuracy and the stability of the matching result are further improved.
Fig. 8 shows the structure of a natural language processing device 800 according to an eighth embodiment of the present invention, and for convenience of explanation, only the portions related to the embodiments of the present invention are shown in detail as follows:
the natural language processing apparatus 800 includes an acquisition unit 801, an intention determination unit 802, a keyword extraction unit 803, and an output unit 804.
An obtaining unit 801, configured to obtain natural language information of a user.
In the embodiment of the invention, the natural language information input by the user can be natural language information in a text format directly input by the user, for example, when the user searches for the position information of the relevant store by using the mall guide board, the text information input by the user can be analyzed; the natural language information of the voice format input by the user may be recognized by the voice recognition model and processed into the natural language information of the text format, for example, when the user uses the telephone to make consultation, the voice information of the user can be analyzed.
An intention determining unit 802 for identifying the natural language information and determining user intention information.
In the embodiment of the invention, the user intention information can pass through a preset intention recognition model, and the specific intention recognition model is not required by the invention because the existing intention recognition model has a plurality of types, and a proper intention recognition model can be selected by a person skilled in the art according to actual requirements, for example, the intention recognition model can be a path characteristic pair intention recognition model constructed based on a semantic analysis tree, and also can be an intention recognition model of a convolutional neural network based on an attention mechanism.
In the embodiment of the invention, the natural language information is identified and the user intention information is determined, so that the professional industry field to which the natural language information belongs can be determined after the text identification is performed on the natural language information, and the user intention information is identified by combining with the user intention identification model of the determined professional industry field.
And a keyword extraction unit 803 configured to perform keyword extraction on the user intention information.
And the output unit 804 is configured to determine a tag matching the keyword in the knowledge base, obtain a matching answer corresponding to the determined tag, and output the matching answer to the user.
The natural language processing device provided by the embodiment of the invention recognizes natural language information input by a user and determines user intention information; further, determining a label matched with the keyword of the user intention information in a knowledge base, acquiring a matching answer corresponding to the determined label, and outputting the matching answer to a user; on one hand, the interference of the traditional word segmentation technology on language processing is reduced, the accuracy of natural language understanding is greatly improved, and on the other hand, the unique mosaic technology is adopted, the method is more friendly and humanized in human-computer interaction results, and better service customers are facilitated.
FIG. 9 illustrates an internal block diagram of a computer device in one embodiment. The computer device may in particular be a terminal (or a server). As shown in fig. 9, the computer device includes a processor, a memory, a network interface, an input device, and a display screen connected by a system bus. The memory includes a nonvolatile storage medium and an internal memory. The non-volatile storage medium of the computer device stores an operating system, and may also store a computer program that, when executed by a processor, causes the processor to implement a natural language processing method. The internal memory may also have stored therein a computer program which, when executed by the processor, causes the processor to perform the natural language processing method. The display screen of the computer equipment can be a liquid crystal display screen or an electronic ink display screen, the input device of the computer equipment can be a touch layer covered on the display screen, can also be keys, a track ball or a touch pad arranged on the shell of the computer equipment, and can also be an external keyboard, a touch pad or a mouse and the like.
It will be appreciated by those skilled in the art that the structure shown in fig. 9 is merely a block diagram of a portion of the structure associated with the present application and is not limiting of the computer device to which the present application applies, and that a particular computer device may include more or fewer components than shown, or may combine some of the components, or have a different arrangement of components.
In one embodiment, the natural language processing apparatus provided herein may be implemented in the form of a computer program that is executable on a computer device as shown in fig. 9. The memory of the computer device may store therein respective program modules constituting the natural language processing apparatus, such as an acquisition unit 801, an intention determination unit 802, a keyword extraction unit 803, and an output unit 804 shown in fig. 8. The computer program constituted by the respective program modules causes the processor to execute the steps in the natural language processing method of the respective embodiments of the present application described in the present specification.
For example, the computer apparatus shown in fig. 9 may execute step S101 by the acquisition unit 801 in the natural language processing device shown in fig. 8. The computer apparatus may perform step S102 through the intention determining unit 802. The computer apparatus may perform step S103 through the keyword extraction unit 803. The computer apparatus may perform step S104 through the output unit 804.
In one embodiment, a computer device is presented, the computer device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the following steps when executing the computer program:
acquiring natural language information input by a user;
identifying the natural language information and determining user intention information;
extracting keywords from the user intention information;
and determining the label matched with the keyword in a knowledge base, acquiring a matching answer corresponding to the determined label, and outputting the matching answer to a user.
In one embodiment, a computer readable storage medium is provided, having a computer program stored thereon, which when executed by a processor causes the processor to perform the steps of:
acquiring natural language information input by a user;
identifying the natural language information and determining user intention information;
extracting keywords from the user intention information;
and determining the label matched with the keyword in a knowledge base, acquiring a matching answer corresponding to the determined label, and outputting the matching answer to a user.
It should be understood that, although the steps in the flowcharts of the embodiments of the present invention are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited to the order of execution unless explicitly recited herein, and the steps may be executed in other orders. Moreover, at least some of the steps in various embodiments may include multiple sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, nor do the order in which the sub-steps or stages are performed necessarily performed in sequence, but may be performed alternately or alternately with at least a portion of the sub-steps or stages of other steps or other steps.
Those skilled in the art will appreciate that all or part of the processes in the methods of the above embodiments may be implemented by a computer program for instructing relevant hardware, where the program may be stored in a non-volatile computer readable storage medium, and where the program, when executed, may include processes in the embodiments of the methods described above. Any reference to memory, storage, database, or other medium used in the various embodiments provided herein may include non-volatile and/or volatile memory. The nonvolatile memory can include Read Only Memory (ROM), programmable ROM (PROM), electrically Programmable ROM (EPROM), electrically Erasable Programmable ROM (EEPROM), or flash memory. Volatile memory can include Random Access Memory (RAM) or external cache memory. By way of illustration and not limitation, RAM is available in a variety of forms such as Static RAM (SRAM), dynamic RAM (DRAM), synchronous DRAM (SDRAM), double Data Rate SDRAM (DDRSDRAM), enhanced SDRAM (ESDRAM), synchronous Link DRAM (SLDRAM), memory bus direct RAM (RDRAM), direct memory bus dynamic RAM (DRDRAM), and memory bus dynamic RAM (RDRAM), among others.
The technical features of the above-described embodiments may be arbitrarily combined, and all possible combinations of the technical features in the above-described embodiments are not described for brevity of description, however, as long as there is no contradiction between the combinations of the technical features, they should be considered as the scope of the description.
The foregoing examples illustrate only a few embodiments of the invention and are described in detail herein without thereby limiting the scope of the invention. It should be noted that it will be apparent to those skilled in the art that several variations and modifications can be made without departing from the spirit of the invention, which are all within the scope of the invention. Accordingly, the scope of protection of the present invention is to be determined by the appended claims.
The foregoing description of the preferred embodiments of the invention is not intended to be limiting, but rather is intended to cover all modifications, equivalents, and alternatives falling within the spirit and principles of the invention.