Disclosure of Invention
Accordingly, an object of an embodiment of the present application is to provide an intention recognition method, apparatus, electronic device, and storage medium, which can combine a history session between a user and a chat robot and a user habit according to a session initiated by the user, so as to achieve the purpose of accurately recognizing the intention of the user.
In order to achieve the technical purpose, the application adopts the following technical scheme:
in a first aspect, an embodiment of the present application provides an intent recognition method, including:
acquiring a current session of a user, wherein the current session comprises a current query sent by the user to a chat robot;
acquiring a plurality of historical sessions of the user based on a time sequence, wherein the historical sessions comprise historical queries from the user within a preset time range and historical replies corresponding to the historical queries, and the historical replies comprise texts, images, voices, actions or expressions which are presented to the user by the chat robot and correspond to the historical queries;
and determining the intention of the user through a preset intention recognition model based on the current session and the historical session as an intention recognition result.
With reference to the first aspect, in some optional embodiments, before acquiring the current session of the user, the method further includes:
constructing an intention recognition model;
acquiring a plurality of historical sessions of the user based on a time sequence, wherein the historical sessions comprise historical queries from the user within a preset time range and historical replies corresponding to the historical queries, and the historical replies comprise texts, images, voices, actions or expressions which are presented to the user by the chat robot and correspond to the historical queries;
extracting the characteristics of the history query and the history answer to obtain characteristic vectors corresponding to the plurality of history sessions to form a characteristic set;
training the intention recognition model based on the feature set to obtain a trained intention recognition model serving as the preset intention recognition model.
With reference to the first aspect, in some optional embodiments, extracting features of the historical query and the historical reply to obtain feature vectors corresponding to the plurality of historical sessions, to form a feature set, includes:
when the initial input of the history query or the history reply is of a text type, extracting text features of the initial input through a preset word segmentation model to obtain the feature vectors corresponding to the plurality of history sessions, and forming the feature set;
when the initial input is of a voice type, converting the initial input into a text type through a preset voice recognition model, and extracting text features of the initial input through the preset word segmentation model to obtain the feature vectors corresponding to the plurality of historical conversations to form the feature set;
and when the initial input is of an action type, extracting action features of the initial input through a preset target prediction model to obtain feature vectors corresponding to the plurality of historical sessions to form the feature set, wherein the action features comprise a plurality of features of the trigger gesture of the user, which are extracted by the target prediction model based on the initial input.
With reference to the first aspect, in some optional embodiments, determining, based on the current session and the historical session, an intent of the user through a preset intent recognition model, as an intent recognition result, includes:
determining whether the historical queries are matched with the historical replies in the single historical session through the preset intention recognition model so as to obtain a first result which represents whether the user is satisfied with the historical replies in the single historical session;
determining, from the first result, an item characterizing satisfaction of the user with the historical replies in a single one of the historical sessions as the number intent recognition result by a preset intent update policy.
With reference to the first aspect, in some optional embodiments, determining, by the preset intent recognition model, whether the historical queries match the historical replies in the single historical session to obtain a first result that characterizes whether the user is satisfied with the historical replies in the single historical session includes:
when the user does not give a new historical query within a first preset time period after the historical answer is made by the chat robot, determining that the first result is that the user is satisfied with the historical answer;
when the historical reply is not withdrawn by the user within a second preset time period after the historical reply is made by the chat robot, determining that the first result is that the user is satisfied with the historical reply;
and when the history answer is within a first preset time period after the chat robot makes the history answer, the user makes a new history inquiry, and the new history inquiry is related to the history answer, determining that the first result is that the user is satisfied with the history answer.
With reference to the first aspect, in some optional embodiments, determining, from the first result, one item characterizing satisfaction of the user with the historical replies in a single one of the historical conversations as the number intent recognition result by a preset intent update policy includes:
determining a history session corresponding to the closest item from the current session in the first result as the intention recognition result;
with reference to the first aspect, in some optional embodiments, the method further includes: and providing a current answer corresponding to the current session according to the intention recognition result.
In a second aspect, an embodiment of the present application further provides an intention recognition apparatus, including:
the first acquisition unit is used for acquiring a current session of a user, wherein the current session comprises a current query sent by the user to the chat robot;
a second obtaining unit, configured to obtain a plurality of history sessions based on a time sequence of the user, where the history sessions include history queries from the user within a preset time range, and history replies corresponding to the history queries, and the history replies include text, images, voices, actions, or expressions corresponding to the history queries that are presented to the user by the chat robot;
and the determining unit is used for determining the intention of the user through a preset intention recognition model based on the current session and the historical session as an intention recognition result.
In a third aspect, an embodiment of the present application further provides an electronic device, where the electronic device includes a processor and a memory coupled to each other, where the memory stores a computer program, and when the computer program is executed by the processor, causes the electronic device to perform the method described above.
In a fourth aspect, embodiments of the present application also provide a computer-readable storage medium having stored therein a computer program which, when run on a computer, causes the computer to perform the above-described method.
The application adopting the technical scheme has the following advantages:
in the technical scheme provided by the application, the current session of the user is firstly obtained, then a plurality of historical sessions of the user based on a time sequence are obtained, the historical sessions comprise historical queries from the user within a preset time range and historical replies corresponding to the historical queries, and the intention of the user is determined through a preset intention recognition model based on the current session and the historical sessions and is used as an intention recognition result. Therefore, according to the conversation initiated by the user, the historical conversation between the user and the chat robot, the user habit and the like are combined, and the purpose of accurately identifying the intention of the user is achieved.
Detailed Description
The present application will be described in detail below with reference to the drawings and the specific embodiments, wherein like or similar parts are designated by the same reference numerals throughout the drawings or the description, and implementations not shown or described in the drawings are in a form well known to those of ordinary skill in the art. In the description of the present application, the terms "first," "second," and the like are used merely to distinguish between descriptions and are not to be construed as indicating or implying relative importance.
Referring to fig. 1, an electronic device 100 according to an embodiment of the application may include a processor 101 and a memory 102. The memory 102 stores a computer program which, when executed by the processor 101, enables the electronic device 100 to perform the respective steps in the intent recognition method described below.
In this embodiment, the electronic device 100 may be a personal computer, a palm computer, a mobile phone, a cloud server, or the like. The user intention is determined through a preset intention recognition model according to the current session and a plurality of historical sessions of the user.
Referring to fig. 2, the present application further provides an intent recognition method applied to the electronic device 100, and executed or implemented by the electronic device 100. The intention recognition method may include the steps of:
step 110, obtaining a current session of a user, wherein the current session comprises a current query sent by the user to a chat robot;
step 120, obtaining a plurality of historical sessions of the user based on time sequence, wherein the historical sessions comprise historical queries from the user within a preset time range and historical replies corresponding to the historical queries, and the historical replies comprise texts, images, voices, actions or expressions which are presented to the user by the chat robot and correspond to the historical queries;
and step 130, determining the intention of the user through a preset intention recognition model based on the current session and the historical session as an intention recognition result.
In the above-described embodiment, the current session of the user is first acquired, and then a plurality of history sessions of the user based on a time series are acquired, the history sessions including history queries from the user within a preset time range and history replies corresponding to the history queries, and the intention of the user is determined as an intention recognition result through a preset intention recognition model based on the current session and the history sessions. Therefore, according to the conversation initiated by the user, the historical conversation between the user and the chat robot, the user habit and the like are combined, and the purpose of accurately identifying the intention of the user is achieved.
The steps of the intention recognition method will be described in detail as follows:
prior to step 110, the method may include:
constructing an intention recognition model;
acquiring a plurality of historical sessions of the user based on a time sequence, wherein the historical sessions comprise historical queries from the user within a preset time range and historical replies corresponding to the historical queries, and the historical replies comprise texts, images, voices, actions or expressions which are presented to the user by the chat robot and correspond to the historical queries;
extracting the characteristics of the history query and the history answer to obtain characteristic vectors corresponding to the plurality of history sessions to form a characteristic set;
training the intention recognition model based on the feature set to obtain a trained intention recognition model serving as the preset intention recognition model.
In this embodiment, extracting features of the historical query and the historical reply to obtain feature vectors corresponding to the plurality of historical sessions, to form a feature set may include:
when the initial input of the history query or the history reply is of a text type, extracting text features of the initial input through a preset word segmentation model to obtain the feature vectors corresponding to the plurality of history sessions, and forming the feature set;
when the initial input is of a voice type, converting the initial input into a text type through a preset voice recognition model, and extracting text features of the initial input through the preset word segmentation model to obtain the feature vectors corresponding to the plurality of historical conversations to form the feature set;
and when the initial input is of an action type, extracting action features of the initial input through a preset target prediction model to obtain feature vectors corresponding to the plurality of historical sessions to form the feature set, wherein the action features comprise a plurality of features of the trigger gesture of the user, which are extracted by the target prediction model based on the initial input.
Illustratively, when the initial input of the historical query or the historical reply is of a text type, such as, in any one of the historical queries, the user enters text "day of the week? The historical answer of the chat robot for the historical query is 'today Zhou', the initial input text feature is extracted through a preset word segmentation model, and a feature vector corresponding to the text feature is obtained to form a feature set, wherein the preset word segmentation model can be a deep network model, a convolution network model and the like with a keyword extraction function;
when the initial input of the historical query or the historical reply is of a voice type, for example, in any one of the historical queries, the user inputs a voice "how is today weather? The chat robot makes a history reply to the current history inquiry as a section of voice with the content of 'today's cloudy and sunny ', and' how is the weather today? Converting ' and ' today ' multi-cloud conversion into text, extracting initial input text features through a word segmentation model, obtaining feature vectors corresponding to the text features, and forming a feature set, wherein the preset voice recognition model can be a deep network model, a convolution network model and the like with a voice-to-text function;
when the initial input of the history query or the history reply is an action type, for example, in any history query, when the electronic device 100 is a mobile phone, the user simultaneously touches the mobile phone screen through three fingers and slides down, the history reply made by the chat robot for the history query is a screenshot action, and the action features of the initial input are extracted through a preset target prediction model, so that feature vectors corresponding to a plurality of history sessions are obtained, and a feature set is formed.
In step 110, a user inputs a current query to the electronic device 100 via an input device (e.g., an electronic touch screen, a microphone, an input keyboard, etc.), and the current query is obtained by the electronic device 100. The acquiring of the current query may be actively performed by the electronic device 100 in real time, for example, when the electronic device 100 is a mobile phone, the microphone in the mobile phone is set to be in a normally open state, and voice information of the user is acquired in real time; the acquisition of the current query may also be performed passively by an operation instruction of the user, for example, when the electronic device 100 is a mobile phone, and when the user presses the microphone button for a long time, the mobile phone acquires the voice information of the user during the long-time pressing of the microphone button again. That is, the manner of acquiring the current session is not particularly limited herein.
In this embodiment, the current query may be text information, voice information, pictures, motion gestures, dynamic expressions, etc. issued by the user.
In step 120, a time-series based history session stored in the memory 102 of the electronic device 100 is acquired, the history session being arranged in order from short to long according to the time from the current moment. The history session comprises a history inquiry from a user and a history reply corresponding to the history inquiry in a preset time range. The historical query may be text information, voice information, pictures, motion gestures, dynamic expressions, etc., issued by the user, and the historical reply may include text, images, voice, motion, or expressions presented to the user by the chat bot that correspond to the historical query.
In this embodiment, the preset time range may be flexibly set according to practical situations, for example, 30 days, 3 months, 6 months, 1 year, and the like.
In step 130, determining the intention of the user through a preset intention recognition model based on the current session and the history session, as an intention recognition result, may include:
determining whether the historical queries are matched with the historical replies in the single historical session through the preset intention recognition model so as to obtain a first result which represents whether the user is satisfied with the historical replies in the single historical session;
determining, from the first result, an item characterizing satisfaction of the user with the historical replies in a single one of the historical sessions as the number intent recognition result by a preset intent update policy.
In this embodiment, determining, by the preset intent recognition model, whether the historical queries match the historical replies in the single historical session to obtain a first result that characterizes whether the user is satisfied with the historical replies in the single historical session may include:
when the user does not give a new historical query within a first preset time period after the historical answer is made by the chat robot, determining that the first result is that the user is satisfied with the historical answer;
when the historical reply is not withdrawn by the user within a second preset time period after the historical reply is made by the chat robot, determining that the first result is that the user is satisfied with the historical reply;
and when the history answer is within a first preset time period after the chat robot makes the history answer, the user makes a new history inquiry, and the new history inquiry is related to the history answer, determining that the first result is that the user is satisfied with the history answer.
The first preset duration and the second preset duration can be flexibly set according to actual conditions, such as 3 minutes, 5 minutes, 10 minutes and the like.
Illustratively, when the history answer is within 3 minutes after the chat robot makes the history answer, the user does not make a new history inquiry, it can be understood that the history answer at this time is approved by the user, and the user stays on the history answer interface and is in the reading process, so that the first result is determined to be satisfied with the history answer by the user;
when the historical reply is made by the chat robot within two minutes (for example, the historical reply made by the chat robot is that an animation expression is sent to a mobile phone contact A of a user and can be withdrawn within two minutes), if the user does not withdraw the historical reply within two minutes, determining that the first result is that the user is satisfied with the historical reply;
when the user makes a new history query within 3 minutes after the history answer is made by the chat robot and the new history query is related to the history answer (for example, when the history answer is that the weather is cloudy to rainy today, the user makes the new history query about that the user is rainy), that is, the user approves the history answer or a part of the history answer and makes a new query based on the approval, it is determined that the first result is that the user is satisfied with the history answer.
In this embodiment, determining, from the first result, one item characterizing satisfaction of the user with the historical replies in the single historical session as the number intent recognition result by presetting an intent update policy may include:
and determining a historical session corresponding to the closest item from the current session in the first result as the intention recognition result.
It can be appreciated that the electronic device 100 may not always be the same user, and the operating habits of different users are also greatly different, so that a history session that is the closest to the current session and is satisfied by the user is determined, and as an intention recognition result, the history session is maximally attached to the operating habits of the user.
As an alternative embodiment, the method may further comprise:
and providing a current answer corresponding to the current session according to the intention recognition result.
For example, when the intention recognition result is a search-class intention, and specifically a today's weather, making a corresponding current reply by the chat robot is "the today's weather is a sunny cloudy"; when the intention recognition result is an idle-talk intention, and specifically is ' you feel me is ' mock ', making a corresponding current reply by the chat robot is ' the owner is the most attractive female child ' that me sees; when the intention recognition result is an action intention, and particularly when three fingers of a user simultaneously contact the electronic touch screen and slide downwards, namely, a user makes a screenshot gesture and has an intention of want screenshot, the chat robot makes a corresponding action current answer to intercept the content displayed on the current screen.
Referring to fig. 3, the present application further provides an intention recognition device 200, where the intention recognition device 200 includes at least one software function module that may be stored in the memory 102 in the form of software or Firmware (Firmware) or cured in an Operating System (OS) of the electronic device 100. The processor 101 is configured to execute executable modules stored in the memory 102, such as software functional modules and computer programs included in the intent recognition device 200.
The intention recognition apparatus 200 includes a first acquisition unit 210, a second acquisition unit 220, and a determination unit 230, and each unit has the following functions:
a first obtaining unit 210, configured to obtain a current session of a user, where the current session includes a current query sent by the user to a chat robot;
a second obtaining unit 220, configured to obtain a plurality of history sessions based on a time sequence of the user, where the history sessions include history queries from the user within a preset time range, and history replies corresponding to the history queries, and the history replies include text, images, voices, actions, or expressions corresponding to the history queries that are presented to the user by the chat robot;
a determining unit 230 for determining an intention of the user as an intention recognition result through a preset intention recognition model based on the current session and the history session.
Optionally, the intention recognition apparatus 200 may further include:
a construction unit for constructing an intention recognition model;
a third obtaining unit, configured to obtain a plurality of history sessions based on a time sequence of the user, where the history sessions include history queries from the user within a preset time range, and history replies corresponding to the history queries, and the history replies include text, images, voices, actions, or expressions corresponding to the history queries that are presented to the user by the chat robot;
the extraction unit is used for extracting the characteristics of the history query and the history answer to obtain characteristic vectors corresponding to the plurality of history sessions to form a characteristic set;
the training unit is used for training the intention recognition model based on the feature set to obtain a trained intention recognition model serving as the preset intention recognition model.
Optionally, the extraction unit is further configured to:
when the initial input of the history query or the history reply is of a text type, extracting text features of the initial input through a preset word segmentation model to obtain the feature vectors corresponding to the plurality of history sessions, and forming the feature set;
when the initial input is of a voice type, converting the initial input into a text type through a preset voice recognition model, and extracting text features of the initial input through the preset word segmentation model to obtain the feature vectors corresponding to the plurality of historical conversations to form the feature set;
and when the initial input is of an action type, extracting action features of the initial input through a preset target prediction model to obtain feature vectors corresponding to the plurality of historical sessions to form the feature set, wherein the action features comprise a plurality of features of the trigger gesture of the user, which are extracted by the target prediction model based on the initial input.
Optionally, the determining unit 230 is further configured to:
determining whether the historical queries are matched with the historical replies in the single historical session through the preset intention recognition model so as to obtain a first result which represents whether the user is satisfied with the historical replies in the single historical session;
determining, from the first result, an item characterizing satisfaction of the user with the historical replies in a single one of the historical sessions as the number intent recognition result by a preset intent update policy.
Optionally, the determining unit 230 is further configured to:
when the user does not give a new historical query within a first preset time period after the historical answer is made by the chat robot, determining that the first result is that the user is satisfied with the historical answer;
when the historical reply is not withdrawn by the user within a second preset time period after the historical reply is made by the chat robot, determining that the first result is that the user is satisfied with the historical reply;
and when the history answer is within a first preset time period after the chat robot makes the history answer, the user makes a new history inquiry, and the new history inquiry is related to the history answer, determining that the first result is that the user is satisfied with the history answer.
Optionally, the determining unit 230 is further configured to:
and determining a historical session corresponding to the closest item from the current session in the first result as the intention recognition result.
Optionally, the intention recognition apparatus 200 may further include:
and the replying unit is used for providing a current reply corresponding to the current session according to the intention recognition result.
In this embodiment, the processor 101 may be an integrated circuit chip with signal processing capability. The processor 101 may be a general-purpose processor. For example, the processor 101 may be a central processing unit (Central Processing Unit, CPU), digital signal processor (Digital Signal Processing, DSP), application specific integrated circuit (Application Specific Integrated Circuit, ASIC), field programmable gate array (Field-Programmable Gate Array, FPGA) or other programmable logic device, discrete gate or transistor logic, discrete hardware components, or may implement or perform the methods, steps, and logic blocks disclosed in embodiments of the present application.
The memory 102 may be, but is not limited to, random access memory, read only memory, programmable read only memory, erasable programmable read only memory, electrically erasable programmable read only memory, and the like. In this embodiment, the memory 102 may be used to store a current session, a historical session, an intent recognition result, an intent recognition model, a feature set, a first result, a first preset duration, a second preset duration, a current answer, and the like. Of course, the memory 102 may also be used to store a program that the processor 101 executes after receiving the execution instruction.
It is understood that the electronic device 100 shown in fig. 1 is only a schematic structural diagram, and that the electronic device 100 may also include more components than those shown in fig. 1. The components shown in fig. 1 may be implemented in hardware, software, or a combination thereof.
It should be noted that, for convenience and brevity of description, specific working processes of the electronic device 100 described above may refer to corresponding processes of each step in the foregoing method, and will not be described in detail herein.
The embodiment of the application also provides a computer readable storage medium. The computer-readable storage medium has stored therein a computer program which, when run on a computer, causes the computer to execute the intention recognition method as described in the above embodiments.
From the foregoing description of the embodiments, it will be apparent to those skilled in the art that the present application may be implemented in hardware, or by means of software plus a necessary general hardware platform, and based on this understanding, the technical solution of the present application may be embodied in the form of a software product, which may be stored in a non-volatile storage medium (may be a CD-ROM, a U-disc, a mobile hard disk, etc.), and includes several instructions for causing a computer device (may be a personal computer, a server, or a network device, etc.) to execute the method described in the respective implementation scenario of the present application.
In summary, the embodiments of the present application provide an intent recognition method, apparatus, electronic device 100 and storage medium. In the technical scheme, a current session of a user is firstly obtained, then a plurality of historical sessions of the user based on a time sequence are obtained, the historical sessions comprise historical queries from the user within a preset time range and historical replies corresponding to the historical queries, and the intention of the user is determined through a preset intention recognition model based on the current session and the historical sessions and used as an intention recognition result. Therefore, according to the conversation initiated by the user, the historical conversation between the user and the chat robot, the user habit and the like are combined, and the purpose of accurately identifying the intention of the user is achieved.
In the embodiments provided in the present application, it should be understood that the disclosed apparatus, system and method may be implemented in other manners as well. The above-described apparatus, system, and method embodiments are merely illustrative, for example, flow charts and block diagrams in the figures illustrate the architecture, functionality, and operation of possible implementations of systems, methods and computer program products according to various embodiments of the present application. In this regard, each block in the flowchart or block diagrams may represent a module, segment, or portion of code, which comprises one or more executable instructions for implementing the specified logical function(s). It will also be noted that each block of the block diagrams and/or flowchart illustration, and combinations of blocks in the block diagrams and/or flowchart illustration, can be implemented by special purpose hardware-based systems which perform the specified functions or acts, or combinations of special purpose hardware and computer instructions. In addition, functional modules in the embodiments of the present application may be integrated together to form a single part, or each module may exist alone, or two or more modules may be integrated to form a single part.
The above description is only an example of the present application and is not intended to limit the scope of the present application, and various modifications and variations will be apparent to those skilled in the art. Any modification, equivalent replacement, improvement, etc. made within the spirit and principle of the present application should be included in the protection scope of the present application.