Disclosure of Invention
The embodiment of the application aims to provide a retrieval enhancement method and device based on a large model, an intelligent customer service system and a medium, and the accuracy of the intelligent customer service in replying to a user question is improved by building an algorithm model.
In a first aspect, a large model-based search enhancement method is provided, and is applied to an intelligent customer service system, where the method may include:
Acquiring user problem content and corresponding user intention input by a current user;
matching the input user question content with standard questions in a standard question-answering library by adopting a pre-trained recall model to obtain preset number of standard question-answering information matched with the user question content; the recall model is obtained by performing iterative training on a target neural network model based on a positive sample pair and a negative sample pair which are formed by historical user problem contents stored in a customer service system and standard problems in a standard question-answer library; the standard question-answer information comprises matched standard questions and corresponding standard replies;
Based on the user intention, analyzing a preset number of standard question-answer information by adopting a pre-trained large model to obtain the output probability of each standard question-answer information;
and acquiring a reply result of the user problem content based on the output probability of the standard question and answer information.
In one possible implementation, the positive sample pair includes a sample pair of any historical user issue content and a first standard issue that has the same historical user intent as the historical user issue content, and a sample pair of two different historical user issue content that has the same historical user intent;
The negative sample pair includes a sample pair of any one of the historical user issue content and the second standard issue having a different historical user intent than the historical user issue content, and a sample pair of two different historical user issue content having different historical user intent.
In one possible implementation, the backbone network model of the recall model is a Bert-like architecture model or a modified LLM architecture double-tower model.
In one possible implementation, the training process of the recall model includes:
acquiring historical user problem contents and corresponding historical user intentions stored by an intelligent customer service system;
Constructing a training sample pair comprising the positive sample pair and the negative sample pair based on historical user intent and a standard question-answer library;
And (3) inputting the training sample pair into a target neural network model by adopting an In-Batch comparison learning mode, and carrying out convergence processing on a result output by the target neural network model by adopting a preset loss function until a preset convergence condition is met, so as to obtain a trained recall model.
In one possible implementation, the preset loss function is a masked contrast learning loss function.
In one possible implementation, the recall model includes: the system comprises an input layer, a model layer, a characterization aggregation layer, a characterization output layer, a normalization layer, a matching layer and an output layer;
the input layer is used for inputting the user problem content to the model layer at a first input port and inputting any standard problem in the standard question-answering library to the model layer at a second input port;
The model layer is used for acquiring a first characterization vector of the user problem content and a second characterization vector of the standard problem;
The characterization aggregation layer is used for performing sequence dimension aggregation on the first characterization vector and the second characterization vector of the standard problem respectively to obtain a first sentence granularity characterization vector and a second sentence granularity characterization vector;
the representation output layer is used for outputting the first sentence granularity representation vector and the second sentence granularity representation vector which are subjected to sequence dimension aggregation;
the normalization layer is used for normalizing the first sentence granularity representation vector and the second sentence granularity representation vector to obtain a first normalized representation vector and a second normalized representation vector;
the matching layer is used for matching the first normalized characterization vector and the second normalized characterization vector by adopting a preset matching algorithm to obtain a matching score;
And the output layer is used for outputting a preset number of standard questions with highest matching scores and corresponding standard replies.
In one possible implementation, a reply result of the user question content is obtained based on the output probability of the standard question-answer information.
If the output probability of any standard question-answer information is maximum and is larger than a probability threshold, determining the standard answer in the standard question-answer information as an answer result of the user question content;
Or if the output probability of the at least one piece of standard question and answer information is maximum and is not greater than the probability threshold, determining standard replies in the at least one piece of standard question and answer information as reply results of the user question content.
In a second aspect, a retrieval enhancement device based on a large model is provided, and the retrieval enhancement device is applied to an intelligent customer service system, and the device can comprise:
The acquisition unit is used for acquiring the user problem content and the corresponding user intention input by the current user;
The matching unit is used for matching the input user question content with the standard questions in the standard question-answering library by adopting a pre-trained recall model to obtain preset number of standard question-answering information matched with the user question content; the recall model is obtained by performing iterative training on a target neural network model based on a positive sample pair and a negative sample pair which are formed by historical user problem contents and a standard question-answer library and stored in a customer service system; the standard question-answer information comprises matched standard questions and corresponding standard replies;
the analysis unit is used for analyzing the preset number of standard question-answer information by adopting a pre-trained large model based on the user intention to obtain the output probability of each standard question-answer information;
The obtaining unit is further configured to obtain a reply result of the user question content based on the output probability of the question and answer information of each standard.
In a third aspect, an intelligent customer service system is provided, the system comprising a processor, a communication interface, a memory and a communication bus, wherein the processor, the communication interface and the memory are in communication with each other through the communication bus;
a memory for storing a computer program;
A processor for implementing the method steps of any one of the above first aspects when executing a program stored on a memory.
In a fourth aspect, a computer-readable storage medium is provided, in which a computer program is stored which, when being executed by a processor, carries out the method steps of any of the first aspects.
After obtaining user problem contents and corresponding user intentions input by a current user, the retrieval enhancement method based on the large model provided by the embodiment of the application adopts a pre-trained recall model to match the input user problem contents with standard questions in a standard question-answering library so as to obtain preset number of standard question-answering information matched with the user problem contents; the recall model is obtained by performing iterative training on the target neural network model based on a positive sample pair and a negative sample pair which are formed by historical user problem contents stored in the customer service system and standard problems in a standard question-answer library; the standard question-answer information comprises matched standard questions and corresponding standard replies; based on user intention, analyzing a preset number of standard question-answer information by adopting a pre-trained large model to obtain the output probability of each standard question-answer information; and obtaining a reply result of the user problem content based on the output probability of each standard question and answer information. According to the method, the algorithm model is built, so that the accuracy of the intelligent customer service for replying to the user questions is improved.
Detailed Description
The following description of the embodiments of the present application will be made clearly and completely with reference to the accompanying drawings, in which it is apparent that the embodiments described are only some embodiments, but not all embodiments of the present application. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application. Unless defined otherwise, technical or scientific terms used herein should be given the ordinary meaning as understood by one of ordinary skill in the art to which this application belongs. The terms "first," "second," and the like, as used herein, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected," "coupled," or "connected," and the like, are not restricted to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.
Fig. 1 is a flow chart of a large model-based search enhancement method provided in an embodiment of the present invention, where the method is applicable to a case of searching questions and answers in an intelligent customer service system, and the method may be performed by the intelligent customer service system, where the system may be implemented by software and/or hardware and is generally integrated on an electronic device, and in this embodiment, the electronic device includes but is not limited to: a computer device. As shown in fig. 1, a retrieval enhancement method based on a large model provided by an embodiment of the present invention may include the following steps:
Step S110, obtaining user problem contents and corresponding user intentions input by the current user.
Receiving user questions in the form of voice/text input by a current user to an intelligent customer service system, and analyzing user question contents of the user questions;
And carrying out intention recognition on the user problem content by adopting a natural language processing technology to obtain the user intention of the user problem content. For example, the user question content is "how i want to operate to refund today," whose user intends to be "refund.
And step S120, matching the input user question content with standard questions in a standard question-answering library by adopting a pre-trained recall model to obtain preset number of standard question-answering information matched with the user question content.
Prior to performing this step, a recall model needs to be trained, including:
acquiring historical user problem contents and corresponding historical user intentions stored by an intelligent customer service system;
constructing a training sample pair comprising a positive sample pair and a negative sample pair based on historical user intentions and a standard question-answer library; specifically, a sample pair consisting of any one of the history user question contents having the same/similar history user intention as the history user question contents and the first standard question, and a sample pair consisting of two different history user question contents having the same/similar history user intention are taken as a positive sample pair; as the negative sample pair, a sample pair made up of any one of the history user question contents having different/dissimilar history user intentions with the history user question contents and the second standard question, and a sample pair made up of two different/dissimilar history user question contents having different history user intentions.
And (3) adopting a comparison learning mode of Batch processing (In-Batch), inputting a training sample pair into the target neural network model, and adopting a preset loss function to perform convergence processing on a result output by the target neural network model until a preset convergence condition is met, so as to obtain a trained recall model.
That is, the pre-trained recall model is obtained by performing iterative training on the target neural network model based on a positive sample pair and a negative sample pair which are formed by the historical user problem content stored in the customer service system and the standard problems in the standard question-answering library.
Further, the recall model may include: the system comprises an input layer, a model layer, a characterization aggregation layer, a characterization output layer, a normalization layer, a matching layer and an output layer;
The input layer is used for inputting user problem contents to the model layer at a first input port and inputting any standard problem in the standard question-answer library to the model layer at a second input port;
the model layer is used for acquiring a first characterization vector of the user problem content and a second characterization vector of the standard problem;
The characterization aggregation layer is used for performing sequence dimension aggregation on the first characterization vector and the second characterization vector of the standard problem respectively to obtain a first sentence granularity characterization vector and a second sentence granularity characterization vector;
the representation output layer is used for outputting the first sentence granularity representation vector and the second sentence granularity representation vector which are subjected to sequence dimension aggregation;
the normalization layer is used for normalizing the first sentence granularity representation vector and the second sentence granularity representation vector to obtain a first normalized representation vector and a second normalized representation vector;
The matching layer is used for matching the first normalized characterization vector and the second normalized characterization vector by adopting a preset matching algorithm to obtain a matching score; the matching algorithm may be a similarity algorithm such as pearson correlation coefficient, euclidean distance, cosine similarity, etc.
And the output layer is used for outputting a preset number (such as top K) of standard questions with highest matching scores (namely standard questions to be processed) and corresponding standard replies (standard replies of the standard questions to be processed).
In some embodiments, the backbone network model (at the model layer) of the recall model is a double tower structure, which may be a Bert-like architecture model (e.g., BGE bge-large-zh-v 1.5) or a modified large language model (Large Language Model, LLM) architecture double tower model (Qwen 1.5 model).
As shown in fig. 2, the backbone network model is a schematic structural diagram of a recall model when the model of the Bert-like architecture is described. As shown in fig. 3, the backbone network model is a schematic diagram of the recall model when the model of the modified LLM architecture is described. In comparison to fig. 2, fig. 3 adds a mapping layer (fully connected MLP layer) between the characterization aggregation layer and the characterization output layer.
Input layer: for ease of illustration herein, the input layer is used to input user questions and standard questions, denoted by the (Q, a) pair. Q is a user question and a is a standard question, where a may also be a similar user question of the same user intent during training of the model.
Model layer: the whole network of the backbone network model is of a double-tower structure, Q/Q is input at one side, and A/a is input at one side. After Q/Q or A/a is input into the backbone network, the output vector is an implicit layer vector, and is recorded asOr (b)Thereby obtaining the token granularity characterization vector. The dimension isL is the input sequence length and h is the dimension of the hidden layer.
Characterization of the polymeric layer: characterization aggregation layer (AGGR) is used to characterize the implicit layer vectorOr (b)Aggregation is carried out in the sequence dimension, thus obtaining the characterization vector of the sentence granularityOr (b). There are a number of ways of polymerization:
1) For backbone network selection Bert architecture, we directly employ I.e., the token vector of the first CLS token of the sequence as a sentence token,The same applies to the calculation of (2).
2) The LLM architecture is selected for the backbone network,,. I.e., aggregate the hidden layer representation in the sequence dimension using MAX POOLING function and MEAN POOLING function, splice the two pooling vectors, then map through the mapping layer (fully connected MLP layer), get the sentence representation,The same applies to the calculation of (2).
Normalization layer: will beOr (b)Vector normalization, and an optional normalization scheme is L2 Norm, so as to obtainOr (b)。
Matching layer: calculating matching scores, i.e.Where f () is a matching function, such as a similarity algorithm, one possible definition is: τ is the temperature coefficient.
Further, the predetermined loss function is a masked contrast learning loss function (mask-InfoNCE), which can be expressed as:
Where m (-) is a mask function, when In the time-course of which the first and second contact surfaces,The value is 0, otherwise the value is 1e9. Wherein ID (-) represents a numbering function, whenWhen the questions are standard questions, the values are the numbers of the standard questions (the numbers of the standard questions in a standard question-answering library), and when the questions are similar questions, the values are the numbers of the standard questions corresponding to the similar questions. The specific number value can be the actual number of the content in the service system, or the hash function can be modulo.
That is, in the training process, the masked contrast learning loss function is a common calculation loss function using the output data of the model, the tag data, and the mask data.
The masked contrast learning loss function, i.e., other standard problems in the Batch may be the same as the "standard problem" in the positive sample, and it is not reasonable to directly treat the other standard problems in the Batch as negative samples. We avoid this problem by introducing a mask that sets the negative sample loss function of other standard problem = standard problem within Batch to 0.
Returning to step S120, after the pre-trained recall model is adopted to match the input user question content with the standard questions in the standard question-answering library, the matching score of the user question content and different standard questions is obtained; and determining standard question-answering information formed by the preset number of standard questions with the highest matching score and corresponding standard replies as preset number of standard question-answering information matched with the content of the user questions.
And step 130, analyzing a preset number of standard question-answer information by adopting a pre-trained large model based on the intention of the user to obtain the output probability of each standard question-answer information.
And constructing a Prompt by using the combination of the preset number of standard questions and corresponding standard replies output by the recall model, constructing a preset number of classification task training samples by using the reply results meeting the user intention in the preset number of standard replies as labels, and training a neural network model (such as a Qwen1.5 model) to obtain a trained large model.
And inputting a preset number of standard question-answer information into the trained large model, and outputting the output probability of standard replies in the standard question-answer information of each category.
And step 140, obtaining a reply result of the user question content based on the output probability of each standard question and answer information.
If the output probability of any standard question-answer information is maximum and is larger than the probability threshold, determining the standard answer in the standard question-answer information as a answer result of the user question content.
Or if the output probability of the at least one piece of standard question and answer information is maximum and is not greater than the probability threshold, determining that the standard replies in the at least one piece of standard question and answer information are reply results of the user question content.
In some embodiments, during the process of training the recall model, if the number of positive sample pairs and/or negative sample pairs is too small, the accuracy of training the even image will be affected, so if the number of positive sample pairs or negative sample pairs is smaller than a preset number threshold, the training will belong to a minority class of samples, and at this time, sample expansion needs to be performed on the minority class of samples, including:
Carrying out sample classification (such as according to problem content, user intention and the like) on the minority class samples according to neighbor distribution information of the minority class samples by adopting a preset selective interpolation SMOTE algorithm to obtain target samples of different classes; and interpolating the target samples of different categories to obtain new samples corresponding to the target samples of different categories.
In a specific implementation, obtaining target samples of different classes includes: calculating a minority sample of k neighbors of a target minority sample in the minority samples to obtain a first neighbor sample set, wherein k is an integer greater than 0; calculating k neighbor samples of the target minority class samples in the whole training sample pair to obtain a second neighbor sample set, wherein k is an integer larger than 0; the target minority sample is any one of minority samples; if the first neighbor sample set and the second neighbor sample set do not have the same samples, determining the target minority sample as a noise sample. And if the first neighbor sample set and the second neighbor sample set have the same samples, determining the target minority sample as the target sample.
And then, interpolating the target samples of different types to obtain new samples corresponding to the target samples of different types, such as triangular interpolation or linear interpolation.
Another expansion mode may be: clustering a minority class sample and a majority class sample by using a fuzzy clustering algorithm, and determining cluster centers and cluster radii of different aggregation classes; based on the cluster centers and cluster radii of different aggregation classes, new samples are generated, which may be new samples of a minority class or new samples of a majority class.
In some embodiments, after the training sample pair is obtained, the limit learning machine may be further trained based on the training sample. Unlike a conventional single hidden layer feed-forward neural network (SLFNs), the extreme learning machine randomly assigns input weights and hidden layer biases without the need to adjust parameters as they are back-propagated to errors in the neural network. The output weight of the network model of the extreme learning machine is directly determined by solving the linear model, so that the training stage of the extreme learning machine is completed only by one iteration, and the training speed is extremely high. The network structure of the extreme learning machine comprises: input layer, hidden layer and output layer, the connection between the input layer and hidden layer is through input weightEstablishing connection between hidden layer and output layer by output weightAnd (5) establishing.
According to the retrieval enhancement method based on the large model, the recall effect of the recall model can be remarkably improved by using the (user question, standard question) and (user question, similar user question) sample pair as training samples and using the comparison learning loss function with the mask. And, by sending the result of the recall model into the large model, the top K standard questions recalled by the recall model are reordered, so that the real reply content can be further ranked at a position more forward.
Corresponding to the method, the embodiment of the application also provides a retrieval enhancement device based on a large model, as shown in fig. 4, which comprises:
An obtaining unit 410, configured to obtain user question content and a corresponding user intention input by a current user;
The matching unit 420 is configured to match the input user question content with standard questions in a standard question-answering library by using a pre-trained recall model, so as to obtain a preset number of standard question-answering information matched with the user question content; the recall model is obtained by performing iterative training on a target neural network model based on a positive sample pair and a negative sample pair which are formed by historical user problem contents and a standard question-answer library and stored in a customer service system; the standard question-answer information comprises matched standard questions and corresponding standard replies;
an analysis unit 430, configured to analyze a preset number of standard question-answer information by using a pre-trained large model based on the user intention, so as to obtain output probabilities of the standard question-answer information;
The obtaining unit 410 is further configured to obtain a reply result of the user question content based on the output probability of the question and answer information.
The functions of each functional unit of the large-model-based retrieval enhancement device provided by the embodiment of the application can be realized through the steps of the method, so that the specific working process and the beneficial effects of each unit in the large-model-based retrieval enhancement device provided by the embodiment of the application are not repeated here.
The embodiment of the application also provides an intelligent customer service system, as shown in fig. 5, which comprises a processor 510, a communication interface 520, a memory 530 and a communication bus 540, wherein the processor 510, the communication interface 520 and the memory 530 are in communication with each other through the communication bus 540.
A memory 530 for storing a computer program;
The processor 510 is configured to execute the program stored in the memory 530, and implement the following steps:
Acquiring user problem content and corresponding user intention input by a current user;
matching the input user question content with standard questions in a standard question-answering library by adopting a pre-trained recall model to obtain preset number of standard question-answering information matched with the user question content; the recall model is obtained by performing iterative training on a target neural network model based on a positive sample pair and a negative sample pair which are formed by historical user problem contents stored in a customer service system and standard problems in a standard question-answer library; the standard question-answer information comprises matched standard questions and corresponding standard replies;
Based on the user intention, analyzing a preset number of standard question-answer information by adopting a pre-trained large model to obtain the output probability of each standard question-answer information;
and acquiring a reply result of the user problem content based on the output probability of the standard question and answer information.
The communication bus mentioned above may be a peripheral component interconnect standard (PERIPHERAL COMPONENT INTERCONNECT, PCI) bus or an extended industry standard architecture (Extended Industry Standard Architecture, EISA) bus, or the like. The communication bus may be classified as an address bus, a data bus, a control bus, or the like. For ease of illustration, the figures are shown with only one bold line, but not with only one bus or one type of bus.
The communication interface is used for communication between the electronic device and other devices.
The Memory may include random access Memory (Random Access Memory, RAM) or may include Non-Volatile Memory (NVM), such as at least one disk Memory. Optionally, the memory may also be at least one memory device located remotely from the aforementioned processor.
The processor may be a general-purpose processor, including a central processing unit (Central Processing Unit, CPU), a network processor (Network Processor, NP), etc.; but may also be a digital signal processor (DIGITAL SIGNAL Processing, DSP), application SPECIFIC INTEGRATED Circuit (ASIC), field-Programmable gate array (Field-Programmable GATE ARRAY, FPGA) or other Programmable logic device, discrete gate or transistor logic device, discrete hardware components.
Since the implementation manner and the beneficial effects of the solution to the problem of each device of the electronic apparatus in the foregoing embodiment may be implemented by referring to each step in the embodiment shown in fig. 1, the specific working process and the beneficial effects of the electronic apparatus provided by the embodiment of the present application are not repeated herein.
In yet another embodiment of the present application, a computer readable storage medium is provided, in which instructions are stored, which when run on a computer, cause the computer to perform the large model based retrieval enhancement method according to any of the above embodiments.
In yet another embodiment of the present application, there is also provided a computer program product containing instructions that, when run on a computer, cause the computer to perform the large model-based retrieval enhancement method of any of the above embodiments.
It will be appreciated by those skilled in the art that embodiments of the present application may be provided as methods, systems, or computer program products. Accordingly, embodiments of the present application may take the form of an entirely hardware embodiment, an entirely software embodiment or an embodiment combining software and hardware aspects. Furthermore, embodiments of the present application may take the form of a computer program product on one or more computer-usable storage media (including, but not limited to, disk storage, CD-ROM, optical storage, etc.) having computer-usable program code embodied therein.
Embodiments of the present application are described with reference to flowchart illustrations and/or block diagrams of methods, apparatus (systems) and computer program products according to embodiments of the application. It will be understood that each flow and/or block of the flowchart illustrations and/or block diagrams, and combinations of flows and/or blocks in the flowchart illustrations and/or block diagrams, can be implemented by computer program instructions. These computer program instructions may be provided to a processor of a general purpose computer, special purpose computer, embedded processor, or other programmable data processing apparatus to produce a machine, such that the instructions, which execute via the processor of the computer or other programmable data processing apparatus, create means for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be stored in a computer-readable memory that can direct a computer or other programmable data processing apparatus to function in a particular manner, such that the instructions stored in the computer-readable memory produce an article of manufacture including instruction means which implement the function specified in the flowchart flow or flows and/or block diagram block or blocks.
These computer program instructions may also be loaded onto a computer or other programmable data processing apparatus to cause a series of operational steps to be performed on the computer or other programmable apparatus to produce a computer implemented process such that the instructions which execute on the computer or other programmable apparatus provide steps for implementing the functions specified in the flowchart flow or flows and/or block diagram block or blocks.
While preferred embodiments of the present application have been described, additional variations and modifications in those embodiments may occur to those skilled in the art once they learn of the basic inventive concepts. It is therefore intended that the following claims be interpreted as including the preferred embodiment and all such alterations and modifications as fall within the scope of the embodiments of the application.
It will be apparent to those skilled in the art that various modifications and variations can be made in the embodiments of the present application without departing from the spirit or scope of the embodiments of the application. Thus, if such modifications and variations of the embodiments of the present application fall within the scope of the claims and the equivalents thereof, it is intended that such modifications and variations be included in the embodiments of the present application.