[go: up one dir, main page]

CN111738437B - Training method, text generation device and electronic equipment - Google Patents

Training method, text generation device and electronic equipment Download PDF

Info

Publication number
CN111738437B
CN111738437B CN202010689980.6A CN202010689980A CN111738437B CN 111738437 B CN111738437 B CN 111738437B CN 202010689980 A CN202010689980 A CN 202010689980A CN 111738437 B CN111738437 B CN 111738437B
Authority
CN
China
Prior art keywords
length
text
control vector
length control
vector
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202010689980.6A
Other languages
Chinese (zh)
Other versions
CN111738437A (en
Inventor
梁忠平
温祖杰
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Alipay Hangzhou Information Technology Co Ltd
Original Assignee
Alipay Hangzhou Information Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Hangzhou Information Technology Co Ltd filed Critical Alipay Hangzhou Information Technology Co Ltd
Priority to CN202010689980.6A priority Critical patent/CN111738437B/en
Publication of CN111738437A publication Critical patent/CN111738437A/en
Application granted granted Critical
Publication of CN111738437B publication Critical patent/CN111738437B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3347Query execution using vector based model
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/34Browsing; Visualisation therefor
    • G06F16/345Summarisation for human users
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Databases & Information Systems (AREA)
  • Artificial Intelligence (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • General Health & Medical Sciences (AREA)
  • Evolutionary Computation (AREA)
  • Biophysics (AREA)
  • Software Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

One or more embodiments of the present specification provide a training method, a text generation method, an apparatus, and an electronic device; the training process is realized by adopting a Teacher-Student training framework in a way of combining a Teacher generated model and a Student generated model for training. The teacher generation model is a general text generation model and is used for the student generation model to learn the text generation method; the student generation model introduces a first length control vector for controlling the maximum length of the output text and a second length control vector for controlling the minimum length of the output text, and the capability of the student generation model for controlling the length of the output text is trained through the first length control vector and the second length control vector; and based on a reinforcement learning method, joint loss of the teacher generated model and the student generated model is obtained through the return value, and the student generated model is trained to obtain a generated model with controllable output text length for generating the output text with controllable output text length.

Description

Training method, text generation device and electronic equipment
Technical Field
One or more embodiments of the present disclosure relate to the field of natural language processing technologies, and in particular, to a training method, a text generation method, an apparatus, and an electronic device.
Background
Text generation is a widely used natural language processing technology, and can be applied to many natural language processing tasks, such as a question and answer system, a chat system, an article abstract generation system, and the like. In some application scenarios, there is a limited requirement on the length of the generated text. For example, in the context of news summaries, it is desirable that the news summary generated cannot be too short and too long due to limitations in page size and display layout, and that text not be displayed if too short blank locations are too unsightly and too long. For another example, in a chat system, there is a limited requirement for generating machine responses, too long a response will bring reading burden to the user, reducing product experience, and too short a response will lose information, making the chat uninteresting. However, none of the existing text generation methods takes into account the problem of limiting the length of the generated text.
Therefore, how to effectively control the length of the generated text is a problem that needs to be solved in the technical field of natural language processing at present.
Disclosure of Invention
In view of this, an object of one or more embodiments of the present disclosure is to provide a training method, a text generation device, and an electronic device.
In view of the above, one or more embodiments of the present specification provide a training method for outputting a generative model with controllable text length, including:
acquiring a sample input text;
inputting the sample input text into a teacher generation model to obtain a first probability distribution corresponding to the sample input text;
constructing a first length control vector and a second length control vector; wherein the first length control vector is used for controlling the maximum length of the output text, and the second length control vector is used for controlling the minimum length of the output text;
inputting the sample input text, the first length control vector and the second length control vector into a student generation model to obtain a second probability distribution corresponding to the sample input text;
obtaining a return value for reinforcement learning according to the first probability distribution, the second probability distribution, the first length control vector and the second length control vector;
obtaining the loss of the student generation model according to the return value and the second probability distribution;
and training the student generated model by taking the minimum loss of the student generated model as a training target so as to obtain a generated model with controllable output text length when training is finished.
Based on the same inventive concept, one or more embodiments of the present specification further provide a text generation method, including:
acquiring an input text;
constructing a first length control vector and a second length control vector; the first length control vector is used for controlling the maximum length of an output text corresponding to the input text, and the second length control vector is used for controlling the minimum length of the output text corresponding to the input text;
and inputting the input text, the first length control vector and the second length control vector into the generation model with controllable output text length obtained by training according to any one of the training methods to obtain the output text corresponding to the input text.
Based on the same inventive concept, one or more embodiments of the present specification further provide a training apparatus, including:
a first obtaining module configured to obtain a sample input text;
a first determining module configured to input the sample input text into a teacher generated model to obtain a first probability distribution corresponding to the sample input text;
a first construction module configured to construct a first length control vector and a second length control vector; wherein the first length control vector is used for controlling the maximum length of the output text, and the second length control vector is used for controlling the minimum length of the output text;
a second determining module configured to input the sample input text, the first length control vector and the second length control vector into a student generation model to obtain a second probability distribution corresponding to the sample input text;
a reward value determination module configured to derive a reward value for reinforcement learning from the first probability distribution, the second probability distribution, the first length control vector, and the second length control vector;
a first loss determination module configured to derive a loss for the student-generated model from the reward value and the second probability distribution;
and the first training module is configured to train the student generated model by taking the minimum loss of the student generated model as a training target so as to obtain the generated model with controllable output text length at the end of training.
Based on the same inventive concept, one or more embodiments of the present specification further provide a text generation apparatus, including:
a second obtaining module configured to obtain an input text;
a second construction module configured to construct a first length control vector and a second length control vector; the first length control vector is used for controlling the maximum length of an output text corresponding to the input text, and the second length control vector is used for controlling the minimum length of the output text corresponding to the input text;
and the generating module is configured to input the input text, the first length control vector and the second length control vector into the generation model with controllable output text length obtained by training according to any one of the above training methods, so as to obtain an output text corresponding to the input text.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored on the memory and executable on the processor, and the processor implements the method as described in any one of the above items when executing the program.
As can be seen from the above description, in the training method, the text generation device and the electronic device provided in one or more embodiments of the present specification, the training process is implemented in a manner of joint training of a Teacher generation model and a Student generation model by using a Teacher-Student training framework. The teacher generated model is a general text generation model used for the student generated model to learn the text generation method thereof. The student generation model introduces a first length control vector for controlling the maximum length of the output text and a second length control vector for controlling the minimum length of the output text, and the ability of the student generation model for controlling the length of the output text is trained through the first length control vector and the second length control vector. And on the basis of a reinforcement learning method, joint loss of the teacher generated model and the student generated model is obtained through the return value, and the student generated model is trained to obtain a generated model with controllable output text length, and the generated model is used for generating the output text with controllable output text length. The scheme of the disclosure effectively realizes the length control of the generated text, and can meet the requirement on the length of the generated text in a specific text generation task.
Drawings
In order to more clearly illustrate one or more embodiments or prior art solutions of the present specification, the drawings that are needed in the description of the embodiments or prior art will be briefly described below, and it is obvious that the drawings in the following description are only one or more embodiments of the present specification, and that other drawings may be obtained by those skilled in the art without inventive effort from these drawings.
FIG. 1 is a flow diagram of a training method in accordance with one or more embodiments of the present disclosure;
FIG. 2 is a schematic diagram of the structure and operation process of a Transformer model;
FIG. 3 is a schematic diagram of the input of a student generated model in one or more embodiments of the present disclosure;
FIG. 4 is a flow diagram of a method for generating text in accordance with one or more embodiments of the disclosure;
FIG. 5 is a schematic diagram of a training apparatus according to one or more embodiments of the present disclosure;
FIG. 6 is a block diagram of a text generation apparatus according to one or more embodiments of the present disclosure;
fig. 7 is a schematic structural diagram of an electronic device according to one or more embodiments of the present disclosure.
Detailed Description
For the purpose of promoting a better understanding of the objects, aspects and advantages of the present disclosure, reference is made to the following detailed description taken in conjunction with the accompanying drawings.
It is to be noted that unless otherwise defined, technical or scientific terms used in one or more embodiments of the present specification should have the ordinary meaning as understood by those of ordinary skill in the art to which this disclosure belongs. The use of "first," "second," and similar terms in one or more embodiments of the specification is not intended to indicate any order, quantity, or importance, but rather is used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that the element or item listed before the word covers the element or item listed after the word and its equivalents, but does not exclude other elements or items.
As described in the background section, the existing text generation method generally does not consider limiting the length of the generated text when generating the text. In natural language processing, a text generation model is usually obtained by training in a machine learning manner to realize a text generation task. In the process of implementing the present disclosure, the applicant finds that, in the existing text generation model, no parameter or processing step for controlling the length of the output text of the text generation model exists no matter in the training or generation process, which causes the problem that the existing text generation model cannot control the length of the output text output by the existing text generation model.
In view of the above, one or more embodiments of the present specification provide a training method, a text generation method, and related hardware. The training method aims to train and obtain a generation model with controllable output text length. The training process is realized by adopting a training framework of a Teacher-Student (also called knowledge distillation) in a way of joint training of a Teacher generated model and a Student generated model. The teacher generated model is a general text generation model used for the student generated model to learn the text generation method thereof. The student generation model introduces a first length control vector for controlling the maximum length of the output text and a second length control vector for controlling the minimum length of the output text, and the ability of the student generation model for controlling the length of the output text is trained through the first length control vector and the second length control vector. The method comprises the steps of reflecting a difference value between an output text and a length requirement and a difference of prediction capabilities of a teacher generation model and a student generation model through a return value based on a reinforcement learning method, and calculating loss of the student generation model based on the return value, wherein the loss of the student generation model is joint loss of the teacher generation model and the student generation model due to introduction of the return value. And training the student generated model by taking the minimum loss of the student generated model as a target so as to obtain the generated model with controllable output text length when training is finished. When the text is generated, the input text, the first length control vector and the second length control vector are input into the generated model with controllable length of the obtained output text, so that the output text corresponding to the input text can be obtained, and the length of the output text is controlled by the length control vector and the second length control vector so as to meet the requirement of the specific text generation task on the length of the generated text.
The technical solutions of one or more embodiments of the present specification are described in detail below with reference to specific embodiments.
One or more embodiments of the present specification provide a training method for outputting a generative model with controllable text length. Referring to fig. 1, the training method includes the following steps:
s101, obtaining a sample input text;
step S102, inputting the sample input text into a teacher generation model to obtain a first probability distribution corresponding to the sample input text;
step S103, constructing a first length control vector and a second length control vector; wherein the first length control vector is used for controlling the maximum length of the output text, and the second length control vector is used for controlling the minimum length of the output text;
step S104, inputting the sample input text, the first length control vector and the second length control vector into a student generation model to obtain a second probability distribution corresponding to the sample input text;
step S105, obtaining a return value for reinforcement learning according to the first probability distribution, the second probability distribution, the first length control vector and the second length control vector;
s106, obtaining the loss of the student generation model according to the return value and the second probability distribution;
and S107, training the student generation model by taking the minimum loss of the student generation model as a training target so as to obtain a generation model with controllable output text length when training is finished.
In this embodiment, a Teacher-Student training framework is adopted, and the teachers generation model and students generation model are jointly trained. Both the teacher generating model and the student generating model adopt a Transformer model. Referring to fig. 2, a schematic structural diagram of a Transformer model is shown. The Transformer model includes a Transformer encoder and a Transformer decoder. The method comprises the steps that after an input text is subjected to embedding processing to obtain word vectors, the word vectors are input into a Transformer encoder; the Transformer encoder comprises a plurality of Transformer encoding modules which are connected in sequence, and the last Transformer encoding module outputs to obtain an encoding vector corresponding to the input text. And inputting the coding vector into a Transformer decoder, wherein the Transformer decoder comprises a plurality of sequentially connected Transformer decoding modules, and the last Transformer decoding module outputs the probability distribution of the predicted output word. The Transformer decoder outputs a plurality of output words step by step to obtain output text. The input of each step of the Transformer decoder is the coded vector output by the Transformer encoder and all output words already output, and the output of each step of the Transformer decoder is the output word of the current step.
Fig. 2 only shows the structure and the working process of the Transformer model by way of example. It is to be understood that, when the training process is explained in the embodiment described later, the input of the Transformer model is to input the sample input text in the training sample.
In this embodiment, a sample input text is first obtained. The sample input text is from a sample set used to train teacher generated models and student generated models. According to different text generation tasks, different corpora can be selected from the sample set. For example, when applied to a question-and-answer task, the sample set may be the question-and-answer history of the user and the robot; when applied to a news digest generation task, the sample set may be several news texts, i.e., their corresponding digest texts. For ease of presentation, the sample input text may be presented asx i The sample output text corresponding to the sample input text isy i (ii) a Sample input textx i And sample output texty i I.e. to form a training samplex i y i ). Several training samples form a training set
Figure DEST_PATH_IMAGE001
WhereinDis shown in commonDAnd (4) training samples.
In this embodiment, sample input text is obtainedx i Thereafter, the sample is entered into the textx i And inputting a teacher generated model. Wherein the teacher generated model comprises a first transform encoder and a first transform decoder. First, the sample input text is subjected to an embedding process. Specifically, each word in the sample input text is coded into a vector form in a one-hot mode, and then word embedding processing is performed to extract the characteristics of each word, so that a first word vector corresponding to each word in the sample input text is obtained. The algorithm used by the Word embedding process can be chosen arbitrarily, such as Word2Vec, GloVe, etc.
Then, a first position vector is generated for each word in the sample input text. Since the Transformer model uses global information and does not utilize sequential information of words, position embedding is required for each word in the sample input text to generate a corresponding first position vector to represent the position feature in the text. The specific method for generating the first position vector may adopt an existing arbitrary position embedding method, which is not limited in this embodiment.
For the first transform encoder, a first word vector and a first position vector corresponding to each word in the sample input text are linearly combined, and the result of the linear combination is used as a representation vector of each word. In this embodiment, the first word vector and the first position vector are directly added, and the added result is input to the first Transformer encoder, and sequentially passes through a plurality of Transformer encoding modules included in the first Transformer encoder, so as to obtain first encoding vectors corresponding to each word in the sample input text. For each transform coding module, the input passes through a Multi-Head Attention layer, an Add & Norm layer, a Feed Forward layer, and an Add & Norm layer in sequence. The Multi-Head orientation layer is composed of a plurality of Self-orientations, and semantic features of words are extracted and combined in different semantic environments. The Feed Forward layer is two hidden layers which are fully connected, and the Add & Norm layer is subjected to residual error connection and normalization processing. In the above layers included in the first transform encoder, the method of this embodiment does not involve any improvement in the structure and operation manner, and therefore the structure and operation manner thereof are not described in detail in this embodiment.
For the first transform decoder, the first transform encoder generates the first encoding vector to be input to the first transform decoder, and the first transform decoder outputs a plurality of output words step by step. In the first step, a first coded vector and a start symbol are input to a first transform decoder to obtain an output word of the first step. For each subsequent step, the first transform decoder generates an output word of the current step according to the first coded vector and all output words already output. The output of the first transform decoder is in a probability distribution form, a plurality of output word sequences and corresponding probabilities thereof are finally obtained, and then a final output word sequence is obtained through a beacon search algorithm and is used as an output text. For ease of representation, the first probability distribution output by the first transform decoder may be represented as
Figure 134838DEST_PATH_IMAGE002
. Wherein,pa first probability distribution;
Figure DEST_PATH_IMAGE003
generating output words of the current step of the model for the teacher;
Figure 678077DEST_PATH_IMAGE004
inputting text for the sample;
Figure DEST_PATH_IMAGE005
generating all output words of the model from the first step to the previous step of the current step for the teacher;
Figure 51290DEST_PATH_IMAGE006
model parameters of the model are generated for the teacher.
After the first probability distribution is obtained, calculating the loss of the teacher generated model, wherein the loss function is
Figure DEST_PATH_IMAGE007
. Wherein,L p the loss of the model is generated for the teacher,Tand the actual length of the output text output by the teacher generated model refers to the number of words included in the output text output by the teacher generated model, and is obtained through statistics of the beamsearch algorithm. And after the loss of the teacher generated model is obtained, updating the model parameters of the teacher generated model by using a random gradient descent algorithm with the minimum loss of the teacher generated model as a training target, and realizing the training of the teacher generated model.
In this embodiment, the sample is also input into the textx i The student generated model is input. Wherein, the student generation model comprises: a second transform encoder and a second transform decoder. When the input text is input into the second transform coder, the second transform coder comprises a second word vector obtained by embedding the sample input text and a second position vector obtained by embedding the position of the sample input text, and also comprises a first length control vector used for controlling the maximum length of the output text and a second length control vector used for controlling the minimum length of the output text.
Referring to fig. 3, an input schematic of a model is generated for a student. Each rectangular box in fig. 2 represents a vector, the dimensions of the vectors are the same, and characters, letters, or numbers in the rectangular box represent the corresponding vector in the following description. In this embodiment, the sample input text takes "i love beijing tianan" as an example, and second word vectors corresponding to six words are obtained through word embedding, and then start characters < B > are added to the head positions of the sequence, so that all the second word vectors are formed. (1) To (6) is a second location vector for representing the location of the corresponding second word vector in the text; for example, "me" corresponds to (1), indicating that the position of "me" in the text is the first word, and "jing" corresponds to (4), indicating that the position of "jing" in the text is the fourth word.
0 to 6 are first length control vectors, each of which corresponds to a second word vector, and indicate that at most several output words can be output when the corresponding second word vector is decoded. The first length control vector controls the maximum length of the output text as a whole. In this embodiment, the output text only includes six output words at most by controlling the first length control vector. For example, for a start symbol "< B >" which is used to generate an output word of the first step during decoding, the corresponding first length control vector is 6, which indicates that a total of six output words can be generated after the current step; for the second word vector "jing", the corresponding first length control vector is 2, and when the second word vector "jing" is used for decoding to generate an output word, a total of two output words can also be generated. When the first length control vectors 0 to 6 are constructed, values of all dimensions can be preset, and only the six first length control vectors 0 to 6 are required to be different from each other. For example, the value of each dimension of the first length control vector 6 may be a numerical value 6.
And 0 'to 4' are second length control vectors, which respectively correspond to one second word vector and are used for indicating that at least several output words need to be output when the corresponding second word vector is decoded. The second length control vector controls the minimum length of the output text as a whole. In this embodiment, the output text needs to include at least four output words by the control of the second length control vector. For example, for the start symbol "< B >" which is used to generate the output word of the first step during decoding, the corresponding second length control vector is 4', which indicates that a total of at least four output words need to be generated after the current step; for the second word vector "jing", the corresponding second length control vector is 0', and when the second word vector "jing" is used for decoding to generate an output word, the number of generated output words already satisfies the minimum length of the output text. The second length control vectors 0 'to 4' can be constructed in a manner that references are made to the first length control vectors, i.e., only the respective control vectors need to be guaranteed to be different.
For the second transform encoder, referring to fig. 3, the second word vector, the second position vector, the first length control vector, and the second length control vector corresponding to each word in the sample input text are linearly combined, and the result of the linear combination is the representation vector of each word. In this embodiment, the first word vector, the first position vector, the first length control vector, and the second length control vector corresponding to each word are directly added, the added result is input to the second transform encoder, and after sequentially passing through a plurality of transform encoding modules included in the second transform encoder, the second encoding vectors corresponding to each word in the sample input text are obtained. The second transform encoder operates in a similar manner to the first transform encoder, and reference is made specifically to the description of the first transform encoder above.
For the second transform decoder, the second encoding vector generated by the second transform encoder is input into the second transform decoder, and the second transform decoder outputs a plurality of output words step by step. The second transform decoder operates in a similar manner to the first transform decoder, and reference is made to the description of the first transform decoder.
For convenience of representation, the second probability distribution output by the second transform decoder can be represented as
Figure 915341DEST_PATH_IMAGE008
. Wherein,
Figure DEST_PATH_IMAGE009
a second probability distribution;
Figure 824653DEST_PATH_IMAGE010
generating output words of the current step of the model for the students;
Figure DEST_PATH_IMAGE011
inputting text for the sample;ma first length control vector;na second length control vector;
Figure 630804DEST_PATH_IMAGE012
generating all output words of the model from the first step to the previous step of the current step for the student;
Figure DEST_PATH_IMAGE013
model parameters of the model are generated for the student.
In this embodiment, a reinforcement learning method is used for training. Specifically, the first probability distribution generated by the teacher generated model, the second probability distribution generated by the student generated model, the first length control vector and the second length control vector are obtained through the steps, and the return value for reinforcement learning is calculated. The way in which the reward value is calculated can be expressed as:
Figure 992777DEST_PATH_IMAGE014
wherein,ris a return value;T’generating the actual length of the output text output by the model for the student, wherein the length refers to the number of words included in the output text output by the teacher generated model, and is obtained through the statistics of a beamsearch algorithm;mdetermining the maximum length of an output text output by the student generation model according to the first length control vector;ndetermining the minimum length of the output text output by the student generation model according to the second length control vector;uandvthe smoothness of the reported value and the weight of the target learning value are controlled separately for the hyper-parameter.
Among the above-mentioned reported values, the value of,
Figure DEST_PATH_IMAGE015
the term is a first length difference obtained according to the maximum length and the actual length and is used for describing the difference between the actual length of the output text output by the student generation model and the set maximum length.
Figure 660519DEST_PATH_IMAGE016
This term is the second length difference obtained from said minimum and actual lengths forAnd describing the difference between the actual length of the output text output by the student generation model and the set minimum length.
Figure DEST_PATH_IMAGE017
The term is the cross entropy of the first probability distribution and the second probability distribution, and is used for describing the difference of the prediction capabilities of the teacher generated model and the student generated model so as to enable the student generated model to learn the text generation method of the teacher generated model in the subsequent training process. And carrying out linear combination on the three items to obtain the return value.
In this embodiment, after the return value for reinforcement learning is obtained, the loss of the student generated model is calculated according to the return value and the second probability distribution generated by the student generated model. The loss of the student-generated model can be expressed as
Figure 250769DEST_PATH_IMAGE018
. Wherein,
Figure DEST_PATH_IMAGE019
a loss of the model is generated for the student.
Therefore, the loss of the Student generation model introduces the obtained return value, the return value forms the joint loss of the Teacher generation model and the Student generation model, and the training framework of the Teacher-Student is realized; in addition, based on a reinforcement learning method, the introduction of the return value can enable the student generation model to be continuously optimized based on the difference of text length and the difference of prediction capability of the student generation model and the teacher generation model in the training process.
And training the student generated model by taking the minimum loss of the student generated model as a training target, wherein the trained student generated model is the generated model with controllable output text length after the training is finished. When the generation model with the controllable output text length is used for generating a text, the output text with the text length meeting the length requirements corresponding to the first length control vector and the second length control vector can be generated according to the preset first length control vector and the preset second length control vector.
It can be seen from the above embodiments that, in the training method of the generative model with controllable output text length, the training process adopts the Teacher-Student training framework and is realized in a manner of joint training of the Teacher generative model and the Student generative model. The teacher generated model is a general text generation model used for the student generated model to learn the text generation method thereof. The student generation model introduces a first length control vector for controlling the maximum length of the output text and a second length control vector for controlling the minimum length of the output text, and the ability of the student generation model for controlling the length of the output text is trained through the first length control vector and the second length control vector. Based on the reinforcement learning method, joint loss of the teacher generation model and the student generation model is obtained through the return value, the student generation model is trained to obtain the generation model with the controllable length of the output text, the generation model is used for generating the output text with the controllable length, the length control of the generated text is effectively achieved, and the requirement for the length of the generated text in a specific text generation task can be met.
It should be noted that, in the above embodiments, the transform encoder, the transform decoder, the beacon search algorithm, etc., the method of the present embodiment does not involve specific modifications to its specific structure or algorithm flow, and therefore, detailed principles and operation thereof are not described in detail.
It should be noted that the method of one or more embodiments of the present disclosure may be performed by a single device, such as a computer or server. The method of the embodiment can also be applied to a distributed scene and completed by the mutual cooperation of a plurality of devices. In such a distributed scenario, one of the devices may perform only one or more steps of the method of one or more embodiments of the present disclosure, and the devices may interact with each other to complete the method.
It should be noted that the above description describes certain embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing may also be possible or may be advantageous.
Based on the same inventive concept, one or more embodiments of the present specification further provide a text generation method. Referring to fig. 4, the text generation method includes the following steps:
s401, acquiring an input text;
step S402, constructing a first length control vector and a second length control vector; the first length control vector is used for controlling the maximum length of an output text corresponding to the input text, and the second length control vector is used for controlling the minimum length of the output text corresponding to the input text;
step S403, inputting the input text, the first length control vector, and the second length control vector into the generation model with controllable output text length obtained by training according to the training method of any of the above embodiments, so as to obtain an output text corresponding to the input text.
In this embodiment, the first length control vector and the second length control vector are correspondingly constructed according to the length requirement of the output text required by the specific text generation task, and the construction manner may refer to any one of the embodiments of the training method described above, which is not described in detail in this embodiment.
When the text is generated, the input text is subjected to word embedding and position embedding, and is linearly combined with the constructed first length control vector and the second length control vector to be input into the generation model with the controllable output text length. And outputting the generated model with the controllable text length, so that the output text corresponding to the input text can be output. The generation model with controllable output text length is based on a Transformer model, a beamsearch algorithm is adopted when determining the output text, the working mode of the generation model is similar to the training process in the embodiment of the training method, and detailed description is omitted in the embodiment.
Therefore, the generated model with controllable output text length obtained by training through the training method of the foregoing embodiment is applied, so that the output text output by the generated model can meet the requirement of the text generation task on the length of the generated text.
Based on the same inventive concept, one or more embodiments of the present specification further provide a training device. Referring to fig. 5, the training apparatus includes:
a first obtaining module 501 configured to obtain a sample input text;
a first determining module 502 configured to input the sample input text into a teacher generated model, resulting in a first probability distribution corresponding to the sample input text;
a first construction module 503 configured to construct a first length control vector and a second length control vector; wherein the first length control vector is used for controlling the maximum length of the output text, and the second length control vector is used for controlling the minimum length of the output text;
a second determining module 504, configured to input the sample input text, the first length control vector, and the second length control vector into a student generation model, so as to obtain a second probability distribution corresponding to the sample input text;
a reward value determination module 505 configured to obtain a reward value for reinforcement learning according to the first probability distribution, the second probability distribution, the first length control vector and the second length control vector;
a first loss determination module 506 configured to derive a loss of the student-generated model based on the reward value and the second probability distribution;
and the first training module 507 is configured to train the student generated model with the minimum loss of the student generated model as a training target so as to obtain a generated model with controllable output text length at the end of training.
As an optional embodiment, the training apparatus further includes: a second loss determination module configured to derive a loss for the teacher-generated model based on the first probability distribution; a second training module configured to train the teacher generated model with a minimum loss of the teacher generated model as a training target.
As an alternative embodiment, the teacher generated model includes: a first transform encoder and a first transform decoder; the first determining module is specifically configured to perform embedding processing on the sample input text to obtain first word vectors corresponding to words in the sample input text; respectively generating a first position vector for each word in the sample input text; linearly combining the first word vector and the first position vector and inputting the combined first word vector and the first position vector into the first Transformer encoder to obtain first encoding vectors corresponding to all words in the sample input text; and inputting the first coding vector into the first transform decoder to obtain a first probability distribution corresponding to the sample input text.
As an alternative embodiment, the student generated model comprises: a second transform encoder and a second transform decoder; the second determining module is specifically configured to perform embedding processing on the sample input text to obtain second word vectors corresponding to words in the sample input text; respectively generating a second position vector for each word in the sample input text; linearly combining the second word vector, the second position vector, the first length control vector and the second length control vector and inputting the combined second word vector, the second position vector, the first length control vector and the second length control vector into the second transform encoder to obtain second encoding vectors corresponding to all words in the sample input text; and inputting the second coding vector into the second transform decoder to obtain a second probability distribution corresponding to the sample input text.
As an optional embodiment, the report value determining module is specifically configured to determine, according to the first length control vector and the second length control vector, a maximum length and a minimum length of an output text output by the student generation model, respectively; determining the actual length of an output text output by the student generation model; determining a first length difference according to the maximum length and the actual length; determining a second length difference according to the minimum length and the actual length; calculating cross-entropy of the first probability distribution and the second probability distribution; and linearly combining the first length difference, the second length difference and the cross entropy to obtain the return value.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus of the foregoing embodiment is used to implement the corresponding training method in the foregoing embodiment, and has the beneficial effects of the corresponding training method embodiment, which are not described herein again.
Based on the same inventive concept, one or more embodiments of the present specification further provide a text generation apparatus. Referring to fig. 6, the text generation apparatus includes:
a second obtaining module 601 configured to obtain an input text;
a second construction module 602 configured to construct a first length control vector and a second length control vector; the first length control vector is used for controlling the maximum length of an output text corresponding to the input text, and the second length control vector is used for controlling the minimum length of the output text corresponding to the input text;
the generating module 603 is configured to input the input text, the first length control vector, and the second length control vector into the generated model with controllable output text length, which is obtained by training according to the training method described in any of the above embodiments, so as to obtain an output text corresponding to the input text.
For convenience of description, the above devices are described as being divided into various modules by functions, and are described separately. Of course, the functionality of the modules may be implemented in the same one or more software and/or hardware implementations in implementing one or more embodiments of the present description.
The apparatus in the foregoing embodiment is used to implement the corresponding text generation method in the foregoing embodiment, and has the beneficial effects of the corresponding text generation method embodiment, which are not described herein again.
Based on the same inventive concept, one or more embodiments of the present specification further provide an electronic device, which includes a memory, a processor, and a computer program stored in the memory and executable on the processor, and when the processor executes the computer program, the processor implements the training method or the text generation method according to any one of the above embodiments.
Fig. 7 is a schematic diagram illustrating a more specific hardware structure of an electronic device according to this embodiment, where the electronic device may include: a processor 1010, a memory 1020, an input/output interface 1030, a communication interface 1040, and a bus 1050. Wherein the processor 1010, memory 1020, input/output interface 1030, and communication interface 1040 are communicatively coupled to each other within the device via bus 1050.
The processor 1010 may be implemented by a general-purpose CPU (Central Processing Unit), a microprocessor, an Application Specific Integrated Circuit (ASIC), or one or more Integrated circuits, and is configured to execute related programs to implement the technical solutions provided in the embodiments of the present disclosure.
The Memory 1020 may be implemented in the form of a ROM (Read Only Memory), a RAM (Random Access Memory), a static storage device, a dynamic storage device, or the like. The memory 1020 may store an operating system and other application programs, and when the technical solution provided by the embodiments of the present specification is implemented by software or firmware, the relevant program codes are stored in the memory 1020 and called to be executed by the processor 1010.
The input/output interface 1030 is used for connecting an input/output module to input and output information. The i/o module may be configured as a component in a device (not shown) or may be external to the device to provide a corresponding function. The input devices may include a keyboard, a mouse, a touch screen, a microphone, various sensors, etc., and the output devices may include a display, a speaker, a vibrator, an indicator light, etc.
The communication interface 1040 is used for connecting a communication module (not shown in the drawings) to implement communication interaction between the present apparatus and other apparatuses. The communication module can realize communication in a wired mode (such as USB, network cable and the like) and also can realize communication in a wireless mode (such as mobile network, WIFI, Bluetooth and the like).
Bus 1050 includes a path that transfers information between various components of the device, such as processor 1010, memory 1020, input/output interface 1030, and communication interface 1040.
It should be noted that although the above-mentioned device only shows the processor 1010, the memory 1020, the input/output interface 1030, the communication interface 1040 and the bus 1050, in a specific implementation, the device may also include other components necessary for normal operation. In addition, those skilled in the art will appreciate that the above-described apparatus may also include only those components necessary to implement the embodiments of the present description, and not necessarily all of the components shown in the figures.
Those of ordinary skill in the art will understand that: the discussion of any embodiment above is meant to be exemplary only, and is not intended to intimate that the scope of the disclosure, including the claims, is limited to these examples; within the spirit of the present disclosure, features from the above embodiments or from different embodiments may also be combined, steps may be implemented in any order, and there are many other variations of different aspects of one or more embodiments of the present description as described above, which are not provided in detail for the sake of brevity.
It is intended that the one or more embodiments of the present specification embrace all such alternatives, modifications and variations as fall within the broad scope of the appended claims. Therefore, any omissions, modifications, substitutions, improvements, and the like that may be made without departing from the spirit and principles of one or more embodiments of the present disclosure are intended to be included within the scope of the present disclosure.

Claims (13)

1. A training method for outputting a text length controllable generative model comprises the following steps:
acquiring a sample input text;
inputting the sample input text into a teacher generation model to obtain a first probability distribution corresponding to the sample input text;
constructing a first length control vector and a second length control vector; wherein the first length control vector is used for controlling the maximum length of the output text, and the second length control vector is used for controlling the minimum length of the output text;
inputting the sample input text, the first length control vector and the second length control vector into a student generation model to obtain a second probability distribution corresponding to the sample input text;
obtaining a return value for reinforcement learning according to the first probability distribution, the second probability distribution, the first length control vector and the second length control vector;
obtaining the loss of the student generation model according to the return value and the second probability distribution;
and training the student generated model by taking the minimum loss of the student generated model as a training target so as to obtain a generated model with controllable output text length when training is finished.
2. The method of claim 1, further comprising:
obtaining the loss of the teacher generated model according to the first probability distribution;
and training the teacher generated model by taking the minimum loss of the teacher generated model as a training target.
3. The method of claim 1, the teacher generated model comprising: a first transform encoder and a first transform decoder;
the step of inputting the sample input text into a teacher generation model to obtain a first probability distribution corresponding to the sample input text specifically includes:
embedding the sample input text to obtain first word vectors corresponding to all words in the sample input text;
respectively generating a first position vector for each word in the sample input text;
linearly combining the first word vector and the first position vector and inputting the combined first word vector and the first position vector into the first Transformer encoder to obtain first encoding vectors corresponding to all words in the sample input text;
and inputting the first coding vector into the first transform decoder to obtain a first probability distribution corresponding to the sample input text.
4. The method of claim 1, the student generating a model comprising: a second transform encoder and a second transform decoder;
inputting the sample input text, the first length control vector and the second length control vector into a student generation model to obtain a second probability distribution corresponding to the sample input text, and specifically comprising:
embedding the sample input text to obtain second word vectors corresponding to all words in the sample input text;
respectively generating a second position vector for each word in the sample input text;
linearly combining the second word vector, the second position vector, the first length control vector and the second length control vector and inputting the combined second word vector, the second position vector, the first length control vector and the second length control vector into the second transform encoder to obtain second encoding vectors corresponding to all words in the sample input text;
and inputting the second coding vector into the second transform decoder to obtain a second probability distribution corresponding to the sample input text.
5. The method of claim 1, wherein deriving the reward value for reinforcement learning from the first probability distribution, the second probability distribution, the first length control vector, and the second length control vector comprises:
determining the maximum length and the minimum length of an output text output by the student generation model according to the first length control vector and the second length control vector;
determining the actual length of an output text output by the student generation model;
determining a first length difference according to the maximum length and the actual length;
determining a second length difference according to the minimum length and the actual length;
calculating cross-entropy of the first probability distribution and the second probability distribution;
and linearly combining the first length difference, the second length difference and the cross entropy to obtain the return value.
6. A text generation method, comprising:
acquiring an input text;
constructing a first length control vector and a second length control vector; the first length control vector is used for controlling the maximum length of an output text corresponding to the input text, and the second length control vector is used for controlling the minimum length of the output text corresponding to the input text;
inputting the input text, the first length control vector and the second length control vector into a generation model with controllable output text length obtained by training according to any one of the training methods in claims 1-5 to obtain an output text corresponding to the input text.
7. An exercise device comprising:
a first obtaining module configured to obtain a sample input text;
a first determining module configured to input the sample input text into a teacher generated model to obtain a first probability distribution corresponding to the sample input text;
a first construction module configured to construct a first length control vector and a second length control vector; wherein the first length control vector is used for controlling the maximum length of the output text, and the second length control vector is used for controlling the minimum length of the output text;
a second determining module configured to input the sample input text, the first length control vector and the second length control vector into a student generation model to obtain a second probability distribution corresponding to the sample input text;
a reward value determination module configured to derive a reward value for reinforcement learning from the first probability distribution, the second probability distribution, the first length control vector, and the second length control vector;
a first loss determination module configured to derive a loss for the student-generated model from the reward value and the second probability distribution;
and the first training module is configured to train the student generated model by taking the minimum loss of the student generated model as a training target so as to obtain the generated model with controllable output text length at the end of training.
8. The apparatus of claim 7, further comprising:
a second loss determination module configured to derive a loss for the teacher-generated model based on the first probability distribution;
a second training module configured to train the teacher generated model with a minimum loss of the teacher generated model as a training target.
9. The apparatus of claim 7, the teacher generated model comprising: a first transform encoder and a first transform decoder;
the first determining module is specifically configured to perform embedding processing on the sample input text to obtain first word vectors corresponding to words in the sample input text; respectively generating a first position vector for each word in the sample input text; linearly combining the first word vector and the first position vector and inputting the combined first word vector and the first position vector into the first Transformer encoder to obtain first encoding vectors corresponding to all words in the sample input text; and inputting the first coding vector into the first transform decoder to obtain a first probability distribution corresponding to the sample input text.
10. The apparatus of claim 7, the student-generated model comprising: a second transform encoder and a second transform decoder;
the second determining module is specifically configured to perform embedding processing on the sample input text to obtain second word vectors corresponding to words in the sample input text; respectively generating a second position vector for each word in the sample input text; linearly combining the second word vector, the second position vector, the first length control vector and the second length control vector and inputting the combined second word vector, the second position vector, the first length control vector and the second length control vector into the second transform encoder to obtain second encoding vectors corresponding to all words in the sample input text; and inputting the second coding vector into the second transform decoder to obtain a second probability distribution corresponding to the sample input text.
11. The apparatus of claim 7, the reward value determination module, being specifically configured to determine a maximum length and a minimum length of an output text output by the student-generated model from the first length control vector and the second length control vector, respectively; determining the actual length of an output text output by the student generation model; determining a first length difference according to the maximum length and the actual length; determining a second length difference according to the minimum length and the actual length; calculating cross-entropy of the first probability distribution and the second probability distribution; and linearly combining the first length difference, the second length difference and the cross entropy to obtain the return value.
12. A text generation apparatus comprising:
a second obtaining module configured to obtain an input text;
a second construction module configured to construct a first length control vector and a second length control vector; the first length control vector is used for controlling the maximum length of an output text corresponding to the input text, and the second length control vector is used for controlling the minimum length of the output text corresponding to the input text;
a generating module configured to input the input text, the first length control vector and the second length control vector into a generation model with controllable output text length obtained by training according to any one of claims 1 to 5, so as to obtain an output text corresponding to the input text.
13. An electronic device comprising a memory, a processor and a computer program stored on the memory and executable on the processor, the processor implementing the method of any one of claims 1 to 6 when executing the program.
CN202010689980.6A 2020-07-17 2020-07-17 Training method, text generation device and electronic equipment Active CN111738437B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010689980.6A CN111738437B (en) 2020-07-17 2020-07-17 Training method, text generation device and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010689980.6A CN111738437B (en) 2020-07-17 2020-07-17 Training method, text generation device and electronic equipment

Publications (2)

Publication Number Publication Date
CN111738437A CN111738437A (en) 2020-10-02
CN111738437B true CN111738437B (en) 2020-11-20

Family

ID=72654834

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010689980.6A Active CN111738437B (en) 2020-07-17 2020-07-17 Training method, text generation device and electronic equipment

Country Status (1)

Country Link
CN (1) CN111738437B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112527127B (en) * 2020-12-23 2022-01-28 北京百度网讯科技有限公司 Training method and device for input method long sentence prediction model, electronic equipment and medium
CN115130549B (en) * 2022-05-25 2025-05-13 清华大学 Information selection model training method, information selection method and device
CN117787241A (en) * 2023-12-27 2024-03-29 人民网股份有限公司 Method and device for controlling length of generated text based on large language model

Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016180268A1 (en) * 2015-05-13 2016-11-17 阿里巴巴集团控股有限公司 Text aggregate method and device
CN110147442A (en) * 2019-04-15 2019-08-20 深圳智能思创科技有限公司 A kind of text snippet generation system and method for length-controllable
CN111143551A (en) * 2019-12-04 2020-05-12 支付宝(杭州)信息技术有限公司 Text preprocessing method, classification method, device and equipment

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2020027619A1 (en) * 2018-08-02 2020-02-06 네오사피엔스 주식회사 Method, device, and computer readable storage medium for text-to-speech synthesis using machine learning on basis of sequential prosody feature
CN110162630B (en) * 2019-05-09 2025-06-27 深圳市腾讯信息技术有限公司 A method, device and equipment for deduplication of text

Patent Citations (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016180268A1 (en) * 2015-05-13 2016-11-17 阿里巴巴集团控股有限公司 Text aggregate method and device
CN110147442A (en) * 2019-04-15 2019-08-20 深圳智能思创科技有限公司 A kind of text snippet generation system and method for length-controllable
CN111143551A (en) * 2019-12-04 2020-05-12 支付宝(杭州)信息技术有限公司 Text preprocessing method, classification method, device and equipment

Also Published As

Publication number Publication date
CN111738437A (en) 2020-10-02

Similar Documents

Publication Publication Date Title
CN111738437B (en) Training method, text generation device and electronic equipment
CN114722182B (en) A method and system for online course recommendation based on knowledge graph
WO2022033208A1 (en) Visual dialogue method and apparatus, model training method and apparatus, electronic device, and computer readable storage medium
CN117218498B (en) Multi-modal large language model training method and system based on multi-modal encoder
CN111931517B (en) Text translation method, device, electronic equipment and storage medium
CN110956018A (en) Training method of text processing model, text processing method, text processing device and storage medium
JP2022517971A (en) Language sequence labeling methods, equipment, programs and computing equipment
CN105632251A (en) 3D virtual teacher system having voice function and method thereof
CN106354701A (en) Chinese character processing method and device
CN116993963B (en) Image processing method, device, equipment and storage medium
CN113704419A (en) Conversation processing method and device
CN115424013B (en) Model training methods, image processing methods and equipment, and media
CN114297220B (en) A data processing method, device, computer equipment and storage medium
CN111260516A (en) Data processing method, computer storage medium and related equipment
CN109992785B (en) Content calculation method, device and equipment based on machine learning
CN117332112A (en) Multimodal retrieval model training, multimodal retrieval method, electronic device, and storage medium
CN116958738A (en) Training method and device of picture recognition model, storage medium and electronic equipment
CN117539985A (en) Question-answering method and device based on language style, electronic equipment and storage medium
CN113177393A (en) Method and apparatus for improving pre-trained language model for web page structure understanding
CN117113268B (en) Multi-scale data fusion method, device, medium and electronic equipment
CN115359786B (en) Multi-intent semantic understanding model training and use method and device
CN110377915B (en) Text emotion analysis method and device, storage medium and equipment
KR20120004719A (en) Group Learning Support Method, System, Device and Terminal
Al Ka'bi Proposed Artificial Intelligence Algorithm for Developing Higher Education
CN118887493B (en) Training and reasoning method, device, equipment and medium of CLIP model

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant