Disclosure of Invention
The application aims to solve the technical problem of accurately determining effective information of texts and provides a text classification method and device.
In a first aspect, an embodiment of the present application provides a text classification method, where the method includes:
Acquiring a text to be classified;
inputting the text to be classified into a text classification model to obtain at least one category of the text to be classified, wherein the text classification model is used for determining at least one category of the text, and the text classification model is trained by the following modes:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
and updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Optionally, the tag relationship feature includes:
The characteristic of the association relation between the ith label and the (i+1) th label is reflected;
and at the t-1 decoding moment, the characteristic showing the association relation between the ith tag and the (i+1) th tag is determined based on the embedded information corresponding to all the tags in the tag dictionary and the probability of occurrence of the ith tag determined at the t-1 decoding moment.
Optionally, the decoding based on the first coding vector and the tag relation feature to obtain a decoding result includes:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
Optionally, the at least one auxiliary vector includes a first auxiliary vector, where the first auxiliary vector is obtained by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
Optionally, the emotion classification model is a classification model based on an attention mechanism.
Optionally, the at least one auxiliary vector includes a second auxiliary vector, where the second auxiliary vector is obtained by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
Optionally, the attribute classification model is a classification model based on an attention mechanism.
Optionally, the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
Optionally, the text classification model obtains at least one category of the text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
Optionally, the text to be classified and the training text are comment texts.
In a second aspect, an embodiment of the present application provides a text classification apparatus, including:
the acquisition unit is used for acquiring the text to be classified;
the text classification model is used for determining at least one category of the text, and is obtained by training in the following way:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
and updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Optionally, the tag relationship feature includes:
The characteristic of the association relation between the ith label and the (i+1) th label is reflected;
and at the t-1 decoding moment, the characteristic showing the association relation between the ith tag and the (i+1) th tag is determined based on the embedded information corresponding to all the tags in the tag dictionary and the probability of occurrence of the ith tag determined at the t-1 decoding moment.
Optionally, the decoding based on the first coding vector and the tag relation feature to obtain a decoding result includes:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
Optionally, the at least one auxiliary vector includes a first auxiliary vector, where the first auxiliary vector is obtained by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
Optionally, the emotion classification model is a classification model based on an attention mechanism.
Optionally, the at least one auxiliary vector includes a second auxiliary vector, where the second auxiliary vector is obtained by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
Optionally, the attribute classification model is a classification model based on an attention mechanism.
Optionally, the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
Optionally, the text classification model obtains at least one category of the text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
Optionally, the text to be classified and the training text are comment texts.
Compared with the prior art, the embodiment of the application has the following advantages:
The embodiment of the application provides a text classification method, which comprises the steps of obtaining a text to be classified, and then inputting the text to be classified into a text classification model, wherein the text classification model is used for determining at least one category of the text. Thus, after inputting the text to be classified into the text classification model, at least one category of the text to be classified may be obtained. The text classification model is trained based on training text and at least one label corresponding to the training text. When the text classification model is trained, the training text can be firstly encoded to obtain a first encoding vector, and then, a certain association relationship is also arranged among labels in consideration of all labels used for text classification by the text classification model, and the association relationship has a certain influence on a text classification result. Therefore, in the embodiment of the application, after the first coding vector is obtained, decoding can be performed based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between the tags in the tag dictionary, and the tag dictionary comprises all the tags used by the text classification model for text classification. Further, based on the decoding result and at least one label corresponding to the training text, updating parameters of the text classification model. Therefore, in the embodiment of the application, the association relation between the labels in the label dictionary is considered when the text classification model is trained, so that the classification result obtained by training the text classification model is more accurate when the text classification model is used for classifying the text to be classified. Therefore, by utilizing the scheme, at least one label corresponding to the text to be classified can be accurately determined, and the label of the text to be classified can embody the effective information of the text to be classified. In addition, in the embodiment of the application, the tag relation feature can embody the global association relation among all tags in the tag dictionary. Moreover, the label relation features are decoupled from the training text, and as the label relation features are decoupled from the training text, the probability of overfitting can be reduced when the text classification model is trained.
Detailed Description
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The inventor of the present application has found that, for some texts, it has guiding significance to the practical process of the user. For example, a large number of online comments are generated for comment websites, forums, blogs and the like, and a novel communication bridge is established between a practitioner and a consumer. Among the vast, continuously emerging reviews, there are some that can give guidance in the future actions of the practitioner. For example, these comments may provide specific details to tell "who" and in what manner "to present the relevant" particular aspect "service.
At present, effective information in the evaluation paper can be extracted in a manual analysis mode so as to facilitate the subsequent guidance of a subsequent practical process by using the effective information. However, in this way, the accuracy of the extracted effective information depends on subjective factors such as experience and knowledge level of the analyst, and the accuracy cannot be guaranteed.
In order to solve the above problems, the embodiment of the application provides a text classification method and a text classification device.
Various non-limiting embodiments of the present application are described in detail below with reference to the attached drawing figures.
Exemplary method
Referring to fig. 1, the flow chart of a text classification method according to an embodiment of the present application is shown. In this embodiment, the method may be performed by a terminal device or may be performed by a server, and the embodiment of the present application is not limited specifically.
In one example, the text classification method shown in FIG. 2 may include, for example, the steps S101-S102.
S101, acquiring a text to be classified.
The text to be classified is not particularly limited, and may be any text that can be used to guide a practical process.
In one example, the text to be classified may be comment text. For this case, in a specific implementation, S101 may obtain comment text from a web platform such as a comment website, forum, blog, etc., and then filter the comment text to obtain the foregoing part of comment text that can be used to guide a practice process, and further determine the text to be classified from the part of comment text. For example, preprocessing one or more comment texts in the part of comment texts, so as to obtain the text to be classified. The comment text is preprocessed, including but not limited to, supplementing missing commas and periods, removing irrelevant special characters, and replacing all punctuation marks with consistent strings.
S102, inputting the text to be classified into a text classification model to obtain at least one category of the text to be classified, wherein the text classification model is used for determining at least one category of the text.
Considering that different types of texts have different guiding meanings on practical behaviors, for example, some texts have promoting effects, some texts have correcting effects, and some texts have encouraging effects. Thus, for text, the category to which the text corresponds may be valid information for the text. Accordingly, a category to which the text to be classified corresponds may be determined in order to guide the practice process based on the category of the text in the subsequent process.
In one example, after obtaining text to be classified, the text to be classified may be input into a text classification model that is used to determine at least one category of text. Formally, since the text classification model is used for determining at least one category of the text, after the text to be classified is input into the text classification model, the at least one category of the text to be classified can be obtained.
In one example, the text classification model may be pre-trained. Next, the training mode of the text classification model will be described with reference to fig. 2. Fig. 2 is a flow chart of a training method of a text classification model according to an embodiment of the present application.
It should be noted that, the model training process is a process of multiple iterative computations, each iteration can adjust parameters of the model, and the adjusted parameters participate in the next iteration computation.
Fig. 2 illustrates a round of iterative process in a training text classification model, taking a training text as an example. It will be appreciated that there are many sets of training text used to train the text classification model, and that each set of training text is treated similarly when training the text classification model. After training the plurality of groups of training texts, a text classification model with the accuracy meeting the requirement can be obtained.
The method shown in fig. 2 may include the following steps S201-S202.
S201, acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category.
Similar to the text to be classified, the training text may also be any text that can be used to guide a practical process. In one example, the training text may be comment text. For a specific implementation of acquiring the training text in this case, reference may be made to the related description section for acquiring the text to be classified in S101, and the description will not be repeated here.
In the embodiment of the application, at least one label corresponding to the training text can be manually marked. In one example, to ensure the accuracy of at least one tag corresponding to the training text, multiple researchers inviting the natural language processing domain may be employed to annotate the training text. Before the annotation begins, the annotator scrutinizes the annotation guide file, which presents indications of key concepts, definitions, criteria and examples. In addition, before formal annotation, the annotator can accept annotation exercises.
In addition, since multiple annotators can annotate training texts, labels annotated by multiple annotators may not be completely consistent, for this case, multiple annotators may discuss annotation results to determine the end-use annotation result.
It should be noted that the at least one label corresponding to the training sample may be at least one label included in a label dictionary, where the label dictionary includes all labels used by the text classification model for classifying the text. For comment text, for example, three kinds of labels, namely, negative preventable, positive and regular, and positive and model, can be included in the label dictionary. Wherein:
negative preventable-meaning that text conveys a negative emotion. Positive and conventional meaning that text conveys conventional experiences with positive emotions, those experiences are conventional behaviors exhibited by the host. Positive and model, conveys innovative experiences with positive emotions that show sporadic behavior in a few subjects.
S202, training the text classification model by utilizing the training text and at least one label corresponding to the training text.
S202, in a specific implementation, may include S2021-S2023 as follows.
And S2021, coding the training text to obtain a first coding vector.
In one example, the text classification model may include an encoder (encoder) for encoding the training text. Specifically, the encoder may encode a word embedded vector of the training text to obtain the first encoded vector. In one example, the encoder may be a Bi-directional Long Short-Term Memory (BiLSTM) network based on the attention (attention) mechanism.
Wherein, the hidden state of each word in the training text is obtained by connecting BiLSTM from two directionsMoreover, because not all words have the same importance in classification prediction, attention mechanisms are adopted so that important word segmentation is focused more in classification prediction. In one example:
the attention mechanism assigns weights to the ith context vector at time step t as follows:
in the formula (1) and the formula (2):
the weight score is:
V is a weight parameter;
w s∈Rd×d,Rd×d is a matrix of d x d;
s t the current hidden state;
Uc∈Rd×d;
The hidden state of the ith word segmentation;
b s is the bias vector;
gamma ti is the attention weight.
The final context text representation c t at step t is calculated as shown in equation (3) below:
Referring to fig. 3a, fig. 3a is a schematic structural diagram of an encoder according to an embodiment of the present application.
As shown in fig. 3a, the encoder may process the words { w 1,w2,...,wn } constituting the training text X to obtain a generated context representation { h 1,h2,…,hn }, and then the context representation and the virtualization process are performed by the attention module to obtain the final context text representation c at the time of step t.
S2022, decoding is carried out based on the first coding vector and the label relation feature, so that a decoding result is obtained, and the label relation feature is used for indicating the association relation between labels in a label dictionary.
In the embodiment of the application, a certain association relationship is also provided between the labels in consideration of all the labels used for text classification by the text classification model, and the association relationship has a certain influence on the text classification result. Therefore, in the embodiment of the present application, after the first coding vector is obtained, a tag relation feature is introduced, and decoding is performed based on the first coding vector and the tag relation feature, so as to obtain a decoding result. The label relation feature is used for indicating the association relation between labels in the label dictionary. In one example, the text classification model may include a decoder (decoder) to decode based on the first encoding vector and the tag relationship feature to obtain a decoding result.
Regarding the tag relationship feature, it should be noted that:
in one example, the tag relationship feature may be a relationship feature of the ith tag and the (i+1) th tag, that is, a feature that reflects an association relationship between the ith tag and the (i+1) th tag.
In one example, at t-1 decoding time, the tag relationship feature may be determined based on embedded information corresponding to all tags in the tag dictionary and the probability of occurrence of the ith tag determined at t-1 decoding time. In this case, the tag relation feature can be expressed as an influence of the probability of the preceding tag (i-th tag) on the following tag (i+1-th tag) at t-1 decoding time.
As one example, semantic information that allows for the name of the tag is critical in distinguishing categories and establishing category ties. Therefore, in the embodiment of the present application, the embedded information corresponding to all the tags in the tag dictionary may be obtained according to each tag name in the tag dictionary. In one example, the word embedding corresponding to the tag name is used as the initialization of the tag word embedding, and thus, the initialization vector of the word embedding of the tag name is Z i L={zi,1,zi,2,...,zi,k. The embedded information x i L corresponding to the tag can be extracted by a pooling function. Reference may be made to equation (4).
Further, the context feature of the label meaning may be obtained using BiLSTM, h i L represents the hidden state of the i-th label, and h i L may be calculated by the following formula (5):
At the time of t-1 decoding, the probability of occurrence of the ith tag is y t-1, and the relationship characteristic between the ith tag and the (i+1) th tag can be represented by the following formula (6):
in formula (6):
representing the relation characteristic of the ith label and the (i+1) th label at the t-1 decoding moment;
y t-1 is the probability of the occurrence of the ith tag at t-1 decoding time, which is the output of the decoder at t-1 decoding time;
Wy∈Rd×d;
Wy、 And All are parameter matrixes;
And Respectively representing hidden states corresponding to the ith label and the (i+1) th label;
g (·) is an activation function, e.g., a ReLU function may be selected. In one example, the tag relationship features may be obtained by a tag relationship extraction module.
A specific implementation of the tag relation feature obtained by the tag relation extraction module is described next in connection with fig. 3 b. Fig. 3b is a schematic structural diagram of a label relation extracting module according to an embodiment of the present application.
As shown in fig. 3b, the tag relation extraction module obtains the hidden state of each tag based on the embedded information (x 1 L,x2 L,......,xm L) corresponding to each tag. Wherein the hidden state of the ith label isThen, based on the hidden state of each tag and the probability y t-1 of the ith tag at the t-1 decoding time, obtaining the tag relation sum
In one example, consider that in some scenarios, if training text is to be fully understood, assistance from external knowledge is often sought, as training text is literally reflected in limited information. In view of this, in one example of an embodiment of the present application, S2022 may decode based on the first encoded vector, the tag relationship feature, and at least one auxiliary vector, to obtain a decoding result. Wherein the auxiliary vector is derived based on the training text. In one example, the auxiliary vector learns the effective information of the training text in a certain dimension or several dimensions.
Regarding the at least one auxiliary vector, it should be noted that:
in one example, when the at least one tag includes content related to emotion polarity, the at least one auxiliary vector includes a first auxiliary vector, which is a vector related to emotion polarity.
In one example, the first auxiliary vector may be derived based on an emotion classification model. In particular, the training text may be input to the emotion classification model, which is used to determine emotion polarity of the text. When the emotion classification model outputs emotion polarities corresponding to the training text, the intermediate network of the emotion classification model can output a first hidden layer vector, and in the embodiment of the application, the first hidden layer vector can be determined to be the first auxiliary vector. It will be appreciated that the first hidden vector extracts valid information in the training text that relates to emotion polarity. Correspondingly, when decoding is performed based on the first hidden vector, more information related to emotion polarity can be obtained, so that a decoding result output by the decoder is more accurate.
In one example, in order to enable the first hidden layer vector to embody more and more accurate information related to emotion polarity, the emotion classification model may be a classification model based on an attention mechanism, so that when determining emotion polarity corresponding to the training text, the emotion classification model can pay more attention to content related to emotion polarity in the training text, and accordingly, the first hidden layer vector can embody more and more accurate information related to emotion polarity.
Regarding the emotion classification model, reference may be made to fig. 3c, and fig. 3c is a schematic structural diagram of an emotion classification model according to an embodiment of the present application.
As shown in FIG. 3c, for a training text including n segmentation words { x 1,x2,……xn }, the emotion classification model may classify the training text to obtain emotion polarities (i.e., emotion classifications) of the training text. Accordingly, the attention module of the emotion classification model may output a first hidden layer vector v s.
In one example, when the at least one tag includes content related to the attribute, the at least one auxiliary vector includes a second auxiliary vector, the second auxiliary vector being a vector related to the attribute. Wherein "attributes" may also be referred to as "aspects".
In one example, the second auxiliary vector may be derived based on an attribute classification model. In particular, the training text may be input to the attribute classification model, which is used to determine attributes of subjects involved in the text. When the attribute classification model outputs the attribute corresponding to the training text, the intermediate network of the attribute classification model may output a second hidden layer vector, and in the embodiment of the present application, the second hidden layer vector may be determined as the second auxiliary vector. It will be appreciated that the second hidden vector extracts valid information in the training text that is related to the attribute. Correspondingly, when decoding is performed based on the second hidden vector, more information related to the attribute can be obtained, so that the decoding result output by the decoder is more accurate.
In one example, in order to enable the second hidden layer vector to embody more accurate information related to the attribute, the attribute classification model may be a classification model based on an attention mechanism, so that the attribute classification model can pay more attention to content related to the attribute in the training text when determining the attribute corresponding to the training text, and accordingly, the second hidden layer vector can embody more accurate information related to the attribute.
Regarding the attribute classification model, reference may be made to fig. 3d, and fig. 3d is a schematic structural diagram of an attribute classification model according to an embodiment of the present application.
As shown in fig. 3d, for a training text comprising n segmentation words { x 1,x2,......xn }, the attribute classification model may classify the training text to obtain an attribute classification (i.e., an aspect feature classification) of a subject involved in the training text. Accordingly, the attention module of the attribute classification model may output a second hidden layer vector v a.
When the at least one auxiliary vector includes a first auxiliary vector and a second auxiliary vector, the decoder, when decoding, may decode based on the first encoded vector, the tag relationship feature, and the first auxiliary vector and the second auxiliary vector. For this case:
At the t-th decoding time, hidden layer information of the node can be expressed by the following formula (7).
st=LSTM(st-1,[f(yt-1);vs;va;ct-1]) (7)
In formula (7):
s t is hidden layer information of the node at the t decoding moment;
s t-1 is hidden layer information of the node at the t-1 decoding moment;
f (y t-1) is a label relationship feature, and reference is made to the relevant description section above for details;
v s is the first auxiliary vector;
v a is a second auxiliary vector;
c t-1 is the first encoding vector.
And S2023, updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Because the label corresponding to the training text is used for indicating at least one category corresponding to the training text, and the decoding result is a category prediction result corresponding to the training text, the parameters of the text classification model can be updated based on the decoding result and the at least one label corresponding to the training text. In the subsequent training process, the category prediction result of the text classification model after parameter adjustment can be more similar to the real label corresponding to the training text (for example, at least one label corresponding to the pre-labeled training text).
In one example, a penalty function of the text classification model may be calculated based on the decoding result and at least one tag corresponding to the training text, and then parameters of the text classification model may be adjusted based on the penalty function. The loss function mentioned here may be, for example, a cross entropy function.
Next, the training process of the text classification model will be described with reference to fig. 3 e. Referring to fig. 3e, fig. 3e is a schematic diagram related to model training according to an embodiment of the present application.
As shown in fig. 3e, the text classification model includes an encoder 301 and a decoder 302.
The input of the encoder is training text;
the output of the encoder is a first encoded vector;
the inputs to the decoder are the first encoded vector, a tag relationship feature, a first auxiliary vector, and a second auxiliary vector.
The first auxiliary vector is obtained by processing the training text by using an emotion classification model, and with respect to the emotion classification model, reference is made to the related description section above, and description is not repeated here;
The second auxiliary vector is obtained by processing the training text for an attribute classification model, and with respect to the attribute classification model, reference may be made to the related description section above, and description thereof will not be repeated here;
The tag relation feature is that the tag relation extracting module is obtained based on all tags in the tag dictionary and the probability y t-1 of occurrence of the ith tag at the t-1 decoding time, and the description of the tag relation feature is specifically referred to the related description section above and will not be repeated here.
The output of the decoder is a decoding result, which is used to indicate the label prediction result of the training text.
As described above, in some examples, when the text classification model is trained, the first auxiliary vector, the second auxiliary vector and the tag relationship feature are introduced, so that the trained model has the capability of learning the association relationship between tags and extracting more effective information related to emotion polarity and aspect features (attributes), and therefore, the trained text classification model can accurately classify the text to be classified.
When the text to be classified is classified by using the text classification model, the text to be classified can be firstly encoded to obtain a second encoding vector, and then the second encoding vector is decoded to obtain at least one category of the text to be classified. Specifically, the encoder may encode the text to be classified to obtain a second encoded vector, and then the decoder decodes the second encoded vector to obtain at least one category of the text to be classified.
Exemplary apparatus
Based on the method provided by the embodiment, the embodiment of the application also provides a device, and the device is described below with reference to the accompanying drawings.
Referring to fig. 4, the structure of a text classification device according to an embodiment of the present application is shown. The apparatus 400 may specifically comprise, for example, an acquisition unit 401 and a determination unit 402.
An obtaining unit 401, configured to obtain a text to be classified;
a determining unit 402, configured to input the text to be classified into a text classification model, to obtain at least one category of the text to be classified, where the text classification model is used to determine at least one category of the text, and the text classification model is trained by:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
and updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Optionally, the tag relationship feature includes:
The characteristic of the association relation between the ith label and the (i+1) th label is reflected;
and at the t-1 decoding moment, the characteristic showing the association relation between the ith tag and the (i+1) th tag is determined based on the embedded information corresponding to all the tags in the tag dictionary and the probability of occurrence of the ith tag determined at the t-1 decoding moment.
Optionally, the decoding based on the first coding vector and the tag relation feature to obtain a decoding result includes:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
Optionally, the at least one auxiliary vector includes a first auxiliary vector, where the first auxiliary vector is obtained by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
Optionally, the emotion classification model is a classification model based on an attention mechanism.
Optionally, the at least one auxiliary vector includes a second auxiliary vector, where the second auxiliary vector is obtained by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
Optionally, the attribute classification model is a classification model based on an attention mechanism.
Optionally, the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
Optionally, the text classification model obtains at least one category of the text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
Optionally, the text to be classified and the training text are comment texts.
Since the apparatus 400 is an apparatus corresponding to the method provided in the above method embodiment, the specific implementation of each unit of the apparatus 400 is the same as the above method embodiment, and therefore, with respect to the specific implementation of each unit of the apparatus 400, reference may be made to the description part of the above method embodiment, and details are not repeated herein.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.