[go: up one dir, main page]

CN114997165B - A text classification method and device - Google Patents

A text classification method and device Download PDF

Info

Publication number
CN114997165B
CN114997165B CN202210623512.8A CN202210623512A CN114997165B CN 114997165 B CN114997165 B CN 114997165B CN 202210623512 A CN202210623512 A CN 202210623512A CN 114997165 B CN114997165 B CN 114997165B
Authority
CN
China
Prior art keywords
text
classification model
vector
label
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210623512.8A
Other languages
Chinese (zh)
Other versions
CN114997165A (en
Inventor
商丽丽
唐华云
王延昭
华娇娇
孙爽
黄鑫玉
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Bond Jinke Information Technology Co ltd
Original Assignee
China Bond Jinke Information Technology Co ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Bond Jinke Information Technology Co ltd filed Critical China Bond Jinke Information Technology Co ltd
Priority to CN202210623512.8A priority Critical patent/CN114997165B/en
Publication of CN114997165A publication Critical patent/CN114997165A/en
Application granted granted Critical
Publication of CN114997165B publication Critical patent/CN114997165B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Biomedical Technology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Evolutionary Computation (AREA)
  • Data Mining & Analysis (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Biophysics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种文本分类方法,包括:获取待分类文本,将待分类文本输入文本分类模型,得到待分类文本的至少一个类别,从而得到所述至少一个类别所体现的待分类文本的有效信息。在训练文本分类模型时,可以对训练文本进行编码,得到第一编码向量,并基于第一编码向量和标签关系特征进行解码,得到解码结果,标签关系特征用于指示标签字典中的标签之间的关联关系。进一步地,基于解码结果和训练文本对应的至少一个标签,更新文本分类模型的参数。由此可见,本方案训练得到的文本分类模型在对待分类文本进行分类时,所得到的分类结果更加准确。而待分类文本的标签,可以体现待分类文本的有效信息,故而本方案可以准确的确定待分类文本的有效信息。

The present application discloses a text classification method, including: obtaining a text to be classified, inputting the text to be classified into a text classification model, obtaining at least one category of the text to be classified, and thereby obtaining effective information of the text to be classified embodied by the at least one category. When training the text classification model, the training text can be encoded to obtain a first encoding vector, and decoded based on the first encoding vector and the label relationship feature to obtain a decoding result, and the label relationship feature is used to indicate the association relationship between labels in the label dictionary. Further, based on the decoding result and at least one label corresponding to the training text, the parameters of the text classification model are updated. It can be seen that the text classification model trained by this scheme has a more accurate classification result when classifying the text to be classified. The label of the text to be classified can reflect the effective information of the text to be classified, so this scheme can accurately determine the effective information of the text to be classified.

Description

Text classification method and device
Technical Field
The present application relates to the field of natural language processing, and in particular, to a text classification method and apparatus.
Background
For some text, it has instructive significance for the user's practice. In particular, the effective information of the text can be used for guiding the subsequent practical process. Therefore, how to quickly obtain the effective information of the text is particularly important.
At present, effective information of a text can be determined by adopting a manual analysis mode, but by adopting the mode, the accuracy of the extracted effective information depends on subjective factors such as experience, knowledge level and the like of an analysis person, and the accuracy cannot be ensured.
Therefore, a scheme is urgently needed to accurately determine effective information of text.
Disclosure of Invention
The application aims to solve the technical problem of accurately determining effective information of texts and provides a text classification method and device.
In a first aspect, an embodiment of the present application provides a text classification method, where the method includes:
Acquiring a text to be classified;
inputting the text to be classified into a text classification model to obtain at least one category of the text to be classified, wherein the text classification model is used for determining at least one category of the text, and the text classification model is trained by the following modes:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
and updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Optionally, the tag relationship feature includes:
The characteristic of the association relation between the ith label and the (i+1) th label is reflected;
and at the t-1 decoding moment, the characteristic showing the association relation between the ith tag and the (i+1) th tag is determined based on the embedded information corresponding to all the tags in the tag dictionary and the probability of occurrence of the ith tag determined at the t-1 decoding moment.
Optionally, the decoding based on the first coding vector and the tag relation feature to obtain a decoding result includes:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
Optionally, the at least one auxiliary vector includes a first auxiliary vector, where the first auxiliary vector is obtained by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
Optionally, the emotion classification model is a classification model based on an attention mechanism.
Optionally, the at least one auxiliary vector includes a second auxiliary vector, where the second auxiliary vector is obtained by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
Optionally, the attribute classification model is a classification model based on an attention mechanism.
Optionally, the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
Optionally, the text classification model obtains at least one category of the text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
Optionally, the text to be classified and the training text are comment texts.
In a second aspect, an embodiment of the present application provides a text classification apparatus, including:
the acquisition unit is used for acquiring the text to be classified;
the text classification model is used for determining at least one category of the text, and is obtained by training in the following way:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
and updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Optionally, the tag relationship feature includes:
The characteristic of the association relation between the ith label and the (i+1) th label is reflected;
and at the t-1 decoding moment, the characteristic showing the association relation between the ith tag and the (i+1) th tag is determined based on the embedded information corresponding to all the tags in the tag dictionary and the probability of occurrence of the ith tag determined at the t-1 decoding moment.
Optionally, the decoding based on the first coding vector and the tag relation feature to obtain a decoding result includes:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
Optionally, the at least one auxiliary vector includes a first auxiliary vector, where the first auxiliary vector is obtained by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
Optionally, the emotion classification model is a classification model based on an attention mechanism.
Optionally, the at least one auxiliary vector includes a second auxiliary vector, where the second auxiliary vector is obtained by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
Optionally, the attribute classification model is a classification model based on an attention mechanism.
Optionally, the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
Optionally, the text classification model obtains at least one category of the text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
Optionally, the text to be classified and the training text are comment texts.
Compared with the prior art, the embodiment of the application has the following advantages:
The embodiment of the application provides a text classification method, which comprises the steps of obtaining a text to be classified, and then inputting the text to be classified into a text classification model, wherein the text classification model is used for determining at least one category of the text. Thus, after inputting the text to be classified into the text classification model, at least one category of the text to be classified may be obtained. The text classification model is trained based on training text and at least one label corresponding to the training text. When the text classification model is trained, the training text can be firstly encoded to obtain a first encoding vector, and then, a certain association relationship is also arranged among labels in consideration of all labels used for text classification by the text classification model, and the association relationship has a certain influence on a text classification result. Therefore, in the embodiment of the application, after the first coding vector is obtained, decoding can be performed based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between the tags in the tag dictionary, and the tag dictionary comprises all the tags used by the text classification model for text classification. Further, based on the decoding result and at least one label corresponding to the training text, updating parameters of the text classification model. Therefore, in the embodiment of the application, the association relation between the labels in the label dictionary is considered when the text classification model is trained, so that the classification result obtained by training the text classification model is more accurate when the text classification model is used for classifying the text to be classified. Therefore, by utilizing the scheme, at least one label corresponding to the text to be classified can be accurately determined, and the label of the text to be classified can embody the effective information of the text to be classified. In addition, in the embodiment of the application, the tag relation feature can embody the global association relation among all tags in the tag dictionary. Moreover, the label relation features are decoupled from the training text, and as the label relation features are decoupled from the training text, the probability of overfitting can be reduced when the text classification model is trained.
Drawings
In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings that are required to be used in the embodiments or the description of the prior art will be briefly described below, and it is obvious that the drawings in the following description are only some embodiments described in the present application, and other drawings may be obtained according to the drawings without inventive effort to those skilled in the art.
Fig. 1 is a schematic flow chart of a text classification method according to an embodiment of the present application;
FIG. 2 is a flow chart of a training method of a text classification model according to an embodiment of the present application;
FIG. 3a is a schematic diagram illustrating an encoder according to an embodiment of the present application;
fig. 3b is a schematic structural diagram of a label relation extracting module according to an embodiment of the present application;
FIG. 3c is a schematic structural diagram of an emotion classification model according to an embodiment of the present application;
FIG. 3d is a schematic structural diagram of an attribute classification model according to an embodiment of the present application;
FIG. 3e is a schematic diagram of model training according to an embodiment of the present application;
fig. 4 is a schematic structural diagram of a text classification device according to an embodiment of the present application.
Detailed Description
In order to make the present application better understood by those skilled in the art, the following description will clearly and completely describe the technical solutions in the embodiments of the present application with reference to the accompanying drawings, and it is apparent that the described embodiments are only some embodiments of the present application, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the application without making any inventive effort, are intended to be within the scope of the application.
The inventor of the present application has found that, for some texts, it has guiding significance to the practical process of the user. For example, a large number of online comments are generated for comment websites, forums, blogs and the like, and a novel communication bridge is established between a practitioner and a consumer. Among the vast, continuously emerging reviews, there are some that can give guidance in the future actions of the practitioner. For example, these comments may provide specific details to tell "who" and in what manner "to present the relevant" particular aspect "service.
At present, effective information in the evaluation paper can be extracted in a manual analysis mode so as to facilitate the subsequent guidance of a subsequent practical process by using the effective information. However, in this way, the accuracy of the extracted effective information depends on subjective factors such as experience and knowledge level of the analyst, and the accuracy cannot be guaranteed.
In order to solve the above problems, the embodiment of the application provides a text classification method and a text classification device.
Various non-limiting embodiments of the present application are described in detail below with reference to the attached drawing figures.
Exemplary method
Referring to fig. 1, the flow chart of a text classification method according to an embodiment of the present application is shown. In this embodiment, the method may be performed by a terminal device or may be performed by a server, and the embodiment of the present application is not limited specifically.
In one example, the text classification method shown in FIG. 2 may include, for example, the steps S101-S102.
S101, acquiring a text to be classified.
The text to be classified is not particularly limited, and may be any text that can be used to guide a practical process.
In one example, the text to be classified may be comment text. For this case, in a specific implementation, S101 may obtain comment text from a web platform such as a comment website, forum, blog, etc., and then filter the comment text to obtain the foregoing part of comment text that can be used to guide a practice process, and further determine the text to be classified from the part of comment text. For example, preprocessing one or more comment texts in the part of comment texts, so as to obtain the text to be classified. The comment text is preprocessed, including but not limited to, supplementing missing commas and periods, removing irrelevant special characters, and replacing all punctuation marks with consistent strings.
S102, inputting the text to be classified into a text classification model to obtain at least one category of the text to be classified, wherein the text classification model is used for determining at least one category of the text.
Considering that different types of texts have different guiding meanings on practical behaviors, for example, some texts have promoting effects, some texts have correcting effects, and some texts have encouraging effects. Thus, for text, the category to which the text corresponds may be valid information for the text. Accordingly, a category to which the text to be classified corresponds may be determined in order to guide the practice process based on the category of the text in the subsequent process.
In one example, after obtaining text to be classified, the text to be classified may be input into a text classification model that is used to determine at least one category of text. Formally, since the text classification model is used for determining at least one category of the text, after the text to be classified is input into the text classification model, the at least one category of the text to be classified can be obtained.
In one example, the text classification model may be pre-trained. Next, the training mode of the text classification model will be described with reference to fig. 2. Fig. 2 is a flow chart of a training method of a text classification model according to an embodiment of the present application.
It should be noted that, the model training process is a process of multiple iterative computations, each iteration can adjust parameters of the model, and the adjusted parameters participate in the next iteration computation.
Fig. 2 illustrates a round of iterative process in a training text classification model, taking a training text as an example. It will be appreciated that there are many sets of training text used to train the text classification model, and that each set of training text is treated similarly when training the text classification model. After training the plurality of groups of training texts, a text classification model with the accuracy meeting the requirement can be obtained.
The method shown in fig. 2 may include the following steps S201-S202.
S201, acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category.
Similar to the text to be classified, the training text may also be any text that can be used to guide a practical process. In one example, the training text may be comment text. For a specific implementation of acquiring the training text in this case, reference may be made to the related description section for acquiring the text to be classified in S101, and the description will not be repeated here.
In the embodiment of the application, at least one label corresponding to the training text can be manually marked. In one example, to ensure the accuracy of at least one tag corresponding to the training text, multiple researchers inviting the natural language processing domain may be employed to annotate the training text. Before the annotation begins, the annotator scrutinizes the annotation guide file, which presents indications of key concepts, definitions, criteria and examples. In addition, before formal annotation, the annotator can accept annotation exercises.
In addition, since multiple annotators can annotate training texts, labels annotated by multiple annotators may not be completely consistent, for this case, multiple annotators may discuss annotation results to determine the end-use annotation result.
It should be noted that the at least one label corresponding to the training sample may be at least one label included in a label dictionary, where the label dictionary includes all labels used by the text classification model for classifying the text. For comment text, for example, three kinds of labels, namely, negative preventable, positive and regular, and positive and model, can be included in the label dictionary. Wherein:
negative preventable-meaning that text conveys a negative emotion. Positive and conventional meaning that text conveys conventional experiences with positive emotions, those experiences are conventional behaviors exhibited by the host. Positive and model, conveys innovative experiences with positive emotions that show sporadic behavior in a few subjects.
S202, training the text classification model by utilizing the training text and at least one label corresponding to the training text.
S202, in a specific implementation, may include S2021-S2023 as follows.
And S2021, coding the training text to obtain a first coding vector.
In one example, the text classification model may include an encoder (encoder) for encoding the training text. Specifically, the encoder may encode a word embedded vector of the training text to obtain the first encoded vector. In one example, the encoder may be a Bi-directional Long Short-Term Memory (BiLSTM) network based on the attention (attention) mechanism.
Wherein, the hidden state of each word in the training text is obtained by connecting BiLSTM from two directionsMoreover, because not all words have the same importance in classification prediction, attention mechanisms are adopted so that important word segmentation is focused more in classification prediction. In one example:
the attention mechanism assigns weights to the ith context vector at time step t as follows:
in the formula (1) and the formula (2):
the weight score is:
V is a weight parameter;
w s∈Rd×d,Rd×d is a matrix of d x d;
s t the current hidden state;
Uc∈Rd×d;
The hidden state of the ith word segmentation;
b s is the bias vector;
gamma ti is the attention weight.
The final context text representation c t at step t is calculated as shown in equation (3) below:
Referring to fig. 3a, fig. 3a is a schematic structural diagram of an encoder according to an embodiment of the present application.
As shown in fig. 3a, the encoder may process the words { w 1,w2,...,wn } constituting the training text X to obtain a generated context representation { h 1,h2,…,hn }, and then the context representation and the virtualization process are performed by the attention module to obtain the final context text representation c at the time of step t.
S2022, decoding is carried out based on the first coding vector and the label relation feature, so that a decoding result is obtained, and the label relation feature is used for indicating the association relation between labels in a label dictionary.
In the embodiment of the application, a certain association relationship is also provided between the labels in consideration of all the labels used for text classification by the text classification model, and the association relationship has a certain influence on the text classification result. Therefore, in the embodiment of the present application, after the first coding vector is obtained, a tag relation feature is introduced, and decoding is performed based on the first coding vector and the tag relation feature, so as to obtain a decoding result. The label relation feature is used for indicating the association relation between labels in the label dictionary. In one example, the text classification model may include a decoder (decoder) to decode based on the first encoding vector and the tag relationship feature to obtain a decoding result.
Regarding the tag relationship feature, it should be noted that:
in one example, the tag relationship feature may be a relationship feature of the ith tag and the (i+1) th tag, that is, a feature that reflects an association relationship between the ith tag and the (i+1) th tag.
In one example, at t-1 decoding time, the tag relationship feature may be determined based on embedded information corresponding to all tags in the tag dictionary and the probability of occurrence of the ith tag determined at t-1 decoding time. In this case, the tag relation feature can be expressed as an influence of the probability of the preceding tag (i-th tag) on the following tag (i+1-th tag) at t-1 decoding time.
As one example, semantic information that allows for the name of the tag is critical in distinguishing categories and establishing category ties. Therefore, in the embodiment of the present application, the embedded information corresponding to all the tags in the tag dictionary may be obtained according to each tag name in the tag dictionary. In one example, the word embedding corresponding to the tag name is used as the initialization of the tag word embedding, and thus, the initialization vector of the word embedding of the tag name is Z i L={zi,1,zi,2,...,zi,k. The embedded information x i L corresponding to the tag can be extracted by a pooling function. Reference may be made to equation (4).
Further, the context feature of the label meaning may be obtained using BiLSTM, h i L represents the hidden state of the i-th label, and h i L may be calculated by the following formula (5):
At the time of t-1 decoding, the probability of occurrence of the ith tag is y t-1, and the relationship characteristic between the ith tag and the (i+1) th tag can be represented by the following formula (6):
in formula (6):
representing the relation characteristic of the ith label and the (i+1) th label at the t-1 decoding moment;
y t-1 is the probability of the occurrence of the ith tag at t-1 decoding time, which is the output of the decoder at t-1 decoding time;
Wy∈Rd×d;
Wy And All are parameter matrixes;
And Respectively representing hidden states corresponding to the ith label and the (i+1) th label;
g (·) is an activation function, e.g., a ReLU function may be selected. In one example, the tag relationship features may be obtained by a tag relationship extraction module.
A specific implementation of the tag relation feature obtained by the tag relation extraction module is described next in connection with fig. 3 b. Fig. 3b is a schematic structural diagram of a label relation extracting module according to an embodiment of the present application.
As shown in fig. 3b, the tag relation extraction module obtains the hidden state of each tag based on the embedded information (x 1 L,x2 L,......,xm L) corresponding to each tag. Wherein the hidden state of the ith label isThen, based on the hidden state of each tag and the probability y t-1 of the ith tag at the t-1 decoding time, obtaining the tag relation sum
In one example, consider that in some scenarios, if training text is to be fully understood, assistance from external knowledge is often sought, as training text is literally reflected in limited information. In view of this, in one example of an embodiment of the present application, S2022 may decode based on the first encoded vector, the tag relationship feature, and at least one auxiliary vector, to obtain a decoding result. Wherein the auxiliary vector is derived based on the training text. In one example, the auxiliary vector learns the effective information of the training text in a certain dimension or several dimensions.
Regarding the at least one auxiliary vector, it should be noted that:
in one example, when the at least one tag includes content related to emotion polarity, the at least one auxiliary vector includes a first auxiliary vector, which is a vector related to emotion polarity.
In one example, the first auxiliary vector may be derived based on an emotion classification model. In particular, the training text may be input to the emotion classification model, which is used to determine emotion polarity of the text. When the emotion classification model outputs emotion polarities corresponding to the training text, the intermediate network of the emotion classification model can output a first hidden layer vector, and in the embodiment of the application, the first hidden layer vector can be determined to be the first auxiliary vector. It will be appreciated that the first hidden vector extracts valid information in the training text that relates to emotion polarity. Correspondingly, when decoding is performed based on the first hidden vector, more information related to emotion polarity can be obtained, so that a decoding result output by the decoder is more accurate.
In one example, in order to enable the first hidden layer vector to embody more and more accurate information related to emotion polarity, the emotion classification model may be a classification model based on an attention mechanism, so that when determining emotion polarity corresponding to the training text, the emotion classification model can pay more attention to content related to emotion polarity in the training text, and accordingly, the first hidden layer vector can embody more and more accurate information related to emotion polarity.
Regarding the emotion classification model, reference may be made to fig. 3c, and fig. 3c is a schematic structural diagram of an emotion classification model according to an embodiment of the present application.
As shown in FIG. 3c, for a training text including n segmentation words { x 1,x2,……xn }, the emotion classification model may classify the training text to obtain emotion polarities (i.e., emotion classifications) of the training text. Accordingly, the attention module of the emotion classification model may output a first hidden layer vector v s.
In one example, when the at least one tag includes content related to the attribute, the at least one auxiliary vector includes a second auxiliary vector, the second auxiliary vector being a vector related to the attribute. Wherein "attributes" may also be referred to as "aspects".
In one example, the second auxiliary vector may be derived based on an attribute classification model. In particular, the training text may be input to the attribute classification model, which is used to determine attributes of subjects involved in the text. When the attribute classification model outputs the attribute corresponding to the training text, the intermediate network of the attribute classification model may output a second hidden layer vector, and in the embodiment of the present application, the second hidden layer vector may be determined as the second auxiliary vector. It will be appreciated that the second hidden vector extracts valid information in the training text that is related to the attribute. Correspondingly, when decoding is performed based on the second hidden vector, more information related to the attribute can be obtained, so that the decoding result output by the decoder is more accurate.
In one example, in order to enable the second hidden layer vector to embody more accurate information related to the attribute, the attribute classification model may be a classification model based on an attention mechanism, so that the attribute classification model can pay more attention to content related to the attribute in the training text when determining the attribute corresponding to the training text, and accordingly, the second hidden layer vector can embody more accurate information related to the attribute.
Regarding the attribute classification model, reference may be made to fig. 3d, and fig. 3d is a schematic structural diagram of an attribute classification model according to an embodiment of the present application.
As shown in fig. 3d, for a training text comprising n segmentation words { x 1,x2,......xn }, the attribute classification model may classify the training text to obtain an attribute classification (i.e., an aspect feature classification) of a subject involved in the training text. Accordingly, the attention module of the attribute classification model may output a second hidden layer vector v a.
When the at least one auxiliary vector includes a first auxiliary vector and a second auxiliary vector, the decoder, when decoding, may decode based on the first encoded vector, the tag relationship feature, and the first auxiliary vector and the second auxiliary vector. For this case:
At the t-th decoding time, hidden layer information of the node can be expressed by the following formula (7).
st=LSTM(st-1,[f(yt-1);vs;va;ct-1]) (7)
In formula (7):
s t is hidden layer information of the node at the t decoding moment;
s t-1 is hidden layer information of the node at the t-1 decoding moment;
f (y t-1) is a label relationship feature, and reference is made to the relevant description section above for details;
v s is the first auxiliary vector;
v a is a second auxiliary vector;
c t-1 is the first encoding vector.
And S2023, updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Because the label corresponding to the training text is used for indicating at least one category corresponding to the training text, and the decoding result is a category prediction result corresponding to the training text, the parameters of the text classification model can be updated based on the decoding result and the at least one label corresponding to the training text. In the subsequent training process, the category prediction result of the text classification model after parameter adjustment can be more similar to the real label corresponding to the training text (for example, at least one label corresponding to the pre-labeled training text).
In one example, a penalty function of the text classification model may be calculated based on the decoding result and at least one tag corresponding to the training text, and then parameters of the text classification model may be adjusted based on the penalty function. The loss function mentioned here may be, for example, a cross entropy function.
Next, the training process of the text classification model will be described with reference to fig. 3 e. Referring to fig. 3e, fig. 3e is a schematic diagram related to model training according to an embodiment of the present application.
As shown in fig. 3e, the text classification model includes an encoder 301 and a decoder 302.
The input of the encoder is training text;
the output of the encoder is a first encoded vector;
the inputs to the decoder are the first encoded vector, a tag relationship feature, a first auxiliary vector, and a second auxiliary vector.
The first auxiliary vector is obtained by processing the training text by using an emotion classification model, and with respect to the emotion classification model, reference is made to the related description section above, and description is not repeated here;
The second auxiliary vector is obtained by processing the training text for an attribute classification model, and with respect to the attribute classification model, reference may be made to the related description section above, and description thereof will not be repeated here;
The tag relation feature is that the tag relation extracting module is obtained based on all tags in the tag dictionary and the probability y t-1 of occurrence of the ith tag at the t-1 decoding time, and the description of the tag relation feature is specifically referred to the related description section above and will not be repeated here.
The output of the decoder is a decoding result, which is used to indicate the label prediction result of the training text.
As described above, in some examples, when the text classification model is trained, the first auxiliary vector, the second auxiliary vector and the tag relationship feature are introduced, so that the trained model has the capability of learning the association relationship between tags and extracting more effective information related to emotion polarity and aspect features (attributes), and therefore, the trained text classification model can accurately classify the text to be classified.
When the text to be classified is classified by using the text classification model, the text to be classified can be firstly encoded to obtain a second encoding vector, and then the second encoding vector is decoded to obtain at least one category of the text to be classified. Specifically, the encoder may encode the text to be classified to obtain a second encoded vector, and then the decoder decodes the second encoded vector to obtain at least one category of the text to be classified.
Exemplary apparatus
Based on the method provided by the embodiment, the embodiment of the application also provides a device, and the device is described below with reference to the accompanying drawings.
Referring to fig. 4, the structure of a text classification device according to an embodiment of the present application is shown. The apparatus 400 may specifically comprise, for example, an acquisition unit 401 and a determination unit 402.
An obtaining unit 401, configured to obtain a text to be classified;
a determining unit 402, configured to input the text to be classified into a text classification model, to obtain at least one category of the text to be classified, where the text classification model is used to determine at least one category of the text, and the text classification model is trained by:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
and updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text.
Optionally, the tag relationship feature includes:
The characteristic of the association relation between the ith label and the (i+1) th label is reflected;
and at the t-1 decoding moment, the characteristic showing the association relation between the ith tag and the (i+1) th tag is determined based on the embedded information corresponding to all the tags in the tag dictionary and the probability of occurrence of the ith tag determined at the t-1 decoding moment.
Optionally, the decoding based on the first coding vector and the tag relation feature to obtain a decoding result includes:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
Optionally, the at least one auxiliary vector includes a first auxiliary vector, where the first auxiliary vector is obtained by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
Optionally, the emotion classification model is a classification model based on an attention mechanism.
Optionally, the at least one auxiliary vector includes a second auxiliary vector, where the second auxiliary vector is obtained by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
Optionally, the attribute classification model is a classification model based on an attention mechanism.
Optionally, the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
Optionally, the text classification model obtains at least one category of the text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
Optionally, the text to be classified and the training text are comment texts.
Since the apparatus 400 is an apparatus corresponding to the method provided in the above method embodiment, the specific implementation of each unit of the apparatus 400 is the same as the above method embodiment, and therefore, with respect to the specific implementation of each unit of the apparatus 400, reference may be made to the description part of the above method embodiment, and details are not repeated herein.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the application disclosed herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.
The foregoing description of the preferred embodiments of the application is not intended to limit the application to the precise form disclosed, and any such modifications, equivalents, and alternatives falling within the spirit and scope of the application are intended to be included within the scope of the application.

Claims (18)

1. A method of text classification, the method comprising:
Acquiring a text to be classified;
inputting the text to be classified into a text classification model to obtain at least one category of the text to be classified, wherein the text classification model is used for determining at least one category of the text, and the text classification model is trained by the following modes:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text;
the label relation features comprise features for reflecting the association relation between the ith label and the (i+1) th label;
obtaining the hidden state of each label based on the embedded information corresponding to all the labels in the label dictionary, wherein the hidden state of the ith label is Based on the hidden state of each tag and the probability y t-1 of the occurrence of the ith tag at t-1 decoding moments, obtaining the tag relationship features Wherein, The method is characterized in that the relation characteristic of the ith label and the (i+1) th label at the t-1 decoding moment is represented, y t-1 is the probability of the occurrence of the ith label at the t-1 decoding moment and is the output of a decoder at the t-1 decoding moment, and W y∈Rd×d is represented by the formula; is a d x d matrix, W y, AndAll are parameter matrixes; And And respectively representing hidden states corresponding to the ith label and the (i+1) th label, wherein g (·) is an activation function.
2. The method of claim 1, wherein decoding based on the first encoded vector and the tag relationship feature results in a decoded result, comprising:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
3. The method according to claim 2, wherein the at least one assistance vector comprises a first assistance vector, the first assistance vector being obtained by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
4. A method according to claim 3, wherein the emotion classification model is a classification model based on an attention mechanism.
5. The method according to claim 2, wherein the at least one assistance vector comprises a second assistance vector, the second assistance vector being obtained by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
6. The method of claim 5, wherein the attribute classification model is an attention-mechanism-based classification model.
7. The method of claim 1, wherein the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
8. The method of claim 1, wherein the text classification model obtains the at least one category of text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
9. The method of claim 1, wherein the text to be classified and the training text are comment text.
10. A text classification device, the device comprising:
the acquisition unit is used for acquiring the text to be classified;
the text classification model is used for determining at least one category of the text, and is obtained by training in the following way:
acquiring a training text and at least one label corresponding to the training text, wherein the at least one label corresponding to the training text is used for indicating at least one category of the training text, and one label corresponds to one category;
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein:
training to obtain the text classification model by using the training text and at least one label corresponding to the training text, wherein the training comprises the following steps:
coding the training text to obtain a first coding vector;
Decoding based on the first coding vector and the tag relation feature to obtain a decoding result, wherein the tag relation feature is used for indicating the association relation between tags in a tag dictionary, and the tag dictionary comprises all tags used by the text classification model for classifying the text;
updating parameters of the text classification model based on the decoding result and at least one label corresponding to the training text;
the label relation features comprise features for reflecting the association relation between the ith label and the (i+1) th label;
obtaining the hidden state of each label based on the embedded information corresponding to all the labels in the label dictionary, wherein the hidden state of the ith label is Based on the hidden state of each tag and the probability y t-1 of the occurrence of the ith tag at t-1 decoding moments, obtaining the tag relationship features Wherein, The method is characterized in that the relation characteristic of the ith label and the (i+1) th label at the t-1 decoding moment is represented, y t-1 is the probability of the occurrence of the ith label at the t-1 decoding moment and is the output of a decoder at the t-1 decoding moment, and W y∈Rd×d is represented by the formula; is a d x d matrix, W y, AndAll are parameter matrixes; And And respectively representing hidden states corresponding to the ith label and the (i+1) th label, wherein g (·) is an activation function.
11. The apparatus of claim 10, wherein the decoding based on the first encoded vector and the tag relationship feature to obtain a decoding result comprises:
decoding is carried out based on the first coding vector, the label relation feature and at least one auxiliary vector, a decoding result is obtained, and the auxiliary vector is obtained based on the training text.
12. The apparatus of claim 11, wherein the at least one assistance vector comprises a first assistance vector, the first assistance vector being derived by:
Inputting the training text into an emotion classification model to obtain a first hidden layer vector output by the emotion classification model, wherein the emotion classification model is used for determining emotion polarity of the text;
the first hidden layer vector is determined as the first auxiliary vector.
13. The apparatus of claim 12, wherein the emotion classification model is a classification model based on an attention mechanism.
14. The apparatus of claim 11, wherein the at least one assistance vector comprises a second assistance vector, the second assistance vector being derived by:
Inputting the training text into an attribute classification model to obtain a second hidden layer vector output by the attribute classification model, wherein the attribute classification model is used for determining the attribute of a main body involved in the text;
And determining the second hidden layer vector as the second auxiliary vector.
15. The apparatus of claim 14, wherein the attribute classification model is an attention-mechanism-based classification model.
16. The apparatus of claim 10, wherein the text classification model comprises an encoder and a decoder;
The encoder is used for encoding the training text to obtain a first encoding vector;
the decoder is used for decoding based on the first coding vector and the label relation characteristic to obtain a decoding result.
17. The apparatus of claim 10, wherein the text classification model obtains the at least one category of text to be classified by:
coding the text to be classified to obtain a second coding vector;
And decoding the second coding vector to obtain at least one category of the text to be classified.
18. The apparatus of claim 10, wherein the text to be categorized and the training text are comment text.
CN202210623512.8A 2022-06-02 2022-06-02 A text classification method and device Active CN114997165B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210623512.8A CN114997165B (en) 2022-06-02 2022-06-02 A text classification method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210623512.8A CN114997165B (en) 2022-06-02 2022-06-02 A text classification method and device

Publications (2)

Publication Number Publication Date
CN114997165A CN114997165A (en) 2022-09-02
CN114997165B true CN114997165B (en) 2025-03-07

Family

ID=83031273

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210623512.8A Active CN114997165B (en) 2022-06-02 2022-06-02 A text classification method and device

Country Status (1)

Country Link
CN (1) CN114997165B (en)

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362684A (en) * 2019-06-27 2019-10-22 腾讯科技(深圳)有限公司 A kind of file classification method, device and computer equipment
CN112364169A (en) * 2021-01-13 2021-02-12 北京云真信科技有限公司 Nlp-based wifi identification method, electronic device and medium

Family Cites Families (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US10515138B2 (en) * 2014-04-25 2019-12-24 Mayo Foundation For Medical Education And Research Enhancing reading accuracy, efficiency and retention
CN109299262B (en) * 2018-10-09 2022-04-15 中山大学 A textual entailment relation recognition method fused with multi-granularity information
CN110442707B (en) * 2019-06-21 2022-06-17 电子科技大学 Seq2 seq-based multi-label text classification method
CN111859978B (en) * 2020-06-11 2023-06-20 南京邮电大学 A method for generating emotional text based on deep learning

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110362684A (en) * 2019-06-27 2019-10-22 腾讯科技(深圳)有限公司 A kind of file classification method, device and computer equipment
CN112364169A (en) * 2021-01-13 2021-02-12 北京云真信科技有限公司 Nlp-based wifi identification method, electronic device and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
基于BERT的文本情感分析;刘思琴 等;信息安全研究;20200305;正文220-227页 *

Also Published As

Publication number Publication date
CN114997165A (en) 2022-09-02

Similar Documents

Publication Publication Date Title
WO2019153737A1 (en) Comment assessing method, device, equipment and storage medium
CN110162749A (en) Information extracting method, device, computer equipment and computer readable storage medium
KR20190125153A (en) An apparatus for predicting the status of user's psychology and a method thereof
CN111858944A (en) A Entity Aspect-Level Sentiment Analysis Method Based on Attention Mechanism
CN113128227A (en) Entity extraction method and device
CN109376222A (en) Question and answer matching degree calculation method, question and answer automatic matching method and device
CN115599901B (en) Machine Question Answering Method, Device, Equipment and Storage Medium Based on Semantic Prompts
CN111079433B (en) Event extraction method and device and electronic equipment
CN106682387A (en) Method and device used for outputting information
CN112071429A (en) Medical automatic question-answering system construction method based on knowledge graph
CN111324739B (en) Text emotion analysis method and system
CN113590945B (en) Book recommendation method and device based on user borrowing behavior-interest prediction
CN112395391B (en) Concept graph construction method, device, computer equipment and storage medium
Pröllochs et al. Negation scope detection for sentiment analysis: A reinforcement learning framework for replicating human interpretations
CN110298038B (en) Text scoring method and device
Dedeepya et al. Detecting cyber bullying on twitter using support vector machine
CN118227769A (en) Knowledge graph enhancement-based large language model question-answer generation method
CN111291550B (en) Chinese entity extraction method and device
CN114492661B (en) Text data classification method and device, computer equipment and storage medium
CN114997165B (en) A text classification method and device
CN117056512A (en) Social media comment irony detection method based on topic context
CN115906824A (en) Text fine-grained emotion analysis method, system, medium and computing equipment
Driss LSTM-Based QoE Evaluation for Web Microservices’ Reputation Scoring
CN117932487B (en) Risk classification model training and risk classification method and device
Lv et al. Personagan: Personalized response generation via generative adversarial networks

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant