[go: up one dir, main page]

CN114780727B - Text classification method, device, computer equipment and medium based on reinforcement learning - Google Patents

Text classification method, device, computer equipment and medium based on reinforcement learning Download PDF

Info

Publication number
CN114780727B
CN114780727B CN202210433355.4A CN202210433355A CN114780727B CN 114780727 B CN114780727 B CN 114780727B CN 202210433355 A CN202210433355 A CN 202210433355A CN 114780727 B CN114780727 B CN 114780727B
Authority
CN
China
Prior art keywords
classification
semantic
model
clustering
training
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202210433355.4A
Other languages
Chinese (zh)
Other versions
CN114780727A (en
Inventor
王伟
张黔
陈焕坤
郑毅
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
China Resources Digital Technology Co Ltd
Original Assignee
China Resources Digital Technology Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by China Resources Digital Technology Co Ltd filed Critical China Resources Digital Technology Co Ltd
Priority to CN202210433355.4A priority Critical patent/CN114780727B/en
Publication of CN114780727A publication Critical patent/CN114780727A/en
Application granted granted Critical
Publication of CN114780727B publication Critical patent/CN114780727B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Computational Linguistics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Probability & Statistics with Applications (AREA)
  • Databases & Information Systems (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Biomedical Technology (AREA)
  • Biophysics (AREA)
  • Evolutionary Computation (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Software Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The embodiment of the application belongs to the technical field of artificial intelligence and relates to a text classification method based on reinforcement learning, which comprises the steps of obtaining training text corpus, extracting semantic features of the training text corpus to obtain semantic feature vectors, inputting the semantic feature vectors into a trained clustering model, outputting semantic clusters, extracting keywords from all the semantic clusters, forming semantic feature queues corresponding to the semantic clusters according to the extracted keywords, selecting keywords from each semantic feature queue as target keywords, generating word semantic vectors based on the target keywords, inputting the word semantic vectors into a pre-built initial classification model for training to obtain a trained target classification model, obtaining texts to be classified, inputting the texts to be classified into the target classification model, and outputting text classification results. The application also provides a text classification device, computer equipment and medium based on reinforcement learning. The application can improve the accuracy of text classification.

Description

Text classification method, device, computer equipment and medium based on reinforcement learning
Technical Field
The present application relates to the field of artificial intelligence, and in particular, to a text classification method, apparatus, computer device, and medium based on reinforcement learning.
Background
Text classification has been a common task in the field of natural language understanding, and has formed many approaches, and can be generally divided into two categories, i.e., supervised learning and unsupervised learning. In the supervised learning field, various kinds of feature information capable of representing text semantics are extracted to complete classification through training models, and in the unsupervised field, features of texts are learned autonomously through clustering and other methods to form clusters of texts with similar features, so that classification is completed.
However, the current text classification model ignores the unbalance of the sample, the model tends to learn the characteristics of more dominant categories, and ignores the characteristics of less dominant categories, so that the learned algorithm is easy to be over-fitted, and the text classification is inaccurate.
Disclosure of Invention
The embodiment of the application aims to provide a text classification method, a text classification device, computer equipment and a text classification medium based on reinforcement learning, which are used for solving the technical problem of inaccurate text classification caused by sample unbalance in the related technology.
In order to solve the above technical problems, the embodiments of the present application provide a text classification method based on reinforcement learning, which adopts the following technical scheme:
acquiring a training text corpus, and extracting semantic features of the training text corpus to obtain a semantic feature vector;
inputting the semantic feature vectors into a trained clustering model, and outputting semantic clusters;
Extracting keywords from all the semantic clusters, and forming a semantic feature queue corresponding to each semantic cluster according to the extracted keywords;
Selecting keywords from each semantic feature queue as target keywords, and generating word sense vectors based on the target keywords;
inputting the word sense vector into a pre-constructed initial classification model for training to obtain a trained target classification model;
and obtaining a text to be classified, inputting the text to be classified into the target classification model, and outputting a text classification result.
Further, before the step of inputting the semantic feature vector into the trained cluster model and outputting the semantic cluster, the method further comprises:
Inputting the semantic feature vector into a pre-constructed neural network model, and outputting a clustering result;
determining a clustering loss function according to the clustering result;
adjusting model parameters of the neural network model based on the cluster loss function;
and when the iteration ending condition is met, generating a clustering model according to the model parameters.
Further, the step of determining a cluster loss function according to the clustering result includes:
calculating the contour coefficient of each cluster in the clustering result;
Obtaining training rewards according to the contour coefficients;
And obtaining the clustering loss function based on the clustering result and the training reward score.
Further, the step of forming a semantic feature queue corresponding to each semantic cluster according to the extracted keywords includes:
calculating the similarity between the keywords of each semantic cluster;
sorting the keywords according to the similarity to obtain a sorting result;
And generating a semantic feature queue corresponding to each semantic cluster based on the sequencing result.
Further, the step of generating a word sense vector based on the target keyword includes:
Extracting features of the target keywords to obtain keyword vectors;
and splicing the keyword vector and the semantic feature vector to obtain a word semantic vector.
Further, the step of inputting the word sense vector into a pre-constructed initial classification model to train and obtaining a trained target classification model includes:
inputting the word sense vector into a pre-constructed initial classification model to obtain a prediction classification result;
determining a classification loss function according to the prediction classification result;
adjusting model parameters of the initial classification model according to the classification loss function;
and when the iteration ending condition is met, generating a target classification model based on the model parameters.
Further, the step of determining a classification loss function according to the prediction classification result includes:
calculating to obtain a classified rewarding value according to the prediction classification result;
And obtaining a classification loss function based on the classification reward value and the prediction classification result.
In order to solve the technical problems, the embodiment of the application also provides a text classification device based on reinforcement learning, which adopts the following technical scheme:
the semantic feature extraction module is used for obtaining training text corpus, and extracting semantic features of the training text corpus to obtain semantic feature vectors;
The clustering module is used for inputting the semantic feature vectors into the trained clustering model and outputting semantic clusters;
The keyword extraction module is used for extracting keywords from all the semantic clusters and forming a semantic feature queue corresponding to each semantic cluster according to the extracted keywords;
The vector generation module is used for selecting keywords from each semantic feature queue as target keywords and generating word sense vectors based on the target keywords;
the training module is used for inputting the word meaning vector into a pre-constructed initial classification model for training to obtain a trained target classification model;
The classification module is used for acquiring texts to be classified, inputting the texts to be classified into the target classification model, and outputting text classification results.
In order to solve the above technical problems, the embodiment of the present application further provides a computer device, which adopts the following technical schemes:
The computer device includes a memory having stored therein computer readable instructions which when executed implement the steps of the reinforcement learning based text classification method described above.
In order to solve the above technical problems, an embodiment of the present application further provides a computer readable storage medium, which adopts the following technical schemes:
The computer readable storage medium has stored thereon computer readable instructions which when executed by a processor implement the steps of the reinforcement learning based text classification method as described above.
Compared with the prior art, the embodiment of the application has the following main beneficial effects:
The method comprises the steps of obtaining training text corpus, carrying out semantic feature extraction on the training text corpus to obtain semantic feature vectors, inputting the semantic feature vectors into a trained clustering model to output semantic clusters, carrying out keyword extraction on all the semantic clusters, forming semantic feature queues corresponding to each semantic cluster according to the extracted keywords, selecting keywords from each semantic feature queue as target keywords, generating word semantic vectors based on the target keywords, inputting the word semantic vectors into a pre-built initial classification model to carry out training to obtain a trained target classification model, obtaining text to be classified, inputting the text to be classified into the target classification model to output text classification results, carrying out semantic feature extraction on the training text corpus, clustering the extracted semantic feature vectors to obtain semantic clusters of different categories, and carrying out training on the classification models by using the semantic clusters of different categories to enable the classification models to learn the semantic features of different categories in the training text corpus, so that the text classification accuracy can be improved.
Drawings
In order to more clearly illustrate the solution of the present application, a brief description will be given below of the drawings required for the description of the embodiments of the present application, it being apparent that the drawings in the following description are some embodiments of the present application, and that other drawings may be obtained from these drawings without the exercise of inventive effort for a person of ordinary skill in the art.
FIG. 1 is an exemplary system architecture diagram in which the present application may be applied;
FIG. 2 is a flow chart of one embodiment of a reinforcement learning based text classification method in accordance with the present application;
FIG. 3 is a schematic diagram of an embodiment of a reinforcement learning based text classification device in accordance with the present application;
FIG. 4 is a schematic structural diagram of one embodiment of a computer device in accordance with the present application.
Detailed Description
Unless defined otherwise, all technical and scientific terms used herein have the same meaning as commonly understood by one of ordinary skill in the art to which this application belongs, the terms used in the description herein are used for the purpose of describing particular embodiments only and are not intended to limit the application, and the terms "comprising" and "having" and any variations thereof in the description of the application and the claims and the above description of the drawings are intended to cover non-exclusive inclusions. The terms first, second and the like in the description and in the claims or in the above-described figures, are used for distinguishing between different objects and not necessarily for describing a sequential or chronological order.
Reference herein to "an embodiment" means that a particular feature, structure, or characteristic described in connection with the embodiment may be included in at least one embodiment of the application. The appearances of such phrases in various places in the specification are not necessarily all referring to the same embodiment, nor are separate or alternative embodiments mutually exclusive of other embodiments. Those of skill in the art will explicitly and implicitly appreciate that the embodiments described herein may be combined with other embodiments.
In order to make the person skilled in the art better understand the solution of the present application, the technical solution of the embodiment of the present application will be clearly and completely described below with reference to the accompanying drawings.
The application provides a text classification method based on reinforcement learning, which relates to artificial intelligence and can be applied to a system architecture 100 shown in fig. 1, wherein the system architecture 100 can comprise terminal equipment 101, 102 and 103, a network 104 and a server 105. The network 104 is used as a medium to provide communication links between the terminal devices 101, 102, 103 and the server 105. The network 104 may include various connection types, such as wired, wireless communication links, or fiber optic cables, among others.
The user may interact with the server 105 via the network 104 using the terminal devices 101, 102, 103 to receive or send messages or the like. Various communication client applications, such as a web browser application, a shopping class application, a search class application, an instant messaging tool, a mailbox client, social platform software, etc., may be installed on the terminal devices 101, 102, 103.
The terminal devices 101, 102, 103 may be various electronic devices having a display screen and supporting web browsing, including but not limited to smartphones, tablet computers, electronic book readers, MP3 players (Moving Picture Experts Group Audio Layer III, dynamic video expert compression standard audio plane 3), MP4 (Moving Picture Experts Group Audio Layer IV, dynamic video expert compression standard audio plane 4) players, laptop and desktop computers, and the like.
The server 105 may be a server providing various services, such as a background server providing support for pages displayed on the terminal devices 101, 102, 103.
It should be noted that, the text classification method based on reinforcement learning provided by the embodiment of the present application is generally executed by a server/terminal device, and accordingly, the text classification device based on reinforcement learning is generally disposed in the server/terminal device.
It should be understood that the number of terminal devices, networks and servers in fig. 1 is merely illustrative. There may be any number of terminal devices, networks, and servers, as desired for implementation.
With continued reference to fig. 2, a flow chart of one embodiment of a reinforcement learning based text classification method according to the present application is shown, comprising the steps of:
Step S201, obtaining a training text corpus, and extracting semantic features of the training text corpus to obtain semantic feature vectors.
The training text corpus may be obtained from public data sets including, but not limited to, a Chinese news data set, THUCNews data set, an online_ shoppping _10_cas data set, and the like. The method comprises the steps of obtaining an original text corpus from a public data set, performing preprocessing such as word segmentation and stop word removal on the original text corpus, and randomly dividing the preprocessed original text corpus into a training set and a testing set according to a preset proportion, wherein the training set is training text corpus and is a text set.
In this embodiment, the corpus of training text is subjected to semantic feature extraction, and semantic feature extraction may be performed using a semantic feature extraction model, where the semantic feature extraction model includes, but is not limited to, a CNN (Convolutional Neural Networks, convolutional neural network) model, a RNN (RecurrentNeuralNetwork, cyclic neural network) model, an LSTM (Long-short term memory, long-short-term memory network) model, a BERT (Bidirectional Encoder Representations from Transformers, bi-directional encoder characterization based on a converter) model, and the like, without limitation.
As a specific implementation, the training text corpus may be input to a BERT-based pre-training language model for semantic feature extraction.
Step S202, inputting the semantic feature vectors into a trained clustering model, and outputting semantic clusters.
In this embodiment, the semantic feature vectors may be clustered by a trained clustering model, where the clustering algorithm used by the clustering model includes, but is not limited to, a K-means algorithm, and a Single-Pass algorithm.
Taking a Single-Pass algorithm as an example, the clustering process is described in detail, and the steps comprise:
step A, selecting any semantic feature vector as a cluster center of a first cluster;
and B, selecting any unprocessed semantic feature vector from unprocessed other semantic feature vectors, calculating the similarity value of the semantic feature vector and all the existing class clusters, selecting the class cluster with the largest similarity value as the nearest class cluster of the semantic feature vector, and obtaining the similarity value of the nearest class cluster.
It should be appreciated that the algorithm just begins with only one cluster, i.e., the cluster generated in step a, when "all clusters in existence" are the clusters. With the operation of the algorithm, a new class cluster is created for the semantic feature vector or the semantic feature vector is classified into different class clusters, the number of the class clusters is increased, and the existing all class clusters refer to all the class clusters which are generated currently.
And C, judging the similarity value of the nearest class cluster and a similarity threshold value, if the similarity value is larger than the similarity threshold value, classifying the semantic feature vector selected in the step B into the nearest class cluster and updating the center of the nearest class cluster, otherwise, taking the semantic feature vector selected in the step B as the class cluster center of a new class cluster.
In this embodiment, the center of the class cluster is the average of the semantic feature vectors in the class cluster. The method is characterized by comprising the following steps:
wherein C represents the centroid vector of the class cluster, n represents the number of semantic feature vectors of the class cluster, and d i represents the semantic feature vectors in the class cluster.
The update is to recalculate the average value of the semantic feature vectors of the clusters after adding one semantic feature vector.
And D, judging whether the semantic feature vectors in the semantic feature vector set to be processed are processed, if not, returning to the step C, otherwise, outputting a clustering result.
Step S203, extracting keywords from all the semantic clusters, and forming a semantic feature queue corresponding to each semantic cluster according to the extracted keywords.
In this embodiment, a preset number of keywords are extracted for each semantic cluster, and the method for extracting keywords includes, but is not limited to, TF-IDF (word frequency-inverse document frequency) algorithm, LDA algorithm, and the like.
And carrying out similarity sequencing on the extracted keywords to form a semantic feature sequence. Specifically, the similarity between the keywords of each semantic cluster is calculated, the keywords are ranked according to the similarity, a ranking result is obtained, and a semantic feature queue corresponding to each semantic cluster is generated based on the ranking result.
Wherein, computing the similarity between each semantic cluster keyword may employ a similarity algorithm including, but not limited to, cosine similarity (Cosine Similarity), lycemic distance (LEVENSHTEIN DISTANCE), and the like. After calculating the similarity between the keywords, the keywords are arranged in an ascending order according to the similarity, and a semantic feature queue of the semantic cluster with the similarity arranged from high to low is formed.
Step S204, selecting keywords from each semantic feature queue as target keywords, and generating word sense vectors based on the target keywords.
In the present embodiment, the text classification model is trained by reinforcement learning, and prior to training, the following definition is performed:
Before each training round, randomly selecting a keyword from each semantic feature queue, obtaining a corresponding vector by adopting the semantic feature extraction model, and multiplying the vector by a preset coefficient.
The classification rewards are defined as classification rewards by dividing the predicted label class value by the correct label value of the sample and multiplying the inverse of the proportion of the class in the whole sample. For example, assuming that the samples have 5 total categories, category 1 accounts for 1/10, category 2 accounts for 1/5, category 3 accounts for 1/4, category 4 accounts for 1/3, category 5 accounts for 7/60, the category 1 reward coefficient is 10, category 2 reward coefficient is 5, category 3 reward coefficient is 4, category 4 reward coefficient is 3, and category 5 reward coefficient is about 8.57.
Before each round of training, a preset number of keywords are randomly selected from each semantic feature queue to serve as target keywords, and specifically, one keyword is randomly selected.
And extracting the characteristics of the target keywords, obtaining vectors corresponding to the target keywords by using the semantic characteristic extraction model, and multiplying the vectors by preset coefficients to obtain word semantic vectors corresponding to the target keywords.
It should be noted that, the preset coefficient is the reciprocal of the absolute value of the contour coefficient of the semantic cluster where the target keyword is located.
And splicing the keyword vector and the semantic feature vector obtained through the semantic feature extraction model to obtain a word semantic vector.
In the embodiment, the keyword vectors and the semantic feature vectors selected from different semantic clusters are spliced, and the obtained word semantic vectors are used for training the classification model, so that the classification model can learn different semantic features in the training sample, and the accuracy of model classification is improved.
Step S205, inputting the word sense vector into a pre-constructed initial classification model for training to obtain a trained target classification model.
In this embodiment, the pre-built initial classification model is a multi-layer neural network model with N actions corresponding to the classification, the word sense vector is input into the multi-layer neural network model, and training and updating are performed on the multi-layer neural network model according to the classification reward value, so that the classification reward value is maximized, where N is a natural number greater than zero.
As a specific implementation mode, the multi-layer neural network model comprises an input layer, a first hidden layer, a second hidden layer and an output layer, wherein the input layer inputs word meaning vectors v, a first hidden layer weight matrix is set to be w 1, a relu activation function is adopted, the bias amount is set to be b 1, the output of the first hidden layer is o 1=relu(w1*v+b1, a second hidden layer weight matrix is set to be w 2, a relu activation function is adopted, the bias amount is set to be b 2, the output of the second hidden layer is o 2=relu(w2*o1+b2, the output layer adopts a softmax layer, o 2 is input into the softmax layer, and o 3,o3, namely the class probability Pa predicted by each training is obtained through the softmax layer.
In order to achieve better text classification, more hidden layers can be set according to the actual situation.
Step S206, obtaining a text to be classified, inputting the text to be classified into a target classification model, and outputting a text classification result.
And acquiring the text to be classified, inputting the text to be classified into the target classification model, and outputting a text classification result.
According to the method, the semantic feature extraction is carried out on the training text corpus, the extracted semantic feature vectors are clustered to obtain different semantic clusters, and the classification model is trained according to the different semantic clusters, so that the classification model learns the semantic features of different categories in the training text corpus, and the text classification accuracy can be improved.
In some optional implementations of this embodiment, before the step of inputting the semantic feature vector into the trained cluster model and outputting the semantic cluster, the step of further includes:
inputting the semantic feature vector into a pre-constructed neural network model, and outputting a clustering result;
determining a clustering loss function according to the clustering result;
model parameters of the neural network model are adjusted based on the clustering loss function;
And when the iteration ending condition is met, generating a clustering model according to the model parameters.
In this embodiment, the pre-built neural network model may have the same structure as the classification model, and includes an input layer, a first hidden layer, a second hidden layer and an output layer, where the input layer inputs a semantic feature vector x, the first hidden layer weight matrix is set to be w 1, a relu activation function is adopted, the bias amount is b 1, the output of the first hidden layer is o 1=relu(w1*x+b1), the second hidden layer weight matrix is set to be w 2, a relu activation function is adopted, the bias amount is b 2, the output of the second hidden layer is o 2=relu(w2*o1+b2, the output layer adopts a softmax layer, the o 2 is input into the softmax layer, and the probability Pc of o 3,o3, that is, each action is obtained through the softmax layer. Different hidden layers can be set according to actual needs.
In this embodiment, model parameters of the neural network model are adjusted based on the loss function, and when the iteration end condition is satisfied, a cluster model is generated according to the model parameters.
Specifically, model parameters of the neural network model are adjusted based on the loss function value of the loss function, iterative training is continued, the model is trained to a certain extent, at this time, the performance of the model reaches an optimal state, and the loss function value is hardly changed, namely convergence. And when the iteration ending condition is met, model convergence is achieved, and after model convergence, a final neural network model is output as a clustering model according to the finally adjusted model parameters.
According to the method, the neural network model pre-constructed through reinforcement learning training is used as a clustering model, so that the clustering precision can be improved, and the text classification efficiency can be improved.
In some optional implementations, the step of determining the cluster loss function according to the clustering result includes:
Calculating the contour coefficient of each cluster in the clustering result;
obtaining training reward points according to the contour coefficients;
and obtaining a clustering loss function based on the clustering result and the training reward score.
The contour coefficient can be calculated by adopting an index-contour coefficient for measuring the clustering effect in a clustering algorithm.
For each semantic feature vector o in cluster D, calculating o and other objects in the cluster to which o belongsThe average distance a (o) between the two is calculated as follows:
b (o) is the minimum average distance of o to all clusters not containing o, and the formula is as follows:
The contour coefficients are:
in the present embodiment, the cluster model is trained by reinforcement learning, and prior to training, the following definition is performed:
The prize value is defined as giving the prize value 1+1/|s (o) | when the contour coefficient is less than the preset threshold Tg, and otherwise giving the prize value- |s (o) |.
At the end of each training period, a training prize score S 1 is calculated up to that point in time, calculated by the following formula:
Wherein, gamma is a gain attenuation coefficient, n is the number of training periods, i=1 to (n-1), S t is the reward score obtained in the t training period, namely when the contour coefficient is smaller than a preset threshold Tg, S t =1+1/|s (o) |, otherwise S t = - |s (o) |.
In an embodiment, obtaining the clustering loss function based on the clustering result and the training reward score specifically includes:
Calculating the logarithmic value of the clustering result, multiplying the logarithmic value and the training rewards score to obtain a product value, taking the negative value of the product value as a clustering loss function, and then calculating the clustering loss function as follows:
Loss=-S1×logPct;
Where Pc t represents the probability of the t-th round of action.
In this embodiment, the model is iteratively updated by using the clustering result and the clustering loss function obtained by training the reward points, so that the training reward points are maximized, and the accuracy of the model is ensured.
In some optional implementations of this embodiment, the step of inputting the word sense vector into the pre-constructed initial classification model to perform training, and obtaining the trained target classification model includes:
inputting the word meaning vector into a pre-constructed initial classification model to obtain a prediction classification result;
Determining a classification loss function according to the prediction classification result;
Model parameters of the initial classification model are adjusted according to the classification loss function;
and when the iteration end condition is met, generating a target classification model based on the model parameters.
Specifically, model parameters of the initial classification model are adjusted based on the loss function value of the classification loss function, iterative training is continued, the model is trained to a certain extent, at this time, the performance of the model reaches an optimal state, and the loss function value is hardly changed, namely, convergence. And when the iteration ending condition is met, model convergence is achieved, and after model convergence, a final classification model is output as a target classification model according to the finally adjusted model parameters.
According to the word sense vector training classification model obtained by splicing the keyword vectors and the semantic feature vectors of different semantic clusters, the classification model can learn different categories and implicit semantic features in the training text corpus, and the accuracy of text classification is further improved.
In this embodiment, the step of determining the classification loss function according to the prediction classification result includes:
calculating to obtain a classified rewarding value according to the prediction classification result;
A classification loss function is derived based on the classification prize value and the predicted classification result.
Specifically, the logarithmic value of the prediction classification result Pa is calculated, the logarithmic value and the classification rewarding value S 2 are multiplied to obtain a product value, and the negative value of the product value is taken as the classification loss function, so that the calculation formula of the classification loss function is as follows:
Loss=-S2×logPat;
Wherein Pa t represents the classified label probability of the training output of the t-th round.
According to the method and the device, the model is updated iteratively through the classification loss function obtained by predicting the classification result and the classification rewarding value, so that the classification rewarding value is maximized, the accuracy of the classification model is guaranteed, and the accuracy of text classification is improved.
The application is operational with numerous general purpose or special purpose computer system environments or configurations. Such as a personal computer, a server computer, a hand-held or portable device, a tablet device, a multiprocessor system, a microprocessor-based system, a set top box, a programmable consumer electronics, a network PC, a minicomputer, a mainframe computer, a distributed computing environment that includes any of the above systems or devices, and the like. The application may be described in the general context of computer-executable instructions, such as program modules, being executed by a computer. Generally, program modules include routines, programs, objects, components, data structures, etc. that perform particular tasks or implement particular abstract data types. The application may also be practiced in distributed computing environments where tasks are performed by remote processing devices that are linked through a communications network. In a distributed computing environment, program modules may be located in both local and remote computer storage media including memory storage devices.
Those skilled in the art will appreciate that implementing all or part of the above described methods may be accomplished by computer readable instructions stored in a computer readable storage medium that, when executed, may comprise the steps of the embodiments of the methods described above. The storage medium may be a nonvolatile storage medium such as a magnetic disk, an optical disk, a Read-Only Memory (ROM), or a random access Memory (Random Access Memory, RAM).
It should be understood that, although the steps in the flowcharts of the figures are shown in order as indicated by the arrows, these steps are not necessarily performed in order as indicated by the arrows. The steps are not strictly limited in order and may be performed in other orders, unless explicitly stated herein. Moreover, at least some of the steps in the flowcharts of the figures may include a plurality of sub-steps or stages that are not necessarily performed at the same time, but may be performed at different times, the order of their execution not necessarily being sequential, but may be performed in turn or alternately with other steps or at least a portion of the other steps or stages.
With further reference to fig. 3, as an implementation of the method shown in fig. 2, the present application provides an embodiment of a text classification apparatus based on reinforcement learning, where the embodiment of the apparatus corresponds to the embodiment of the method shown in fig. 2, and the apparatus is particularly applicable to various electronic devices.
As shown in fig. 3, the text classification device 300 based on reinforcement learning according to the present embodiment includes a semantic feature extraction module 301, a clustering module 302, a keyword extraction module 303, a vector generation module 304, a training module 305, and a classification module 306. Wherein:
The semantic feature extraction module 301 is configured to obtain a training text corpus, and perform semantic feature extraction on the training text corpus to obtain a semantic feature vector;
The clustering module 302 is configured to input the semantic feature vector into a trained clustering model, and output a semantic cluster;
the keyword extraction module 303 is configured to extract keywords from all the semantic clusters, and form a semantic feature queue corresponding to each semantic cluster according to the extracted keywords;
the vector generation module 304 is configured to select a keyword from each semantic feature queue as a target keyword, and generate a word sense vector based on the target keyword;
The training module 305 is configured to input the word sense vector into a pre-constructed initial classification model for training, so as to obtain a trained target classification model;
the classification module 306 is configured to obtain a text to be classified, input the text to be classified into the target classification model, and output a text classification result.
According to the text classification device based on reinforcement learning, semantic feature extraction is performed on the training text corpus, the extracted semantic feature vectors are clustered to obtain semantic clusters of different categories, and the classification model is trained by using the semantic clusters of different categories, so that the classification model learns the semantic features of different categories in the training text corpus, and the text classification accuracy can be improved.
In some optional implementations of the present embodiment, the reinforcement learning-based text classification device 300 further includes a cluster training module including a clustering sub-module, a computing sub-module, an adjusting sub-module, and a generating sub-module, wherein:
the clustering sub-module is used for inputting the semantic feature vector into a pre-constructed neural network model and outputting a clustering result;
the calculation sub-module is used for determining a clustering loss function according to the clustering result;
the adjustment submodule is used for adjusting model parameters of the neural network model based on the clustering loss function;
And the generation submodule is used for generating a clustering model according to the model parameters when the iteration ending condition is met.
According to the embodiment, the neural network model pre-constructed through reinforcement learning training is used as a clustering model, so that the clustering precision can be improved, and the text classification efficiency is further improved.
In this embodiment, the calculation submodule is further configured to:
calculating the contour coefficient of each cluster in the clustering result;
Obtaining training rewards according to the contour coefficients;
And obtaining the clustering loss function based on the clustering result and the training reward score.
In this embodiment, the model is iteratively updated by using the clustering result and the clustering loss function obtained by training the reward points, so that the training reward points are maximized, and the accuracy of the model is ensured.
In some optional implementations of the present embodiment, the keyword extraction module 303 includes a similarity calculation sub-module, a ranking sub-module, and a generation sub-module, where:
The similarity calculation submodule is used for calculating the similarity between the keywords of each semantic cluster;
the sorting sub-module is used for sorting the keywords according to the similarity to obtain a sorting result;
And the generation submodule is used for generating a semantic feature queue corresponding to each semantic cluster based on the sequencing result.
In this embodiment, the vector generation module 304 includes an extraction sub-module and a stitching sub-module, where:
The extraction submodule is used for extracting the characteristics of the target keywords to obtain keyword vectors;
And the splicing sub-module is used for splicing the keyword vector and the semantic feature vector to obtain a word semantic vector.
In the embodiment, the keyword vectors and the semantic feature vectors selected from different semantic clusters are spliced, and the obtained word semantic vectors are used for training the classification model, so that the classification model can learn different semantic features in the training sample, and the accuracy of model classification is improved.
In some alternative implementations of the present embodiment, the training module 305 includes a classification sub-module, a calculation sub-module, an adjustment sub-module, and an output sub-module, where:
the classification submodule is used for inputting the word meaning vector into a pre-constructed initial classification model to obtain a prediction classification result;
The calculation sub-module is used for determining a classification loss function according to the prediction classification result;
the adjustment submodule is used for adjusting model parameters of the initial classification model according to the classification loss function;
And the output submodule is used for generating a target classification model based on the model parameters when the iteration ending condition is met.
According to the word sense vector training classification model obtained by splicing the keyword vectors and the semantic feature vectors of different semantic clusters, the classification model can learn different categories and implicit semantic features in the training text corpus, and the accuracy of text classification is further improved.
In this embodiment, the calculation submodule is further configured to:
calculating to obtain a classified rewarding value according to the prediction classification result;
And obtaining a classification loss function based on the classification reward value and the prediction classification result.
According to the method and the device, the model is updated iteratively through the classification loss function obtained by predicting the classification result and the classification rewarding value, so that the classification rewarding value is maximized, the accuracy of the classification model is guaranteed, and the accuracy of text classification is improved.
In order to solve the technical problems, the embodiment of the application also provides computer equipment. Referring specifically to fig. 4, fig. 4 is a basic structural block diagram of a computer device according to the present embodiment.
The computer device 4 comprises a memory 41, a processor 42, a network interface 43 communicatively connected to each other via a system bus. It should be noted that only computer device 4 having components 41-43 is shown in the figures, but it should be understood that not all of the illustrated components are required to be implemented and that more or fewer components may be implemented instead. It will be appreciated by those skilled in the art that the computer device herein is a device capable of automatically performing numerical calculation and/or information processing according to a preset or stored instruction, and its hardware includes, but is not limited to, a microprocessor, an Application SPECIFIC INTEGRATED Circuit (ASIC), a Programmable gate array (Field-Programmable GATE ARRAY, FPGA), a digital Processor (DIGITAL SIGNAL Processor, DSP), an embedded device, and the like.
The computer equipment can be a desktop computer, a notebook computer, a palm computer, a cloud server and other computing equipment. The computer equipment can perform man-machine interaction with a user through a keyboard, a mouse, a remote controller, a touch pad or voice control equipment and the like.
The memory 41 includes at least one type of readable storage medium including flash memory, hard disk, multimedia card, card memory (e.g., SD or DX memory, etc.), random Access Memory (RAM), static Random Access Memory (SRAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), programmable Read Only Memory (PROM), magnetic memory, magnetic disk, optical disk, etc. In some embodiments, the storage 41 may be an internal storage unit of the computer device 4, such as a hard disk or a memory of the computer device 4. In other embodiments, the memory 41 may also be an external storage device of the computer device 4, such as a plug-in hard disk, a smart memory card (SMART MEDIA CARD, SMC), a Secure Digital (SD) card, a flash memory card (FLASH CARD) or the like, which are provided on the computer device 4. Of course, the memory 41 may also comprise both an internal memory unit of the computer device 4 and an external memory device. In this embodiment, the memory 41 is typically used to store an operating system and various application software installed on the computer device 4, such as computer readable instructions of a text classification method based on reinforcement learning. Further, the memory 41 may be used to temporarily store various types of data that have been output or are to be output.
The processor 42 may be a central processing unit (Central Processing Unit, CPU), controller, microcontroller, microprocessor, or other data processing chip in some embodiments. The processor 42 is typically used to control the overall operation of the computer device 4. In this embodiment, the processor 42 is configured to execute computer readable instructions stored in the memory 41 or process data, for example, execute computer readable instructions of the text classification method based on reinforcement learning.
The network interface 43 may comprise a wireless network interface or a wired network interface, which network interface 43 is typically used for establishing a communication connection between the computer device 4 and other electronic devices.
According to the text classification method based on reinforcement learning, the steps of the text classification method based on reinforcement learning in the above embodiment are realized when a processor executes computer readable instructions stored in a memory, semantic feature extraction is performed on training text corpus, extracted semantic feature vectors are clustered to obtain different semantic clusters, and classification models are trained according to the different semantic clusters, so that classification accuracy of the text in the training text corpus can be improved.
The application also provides another embodiment, namely a computer readable storage medium, wherein the computer readable storage medium stores computer readable instructions, and the computer readable instructions can be executed by at least one processor, so that the at least one processor executes the steps of the text classification method based on reinforcement learning, and the accuracy of text classification can be improved by extracting semantic features from training text corpus, clustering the extracted semantic feature vectors to obtain different semantic clusters, and training a classification model according to the different semantic clusters.
From the above description of the embodiments, it will be clear to those skilled in the art that the above-described embodiment method may be implemented by means of software plus a necessary general hardware platform, but of course may also be implemented by means of hardware, but in many cases the former is a preferred embodiment. Based on such understanding, the technical solution of the present application may be embodied essentially or in a part contributing to the prior art in the form of a software product stored in a storage medium (e.g. ROM/RAM, magnetic disk, optical disk) comprising instructions for causing a terminal device (which may be a mobile phone, a computer, a server, an air conditioner, or a network device, etc.) to perform the method according to the embodiments of the present application.
It is apparent that the above-described embodiments are only some embodiments of the present application, but not all embodiments, and the preferred embodiments of the present application are shown in the drawings, which do not limit the scope of the patent claims. This application may be embodied in many different forms, but rather, embodiments are provided in order to provide a thorough and complete understanding of the present disclosure. Although the application has been described in detail with reference to the foregoing embodiments, it will be apparent to those skilled in the art that modifications may be made to the embodiments described in the foregoing description, or equivalents may be substituted for elements thereof. All equivalent structures made by the content of the specification and the drawings of the application are directly or indirectly applied to other related technical fields, and are also within the scope of the application.

Claims (9)

1. A text classification method based on reinforcement learning, comprising the steps of:
acquiring a training text corpus, and extracting semantic features of the training text corpus to obtain a semantic feature vector;
inputting the semantic feature vectors into a trained clustering model, and outputting semantic clusters;
Extracting keywords from all the semantic clusters, and forming a semantic feature queue corresponding to each semantic cluster according to the extracted keywords;
Selecting keywords from each semantic feature queue as target keywords, and generating word sense vectors based on the target keywords;
inputting the word sense vector into a pre-constructed initial classification model for training to obtain a trained target classification model;
Obtaining a text to be classified, inputting the text to be classified into the target classification model, and outputting a text classification result;
wherein, before the step of inputting the semantic feature vector into the trained cluster model and outputting the semantic cluster, the method further comprises the following steps:
Inputting the semantic feature vector into a pre-constructed neural network model, and outputting a clustering result;
determining a clustering loss function according to the clustering result;
the step of determining a clustering loss function according to the clustering result comprises the following steps:
calculating the contour coefficient of each cluster in the clustering result;
Obtaining training rewards according to the contour coefficients;
And obtaining the clustering loss function based on the clustering result and the training reward score.
2. The reinforcement learning based text classification method of claim 1, further comprising, prior to said step of inputting said semantic feature vectors into a trained cluster model, outputting semantic clusters:
adjusting model parameters of the neural network model based on the cluster loss function;
and when the iteration ending condition is met, generating a clustering model according to the model parameters.
3. The reinforcement learning-based text classification method of claim 1, wherein said step of forming a semantic feature queue corresponding to each of said semantic clusters according to the extracted keywords comprises:
calculating the similarity between the keywords of each semantic cluster;
sorting the keywords according to the similarity to obtain a sorting result;
And generating a semantic feature queue corresponding to each semantic cluster based on the sequencing result.
4. The reinforcement learning based text classification method of claim 1, wherein said step of generating a word sense vector based on said target keyword comprises:
Extracting features of the target keywords to obtain keyword vectors;
and splicing the keyword vector and the semantic feature vector to obtain a word semantic vector.
5. The reinforcement learning based text classification method of claim 1, wherein said step of inputting said word sense vector into a pre-constructed initial classification model for training to obtain a trained target classification model comprises:
inputting the word sense vector into a pre-constructed initial classification model to obtain a prediction classification result;
determining a classification loss function according to the prediction classification result;
adjusting model parameters of the initial classification model according to the classification loss function;
and when the iteration ending condition is met, generating a target classification model based on the model parameters.
6. The reinforcement learning based text classification method of claim 5, wherein said step of determining a classification loss function based on said predictive classification result comprises:
calculating to obtain a classified rewarding value according to the prediction classification result;
And obtaining a classification loss function based on the classification reward value and the prediction classification result.
7. A reinforcement learning-based text classification device, comprising:
the semantic feature extraction module is used for obtaining training text corpus, and extracting semantic features of the training text corpus to obtain semantic feature vectors;
The clustering module is used for inputting the semantic feature vectors into the trained clustering model and outputting semantic clusters;
The keyword extraction module is used for extracting keywords from all the semantic clusters and forming a semantic feature queue corresponding to each semantic cluster according to the extracted keywords;
The vector generation module is used for selecting keywords from each semantic feature queue as target keywords and generating word sense vectors based on the target keywords;
the training module is used for inputting the word meaning vector into a pre-constructed initial classification model for training to obtain a trained target classification model;
the classification module is used for acquiring texts to be classified, inputting the texts to be classified into the target classification model and outputting text classification results;
The text classification device based on reinforcement learning further comprises a clustering training module, wherein the clustering training module comprises a clustering sub-module and a computing sub-module, and the text classification device based on reinforcement learning comprises:
The clustering sub-module is used for inputting the semantic feature vector into a pre-constructed neural network model and outputting a clustering result;
The calculation submodule is used for determining a clustering loss function according to the clustering result;
The computation submodule is further to:
calculating the contour coefficient of each cluster in the clustering result;
Obtaining training rewards according to the contour coefficients;
And obtaining the clustering loss function based on the clustering result and the training reward score.
8. A computer device comprising a memory having stored therein computer readable instructions which when executed implement the steps of the reinforcement learning based text classification method of any of claims 1 to 6.
9. A computer readable storage medium having stored thereon computer readable instructions which when executed by a processor implement the steps of the reinforcement learning based text classification method of any of claims 1 to 6.
CN202210433355.4A 2022-04-24 2022-04-24 Text classification method, device, computer equipment and medium based on reinforcement learning Active CN114780727B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210433355.4A CN114780727B (en) 2022-04-24 2022-04-24 Text classification method, device, computer equipment and medium based on reinforcement learning

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210433355.4A CN114780727B (en) 2022-04-24 2022-04-24 Text classification method, device, computer equipment and medium based on reinforcement learning

Publications (2)

Publication Number Publication Date
CN114780727A CN114780727A (en) 2022-07-22
CN114780727B true CN114780727B (en) 2025-02-25

Family

ID=82432505

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210433355.4A Active CN114780727B (en) 2022-04-24 2022-04-24 Text classification method, device, computer equipment and medium based on reinforcement learning

Country Status (1)

Country Link
CN (1) CN114780727B (en)

Families Citing this family (14)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115062602B (en) * 2022-08-17 2022-11-11 杭州火石数智科技有限公司 Sample construction method and device for contrast learning and computer equipment
CN115131589B (en) * 2022-08-31 2022-11-22 天津艺点意创科技有限公司 Image generation method for intelligent design of Internet literary works
CN116150698B (en) * 2022-09-08 2023-08-22 天津大学 A DRG automatic grouping method and system based on semantic information fusion
CN115730237B (en) * 2022-11-28 2024-04-23 智慧眼科技股份有限公司 Junk mail detection method, device, computer equipment and storage medium
CN116128438B (en) * 2022-12-27 2024-07-05 江苏巨楷科技发展有限公司 Intelligent community management system based on big data record information
CN115687944B (en) * 2022-12-27 2023-09-15 荣耀终端有限公司 A short message collection method and related equipment
CN116303949B (en) * 2023-02-24 2024-03-19 科讯嘉联信息技术有限公司 Dialogue processing method, dialogue processing system, storage medium and terminal
CN116339799B (en) * 2023-04-06 2023-11-28 山景智能(北京)科技有限公司 Method, system, terminal equipment and storage medium for intelligent data interface management
CN116167336B (en) * 2023-04-22 2023-07-07 拓普思传感器(太仓)有限公司 Sensor data processing method based on cloud computing, cloud server and medium
CN116882408B (en) * 2023-09-07 2024-02-27 南方电网数字电网研究院有限公司 Construction method and device of transformer graph model, computer equipment and storage medium
CN117496126B (en) * 2023-11-13 2024-04-30 浙江飞图影像科技有限公司 Automatic image positioning system and method based on keywords
CN118093882B (en) * 2024-04-24 2024-11-29 北京邮电大学 Aesthetic-guidance-based text-to-graphic model optimization method, device, equipment and medium
CN119150148A (en) * 2024-11-14 2024-12-17 中电科新型智慧城市研究院有限公司 Event classification method, electronic device, and computer-readable storage medium
CN119271819B (en) * 2024-12-04 2025-03-21 中国电子科技集团公司第二十八研究所 A patent topic clustering method based on reinforcement learning fine-tuning semantic vector model

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059316A (en) * 2019-04-16 2019-07-26 广东省科技基础条件平台中心 A kind of dynamic scientific and technological resources semantic analysis based on data perception
CN112364937A (en) * 2020-11-30 2021-02-12 腾讯科技(深圳)有限公司 User category determination method and device, recommended content determination method and electronic equipment

Family Cites Families (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20200125639A1 (en) * 2018-10-22 2020-04-23 Ca, Inc. Generating training data from a machine learning model to identify offensive language
CN112749281B (en) * 2021-01-19 2023-04-07 青岛科技大学 Restful type Web service clustering method fusing service cooperation relationship
CN113486670B (en) * 2021-07-23 2023-08-29 平安科技(深圳)有限公司 Text classification method, device, equipment and storage medium based on target semantics

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110059316A (en) * 2019-04-16 2019-07-26 广东省科技基础条件平台中心 A kind of dynamic scientific and technological resources semantic analysis based on data perception
CN112364937A (en) * 2020-11-30 2021-02-12 腾讯科技(深圳)有限公司 User category determination method and device, recommended content determination method and electronic equipment

Also Published As

Publication number Publication date
CN114780727A (en) 2022-07-22

Similar Documents

Publication Publication Date Title
CN114780727B (en) Text classification method, device, computer equipment and medium based on reinforcement learning
CN111444340B (en) Text classification method, device, equipment and storage medium
CN112231569B (en) News recommendation method, device, computer equipment and storage medium
CN111831826A (en) Training method, classification method and device of cross-domain text classification model
JP2023017921A (en) Content recommendation and sorting model training method, apparatus, and device and computer program
CN117520497A (en) Large model interaction processing method, system, terminal, equipment and medium
CN111563158A (en) Text sorting method, sorting device, server and computer-readable storage medium
CN113901836B (en) Word sense disambiguation method and device based on context semantics and related equipment
CN118520976B (en) Text dialogue generation model training method, text dialogue generation method and equipment
CN111459959B (en) Method and apparatus for updating event sets
CN118551019A (en) Answer text generation method, device, equipment, storage medium and program product
CN116796729A (en) Text recommendation method, device, equipment and storage medium based on feature enhancement
CN113987115B (en) Text similarity calculation method, device, equipment and storage medium
CN114091451B (en) A text classification method, device, equipment and storage medium
CN114742058B (en) Named entity extraction method, named entity extraction device, computer equipment and storage medium
CN112364649B (en) Named entity identification method and device, computer equipment and storage medium
CN115827865A (en) Method and system for classifying objectionable texts by fusing multi-feature map attention mechanism
CN114780809A (en) Knowledge pushing method, device, equipment and storage medium based on reinforcement learning
CN114925202B (en) Text classification method, device, computer equipment and storage medium
JP2022111020A (en) Transfer learning method of deep learning model based on document similarity learning and computer device
CN113792549B (en) User intention recognition method, device, computer equipment and storage medium
CN112732913B (en) Method, device, equipment and storage medium for classifying unbalanced samples
CN116911304B (en) Text recommendation method and device
US20240290095A1 (en) Method, electronic device, and computer program product for extracting target frame
CN119250081A (en) Intent recognition method, recognition model training method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
CB02 Change of applicant information
CB02 Change of applicant information

Country or region after: China

Address after: Room 801, building 2, Shenzhen new generation industrial park, 136 Zhongkang Road, Meidu community, Meilin street, Futian District, Shenzhen, Guangdong 518000

Applicant after: China Resources Digital Technology Co.,Ltd.

Address before: Room 801, building 2, Shenzhen new generation industrial park, 136 Zhongkang Road, Meidu community, Meilin street, Futian District, Shenzhen, Guangdong 518000

Applicant before: Runlian software system (Shenzhen) Co.,Ltd.

Country or region before: China

GR01 Patent grant
GR01 Patent grant