CN110851593B - Complex value word vector construction method based on position and semantics - Google Patents
Complex value word vector construction method based on position and semantics Download PDFInfo
- Publication number
- CN110851593B CN110851593B CN201910898057.0A CN201910898057A CN110851593B CN 110851593 B CN110851593 B CN 110851593B CN 201910898057 A CN201910898057 A CN 201910898057A CN 110851593 B CN110851593 B CN 110851593B
- Authority
- CN
- China
- Prior art keywords
- word
- vector
- complex
- neural network
- valued
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 239000013598 vector Substances 0.000 title claims abstract description 55
- 238000010276 construction Methods 0.000 title claims abstract description 9
- 238000012360 testing method Methods 0.000 claims abstract description 23
- 238000012549 training Methods 0.000 claims abstract description 18
- 238000003062 neural network model Methods 0.000 claims abstract description 17
- 238000012795 verification Methods 0.000 claims abstract description 16
- 238000013528 artificial neural network Methods 0.000 claims abstract description 11
- 238000013145 classification model Methods 0.000 claims abstract description 7
- 238000007781 pre-processing Methods 0.000 claims abstract description 6
- 238000013527 convolutional neural network Methods 0.000 claims description 8
- 230000000694 effects Effects 0.000 claims description 5
- 239000011159 matrix material Substances 0.000 claims description 4
- 230000011218 segmentation Effects 0.000 claims description 4
- 230000004913 activation Effects 0.000 claims description 3
- 238000004364 calculation method Methods 0.000 claims description 3
- 239000000284 extract Substances 0.000 abstract description 2
- 238000000034 method Methods 0.000 description 11
- 230000008451 emotion Effects 0.000 description 7
- 230000006870 function Effects 0.000 description 4
- 238000011161 development Methods 0.000 description 3
- 230000018109 developmental process Effects 0.000 description 3
- ORILYTVJVMAKLC-UHFFFAOYSA-N Adamantane Natural products C1C(C2)CC3CC1CC2C3 ORILYTVJVMAKLC-UHFFFAOYSA-N 0.000 description 2
- 238000010586 diagram Methods 0.000 description 2
- 238000009826 distribution Methods 0.000 description 2
- 238000011478 gradient descent method Methods 0.000 description 2
- 238000003058 natural language processing Methods 0.000 description 2
- 230000006978 adaptation Effects 0.000 description 1
- 238000004458 analytical method Methods 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000007547 defect Effects 0.000 description 1
- 230000002996 emotional effect Effects 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000004880 explosion Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 238000012986 modification Methods 0.000 description 1
- 230000004048 modification Effects 0.000 description 1
- 238000005457 optimization Methods 0.000 description 1
- 238000012545 processing Methods 0.000 description 1
- 238000011160 research Methods 0.000 description 1
- 238000009827 uniform distribution Methods 0.000 description 1
- 230000000007 visual effect Effects 0.000 description 1
Classifications
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/35—Clustering; Classification
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/383—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using metadata automatically derived from the content
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F16/00—Information retrieval; Database structures therefor; File system structures therefor
- G06F16/30—Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
- G06F16/38—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually
- G06F16/387—Retrieval characterised by using metadata, e.g. metadata not derived from the content or metadata generated manually using geographical or spatial information, e.g. location
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06F—ELECTRIC DIGITAL DATA PROCESSING
- G06F18/00—Pattern recognition
- G06F18/20—Analysing
- G06F18/21—Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
- G06F18/214—Generating training patterns; Bootstrap methods, e.g. bagging or boosting
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/044—Recurrent networks, e.g. Hopfield networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/04—Architecture, e.g. interconnection topology
- G06N3/045—Combinations of networks
-
- G—PHYSICS
- G06—COMPUTING; CALCULATING OR COUNTING
- G06N—COMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
- G06N3/00—Computing arrangements based on biological models
- G06N3/02—Neural networks
- G06N3/08—Learning methods
- G06N3/084—Backpropagation, e.g. using gradient descent
Landscapes
- Engineering & Computer Science (AREA)
- Theoretical Computer Science (AREA)
- Physics & Mathematics (AREA)
- Data Mining & Analysis (AREA)
- General Physics & Mathematics (AREA)
- General Engineering & Computer Science (AREA)
- Evolutionary Computation (AREA)
- Life Sciences & Earth Sciences (AREA)
- Artificial Intelligence (AREA)
- Computing Systems (AREA)
- Library & Information Science (AREA)
- Molecular Biology (AREA)
- Computational Linguistics (AREA)
- Biophysics (AREA)
- Biomedical Technology (AREA)
- Mathematical Physics (AREA)
- Software Systems (AREA)
- Health & Medical Sciences (AREA)
- General Health & Medical Sciences (AREA)
- Databases & Information Systems (AREA)
- Bioinformatics & Cheminformatics (AREA)
- Bioinformatics & Computational Biology (AREA)
- Computer Vision & Pattern Recognition (AREA)
- Evolutionary Biology (AREA)
- Information Retrieval, Db Structures And Fs Structures Therefor (AREA)
- Machine Translation (AREA)
Abstract
The invention discloses a complex-valued word vector construction method based on position and semantics, which comprises the following steps: collecting a text classification corpus, and dividing the text classification corpus into a training set, a verification set and a test set; preprocessing the text in the corpus (removing stop words); constructing sentence representation by using the relative position information and the global semantic information; inputting word vectors of the training corpus into a complex-valued neural network, and training a semantic classification model; inputting the text word vector of the verification set into a complex-valued neural network model, so as to calculate the prediction probability of each sample; testing a model obtained based on the verification set on a test set; the invention overcomes the current situation that the text classification corpus is relatively lacking, can more fully extract the characteristic information (position information) of the text, fuses the position information and the global semantic information of the text, and applies the complex-valued word vector to the duplicate neural network, so that the neural network models have stronger discrimination capability.
Description
Technical Field
The invention relates to the technical field of text classification, in particular to a complex-valued word vector construction method based on position and semantics.
Background
With the rapid development of science and technology, particularly the rapid development of the internet and social networks in the past few years, various information is filled on the internet, including comments made by users on social platforms and certain views of the users, which also becomes one of the main sources for acquiring information in daily life of the users. People can acquire a large amount of data through the Internet, but how to reasonably and effectively manage the large amount of data becomes an increasingly concerned problem. A very common way of managing a large amount of information is classification, whereby the classification of visible text implies a huge social value. The invention mainly researches emotion or category of sentences and chapters.
Classification tasks play an important role in natural language processing tasks. Classification tasks can be simply classified into two categories (spam categories, etc.), multiple categories (emotional state of text), and their development is of great interest to the industry and academia. The invention not only discusses the emotion of the sentence, but also judges the category to which the sentence belongs, which is a fine-grained task in the text classification field. For example, "whether it is not seemingly impossible: this makes the cascade of killers jiffri-damer boring. The emotion of this sentence is a negative emotion, mainly determined by the word "boring".
The neural network classification method based on complex-valued vector representation of position and semantics aims at distinguishing emotion polarities or belonging classifications of given sentences. Of course, both industry and academia are currently aware of the importance of word emotion information in sentences and attempt to better distinguish them by designing a series of classification models. However, the current method usually ignores the importance of the word position information in the sentence, and does not know whether the position of the word is important in semantic information or important in position information, when the word sequence is changed but the word is not changed, the model is expected to better recognize that the word semantics are changed. Therefore, the invention focuses on the relation between word sequence and semantics in sentences, constructs the position and semantic information between words of complex-valued word vector encodings, and extracts classification information through a complex-valued neural network model.
Now, complex-valued neural network based models have been used by researchers to model some natural language processing tasks and have been very successful. However, the current method only uses complex vectors, and does not better utilize complex valued word vectors to mine the relative position information of words in sentences.
Disclosure of Invention
The invention aims to solve the technical problems of overcoming the defects of the prior art and providing a text emotion or category classification method based on a neural network model of complex-valued word vectors, constructing a large number of text corpus, respectively constructing word vectors and relative position information of texts, training a text classification model by using the complex-valued neural network model, obtaining a prediction result of an optimal model on a test set by using a back propagation and random descent method Adam training network model, and finally obtaining a more accurate classification result.
The aim of the invention is realized by the following technical scheme:
a complex-valued word vector construction method based on position and semantics comprises the following steps:
(1) Adopting a jieba word segmentation tool to segment chapters and sentences, thereby constructing a multi-mode classification corpus
(2) Randomly selecting 80% of N samples from the multi-mode classification corpus constructed in the step (1) as a training set, dividing 10% of N samples into a verification set and the rest 10% of N samples into a test set, and preprocessing the training set, the verification set and the test set respectively;
(3) Selecting sentences after preprocessing in the multi-mode classification corpus to construct a complex-valued neural network model, and further constructing a loss function of the complex-valued neural network model:
wherein: y is i The representation is indeed a class label,representing a prediction result;
(4) Training the complex-valued neural network model on a training set to obtain a semantic classification model;
(5) Performing effect verification on the trained neural network model in a verification set, and recording and storing model parameters when the model parameters reach the optimal conditions;
(6) And (3) testing samples on the test set by using the optimal model parameters stored in the previous step, finally obtaining a prediction result of each test sample, comparing test labels, and calculating the classification accuracy.
The complex-valued neural network model construction in the step (3) comprises the following steps:
3.1, constructing an absolute position index of each word according to the positions of the words in chapters and sentences, namely:
p i =e iβPOS
wherein p is i The position vector of the current word is represented, POS is the absolute position of the current word, beta is the initialized period of the current word, and i is the imaginary part representation of a complex number;
3.2 obtaining 300-dimensional word vector w of words in each chapter or sentence by using glove tool i Simultaneously, a matrix of position information is established, and each position index corresponds to a 300-dimensional position vector p i Further, a relative position index of each word is constructed, namely:
x i =w i e iβPOS
3.3, the word vector of each word of the text is input into the Fasttext and LSTM, CNN networks together with the relative position information vector thereof in the sequence of sentences, and the specific calculation formula is as follows:
Fasttext:
Z C =σ(Ax-By+b r )+iσ(Bx-Ay+b i )
CNN
Z C =Ax-By+i(Bx+Ay)
wherein sigma represents a sigmod activation function, A and B represent the real part and the imaginary part of the weight respectively, and x and y represent the real part and the imaginary part of the input feature;
RNN
wherein x is t The input of each word is represented as such,representing the respectively obtained network hidden layer states;
3.4, outputting the final network hidden layer in the above stepsInputting the vector x into a nonlinear fully-connected neural network layer to obtain a neural network representation vector x, and inputting the representation x into a softmax classification layer to output a final class y:
y=softmax(W s x+b s )
the relative position index of each word in step 3.2 is a vector representation of the chapter or sentence obtained by the following formula:
relative position between word dimensions:
wherein: x represents the vector representation of the word, w j,k A semantic vector representing a word, n representing a word space, pos, j, k being as defined above;
relative position between words:
wherein: p, k represents the kth dimension of the word at the different positions.
The beneficial effects of the invention are as follows:
(1) An effective entity text corpus is built, and the dilemma of lack of the current text classification corpus is overcome;
(2) And developing a complex-valued neural network model framework based on the position information by using the relative position information of words in the text or the sentence as the characteristic, and performing classification tasks.
Drawings
FIG. 1 is a flow chart of the method of the present invention;
FIG. 2 is a 3-dimensional complex word vector distribution diagram;
FIG. 3 is a run-time comparison of classification models based on different word vectors.
FIG. 4 is a graph of similarity comparisons of different words under the same sentence for different word vectors.
Detailed Description
The technical solution of the present invention will be described in further detail with reference to the accompanying drawings, but the scope of the present invention is not limited to the following description. FIG. 1 shows the flow of the neural network classification method based on complex-valued vector representation of location and semantics proposed by the present method; FIG. 2 shows a possible distribution diagram of a 3-dimensional complex word vector; FIG. 3 shows the run-time comparison of classification models for different word vectors; FIG. 4 is a graph of similarity comparisons of different words under the same sentence for different word vectors. :
the traditional extraction, arrangement and processing of the webpage text content containing the keywords are completely manual, labor and effort are wasted, and the efficiency is low due to the fact that the explosion type of the webpage data is increased and the method is purely manual.
Based on the acquired webpage data, a multi-mode data set is established, and the specific steps are as follows:
(1): and (3) segmenting the chapters and sentences by using a jieba segmentation tool, and removing stop words, useless punctuation marks and the like after segmentation, so as to construct a multi-mode classification corpus. The multi-modal classification corpus is a text classification corpus, the total number of samples of the corpus is N, and each sample comprises a text;
(2): randomly selecting 80% of texts from the multi-mode classification corpus constructed in the step (1) as a training set, dividing 10% of texts into a verification set, dividing the rest 10% of texts into a test set, and preprocessing the training set, the verification set and the test set respectively;
(3): for the preprocessed chapters or sentences, according to the absolute position information characteristics of the words in the position construction of the chapters or sentences, the absolute position information characteristics are respectively input into complex-valued Fastext and LSTM, CNN neural network models, and the application method is as follows:
3.1: constructing an absolute position index of each word of a chapter or sentence based on the position of the word, assuming that a word appears at the beginning of the sentence, its position index will be marked 1 with a period ofAnd the position indexes of other words in the sentence are sequentially overlapped:
p i =e iβpos
wherein p is i The position vector representing the current word, POS is its absolute position, β is its initialized period, and i is the imaginary representation of the complex number.
3.2: obtaining 300-dimensional word vector w of words in each chapter or sentence by using glove tool i Simultaneously, a matrix of position information is established, and each position index corresponds to a 300-dimensional position vector p i The model initialization stage initializes the parameter matrix with uniform distribution and updates the optimization during model training. At this step we get a representation x of each word in the chapters and sentences i =w i e iβPOS 。
The relative position index of each word in step 3.2 is a vector representation of the chapter or sentence obtained by the following formula:
the relative positions between word dimensions are derived as follows:
the relative positions between words are derived as follows:
3.3: the word vector of each word of the text is input into the Fasttext and LSTM, CNN network together with its position information vector in the order of sentences, and the specific calculation formula is as follows:
Fasttext:
Z C =σ(Ax-By+b r )+iσ(Bx-Ay+b i )
CNN:
Z C =Ax-By+i(Bx+Ay)
RNN:
wherein sigma represents a sigmod activation function, A and B represent the real part and the imaginary part of the weight respectively, x and y represent the real part and the imaginary part of the input characteristic, and the weight and the input in the RNN formula are represented by complex values. By using the above formula, a representation 300-dimensional x of each word is input t 128-dimensional network hidden layer representation is obtained respectively
3.4: based on the network hidden state representation of each word in the text obtained in the previous stepAnd inputting the vector into a nonlinear fully-connected neural network layer, so as to obtain a neural network representation vector x, and inputting the representation x into a softmax classification layer to output a final class y:
y=softmax(W s x+b s )
defining a complex-valued neural network model loss function as:
wherein y is i The representation is indeed a class label,representing the predicted outcome. The model is trained by a back propagation algorithm and a batch random gradient descent method. The model was trained by a back-propagation algorithm, batch (mini-batch=32) Adam gradient descent method.
Training a model on a training set, carrying out verification on the model effect on a verification set every 100 batches, and recording model parameters when the effect stored on the verification set reaches the optimal.
And (3) testing samples on the test set by using the optimal model stored in the previous step, finally obtaining a prediction result of each test sample, comparing test labels, and calculating the classification accuracy. Finally, the prediction result of each sample is obtained, the test label is compared, the classification accuracy is calculated, the Fastext model, the convolutional neural network model and the LSTM model are compared with the complex value model, the running time table is counted, and the effect of the analysis model can be obviously improved through visual observation, as shown in figure 3. To further prove the effectiveness of the model method, we randomly choose a sentence from the dataset, calculate similarity scores between different 3-grams respectively, and as can be seen from FIG. 4, the score obtained when we directly use word-empedding is very large in value between good excepted and exact good, i.e. greenish in color; using the transformer location construction method, the scoring values obtained are substantially fixed, i.e. the location vectors are weighted very much, which severely affects the final result (albeit with regularity); finally, fig. 4 (c) uses our complex valued word vectors to derive a small scoring value, which is yellowish in color, because our word vectors can smooth word vectors and position vectors, thus reducing the parameters trained in the network.
The technical means disclosed by the scheme of the invention is not limited to the technical means disclosed by the embodiment, and also comprises the technical scheme formed by any combination of the technical features. It should be noted that modifications and adaptations to the invention may occur to one skilled in the art without departing from the principles of the present invention and are intended to be within the scope of the present invention.
Claims (1)
1. The complex-valued word vector construction method based on the position and the semantics is characterized by comprising the following steps:
(1) Adopting a jieba word segmentation tool to segment chapters and sentences, thereby constructing a multi-mode classification corpus;
(2) Randomly selecting 80% of N samples from the multi-mode classification corpus constructed in the step (1) as a training set, dividing 10% of N samples into a verification set and the rest 10% of N samples into a test set, and preprocessing the training set, the verification set and the test set respectively;
(3) Selecting sentences after preprocessing in the multi-mode classification corpus to construct a complex-valued neural network model, and further constructing a complex-valued neural network model loss function:
wherein: y is i Representing a true class label of the tag,representing a prediction result; wherein:
3.1, constructing an absolute position index of each word according to the positions of the words in chapters and sentences, namely:
p i =e iβPOS
wherein: p is p i The position vector of the current word is represented, POS is the absolute position of the current word, beta is the initialized period of the current word, and i is the imaginary part representation of a complex number;
3.2 obtaining 300-dimensional word vector w of words in each chapter or sentence by using glove tool i Simultaneously, a matrix of position information is established, and each position index corresponds to a 300-dimensional position vector p i And further constructing a relative position index of each word, namely representing each word in chapters and sentences:
x i =w i e iβPOS
the relative position index of each word in step 3.2 is a vector representation of the chapter or sentence obtained by the following formula:
relative position between word dimensions:
wherein: x represents the vector representation of the word, w j,k A semantic vector representing a word, n representing a word interval;
relative position between words:
wherein: p.k represents the kth dimension of words at different positions;
3.3, the word vector of each word of the text is input into the Fasttext and LSTM, CNN networks together with the relative position information vector thereof in the sequence of sentences, and the specific calculation formula is as follows:
Fasttext:
Z C =σ(Ax-By+b r )+iσ(Bx-Ay+b i )
CNN
Z C =Ax-By+i(Bx+Ay)
wherein sigma represents a sigmod activation function, A and B represent the real part and the imaginary part of the weight respectively, and x and y represent the real part and the imaginary part of the input feature;
RNN
wherein x is t The input of each word is represented as such,representing the respectively obtained network hidden layer states; the weight and the input in the RNN formula are represented by complex values; by using the above formula, a representation 300-dimensional x of each word is input t Respectively obtaining 128-dimension of network hidden layer representation>
3.4, outputting the final network hidden layer in the above stepsInputting the vector x into a nonlinear fully-connected neural network layer to obtain a neural network representation vector x, and inputting the representation x into a softmax classification layer to output a final class y:
y=softmax(W s x+b s );
(4) Training the complex-valued neural network model on a training set to obtain a semantic classification model;
(5) Performing effect verification on the trained neural network model in a verification set, and recording and storing model parameters when the model parameters reach the optimal conditions;
(6) And (3) testing samples on the test set by using the optimal model parameters stored in the previous step, finally obtaining a prediction result of each test sample, comparing test labels, and calculating the classification accuracy.
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910898057.0A CN110851593B (en) | 2019-09-23 | 2019-09-23 | Complex value word vector construction method based on position and semantics |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN201910898057.0A CN110851593B (en) | 2019-09-23 | 2019-09-23 | Complex value word vector construction method based on position and semantics |
Publications (2)
Publication Number | Publication Date |
---|---|
CN110851593A CN110851593A (en) | 2020-02-28 |
CN110851593B true CN110851593B (en) | 2024-01-05 |
Family
ID=69596047
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN201910898057.0A Active CN110851593B (en) | 2019-09-23 | 2019-09-23 | Complex value word vector construction method based on position and semantics |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN110851593B (en) |
Families Citing this family (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111897961A (en) * | 2020-07-22 | 2020-11-06 | 深圳大学 | A text classification method with a wide neural network model and related components |
CN114970497B (en) * | 2022-06-02 | 2023-05-16 | 中南大学 | Text classification method and word sense disambiguation method based on pre-trained feature embedding |
CN115934752B (en) * | 2022-12-09 | 2023-07-14 | 北京中科闻歌科技股份有限公司 | Method for constructing retrieval model, electronic equipment and storage medium |
CN116112320B (en) * | 2023-04-12 | 2023-08-01 | 广东致盛技术有限公司 | Method and device for constructing edge computing intelligent gateway based on object model |
Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106776581A (en) * | 2017-02-21 | 2017-05-31 | 浙江工商大学 | Subjective texts sentiment analysis method based on deep learning |
CN108363714A (en) * | 2017-12-21 | 2018-08-03 | 北京至信普林科技有限公司 | A kind of method and system for the ensemble machine learning for facilitating data analyst to use |
CN108363769A (en) * | 2018-02-07 | 2018-08-03 | 大连大学 | The method for building up of semantic-based music retrieval data set |
CN109522548A (en) * | 2018-10-26 | 2019-03-26 | 天津大学 | A kind of text emotion analysis method based on two-way interactive neural network |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
US7403890B2 (en) * | 2002-05-13 | 2008-07-22 | Roushar Joseph C | Multi-dimensional method and apparatus for automated language interpretation |
-
2019
- 2019-09-23 CN CN201910898057.0A patent/CN110851593B/en active Active
Patent Citations (4)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN106776581A (en) * | 2017-02-21 | 2017-05-31 | 浙江工商大学 | Subjective texts sentiment analysis method based on deep learning |
CN108363714A (en) * | 2017-12-21 | 2018-08-03 | 北京至信普林科技有限公司 | A kind of method and system for the ensemble machine learning for facilitating data analyst to use |
CN108363769A (en) * | 2018-02-07 | 2018-08-03 | 大连大学 | The method for building up of semantic-based music retrieval data set |
CN109522548A (en) * | 2018-10-26 | 2019-03-26 | 天津大学 | A kind of text emotion analysis method based on two-way interactive neural network |
Non-Patent Citations (2)
Title |
---|
胡朝举 ; 赵晓伟 ; .基于词向量技术和混合神经网络的情感分析.计算机应用研究.2017,(第12期),全文. * |
郭文姣 ; 欧阳昭连 ; 李阳 ; 郭柯磊 ; 杜然然 ; 池慧 ; .应用共词分析法揭示生物医学工程领域的研究主题.中国生物医学工程学报.2012,(第04期),全文. * |
Also Published As
Publication number | Publication date |
---|---|
CN110851593A (en) | 2020-02-28 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN108984526B (en) | A deep learning-based document topic vector extraction method | |
CN110851593B (en) | Complex value word vector construction method based on position and semantics | |
CN107590177B (en) | A Chinese text classification method combined with supervised learning | |
CN107967318A (en) | A kind of Chinese short text subjective item automatic scoring method and system using LSTM neutral nets | |
CN109726745B (en) | A goal-based sentiment classification method incorporating descriptive knowledge | |
CN108255813B (en) | Text matching method based on word frequency-inverse document and CRF | |
Chang et al. | Research on detection methods based on Doc2vec abnormal comments | |
CN106383815A (en) | Neural network sentiment analysis method in combination with user and product information | |
CN110929034A (en) | Commodity comment fine-grained emotion classification method based on improved LSTM | |
CN106980609A (en) | A kind of name entity recognition method of the condition random field of word-based vector representation | |
CN111460820A (en) | A method and device for named entity recognition in cyberspace security field based on pre-training model BERT | |
CN105183833A (en) | User model based microblogging text recommendation method and recommendation apparatus thereof | |
CN109492105B (en) | Text emotion classification method based on multi-feature ensemble learning | |
CN110569355B (en) | Viewpoint target extraction and target emotion classification combined method and system based on word blocks | |
CN112434164B (en) | Network public opinion analysis method and system taking topic discovery and emotion analysis into consideration | |
CN108388554A (en) | Text emotion identifying system based on collaborative filtering attention mechanism | |
CN111222318A (en) | Trigger word recognition method based on two-channel bidirectional LSTM-CRF network | |
CN111753088A (en) | Method for processing natural language information | |
CN114417851A (en) | Emotion analysis method based on keyword weighted information | |
CN117474703A (en) | Topic intelligent recommendation method based on social network | |
CN113076425B (en) | A Classification Method for Event-Related Opinion Sentences for Weibo Comments | |
Koli et al. | A Review on Sentiment Analysis Methodologies, Practices and Applications with Machine Learning | |
CN107729509B (en) | Discourse similarity determination method based on recessive high-dimensional distributed feature representation | |
Wang et al. | Application of Sentiment Classification of Weibo Comments Based on TextCNN Model | |
Jiang et al. | Sentiment classification based on clause polarity and fusion via convolutional neural network |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |