[go: up one dir, main page]

CN117251569A - Classification methods, devices, terminals and storage media - Google Patents

Classification methods, devices, terminals and storage media Download PDF

Info

Publication number
CN117251569A
CN117251569A CN202311117045.2A CN202311117045A CN117251569A CN 117251569 A CN117251569 A CN 117251569A CN 202311117045 A CN202311117045 A CN 202311117045A CN 117251569 A CN117251569 A CN 117251569A
Authority
CN
China
Prior art keywords
text
emotion
similarity
voice
score
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202311117045.2A
Other languages
Chinese (zh)
Inventor
王佳茜
曹斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Neusoft Reach Automotive Technology Shenyang Co Ltd
Original Assignee
Neusoft Reach Automotive Technology Shenyang Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Neusoft Reach Automotive Technology Shenyang Co Ltd filed Critical Neusoft Reach Automotive Technology Shenyang Co Ltd
Priority to CN202311117045.2A priority Critical patent/CN117251569A/en
Publication of CN117251569A publication Critical patent/CN117251569A/en
Pending legal-status Critical Current

Links

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/35Clustering; Classification
    • G06F16/353Clustering; Classification into predefined classes
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3344Query execution using natural language analysis
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/3331Query processing
    • G06F16/334Query execution
    • G06F16/3346Query execution using probabilistic model
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/10Text processing
    • G06F40/12Use of codes for handling textual entities
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/216Parsing using statistical methods
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/237Lexical tools
    • G06F40/242Dictionaries
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/279Recognition of textual entities
    • G06F40/289Phrasal analysis, e.g. finite state techniques or chunking

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • General Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • Health & Medical Sciences (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • General Health & Medical Sciences (AREA)
  • Databases & Information Systems (AREA)
  • Data Mining & Analysis (AREA)
  • Probability & Statistics with Applications (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

本申请公开了一种分类方法、装置、终端及存储介质,方法包括:获取不同车主的语音文本;计算不同车主的语音文本的情感得分、文本相似度以及权重;基于不同车主的语音文本的情感得分、文本相似度以及权重,计算不同车主间的情感相似度,以基于不同车主间的情感相似度确定不同车主的类型。本发明通过将相似度计算与情感相结合,计算不同车主的情感相似度,以提升不同车主的分类准确度。

This application discloses a classification method, device, terminal and storage medium. The method includes: obtaining the voice texts of different car owners; calculating the emotion scores, text similarities and weights of the voice texts of different car owners; based on the emotions of the voice texts of different car owners Score, text similarity and weight are used to calculate the emotional similarity between different car owners to determine the types of different car owners based on the emotional similarity between different car owners. The present invention combines similarity calculation with emotion to calculate the emotional similarity of different car owners to improve the classification accuracy of different car owners.

Description

Classification method, classification device, classification terminal and storage medium
Technical Field
The present application relates to the field of computer technologies, and in particular, to a classification method, apparatus, terminal, and storage medium.
Background
The terminal that the vehicle set up has the pronunciation acquisition function, gathers different car owners 'pronunciation through the terminal to discern different car owners' pronunciation, can acquire different car owners 'appeal, characteristics such as personality, in order to realize different car owners' classification.
At present, aiming at classification of different vehicle owners, voices of the different vehicle owners are obtained mainly through a terminal, similarity calculation is directly carried out on the voices of the different vehicle owners, and then the different vehicle owners are classified according to the calculated similarity.
However, by adopting the method to classify different vehicle owners, the problem of high similarity but inaccurate classification can occur.
Disclosure of Invention
The main purpose of the application is to provide a classification method, a device, a terminal and a storage medium, so as to solve the problems that the similarity is high but the classification is inaccurate when different vehicle owners are classified in the related technology.
To achieve the above object, in a first aspect, the present application provides a classification method, including:
acquiring voice texts of different vehicle owners;
calculating emotion scores, text similarity and weights of voice texts of different vehicle owners;
and calculating the emotion similarity among different car owners based on emotion scores, text similarity and weights of voice texts of the different car owners so as to determine the types of the different car owners based on the emotion similarity among the different car owners.
In one possible implementation, the different owners include at least a first owner and a second owner, and the voice text of the different owners includes at least a first voice text of the first owner and a second voice text of the second owner;
calculating emotion scores, text similarity and weights of voice texts of different vehicle owners, wherein the emotion scores, the text similarity and the weights comprise:
preprocessing the first voice text and the second voice text respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text;
respectively calculating a first emotion score corresponding to the first word vector and a second emotion score corresponding to the second word vector by adopting a fine-granularity emotion dictionary;
calculating the similarity of the first voice text and the second voice text to obtain text similarity;
and calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score and the text similarity.
In one possible implementation manner, preprocessing the first voice text and the second voice text respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text, including:
word segmentation is carried out on the first voice text and the second voice text respectively, so that the segmented first voice text and segmented second voice text are obtained;
and respectively carrying out vectorization processing on the first voice text after word segmentation and the second voice text after word segmentation by adopting a TF-IDF algorithm to obtain a first word vector and a second word vector.
In one possible implementation manner, calculating a first emotion score corresponding to a first word vector and a second emotion score corresponding to a second word vector by using a fine-granularity emotion dictionary includes:
comparing each word vector in the first word vector and the second word vector with a preset word vector in a fine granularity emotion dictionary, and determining an emotion score of each word vector in the first word vector and an emotion score of each word vector in the second word vector;
the first emotion score is calculated based on the emotion score of each of the first word vectors, and the second emotion score is calculated based on the emotion score of each of the second word vectors.
In one possible implementation, calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score, and the text similarity includes:
normalizing the first emotion score, the second emotion score and the text similarity to obtain a normalized first emotion score, a normalized second emotion score and a normalized text similarity;
calculating a first information entropy based on the normalized first emotion score and the normalized text similarity, and calculating a second information entropy based on the normalized second emotion score and the normalized text similarity;
the first weight is calculated based on the first information entropy and the text encoding of the first speech text, and the second weight is calculated based on the second information entropy and the text encoding of the second speech text.
In one possible implementation, the emotional similarity between the different owners includes at least emotional similarity of the first owner and the second owner;
based on emotion scores, text similarity and weights of voice texts of different vehicle owners, the emotion similarity among the different vehicle owners is calculated, and the method comprises the following steps:
and calculating emotion similarity of the first vehicle owner and the second vehicle owner based on the first weight, the second weight, the first emotion score, the second emotion score and the text similarity.
In one possible implementation, the calculation formula of the emotion similarity of the first vehicle owner and the second vehicle owner is:
score=wi*EscoreA*EscoreB+wj*similarity(A,B)
wherein score represents emotional similarity of the first vehicle owner and the second vehicle owner, wi represents a first weight, wj represents a second weight, escoreA represents a first emotional score, escoreB represents a second emotional score, similarity (a, B) represents text similarity.
In a second aspect, an embodiment of the present invention provides a classification apparatus, including:
the acquisition module is used for acquiring voice texts of different vehicle owners;
the computing module is used for computing emotion scores, text similarity and weights of voice texts of different vehicle owners;
the classification module is used for calculating emotion similarity among different car owners based on emotion scores, text similarity and weights of voice texts of the different car owners so as to determine types of the different car owners based on the emotion similarity among the different car owners.
In a third aspect, an embodiment of the present invention provides a terminal, including a memory, a processor, and a computer program stored in the memory and executable on the processor, the processor implementing the steps of any of the classification methods described above when the computer program is executed by the processor.
In a fourth aspect, embodiments of the present invention provide a computer readable storage medium storing a computer program which, when executed by a processor, performs the steps of any of the classification methods described above.
The embodiment of the invention provides a classification method, a classification device, a classification terminal and a storage medium, wherein the classification method comprises the following steps: firstly, voice texts of different vehicle owners are obtained, emotion scores, text similarity and weights of the voice texts of the different vehicle owners are calculated, and then emotion similarity among the different vehicle owners is calculated based on the emotion scores, the text similarity and the weights of the voice texts of the different vehicle owners, so that types of the different vehicle owners are determined based on the emotion similarity among the different vehicle owners. According to the method, the similarity calculation is combined with emotion, so that the emotion similarity of different car owners is calculated, and the classification accuracy of the different car owners is improved.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this application, are included to provide a further understanding of the application and to provide a further understanding of the application with regard to the other features, objects and advantages of the application. The drawings of the illustrative embodiments of the present application and their descriptions are for the purpose of illustrating the present application and are not to be construed as unduly limiting the present application. In the drawings:
FIG. 1 is a flow chart of an implementation of a classification method according to an embodiment of the present invention;
fig. 2 is a schematic structural diagram of a sorting device according to an embodiment of the present invention;
fig. 3 is a schematic diagram of a terminal according to an embodiment of the present invention.
Detailed Description
For the purpose of making the objects, technical solutions and advantages of the embodiments of the present invention more apparent, the technical solutions of the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention, and it is apparent that the described embodiments are only some embodiments of the present invention, not all embodiments. All other embodiments, which can be made by those skilled in the art based on the embodiments of the invention without making any inventive effort, are intended to be within the scope of the invention.
The terms "first," "second," "third," "fourth" and the like in the description and in the claims and in the above drawings, if any, are used for distinguishing between similar objects and not necessarily for describing a particular sequential or chronological order. It is to be understood that the data so used may be interchanged where appropriate such that the embodiments of the invention described herein may be implemented in sequences other than those illustrated or otherwise described herein.
It should be understood that, in various embodiments of the present invention, the sequence number of each process does not mean that the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present invention.
It should be understood that in the present invention, "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion, such that a process, method, system, article, or apparatus that comprises a list of steps or elements is not necessarily limited to those steps or elements that are expressly listed or inherent to such process, method, article, or apparatus.
It should be understood that in the present invention, "plurality" means two or more. "and/or" is merely an association relationship describing an association object, and means that three relationships may exist, for example, and/or B may mean: a exists alone, A and B exist together, and B exists alone. The character "/" generally indicates that the context-dependent object is an "or" relationship. "comprising A, B and C", "comprising A, B, C" means that all three of A, B, C comprise, "comprising A, B or C" means that one of the three comprises A, B, C, and "comprising A, B and/or C" means that any 1 or any 2 or 3 of the three comprises A, B, C.
It should be understood that in the present invention, "B corresponding to a", "a corresponding to B", or "B corresponding to a" means that B is associated with a, from which B can be determined. Determining B from a does not mean determining B from a alone, but may also determine B from a and/or other information. The matching of A and B is that the similarity of A and B is larger than or equal to a preset threshold value.
As used herein, "if" may be interpreted as "at … …" or "at … …" or "in response to a determination" or "in response to detection" depending on the context.
The technical scheme of the invention is described in detail below by specific examples. The following embodiments may be combined with each other, and some embodiments may not be repeated for the same or similar concepts or processes.
For the purpose of making the objects, technical solutions and advantages of the present invention more apparent, the following description will be made by way of specific embodiments with reference to the accompanying drawings.
In one embodiment, as shown in FIG. 1, a classification method is provided, comprising the steps of:
step S101: and acquiring voice texts of different vehicle owners.
The vehicle driven by the vehicle owner is provided with equipment with a voice acquisition function, and voice texts of different vehicle owners can be directly acquired by acquiring voices of different vehicle owners. The device includes, but is not limited to, a microphone, a mobile phone, a collector, etc. arranged on the vehicle.
Step S102: and calculating emotion scores, text similarity and weights of voice texts of different vehicle owners.
The number of different vehicle owners is not limited, and can be set according to specific situations, such as 2, 3, 10 and the like.
Under the condition that different vehicle owners at least comprise a first vehicle owner and a second vehicle owner, voice texts of the different vehicle owners at least comprise a first voice text of the first vehicle owner and a second voice text of the second vehicle owner, emotion scores, text similarity and weights of the voice texts of the different vehicle owners are calculated, the first voice text and the second voice text need to be preprocessed respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text, then fine-grained emotion dictionaries are adopted to calculate a first emotion score corresponding to the first word vector and a second emotion score corresponding to the second word vector respectively, similarity of the first voice text and the second voice text is calculated to obtain the text similarity, and finally, based on the first emotion score, the second emotion score and the text similarity, a first weight corresponding to the first vehicle owner and a second weight corresponding to the second vehicle owner are calculated.
The method comprises the steps of respectively preprocessing a first voice text and a second voice text to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text, respectively performing word segmentation on the first voice text and the second voice text to obtain a segmented first voice text and a segmented second voice text, and then performing vectorization on the segmented first voice text and the segmented second voice text by adopting a TF-IDF (term frequency-reverse document frequency) algorithm to obtain a first word vector and a second word vector.
For example, in the case of two owners, the first phonetic text of the first owner is "this navigation is too good", and the second phonetic text of the second owner is "this navigation is too bad". The word segmentation processing is performed on the first voice text which is "the navigation is too good" and the second voice text which is "the navigation is too bad", so that the first voice text after word segmentation is "the navigation, the navigation is too good, the user and the user", and the second voice text is "the navigation, the navigation is too bad, the user and the user".
And then vectorizing the first voice text of 'the first word, the second word, the first word, the second word and the first word, wherein the first voice text is' the first word, the second word, the first word, the second word and the first word. The manner of vectorization is not particularly limited.
After the first term vector and the second term vector are obtained, a fine-grained emotion dictionary is required to be adopted to calculate a first emotion score corresponding to the first term vector and a second emotion score corresponding to the second term vector respectively, specifically, each term vector in the first term vector and the second term vector is required to be compared with a preset term vector in the fine-grained emotion dictionary respectively, the emotion score of each term vector in the first term vector and the emotion score of each term vector in the second term vector are determined, and then the first emotion score is calculated based on the emotion score of each term vector in the first term vector, and the second emotion score is calculated based on the emotion score of each term vector in the second term vector.
The fine-granularity emotion dictionary has preset word vectors, the preset word vectors can be scores corresponding to different emotions, for example, all emotions are classified into six categories including positive, negative, neutral, favorite, aversion and anger, and each category has own scores, namely, the scores corresponding to positive, negative, neutral, favorite, aversion and anger are respectively 1, -1,0,2, -2 and-3.
And carrying out a comparison and matching on each word vector in the first word vector and the second word vector and the score corresponding to the emotion type, so that the emotion score of each word vector in the first word vector and the emotion score of each word vector in the second word vector can be obtained.
And then carrying out sum computation on the emotion score of each word vector in the first word vector to obtain a first emotion score, and carrying out sum computation on the emotion score of each word vector in the second word vector to obtain a second emotion score.
Meanwhile, the text similarity between the first voice text and the second voice text needs to be calculated, and the calculation formula is as follows:
where similarity (a, B) represents a text similarity between the first phonetic text and the second phonetic text, a represents the first phonetic text, and B represents the second phonetic text.
After the first emotion score, the second emotion score and the text similarity are obtained through calculation, a first weight corresponding to a first owner and a second weight corresponding to a second owner are calculated based on the first emotion score, the second emotion score and the text similarity, specifically, normalization processing is carried out on the first emotion score, the second emotion score and the text similarity to obtain a normalized first emotion score, a normalized second emotion score and a normalized text similarity, then a first information entropy is calculated based on the normalized first emotion score and the normalized text similarity, a second information entropy is calculated based on the normalized second emotion score and the normalized text similarity, and then a first weight is calculated based on the first information entropy and the text coding of the first voice text, and a second weight is calculated based on the second information entropy and the text coding of the second voice text.
The normalization process includes, but is not limited to, a process mode such as a range method.
After normalization processing is performed on the first emotion score EscoreA, the second emotion score EscoreB and the text similarity (A, B), an entropy weight method can be adopted to calculate the normalized first emotion score and the normalized text similarity to obtain a first information entropy E i Calculating normalized second emotion scores and normalized text similarity to obtain second information entropy E j
And then can be based on the first information entropy E i And a text code i of the first voice text, calculating to obtain a first weight w i The calculation formula is as follows:
where i=1, 2,..n.
According to the first information entropy E j And the text code j of the second voice text, calculating to obtain a second weight w j The calculation formula is as follows:
where j=1, 2,..n.
Step S103: and calculating the emotion similarity among different car owners based on emotion scores, text similarity and weights of voice texts of the different car owners so as to determine the types of the different car owners based on the emotion similarity among the different car owners.
Under the condition that the emotion similarity among different vehicle owners at least comprises emotion similarity of a first vehicle owner and a second vehicle owner, the emotion similarity among the different vehicle owners is calculated based on emotion scores, text similarity and weights of voice texts of the different vehicle owners, and the emotion similarity of the first vehicle owner and the second vehicle owner is calculated based on the first weights, the second weights, the first emotion scores, the second emotion scores and the text similarity.
Specifically, the calculation formula of the emotion similarity of the first vehicle owner and the second vehicle owner is as follows:
score=wi*EscoreA*EscoreB+wj*similarity(A,B)
wherein score represents emotional similarity of the first vehicle owner and the second vehicle owner, wi represents a first weight, wj represents a second weight, escoreA represents a first emotional score, escoreB represents a second emotional score, similarity (a, B) represents text similarity.
The embodiment of the invention provides a classification method, which comprises the following steps: firstly, voice texts of different vehicle owners are obtained, emotion scores, text similarity and weights of the voice texts of the different vehicle owners are calculated, and then emotion similarity among the different vehicle owners is calculated based on the emotion scores, the text similarity and the weights of the voice texts of the different vehicle owners, so that types of the different vehicle owners are determined based on the emotion similarity among the different vehicle owners. According to the method, the similarity calculation is combined with emotion, so that the emotion similarity of different car owners is calculated, and the classification accuracy of the different car owners is improved.
It should be understood that the sequence number of each step in the foregoing embodiment does not mean that the execution sequence of each process should be determined by the function and the internal logic, and should not limit the implementation process of the embodiment of the present invention.
The following are device embodiments of the invention, for details not described in detail therein, reference may be made to the corresponding method embodiments described above.
Fig. 2 shows a schematic structural diagram of a classification device according to an embodiment of the present invention, and for convenience of explanation, only a portion related to the embodiment of the present invention is shown, and the classification device includes an obtaining module 201, a calculating module 202, and a classification module 203, which are specifically as follows:
an acquisition module 201, configured to acquire voice texts of different vehicle owners;
the calculating module 202 is used for calculating emotion scores, text similarity and weights of voice texts of different vehicle owners;
the classification module 203 is configured to calculate emotion similarity between different vehicle owners based on emotion scores, text similarity and weights of voice texts of the different vehicle owners, so as to determine types of the different vehicle owners based on the emotion similarity between the different vehicle owners.
In one possible implementation, the different owners include at least a first owner and a second owner, and the voice text of the different owners includes at least a first voice text of the first owner and a second voice text of the second owner;
the computing module 202 is further configured to pre-process the first voice text and the second voice text, respectively, to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text;
respectively calculating a first emotion score corresponding to the first word vector and a second emotion score corresponding to the second word vector by adopting a fine-granularity emotion dictionary;
calculating the similarity of the first voice text and the second voice text to obtain text similarity;
and calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score and the text similarity.
In one possible implementation manner, the computing module 202 is further configured to perform word segmentation processing on the first voice text and the second voice text, so as to obtain a segmented first voice text and a segmented second voice text;
and respectively carrying out vectorization processing on the first voice text after word segmentation and the second voice text after word segmentation by adopting a TF-IDF algorithm to obtain a first word vector and a second word vector.
In a possible implementation manner, the computing module 202 is further configured to compare each of the first word vector and the second word vector with a word vector preset in the fine granularity emotion dictionary, and determine an emotion score of each of the first word vector and an emotion score of each of the second word vector;
the first emotion score is calculated based on the emotion score of each of the first word vectors, and the second emotion score is calculated based on the emotion score of each of the second word vectors.
In one possible implementation manner, the computing module 202 is further configured to normalize the first emotion score, the second emotion score, and the text similarity to obtain a normalized first emotion score, a normalized second emotion score, and a normalized text similarity;
calculating a first information entropy based on the normalized first emotion score and the normalized text similarity, and calculating a second information entropy based on the normalized second emotion score and the normalized text similarity;
the first weight is calculated based on the first information entropy and the text encoding of the first speech text, and the second weight is calculated based on the second information entropy and the text encoding of the second speech text.
In one possible implementation, the emotional similarity between the different owners includes at least emotional similarity of the first owner and the second owner;
the classification module 203 is further configured to calculate emotional similarity of the first vehicle owner and the second vehicle owner based on the first weight, the second weight, the first emotion score, the second emotion score, and the text similarity.
In one possible implementation, the calculation formula of the emotion similarity of the first vehicle owner and the second vehicle owner is:
score=wi*EscoreA*EscoreB+wj*similarity(A,B)
wherein score represents emotional similarity of the first vehicle owner and the second vehicle owner, wi represents a first weight, wj represents a second weight, escoreA represents a first emotional score, escoreB represents a second emotional score, similarity (a, B) represents text similarity.
The embodiment of the invention provides a classification device which can be particularly used for acquiring voice texts of different vehicle owners, calculating emotion scores, text similarity and weights of the voice texts of the different vehicle owners, and calculating emotion similarity among the different vehicle owners based on the emotion scores, the text similarity and the weights of the voice texts of the different vehicle owners so as to determine types of the different vehicle owners based on the emotion similarity among the different vehicle owners. According to the method, the similarity calculation is combined with emotion, so that the emotion similarity of different car owners is calculated, and the classification accuracy of the different car owners is improved.
Fig. 3 is a schematic diagram of a terminal according to an embodiment of the present invention. As shown in fig. 3, the terminal 3 of this embodiment includes: a processor 301, a memory 302 and a computer program 303 stored in the memory 302 and executable on the processor 301. The steps of the various classification method embodiments described above, such as steps 101-103 shown in fig. 1, are implemented when the processor 301 executes the computer program 303. Alternatively, the processor 301, when executing the computer program 303, performs the functions of the modules/units of the various sorting apparatus embodiments described above, such as the functions of the modules/units 201-203 shown in fig. 2.
The present invention also provides a readable storage medium having a computer program stored therein, which when executed by a processor is configured to implement a classification method provided in the above various embodiments, including:
acquiring voice texts of different vehicle owners;
calculating emotion scores, text similarity and weights of voice texts of different vehicle owners;
and calculating the emotion similarity among different car owners based on emotion scores, text similarity and weights of voice texts of the different car owners so as to determine the types of the different car owners based on the emotion similarity among the different car owners.
In one possible implementation, the different owners include at least a first owner and a second owner, and the voice text of the different owners includes at least a first voice text of the first owner and a second voice text of the second owner;
calculating emotion scores, text similarity and weights of voice texts of different vehicle owners, wherein the emotion scores, the text similarity and the weights comprise:
preprocessing the first voice text and the second voice text respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text;
respectively calculating a first emotion score corresponding to the first word vector and a second emotion score corresponding to the second word vector by adopting a fine-granularity emotion dictionary;
calculating the similarity of the first voice text and the second voice text to obtain text similarity;
and calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score and the text similarity.
In one possible implementation manner, preprocessing the first voice text and the second voice text respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text, including:
word segmentation is carried out on the first voice text and the second voice text respectively, so that the segmented first voice text and segmented second voice text are obtained;
and respectively carrying out vectorization processing on the first voice text after word segmentation and the second voice text after word segmentation by adopting a TF-IDF algorithm to obtain a first word vector and a second word vector.
In one possible implementation manner, calculating a first emotion score corresponding to a first word vector and a second emotion score corresponding to a second word vector by using a fine-granularity emotion dictionary includes:
comparing each word vector in the first word vector and the second word vector with a preset word vector in a fine granularity emotion dictionary, and determining an emotion score of each word vector in the first word vector and an emotion score of each word vector in the second word vector;
the first emotion score is calculated based on the emotion score of each of the first word vectors, and the second emotion score is calculated based on the emotion score of each of the second word vectors.
In one possible implementation, calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score, and the text similarity includes:
normalizing the first emotion score, the second emotion score and the text similarity to obtain a normalized first emotion score, a normalized second emotion score and a normalized text similarity;
calculating a first information entropy based on the normalized first emotion score and the normalized text similarity, and calculating a second information entropy based on the normalized second emotion score and the normalized text similarity;
the first weight is calculated based on the first information entropy and the text encoding of the first speech text, and the second weight is calculated based on the second information entropy and the text encoding of the second speech text.
In one possible implementation, the emotional similarity between the different owners includes at least emotional similarity of the first owner and the second owner;
based on emotion scores, text similarity and weights of voice texts of different vehicle owners, the emotion similarity among the different vehicle owners is calculated, and the method comprises the following steps:
and calculating emotion similarity of the first vehicle owner and the second vehicle owner based on the first weight, the second weight, the first emotion score, the second emotion score and the text similarity.
In one possible implementation, the calculation formula of the emotion similarity of the first vehicle owner and the second vehicle owner is:
score=wi*EscoreA*EscoreB+wj*similarity(A,B)
wherein score represents emotional similarity of the first vehicle owner and the second vehicle owner, wi represents a first weight, wj represents a second weight, escoreA represents a first emotional score, escoreB represents a second emotional score, similarity (a, B) represents text similarity.
The readable storage medium may be a computer storage medium or a communication medium. Communication media includes any medium that facilitates transfer of a computer program from one place to another. Computer storage media can be any available media that can be accessed by a general purpose or special purpose computer. For example, a readable storage medium is coupled to the processor such that the processor can read information from, and write information to, the readable storage medium. In the alternative, the readable storage medium may be integral to the processor. The processor and the readable storage medium may reside in an application specific integrated circuit (Application Specific Integrated Circuits, ASIC). In addition, the ASIC may reside in a user device. The processor and the readable storage medium may reside as discrete components in a communication device. The readable storage medium may be read-only memory (ROM), random-access memory (RAM), CD-ROMs, magnetic tape, floppy disk, optical data storage device, etc.
The present invention also provides a program product comprising execution instructions stored in a readable storage medium. At least one processor of the apparatus may read the execution instructions from the readable storage medium, the execution instructions being executed by the at least one processor to cause the apparatus to implement a classification method provided by the various embodiments described above, comprising:
acquiring voice texts of different vehicle owners;
calculating emotion scores, text similarity and weights of voice texts of different vehicle owners;
and calculating the emotion similarity among different car owners based on emotion scores, text similarity and weights of voice texts of the different car owners so as to determine the types of the different car owners based on the emotion similarity among the different car owners.
In one possible implementation, the different owners include at least a first owner and a second owner, and the voice text of the different owners includes at least a first voice text of the first owner and a second voice text of the second owner;
calculating emotion scores, text similarity and weights of voice texts of different vehicle owners, wherein the emotion scores, the text similarity and the weights comprise:
preprocessing the first voice text and the second voice text respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text;
respectively calculating a first emotion score corresponding to the first word vector and a second emotion score corresponding to the second word vector by adopting a fine-granularity emotion dictionary;
calculating the similarity of the first voice text and the second voice text to obtain text similarity;
and calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score and the text similarity.
In one possible implementation manner, preprocessing the first voice text and the second voice text respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text, including:
word segmentation is carried out on the first voice text and the second voice text respectively, so that the segmented first voice text and segmented second voice text are obtained;
and respectively carrying out vectorization processing on the first voice text after word segmentation and the second voice text after word segmentation by adopting a TF-IDF algorithm to obtain a first word vector and a second word vector.
In one possible implementation manner, calculating a first emotion score corresponding to a first word vector and a second emotion score corresponding to a second word vector by using a fine-granularity emotion dictionary includes:
comparing each word vector in the first word vector and the second word vector with a preset word vector in a fine granularity emotion dictionary, and determining an emotion score of each word vector in the first word vector and an emotion score of each word vector in the second word vector;
the first emotion score is calculated based on the emotion score of each of the first word vectors, and the second emotion score is calculated based on the emotion score of each of the second word vectors.
In one possible implementation, calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score, and the text similarity includes:
normalizing the first emotion score, the second emotion score and the text similarity to obtain a normalized first emotion score, a normalized second emotion score and a normalized text similarity;
calculating a first information entropy based on the normalized first emotion score and the normalized text similarity, and calculating a second information entropy based on the normalized second emotion score and the normalized text similarity;
the first weight is calculated based on the first information entropy and the text encoding of the first speech text, and the second weight is calculated based on the second information entropy and the text encoding of the second speech text.
In one possible implementation, the emotional similarity between the different owners includes at least emotional similarity of the first owner and the second owner;
based on emotion scores, text similarity and weights of voice texts of different vehicle owners, the emotion similarity among the different vehicle owners is calculated, and the method comprises the following steps:
and calculating emotion similarity of the first vehicle owner and the second vehicle owner based on the first weight, the second weight, the first emotion score, the second emotion score and the text similarity.
In one possible implementation, the calculation formula of the emotion similarity of the first vehicle owner and the second vehicle owner is:
score=wi*EscoreA*EscoreB+wj*similarity(A,B)
wherein score represents emotional similarity of the first vehicle owner and the second vehicle owner, wi represents a first weight, wj represents a second weight, escoreA represents a first emotional score, escoreB represents a second emotional score, similarity (a, B) represents text similarity.
In the above described embodiments of the apparatus, it is understood that the processor may be a central processing unit (Central Processing Unit, CPU), but may also be other general purpose processors, digital signal processors (Digital Signal Processor, DSP), application specific integrated circuits (Application Specific Integrated Circuit, ASIC), etc. A general purpose processor may be a microprocessor or the processor may be any conventional processor or the like. The steps of a method disclosed in connection with the present invention may be embodied directly in a hardware processor for execution, or in a combination of hardware and software modules in a processor for execution.
The above embodiments are only for illustrating the technical solution of the present invention, and are not limiting; although the invention has been described in detail with reference to the foregoing embodiments, it will be understood by those of ordinary skill in the art that: the technical scheme described in the foregoing embodiments can be modified or some technical features thereof can be replaced by equivalents; such modifications and substitutions do not depart from the spirit and scope of the technical solutions of the embodiments of the present invention, and are intended to be included in the scope of the present invention.

Claims (10)

1. A method of classification, comprising:
acquiring voice texts of different vehicle owners;
calculating emotion scores, text similarity and weights of voice texts of different vehicle owners;
and calculating the emotion similarity among different vehicle owners based on the emotion scores, the text similarity and the weights of the voice texts of the different vehicle owners so as to determine the types of the different vehicle owners based on the emotion similarity among the different vehicle owners.
2. The classification method of claim 1, wherein the different vehicle owners comprise at least a first vehicle owner and a second vehicle owner, and wherein the different vehicle owners' voice text comprises at least a first voice text of the first vehicle owner and a second voice text of the second vehicle owner;
the calculating of emotion scores, text similarity and weights of voice texts of different vehicle owners comprises:
preprocessing the first voice text and the second voice text respectively to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text;
respectively calculating a first emotion score corresponding to the first word vector and a second emotion score corresponding to the second word vector by adopting a fine-granularity emotion dictionary;
calculating the similarity of the first voice text and the second voice text to obtain text similarity;
and calculating a first weight corresponding to the first owner and a second weight corresponding to the second owner based on the first emotion score, the second emotion score and the text similarity.
3. The method of classifying as claimed in claim 2, wherein the preprocessing the first and second voice texts to obtain a first word vector corresponding to the first voice text and a second word vector corresponding to the second voice text includes:
word segmentation is carried out on the first voice text and the second voice text respectively, so that the segmented first voice text and segmented second voice text are obtained;
and respectively carrying out vectorization processing on the first voice text after word segmentation and the second voice text after word segmentation by adopting a TF-IDF algorithm to obtain a first word vector and a second word vector.
4. The classification method of claim 2, wherein the calculating the first emotion score corresponding to the first word vector and the second emotion score corresponding to the second word vector using a fine granularity emotion dictionary comprises:
comparing each word vector in the first word vector and the second word vector with a preset word vector in the fine granularity emotion dictionary respectively, and determining an emotion score of each word vector in the first word vector and an emotion score of each word vector in the second word vector;
the first emotion score is calculated based on the emotion score of each of the first word vectors, and the second emotion score is calculated based on the emotion score of each of the second word vectors.
5. The classification method of claim 2, wherein calculating a first weight corresponding to a first owner and a second weight corresponding to a second owner based on the first emotion score, the second emotion score, and the text similarity comprises:
normalizing the first emotion score, the second emotion score and the text similarity to obtain a normalized first emotion score, a normalized second emotion score and a normalized text similarity;
calculating a first information entropy based on the normalized first emotion score and the normalized text similarity, and calculating a second information entropy based on the normalized second emotion score and the normalized text similarity;
the first weight is calculated based on the first information entropy and the text encoding of the first speech text, and the second weight is calculated based on the second information entropy and the text encoding of the second speech text.
6. The classification method of claim 2, wherein the emotional similarity between the different owners comprises at least emotional similarity of a first owner and a second owner;
the calculating the emotion similarity between different vehicle owners based on the emotion scores, the text similarity and the weights of the voice texts of the different vehicle owners comprises the following steps:
and calculating emotion similarity of the first vehicle owner and the second vehicle owner based on the first weight, the second weight, the first emotion score, the second emotion score and the text similarity.
7. The classification method of claim 6, wherein the first and second vehicle owners have a emotional similarity calculated by:
score=wi*EscoreA*EscoreB+wj*similarity(A,B)
wherein score represents emotional similarity of the first vehicle owner and the second vehicle owner, wi represents a first weight, wj represents a second weight, escoreA represents a first emotional score, escoreB represents a second emotional score, similarity (a, B) represents text similarity.
8. A sorting apparatus, comprising:
the acquisition module is used for acquiring voice texts of different vehicle owners;
the computing module is used for computing emotion scores, text similarity and weights of voice texts of different vehicle owners;
and the classification module is used for calculating the emotion similarity among different car owners based on the emotion scores, the text similarity and the weights of the voice texts of the different car owners so as to determine the types of the different car owners based on the emotion similarity among the different car owners.
9. A terminal comprising a memory, a processor and a computer program stored in the memory and executable on the processor, characterized in that the processor implements the steps of the classification method according to any of claims 1 to 7 when the computer program is executed.
10. A computer-readable storage medium storing a computer program, characterized in that the computer program realizes the steps of the classification method according to any one of claims 1 to 7 when the computer program is executed by a processor.
CN202311117045.2A 2023-08-31 2023-08-31 Classification methods, devices, terminals and storage media Pending CN117251569A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202311117045.2A CN117251569A (en) 2023-08-31 2023-08-31 Classification methods, devices, terminals and storage media

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202311117045.2A CN117251569A (en) 2023-08-31 2023-08-31 Classification methods, devices, terminals and storage media

Publications (1)

Publication Number Publication Date
CN117251569A true CN117251569A (en) 2023-12-19

Family

ID=89130415

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202311117045.2A Pending CN117251569A (en) 2023-08-31 2023-08-31 Classification methods, devices, terminals and storage media

Country Status (1)

Country Link
CN (1) CN117251569A (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100325135A1 (en) * 2009-06-23 2010-12-23 Gracenote, Inc. Methods and apparatus for determining a mood profile associated with media data
CN108470065A (en) * 2018-03-22 2018-08-31 北京航空航天大学 A kind of determination method and device of exception comment text
CN110070889A (en) * 2019-03-15 2019-07-30 深圳壹账通智能科技有限公司 Vehicle monitoring method, device and storage medium, server
CN111898377A (en) * 2020-07-07 2020-11-06 苏宁金融科技(南京)有限公司 Emotion recognition method and device, computer equipment and storage medium
CN111984793A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Text sentiment classification model training method, device, computer equipment and medium

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20100325135A1 (en) * 2009-06-23 2010-12-23 Gracenote, Inc. Methods and apparatus for determining a mood profile associated with media data
CN108470065A (en) * 2018-03-22 2018-08-31 北京航空航天大学 A kind of determination method and device of exception comment text
CN110070889A (en) * 2019-03-15 2019-07-30 深圳壹账通智能科技有限公司 Vehicle monitoring method, device and storage medium, server
CN111898377A (en) * 2020-07-07 2020-11-06 苏宁金融科技(南京)有限公司 Emotion recognition method and device, computer equipment and storage medium
CN111984793A (en) * 2020-09-03 2020-11-24 平安国际智慧城市科技股份有限公司 Text sentiment classification model training method, device, computer equipment and medium

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
刘小方,姚春江,周永涛: "导弹装备性能质量状态评估预测理论与技术", 30 June 2019, 国防工业出版社, pages: 65 - 68 *

Similar Documents

Publication Publication Date Title
CN111695352A (en) Grading method and device based on semantic analysis, terminal equipment and storage medium
Kinnunen et al. Real-time speaker identification and verification
CN108228732B (en) Language storage method and language dialogue system
WO2020140372A1 (en) Recognition model-based intention recognition method, recognition device, and medium
US6868381B1 (en) Method and apparatus providing hypothesis driven speech modelling for use in speech recognition
CN111984780A (en) Multi-intention recognition model training method, multi-intention recognition method and related device
CN112632248A (en) Question answering method, device, computer equipment and storage medium
CN112115702A (en) Intention recognition method, device, dialogue robot and computer readable storage medium
CN114822510A (en) Voice awakening method and system based on binary convolutional neural network
CN111522937A (en) Method, device and electronic equipment for speech recommendation
WO2017094121A1 (en) Voice recognition device, voice emphasis device, voice recognition method, voice emphasis method, and navigation system
Wang et al. Sound events recognition and retrieval using multi-convolutional-channel sparse coding convolutional neural networks
CN112908315B (en) Question and answer intention judging method based on sound characteristics and voice recognition
CN114239572A (en) Text recommendation method, system, terminal and storage medium
CN115083398A (en) Voice recognition method and device, computer equipment and storage medium
CN119088912B (en) Complex scenario dialogue method and system based on large language model
CN117251569A (en) Classification methods, devices, terminals and storage media
CN111640423B (en) Word boundary estimation method and device and electronic equipment
CN111933153B (en) Voice segmentation point determining method and device
CN119323955A (en) Training method of voice recognition model, voice recognition method and equipment
CN113707148B (en) Method, device, equipment and medium for determining speech recognition accuracy
CN117975965A (en) Speech determination method, device, equipment and medium based on lip language recognition
CN112908339B (en) Conference link positioning method and device, positioning equipment and readable storage medium
CN116468046A (en) Artificial intelligence-based semantic extraction method, device, computer equipment and medium
CN116129915A (en) Identity recognition method, voice quality inspection method and related equipment

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination