[go: up one dir, main page]

CN110858099B - Candidate word generation method and device - Google Patents

Candidate word generation method and device Download PDF

Info

Publication number
CN110858099B
CN110858099B CN201810948159.4A CN201810948159A CN110858099B CN 110858099 B CN110858099 B CN 110858099B CN 201810948159 A CN201810948159 A CN 201810948159A CN 110858099 B CN110858099 B CN 110858099B
Authority
CN
China
Prior art keywords
candidate
word
user
candidate words
information
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN201810948159.4A
Other languages
Chinese (zh)
Other versions
CN110858099A (en
Inventor
姚波怀
张扬
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing Sogou Technology Development Co Ltd
Original Assignee
Beijing Sogou Technology Development Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing Sogou Technology Development Co Ltd filed Critical Beijing Sogou Technology Development Co Ltd
Priority to CN201810948159.4A priority Critical patent/CN110858099B/en
Publication of CN110858099A publication Critical patent/CN110858099A/en
Application granted granted Critical
Publication of CN110858099B publication Critical patent/CN110858099B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0236Character input methods using selection techniques to select from displayed items
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/21Design or setup of recognition systems or techniques; Extraction of features in feature space; Blind source separation
    • G06F18/214Generating training patterns; Bootstrap methods, e.g. bagging or boosting
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2411Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on the proximity to a decision surface, e.g. support vector machines
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F3/00Input arrangements for transferring data to be processed into a form capable of being handled by the computer; Output arrangements for transferring data from processing unit to output unit, e.g. interface arrangements
    • G06F3/01Input arrangements or combined input and output arrangements for interaction between user and computer
    • G06F3/02Input arrangements using manually operated switches, e.g. using keyboards or dials
    • G06F3/023Arrangements for converting discrete items of information into a coded form, e.g. arrangements for interpreting keyboard generated codes as alphanumeric codes, operand codes or instruction codes
    • G06F3/0233Character input methods
    • G06F3/0237Character input methods using prediction or retrieval techniques

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • General Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • General Physics & Mathematics (AREA)
  • Artificial Intelligence (AREA)
  • Evolutionary Biology (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Machine Translation (AREA)

Abstract

The invention discloses a candidate word generation method and a candidate word generation device, wherein the method comprises the following steps: acquiring the above words in real time and obtaining candidate words; determining a user emotion category according to the above; and screening the candidate words according to the emotion categories of the users to obtain candidate words to be output. By using the method and the device, the accuracy of the candidate words can be improved, and the user input experience is improved.

Description

候选词生成方法及装置Candidate word generation method and device

技术领域Technical Field

本发明涉及输入法领域,具体涉及一种候选词生成方法及装置。The present invention relates to the field of input methods, and in particular to a candidate word generation method and device.

背景技术Background technique

输入法是一种将各种符号输入电子设备的编码方法,是人类和电子设备打交道不可或缺的工具。为了加快输入速度,现有的输入法大都具有联想记忆功能,即在输入一个汉字或一个词语后,输入法会自动推荐候选的字或词语。Input method is a coding method for inputting various symbols into electronic devices, and is an indispensable tool for humans to interact with electronic devices. In order to speed up the input speed, most existing input methods have an associative memory function, that is, after inputting a Chinese character or a word, the input method will automatically recommend candidate characters or words.

现有的一些输入法,在用户输入过程中会进行词预测,预测用户下一个将要输入的词并提供给用户,以方便用户输入。目前输入法用于词预测的方法主要是利用大规模语料数据,使用传统的ngram统计模型,或者深度学习模型,学习出语言模型,然后利用这些模型,根据上文和输入环境等信息,找到最大概率的下文词。Some existing input methods predict words during the user input process, predicting the next word to be input by the user and providing it to the user to facilitate the user input. The current method for word prediction in input methods mainly uses large-scale corpus data, traditional ngram statistical models, or deep learning models to learn language models, and then uses these models to find the next word with the highest probability based on information such as the previous context and input environment.

这种方法虽然在一定程度上方便了用户输入,但目前的模型难以感知过长的上文,因此常常会出现断章取义的预测结果。例如,用户输入“色香味俱全,你做的菜”,目前输入法大都只会根据上文“你做的菜”,给出“不好吃”、“好吃”等候选词,很明显“不好吃”在这里并不是一个合理的候选词,这样的候选词不仅掩盖了其他表达赞赏情绪的候选词,比如,很赞、很棒等,而且给用户带来很差的体验。Although this method facilitates user input to a certain extent, the current model has difficulty perceiving the overly long preceding context, and therefore often produces out-of-context prediction results. For example, when a user inputs "the dishes you make are delicious and colorful", most current input methods will only give candidate words such as "not delicious" and "delicious" based on the preceding context "the dishes you make". Obviously, "not delicious" is not a reasonable candidate word here. Such candidate words not only cover up other candidate words that express appreciation, such as "very good" and "great", but also bring a very poor experience to users.

发明内容Summary of the invention

本发明实施例提供一种候选词生成方法及装置,以提高候选词的准确性,提升用户输入体验。The embodiments of the present invention provide a candidate word generation method and device to improve the accuracy of candidate words and enhance the user input experience.

为此,本发明提供如下技术方案:To this end, the present invention provides the following technical solutions:

一种候选词生成方法,所述方法包括:A candidate word generation method, the method comprising:

实时获取上文并得到候选词;Get the previous text and candidate words in real time;

根据所述上文确定用户情绪类别;Determine the user's emotion category according to the above context;

根据所述用户情绪类别对所述候选词进行筛选,得到待输出的候选词。The candidate words are screened according to the user emotion category to obtain candidate words to be output.

优选地,所述上文为以下任意一种:文本、语音、图片。Preferably, the above context is any one of the following: text, voice, or picture.

优选地,所述方法还包括:预先构建情绪识别模型;Preferably, the method further comprises: pre-building an emotion recognition model;

所述根据所述上文确定用户情绪类别包括:Determining the user emotion category according to the above text includes:

提取所述上文的文本信息;Extracting the text information of the above text;

利用所述文本信息及所述情绪识别模型,得到用户情绪类别。The user emotion category is obtained by using the text information and the emotion recognition model.

优选地,所述情绪识别模型为深度学习模型;所述提取所述上文的文本信息包括:获取所述上文对应的词序列,确定所述词序列中各词的词向量;或者Preferably, the emotion recognition model is a deep learning model; the extracting the text information of the above text comprises: obtaining a word sequence corresponding to the above text, and determining a word vector of each word in the word sequence; or

所述情绪识别模型为SVM或决策树;所述提取所述上文的文本信息包括:获取所述上文对应的词序列,确定所述词序列中各词的ID。The emotion recognition model is SVM or decision tree; the extracting the text information of the above text includes: obtaining the word sequence corresponding to the above text, and determining the ID of each word in the word sequence.

优选地,所述根据所述上文确定用户情绪类别还包括:Preferably, the determining the user emotion category according to the above text further includes:

获取辅助信息,所述辅助信息包括以下任意一种或多种:当前环境信息、位置信息、用户身体信息;Acquire auxiliary information, the auxiliary information including any one or more of the following: current environment information, location information, user body information;

利用所述文本信息、所述辅助信息及所述情绪识别模型,得到用户情绪类别。The user emotion category is obtained by using the text information, the auxiliary information and the emotion recognition model.

优选地,所述方法还包括:Preferably, the method further comprises:

获取各候选词的候选得分;Obtain the candidate score of each candidate word;

所述根据所述用户情绪类别对所述候选词进行筛选,得到待输出的候选词包括:The candidate words are screened according to the user emotion category to obtain the candidate words to be output, including:

根据所述用户情绪类别调整所述候选词的候选得分,得到所述候选词的最终得分;Adjusting the candidate score of the candidate word according to the user emotion category to obtain a final score of the candidate word;

根据所述最终得分确定待输出的候选词。The candidate words to be output are determined according to the final scores.

优选地,所述根据所述用户情绪类别调整所述候选词的候选得分,得到所述候选词的最终得分包括:Preferably, adjusting the candidate score of the candidate word according to the user emotion category to obtain the final score of the candidate word includes:

根据所述用户情绪类别确定各候选词的情感得分;Determining a sentiment score for each candidate word according to the user sentiment category;

将所述候选词的候选得分与所述情感得分进行加权求和,得到所述候选词的最终得分。The candidate score of the candidate word and the sentiment score are weightedly summed to obtain the final score of the candidate word.

优选地,所述根据所述用户情绪类别调整所述候选词的候选得分,得到所述候选词的最终得分包括:Preferably, adjusting the candidate score of the candidate word according to the user emotion category to obtain the final score of the candidate word includes:

根据所述用户情绪类别确定各候选词的候选得分的权重;Determining the weight of the candidate score of each candidate word according to the user emotion category;

根据所述候选词的候选得分及其权重计算得到所述候选词的最终得分。The final score of the candidate word is calculated based on the candidate score of the candidate word and its weight.

优选地,所述根据所述最终得分确定待输出的候选词包括:Preferably, determining the candidate words to be output according to the final scores comprises:

依照最终得分从高到低的顺序选取设定数量的候选词作为待输出的候选词;或者Selecting a set number of candidate words as candidate words to be output according to the final scores in descending order; or

选取最终得分大于设定阈值的候选词作为待输出的候选词。The candidate words whose final scores are greater than the set threshold are selected as the candidate words to be output.

优选地,所述根据所述用户情绪类别对所述候选词进行筛选,得到待输出的候选词包括:Preferably, the screening of the candidate words according to the user emotion category to obtain the candidate words to be output includes:

选择所述候选词中与所述情绪类别相对应的候选词作为待输出的候选词。A candidate word corresponding to the emotion category among the candidate words is selected as a candidate word to be output.

优选地,所述方法还包括:Preferably, the method further comprises:

预先建立与不同情绪类别对应的候选词列表;Pre-establish a list of candidate words corresponding to different emotion categories;

所述选择所述候选词中与所述情绪类别相对应的候选词包括:The selecting of the candidate words corresponding to the emotion category from the candidate words comprises:

根据所述列表选择所述候选词中与所述情绪类别相对应的候选词。A candidate word corresponding to the emotion category among the candidate words is selected according to the list.

优选地,所述方法还包括:Preferably, the method further comprises:

根据历史输入信息对所述情绪识别模型进行个性化训练,更新所述情绪识别模型。The emotion recognition model is personalized trained according to historical input information to update the emotion recognition model.

一种候选词生成装置,所述装置包括:A candidate word generating device, the device comprising:

上文获取模块,用于实时获取上文并得到候选词;The previous text acquisition module is used to acquire the previous text in real time and obtain candidate words;

候选词获取模块,用于获得候选词;A candidate word acquisition module, used to obtain candidate words;

情绪识别模块,用于根据所述上文确定用户情绪类别;An emotion recognition module, used to determine the user's emotion category according to the above text;

筛选模块,用于根据所述用户情绪类别对所述候选词进行筛选,得到待输出的候选词。The screening module is used to screen the candidate words according to the user emotion category to obtain the candidate words to be output.

优选地,所述上文为以下任意一种:文本、语音、图片。Preferably, the above context is any one of the following: text, voice, or picture.

优选地,所述装置还包括:模型构建模块,用于预先构建情绪识别模型;Preferably, the device further comprises: a model building module, for pre-building an emotion recognition model;

所述情绪识别模块包括:The emotion recognition module comprises:

文本处理单元,用于提取所述上文的文本信息;A text processing unit, used to extract the text information of the above text;

识别单元,用于利用所述文本信息及所述情绪识别模型,得到用户情绪类别。The recognition unit is used to obtain the user emotion category by using the text information and the emotion recognition model.

优选地,所述情绪识别模型为深度学习模型,所述文本处理单元,具体用于获取所述上文对应的词序列,确定所述词序列中各词的词向量;或者Preferably, the emotion recognition model is a deep learning model, and the text processing unit is specifically used to obtain the word sequence corresponding to the above text and determine the word vector of each word in the word sequence; or

所述情绪识别模型为分类模型;所述文本处理单元,具体用于获取所述上文对应的词序列,确定所述词序列中各词的ID。The emotion recognition model is a classification model; the text processing unit is specifically used to obtain the word sequence corresponding to the above text and determine the ID of each word in the word sequence.

优选地,所述情绪识别模块还包括:Preferably, the emotion recognition module further includes:

辅助信息获取单元,用于获取辅助信息,所述辅助信息包括:当前环境信息和/或位置信息;An auxiliary information acquisition unit, used to acquire auxiliary information, wherein the auxiliary information includes: current environment information and/or location information;

所述识别单元,具体用于利用所述文本信息、所述辅助信息及所述情绪识别模型,得到用户情绪类别。The recognition unit is specifically used to obtain the user emotion category by using the text information, the auxiliary information and the emotion recognition model.

优选地,所述候选词获取模块,还用于获取各候选词的候选得分;Preferably, the candidate word acquisition module is further used to obtain a candidate score of each candidate word;

所述筛选模块包括:The screening module comprises:

得分调整模块,用于根据所述用户情绪类别调整所述候选词的候选得分,得到所述候选词的最终得分;A score adjustment module, used to adjust the candidate score of the candidate word according to the user emotion category to obtain a final score of the candidate word;

候选词输出模块,用于根据所述最终得分确定待输出的候选词。A candidate word output module is used to determine the candidate word to be output according to the final score.

优选地,所述得分调整模块包括:Preferably, the score adjustment module includes:

情感得分确定单元,用于根据所述用户情绪类别确定各候选词的情感得分;A sentiment score determination unit, used to determine the sentiment score of each candidate word according to the user's sentiment category;

第一计算单元,用于将所述候选词的候选得分与所述情感得分进行加权求和,得到所述候选词的最终得分。The first calculation unit is used to perform weighted summation of the candidate score of the candidate word and the sentiment score to obtain a final score of the candidate word.

优选地,所述得分调整模块包括:Preferably, the score adjustment module includes:

权重确定单元,用于根据所述用户情绪类别确定各候选词的候选得分的权重;A weight determination unit, used to determine the weight of the candidate score of each candidate word according to the user emotion category;

第二计算单元,用于根据所述候选词的候选得分及其权重计算得到所述候选词的最终得分。The second calculation unit is used to calculate the final score of the candidate word according to the candidate score of the candidate word and its weight.

优选地,所述候选词输出模块,具体用于依照最终得分从高到低的顺序选取设定数量的候选词作为待输出的候选词;或者选取最终得分大于设定阈值的候选词作为待输出的候选词。Preferably, the candidate word output module is specifically used to select a set number of candidate words as candidate words to be output in descending order of final scores; or select candidate words whose final scores are greater than a set threshold as candidate words to be output.

优选地,所述筛选模块,具体用于选择所述候选词中与所述情绪类别相对应的候选词作为待输出的候选词。Preferably, the screening module is specifically used to select candidate words corresponding to the emotion category from the candidate words as candidate words to be output.

优选地,所述装置还包括:Preferably, the device further comprises:

候选词列表建立模块,用于预先建立与不同情绪类别对应的候选词列表;A candidate word list building module, used to pre-build candidate word lists corresponding to different emotion categories;

所述筛选模块根据所述列表选择所述候选词中与所述情绪类别相对应的候选词。The screening module selects a candidate word corresponding to the emotion category from among the candidate words according to the list.

优选地,所述装置还包括:Preferably, the device further comprises:

信息记录模块,用于记录历史输入信息;Information recording module, used to record historical input information;

模型更新模块,用于利用所述历史输入信息对所述情绪识别模型进行个性化训练,更新所述情绪识别模型。The model updating module is used to perform personalized training on the emotion recognition model using the historical input information to update the emotion recognition model.

一种计算机设备,包括:一个或多个处理器、存储器;A computer device, comprising: one or more processors, a memory;

所述存储器用于存储计算机可执行指令,所述处理器用于执行所述计算机可执行指令,以实现前面所述的方法。The memory is used to store computer-executable instructions, and the processor is used to execute the computer-executable instructions to implement the aforementioned method.

一种可读存储介质,其上存储有指令,所述指令被执行以实现前面所述的方法。A readable storage medium stores instructions thereon, wherein the instructions are executed to implement the above-mentioned method.

本发明实施例提供的候选词生成方法及装置,基于上文信息对用户情绪进行识别,根据识别得到的用户情绪类别对各候选词进行筛选,将符合用户当前心境的候选词优先提供给用户,从而使提供给用户的候选词更准确,进而提高用户输入效率,提升用户输入体验。The candidate word generation method and device provided in the embodiments of the present invention identify user emotions based on the above information, screen each candidate word according to the identified user emotion category, and provide the candidate words that match the user's current mood to the user first, thereby making the candidate words provided to the user more accurate, thereby improving the user input efficiency and enhancing the user input experience.

附图说明BRIEF DESCRIPTION OF THE DRAWINGS

为了更清楚地说明本申请实施例或现有技术中的技术方案,下面将对实施例中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本发明中记载的一些实施例,对于本领域普通技术人员来讲,还可以根据这些附图获得其他的附图。In order to more clearly illustrate the embodiments of the present application or the technical solutions in the prior art, the drawings required for use in the embodiments will be briefly introduced below. Obviously, the drawings described below are only some embodiments recorded in the present invention. For ordinary technicians in this field, other drawings can also be obtained based on these drawings.

图1是本发明实施例候选词生成方法的一种流程图;FIG1 is a flow chart of a candidate word generating method according to an embodiment of the present invention;

图2是本发明实施例候选词生成装置的一种结构示意图;FIG2 is a schematic diagram of the structure of a candidate word generating device according to an embodiment of the present invention;

图3是本发明实施例候选词生成装置的一种具体应用结构示意图;FIG3 is a schematic diagram of a specific application structure of a candidate word generating device according to an embodiment of the present invention;

图4是根据一示例性实施例示出的一种用于候选词生成方法的装置的框图;FIG4 is a block diagram of a device for generating a candidate word according to an exemplary embodiment;

图5是本发明实施例中服务器的结构示意图。FIG. 5 is a schematic diagram of the structure of a server in an embodiment of the present invention.

具体实施方式Detailed ways

为了使本技术领域的人员更好地理解本发明实施例的方案,下面结合附图和实施方式对本发明实施例作进一步的详细说明。In order to enable persons skilled in the art to better understand the solutions of the embodiments of the present invention, the embodiments of the present invention are further described in detail below in conjunction with the accompanying drawings and implementation modes.

针对现有的输入法在生成候选词时,会产生与上文真正想表达的内容不相关的联想候选这一问题,本发明实施例提供一种候选词生成方法及装置,对上文进行分析从而得到用户情绪类别,利用所述情绪类别对各候选词进行筛选,得到待输出的候选词。In order to solve the problem that the existing input method generates associative candidates that are irrelevant to the actual content of the previous text when generating candidate words, an embodiment of the present invention provides a candidate word generation method and device, which analyzes the previous text to obtain the user emotion category, and uses the emotion category to screen each candidate word to obtain the candidate word to be output.

如图1所示,是本发明实施例候选词生成方法的一种流程图,包括以下步骤:As shown in FIG. 1 , it is a flow chart of a candidate word generation method according to an embodiment of the present invention, which includes the following steps:

步骤101,实时获取上文并得到候选词。Step 101, obtain the above text in real time and obtain candidate words.

针对不同的输入设备及输入方式,所述上文的形式也可以不同,比如,所述上文可以是文本、语音、或者图片等形式。另外,根据应用场景不同,所述上文可以是用户输入、已有上文、对端交互的上文等。For different input devices and input methods, the form of the above context may also be different, for example, the above context may be in the form of text, voice, or picture, etc. In addition, according to different application scenarios, the above context may be user input, existing context, context of peer interaction, etc.

所述候选词可以是现有的输入法生成的候选词,对于不同的输入法,其生成候选词的方法及规则等可能会有所不同,此处本发明实施例不做限定。不论基于何种输入法生成的候选词,通过后续本发明方案对这些候选词做进一步筛选,均能得到与上文更匹配的候选词。The candidate words may be generated by an existing input method. Different input methods may have different methods and rules for generating candidate words, which are not limited in the embodiments of the present invention. Regardless of the candidate words generated by the input method, further screening of the candidate words by the subsequent solution of the present invention can obtain candidate words that better match the above.

步骤102,根据所述上文确定用户情绪类别。Step 102: Determine the user emotion category according to the above text.

具体地,可以采用基于模型的方法,预先构建情绪识别模型,所述情绪识别模型可以采用深度学习模型,比如DNN(Deep Neural Networks,深度神经网络)、CNN(Convolutional Neural Network,卷积神经网络)等,或者其它传统分类模型,比如,SVM(Support Vector Machine,支持向量机)、决策树等。模型的训练过程可以采用常规技术,在此不再详细描述。Specifically, a model-based approach can be used to pre-build an emotion recognition model, and the emotion recognition model can use a deep learning model, such as DNN (Deep Neural Networks), CNN (Convolutional Neural Network), etc., or other traditional classification models, such as SVM (Support Vector Machine), decision tree, etc. The training process of the model can use conventional techniques and will not be described in detail here.

在利用所述情绪识别模型对基于上文的用户情绪进行识别时,需要先对所述上文进行预处理,提取所述上文中的文本信息,然后将所述文本信息输入到所述情绪识别模型,根据模型的输出得到用户情绪类别。比如,所述用户情绪类别采用深度学习模型时,需要首先获取所述上文中的词序列,然后确定所述词序列中各词的词向量,将所述词向量输入所述深度学习模型,根据所述情绪识别模型的输出即可得到用户情绪类别。再比如,所述情绪识别模型采用其它分类模型时,同样需要首先获取所述上文中的词序列,然后确定所述词序列中各词的ID,得到ID序列,将该ID序列输入所述分类模型,得到用户情绪类别。When using the emotion recognition model to recognize the user emotions based on the above text, it is necessary to preprocess the above text first, extract the text information in the above text, and then input the text information into the emotion recognition model, and obtain the user emotion category according to the output of the model. For example, when the user emotion category adopts a deep learning model, it is necessary to first obtain the word sequence in the above text, and then determine the word vector of each word in the word sequence, and input the word vector into the deep learning model, and the user emotion category can be obtained according to the output of the emotion recognition model. For another example, when the emotion recognition model adopts other classification models, it is also necessary to first obtain the word sequence in the above text, and then determine the ID of each word in the word sequence to obtain an ID sequence, and input the ID sequence into the classification model to obtain the user emotion category.

所述用户情绪可以分为积极、消极与其他三类,当然还可以再细分,比如积极的情绪有:高兴、喜好;消极的情绪有:悲伤、焦虑、愤怒等。The user emotions can be divided into three categories: positive, negative and others. Of course, they can be further subdivided. For example, positive emotions include happiness and preference; negative emotions include sadness, anxiety, anger, etc.

在实际应用中,所述上文中的词序列可以通过对上文文本进行分词处理得到,比如基于字符串匹配的分词方法、基于理解的分词方法、基于统计的分词方法等;或者通过记录用户输入的各词得到。当然,对于其它非文本形式的上文,可以先得到所述上文中的文本信息,然后再对其进行分词处理,得到相应的词序列。比如,对于图像形式的上文,可以利用图像识别技术得到相应的文本;对于语音形式的上文,可以利用语音识别技术得到相应的文本。In practical applications, the word sequence in the above text can be obtained by segmenting the above text, such as a segmentation method based on string matching, a segmentation method based on understanding, a segmentation method based on statistics, etc.; or by recording each word input by the user. Of course, for other non-textual forms of the above text, the text information in the above text can be obtained first, and then segmented to obtain the corresponding word sequence. For example, for the above text in the form of an image, the corresponding text can be obtained by using image recognition technology; for the above text in the form of a voice, the corresponding text can be obtained by using voice recognition technology.

步骤103,根据所述用户情绪类别对所述候选词进行筛选,得到待输出的候选词。Step 103: Screen the candidate words according to the user emotion category to obtain candidate words to be output.

在对所述候选词进行筛选时,可以有多种筛选方式。When screening the candidate words, there may be multiple screening methods.

比如,可以根据所述用户情绪类别,选择所述候选词中与所述情绪类别相对应的候选词作为待输出的候选词。具体地,可以预先建立与不同情绪类别对应的候选词列表,根据所述列表选择所述候选词中与所述情绪类别相对应的候选词。For example, according to the user emotion category, the candidate words corresponding to the emotion category among the candidate words can be selected as the candidate words to be output. Specifically, a candidate word list corresponding to different emotion categories can be pre-established, and the candidate word corresponding to the emotion category among the candidate words can be selected according to the list.

再比如,可以获取基于当前输入的各候选词的候选得分,根据所述用户情绪类别调整所述候选词的候选得分,得到所述候选词的最终得分;然后再根据所述最终得分确定待输出的候选词。For another example, the candidate scores of each candidate word based on the current input may be obtained, and the candidate scores of the candidate words may be adjusted according to the user emotion category to obtain the final scores of the candidate words; and then the candidate words to be output may be determined according to the final scores.

在对所述候选词的候选得分进行调整时,可以有多种调整方式,比如:When adjusting the candidate score of the candidate word, there may be multiple adjustment methods, such as:

一种方式:先根据所述用户情绪类别确定各候选词的情感得分,比如,可以根据各候选词与识别得到的用户情绪类别对应关系的强弱确定所述候选词的词性得分,对应关系越强词性得分越高;然后将所述候选词的候选得分与所述情感得分进行加权求和,得到所述候选词的最终得分。One method: first determine the sentiment score of each candidate word according to the user sentiment category. For example, the part-of-speech score of the candidate word can be determined according to the strength of the correspondence between each candidate word and the identified user sentiment category. The stronger the correspondence, the higher the part-of-speech score; then perform a weighted summation of the candidate score of the candidate word and the sentiment score to obtain the final score of the candidate word.

另一种方式:先根据所述用户情绪类别确定各候选词的候选得分的权重;然后根据所述候选词的候选得分及其权重计算得到所述候选词的最终得分。比如,对于识别得到的用户情绪类别,与该情绪类别相关的候选词的权重设置为1;其它候选词的权重设置为0.5。Another way: first determine the weight of the candidate score of each candidate word according to the user emotion category; then calculate the final score of the candidate word according to the candidate score of the candidate word and its weight. For example, for the recognized user emotion category, the weight of the candidate word related to the emotion category is set to 1; the weight of other candidate words is set to 0.5.

在计算得到各候选词的最终得分后,可以依照最终得分从高到低的顺序选取设定数量的候选词作为待输出的候选词;或者选取最终得分大于设定阈值的候选词作为待输出的候选词。After calculating the final score of each candidate word, a set number of candidate words can be selected as candidate words to be output in descending order of the final score; or candidate words with final scores greater than a set threshold can be selected as candidate words to be output.

本发明实施例提供的候选词生成方法,基于上文信息对用户情绪进行识别,根据识别得到的用户情绪类别对各候选词进行筛选,从而提供给用户更准确的候选词,进而提高用户输入效率,提升用户输入体验。The candidate word generation method provided in the embodiment of the present invention identifies user emotions based on the above information, and screens each candidate word according to the identified user emotion category, thereby providing the user with more accurate candidate words, thereby improving user input efficiency and enhancing user input experience.

例如,当用户输入“他老是做错,我很”时,按照现有的输入法会生成“开心”、“喜欢”、“伤心”等候选词。利用本发明实施例的方法,首先利用获取的上文,判断用户表达“难过”的情绪更多一些,而非“高兴”的情绪,因此经过筛选处理,“伤心”、“难过”等候选词的排序会更靠前,“开心”、“喜欢”等联想候选会被过滤或者排序靠后。For example, when a user inputs "He always makes mistakes, I am very", the existing input method will generate candidate words such as "happy", "like", and "sad". Using the method of the embodiment of the present invention, firstly, the obtained context is used to determine that the user expresses more "sad" emotions rather than "happy" emotions. Therefore, after screening, candidate words such as "sad" and "sad" will be ranked higher, and association candidates such as "happy" and "like" will be filtered or ranked lower.

再例如,在用户A与用户B通过聊天工具对话的场景,用户A通过语音输入“你好可爱,我好”时,按照现有的输入法会生成“怕怕”、“伤心”、“喜欢”等候选词。利用本发明实施例的方法,首选对所述语音进行语音识别,得到上文文本,利用上文文本判断用户表达“积极”的情绪更多一些,而非“消极”的情绪,因此经过筛选处理,表达积极情绪的“喜欢”等候选的排序应该更靠前,“怕怕”、“伤心”等候选会被过滤或者排序靠后。For another example, in a scenario where user A and user B are talking through a chat tool, when user A inputs "Hello, cute, I'm good" by voice, candidate words such as "scared", "sad", and "like" will be generated according to the existing input method. Using the method of the embodiment of the present invention, the voice is firstly recognized to obtain the above text, and the above text is used to judge that the user expresses more "positive" emotions rather than "negative" emotions. Therefore, after screening, candidates such as "like" that express positive emotions should be ranked higher, and candidates such as "scared" and "sad" will be filtered out or ranked lower.

在用户输入时,有时输入环境等因素也会对用户输入产生一定的影响,因此,在本发明方法另一实施例中,在构建情绪识别模型及利用该模型进行用户情绪识别时,还可以考虑将一些其它因素作为辅助信息考虑进来,比如,所述辅助信息可以包括以下任意一种或多种:当前环境信息、位置信息、用户身体信息。所述环境信息比如可以包括:温度、气候等。所述用户身体信息比如可以包括:体温、运动状态、当前输入速度等。具体地,获取所述辅助信息,利用所述文本信息、所述辅助信息及所述情绪识别模型,得到用户情绪类别。所述辅助信息可以通过调用输入应用提供的应用程序接口得到,或者通过第三方APP得到。When the user inputs, sometimes factors such as the input environment will also have a certain impact on the user input. Therefore, in another embodiment of the method of the present invention, when constructing an emotion recognition model and using the model to perform user emotion recognition, some other factors can also be considered as auxiliary information. For example, the auxiliary information may include any one or more of the following: current environmental information, location information, user body information. The environmental information may include, for example, temperature, climate, etc. The user body information may include, for example, body temperature, movement status, current input speed, etc. Specifically, the auxiliary information is obtained, and the user emotion category is obtained using the text information, the auxiliary information and the emotion recognition model. The auxiliary information can be obtained by calling the application program interface provided by the input application, or by obtaining it through a third-party APP.

由于不同用户可能会有不同的输入习惯,因此,在本发明方法另一实施例中,还可以记录历史输入信息,比如用户每次选择的候选词,利用记录的历史输入信息对所述情绪识别模型进行个性化训练,更新所述情绪识别模型。具体地,可以将每次记录的历史输入信息作为一个训练样本,在训练样本达到一定数量后,在原有情绪识别模型参数的基础上,利用新的样本重新训练,从而得到与该用户更加匹配的个性化的情绪识别模型。在后续的输入中,利用更新后的情绪识别模型对用户情绪进行识别,可以使识别结果更准确,进而可以使向用户提供的候选词具有与上文文本更高的匹配度,进一步提高候选词的准确性,提高用户输入效率。Since different users may have different input habits, in another embodiment of the method of the present invention, historical input information may also be recorded, such as the candidate words selected by the user each time, and the emotion recognition model may be personalized trained using the recorded historical input information to update the emotion recognition model. Specifically, the historical input information recorded each time may be used as a training sample. After the number of training samples reaches a certain number, the new samples may be used to retrain based on the original emotion recognition model parameters, thereby obtaining a personalized emotion recognition model that better matches the user. In subsequent inputs, the updated emotion recognition model may be used to identify the user's emotions, which may make the recognition results more accurate, and thus may make the candidate words provided to the user have a higher degree of match with the above text, further improving the accuracy of the candidate words and improving the user's input efficiency.

相应地,本发明实施例还提供一种候选词生成装置,本发明实施例的候选词生成装置,可以集成在用户设备中,所述用户设备可以是笔记本、计算机、PAD、手机等。用户在进行输入操作时,可以使用实体键盘,也可以使用用户设备触摸屏上的虚拟键盘。Accordingly, an embodiment of the present invention further provides a candidate word generating device, which can be integrated in a user device, which can be a notebook, a computer, a PAD, a mobile phone, etc. When a user performs input operations, he can use a physical keyboard or a virtual keyboard on the touch screen of the user device.

如图2所示,是本发明实施例候选词生成装置的一种结构示意图。As shown in FIG. 2 , it is a schematic diagram of the structure of a candidate word generating device according to an embodiment of the present invention.

在该实施例中,所述装置包括:In this embodiment, the apparatus comprises:

上文获取模块201,用于实时获取上文;A text acquisition module 201 is used to acquire text in real time;

候选词获取模块202,用于获得候选词;A candidate word acquisition module 202 is used to obtain candidate words;

情绪识别模块203,用于根据所述上文确定用户情绪类别;An emotion recognition module 203, used to determine the user's emotion category according to the above text;

筛选模块204,用于根据所述用户情绪类别对所述候选词进行筛选,得到待输出的候选词。The screening module 204 is used to screen the candidate words according to the user emotion category to obtain the candidate words to be output.

针对不同的输入设备及输入方式,所述上文的形式也可以不同,比如,所述上文可以是文本、语音、或者图片等形式。另外,根据应用场景不同,所述上文可以是用户输入、已有上文、对端交互的上文等。For different input devices and input methods, the form of the above context may also be different, for example, the above context may be in the form of text, voice, or picture, etc. In addition, according to different application scenarios, the above context may be user input, existing context, context of peer interaction, etc.

所述候选词可以是现有的输入法生成的候选词,对于不同的输入法,其生成候选词的方法及规则等可能会有所不同,此处本发明实施例不做限定。不论基于何种输入法生成的候选词,通过后续本发明方案对这些候选词做进一步筛选,均能得到与上文更匹配的候选词。The candidate words may be generated by an existing input method. Different input methods may have different methods and rules for generating candidate words, which are not limited in the embodiments of the present invention. Regardless of the candidate words generated by the input method, further screening of the candidate words by the subsequent solution of the present invention can obtain candidate words that better match the above.

上述情绪识别模块203具体可以采用基于模型的方式识别用户情绪类别。The emotion recognition module 203 may specifically recognize the user emotion category in a model-based manner.

所述模型可以由模型构建模块预先构建,所述模型构建模块可以作为独立的模块,也可以集成于该装置,作为本发明装置的一部分。The model can be pre-constructed by a model construction module, and the model construction module can be used as an independent module or integrated into the device as a part of the device of the present invention.

在实际应用中,所述情绪识别模型可以采用深度学习模型,比如DNN、CNN等,或者采用传统的分类模型,比如,SVM、决策树等。模型的训练过程可以采用常规技术,在此不再详细描述。In practical applications, the emotion recognition model may adopt a deep learning model, such as DNN, CNN, etc., or a traditional classification model, such as SVM, decision tree, etc. The training process of the model may adopt conventional techniques and will not be described in detail here.

基于预先构建的情绪识别模型,所述情绪识别模块203基于上文确定用户情绪类别时,需要先对所述上文进行预处理,提取所述上文的文本信息,然后将所述文本信息输入到所述情绪识别模型,根据模型的输出得到用户情绪类别。相应地,所述情绪识别模块203的一个具体结构可以包括:文本处理单元和识别单元。其中:Based on the pre-built emotion recognition model, when the emotion recognition module 203 determines the user emotion category based on the above text, it is necessary to pre-process the above text, extract the text information of the above text, and then input the text information into the emotion recognition model to obtain the user emotion category according to the output of the model. Accordingly, a specific structure of the emotion recognition module 203 may include: a text processing unit and a recognition unit. Among them:

所述文本处理单元,用于提取所述上文的文本信息;The text processing unit is used to extract the text information of the above text;

所述识别单元,用于利用所述文本信息及所述情绪识别模型,得到用户情绪类别。The recognition unit is used to obtain the user emotion category by using the text information and the emotion recognition model.

对于不同的模型,所述文本处理单元需要得到不同的文本信息,比如,所述情绪识别模型为深度学习模型时,所述文本处理单元需要获取所述上文对应的词序列,确定所述词序列中各词的词向量;所述情绪识别模型为传统分类模型时,所述文本处理单元需要获取所述上文对应的词序列,确定所述词序列中各词的ID。相应地,所述识别单元需要将所述词序列中各词的词向量输入深度学习模型得到用户情绪类别,或者将所述词序列中各词的ID输入所述传统分类模型得到用户情绪类别。For different models, the text processing unit needs to obtain different text information. For example, when the emotion recognition model is a deep learning model, the text processing unit needs to obtain the word sequence corresponding to the above text and determine the word vector of each word in the word sequence; when the emotion recognition model is a traditional classification model, the text processing unit needs to obtain the word sequence corresponding to the above text and determine the ID of each word in the word sequence. Accordingly, the recognition unit needs to input the word vector of each word in the word sequence into the deep learning model to obtain the user emotion category, or input the ID of each word in the word sequence into the traditional classification model to obtain the user emotion category.

对于文本形式的上文,所述文本处理单元可以通过对所述上文进行分词处理,得到所述上文文本对应的词序列;对于其它形式的上文,可以先利用相应的识别技术,得到对应的文本,然后再进行分词处理,得到词序列。如果是用户输入的上文,也可以通过记录用户输入的各词,得到所述上文对应的词序列。For the textual context, the text processing unit can obtain the word sequence corresponding to the textual context by performing word segmentation on the context; for other forms of context, the corresponding text can be obtained by first using corresponding recognition technology, and then word segmentation can be performed to obtain the word sequence. If the context is user input, the word sequence corresponding to the context can also be obtained by recording each word input by the user.

进一步地,所述情绪识别模块203还可包括:辅助信息获取单元,用于获取辅助信息,所述辅助信息包括以下任意一种或多种:当前环境信息、位置信息等、用户身体信息。相应地,所述识别单元可以利用所述文本信息、所述辅助信息及所述情绪识别模型,得到用户情绪类别。需要说明的是,在这种情况下,所述情绪识别模型构建时,也需要考虑上述辅助信息。Furthermore, the emotion recognition module 203 may also include: an auxiliary information acquisition unit for acquiring auxiliary information, wherein the auxiliary information includes any one or more of the following: current environment information, location information, user body information, etc. Accordingly, the recognition unit can use the text information, the auxiliary information and the emotion recognition model to obtain the user emotion category. It should be noted that in this case, the auxiliary information also needs to be considered when constructing the emotion recognition model.

在实际应用中,所述筛选模块204在对所述候选词进行筛选时,可以有多种筛选方式。In actual applications, the screening module 204 may use a variety of screening methods when screening the candidate words.

比如,在一个具体实施例中,可以选择所述候选词中词性与所述下一个词的词性信息相同的候选词作为待输出的候选词。For example, in a specific embodiment, a candidate word among the candidate words whose part of speech is the same as the part of speech information of the next word may be selected as the candidate word to be output.

再比如,如图3所示,在另一个具体实施例中,所述候选词获取模块202还用于获取各候选词的候选得分。For another example, as shown in FIG. 3 , in another specific embodiment, the candidate word acquisition module 202 is further configured to acquire a candidate score for each candidate word.

相应地,在该实施例中,所述筛选模块204包括:得分调整模块241和候选词输出模块242。其中:Accordingly, in this embodiment, the screening module 204 includes: a score adjustment module 241 and a candidate word output module 242. Among them:

所述得分调整模块241用于根据所述用户情绪类别调整所述候选词的候选得分,得到所述候选词的最终得分;The score adjustment module 241 is used to adjust the candidate score of the candidate word according to the user emotion category to obtain the final score of the candidate word;

所述候选词输出模块242用于根据所述最终得分确定待输出的候选词。The candidate word output module 242 is used to determine the candidate word to be output according to the final score.

在实际应用中,所述得分调整模块241也可以采用多种方式对候选词的候选得分进行调整。In practical applications, the score adjustment module 241 may also use a variety of methods to adjust the candidate scores of the candidate words.

比如,所述得分调整模块241的一种具体实现可以包括:情感得分确定单元和第一计算单元。其中:所述情感得分确定单元用于根据所述用户情绪类别确定各候选词的情感得分;所述第一计算单元用于将所述候选词的候选得分与所述情感得分进行加权求和,得到所述候选词的最终得分。For example, a specific implementation of the score adjustment module 241 may include: a sentiment score determination unit and a first calculation unit. The sentiment score determination unit is used to determine the sentiment score of each candidate word according to the user emotion category; the first calculation unit is used to perform a weighted summation of the candidate score of the candidate word and the sentiment score to obtain the final score of the candidate word.

再比如,所述得分调整模块241的另一种具体实现可以包括:权重确定单元和第二计算单元。其中:所述权重确定单元用于根据所述用户情绪类别确定各候选词的候选得分的权重;所述第二计算单元用于根据所述候选词的候选得分及其权重计算得到所述候选词的最终得分。For another example, another specific implementation of the score adjustment module 241 may include: a weight determination unit and a second calculation unit. The weight determination unit is used to determine the weight of the candidate score of each candidate word according to the user emotion category; the second calculation unit is used to calculate the final score of the candidate word according to the candidate score of the candidate word and its weight.

上述候选词输出模块242具体可以依照最终得分从高到低的顺序选取设定数量的候选词作为待输出的候选词;或者选取最终得分大于设定阈值的候选词作为待输出的候选词。The candidate word output module 242 may specifically select a set number of candidate words as candidate words to be output in descending order of final scores; or select candidate words whose final scores are greater than a set threshold as candidate words to be output.

本发明实施例提供的候选词生成装置,基于获取的上文信息确定用户情绪类别,根据用户情绪类别对各候选词进行筛选,从而提供给用户更准确的候选词,进而提高用户输入效率,提升用户输入体验。The candidate word generation device provided in the embodiment of the present invention determines the user emotion category based on the acquired context information, and screens each candidate word according to the user emotion category, thereby providing the user with more accurate candidate words, thereby improving the user input efficiency and enhancing the user input experience.

由于不同用户可能会有不同的输入习惯,因此,在本发明装置另一实施例中,还可以进一步包括:信息记录模块和模型更新模块(未图示)。其中,所述信息记录模块用于记录历史输入信息;所述模型更新模块用于利用所述信息记录模块记录的历史输入信息对所述情绪识别模型进行个性化训练,更新所述情绪识别模型。具体地,可以将每次记录的历史输入信息作为一个训练样本,在训练样本达到一定数量后,在原有情绪识别模型参数的基础上,利用新的样本重新训练,从而得到与该用户更加匹配的个性化的情绪识别模型。Since different users may have different input habits, in another embodiment of the device of the present invention, it may further include: an information recording module and a model updating module (not shown). The information recording module is used to record historical input information; the model updating module is used to use the historical input information recorded by the information recording module to perform personalized training on the emotion recognition model and update the emotion recognition model. Specifically, each recorded historical input information can be used as a training sample. After the number of training samples reaches a certain number, the new samples can be used to retrain on the basis of the original emotion recognition model parameters, so as to obtain a personalized emotion recognition model that better matches the user.

相应地,在后续用户的输入过程中,利用更新后的情绪识别模型对用户情绪进行识别,可以使识别结果更准确,进而可以使向用户提供的候选词具有与上文文本更高的匹配度,进一步提高候选词的准确性,提高用户输入效率。Accordingly, in the subsequent user input process, using the updated emotion recognition model to identify user emotions can make the recognition results more accurate, and thus can make the candidate words provided to the user have a higher match with the previous text, further improving the accuracy of the candidate words and improving user input efficiency.

需要说明的是,在实际应用中,可以将本发明方法及装置应用于各种不同的输入法中,而且不论是采用拼音输入、五笔输入、语音输入还是其它方式的输入,都能够适用。It should be noted that, in practical applications, the method and device of the present invention can be applied to various input methods, and can be applicable to input methods such as pinyin input, Wubi input, voice input or other input methods.

图4是根据一示例性实施例示出的一种用于候选词生成方法的装置800的框图。例如,装置800可以是移动电话,计算机,数字广播终端,消息收发设备,游戏控制台,平板设备,医疗设备,健身设备,个人数字助理等。Fig. 4 is a block diagram of an apparatus 800 for generating a candidate word according to an exemplary embodiment. For example, the apparatus 800 may be a mobile phone, a computer, a digital broadcast terminal, a messaging device, a game console, a tablet device, a medical device, a fitness device, a personal digital assistant, etc.

参照图4,装置800可以包括以下一个或多个组件:处理组件802,存储器804,电源组件806,多媒体组件808,音频组件810,输入/输出(I/O)的接口812,传感器组件814,以及通信组件816。4 , the device 800 may include one or more of the following components: a processing component 802 , a memory 804 , a power component 806 , a multimedia component 808 , an audio component 810 , an input/output (I/O) interface 812 , a sensor component 814 , and a communication component 816 .

处理组件802通常控制装置800的整体操作,诸如与显示,电话呼叫,数据通信,相机操作和记录操作相关联的操作。处理元件802可以包括一个或多个处理器820来执行指令,以完成上述的方法的全部或部分步骤。此外,处理组件802可以包括一个或多个模块,便于处理组件802和其他组件之间的交互。例如,处理部件802可以包括多媒体模块,以方便多媒体组件808和处理组件802之间的交互。The processing component 802 generally controls the overall operation of the device 800, such as operations associated with display, phone calls, data communications, camera operations, and recording operations. The processing component 802 may include one or more processors 820 to execute instructions to complete all or part of the steps of the above-mentioned method. In addition, the processing component 802 may include one or more modules to facilitate the interaction between the processing component 802 and other components. For example, the processing component 802 may include a multimedia module to facilitate the interaction between the multimedia component 808 and the processing component 802.

存储器804被配置为存储各种类型的数据以支持在设备800的操作。这些数据的示例包括用于在装置800上操作的任何应用程序或方法的指令,联系人数据,电话簿数据,消息,图片,视频等。存储器804可以由任何类型的易失性或非易失性存储设备或者它们的组合实现,如静态随机存取存储器(SRAM),电可擦除可编程只读存储器(EEPROM),可擦除可编程只读存储器(EPROM),可编程只读存储器(PROM),只读存储器(ROM),磁存储器,快闪存储器,磁盘或光盘。The memory 804 is configured to store various types of data to support operations on the device 800. Examples of such data include instructions for any application or method operating on the device 800, contact data, phone book data, messages, pictures, videos, etc. The memory 804 can be implemented by any type of volatile or non-volatile storage device or a combination thereof, such as static random access memory (SRAM), electrically erasable programmable read-only memory (EEPROM), erasable programmable read-only memory (EPROM), programmable read-only memory (PROM), read-only memory (ROM), magnetic memory, flash memory, magnetic disk or optical disk.

电力组件806为装置800的各种组件提供电力。电力组件806可以包括电源管理系统,一个或多个电源,及其他与为装置800生成、管理和分配电力相关联的组件。The power component 806 provides power to the various components of the device 800. The power component 806 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power for the device 800.

多媒体组件808包括在所述装置800和用户之间的提供一个输出接口的屏幕。在一些实施例中,屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板,屏幕可以被实现为触摸屏,以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。所述触摸传感器可以不仅感测触摸或滑动动作的边界,而且还检测与所述触摸或滑动操作相关的持续时间和压力。在一些实施例中,多媒体组件808包括一个前置摄像头和/或后置摄像头。当设备800处于操作模式,如拍摄模式或视频模式时,前置摄像头和/或后置摄像头可以接收外部的多媒体数据。每个前置摄像头和后置摄像头可以是一个固定的光学透镜系统或具有焦距和光学变焦能力。The multimedia component 808 includes a screen that provides an output interface between the device 800 and the user. In some embodiments, the screen may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from the user. The touch panel includes one or more touch sensors to sense touch, slide, and gestures on the touch panel. The touch sensor may not only sense the boundaries of the touch or slide action, but also detect the duration and pressure associated with the touch or slide operation. In some embodiments, the multimedia component 808 includes a front camera and/or a rear camera. When the device 800 is in an operating mode, such as a shooting mode or a video mode, the front camera and/or the rear camera may receive external multimedia data. Each front camera and rear camera may be a fixed optical lens system or have a focal length and optical zoom capability.

音频组件810被配置为输出和/或输入音频信号。例如,音频组件810包括一个麦克风(MIC),当装置800处于操作模式,如呼叫模式、记录模式和语音识别模式时,麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器804或经由通信组件816发送。在一些实施例中,音频组件810还包括一个扬声器,用于输出音频信号。The audio component 810 is configured to output and/or input audio signals. For example, the audio component 810 includes a microphone (MIC), and when the device 800 is in an operating mode, such as a call mode, a recording mode, and a speech recognition mode, the microphone is configured to receive an external audio signal. The received audio signal can be further stored in the memory 804 or sent via the communication component 816. In some embodiments, the audio component 810 also includes a speaker for outputting audio signals.

I/O接口812为处理组件802和外围接口模块之间提供接口,上述外围接口模块可以是键盘,点击轮,按钮等。这些按钮可包括但不限于:主页按钮、音量按钮、启动按钮和锁定按钮。I/O interface 812 provides an interface between processing component 802 and peripheral interface modules, such as keyboards, click wheels, buttons, etc. These buttons may include but are not limited to: home button, volume button, start button, and lock button.

传感器组件814包括一个或多个传感器,用于为装置800提供各个方面的状态评估。例如,传感器组件814可以检测到设备800的打开/关闭状态,组件的相对定位,例如所述组件为装置800的显示器和小键盘,传感器组件814还可以检测装置800或装置800一个组件的位置改变,用户与装置800接触的存在或不存在,装置800方位或加速/减速和装置800的温度变化。传感器组件814可以包括接近传感器,被配置用来在没有任何的物理接触时检测附近物体的存在。传感器组件814还可以包括光传感器,如CMOS或CCD图像传感器,用于在成像应用中使用。在一些实施例中,该传感器组件814还可以包括加速度传感器,陀螺仪传感器,磁传感器,压力传感器或温度传感器。The sensor assembly 814 includes one or more sensors for providing various aspects of status assessment for the device 800. For example, the sensor assembly 814 can detect the open/closed state of the device 800, the relative positioning of components, such as the display and keypad of the device 800, and the sensor assembly 814 can also detect the position change of the device 800 or a component of the device 800, the presence or absence of user contact with the device 800, the orientation or acceleration/deceleration of the device 800, and the temperature change of the device 800. The sensor assembly 814 may include a proximity sensor configured to detect the presence of nearby objects without any physical contact. The sensor assembly 814 may also include an optical sensor, such as a CMOS or CCD image sensor, for use in imaging applications. In some embodiments, the sensor assembly 814 may also include an accelerometer, a gyroscope sensor, a magnetic sensor, a pressure sensor, or a temperature sensor.

通信组件816被配置为便于装置800和其他设备之间有线或无线方式的通信。装置800可以接入基于通信标准的无线网络,如WiFi,2G或3G,或它们的组合。在一个示例性实施例中,通信部件816经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中,所述通信部件816还包括近场通信(NFC)模块,以促进短程通信。例如,在NFC模块可基于射频识别(RFID)技术,红外数据协会(IrDA)技术,超宽带(UWB)技术,蓝牙(BT)技术和其他技术来实现。The communication component 816 is configured to facilitate wired or wireless communication between the device 800 and other devices. The device 800 can access a wireless network based on a communication standard, such as WiFi, 2G or 3G, or a combination thereof. In an exemplary embodiment, the communication component 816 receives a broadcast signal or broadcast-related information from an external broadcast management system via a broadcast channel. In an exemplary embodiment, the communication component 816 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module can be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.

在示例性实施例中,装置800可以被一个或多个应用专用集成电路(ASIC)、数字信号处理器(DSP)、数字信号处理设备(DSPD)、可编程逻辑器件(PLD)、现场可编程门阵列(FPGA)、控制器、微控制器、微处理器或其他电子元件实现,用于执行上述方法。In an exemplary embodiment, the apparatus 800 may be implemented by one or more application specific integrated circuits (ASICs), digital signal processors (DSPs), digital signal processing devices (DSPDs), programmable logic devices (PLDs), field programmable gate arrays (FPGAs), controllers, microcontrollers, microprocessors or other electronic components to perform the above method.

在示例性实施例中,还提供了一种包括指令的非临时性计算机可读存储介质,例如包括指令的存储器804,上述指令可由装置800的处理器820执行以完成上述按键误触纠错方法。例如,所述非临时性计算机可读存储介质可以是ROM、随机存取存储器(RAM)、CD-ROM、磁带、软盘和光数据存储设备等。In an exemplary embodiment, a non-transitory computer-readable storage medium including instructions is also provided, such as a memory 804 including instructions, and the instructions can be executed by the processor 820 of the device 800 to complete the above-mentioned key mis-touch error correction method. For example, the non-transitory computer-readable storage medium can be a ROM, a random access memory (RAM), a CD-ROM, a magnetic tape, a floppy disk, an optical data storage device, etc.

本发明还提供一种非临时性计算机可读存储介质,当所述存储介质中的指令由移动终端的处理器执行时,使得移动终端能够执行上述本发明方法实施例中的全部或部分步骤。The present invention also provides a non-temporary computer-readable storage medium. When the instructions in the storage medium are executed by a processor of a mobile terminal, the mobile terminal is enabled to execute all or part of the steps in the above-mentioned method embodiment of the present invention.

图5是本发明实施例中服务器的结构示意图。该服务器1900可因配置或性能不同而产生比较大的差异,可以包括一个或一个以上中央处理器(Central Processing Units,CPU)1922(例如,一个或一个以上处理器)和存储器1932,一个或一个以上存储应用程序1942或数据1944的存储介质1930(例如一个或一个以上海量存储设备)。其中,存储器1932和存储介质1930可以是短暂存储或持久存储。存储在存储介质1930的程序可以包括一个或一个以上模块(图示没标出),每个模块可以包括对服务器中的一系列指令操作。更进一步地,中央处理器1922可以设置为与存储介质1930通信,在服务器1900上执行存储介质1930中的一系列指令操作。FIG5 is a schematic diagram of the structure of a server in an embodiment of the present invention. The server 1900 may have relatively large differences due to different configurations or performances, and may include one or more central processing units (CPU) 1922 (for example, one or more processors) and a memory 1932, and one or more storage media 1930 (for example, one or more mass storage devices) storing application programs 1942 or data 1944. Among them, the memory 1932 and the storage medium 1930 may be short-term storage or permanent storage. The program stored in the storage medium 1930 may include one or more modules (not shown in the figure), and each module may include a series of instruction operations in the server. Furthermore, the central processing unit 1922 may be configured to communicate with the storage medium 1930 and execute a series of instruction operations in the storage medium 1930 on the server 1900.

服务器1900还可以包括一个或一个以上电源1926,一个或一个以上有线或无线网络接口1950,一个或一个以上输入输出接口1958,一个或一个以上键盘1956,和/或,一个或一个以上操作系统1941,例如Windows ServerTM,Mac OS XTM,UnixTM,LinuxTM,FreeBSDTM等等。The server 1900 may also include one or more power supplies 1926, one or more wired or wireless network interfaces 1950, one or more input and output interfaces 1958, one or more keyboards 1956, and/or, one or more operating systems 1941, such as Windows ServerTM, Mac OS XTM, UnixTM, LinuxTM, FreeBSDTM, etc.

本领域技术人员在考虑说明书及实践这里公开的发明后,将容易想到本发明的其它实施方案。本发明旨在涵盖本发明的任何变型、用途或者适应性变化,这些变型、用途或者适应性变化遵循本发明的一般性原理并包括本公开未公开的本技术领域中的公知常识或惯用技术手段。说明书和实施例仅被视为示例性的,本发明的真正范围和精神由下面的权利要求指出。Those skilled in the art will readily appreciate other embodiments of the present invention after considering the specification and practicing the invention disclosed herein. The present invention is intended to cover any variations, uses or adaptations of the present invention that follow the general principles of the present invention and include common knowledge or customary techniques in the art that are not disclosed in this disclosure. The description and examples are to be considered exemplary only, and the true scope and spirit of the present invention are indicated by the following claims.

应当理解的是,本发明并不局限于上面已经描述并在附图中示出的精确结构,并且可以在不脱离其范围进行各种修改和改变。本发明的范围仅由所附的权利要求来限制。It should be understood that the present invention is not limited to the exact construction that has been described above and shown in the drawings and that various modifications and changes may be made without departing from the scope thereof. The scope of the present invention is limited only by the appended claims.

以上所述仅为本发明的较佳实施例,并不用以限制本发明,凡在本发明的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本发明的保护范围之内。The above description is only a preferred embodiment of the present invention and is not intended to limit the present invention. Any modifications, equivalent substitutions, improvements, etc. made within the spirit and principle of the present invention should be included in the protection scope of the present invention.

Claims (10)

1. A method of generating a candidate word, the method comprising:
acquiring the above words in real time and obtaining candidate words;
extracting the text information; acquiring auxiliary information, wherein the auxiliary information comprises any one or more of the following: current environmental information, location information, user body information;
pre-constructing an emotion recognition model, recording historical input information, wherein the historical input information comprises candidate words selected by a user each time, taking the recorded historical input information each time as a training sample, and re-training by using the training sample on the basis of pre-constructed emotion recognition model parameters after the training sample reaches a certain number to obtain a personalized emotion recognition model matched with the user;
obtaining emotion categories of the user by using the text information, the auxiliary information and the personalized emotion recognition model corresponding to the user;
obtaining candidate scores of candidate words;
determining part-of-speech scores of the candidate words according to the strength of the corresponding relation between each candidate word and the identified emotion category of the user, wherein the part-of-speech score is higher when the corresponding relation is stronger;
weighting and summing the candidate score of the candidate word and the part-of-speech score to obtain a final score of the candidate word;
and determining candidate words to be output according to the final score.
2. The method according to claim 1, wherein the above is any one of the following: text, speech, pictures.
3. The method of claim 1, wherein the step of determining the position of the substrate comprises,
the emotion recognition model is a deep learning model; the extracting the text information includes: acquiring the word sequence corresponding to the above, and determining word vectors of words in the word sequence; or alternatively
The emotion recognition model is an SVM or a decision tree; the extracting the text information includes: and acquiring the word sequence corresponding to the word sequence, and determining the ID of each word in the word sequence.
4. The method of claim 1, wherein the determining candidate words to be output according to the final score comprises:
selecting a set number of candidate words as candidate words to be output according to the sequence from high to low of the final score; or alternatively
And selecting the candidate words with the final score larger than the set threshold value as candidate words to be output.
5. A candidate word generation device, the device comprising:
the above acquisition module is used for acquiring the above in real time and obtaining candidate words;
the candidate word acquisition module is used for acquiring candidate words;
the emotion recognition module is used for extracting the text information; acquiring auxiliary information, wherein the auxiliary information comprises any one or more of the following: current environmental information, location information, user body information; pre-constructing an emotion recognition model, recording historical input information, wherein the historical input information comprises candidate words selected by a user each time, taking the recorded historical input information each time as a training sample, and re-training by using the training sample on the basis of pre-constructed emotion recognition model parameters after the training sample reaches a certain number to obtain a personalized emotion recognition model matched with the user; obtaining emotion categories of the user by using the text information, the auxiliary information and the personalized emotion recognition model corresponding to the user;
the screening module is used for determining part-of-speech scores of the candidate words according to the strength of the corresponding relation between each candidate word and the identified emotion category of the user, wherein the part-of-speech score is higher when the corresponding relation is stronger; weighting and summing the candidate score of the candidate word and the part-of-speech score to obtain a final score of the candidate word; and determining candidate words to be output according to the final score.
6. The apparatus of claim 5, wherein the foregoing is any one of: text, speech, pictures.
7. The apparatus of claim 5, wherein the device comprises a plurality of sensors,
the emotion recognition model is a deep learning model, and the text processing unit is specifically used for acquiring the word sequence corresponding to the text and determining word vectors of words in the word sequence; or alternatively
The emotion recognition model is a classification model; the text processing unit is specifically configured to obtain the word sequence corresponding to the above, and determine an ID of each word in the word sequence.
8. The apparatus of claim 5, wherein the device comprises a plurality of sensors,
the candidate word output module is specifically configured to select a set number of candidate words as candidate words to be output according to a sequence from high to low of the final score; or selecting the candidate words with the final scores larger than the set threshold value as candidate words to be output.
9. A computer device, comprising: one or more processors, memory;
the memory is for storing computer executable instructions and the processor is for executing the computer executable instructions to implement the method of any one of claims 1 to 4.
10. A readable storage medium having stored thereon instructions that are executed to implement the method of any of claims 1 to 4.
CN201810948159.4A 2018-08-20 2018-08-20 Candidate word generation method and device Active CN110858099B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201810948159.4A CN110858099B (en) 2018-08-20 2018-08-20 Candidate word generation method and device

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201810948159.4A CN110858099B (en) 2018-08-20 2018-08-20 Candidate word generation method and device

Publications (2)

Publication Number Publication Date
CN110858099A CN110858099A (en) 2020-03-03
CN110858099B true CN110858099B (en) 2024-04-12

Family

ID=69635827

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201810948159.4A Active CN110858099B (en) 2018-08-20 2018-08-20 Candidate word generation method and device

Country Status (1)

Country Link
CN (1) CN110858099B (en)

Families Citing this family (3)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN113589953B (en) * 2020-04-30 2024-07-26 北京搜狗科技发展有限公司 Information display method and device and electronic equipment
CN112818841B (en) * 2021-01-29 2024-10-29 北京搜狗科技发展有限公司 Method and related device for identifying emotion of user
CN115437510A (en) * 2022-09-23 2022-12-06 联想(北京)有限公司 Data display method and device

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101581978A (en) * 2008-05-12 2009-11-18 欧姆龙株式会社 Character input program, character input device, and character input method
CN102915731A (en) * 2012-10-10 2013-02-06 百度在线网络技术(北京)有限公司 Method and device for recognizing personalized speeches
CN103076892A (en) * 2012-12-31 2013-05-01 百度在线网络技术(北京)有限公司 Method and equipment for providing input candidate items corresponding to input character string
CN103544246A (en) * 2013-10-10 2014-01-29 清华大学 Method and system for constructing multi-emotion dictionary for internet
KR101534141B1 (en) * 2014-08-05 2015-07-07 성균관대학교산학협력단 Rationale word extraction method and apparatus using genetic algorithm, and sentiment classification method and apparatus using said rationale word
CN104866091A (en) * 2015-03-25 2015-08-26 百度在线网络技术(北京)有限公司 Method and device for outputting audio-effect information in computer equipment
CN106527752A (en) * 2016-09-23 2017-03-22 百度在线网络技术(北京)有限公司 Method and device for providing input candidate items
CN107807920A (en) * 2017-11-17 2018-03-16 新华网股份有限公司 Construction method, device and the server of mood dictionary based on big data
CN107943789A (en) * 2017-11-17 2018-04-20 新华网股份有限公司 Mood analysis method, device and the server of topic information
CN108255316A (en) * 2018-01-23 2018-07-06 广东欧珀移动通信有限公司 Dynamic adjusts method, electronic device and the computer readable storage medium of emoticon

Family Cites Families (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
WO2016004425A1 (en) * 2014-07-04 2016-01-07 Intelligent Digital Avatars, Inc. Systems and methods for assessing, verifying and adjusting the affective state of a user
CN108125673B (en) * 2016-12-01 2023-03-14 松下知识产权经营株式会社 Biological information detection device

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101581978A (en) * 2008-05-12 2009-11-18 欧姆龙株式会社 Character input program, character input device, and character input method
CN102915731A (en) * 2012-10-10 2013-02-06 百度在线网络技术(北京)有限公司 Method and device for recognizing personalized speeches
CN103076892A (en) * 2012-12-31 2013-05-01 百度在线网络技术(北京)有限公司 Method and equipment for providing input candidate items corresponding to input character string
CN103544246A (en) * 2013-10-10 2014-01-29 清华大学 Method and system for constructing multi-emotion dictionary for internet
KR101534141B1 (en) * 2014-08-05 2015-07-07 성균관대학교산학협력단 Rationale word extraction method and apparatus using genetic algorithm, and sentiment classification method and apparatus using said rationale word
CN104866091A (en) * 2015-03-25 2015-08-26 百度在线网络技术(北京)有限公司 Method and device for outputting audio-effect information in computer equipment
CN106527752A (en) * 2016-09-23 2017-03-22 百度在线网络技术(北京)有限公司 Method and device for providing input candidate items
CN107807920A (en) * 2017-11-17 2018-03-16 新华网股份有限公司 Construction method, device and the server of mood dictionary based on big data
CN107943789A (en) * 2017-11-17 2018-04-20 新华网股份有限公司 Mood analysis method, device and the server of topic information
CN108255316A (en) * 2018-01-23 2018-07-06 广东欧珀移动通信有限公司 Dynamic adjusts method, electronic device and the computer readable storage medium of emoticon

Non-Patent Citations (2)

* Cited by examiner, † Cited by third party
Title
Actual Emotion and False Emotion Classification by Physiological Signal;Chi Jung Kim;《2015 8th International Conference on Signal Processing, Image Processing and Pattern Recognition (SIP)》;20160314;21-24 *
潮汕语言输入法及语言信息分析的研究;辛丽苹;《★中国优秀硕士学位论文全文数据库 (信息科技辑)》;20061215;第2006年卷(第12期);F084-306 *

Also Published As

Publication number Publication date
CN110858099A (en) 2020-03-03

Similar Documents

Publication Publication Date Title
WO2021077529A1 (en) Neural network model compressing method, corpus translation method and device thereof
CN107291690B (en) Punctuation adding method and device and punctuation adding device
EP3852044A1 (en) Method and device for commenting on multimedia resource
WO2021128880A1 (en) Speech recognition method, device, and device for speech recognition
CN107291704B (en) Processing method and device for processing
CN107564526B (en) Processing method, apparatus and machine-readable medium
CN111753917B (en) Data processing method, device and storage medium
CN109961791A (en) A kind of voice information processing method, device and electronic equipment
CN108803890B (en) Input method, input device and input device
CN111210844B (en) Method, device and equipment for determining speech emotion recognition model and storage medium
CN110858099B (en) Candidate word generation method and device
CN108399914A (en) A kind of method and apparatus of speech recognition
CN110389667A (en) A kind of input method and device
CN107274903A (en) Text handling method and device, the device for text-processing
CN108628813A (en) Treating method and apparatus, the device for processing
CN110968246A (en) Intelligent Chinese handwriting input recognition method and device
CN112631435B (en) Input method, device, equipment and storage medium
CN112036174B (en) Punctuation marking method and device
CN111381685B (en) A sentence association method and device
CN110908523B (en) Input method and device
CN112579767A (en) Search processing method and device for search processing
CN111400443B (en) Information processing method, device and storage medium
CN115146633A (en) Keyword identification method and device, electronic equipment and storage medium
CN113515618B (en) Voice processing method, device and medium
CN110781270B (en) Method and device for constructing non-keyword model in decoding network

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TG01 Patent term adjustment
TG01 Patent term adjustment