CN115858747A - Clustering-combined Prompt structure intention identification method, device, equipment and storage medium - Google Patents
Clustering-combined Prompt structure intention identification method, device, equipment and storage medium Download PDFInfo
- Publication number
- CN115858747A CN115858747A CN202211475785.9A CN202211475785A CN115858747A CN 115858747 A CN115858747 A CN 115858747A CN 202211475785 A CN202211475785 A CN 202211475785A CN 115858747 A CN115858747 A CN 115858747A
- Authority
- CN
- China
- Prior art keywords
- text
- clustering
- prompt
- center
- intention
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
- 238000000034 method Methods 0.000 title claims abstract description 42
- 230000008569 process Effects 0.000 claims abstract description 6
- 238000010276 construction Methods 0.000 claims description 10
- 241000393496 Electra Species 0.000 claims description 3
- 238000003064 k means clustering Methods 0.000 claims description 3
- 230000006872 improvement Effects 0.000 description 5
- 238000010586 diagram Methods 0.000 description 4
- 244000061458 Solanum melongena Species 0.000 description 3
- 235000002597 Solanum melongena Nutrition 0.000 description 3
- 238000007781 pre-processing Methods 0.000 description 3
- 244000061456 Solanum tuberosum Species 0.000 description 2
- 235000002595 Solanum tuberosum Nutrition 0.000 description 2
- 238000002372 labelling Methods 0.000 description 2
- 235000012015 potatoes Nutrition 0.000 description 2
- 230000009286 beneficial effect Effects 0.000 description 1
- 230000008859 change Effects 0.000 description 1
- 238000013145 classification model Methods 0.000 description 1
- 235000013399 edible fruits Nutrition 0.000 description 1
- 238000005516 engineering process Methods 0.000 description 1
- 238000000605 extraction Methods 0.000 description 1
- 235000013305 food Nutrition 0.000 description 1
- 230000006870 function Effects 0.000 description 1
- 238000011423 initialization method Methods 0.000 description 1
- 238000003058 natural language processing Methods 0.000 description 1
- 230000001932 seasonal effect Effects 0.000 description 1
- 230000011218 segmentation Effects 0.000 description 1
- 235000013311 vegetables Nutrition 0.000 description 1
- 235000013618 yogurt Nutrition 0.000 description 1
Images
Landscapes
- Machine Translation (AREA)
Abstract
Description
技术领域technical field
本发明涉及计算机技术领域,具体地涉及一种结合聚类的Prompt结构意图识别方法、装置、设备及存储介质。The present invention relates to the field of computer technology, in particular to a method, device, device and storage medium for recognizing Prompt structure intentions combined with clustering.
背景技术Background technique
人机对话当中,每一轮的对话文本表示提出了许多方法,目前比较常用的文本表示方法一般通过在大规模的语料上无监督的训练语言模型,然后在自己训练的模型上进行参数微调来实现针对意图分类任务的文本表示过程。使用基于预训练语言模型加参数微调的方式,需要获取下游任务中的大量标注数据,付出的时间和人力成本较大。通过使用预训练语言模型配合Prompt提示的结构可以基于0样本的方式获取用户的意图,但是prompt提示结构的获取需要许多人工经验来进行制作,可能产生意图预测结果波动的问题。In human-computer dialogue, many methods have been proposed for each round of dialogue text representation. Currently, the more commonly used text representation methods generally use unsupervised training of language models on large-scale corpus, and then fine-tune parameters on the models trained by themselves. Implement a text representation process for intent classification tasks. Using the method based on pre-trained language model plus parameter fine-tuning requires obtaining a large amount of labeled data in downstream tasks, which requires a lot of time and labor costs. By using the pre-trained language model and the structure of the prompt prompt, the user's intention can be obtained based on 0 samples, but the acquisition of the prompt prompt structure requires a lot of manual experience to make, which may cause fluctuations in the intention prediction results.
发明内容Contents of the invention
本发明的目的在于提供一种结合聚类的Prompt结构意图识别方法、装置、设备及存储介质。The object of the present invention is to provide a method, device, equipment and storage medium for recognizing Prompt structure intention combined with clustering.
本发明提供一种结合聚类的Prompt结构意图识别方法,其包括步骤:The present invention provides a kind of Prompt structural intent recognition method combined with clustering, which comprises steps:
获取对话文本,对所述对话文本进行无监督的模型训练,得到预训练的语言模型;Obtaining a dialogue text, performing unsupervised model training on the dialogue text, and obtaining a pre-trained language model;
对所述对话文本进行聚类处理,根据聚类结果获取聚类中心,并获取与聚类中心最接近的文本作为意图中心文本;Perform clustering processing on the dialogue text, obtain the cluster center according to the clustering result, and obtain the text closest to the cluster center as the intent center text;
基于所述意图中心文本构建Prompt模板,将所述对话文本输入所述Prompt模板槽位,输出Prompt构造文本;Build a Prompt template based on the intent center text, input the dialogue text into the Prompt template slot, and output the Prompt construction text;
将所述Prompt构造文本通过所述预训练的语言模型判断输入的所述对话文本和所述意图中心文本的意图是否一致。The Prompt constructed text is judged by the pre-trained language model whether the input dialogue text is consistent with the intent of the intent center text.
作为本发明的进一步改进,所述获取对话文本,对所述对话文本进行无监督的模型训练,得到预训练的语言模型,还包括:As a further improvement of the present invention, said acquiring the dialogue text, performing unsupervised model training on the dialogue text to obtain a pre-trained language model, further includes:
根据所述对话文本,构建句式模板,自动生成训练用语料;According to the dialogue text, construct a sentence pattern template, and automatically generate training corpus;
对所述对话文本和所述训练用语料进行训练得到预训练的语言模型。The dialogue text and the training corpus are trained to obtain a pre-trained language model.
作为本发明的进一步改进,所述对所述训练用对话文本进行无监督的模型训练,得到预训练的的语言模型,具体包括:As a further improvement of the present invention, the unsupervised model training is performed on the training dialogue text to obtain a pre-trained language model, which specifically includes:
通过BERT、ELECTRA、GPT模型对所述对话文本进行训练得到预训练的的语言模型。A pre-trained language model is obtained by training the dialogue text through BERT, ELECTRA, and GPT models.
作为本发明的进一步改进,所述对所述对话文本进行聚类处理,具体包括:As a further improvement of the present invention, the clustering of the dialogue text specifically includes:
通过K-means聚类算法对所述对话文本进行聚类,选择聚类中心K值,并划分文本簇类;Carry out clustering to described dialogue text by K-means clustering algorithm, select cluster center K value, and divide text cluster class;
选择每个所述文本簇类中与所述聚类中心K值最接近的文本最为该所述文本簇类中的代表文本,将代表文本进行组合生成所述意图中心文本。Select the text closest to the K value of the cluster center in each text cluster as the representative text in the text cluster, and combine the representative texts to generate the intent center text.
作为本发明的进一步改进,所述选择聚类中心K值,具体包括:As a further improvement of the present invention, the selection of the cluster center K value specifically includes:
计算聚类中心轮廓系数,基于轮廓系数大小,选择最优聚类中心K值。Calculate the cluster center silhouette coefficient, and select the optimal cluster center K value based on the size of the silhouette coefficient.
作为本发明的进一步改进,所述对所述对话文本进行聚类处理,还包括:As a further improvement of the present invention, the clustering processing of the dialogue text also includes:
对于所述文本簇类中的代表文本,通过人工判别文本类别,形成所述意图中心文本。For the representative texts in the text clusters, the intent center text is formed by manually discriminating the text categories.
本发明还提供一种结合聚类的Prompt结构意图识装置,其包括:The present invention also provides a Prompt structural diagram recognition device combined with clustering, which includes:
模型训练模块,其被配置用于获取对话文本,对所述对话文本进行无监督的模型训练,得到预训练的语言模型;A model training module, which is configured to acquire dialogue text, perform unsupervised model training on the dialogue text, and obtain a pre-trained language model;
聚类模块,其被配置用于对所述对话文本进行聚类处理,根据聚类结果获取聚类中心,并获取与聚类中心最接近的文本作为意图中心文本;A clustering module, which is configured to perform clustering processing on the dialogue text, obtain a clustering center according to the clustering result, and obtain the text closest to the clustering center as the text of the intention center;
Prompt模板构造模块,其被配置用于基于所述意图中心文本构建Prompt模板,将所述对话文本输入所述Prompt模板槽位,输出Prompt构造文本;A Prompt template construction module configured to construct a Prompt template based on the intent center text, input the dialogue text into the Prompt template slot, and output the Prompt construction text;
判断模块,其被配置用于将所述Prompt构造文本通过所述预训练的语言模型判断输入的所述对话文本和所述意图中心文本的意图是否一致。A judging module configured to use the Prompt constructed text to judge whether the input dialog text is consistent with the intent of the intent center text through the pre-trained language model.
本发明还提供一种电器设备,其包括:The present invention also provides an electrical device, which includes:
存储器,用于存储可执行指令;memory for storing executable instructions;
处理器,用于运行所述存储器存储的可执行指令时,实现上述的结合聚类的Prompt结构意图识别方法。The processor is configured to implement the above-mentioned Prompt structure intent recognition method combined with clustering when running the executable instructions stored in the memory.
本发明还提供一种冰箱,其包括:The present invention also provides a refrigerator, which includes:
存储器,用于存储可执行指令;memory for storing executable instructions;
处理器,用于运行所述存储器存储的可执行指令时,实现上述的结合聚类的Prompt结构意图识别方法。The processor is configured to implement the above-mentioned Prompt structure intent recognition method combined with clustering when running the executable instructions stored in the memory.
本发明还提供一种计算机可读存储介质,其存储有可执行指令,其所述可执行指令被处理器执行时实现上述的结合聚类的Prompt结构意图识别方法。The present invention also provides a computer-readable storage medium, which stores executable instructions, and when the executable instructions are executed by a processor, the above-mentioned Prompt structure intent recognition method combined with clustering is implemented.
本发明的有益效果是:本发明通过构造零样本Prompt提示结构的的方式,不需要大量标注文本意图标签,即可以对文本的意图进行识别,从而节省对话意图识别的成本。并且,Prompt提示结构构造的过程中,使用聚类方法发现意图中心文本的方式,自动化的构建意图识别所需要的prompt提示结构。可以避免人工构造prompt结构时,产生的意图预测结果波动问题。The beneficial effects of the present invention are: the present invention can recognize the text intent by constructing a zero-sample Prompt prompt structure without labeling a large number of text intent tags, thereby saving the cost of dialogue intent recognition. Moreover, in the process of constructing the prompt prompt structure, the clustering method is used to discover the intent center text, and the prompt prompt structure required for intent recognition is automatically constructed. It can avoid the problem of fluctuating intention prediction results when the prompt structure is artificially constructed.
附图说明Description of drawings
图1是本发明一实施方式中的结合聚类的Prompt结构意图识别方法步骤示意图。Fig. 1 is a schematic diagram of the steps of the method for identifying the structural intent of Prompt combined with clustering in an embodiment of the present invention.
图2是本发明一实施方式中对话文本进行聚类处理的步骤示意图。Fig. 2 is a schematic diagram of the steps of clustering the dialogue text in an embodiment of the present invention.
具体实施方式Detailed ways
为使本发明的目的、技术方案和优点更加清楚,下面将结合本发明具体实施方式及相应的附图对本发明技术方案进行清楚、完整地描述。显然,所描述的实施方式仅是本发明一部分实施方式,而不是全部的实施方式。基于本发明中的实施方式,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施方式,都属于本发明保护的范围。In order to make the purpose, technical solution and advantages of the present invention clearer, the technical solution of the present invention will be clearly and completely described below in conjunction with specific embodiments of the present invention and corresponding drawings. Apparently, the described embodiments are only some, not all, embodiments of the present invention. Based on the implementation manners in the present invention, all other implementation manners obtained by persons of ordinary skill in the art without making creative efforts belong to the scope of protection of the present invention.
下面详细描述本发明的实施方式,实施方式的示例在附图中示出,其中自始至终相同或类似的标号表示相同或类似的元件或具有相同或类似功能的元件。下面通过参考附图描述的实施方式是示例性的,仅用于解释本发明,而不能理解为对本发明的限制。Embodiments of the present invention are described in detail below, and examples of the embodiments are shown in the drawings, wherein the same or similar reference numerals denote the same or similar elements or elements having the same or similar functions throughout. The embodiments described below by referring to the figures are exemplary only for explaining the present invention and should not be construed as limiting the present invention.
本实施方式提供一种结合聚类的Prompt结构意图识别方法,Prompt即提示学习,其可在不显著改变预训练语言模型结构和参数的情况下,通过构建Prompt模板,在语言模型输入段添加额外的文本,增加用以提示的信息,对下游任务进行意图识别。在本实施方式中,通过将对话文本进行聚类处理,自动生成意图中心文本,并基于意图中心文本自动构造符合语意的Prompt模板,可以避免人工构造prompt结构时,产生的意图预测结果波动问题。在本实施方式中,基于智能冰箱对本方法进行说明,本方法在其他设备中的应用可参考本实施方式。This embodiment provides a clustering-based Prompt structural intent recognition method. Prompt is prompt learning, which can add additional information to the input section of the language model by constructing a Prompt template without significantly changing the structure and parameters of the pre-trained language model. text, increase the information used for prompting, and perform intent recognition on downstream tasks. In this embodiment, by clustering the dialogue text, automatically generating the intent-centered text, and automatically constructing a semantic prompt template based on the intent-centered text, it is possible to avoid the problem of fluctuations in intent prediction results when manually constructing the prompt structure. In this implementation manner, this method is described based on a smart refrigerator, and the application of this method in other devices may refer to this implementation manner.
诸如冰箱等电器设备的智能化应用中,文本分类是一种常见的自然语言处理任务,要求能把输入的文本数据进行正确的意图判断,意图判断即基于输入的语料数据判断识别用户的具体使用意图,如可根据不同使用意图可将输入的语料数据整体划分为分属于几大领域的使用意图,每个意图分类都有相应的训练语料用于训练意图分类模型。在本实施方式中,对于智能冰箱,使用意图可包括:菜谱查询、冰箱内食材提醒、音乐播放、新闻播报等。In the intelligent application of electrical equipment such as refrigerators, text classification is a common natural language processing task, which requires the correct intention judgment of the input text data. The intention judgment is to judge and identify the specific usage of the user based on the input corpus data Intent, for example, according to different usage intentions, the input corpus data can be divided into usage intentions belonging to several major fields as a whole, and each intent classification has corresponding training corpus for training the intent classification model. In this embodiment, for a smart refrigerator, the usage intention may include: recipe query, food reminder in the refrigerator, music playing, news broadcasting, etc.
如图1所示,所述结合聚类的Prompt结构意图识别方法包括步骤:As shown in Figure 1, the Prompt structural intent recognition method combined with clustering includes steps:
S1:获取对话文本,对所述对话文本进行无监督的模型训练得到预训练的语言模型。S1: Obtain the dialogue text, and perform unsupervised model training on the dialogue text to obtain a pre-trained language model.
S2:对所述对话文本进行聚类处理,根据聚类结果获取聚类中心,并获取与聚类中心最接近的文本作为意图中心文本。S2: Perform clustering processing on the dialog text, obtain a cluster center according to the clustering result, and obtain the text closest to the cluster center as the intent center text.
S3:基于所述意图中心文本构建Prompt模板,将所述对话文本输入所述Prompt模板槽位,输出Prompt构造文本。S3: Construct a Prompt template based on the intent center text, input the dialog text into the slot of the Prompt template, and output the Prompt constructed text.
S4:将所述Prompt构造文本通过所述预训练的语言模型判断输入的所述对话文本和所述意图中心文本的意图是否一致。S4: Using the Prompt constructed text to judge whether the input dialogue text is consistent with the intent of the intent center text through the pre-trained language model.
在步骤S1中,其具体包括:In step S1, it specifically includes:
通过BERT、ELECTRA、GPT模型对所述对话文本进行训练得到预训练的的语言模型。A pre-trained language model is obtained by training the dialogue text through BERT, ELECTRA, and GPT models.
这里所述的对话文本指的是用户当前对智能电子设备或对与智能电子设备通信连接的客户终端设备等说出的询问性或指令性语句等所转写的文本。如在本实施方式中,用户可提出诸如“今天冰箱里有啥蔬菜”、“今天有什么菜谱推荐”等问题,或用户可发出诸如“提醒冰箱里快到期的酸奶”、“给出当季的水果”等命令指令。基于上述信息,智能冰箱的处理器通过本发明所提供的方法进行语音识别后,判断用户的使用意图。The dialogue text mentioned here refers to the text transcribed by the user currently speaking to the smart electronic device or to the client terminal device communicated with the smart electronic device, etc., such as query or instructional sentences. As in this embodiment, the user can ask questions such as "what vegetables are in the refrigerator today", "what recipes are recommended today", or the user can send questions such as "remind the yogurt in the refrigerator that is about to expire", "give seasonal fruit" and other commands. Based on the above information, the processor of the smart refrigerator judges the user's usage intention after performing speech recognition through the method provided by the present invention.
无监督的模型训练即通过模型从无标签数据学习到数据特征抽取,表征,预测的能力,侧面达到数据增强的作用。无监督的模型训练基于大规模文本库的预训练可以较好地学习到通用的语言表示,有助于下游应用任务,并且,无监督语言模型能够提供更好的模型初始化方法,从而得到具有更好泛化能力的模型,并且加速目标任务的收敛速。Unsupervised model training refers to the ability of the model to learn from unlabeled data to data feature extraction, representation, and prediction, and to achieve the role of data enhancement. Unsupervised model training based on large-scale text library pre-training can better learn general language representation, which is helpful for downstream application tasks, and unsupervised language model can provide better model initialization methods, so as to obtain more A model with good generalization ability, and accelerates the convergence rate of the target task.
根据所使用的不同语言模型,在进行模型训练之前还包括不同的对文本进行预处理的步骤。例如使用BERT模型时,首先将对话文本处理为BERT模型输入格式的文本数据,之后对文本数据进行填补使各文本数据长度一致。According to different language models used, different steps of text preprocessing are also included before model training. For example, when using the BERT model, the dialogue text is first processed into text data in the input format of the BERT model, and then the text data is filled to make the length of each text data consistent.
在本发明的其他实施方式中,也可使用常见的无监督训练模型,并根据模型类型调整文本预处理的步骤。In other embodiments of the present invention, common unsupervised training models can also be used, and the steps of text preprocessing can be adjusted according to the model type.
进一步的,在本发明一些实施方式中,步骤S1还包括:Further, in some embodiments of the present invention, step S1 also includes:
根据所述对话文本,构建句式模板,自动生成训练用语料;According to the dialogue text, construct a sentence pattern template, and automatically generate training corpus;
对所述对话文本和所述训练用语料进行训练得到预训练的语言模型。The dialogue text and the training corpus are trained to obtain a pre-trained language model.
在实际使用过程中,由于用户真实对话文本语料数量较少或者不同意图分类的语料数量分布不均衡,需要自动生成语料数据作为模型训练的补充文本数据。语料生成方法可以参考现有技术,这里不再赘述。In actual use, due to the small amount of corpus in the user's real dialogue text or the uneven distribution of the corpus of different intent categories, it is necessary to automatically generate corpus data as supplementary text data for model training. For the corpus generation method, reference may be made to the prior art, which will not be repeated here.
通过结合实际用户对话文本和基于其生成的大规模语料对语言模型进行训练,可以By combining the actual user dialogue text and the large-scale corpus generated based on it to train the language model, you can
如图2所示,在步骤S2中,其具体包括:As shown in Figure 2, in step S2, it specifically includes:
S21:通过K-means聚类算法对所述对话文本进行聚类,通过轮廓系数选出最佳聚类中心K值并划分文本簇类。S21: Clustering the dialogue texts through a K-means clustering algorithm, selecting the best clustering center K value through the silhouette coefficient and dividing the text clusters.
S22:选择每个所述文本簇类中与所述最佳聚类中心K值最接近的文本最为该所述文本簇类中的代表文本,将代表文本进行组合生成所述意图中心文本。S22: Select the text closest to the optimal cluster center K value in each text cluster as the representative text in the text cluster, and combine the representative texts to generate the intent center text.
这里所述的意图中心文本为一类对话文本中最能够反应该类对话文本意图的构造文本,其可以视为代表此类对话文本的意图。The intention-centered text mentioned here is the structured text in a type of dialogue text that can best reflect the intention of this type of dialogue text, and it can be regarded as representing the intention of this type of dialogue text.
通过对对话文本进行聚类处理,并将与聚类中心K值最接近的文本作为该文本簇类中最有代表性的文本,从而能够有效针对大量文本数据自动生成反应对话文本意图的意图中心文本,生成方法简单高效,且准确率高,并在后续过程中,将意图中心文本构造为Prompt模板,能够有效避免人工构造Prompt模板时产生的意图预测结果波动问题。By clustering the dialogue text and taking the text closest to the K value of the cluster center as the most representative text in the text cluster, it is possible to effectively automatically generate an intent center that reflects the intention of the dialogue text for a large amount of text data The text generation method is simple and efficient with high accuracy, and in the subsequent process, the intention center text is constructed as a prompt template, which can effectively avoid the fluctuation of intention prediction results generated when the prompt template is manually constructed.
具体的,在步骤S21中,其具体包括:对对话文本进行切词和去除停用词等预处理,计算文本特征并构建向量空间模型,使用K-means算法进行聚类。Specifically, in step S21, it specifically includes: performing preprocessing on the dialogue text such as word segmentation and removing stop words, calculating text features and constructing a vector space model, and performing clustering using the K-means algorithm.
进一步的,在步骤S21中,通过计算聚类中心轮廓系数,基于轮廓系数大小,计算轮廓系数极大值,选择最优聚类中心K值。Further, in step S21, by calculating the cluster center silhouette coefficient, based on the size of the silhouette coefficient, the maximum value of the silhouette coefficient is calculated, and the optimal cluster center K value is selected.
进一步的,在步骤S2中,所述对所述对话文本进行聚类处理,还包括:Further, in step S2, the clustering processing of the dialogue text also includes:
对于所述文本簇类中的代表文本,通过人工判别文本类别,形成所述意图中心文本。For the representative texts in the text clusters, the intent center text is formed by manually discriminating the text categories.
对于聚类后的文本,通过人工进行判别文本类别以进一步提高意图中心文本的文本准确度。For the clustered text, the text category of the text is manually identified to further improve the text accuracy of the intent center text.
在本发明的其他实施方式中,也可通过其他常用聚类算法对对话文本进行聚类处理。In other embodiments of the present invention, other commonly used clustering algorithms may also be used to cluster the dialog text.
在步骤S3中,通过意图中心文本构造Prompt模板,这里基于一具体事例进行说明。步骤S2中得到的意图中心文本为“把茄子加入冰箱”,则构造的Prompt模板为:[X]和把茄子放入冰箱是否一致?[MASK]。其中,[X]为输入的对话文本,[MASK]用以表示预测结果,即用以表示一致或不一致。In step S3, the Prompt template is constructed through the intent center text, which is described here based on a specific example. The central text of the intention obtained in step S2 is "put the eggplant into the refrigerator", then the constructed Prompt template is: [X] Is it consistent with putting the eggplant into the refrigerator? [MASK]. Among them, [X] is the input dialogue text, and [MASK] is used to indicate the prediction result, that is, to indicate consistency or inconsistency.
对于对话文本“冰箱添加土豆”,将其输入上述Prompt模板,所输出的Prompt构造文本为“冰箱添加土豆和把茄子放入冰箱是否一致?[MASK]”。将Prompt构造文本输入预训练的语言模型进行语句意图匹配判断,输出一致或不一致,即可基于意图中心文本完成对对话文本的意图判断。For the dialog text "Add potatoes to the refrigerator", input it into the above prompt template, and the output Prompt construction text is "Is adding potatoes to the refrigerator consistent with putting eggplants in the refrigerator? [MASK]". Input the text constructed by Prompt into the pre-trained language model to judge the sentence intent matching. If the output is consistent or inconsistent, the intent judgment of the dialogue text can be completed based on the intent center text.
基于同一发明思路,本发明还提供一种结合聚类的Prompt结构意图识装置,其包括:Based on the same inventive idea, the present invention also provides a Prompt structural diagram recognition device combined with clustering, which includes:
模型训练模块,其被配置用于获取对话文本,对所述对话文本进行无监督的模型训练,得到预训练的语言模型;A model training module, which is configured to acquire dialogue text, perform unsupervised model training on the dialogue text, and obtain a pre-trained language model;
聚类模块,其被配置用于对所述对话文本进行聚类处理,根据聚类结果获取聚类中心,并获取与聚类中心最接近的文本作为意图中心文本;A clustering module, which is configured to perform clustering processing on the dialogue text, obtain a clustering center according to the clustering result, and obtain the text closest to the clustering center as the text of the intention center;
Prompt模板构造模块,其被配置用于基于所述意图中心文本构建Prompt模板,将所述对话文本输入所述Prompt模板槽位,输出Prompt构造文本;A Prompt template construction module configured to construct a Prompt template based on the intent center text, input the dialogue text into the Prompt template slot, and output the Prompt construction text;
判断模块,其被配置用于将所述Prompt构造文本通过所述预训练的语言模型判断输入的所述对话文本和所述意图中心文本的意图是否一致。A judging module configured to use the Prompt constructed text to judge whether the input dialog text is consistent with the intent of the intent center text through the pre-trained language model.
基于同一发明思路,本发明还提供一种电器设备,其包括:Based on the same inventive idea, the present invention also provides an electrical device, which includes:
存储器,用于存储可执行指令;memory for storing executable instructions;
处理器,用于运行所述存储器存储的可执行指令时,实现上述的结合聚类的Prompt结构意图识别方法。The processor is configured to implement the above-mentioned Prompt structure intent recognition method combined with clustering when running the executable instructions stored in the memory.
基于同一发明思路,本发明还提供一种冰箱,其包括:Based on the same inventive idea, the present invention also provides a refrigerator, which includes:
存储器,用于存储可执行指令;memory for storing executable instructions;
处理器,用于运行所述存储器存储的可执行指令时,实现上述的结合聚类的Prompt结构意图识别方法。The processor is configured to implement the above-mentioned Prompt structure intent recognition method combined with clustering when running the executable instructions stored in the memory.
基于同一发明思路,本发明还提供一种计算机可读存储介质,其存储有可执行指令,其特征在于,所述可执行指令被处理器执行时实现上述的结合聚类的Prompt结构意图识别方法。Based on the same inventive idea, the present invention also provides a computer-readable storage medium, which stores executable instructions, and is characterized in that, when the executable instructions are executed by a processor, the above-mentioned Prompt structure intent recognition method combined with clustering is realized. .
综上所述,本实施方式通过构造零样本Prompt提示结构的的方式,不需要大量标注文本意图标签,即可以对文本的意图进行识别,从而节省对话意图识别的成本。并且,Prompt提示结构构造的过程中,使用聚类方法发现意图中心文本的方式,自动化的构建意图识别所需要的prompt提示结构。可以避免人工构造prompt结构时,产生的意图预测结果波动问题。To sum up, in this embodiment, by constructing a zero-sample Prompt prompt structure, it is possible to identify the intent of the text without labeling a large number of text intent tags, thereby saving the cost of dialogue intent recognition. Moreover, in the process of constructing the prompt prompt structure, the clustering method is used to discover the intent center text, and the prompt prompt structure required for intent recognition is automatically constructed. It can avoid the problem of fluctuating intention prediction results when the prompt structure is artificially constructed.
应当理解,虽然本说明书按照实施方式加以描述,但并非每个实施方式仅包含一个独立的技术方案,说明书的这种叙述方式仅仅是为清楚起见,本领域技术人员应当将说明书作为一个整体,各实施方式中的技术方案也可以经适当组合,形成本领域技术人员可以理解的其他实施方式。It should be understood that although this description is described according to implementation modes, not each implementation mode only contains an independent technical solution, and this description in the description is only for clarity, and those skilled in the art should take the description as a whole, and each The technical solutions in the embodiments can also be properly combined to form other embodiments that can be understood by those skilled in the art.
上文所列出的一系列的详细说明仅仅是针对本发明的可行性实施方式的具体说明,并非用以限制本发明的保护范围,凡未脱离本发明技艺精神所作的等效实施方式或变更均应包含在本发明的保护范围之内。The series of detailed descriptions listed above are only specific descriptions of feasible implementations of the present invention, and are not intended to limit the scope of protection of the present invention. Any equivalent implementation or change that does not depart from the technical spirit of the present invention All should be included within the protection scope of the present invention.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211475785.9A CN115858747A (en) | 2022-11-23 | 2022-11-23 | Clustering-combined Prompt structure intention identification method, device, equipment and storage medium |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202211475785.9A CN115858747A (en) | 2022-11-23 | 2022-11-23 | Clustering-combined Prompt structure intention identification method, device, equipment and storage medium |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115858747A true CN115858747A (en) | 2023-03-28 |
Family
ID=85665440
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202211475785.9A Pending CN115858747A (en) | 2022-11-23 | 2022-11-23 | Clustering-combined Prompt structure intention identification method, device, equipment and storage medium |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115858747A (en) |
Cited By (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116580408A (en) * | 2023-06-06 | 2023-08-11 | 上海任意门科技有限公司 | Image generation method and device, electronic equipment and storage medium |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05174019A (en) * | 1991-12-20 | 1993-07-13 | Mitsubishi Electric Corp | Sentence evaluation system |
CN110377911A (en) * | 2019-07-23 | 2019-10-25 | 中国工商银行股份有限公司 | Intension recognizing method and device under dialogue frame |
US20210319178A1 (en) * | 2020-04-12 | 2021-10-14 | Salesforce.Com, Inc. | Autocomplete of user entered text |
CN113704429A (en) * | 2021-08-31 | 2021-11-26 | 平安普惠企业管理有限公司 | Semi-supervised learning-based intention identification method, device, equipment and medium |
CN114492363A (en) * | 2022-04-15 | 2022-05-13 | 苏州浪潮智能科技有限公司 | A small sample fine-tuning method, system and related device |
CN114913953A (en) * | 2022-07-19 | 2022-08-16 | 北京惠每云科技有限公司 | Medical entity relationship identification method and device, electronic equipment and storage medium |
CN114970523A (en) * | 2022-05-20 | 2022-08-30 | 浙江省科技信息研究院 | Topic prompting type keyword extraction method based on text semantic enhancement |
WO2022198750A1 (en) * | 2021-03-26 | 2022-09-29 | 南京邮电大学 | Semantic recognition method |
-
2022
- 2022-11-23 CN CN202211475785.9A patent/CN115858747A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
JPH05174019A (en) * | 1991-12-20 | 1993-07-13 | Mitsubishi Electric Corp | Sentence evaluation system |
CN110377911A (en) * | 2019-07-23 | 2019-10-25 | 中国工商银行股份有限公司 | Intension recognizing method and device under dialogue frame |
US20210319178A1 (en) * | 2020-04-12 | 2021-10-14 | Salesforce.Com, Inc. | Autocomplete of user entered text |
WO2022198750A1 (en) * | 2021-03-26 | 2022-09-29 | 南京邮电大学 | Semantic recognition method |
CN113704429A (en) * | 2021-08-31 | 2021-11-26 | 平安普惠企业管理有限公司 | Semi-supervised learning-based intention identification method, device, equipment and medium |
CN114492363A (en) * | 2022-04-15 | 2022-05-13 | 苏州浪潮智能科技有限公司 | A small sample fine-tuning method, system and related device |
CN114970523A (en) * | 2022-05-20 | 2022-08-30 | 浙江省科技信息研究院 | Topic prompting type keyword extraction method based on text semantic enhancement |
CN114913953A (en) * | 2022-07-19 | 2022-08-16 | 北京惠每云科技有限公司 | Medical entity relationship identification method and device, electronic equipment and storage medium |
Non-Patent Citations (1)
Title |
---|
李学宁: "《形容词修饰语语义计算理论及其在对外汉语学习词典编纂中的应用》", 31 May 2012, 世界图书上海出版公司, pages: 96 * |
Cited By (2)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN116580408A (en) * | 2023-06-06 | 2023-08-11 | 上海任意门科技有限公司 | Image generation method and device, electronic equipment and storage medium |
CN116580408B (en) * | 2023-06-06 | 2023-11-03 | 上海任意门科技有限公司 | Image generation method and device, electronic equipment and storage medium |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
CN109146610B (en) | Intelligent insurance recommendation method and device and intelligent insurance robot equipment | |
CN111933127B (en) | Intention recognition method and intention recognition system with self-learning capability | |
CN106571140B (en) | Intelligent electric appliance control method and system based on voice semantics | |
CN104143327B (en) | A kind of acoustic training model method and apparatus | |
Chang et al. | Speechprompt v2: Prompt tuning for speech classification tasks | |
CN114510570B (en) | Intention classification method, device and computer equipment based on small sample corpus | |
CN107633079B (en) | A Natural Language Human-Computer Interaction Algorithm Based on Database and Neural Network | |
CN112417894A (en) | Conversation intention identification method and system based on multi-task learning | |
CN113823272B (en) | Voice processing method, device, electronic device and storage medium | |
CN107656983A (en) | A kind of intelligent recommendation method and device based on Application on Voiceprint Recognition | |
CN113609264B (en) | Data query method and device for power system nodes | |
CN113254613B (en) | Dialogue question-answering method, device, equipment and storage medium | |
CN108986797A (en) | A kind of voice subject identifying method and system | |
CN112800777A (en) | Semantic determination method | |
Ault et al. | On speech recognition algorithms | |
Cao et al. | Speaker-independent speech emotion recognition based on random forest feature selection algorithm | |
CN116884407A (en) | Lightweight personalized voice awakening method, device and equipment | |
CN113220892A (en) | BERT-based self-adaptive text classification method and device | |
CN115858747A (en) | Clustering-combined Prompt structure intention identification method, device, equipment and storage medium | |
CN116303966A (en) | Dialogue Act Recognition System Based on Prompt Learning | |
CN119149707B (en) | Intelligent question-answering system and method based on self-adaptive feedback loop | |
CN115273828A (en) | Training method, device and electronic device for speech intent recognition model | |
CN112364662A (en) | Intention identification method based on neural network and electronic device | |
WO2025123652A1 (en) | Audio synthesis method and system, electronic device, and computer readable storage medium | |
CN115881103B (en) | Speech emotion recognition model training method, speech emotion recognition method and device |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |