[go: up one dir, main page]

CN114519094A - Method and device for conversational recommendation based on random state and electronic equipment - Google Patents

Method and device for conversational recommendation based on random state and electronic equipment Download PDF

Info

Publication number
CN114519094A
CN114519094A CN202210143900.6A CN202210143900A CN114519094A CN 114519094 A CN114519094 A CN 114519094A CN 202210143900 A CN202210143900 A CN 202210143900A CN 114519094 A CN114519094 A CN 114519094A
Authority
CN
China
Prior art keywords
sentence
score
statement
voice information
dialect
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202210143900.6A
Other languages
Chinese (zh)
Inventor
沈越
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ping An Puhui Enterprise Management Co Ltd
Original Assignee
Ping An Puhui Enterprise Management Co Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Ping An Puhui Enterprise Management Co Ltd filed Critical Ping An Puhui Enterprise Management Co Ltd
Priority to CN202210143900.6A priority Critical patent/CN114519094A/en
Publication of CN114519094A publication Critical patent/CN114519094A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F16/00Information retrieval; Database structures therefor; File system structures therefor
    • G06F16/30Information retrieval; Database structures therefor; File system structures therefor of unstructured textual data
    • G06F16/33Querying
    • G06F16/332Query formulation
    • G06F16/3329Natural language query formulation
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/20Natural language analysis
    • G06F40/205Parsing
    • G06F40/211Syntactic parsing, e.g. based on context-free grammar [CFG] or unification grammars
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F40/00Handling natural language data
    • G06F40/30Semantic analysis
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • Computational Linguistics (AREA)
  • Artificial Intelligence (AREA)
  • General Physics & Mathematics (AREA)
  • General Engineering & Computer Science (AREA)
  • Mathematical Physics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Data Mining & Analysis (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Software Systems (AREA)
  • Evolutionary Computation (AREA)
  • Audiology, Speech & Language Pathology (AREA)
  • Human Computer Interaction (AREA)
  • Databases & Information Systems (AREA)
  • Information Retrieval, Db Structures And Fs Structures Therefor (AREA)

Abstract

The application discloses a method and a device for recommending dialect based on a random state and electronic equipment, wherein the method comprises the following steps: performing text conversion processing on the current voice information of the user to obtain a first sentence corresponding to the voice information; querying historical dialogue data according to the first statement to obtain a second statement, wherein the occurrence time of the second statement is earlier than that of the first statement, and the absolute value of the difference between the occurrence time of the second statement and the occurrence time of the first statement is minimum; extracting the intention of the second sentence to obtain a first intention characteristic; generating a state tracking label of the first statement according to the first will characteristic; obtaining at least one first score from the first sentence, the state tracking label and the at least one conversational input scoring model; multiplying each first fraction of the at least one first fraction by a random function to obtain at least one second fraction; and recommending the dialect corresponding to the largest second score in the at least one second score to the answering device.

Description

基于随机状态的话术推荐方法、装置及电子设备Method, device and electronic device for speech recommendation based on random state

技术领域technical field

本发明涉及人工智能技术领域,具体涉及一种基于随机状态的话术推荐方法、装置及电子设备。The present invention relates to the technical field of artificial intelligence, in particular to a method, device and electronic device for speech recommendation based on a random state.

背景技术Background technique

目前,传统的话术推荐模型基本上是基于协同过滤,或者排序模型得到的TOP列表进行推荐,总体而言,对于大部分场景来说,上述话术推荐模型的效果是可以满足场景的需求的。但是,在与人对话场景中,传统的话术推荐方式所采用的协同过滤或者排序模型的方式都是将高频意图对应的高频话术提取出来进行回复,这样就会导致客户每次提问,哪怕同一段对话中,只要对话的意图相似,得到的回复总会是一些不断重复的高频话术,即,高频的几个话术会重复出现。由此,产生恶性循环,使高频话术总会被推荐到,使其频率进一步堆高,后续访问权重也会加大,更容易被继续访问。使整个对话过程显的比较呆板,缺乏趣味性和创新性,用户体验较差。At present, the traditional vocabulary recommendation model is basically based on collaborative filtering or the TOP list obtained by the sorting model. Generally speaking, for most scenarios, the effect of the above vocabulary recommendation model can meet the needs of the scene. However, in the dialogue with people, the collaborative filtering or sorting model adopted by the traditional word recommendation method is to extract the high-frequency words corresponding to the high-frequency intentions and reply, which will cause the customer to ask questions every time. Even in the same conversation, as long as the intent of the conversation is similar, the response will always be some repeated high-frequency words, that is, several high-frequency words will appear repeatedly. As a result, a vicious circle is created, so that high-frequency words are always recommended, so that the frequency is further increased, and the subsequent access weight will also increase, making it easier to continue to access. It makes the whole dialogue process appear dull, lacking in fun and innovation, and the user experience is poor.

发明内容SUMMARY OF THE INVENTION

为了解决现有技术中存在的上述问题,本申请实施方式提供了一种基于随机状态的话术推荐方法、装置及电子设备,可以提高与人对话场景中的趣味性和创造性,继而提升用户体验。In order to solve the above problems in the prior art, the embodiments of the present application provide a random state-based speech recommendation method, device, and electronic device, which can improve the fun and creativity in a dialogue scene with a person, thereby improving user experience.

第一方面,本申请的实施方式提供了一种基于随机状态的话术推荐方法,包括:In a first aspect, the embodiments of the present application provide a random state-based speech recommendation method, including:

对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句;Perform text conversion processing on the current voice information of the user to obtain a first sentence corresponding to the voice information;

根据第一语句查询历史对话数据,得到第二语句,其中,历史对话数据用于记录第一语句所属的对话事件在当前时刻前所产生的对话数据,第二语句的发生时间早于第一语句的发生时间,且第二语句的发生时间与第一语句的发生时间之间的差的绝对值最小;Query historical dialogue data according to the first sentence, and obtain a second sentence, wherein the historical dialogue data is used to record the dialogue data generated before the current moment in the dialogue event to which the first sentence belongs, and the occurrence time of the second sentence is earlier than that of the first sentence The occurrence time of , and the absolute value of the difference between the occurrence time of the second statement and the occurrence time of the first statement is the smallest;

对第二语句进行意愿提取,得到第一意愿特征;Perform willingness extraction on the second sentence to obtain the first willingness feature;

根据第一意愿特征生成第一语句的状态追踪标签,其中,状态追踪标签用于标识用户在说出语音信息时自身的意愿方向和需求强度;generating a state tracking tag of the first sentence according to the first intention feature, wherein the state tracking tag is used to identify the user's own intention direction and demand intensity when speaking the voice information;

将第一语句、状态追踪标签和至少一个话术输入评分模型,得到至少一个第一分数,其中,评分模型用于对至少一个话术中的每个话术进行评分,至少一个第一分数与至少一个话术一一对应;The first sentence, the state tracking label and the at least one discourse are input into the scoring model to obtain at least one first score, wherein the scoring model is used to score each discourse in the at least one discourse, and the at least one first score is the same as that of the at least one discourse. One-to-one correspondence with at least one vocabulary;

将至少一个第一分数中的每个第一分数与随机函数相乘,得到至少一个第二分数,其中,至少一个第二分数与至少一个第一分数一一对应;Multiplying each of the at least one first score by a random function to obtain at least one second score, wherein the at least one second score corresponds to the at least one first score one-to-one;

将至少一个第二分数中最大的第二分数对应的话术推荐至答复设备,以使答复设备根据推荐话术生成答复语句对用户当前的语音信息进行答复。A phrase corresponding to the largest second score among the at least one second score is recommended to the replying device, so that the replying device generates a reply sentence according to the recommended phrase to reply to the user's current voice information.

第二方面,本申请的实施方式提供了一种基于随机状态的话术推荐装置,包括:In a second aspect, embodiments of the present application provide a random state-based speech recommendation device, including:

分析模块,用于对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句;an analysis module, configured to perform text conversion processing on the current voice information of the user to obtain a first sentence corresponding to the voice information;

查询模块,用于根据第一语句查询历史对话数据,得到第二语句,其中,历史对话数据用于记录第一语句所属的对话事件在当前时刻前所产生的对话数据,第二语句的发生时间早于第一语句的发生时间,且第二语句的发生时间与第一语句的发生时间之间的差的绝对值最小;The query module is used for querying historical dialogue data according to the first sentence to obtain the second sentence, wherein the historical dialogue data is used to record the dialogue data generated before the current moment in the dialogue event to which the first sentence belongs, and the occurrence time of the second sentence. It is earlier than the occurrence time of the first statement, and the absolute value of the difference between the occurrence time of the second statement and the occurrence time of the first statement is the smallest;

处理模块,用于对第二语句进行意愿提取,得到第一意愿特征,根据第一意愿特征生成第一语句的状态追踪标签,其中,状态追踪标签用于标识用户在说出语音信息时自身的意愿方向和需求强度;The processing module is configured to perform willingness extraction on the second sentence to obtain a first willingness feature, and generate a state tracking label of the first sentence according to the first willingness feature, wherein the state tracking label is used to identify the user's own behavior when speaking the voice information. Willingness direction and demand intensity;

评分模块,用于将第一语句、状态追踪标签和至少一个话术输入评分模型,得到至少一个第一分数,其中,评分模型用于对至少一个话术中的每个话术进行评分,至少一个第一分数与至少一个话术一一对应,并将至少一个第一分数中的每个第一分数与随机函数相乘,得到至少一个第二分数,其中,至少一个第二分数与至少一个第一分数一一对应;The scoring module is used to input the first sentence, the state tracking label and the at least one discourse into the scoring model to obtain at least one first score, wherein the scoring model is used to score each discourse in the at least one discourse, at least One first score is in one-to-one correspondence with at least one vocabulary, and each first score in the at least one first score is multiplied by a random function to obtain at least one second score, wherein the at least one second score is associated with the at least one The first scores are in one-to-one correspondence;

推荐模块,用于将至少一个第二分数中最大的第二分数对应的话术推荐至答复设备,以使答复设备根据推荐话术生成答复语句对用户当前的语音信息进行答复。The recommendation module is configured to recommend the phrase corresponding to the largest second score in the at least one second score to the replying device, so that the replying device generates a reply sentence according to the recommended phrase to reply to the current voice information of the user.

第三方面,本申请实施方式提供一种电子设备,包括:处理器,处理器与存储器相连,存储器用于存储计算机程序,处理器用于执行存储器中存储的计算机程序,以使得电子设备执行如第一方面的方法。In a third aspect, an embodiment of the present application provides an electronic device, comprising: a processor, the processor is connected to a memory, the memory is used for storing a computer program, and the processor is used for executing the computer program stored in the memory, so that the electronic device executes the program as described in the first method on the one hand.

第四方面,本申请实施方式提供一种计算机可读存储介质,计算机可读存储介质存储有计算机程序,计算机程序使得计算机执行如第一方面的方法。In a fourth aspect, embodiments of the present application provide a computer-readable storage medium, where the computer-readable storage medium stores a computer program, and the computer program causes a computer to execute the method according to the first aspect.

第五方面,本申请实施方式提供一种计算机程序产品,计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,计算机可操作来使计算机执行如第一方面的方法。In a fifth aspect, embodiments of the present application provide a computer program product, where the computer program product includes a non-transitory computer-readable storage medium storing a computer program, and the computer is operable to cause the computer to execute the method of the first aspect.

实施本申请实施方式,具有如下有益效果:Implementing the embodiments of the present application has the following beneficial effects:

在本申请实施方式中,通过对当前对话中,用户当前时刻所说的语句(第一语句)的前一语句(第二语句)的意愿进行分析提取,从而根据该第二语句的意愿生成第一语句的状态追踪标签。然后,通过第一语句和该状态追踪标签一起对适配出的至少一个话术进行评分,并通过随机函数对评分结果进行扰动,继而选出扰动结果中最高分数对应的话术进行推荐。由此,通过状态追踪标签,使每次进行话术推荐时,参考参数中除了当前对话之外,还携带了前一时刻的对话意愿,由此使恢复过程有了更多的主动性,可以通过标记不同的状态追踪标签来影响话术推荐的方向,继而在模型中引入了逻辑判断,使得模型输出的结果具备了可解释的逻辑性。同时,通过引入的随机函数对模型的输出结果进行随机概率的扰动,可以增加问答中的趣味性和创造性,提升客户体验。In the embodiment of the present application, by analyzing and extracting the will of the sentence (the second sentence) preceding the sentence (the first sentence) spoken by the user at the current moment in the current dialogue, the second sentence is generated according to the will of the second sentence. The state tracking label for a statement. Then, the first sentence and the state tracking tag are used to score at least one adapted phrase, and the scoring result is perturbed by a random function, and then the phrase corresponding to the highest score in the perturbation result is selected for recommendation. Therefore, through the state tracking tag, each time a speech recommendation is made, in addition to the current dialogue, the reference parameters also carry the dialogue willingness of the previous moment, so that the recovery process has more initiative and can be By marking different state tracking tags to affect the direction of speech recommendation, and then introducing logical judgments into the model, the results of the model output have interpretable logic. At the same time, the random function is introduced to perturb the output results of the model with random probability, which can increase the fun and creativity in the question-and-answer process and improve the customer experience.

附图说明Description of drawings

为了更清楚地说明本申请实施方式中的技术方案,下面将对实施方式描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图是本申请的一些实施方式,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。In order to illustrate the technical solutions in the embodiments of the present application more clearly, the following briefly introduces the drawings that are used in the description of the embodiments. Obviously, the drawings in the following description are some embodiments of the present application. For those of ordinary skill in the art, other drawings can also be obtained from these drawings without any creative effort.

图1为本申请实施方式提供的一种基于随机状态的话术推荐装置的硬件结构示意图;1 is a schematic diagram of the hardware structure of a random state-based speech recommendation device provided by an embodiment of the present application;

图2为本申请实施方式提供的一种基于随机状态的话术推荐方法的流程示意图;2 is a schematic flowchart of a random state-based speech recommendation method provided by an embodiment of the present application;

图3为本申请实施方式提供的一种对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句的方法的流程示意图;3 is a schematic flowchart of a method for performing text conversion processing on the current voice information of a user to obtain a first sentence corresponding to the voice information provided by an embodiment of the present application;

图4为本申请实施方式提供的一种根据声学特征,确定语音信息的方言类别的方法的流程示意图;4 is a schematic flowchart of a method for determining a dialect category of speech information according to an acoustic feature provided by an embodiment of the present application;

图5为本申请实施方式提供的一种基于随机状态的话术推荐装置的功能模块组成框图;FIG. 5 is a block diagram of functional modules of a random state-based speech recommendation device according to an embodiment of the present application;

图6为本申请实施方式提供的一种电子设备的结构示意图。FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application.

具体实施方式Detailed ways

下面将结合本申请实施方式中的附图,对本申请实施方式中的技术方案进行清楚、完整地描述,显然,所描述的实施方式是本申请一部分实施方式,而不是全部的实施方式。基于本申请中的实施方式,本领域普通技术人员在没有作出创造性劳动前提下所获得的所有其他实施方式,都属于本申请保护的范围。The technical solutions in the embodiments of the present application will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present application. Obviously, the described embodiments are part of the embodiments of the present application, but not all of the embodiments. Based on the embodiments in the present application, all other embodiments obtained by those of ordinary skill in the art without creative work fall within the protection scope of the present application.

本申请的说明书和权利要求书及所述附图中的术语“第一”、“第二”、“第三”和“第四”等是用于区别不同对象,而不是用于描述特定顺序。此外,术语“包括”和“具有”以及它们任何变形,意图在于覆盖不排他的包含。例如包含了一系列步骤或单元的过程、方法、系统、产品或设备没有限定于已列出的步骤或单元,而是可选地还包括没有列出的步骤或单元,或可选地还包括对于这些过程、方法、产品或设备固有的其它步骤或单元。The terms "first", "second", "third" and "fourth" in the description and claims of the present application and the drawings are used to distinguish different objects, rather than to describe a specific order . Furthermore, the terms "comprising" and "having" and any variations thereof are intended to cover non-exclusive inclusion. For example, a process, method, system, product or device comprising a series of steps or units is not limited to the listed steps or units, but optionally also includes unlisted steps or units, or optionally also includes For other steps or units inherent to these processes, methods, products or devices.

在本文中提及“实施方式”意味着,结合实施方式描述的特定特征、结果或特性可以包含在本申请的至少一个实施方式中。在说明书中的各个位置出现该短语并不一定均是指相同的实施方式,也不是与其它实施方式互斥的独立的或备选的实施方式。本领域技术人员显式地和隐式地理解的是,本文所描述的实施方式可以与其它实施方式相结合。Reference herein to "an embodiment" means that a particular feature, result, or characteristic described in connection with the embodiment can be included in at least one embodiment of the present application. The appearances of the phrase in various places in the specification are not necessarily all referring to the same embodiment, nor a separate or alternative embodiment that is mutually exclusive with other embodiments. It is explicitly and implicitly understood by those skilled in the art that the embodiments described herein may be combined with other embodiments.

首先,参阅图1,图1为本申请实施方式提供的一种基于随机状态的话术推荐装置的硬件结构示意图。该基于随机状态的话术推荐装置100包括至少一个处理器101,通信线路102,存储器103以及至少一个通信接口104。First, referring to FIG. 1 , FIG. 1 is a schematic diagram of a hardware structure of a random state-based term recommendation apparatus according to an embodiment of the present application. The random state-based term recommendation device 100 includes at least one processor 101 , a communication line 102 , a memory 103 and at least one communication interface 104 .

在本实施方式中,处理器101,可以是一个通用中央处理器(central processingunit,CPU),微处理器,特定应用集成电路(application-specific integrated circuit,ASIC),或一个或多个用于控制本申请方案程序执行的集成电路。In this embodiment, the processor 101 may be a general-purpose central processing unit (CPU), a microprocessor, an application-specific integrated circuit (ASIC), or one or more of them for controlling An integrated circuit implemented by the program program of this application.

通信线路102,可以包括一通路,在上述组件之间传送信息。Communication line 102, which may include a path, communicates information between the above-mentioned components.

通信接口104,可以是任何收发器一类的装置(如天线等),用于与其他设备或通信网络通信,例如以太网,RAN,无线局域网(wireless local area networks,WLAN)等。The communication interface 104 may be any device such as a transceiver (such as an antenna, etc.) for communicating with other devices or communication networks, such as Ethernet, RAN, wireless local area networks (WLAN), and the like.

存储器103,可以是只读存储器(read-only memory,ROM)或可存储静态信息和指令的其他类型的静态存储设备,随机存取存储器(random access memory,RAM)或者可存储信息和指令的其他类型的动态存储设备,也可以是电可擦可编程只读存储器(electrically erasable programmable read-only memory,EEPROM)、只读光盘(compactdisc read-only memory,CD-ROM)或其他光盘存储、光碟存储(包括压缩光碟、激光碟、光碟、数字通用光碟、蓝光光碟等)、磁盘存储介质或者其他磁存储设备、或者能够用于携带或存储具有指令或数据结构形式的期望的程序代码并能够由计算机存取的任何其他介质,但不限于此。The memory 103 may be read-only memory (ROM) or other types of static storage devices that can store static information and instructions, random access memory (RAM) or other types of static storage devices that can store information and instructions It can also be an electrically erasable programmable read-only memory (EEPROM), a compact disc read-only memory (CD-ROM), or other optical disk storage, optical disk storage (including compact discs, laser discs, optical discs, digital versatile discs, Blu-ray discs, etc.), magnetic disk storage media or other magnetic storage devices, or capable of carrying or storing desired program code in the form of instructions or data structures and capable of being executed by a computer Access any other medium without limitation.

在本实施方式中,存储器103可以独立存在,通过通信线路102与处理器101相连接。存储器103也可以和处理器101集成在一起。本申请实施方式提供的存储器103通常可以具有非易失性。其中,存储器103用于存储执行本申请方案的计算机执行指令,并由处理器101来控制执行。处理器101用于执行存储器103中存储的计算机执行指令,从而实现本申请下述实施方式中提供的方法。In this embodiment, the memory 103 can exist independently, and is connected to the processor 101 through the communication line 102 . The memory 103 may also be integrated with the processor 101 . The memory 103 provided by the embodiments of the present application may generally be non-volatile. The memory 103 is used for storing computer-executed instructions for executing the solutions of the present application, and the execution is controlled by the processor 101 . The processor 101 is configured to execute the computer-executed instructions stored in the memory 103, thereby implementing the methods provided in the following embodiments of the present application.

在可选的实施方式中,计算机执行指令也可以称之为应用程序代码,本申请对此不作具体限定。In an optional implementation manner, the computer-executed instructions may also be referred to as application code, which is not specifically limited in this application.

在可选的实施方式中,处理器101可以包括一个或多个CPU,例如图1中的CPU0和CPU1。In alternative embodiments, the processor 101 may include one or more CPUs, such as CPU0 and CPU1 in FIG. 1 .

在可选的实施方式中,该基于随机状态的话术推荐装置100可以包括多个处理器,例如图1中的处理器101和处理器107。这些处理器中的每一个可以是一个单核(single-CPU)处理器,也可以是一个多核(multi-CPU)处理器。这里的处理器可以指一个或多个设备、电路、和/或用于处理数据(例如计算机程序指令)的处理核。In an optional embodiment, the random state-based vocabulary recommendation apparatus 100 may include multiple processors, such as the processor 101 and the processor 107 in FIG. 1 . Each of these processors can be a single-core (single-CPU) processor or a multi-core (multi-CPU) processor. A processor herein may refer to one or more devices, circuits, and/or processing cores for processing data (eg, computer program instructions).

在可选的实施方式中,若基于随机状态的话术推荐装置100为服务器,例如,可以是独立的服务器,也可以是提供云服务、云数据库、云计算、云函数、云存储、网络服务、云通信、中间件服务、域名服务、安全服务、内容分发网络(ContentDelivery Network,CDN)、以及大数据和人工智能平台等基础云计算服务的云服务器。则基于随机状态的话术推荐装置100还可以包括输出设备105和输入设备106。输出设备105和处理器101通信,可以以多种方式来显示信息。例如,输出设备105可以是液晶显示器(liquid crystal display,LCD),发光二级管(light emitting diode,LED)显示设备,阴极射线管(cathode ray tube,CRT)显示设备,或投影仪(projector)等。输入设备106和处理器101通信,可以以多种方式接收用户的输入。例如,输入设备106可以是鼠标、键盘、触摸屏设备或传感设备等。In an optional implementation manner, if the vocabulary recommendation device 100 based on the random state is a server, for example, it may be an independent server, or may provide cloud services, cloud databases, cloud computing, cloud functions, cloud storage, network services, Cloud servers for basic cloud computing services such as cloud communication, middleware services, domain name services, security services, Content Delivery Network (CDN), and big data and artificial intelligence platforms. Then, the phrase recommendation apparatus 100 based on the random state may further include an output device 105 and an input device 106 . The output device 105 is in communication with the processor 101 and can display information in a variety of ways. For example, the output device 105 may be a liquid crystal display (LCD), a light emitting diode (LED) display device, a cathode ray tube (CRT) display device, or a projector (projector) Wait. The input device 106 is in communication with the processor 101 and can receive user input in a variety of ways. For example, the input device 106 may be a mouse, a keyboard, a touch screen device or a sensing device, or the like.

上述的基于随机状态的话术推荐装置100可以是一个通用设备或者是一个专用设备。本申请实施方式不限定基于随机状态的话术推荐装置100的类型。The above-mentioned random state-based speech recommendation apparatus 100 may be a general-purpose device or a special-purpose device. The embodiments of the present application do not limit the type of the speech recommendation apparatus 100 based on the random state.

其次,需要说明的是,本申请所公开的实施方式可以基于人工智能技术对相关的数据进行获取和处理。其中,人工智能(Artificial Intelligence,AI)是利用数字计算机或者数字计算机控制的机器模拟、延伸和扩展人的智能,感知环境、获取知识并使用知识获得最佳结果的理论、方法、技术及应用系统。Secondly, it should be noted that the embodiments disclosed in this application can acquire and process related data based on artificial intelligence technology. Among them, artificial intelligence (AI) is a theory, method, technology and application system that uses digital computers or machines controlled by digital computers to simulate, extend and expand human intelligence, perceive the environment, acquire knowledge and use knowledge to obtain the best results. .

人工智能基础技术一般包括如传感器、专用人工智能芯片、云计算、分布式存储、大数据处理技术、操作/交互系统、机电一体化等技术。人工智能软件技术主要包括计算机视觉技术、机器人技术、生物识别技术、语音处理技术、自然语言处理技术以及机器学习/深度学习等几大方向。The basic technologies of artificial intelligence generally include technologies such as sensors, special artificial intelligence chips, cloud computing, distributed storage, big data processing technology, operation/interaction systems, and mechatronics. Artificial intelligence software technology mainly includes computer vision technology, robotics technology, biometrics technology, speech processing technology, natural language processing technology, and machine learning/deep learning.

最后,本申请中的基于随机状态的话术推荐方法可以应用到电商销售、线下实体销售、业务推广、AI电话外呼、社交平台推广等场景。本申请中主要以AI电话外呼场景为例说明该基于随机状态的话术推荐方法方法,其他场景中的基于随机状态的话术推荐方法与AI电话外呼场景下的实现方式类似,在此不再叙述。Finally, the random state-based speech recommendation method in this application can be applied to scenarios such as e-commerce sales, offline physical sales, business promotion, AI phone calls, and social platform promotion. In this application, the AI telephone outbound call scenario is used as an example to illustrate the random state-based speech recommendation method. The random state-based speech recommendation method in other scenarios is similar to the implementation in the AI outbound telephone call scenario, and is not repeated here. narrative.

以下,将对本申请所公开的基于随机状态的话术推荐方法进行说明:The following will describe the random state-based speech recommendation method disclosed in this application:

参阅图2,图2为本申请实施方式提供的一种基于随机状态的话术推荐方法的流程示意图。该基于随机状态的话术推荐方法包括以下步骤:Referring to FIG. 2 , FIG. 2 is a schematic flowchart of a random state-based speech recommendation method provided by an embodiment of the present application. The random state-based discourse recommendation method includes the following steps:

201:对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句。201: Perform text conversion processing on the current voice information of the user to obtain a first sentence corresponding to the voice information.

由于我国的地域广博,地形众多,其中不乏难以逾越的复杂地形。而复杂地形使得在交通不发达的时代人们难以穿过这些地形进行交流,使得我国形成了多样的社会、历史及人文环境,因此,形成了多种多样的方言。Because of my country's vast territory and numerous terrains, there are many complex terrains that are insurmountable. The complex terrain makes it difficult for people to communicate through these terrains in the era of underdeveloped transportation, which makes our country form a variety of social, historical and cultural environments, and therefore, a variety of dialects have been formed.

而在本实施方式中,该语音信息可以是用户通过音频采集设备输入的语音信息,例如,对于AI电话外呼场景而言,该语音信息可以是用户通过通讯设备的音频采集设备对AI所发出的语音进行回复或提出疑问的一句话。因此,考虑到我国方言的多样性和广泛使用性,需要在实际应用中,考虑到用户通过方言进行交流的情况。In this embodiment, the voice information may be the voice information input by the user through the audio collection device. For example, for the AI phone call scenario, the voice information may be the voice information sent by the user to the AI through the audio collection device of the communication device. The voice to reply or ask a question. Therefore, considering the diversity and wide use of dialects in our country, it is necessary to take into account the situation of users communicating through dialects in practical applications.

基于此,本实施方式中提供了一种对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句的方法,如图3所示,该方法包括:Based on this, this embodiment provides a method for performing text conversion processing on the current voice information of the user to obtain the first sentence corresponding to the voice information. As shown in FIG. 3 , the method includes:

301:获取语音信息的声学特征。301 : Acquire acoustic features of speech information.

在本实施方式中,可以预先训练一个声学模型,例如:多层长短期记忆网络、多层卷积神经网络等。由此,通过将待识别语音输入该声学模型,提取该待识别语音的声学特征。示例性的,声学特征可以包括该待识别语音的特征序列、该待识别语音中音素的后验概率分布、以及该待识别语音的声学向量。In this embodiment, an acoustic model can be pre-trained, for example, a multi-layer long short-term memory network, a multi-layer convolutional neural network, and the like. Thus, by inputting the to-be-recognized speech into the acoustic model, the acoustic features of the to-be-recognized speech are extracted. Exemplarily, the acoustic features may include a feature sequence of the to-be-recognized speech, a posterior probability distribution of phonemes in the to-be-recognized speech, and an acoustic vector of the to-be-recognized speech.

具体而言,可以将声学模型中低层网络的输出作为该待识别语音的特征序列,高层网络的输出作为该待识别语音的声学向量。该待识别语音中音素的后验概率分布指代该待识别语音中各个音素被识别为不同音素的概率。Specifically, the output of the low-level network in the acoustic model can be used as the feature sequence of the speech to be recognized, and the output of the high-level network can be used as the acoustic vector of the speech to be recognized. The posterior probability distribution of the phonemes in the speech to be recognized refers to the probability that each phoneme in the speech to be recognized is recognized as different phonemes.

302:根据声学特征,确定语音信息的方言类别。302: Determine the dialect category of the speech information according to the acoustic feature.

在本实施方式中,提供了一种根据声学特征,确定语音信息的方言类别的方法,如图4所示,该方法包括:In this embodiment, a method for determining the dialect category of speech information according to acoustic features is provided. As shown in FIG. 4 , the method includes:

401:根据声学特征确定该语音信息的能量分布、韵律分布、基频和平均语声功率。401: Determine the energy distribution, prosody distribution, fundamental frequency and average speech power of the speech information according to the acoustic features.

具体而言,可以对声学特征中的音素的后验概率分布、以及声学向量进行分析,得出语音信息的能量分布和韵律分布;对声学特征中的特征序列进行分析,得出语音信息的平均语声功率;而基频可以通过对语音信息的发音特性进行分析得出。Specifically, the posterior probability distribution of the phonemes in the acoustic features and the acoustic vector can be analyzed to obtain the energy distribution and prosody distribution of the speech information; the feature sequence in the acoustic features can be analyzed to obtain the average of the speech information. Speech power; and fundamental frequency can be obtained by analyzing the pronunciation characteristics of speech information.

402:根据基频和平均语声功率,确定语音信息对应的方言片区。402: Determine the dialect area corresponding to the speech information according to the fundamental frequency and the average speech power.

具体而言,方言虽然只是在一定的地域中通行,但本身却也有一种完整的系统。方言都具有语音结构系统、词汇结构系统和语法结构系统,能够满足本地区社会交际的需要,总而言之,同一方言片区中的方言往往表现出“同中有异、异中有同”的语言特点。而其中的“同”,往往体现在声音的底层特征,即发音的频率、语速等。换而言之,对于每种方言而言,其发音的基频和发声的平均语声功率是存在一定的地域共通性的。Specifically, although dialects are only used in a certain area, they also have a complete system. All dialects have a phonetic structure system, a lexical structure system and a grammatical structure system, which can meet the needs of social communication in the region. All in all, the dialects in the same dialect area often show the language characteristics of "similarities and differences, and similarities". The "same" among them is often reflected in the underlying characteristics of the sound, that is, the frequency of pronunciation, the speed of speech, etc. In other words, for each dialect, the fundamental frequency of its pronunciation and the average speech power of its pronunciation have certain regional commonality.

因此,在本实施方式中,可以预先对每个方言片区的方言的基频特征和平均语声功率特征进行提取存储,继而通过将该语音信息的基频和平均语声功率与预先采集的各个方言片区的基频特征和平均语声功率特征进行特征匹配的方式,例如,计算相似度或欧氏距离的方式,确定该语音信息对应的方言片区。Therefore, in this embodiment, the fundamental frequency feature and the average voice power feature of the dialect of each dialect area can be extracted and stored in advance, and then the fundamental frequency and average voice power of the voice information can be compared with the pre-collected respective The method of feature matching between the fundamental frequency feature and the average speech power feature of the dialect area, for example, the method of calculating the similarity or the Euclidean distance, determines the dialect area corresponding to the speech information.

403:分别将能量分布和韵律分布进行编码,得到能量分布向量和韵律分布向量。403: Encode the energy distribution and the rhythm distribution respectively to obtain an energy distribution vector and a rhythm distribution vector.

在本实施方式中,能量分布可以表明该语音信息中响度的变化,韵律分布则可以表明该语音信息中声调的变化。具体而言,可以通过获取该语音信息的能量谱,并对该能量谱进行图嵌入处理,得到能量分布向量。同理,可以通过获取该语音信息的频率分布图,继而对频率分布图进行图嵌入处理,得到韵律分布向量。In this embodiment, the energy distribution can indicate the change of loudness in the speech information, and the prosody distribution can indicate the change of the pitch in the speech information. Specifically, the energy distribution vector can be obtained by acquiring the energy spectrum of the speech information and performing a graph embedding process on the energy spectrum. Similarly, the prosodic distribution vector can be obtained by acquiring the frequency distribution map of the speech information, and then performing a graph embedding process on the frequency distribution map.

404:将能量分布向量和韵律分布向量纵向拼接,得到音色分布向量。404: Vertically splicing the energy distribution vector and the rhythm distribution vector to obtain a timbre distribution vector.

在可选的实施方式中,可以将能量分布向量和韵律分布向量求和后再求平均,将得到的平均向量作为音色分布向量。In an optional embodiment, the energy distribution vector and the rhythm distribution vector may be summed and then averaged, and the obtained average vector may be used as the timbre distribution vector.

405:根据音色分布向量在方言片区对应的音色库中进行匹配,并根据匹配结果确定语音信息的方言类别。405: Perform matching in the timbre library corresponding to the dialect area according to the timbre distribution vector, and determine the dialect category of the voice information according to the matching result.

具体而言,如步骤402中所述,同一方言片区中的方言会表现出“同中有异、异中有同”的语言特点,其中的“异”通常体现在声音的中高层特征中,例如声调的流转、尾音等。Specifically, as described in step 402, the dialects in the same dialect area will show the language characteristics of "similarity with differences, and differences with similarities". For example, the flow of tones, tails, etc.

基于此,在本实施方式中,可以预先对每个方言片区中的各个方言的音色分布特征进行提取存储,继而通过将该语音信息的音色分布向量与对应的方言片区中预先采集的各个方言的音色分布特征进行特征匹配的方式,例如,计算相似度或欧氏距离的方式,确定该语音信息对应的方言类别。Based on this, in this embodiment, the timbre distribution features of each dialect in each dialect area can be extracted and stored in advance, and then the timbre distribution vector of the voice information is combined with the pre-collected timbre distribution of each dialect in the corresponding dialect area. The method of feature matching for timbre distribution features, for example, the method of calculating similarity or Euclidean distance, determines the dialect category corresponding to the voice information.

303:获取方言类别对应的音频转置公式,通过音频转置公式将语音信息转化为标准语音。303 : Obtain an audio transposition formula corresponding to the dialect category, and convert the voice information into standard voice through the audio transposition formula.

在本实施方式中,音频转置公式用于标识对应的方言发音与普通话发音之间的转化特征。具体而言,通过该方言转置公式,可以将方言语音转化为对应的普通话语音,即本申请中提到的标准语音。In this embodiment, the audio transposition formula is used to identify the conversion feature between the corresponding dialect pronunciation and the Mandarin pronunciation. Specifically, through the dialect transposition formula, the dialect speech can be converted into the corresponding Mandarin speech, that is, the standard speech mentioned in this application.

在本实施方式中,可以通过采集海量的不同方言但同内容的文本,通过训练的方式确定不同方言相对于普通话之间的差异和规律,例如:发音上的差异和规律、语气上的差异和规律、专有词汇的对应关系等,继而形成不同方言转化为普通话的音频转置公式。In this embodiment, by collecting a large number of texts in different dialects but with the same content, the differences and laws between different dialects relative to Mandarin can be determined by training, for example: differences and laws in pronunciation, differences in tone and The rules, the corresponding relationship of proprietary words, etc., then form the audio transposition formula for converting different dialects into Mandarin.

304:根据标准语音,获取语音信息的拼音文本。304: Acquire the pinyin text of the voice information according to the standard voice.

在本实施方式中,可以通过对标准语音进行特征提取,例如:频谱转化、非线性频谱转换、以及特征系数转换的方式,获取对应音频特征。该音频特征可以是与该标准语音对应的听觉临界频带尺下的度映射,例如:该标准语音在Bark域下的映射,以及该标准语音在等效矩形带宽(Equivalent Rectangular Bandwidth,ERB)域下的映射,通过音频特征可以将标准语音的音频特征进行量化表示。In this implementation manner, the corresponding audio features can be obtained by performing feature extraction on standard speech, such as spectrum conversion, nonlinear spectrum conversion, and feature coefficient conversion. The audio feature may be a degree mapping under the auditory critical band scale corresponding to the standard speech, for example: the mapping of the standard speech in the Bark domain, and the standard speech in the Equivalent Rectangular Bandwidth (ERB) domain The mapping of standard speech can be used to quantify the audio features of standard speech through audio features.

然后,可以将该音频特征在预设的神经网络中进行匹配,得到与该音频特征相匹配的拼音文本。具体而言,拼音文本可以由至少一个第一拼音元文本组成,而第一拼音元文本指任意一个声母或韵母。Then, the audio feature can be matched in a preset neural network to obtain pinyin text matching the audio feature. Specifically, the pinyin text may be composed of at least one first pinyin vowel text, and the first pinyin vowel text refers to any one initial or final.

305:根据拼音文本在预设的词汇库中进行匹配,得到第一语句。305: Perform matching in a preset vocabulary database according to the pinyin text to obtain a first sentence.

具体而言,在得到拼音文本后,可以根据拼音文本中至少一个第一拼音元文本中的每个第一拼音元文本,在预设的词汇库中进行匹配,得到与至少一个第一拼音元文本一一对应至少一个第一字符。然后,根据该至少一个第一字符和至少一个第一拼音元文本之间的对应关系,将该至少一个第一字符,按照至少一个第一拼音元文本在拼音文本中的排列顺序进行排列,即可得到第一语句。Specifically, after the pinyin text is obtained, matching can be performed in a preset vocabulary according to each first pinyin text in the at least one first pinyin text in the pinyin text, and a match with the at least one first pinyin text can be obtained. The text corresponds to at least one first character on a one-to-one basis. Then, according to the correspondence between the at least one first character and the at least one first pinyin text, the at least one first character is arranged according to the arrangement order of the at least one first pinyin text in the pinyin text, that is, The first sentence is available.

202:根据第一语句查询历史对话数据,得到第二语句。202: Query historical dialogue data according to the first sentence to obtain a second sentence.

在本实施方式中,历史对话数据用于记录第一语句所属的对话事件在当前时刻前所产生的对话数据,第二语句的发生时间早于第一语句的发生时间,且第二语句的发生时间与第一语句的发生时间之间的差的绝对值最小。简单而言,该第二语句即为用户在当前语句之前说的上一句话。In this embodiment, the historical dialogue data is used to record the dialogue data generated before the current time by the dialogue event to which the first sentence belongs, the occurrence time of the second sentence is earlier than the occurrence time of the first sentence, and the occurrence time of the second sentence is The absolute value of the difference between the time and the occurrence time of the first sentence is the smallest. In short, the second sentence is the last sentence that the user said before the current sentence.

具体而言,以AI电话外呼的场景为例,历史对话数据中可以保存两条相互关联的语句队列,其中,一队用于存储用户所发出的用户语句,另一对用于存储AI发出的AI语句。同时,用户语句队列中的每个用户语句,以及AI语句队列中的每个AI语句均包含有对话标识和对话发生时间,而对话标识相同的用户语句和AI语句即为一个问答对,即对话标识相同的用户语句为对AI语句的答复,或对话标识相同的AI语句为对用户语句的答复。由此,既可以保证历史对话数据中的问答逻辑性,同时将用户和AI的语句分开保存,便于查找。Specifically, taking the scenario of an outgoing AI phone call as an example, two interrelated statement queues can be stored in the historical dialogue data, one of which is used to store user statements issued by the user, and the other pair is used to store the statements issued by AI. AI statement. At the same time, each user sentence in the user sentence queue and each AI sentence in the AI sentence queue contain a dialogue identifier and a dialogue occurrence time, and a user sentence and an AI sentence with the same dialogue identifier are a question-and-answer pair, that is, a dialogue Identifies the same user sentence as a reply to an AI sentence, or a dialog identifies the same AI sentence as a reply to a user sentence. In this way, the logic of the question and answer in the historical dialogue data can be ensured, and the sentences of the user and the AI can be saved separately for easy search.

因此,在本实施方式中,可以通过查询用户语句队列,确定对话发生时间早于第一语句的发生时间,且发生时间与第一语句的发生时间之间的差的绝对值最小的语句作为第二语句。Therefore, in this embodiment, by querying the user sentence queue, it is possible to determine the sentence with the dialogue occurrence time earlier than the occurrence time of the first sentence and the smallest absolute value of the difference between the occurrence time and the occurrence time of the first sentence as the first sentence. Two sentences.

在可选的实施方式中,可以根据对话发生时间的顺序对用户的历史对话进行存储,此时,只需要在用户语句队列中确定第一语句的前一句语句,即可得到第二语句。In an optional embodiment, the user's historical dialogues may be stored according to the order of the dialogue occurrence time. In this case, only the sentence preceding the first sentence needs to be determined in the user's sentence queue to obtain the second sentence.

203:对第二语句进行意愿提取,得到第一意愿特征。203: Perform intention extraction on the second sentence to obtain the first intention feature.

在本实施方式中,可以对第二语句进行语义提取,继而根据所得的语义向量在预设的意愿库中进行匹配,以确定第二语句的第一意愿特征。具体而言,意愿库中预存的意愿特征是根据适用领域进行设置的,例如,对于AI电话外呼场景,可以预设意愿特征包括:利息多少、额度多少、还款时间、分红比例多少等强相关的意愿特征。由此,进行匹配时,可以通过计算相似度的方式,匹配出与第二语句的语义向量最相似的意愿特征作为该第一意愿特征。In this embodiment, semantic extraction may be performed on the second sentence, and then matching is performed in a preset willing database according to the obtained semantic vector, so as to determine the first willing feature of the second sentence. Specifically, the willingness features pre-stored in the willingness database are set according to the applicable field. For example, for the AI phone outbound call scenario, the willingness features can be preset including: interest, amount, repayment time, dividend ratio, etc. Related Willing Traits. Therefore, when performing matching, the desired feature most similar to the semantic vector of the second sentence can be matched as the first desired feature by calculating the similarity.

204:根据第一意愿特征生成第一语句的状态追踪标签。204: Generate a state tracking label of the first sentence according to the first intention feature.

在本实施方式中,状态追踪标签用于标识用户在说出当前的语音信息时自身的意愿方向和需求强度,可以用来判断用户对当前外呼中所回访或推荐的业务的感兴趣程度和方向,以此来指导后续的回复选择,可以提升本次外呼的成单率。In this embodiment, the status tracking tag is used to identify the user's own willingness direction and demand strength when speaking the current voice information, and can be used to determine the user's interest in the current outgoing call or the recommended service. direction, to guide the follow-up reply selection, which can improve the order rate of this outbound call.

具体而言,可以通过获取第二语句对应的第一需求评分,确定该用户在上一轮对话中的需求强度,再对第一语句进行意愿提取,根据得到第二意愿特征对第一需求评分进行更新,得到与第一语句对应的第二需求评分,以确定用户当前的意愿强度。最后,将第二需求评分和第一意愿特征进行组合,得到可以同时反映用户意愿方向和需求强度的状态追踪标签,例如:“前项_第一意愿特征_第二需求评分”。Specifically, the user's demand intensity in the previous round of dialogue can be determined by obtaining the first demand score corresponding to the second sentence, and then the willingness of the first sentence can be extracted, and the first demand score can be scored according to the obtained second willingness feature. The update is performed to obtain a second demand score corresponding to the first sentence, so as to determine the current willingness intensity of the user. Finally, the second demand score and the first desire feature are combined to obtain a state tracking label that can reflect the user's desire direction and demand intensity at the same time, for example: "previous item_first desire feature_second demand score".

示例性的,当第一语句为首句时,由于其即为第一句,前一句(第二语句)是不存在的。因此,可以将第二语句视为空,对应的意愿也为空,此时,该第二语句对应的第一需求评分可以通过分析用户历史的业务数据,根据历史业务数据中用户的成单率来确定,现假定为6。由此,该第一语句的状态追踪标签为“前项_0_6”。当第一语句为非首句时,其前一句必然存在,此时,前一句(第二语句)对应的第一需求评分可以通过对前一句的语义和情绪进行分析来确定,若前一句的意愿特征为“额度多少”,且对应的第二需求评分为8,则第一语句的状态追踪标签为“前项_额度多少_8”。Exemplarily, when the first sentence is the first sentence, since it is the first sentence, the previous sentence (the second sentence) does not exist. Therefore, the second statement can be regarded as empty, and the corresponding willingness is also empty. At this time, the first demand score corresponding to the second statement can be analyzed according to the user's historical business data, according to the user's order rate in the historical business data. to determine, now assume 6. Therefore, the state tracking label of the first sentence is "previous term_0_6". When the first sentence is not the first sentence, the previous sentence must exist. At this time, the first demand score corresponding to the previous sentence (the second sentence) can be determined by analyzing the semantics and emotions of the previous sentence. If the willingness feature is "how much is the amount", and the corresponding second demand score is 8, the state tracking label of the first sentence is "previous item_how much is the amount_8".

同时,本实施方式中提供了一种根据第二意愿特征对第一需求评分进行更新,得到与第一语句对应的第二需求评分的方法。具体而言,可以对第二意愿特征进行编码处理,得到第一字符串,同时,获取第一语句所属的通话的通话参数,并对通话参数中的缺失值进行补全,得到目标参数。从而将目标参数、第一需求评分和第一字符串进行拼接,得到第二字符串,再将第二字符串输入逻辑回归模型,得到第二需求评分。At the same time, this embodiment provides a method for updating the first demand score according to the second intention feature to obtain a second demand score corresponding to the first sentence. Specifically, the second desired feature can be encoded to obtain the first character string, and at the same time, the call parameters of the call to which the first sentence belongs are obtained, and the missing values in the call parameters are complemented to obtain the target parameter. Thus, the target parameter, the first demand score and the first character string are spliced together to obtain the second character string, and then the second character string is input into the logistic regression model to obtain the second demand score.

在本实施方式中,对于缺失值的补全,可以将缺失值替换为预设的替代值,例如:999等特定的数值,得到目标参数;或,根据缺失值的数据类型,获取数据类型对应的历史数据集,将缺失值替换为历史数据集中包括的至少一个数据值的中值,得到目标参数;或,将缺失值替换为历史数据集中包括的至少一个数据值的均值,得到目标参数。In this embodiment, for the completion of the missing value, the missing value can be replaced with a preset substitute value, for example: a specific value such as 999, to obtain the target parameter; or, according to the data type of the missing value, the corresponding data type can be obtained The historical data set of , replace the missing value with the median of at least one data value included in the historical data set to obtain the target parameter; or replace the missing value with the mean of at least one data value included in the historical data set to obtain the target parameter.

在可选的实施方式中,还可以直接将该第二意愿特征作为第一语句的状态追踪标签,即将前一句的意愿特征作为当前句的状态追踪标签。示例性的,当第一语句为首句时,由于其即为第一句,前一句不存在,因此,该第一语句的状态追踪标签为“前项_0”,即表示前一句为空。当第一语句为非首句时,其前一句必然存在,此时,可以确定该前一句的意愿特征作为该第一语句的状态追踪标签。示例性的,前一句的意愿特征为“额度多少”,则第一语句的状态追踪标签为“前项_额度多少”。In an optional embodiment, the second desire feature may also be directly used as the state tracking label of the first sentence, that is, the desire feature of the previous sentence may be used as the state tracking label of the current sentence. Exemplarily, when the first sentence is the first sentence, since it is the first sentence and the previous sentence does not exist, the state tracking label of the first sentence is "previous item_0", which means that the previous sentence is empty. When the first sentence is not the first sentence, the preceding sentence must exist. In this case, the intention feature of the preceding sentence can be determined as the state tracking label of the first sentence. Exemplarily, if the willingness feature of the previous sentence is "how much is the amount", the state tracking label of the first sentence is "the previous item_how much is the amount".

205:将第一语句、状态追踪标签和至少一个话术输入评分模型,得到与至少一个话术一一对应的至少一个第一分数。205: Input the first sentence, the state tracking label, and the at least one phrase into the scoring model, to obtain at least one first score corresponding to the at least one phrase one-to-one.

在本实施方式中,评分模型用于对至少一个话术中的每个话术进行评分,可以选取rankingbert模型作为评分模型。具体而言,对第一语句进行分析处理,将得到的分析结果和状态追踪标签以以下方式进行排列,得到输入数据:In this embodiment, the scoring model is used to score each speech in the at least one speech, and a rankingbert model may be selected as the scoring model. Specifically, the first sentence is analyzed and processed, and the obtained analysis results and status tracking labels are arranged in the following manner to obtain input data:

[cls,客户意图,sep,话术,sep]+[是否有状态追踪]+[状态追踪标签][cls, customer intent, sep, words, sep]+[whether there is status tracking]+[status tracking label]

其中,“客户意图”为第一语句的意愿特征,其提取方式与步骤203中第二语句的意愿特征的提取方式相似,在此不再赘述。话术为该第一语句所属的话术类型。Wherein, "customer intention" is the willingness feature of the first sentence, and the extraction method is similar to the extraction method of the willingness feature of the second sentence in step 203, and details are not repeated here. The discourse is the discourse type to which the first sentence belongs.

由此,相对于rankingbert模型的传统输入序列「cls,客户意图,sep,话术,sep」,本实施方式中的输入序列中添加了状态追踪标签,从而在模型中引入了逻辑判断,使得模型输出的结果具备了可解释的逻辑性。Therefore, compared with the traditional input sequence of the rankingbert model "cls, customer intent, sep, huashu, sep", the state tracking tag is added to the input sequence in this embodiment, so that logical judgment is introduced into the model, so that the model The output results have interpretable logic.

206:将至少一个第一分数中的每个第一分数与随机函数相乘,得到与至少一个第一分数一一对应的至少一个第二分数。206 : Multiply each of the at least one first score by a random function to obtain at least one second score corresponding to the at least one first score one-to-one.

在本实施方式中,提供了一种随机函数,如公式①所示:In this embodiment, a random function is provided, as shown in formula ①:

Figure BDA0003506610200000131
Figure BDA0003506610200000131

其中,x表示第一语句所属的话术类型,Cx表示历史对话数据中通过话术类型成功成单的对话的数量,Sx表示历史对话数据中使用话术类型的对话的数量,a表示权重系数,random([1,nx])为与nx相关的随机数,nx为第一语句的意愿特征匹配出的可用话术的数量。Among them, x represents the discourse type to which the first sentence belongs, C x represents the number of conversations in the historical dialogue data that were successfully completed by the discourse type, S x represents the number of conversations using the discourse type in the historical dialogue data, and a represents the weight Coefficient, random([1, n x ]) is a random number related to n x , and n x is the number of available words matched by the willingness feature of the first sentence.

由此,第一分数、第二分数和随机函数之间满足公式②:Therefore, formula ② is satisfied between the first score, the second score and the random function:

yi=zi×p(x).........②y i =z i ×p(x).........②

其中,yi表示至少一个第二分数中的第i个第二分数,zi表示至少一个第一分数中的第i个第一分数,i为大于或等于1的整数。Wherein, yi represents the i-th second fraction in the at least one second fraction, zi represents the i-th first fraction in the at least one first fraction, and i is an integer greater than or equal to 1.

207:将至少一个第二分数中最大的第二分数对应的话术作为推荐话术推荐至答复设备,以使答复设备根据推荐话术生成答复语句对用户当前的语音信息进行答复。207 : Recommend the phrase corresponding to the largest second score in the at least one second score to the replying device as a recommended phrase, so that the replying device generates a reply sentence according to the recommended phrase to reply to the user's current voice information.

在本实施方式中,可以选取第一分数组中的最大的前5位,再将该最大的5个第一分数分别与随机函数相乘,选取结果中最大的分数对应的话术进行推荐。由此,可以降低计算数量,提高推荐效率。In this embodiment, the largest top 5 digits in the first score group may be selected, and then the largest five first scores are multiplied by a random function respectively, and the words corresponding to the largest score in the result are selected for recommendation. In this way, the number of computations can be reduced, and the recommendation efficiency can be improved.

综上所述,本发明所提供的基于随机状态的话术推荐方法中,通过对当前对话中,用户当前时刻所说的语句(第一语句)的前一语句(第二语句)的意愿进行分析提取,从而根据该第二语句的意愿生成第一语句的状态追踪标签。然后,通过第一语句和该状态追踪标签一起对适配出的至少一个话术进行评分,并通过随机函数对评分结果进行扰动,继而选出扰动结果中最高分数对应的话术进行推荐。由此,通过状态追踪标签,使每次进行话术推荐时,参考参数中除了当前对话之外,还携带了前一时刻的对话意愿,由此使恢复过程有了更多的主动性,可以通过标记不同的状态追踪标签来影响话术推荐的方向,继而在模型中引入了逻辑判断,使得模型输出的结果具备了可解释的逻辑性。同时,通过引入的随机函数对模型的输出结果进行随机概率的扰动,可以增加问答中的趣味性和创造性,提升客户体验。To sum up, in the random state-based discourse recommendation method provided by the present invention, the intention of the sentence (the second sentence) preceding the sentence (the first sentence) spoken by the user at the current moment in the current dialogue is analyzed. Extraction, thereby generating the state tracking label of the first sentence according to the will of the second sentence. Then, the first sentence and the state tracking tag are used to score at least one adapted phrase, and the scoring result is perturbed by a random function, and then the phrase corresponding to the highest score in the perturbation result is selected for recommendation. Therefore, through the state tracking tag, each time a speech recommendation is made, in addition to the current dialogue, the reference parameters also carry the dialogue willingness of the previous moment, so that the recovery process has more initiative and can be By marking different state tracking tags to affect the direction of speech recommendation, and then introducing logical judgments into the model, the results of the model output have interpretable logic. At the same time, the random function is introduced to perturb the output results of the model with random probability, which can increase the fun and creativity in the question-and-answer process and improve the customer experience.

参阅图5,图5为本申请实施方式提供的一种基于随机状态的话术推荐装置的功能模块组成框图。如图5所示,该基于随机状态的话术推荐装置500包括:Referring to FIG. 5 , FIG. 5 is a block diagram of functional modules of a random state-based vocabulary recommendation apparatus provided by an embodiment of the present application. As shown in FIG. 5 , the random state-based vocabulary recommendation device 500 includes:

分析模块501,用于对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句;The analysis module 501 is used to perform text conversion processing on the current voice information of the user to obtain a first sentence corresponding to the voice information;

查询模块502,用于根据第一语句查询历史对话数据,得到第二语句,其中,历史对话数据用于记录第一语句所属的对话事件在当前时刻前所产生的对话数据,第二语句的发生时间早于第一语句的发生时间,且第二语句的发生时间与第一语句的发生时间之间的差的绝对值最小;The query module 502 is configured to query historical dialogue data according to the first sentence to obtain a second sentence, wherein the historical dialogue data is used to record the dialogue data generated before the current moment in the dialogue event to which the first sentence belongs, and the occurrence of the second sentence. The time is earlier than the occurrence time of the first statement, and the absolute value of the difference between the occurrence time of the second statement and the occurrence time of the first statement is the smallest;

处理模块503,用于对第二语句进行意愿提取,得到第一意愿特征,根据第一意愿特征生成第一语句的状态追踪标签,其中,状态追踪标签用于标识用户在说出语音信息时自身的意愿方向和需求强度;The processing module 503 is configured to perform willingness extraction on the second sentence, obtain a first willingness feature, and generate a state tracking label of the first sentence according to the first willingness feature, wherein the state tracking label is used to identify the user himself when he speaks the voice information direction of willingness and intensity of demand;

评分模块504,用于将第一语句、状态追踪标签和至少一个话术输入评分模型,得到至少一个第一分数,其中,评分模型用于对至少一个话术中的每个话术进行评分,至少一个第一分数与至少一个话术一一对应,并将至少一个第一分数中的每个第一分数与随机函数相乘,得到至少一个第二分数,其中,至少一个第二分数与至少一个第一分数一一对应;The scoring module 504 is configured to input the first sentence, the state tracking label and the at least one discourse into the scoring model to obtain at least one first score, wherein the scoring model is used to score each discourse in the at least one discourse, At least one first score is in one-to-one correspondence with at least one vocabulary, and each first score in the at least one first score is multiplied by a random function to obtain at least one second score, wherein the at least one second score is associated with at least one second score. A first score corresponds one-to-one;

推荐模块505,用于将至少一个第二分数中最大的第二分数对应的话术推荐至答复设备,以使答复设备根据推荐话术生成答复语句对用户当前的语音信息进行答复。The recommending module 505 is configured to recommend the phrase corresponding to the largest second score in the at least one second score to the replying device, so that the replying device generates a reply sentence according to the recommended phrase to reply to the current voice information of the user.

在本发明的实施方式中,在根据第一意愿特征生成第一语句的状态追踪标签方面,处理模块503,具体用于:In the embodiment of the present invention, in terms of generating the state tracking label of the first sentence according to the first intention feature, the processing module 503 is specifically used for:

获取第二语句对应的第一需求评分;Obtain the first demand score corresponding to the second sentence;

对第一语句进行意愿提取,得到第二意愿特征;Perform willingness extraction on the first sentence to obtain the second willingness feature;

根据第二意愿特征对第一需求评分进行更新,得到与第一语句对应的第二需求评分;updating the first demand score according to the second desire feature, to obtain a second demand score corresponding to the first sentence;

将第二需求评分和第一意愿特征进行组合,得到状态追踪标签。Combining the second demand score and the first willingness feature to obtain a state tracking label.

在本发明的实施方式中,在根据第二意愿特征对第一需求评分进行更新,得到与第一语句对应的第二需求评分方面,处理模块503,具体用于:In the embodiment of the present invention, in terms of updating the first demand score according to the second desire feature, and obtaining the second demand score corresponding to the first sentence, the processing module 503 is specifically used for:

对第二意愿特征进行编码处理,得到第一字符串;Encoding the second will feature to obtain the first character string;

获取第一语句所属的通话的通话参数,并对通话参数中的缺失值进行补全,得到目标参数;Obtain the call parameters of the call to which the first sentence belongs, and complete the missing values in the call parameters to obtain the target parameters;

将目标参数、第一需求评分和第一字符串进行拼接,得到第二字符串;Splicing the target parameter, the first demand score and the first string to obtain the second string;

将第二字符串输入逻辑回归模型,得到第二需求评分。Input the second string into the logistic regression model to get the second demand score.

在本发明的实施方式中,在对通话参数中的缺失值进行补全,得到目标参数方面,处理模块503,具体用于:In the embodiment of the present invention, in the aspect of completing the missing values in the call parameters to obtain the target parameters, the processing module 503 is specifically used for:

将缺失值替换为预设的替代值,得到目标参数;Replace missing values with preset substitute values to get target parameters;

或,根据缺失值的数据类型,获取数据类型对应的历史数据集,将缺失值替换为历史数据集中包括的至少一个数据值的中值,得到目标参数;Or, according to the data type of the missing value, obtain the historical data set corresponding to the data type, replace the missing value with the median value of at least one data value included in the historical data set, and obtain the target parameter;

或,将缺失值替换为历史数据集中包括的至少一个数据值的均值,得到目标参数。Or, replace missing values with the mean of at least one data value included in the historical dataset to obtain the target parameter.

在本发明的实施方式中,随机函数如公式③所示:In an embodiment of the present invention, the random function is shown in formula ③:

Figure BDA0003506610200000151
Figure BDA0003506610200000151

其中,x表示第一语句所属的话术类型,Cx表示历史对话数据中通过话术类型成功成单的对话的数量,Sx表示历史对话数据中使用话术类型的对话的数量,a表示权重系数,random([1,nx])为与nx相关的随机数,nx为第一语句的意愿特征匹配出的可用话术的数量。Among them, x represents the discourse type to which the first sentence belongs, C x represents the number of conversations in the historical dialogue data that were successfully completed by the discourse type, S x represents the number of conversations using the discourse type in the historical dialogue data, and a represents the weight Coefficient, random([1, n x ]) is a random number related to n x , and n x is the number of available words matched by the willingness feature of the first sentence.

在本发明的实施方式中,在对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句方面,分析模块501,具体用于:In the embodiment of the present invention, in terms of performing text conversion processing on the current voice information of the user to obtain the first sentence corresponding to the voice information, the analysis module 501 is specifically used for:

获取语音信息的声学特征;Obtain acoustic features of speech information;

根据声学特征,确定语音信息的方言类别;Determine the dialect category of the speech information according to the acoustic features;

获取方言类别对应的音频转置公式,通过音频转置公式将语音信息转化为标准语音,其中,音频转置公式用于标识对应的方言发音与普通话发音之间的转化特征;Obtain the audio transposition formula corresponding to the dialect category, and convert the voice information into standard speech through the audio transposition formula, wherein the audio transposition formula is used to identify the conversion feature between the corresponding dialect pronunciation and the Mandarin pronunciation;

根据标准语音,获取语音信息的拼音文本;Obtain the pinyin text of the voice information according to the standard voice;

根据拼音文本在预设的词汇库中进行匹配,得到第一语句。The first sentence is obtained by matching in the preset vocabulary database according to the pinyin text.

在本发明的实施方式中,在根据声学特征,确定语音信息的方言类别方面,分析模块501,具体用于:In the embodiment of the present invention, in terms of determining the dialect category of the speech information according to the acoustic features, the analysis module 501 is specifically used for:

根据声学特征确定语音信息的能量分布、韵律分布、基频和平均语声功率;Determine the energy distribution, prosody distribution, fundamental frequency and average speech power of speech information according to acoustic features;

根据基频和平均语声功率,确定语音信息对应的方言片区;Determine the dialect area corresponding to the voice information according to the fundamental frequency and the average voice power;

分别将能量分布和韵律分布进行编码,得到能量分布向量和韵律分布向量;Encoding the energy distribution and the rhythm distribution respectively to obtain the energy distribution vector and the rhythm distribution vector;

将能量分布向量和韵律分布向量纵向拼接,得到音色分布向量;The energy distribution vector and the rhythm distribution vector are longitudinally spliced to obtain the timbre distribution vector;

根据音色分布向量在方言片区对应的音色库中进行匹配,并根据匹配结果确定语音信息的方言类别。According to the timbre distribution vector, matching is performed in the timbre library corresponding to the dialect area, and the dialect category of the voice information is determined according to the matching result.

参阅图6,图6为本申请实施方式提供的一种电子设备的结构示意图。如图6所示,电子设备600包括收发器601、处理器602和存储器603。它们之间通过总线604连接。存储器603用于存储计算机程序和数据,并可以将存储器603存储的数据传输给处理器602。Referring to FIG. 6 , FIG. 6 is a schematic structural diagram of an electronic device according to an embodiment of the present application. As shown in FIG. 6 , the electronic device 600 includes a transceiver 601 , a processor 602 and a memory 603 . They are connected by bus 604 . The memory 603 is used to store computer programs and data, and can transmit the data stored in the memory 603 to the processor 602 .

处理器602用于读取存储器603中的计算机程序执行以下操作:The processor 602 is used to read the computer program in the memory 603 to perform the following operations:

对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句;Perform text conversion processing on the current voice information of the user to obtain a first sentence corresponding to the voice information;

根据第一语句查询历史对话数据,得到第二语句,其中,历史对话数据用于记录第一语句所属的对话事件在当前时刻前所产生的对话数据,第二语句的发生时间早于第一语句的发生时间,且第二语句的发生时间与第一语句的发生时间之间的差的绝对值最小;Query historical dialogue data according to the first sentence, and obtain a second sentence, wherein the historical dialogue data is used to record the dialogue data generated before the current moment in the dialogue event to which the first sentence belongs, and the occurrence time of the second sentence is earlier than that of the first sentence The occurrence time of , and the absolute value of the difference between the occurrence time of the second statement and the occurrence time of the first statement is the smallest;

对第二语句进行意愿提取,得到第一意愿特征;Perform willingness extraction on the second sentence to obtain the first willingness feature;

根据第一意愿特征生成第一语句的状态追踪标签,其中,状态追踪标签用于标识用户在说出语音信息时自身的意愿方向和需求强度;generating a state tracking tag of the first sentence according to the first intention feature, wherein the state tracking tag is used to identify the user's own intention direction and demand intensity when speaking the voice information;

将第一语句、状态追踪标签和至少一个话术输入评分模型,得到至少一个第一分数,其中,评分模型用于对至少一个话术中的每个话术进行评分,至少一个第一分数与至少一个话术一一对应;The first sentence, the state tracking label and the at least one discourse are input into the scoring model to obtain at least one first score, wherein the scoring model is used to score each discourse in the at least one discourse, and the at least one first score is the same as that of the at least one discourse. One-to-one correspondence with at least one vocabulary;

将至少一个第一分数中的每个第一分数与随机函数相乘,得到至少一个第二分数,其中,至少一个第二分数与至少一个第一分数一一对应;Multiplying each of the at least one first score by a random function to obtain at least one second score, wherein the at least one second score corresponds to the at least one first score one-to-one;

将至少一个第二分数中最大的第二分数对应的话术作为推荐话术推荐至答复设备,以使答复设备根据推荐话术生成答复语句对用户当前的语音信息进行答复。The speech corresponding to the largest second score among the at least one second score is recommended to the replying device as a recommended speech, so that the replying device generates a reply sentence according to the recommended speech to reply to the user's current voice information.

在本发明的实施方式中,在根据第一意愿特征生成第一语句的状态追踪标签方面,处理器602,具体用于执行以下操作:In the embodiment of the present invention, in terms of generating the state tracking label of the first sentence according to the first intention feature, the processor 602 is specifically configured to perform the following operations:

获取第二语句对应的第一需求评分;Obtain the first demand score corresponding to the second sentence;

对第一语句进行意愿提取,得到第二意愿特征;Perform willingness extraction on the first sentence to obtain the second willingness feature;

根据第二意愿特征对第一需求评分进行更新,得到与第一语句对应的第二需求评分;updating the first demand score according to the second desire feature, to obtain a second demand score corresponding to the first sentence;

将第二需求评分和第一意愿特征进行组合,得到状态追踪标签。Combining the second demand score and the first willingness feature to obtain a state tracking label.

在本发明的实施方式中,在根据第二意愿特征对第一需求评分进行更新,得到与第一语句对应的第二需求评分方面,处理器602,具体用于执行以下操作:In the embodiment of the present invention, in terms of updating the first demand score according to the second desire feature to obtain the second demand score corresponding to the first sentence, the processor 602 is specifically configured to perform the following operations:

对第二意愿特征进行编码处理,得到第一字符串;Encoding the second will feature to obtain the first character string;

获取第一语句所属的通话的通话参数,并对通话参数中的缺失值进行补全,得到目标参数;Obtain the call parameters of the call to which the first sentence belongs, and complete the missing values in the call parameters to obtain the target parameters;

将目标参数、第一需求评分和第一字符串进行拼接,得到第二字符串;Splicing the target parameter, the first demand score and the first string to obtain the second string;

将第二字符串输入逻辑回归模型,得到第二需求评分。Input the second string into the logistic regression model to get the second demand score.

在本发明的实施方式中,在对通话参数中的缺失值进行补全,得到目标参数方面,处理器602,具体用于执行以下操作:In the embodiment of the present invention, in terms of completing the missing values in the call parameters to obtain the target parameters, the processor 602 is specifically configured to perform the following operations:

将缺失值替换为预设的替代值,得到目标参数;Replace missing values with preset substitute values to get target parameters;

或,根据缺失值的数据类型,获取数据类型对应的历史数据集,将缺失值替换为历史数据集中包括的至少一个数据值的中值,得到目标参数;Or, according to the data type of the missing value, obtain the historical data set corresponding to the data type, replace the missing value with the median value of at least one data value included in the historical data set, and obtain the target parameter;

或,将缺失值替换为历史数据集中包括的至少一个数据值的均值,得到目标参数。Or, replace missing values with the mean of at least one data value included in the historical dataset to obtain the target parameter.

在本发明的实施方式中,随机函数如公式④所示:In an embodiment of the present invention, the random function is shown in formula ④:

Figure BDA0003506610200000171
Figure BDA0003506610200000171

其中,x表示第一语句所属的话术类型,Cx表示历史对话数据中通过话术类型成功成单的对话的数量,Sx表示历史对话数据中使用话术类型的对话的数量,a表示权重系数,random([1,nx])为与nx相关的随机数,nx为第一语句的意愿特征匹配出的可用话术的数量。Among them, x represents the discourse type to which the first sentence belongs, C x represents the number of conversations in the historical dialogue data that were successfully completed by the discourse type, S x represents the number of conversations using the discourse type in the historical dialogue data, and a represents the weight Coefficient, random([1, n x ]) is a random number related to n x , and n x is the number of available words matched by the willingness feature of the first sentence.

在本发明的实施方式中,在对用户当前的语音信息进行文本转化处理,得到语音信息对应的第一语句方面,处理器602,具体用于执行以下操作:In an embodiment of the present invention, in terms of performing text conversion processing on the current voice information of the user to obtain the first sentence corresponding to the voice information, the processor 602 is specifically configured to perform the following operations:

获取语音信息的声学特征;Obtain acoustic features of speech information;

根据声学特征,确定语音信息的方言类别;Determine the dialect category of the speech information according to the acoustic features;

获取方言类别对应的音频转置公式,通过音频转置公式将语音信息转化为标准语音,其中,音频转置公式用于标识对应的方言发音与普通话发音之间的转化特征;Obtain the audio transposition formula corresponding to the dialect category, and convert the voice information into standard speech through the audio transposition formula, wherein the audio transposition formula is used to identify the conversion feature between the corresponding dialect pronunciation and the Mandarin pronunciation;

根据标准语音,获取语音信息的拼音文本;Obtain the pinyin text of the voice information according to the standard voice;

根据拼音文本在预设的词汇库中进行匹配,得到第一语句。The first sentence is obtained by matching in the preset vocabulary database according to the pinyin text.

在本发明的实施方式中,在根据声学特征,确定语音信息的方言类别方面,处理器602,具体用于执行以下操作:In the embodiment of the present invention, in terms of determining the dialect category of the speech information according to the acoustic features, the processor 602 is specifically configured to perform the following operations:

根据声学特征确定语音信息的能量分布、韵律分布、基频和平均语声功率;Determine the energy distribution, prosody distribution, fundamental frequency and average speech power of speech information according to acoustic features;

根据基频和平均语声功率,确定语音信息对应的方言片区;Determine the dialect area corresponding to the voice information according to the fundamental frequency and the average voice power;

分别将能量分布和韵律分布进行编码,得到能量分布向量和韵律分布向量;Encoding the energy distribution and the rhythm distribution respectively to obtain the energy distribution vector and the rhythm distribution vector;

将能量分布向量和韵律分布向量纵向拼接,得到音色分布向量;The energy distribution vector and the rhythm distribution vector are longitudinally spliced to obtain the timbre distribution vector;

根据音色分布向量在方言片区对应的音色库中进行匹配,并根据匹配结果确定语音信息的方言类别。According to the timbre distribution vector, matching is performed in the timbre library corresponding to the dialect area, and the dialect category of the voice information is determined according to the matching result.

应理解,本申请中的基于随机状态的话术推荐装置可以包括智能手机(如Android手机、iOS手机、Windows Phone手机等)、平板电脑、掌上电脑、笔记本电脑、移动互联网设备MID(Mobile Internet Devices,简称:MID)、机器人或穿戴式设备等。上述基于随机状态的话术推荐装置仅是举例,而非穷举,包含但不限于上述基于随机状态的话术推荐装置。在实际应用中,上述基于随机状态的话术推荐装置还可以包括:智能车载终端、计算机设备等等。It should be understood that the random state-based speech recommendation device in this application may include smart phones (such as Android mobile phones, iOS mobile phones, Windows Phone mobile phones, etc.), tablet computers, handheld computers, notebook computers, and MID (Mobile Internet Devices, Mobile Internet Devices, etc.) Abbreviation: MID), robots or wearable devices, etc. The above random state-based speech recommendation device is only an example, not exhaustive, including but not limited to the above random state-based speech recommendation device. In practical applications, the above-mentioned random state-based speech recommendation apparatus may further include: an intelligent vehicle-mounted terminal, a computer device, and the like.

通过以上的实施方式的描述,本领域的技术人员可以清楚地了解到本发明可借助软件结合硬件平台的方式来实现。基于这样的理解,本发明的技术方案对背景技术做出贡献的全部或者部分可以以软件产品的形式体现出来,该计算机软件产品可以存储在存储介质中,如ROM/RAM、磁碟、光盘等,包括若干指令用以使得一台计算机设备(可以是个人计算机,服务器,或者网络设备等)执行本发明各个实施方式或者实施方式的某些部分所述的方法。From the description of the above embodiments, those skilled in the art can clearly understand that the present invention can be implemented by means of software combined with a hardware platform. Based on this understanding, all or part of the technical solutions of the present invention can be embodied in the form of software products, and the computer software products can be stored in storage media, such as ROM/RAM, magnetic disks, optical disks, etc. , including several instructions for causing a computer device (which may be a personal computer, a server, or a network device, etc.) to perform the methods described in various embodiments of the present invention or some parts of the embodiments.

因此,本申请实施方式还提供一种计算机可读存储介质,所述计算机可读存储介质存储有计算机程序,所述计算机程序被处理器执行以实现如上述方法实施方式中记载的任何一种基于随机状态的话术推荐方法的部分或全部步骤。例如,所述存储介质可以包括硬盘、软盘、光盘、磁带、磁盘、优盘、闪存等。Therefore, embodiments of the present application also provide a computer-readable storage medium, where a computer program is stored in the computer-readable storage medium, and the computer program is executed by a processor to implement any one of the above-mentioned method embodiments based on Some or all of the steps of the random state's discourse recommendation method. For example, the storage medium may include a hard disk, a floppy disk, an optical disk, a magnetic tape, a magnetic disk, a USB flash drive, a flash memory, and the like.

本申请实施方式还提供一种计算机程序产品,所述计算机程序产品包括存储了计算机程序的非瞬时性计算机可读存储介质,所述计算机程序可操作来使计算机执行如上述方法实施方式中记载的任何一种基于随机状态的话术推荐方法的部分或全部步骤。Embodiments of the present application also provide a computer program product, the computer program product comprising a non-transitory computer-readable storage medium storing a computer program, the computer program being operable to cause a computer to execute the method described in the foregoing method embodiments Part or all of the steps of any random state-based discourse recommendation method.

需要说明的是,对于前述的各方法实施方式,为了简单描述,故将其都表述为一系列的动作组合,但是本领域技术人员应该知悉,本申请并不受所描述的动作顺序的限制,因为依据本申请,某些步骤可以采用其他顺序或者同时进行。其次,本领域技术人员也应该知悉,说明书中所描述的实施方式均属于可选的实施方式,所涉及的动作和模块并不一定是本申请所必须的。It should be noted that, for the sake of simple description, the foregoing method embodiments are all expressed as a series of action combinations, but those skilled in the art should know that the present application is not limited by the described action sequence. Because in accordance with the present application, certain steps may be performed in other orders or concurrently. Secondly, those skilled in the art should also know that the implementation manners described in the specification are all optional implementation manners, and the actions and modules involved are not necessarily required by the present application.

在上述实施方式中,对各个实施方式的描述都各有侧重,某个实施方式中没有详述的部分,可以参见其他实施方式的相关描述。In the above-mentioned embodiments, the description of each embodiment has its own emphasis. For parts that are not described in detail in a certain embodiment, reference may be made to the relevant descriptions of other embodiments.

在本申请所提供的几个实施方式中,应该理解到,所揭露的装置,可通过其它的方式实现。例如,以上所描述的装置实施方式仅仅是示意性的,例如所述单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个系统,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus may be implemented in other manners. For example, the device implementations described above are only illustrative, for example, the division of the units is only a logical function division, and other divisions may be used in actual implementation, for example, multiple units or components may be combined or Integration into another system, or some features can be ignored, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical or other forms.

所述作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是或者也可以不是物理单元,即可以位于一个地方,或者也可以分布到多个网络单元上。可以根据实际的需要选择其中的部分或者全部单元来实现本实施方式方案的目的。The units described as separate components may or may not be physically separated, and components displayed as units may or may not be physical units, that is, may be located in one place, or may be distributed to multiple network units. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this implementation manner.

另外,在本申请各个实施方式中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件程序模块的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units can be implemented in the form of hardware, and can also be implemented in the form of software program modules.

所述集成的单元如果以软件程序模块的形式实现并作为独立的产品销售或使用时,可以存储在一个计算机可读取存储器中。基于这样的理解,本申请的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该计算机软件产品存储在一个存储器中,包括若干指令用以使得一台计算机设备(可为个人计算机、服务器或者网络设备等)执行本申请各个实施方式所述方法的全部或部分步骤。而前述的存储器包括:U盘、只读存储器(ROM,Read-Only Memory)、随机存取存储器(RAM,Random Access Memory)、移动硬盘、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software program module and sold or used as a stand-alone product, may be stored in a computer readable memory. Based on this understanding, the technical solution of the present application can be embodied in the form of a software product in essence, or the part that contributes to the prior art, or all or part of the technical solution, and the computer software product is stored in a memory, Several instructions are included to cause a computer device (which may be a personal computer, a server, or a network device, etc.) to execute all or part of the steps of the methods described in the various embodiments of the present application. The aforementioned memory includes: U disk, read-only memory (ROM, Read-Only Memory), random access memory (RAM, Random Access Memory), mobile hard disk, magnetic disk or optical disk and other media that can store program codes.

本领域普通技术人员可以理解上述实施方式的各种方法中的全部或部分步骤是可以通过程序来指令相关的硬件来完成,该程序可以存储于一计算机可读存储器中,存储器可以包括:闪存盘、只读存储器(英文:Read-Only Memory,简称:ROM)、随机存取器(英文:Random Access Memory,简称:RAM)、磁盘或光盘等。Those of ordinary skill in the art can understand that all or part of the steps in the various methods of the above embodiments can be completed by instructing relevant hardware through a program, and the program can be stored in a computer-readable memory, and the memory can include: a flash disk , Read-only memory (English: Read-Only Memory, referred to as: ROM), random access device (English: Random Access Memory, referred to as: RAM), magnetic disk or optical disk, etc.

以上对本申请实施方式进行了详细介绍,本文中应用了具体个例对本申请的原理及实施方式进行了阐述,以上实施方式的说明只是用于帮助理解本申请的方法及其核心思想;同时,对于本领域的一般技术人员,依据本申请的思想,在具体实施方式及应用范围上均会有改变之处,综上所述,本说明书内容不应理解为对本申请的限制。The embodiments of the present application have been introduced in detail above, and specific examples are used to illustrate the principles and implementations of the present application. The descriptions of the above embodiments are only used to help understand the methods and core ideas of the present application; at the same time, for Persons of ordinary skill in the art, based on the idea of the present application, will have changes in the specific implementation manner and application scope. In summary, the contents of this specification should not be construed as limitations on the present application.

Claims (10)

1. A method for random state based conversational recommendation, the method comprising:
performing text conversion processing on current voice information of a user to obtain a first sentence corresponding to the voice information;
Querying historical dialogue data according to the first statement to obtain a second statement, wherein the historical dialogue data is used for recording dialogue data generated before the current moment by a dialogue event to which the first statement belongs, the occurrence time of the second statement is earlier than that of the first statement, and the absolute value of the difference between the occurrence time of the second statement and the occurrence time of the first statement is minimum;
extracting the will of the second sentence to obtain a first will characteristic;
generating a state tracking label of the first statement according to the first intention characteristic, wherein the state tracking label is used for identifying own intention direction and demand strength of the user when the user speaks the voice information;
inputting the first sentence, the state tracking tag and at least one dialect into a scoring model to obtain at least one first score, wherein the scoring model is used for scoring each of the at least one dialect, and the at least one first score is in one-to-one correspondence with the at least one dialect;
multiplying each first score in the at least one first score by a random function to obtain at least one second score, wherein the at least one second score is in one-to-one correspondence with the at least one first score;
Recommending the dialect corresponding to the maximum second score in the at least one second score to a reply device, so that the reply device generates a reply sentence according to the recommended dialect to reply to the current voice information of the user.
2. The method of claim 1, wherein generating the state tracking label for the first sentence according to the first will characteristics comprises:
acquiring a first demand score corresponding to the second statement;
extracting the will of the first sentence to obtain a second will characteristic;
updating the first demand score according to the second will characteristic to obtain a second demand score corresponding to the first statement;
and combining the second demand score and the first will characteristic to obtain the state tracking label.
3. The method of claim 2, wherein the updating the first demand score according to the second will characteristics to obtain a second demand score corresponding to the first sentence, comprises:
coding the second will characteristics to obtain a first character string;
acquiring call parameters of a call to which the first statement belongs, and completing missing values in the call parameters to obtain target parameters;
Splicing the target parameter, the first demand score and the first character string to obtain a second character string;
and inputting the second character string into a logistic regression model to obtain the second demand score.
4. The method of claim 3, wherein completing missing values in the call parameters to obtain target parameters comprises:
replacing the missing value with a preset replacing value to obtain the target parameter;
or acquiring a historical data set corresponding to the data type according to the data type of the missing value, and replacing the missing value with a median of at least one data value included in the historical data set to obtain the target parameter;
or replacing the missing value with a mean value of at least one data value included in the historical data set to obtain the target parameter.
5. The method of claim 1, wherein the random function is as follows:
Figure FDA0003506610190000021
wherein x represents a conversational type to which the first sentence belongs, CxRepresenting the number of successful singletons by said type of dialogs in the historical dialog data, SxRepresenting the number of dialogs in the historical dialog data using the type of dialogs, a representing the weight coefficient, random ([1, n) x]) Is a and nxAssociated random number, nxThe number of available dialogs matched for the intended feature of the first sentence.
6. The method of claim 1, wherein performing text conversion processing on the current speech information of the user to obtain a first sentence corresponding to the speech information comprises:
acquiring acoustic features of the voice information;
determining dialect types of the voice information according to the acoustic features;
acquiring an audio transposition formula corresponding to the dialect category, and converting the voice information into standard voice through the audio transposition formula, wherein the audio transposition formula is used for identifying conversion characteristics between corresponding dialect pronunciation and mandarin pronunciation;
acquiring a pinyin text of the voice information according to the standard voice;
and matching the pinyin text in a preset vocabulary library to obtain the first sentence.
7. The method of claim 6, wherein determining the dialect class of the speech information based on the acoustic features comprises:
determining the energy distribution, rhythm distribution, fundamental frequency and average voice power of the voice information according to the acoustic features;
Determining a dialect area corresponding to the voice information according to the fundamental frequency and the average voice power;
respectively coding the energy distribution and the rhythm distribution to obtain an energy distribution vector and a rhythm distribution vector;
longitudinally splicing the energy distribution vector and the rhythm distribution vector to obtain a tone distribution vector;
and matching in a tone library corresponding to the dialect area according to the tone distribution vector, and determining the dialect category of the voice information according to a matching result.
8. A stochastic state based dialog recommendation device, the device comprising:
the analysis module is used for performing text conversion processing on the current voice information of the user to obtain a first sentence corresponding to the voice information;
the query module is used for querying historical dialogue data according to the first statement to obtain a second statement, wherein the historical dialogue data is used for recording dialogue data generated before the current moment of a dialogue event to which the first statement belongs, the occurrence time of the second statement is earlier than that of the first statement, and the absolute value of the difference between the occurrence time of the second statement and the occurrence time of the first statement is minimum;
The processing module is used for extracting the will of the second sentence to obtain a first will feature, and generating a state tracking label of the first sentence according to the first will feature, wherein the state tracking label is used for identifying the own direction of the will and the required strength of the user when the user speaks the voice information;
a scoring module, configured to input the first sentence, the state tracking tag, and at least one utterance into a scoring model to obtain at least one first score, where the scoring model is configured to score each of the at least one utterance, the at least one first score is in one-to-one correspondence with the at least one utterance, and multiply each of the at least one first score by a random function to obtain at least one second score, where the at least one second score is in one-to-one correspondence with the at least one first score;
and the recommending module is used for recommending the dialect corresponding to the maximum second score in the at least one second score to the answering equipment so as to enable the answering equipment to generate an answering sentence according to the recommended dialect and answer the current voice information of the user.
9. An electronic device comprising a processor, a memory, a communication interface, and one or more programs, wherein the one or more programs are stored in the memory and configured to be executed by the processor, the one or more programs including instructions for performing the steps in the method of any of claims 1-7.
10. A computer-readable storage medium, characterized in that the computer-readable storage medium stores a computer program which is executed by a processor to implement the method according to any one of claims 1-7.
CN202210143900.6A 2022-02-16 2022-02-16 Method and device for conversational recommendation based on random state and electronic equipment Pending CN114519094A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202210143900.6A CN114519094A (en) 2022-02-16 2022-02-16 Method and device for conversational recommendation based on random state and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202210143900.6A CN114519094A (en) 2022-02-16 2022-02-16 Method and device for conversational recommendation based on random state and electronic equipment

Publications (1)

Publication Number Publication Date
CN114519094A true CN114519094A (en) 2022-05-20

Family

ID=81599647

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202210143900.6A Pending CN114519094A (en) 2022-02-16 2022-02-16 Method and device for conversational recommendation based on random state and electronic equipment

Country Status (1)

Country Link
CN (1) CN114519094A (en)

Cited By (1)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424624A (en) * 2022-11-04 2022-12-02 深圳市人马互动科技有限公司 Man-machine interaction service processing method and device and related equipment

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328761A (en) * 2020-11-03 2021-02-05 中国平安财产保险股份有限公司 Intention label setting method and device, computer equipment and storage medium
CN112735374A (en) * 2020-12-29 2021-04-30 北京三快在线科技有限公司 Automatic voice interaction method and device
CN113535925A (en) * 2021-07-27 2021-10-22 平安科技(深圳)有限公司 Voice broadcasting method, device, equipment and storage medium
CN113656547A (en) * 2021-08-17 2021-11-16 平安科技(深圳)有限公司 Text matching method, device, equipment and storage medium
CN113821625A (en) * 2021-10-11 2021-12-21 中国平安人寿保险股份有限公司 Artificial intelligence based tactical recommendation method, device, equipment and medium
CN113886531A (en) * 2021-10-28 2022-01-04 中国平安人寿保险股份有限公司 Intelligent question and answer determining method and device, computer equipment and storage medium

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112328761A (en) * 2020-11-03 2021-02-05 中国平安财产保险股份有限公司 Intention label setting method and device, computer equipment and storage medium
CN112735374A (en) * 2020-12-29 2021-04-30 北京三快在线科技有限公司 Automatic voice interaction method and device
CN113535925A (en) * 2021-07-27 2021-10-22 平安科技(深圳)有限公司 Voice broadcasting method, device, equipment and storage medium
CN113656547A (en) * 2021-08-17 2021-11-16 平安科技(深圳)有限公司 Text matching method, device, equipment and storage medium
CN113821625A (en) * 2021-10-11 2021-12-21 中国平安人寿保险股份有限公司 Artificial intelligence based tactical recommendation method, device, equipment and medium
CN113886531A (en) * 2021-10-28 2022-01-04 中国平安人寿保险股份有限公司 Intelligent question and answer determining method and device, computer equipment and storage medium

Cited By (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN115424624A (en) * 2022-11-04 2022-12-02 深圳市人马互动科技有限公司 Man-machine interaction service processing method and device and related equipment
CN115424624B (en) * 2022-11-04 2023-01-24 深圳市人马互动科技有限公司 Man-machine interaction service processing method and device and related equipment

Similar Documents

Publication Publication Date Title
US11676067B2 (en) System and method for creating data to train a conversational bot
US11423233B2 (en) On-device projection neural networks for natural language understanding
EP3559946B1 (en) Facilitating end-to-end communications with automated assistants in multiple languages
CN115309877B (en) Dialogue generation method, dialogue model training method and device
CN116226334A (en) Method for training generated large language model and searching method based on model
EP4113357A1 (en) Method and apparatus for recognizing entity, electronic device and storage medium
CN110852075B (en) Voice transcription method and device capable of automatically adding punctuation marks and readable storage medium
US20220165257A1 (en) Neural sentence generator for virtual assistants
CN107112009B (en) Method, system and computer-readable storage device for generating a confusion network
CN112100353A (en) Man-machine conversation method and system, computer device and medium
CN112784573B (en) Text emotion content analysis method, device, equipment and storage medium
CN113051380A (en) Information generation method and device, electronic equipment and storage medium
CN114020886A (en) Voice intent recognition method, device, device and storage medium
CN117436438A (en) Emotion analysis method, training method and device for large language model
CN116361442A (en) Business hall data analysis method and system based on artificial intelligence
CN117725163A (en) Intelligent question-answering method, device, equipment and storage medium
WO2023045186A1 (en) Intention recognition method and apparatus, and electronic device and storage medium
CN114528851A (en) Reply statement determination method and device, electronic equipment and storage medium
CN114519094A (en) Method and device for conversational recommendation based on random state and electronic equipment
CN118940842A (en) Human-computer interaction method, device, equipment and storage medium based on historical dialogue
CN119149681A (en) Information processing method, information processing apparatus, computer device, storage medium, and program product
CN114547321A (en) Knowledge graph-based answer generation method and device and electronic equipment
CN117874207A (en) Reply sentence generation method and device, electronic equipment and storage medium
CN113743126B (en) Intelligent interaction method and device based on user emotion
CN114281969A (en) Reply sentence recommendation method and device, electronic equipment and storage medium

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20220520