CN113223510B - Refrigerator and equipment voice interaction method and computer readable storage medium thereof - Google Patents
Refrigerator and equipment voice interaction method and computer readable storage medium thereof Download PDFInfo
- Publication number
- CN113223510B CN113223510B CN202010070740.8A CN202010070740A CN113223510B CN 113223510 B CN113223510 B CN 113223510B CN 202010070740 A CN202010070740 A CN 202010070740A CN 113223510 B CN113223510 B CN 113223510B
- Authority
- CN
- China
- Prior art keywords
- state
- semantic
- current
- full
- preset
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Active
Links
- 230000003993 interaction Effects 0.000 title claims abstract description 43
- 238000000034 method Methods 0.000 title claims abstract description 34
- 230000005236 sound signal Effects 0.000 claims abstract description 4
- 230000006870 function Effects 0.000 claims description 29
- 238000004590 computer program Methods 0.000 claims description 9
- 235000013305 food Nutrition 0.000 claims description 4
- 230000006872 improvement Effects 0.000 description 10
- 239000004615 ingredient Substances 0.000 description 8
- 230000008569 process Effects 0.000 description 4
- 230000002618 waking effect Effects 0.000 description 3
- 244000061456 Solanum tuberosum Species 0.000 description 2
- 235000002595 Solanum tuberosum Nutrition 0.000 description 2
- 235000012015 potatoes Nutrition 0.000 description 2
- 230000004044 response Effects 0.000 description 2
- 244000291564 Allium cepa Species 0.000 description 1
- 235000002732 Allium cepa var. cepa Nutrition 0.000 description 1
- 230000009286 beneficial effect Effects 0.000 description 1
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- F—MECHANICAL ENGINEERING; LIGHTING; HEATING; WEAPONS; BLASTING
- F25—REFRIGERATION OR COOLING; COMBINED HEATING AND REFRIGERATION SYSTEMS; HEAT PUMP SYSTEMS; MANUFACTURE OR STORAGE OF ICE; LIQUEFACTION SOLIDIFICATION OF GASES
- F25D—REFRIGERATORS; COLD ROOMS; ICE-BOXES; COOLING OR FREEZING APPARATUS NOT OTHERWISE PROVIDED FOR
- F25D29/00—Arrangement or mounting of control or safety devices
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/08—Speech classification or search
- G10L15/18—Speech classification or search using natural language modelling
- G10L15/1822—Parsing for meaning understanding
-
- H—ELECTRICITY
- H04—ELECTRIC COMMUNICATION TECHNIQUE
- H04L—TRANSMISSION OF DIGITAL INFORMATION, e.g. TELEGRAPHIC COMMUNICATION
- H04L12/00—Data switching networks
- H04L12/28—Data switching networks characterised by path configuration, e.g. LAN [Local Area Networks] or WAN [Wide Area Networks]
- H04L12/2803—Home automation networks
- H04L12/2816—Controlling appliance services of a home automation network by calling their functionalities
- H04L12/282—Controlling appliance services of a home automation network by calling their functionalities based on user interaction within the home
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Multimedia (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Health & Medical Sciences (AREA)
- Computational Linguistics (AREA)
- Acoustics & Sound (AREA)
- Combustion & Propulsion (AREA)
- Chemical & Material Sciences (AREA)
- Artificial Intelligence (AREA)
- Mechanical Engineering (AREA)
- Thermal Sciences (AREA)
- General Engineering & Computer Science (AREA)
- Automation & Control Theory (AREA)
- Computer Networks & Wireless Communication (AREA)
- Signal Processing (AREA)
- Safety Devices In Control Systems (AREA)
Abstract
Description
技术领域technical field
本发明涉及一种冰箱及其设备语音交互方法、计算机可读存储介质。The present invention relates to a voice interaction method for a refrigerator and its equipment, and a computer-readable storage medium.
背景技术Background technique
随着人们生活水平的进一步提高,人们对家电智能化的要求也越来越高。而在智能家具特别是智能冰箱中,常常需要用户通过语音与冰箱进行交互。With the further improvement of people's living standards, people's requirements for the intelligentization of home appliances are also getting higher and higher. In smart furniture, especially smart refrigerators, users are often required to interact with the refrigerator through voice.
而在用户使用该语音交互功能的过程中,通常需要用户先通过定制的命令词进行唤醒,启动设备的语音识别功能对后续的指令进行识别。在全双工的自然通信的过程中,设备执行完指令后,设备不会立即关闭交互,而是等待一段时间后看是否有后续的指令,若有指令,则会继续执行指令。若无指令,则关闭交互,等待下一次用户唤醒。In the process of using the voice interaction function, the user usually needs to wake up through a customized command word first, and activate the voice recognition function of the device to recognize subsequent commands. In the process of full-duplex natural communication, after the device executes the command, the device does not immediately close the interaction, but waits for a period of time to see if there is a follow-up command. If there is an command, it will continue to execute the command. If there is no instruction, close the interaction and wait for the next user wake-up.
但是,在全双工的自然通信过程中,若有其他噪音或者其他指令,特别在厨房中,环境较为嘈杂,难免会有其他的声音,对设备识别语音及做出反馈等造成影响,并影响到用户与设备之间的交互,导致用户体验较差。However, in the process of full-duplex natural communication, if there are other noises or other instructions, especially in the kitchen, the environment is relatively noisy, and other sounds will inevitably occur, which will affect the device's voice recognition and feedback, and affect to the interaction between the user and the device, resulting in a poor user experience.
因此,必须设计一种新的冰箱及其设备语音交互方法、计算机可读存储介质。Therefore, it is necessary to design a new voice interaction method and computer-readable storage medium for refrigerators and their devices.
发明内容SUMMARY OF THE INVENTION
为了解决上述问题,本发明提出了一种设备语音交互方法,所述设备语音交互方法包括:In order to solve the above problems, the present invention proposes a device voice interaction method, and the device voice interaction method includes:
对当前环境的音频信号进行识别获得当前语音信息;Recognize the audio signal of the current environment to obtain the current voice information;
对当前语音信息进行语义识别,获得当前语义状态;Perform semantic recognition on the current speech information to obtain the current semantic state;
获取设备的当前状态,在设备为全双工状态的情况下,判断当前语义状态是否为预设语义;Obtain the current state of the device, and determine whether the current semantic state is the preset semantic when the device is in a full-duplex state;
若当前语义状态不为预设语义,则不对设备进行功能控制。If the current semantic state is not the preset semantic, no functional control of the device is performed.
作为本发明的进一步改进,步骤“在设备为全双工状态的情况下,判断当前语义状态是否为预设语义”包括:As a further improvement of the present invention, the step "under the condition that the device is in a full-duplex state, determine whether the current semantic state is a preset semantic" includes:
若当前语义状态为预设语义,判断当前语义状态是否为共有语义;If the current semantic state is preset semantics, determine whether the current semantic state is shared semantics;
若当前语义状态为共有语义,则识别语音信息获得操作指令,并根据操作指令对设备进行功能控制。If the current semantic state is shared semantics, the voice information is recognized to obtain an operation instruction, and the device is functionally controlled according to the operation instruction.
作为本发明的进一步改进,步骤“在设备为全双工状态的情况下,判断当前语义状态是否为预设语义”包括:As a further improvement of the present invention, the step "under the condition that the device is in a full-duplex state, determine whether the current semantic state is a preset semantic" includes:
若当前语义状态为预设语义,判断当前语义状态是否为全双工语义;If the current semantic state is the preset semantic, determine whether the current semantic state is the full-duplex semantic;
若当前语义状态为全双工语义,则判断当前语义状态与全双工状态是否一致;If the current semantic state is full-duplex semantics, determine whether the current semantic state is consistent with the full-duplex state;
若一致,则识别语音信息获得操作指令,并根据操作指令对设备进行功能控制;If they are consistent, recognize the voice information to obtain operation instructions, and perform functional control of the device according to the operation instructions;
若不一致,则不对设备进行功能控制。If they are inconsistent, no functional control of the device is performed.
作为本发明的进一步改进,在设备为全双工状态的情况下,若在T时间内没有获得操作指令,则退出全双工状态。As a further improvement of the present invention, when the device is in a full-duplex state, if no operation instruction is obtained within T time, the device exits the full-duplex state.
作为本发明的进一步改进,步骤“不对设备进行功能控制”具体包括:As a further improvement of the present invention, the step "do not perform function control on the device" specifically includes:
在屏幕上实时显示语音信息。Display voice messages on the screen in real time.
作为本发明的进一步改进,步骤“获取设备的当前状态”包括:As a further improvement of the present invention, the step "acquiring the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态不为预设语义,则忽略该当前语音信息,不对设备进行功能控制。If the current semantic state is not the preset semantic, the current voice information is ignored, and no function control of the device is performed.
作为本发明的进一步改进,步骤“获取设备的当前状态”包括:As a further improvement of the present invention, the step "acquiring the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态不为预设语义,识别语音信息并判断是否获得操作指令,若获得操作指令,则根据操作指令对设备进行功能控制。If the current semantic state is not the preset semantics, the voice information is recognized and it is judged whether an operation command is obtained, and if the operation command is obtained, the device is functionally controlled according to the operation command.
作为本发明的进一步改进,步骤“获取设备的当前状态”包括:As a further improvement of the present invention, the step "acquiring the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态为预设语义,判断当前语义状态是否为共有语义;If the current semantic state is preset semantics, determine whether the current semantic state is shared semantics;
若当前语义状态为共有语义,则识别语音信息获得操作指令,并根据操作指令对设备进行功能控制。If the current semantic state is shared semantics, the voice information is recognized to obtain an operation instruction, and the device is functionally controlled according to the operation instruction.
作为本发明的进一步改进,步骤“获取设备的当前状态”包括:As a further improvement of the present invention, the step "acquiring the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态为预设语义,判断当前语义状态是否为全双工语义;If the current semantic state is the preset semantic, determine whether the current semantic state is the full-duplex semantic;
若当前语义状态为全双工语义,则设置设备的当前状态为全双工状态,并识别语音信息获得操作指令,根据操作指令对设备进行功能控制。If the current semantic state is full-duplex semantics, set the current state of the device to be a full-duplex state, recognize voice information to obtain operation instructions, and perform functional control of the device according to the operation instructions.
作为本发明的进一步改进,“全双工状态”包括:食材管理场景状态、菜谱管理场景状态、视听场景状态、外卖场景状态、功能调节场景状态。As a further improvement of the present invention, the "full-duplex state" includes: an ingredient management scene state, a recipe management scene state, an audio-visual scene state, a takeaway scene state, and a function adjustment scene state.
为了解决上述问题,本发明提出了一种冰箱,包括存储器和处理器,所述存储器存储有可在所述处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述所述设备语音交互方法中的步骤。In order to solve the above problems, the present invention proposes a refrigerator, comprising a memory and a processor, wherein the memory stores a computer program that can be executed on the processor, and the processor executes the program to achieve the above-mentioned Steps in a device voice interaction method.
为了解决上述问题,本发明提出了一种计算机可读存储介质,其上存储有计算机程序,其特征在于,所述计算机程序被处理器执行时实现如上述所述设备语音交互方法中的步骤。In order to solve the above problems, the present invention provides a computer-readable storage medium on which a computer program is stored, characterized in that, when the computer program is executed by a processor, the steps in the above-mentioned device voice interaction method are implemented.
本发明的有益效果:在本发明中,对设备的当前状态及语义信息进行判断,在设备为全双工状态的情况,若此时语义状态不为预设语义,则说明该语义信息为无关信息,设备不会进行相应的功能控制。从而,设备在全双工的工作状态下,可防止设备对噪音及其他无关语音信息也进行反馈,影响设备的工作效率。Beneficial effects of the present invention: In the present invention, the current state and semantic information of the device are judged. When the device is in a full-duplex state, if the semantic state is not preset semantics at this time, it means that the semantic information is irrelevant information, the device will not perform corresponding function control. Therefore, in a full-duplex working state, the device can prevent the device from also feeding back noise and other irrelevant voice information, which affects the working efficiency of the device.
附图说明Description of drawings
图1为本发明中设备语音交互方法的流程示意图。FIG. 1 is a schematic flowchart of a device voice interaction method in the present invention.
具体实施方式Detailed ways
为了使本技术领域的人员更好地理解本发明中的技术方案,下面将结合本发明实施例中的附图,对本发明实施例中的技术方案进行清楚、完整地描述,显然,所描述的实施例仅仅是本发明一部分实施例,而不是全部的实施例。基于本发明中的实施例,本领域普通技术人员在没有做出创造性劳动前提下所获得的所有其他实施例,都应当属于本发明保护的范围。In order to make those skilled in the art better understand the technical solutions of the present invention, the technical solutions in the embodiments of the present invention will be clearly and completely described below with reference to the accompanying drawings in the embodiments of the present invention. Obviously, the described The embodiments are only some of the embodiments of the present invention, but not all of the embodiments. Based on the embodiments of the present invention, all other embodiments obtained by persons of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.
另外,在本发明中,采用冰箱做具体实施例,当然,本发明中的语音交互还可以适用于其他的电器或设备,都应当属于本发明保护的范围。In addition, in the present invention, a refrigerator is used as a specific embodiment. Of course, the voice interaction in the present invention can also be applied to other electrical appliances or equipment, which should belong to the protection scope of the present invention.
如图1所示,本发明提供了一种设备语音交互方法,所述设备语音交互方法包括:As shown in FIG. 1, the present invention provides a device voice interaction method, and the device voice interaction method includes:
对当前环境的音频信号进行识别获得当前语音信息;Recognize the audio signal of the current environment to obtain the current voice information;
对当前语音信息进行语义识别,获得当前语义状态;Perform semantic recognition on the current speech information to obtain the current semantic state;
获取设备的当前状态,在设备为全双工状态的情况下,判断当前语义状态是否为预设语义;Obtain the current state of the device, and determine whether the current semantic state is the preset semantic when the device is in a full-duplex state;
若当前语义状态不为预设语义,则不对设备进行功能控制。If the current semantic state is not the preset semantic, no functional control of the device is performed.
因此,在本发明中,对设备的当前状态及语义信息进行判断,在设备为全双工状态的情况,若此时语义状态不为预设语义,则说明该语义信息为无关信息,设备不会进行相应的功能控制。从而,设备在全双工的工作状态下,可防止设备对噪音及其他无关语音信息也进行反馈,影响设备的工作效率。Therefore, in the present invention, the current state and semantic information of the device are judged. In the case where the device is in a full-duplex state, if the semantic state is not the preset semantic at this time, it means that the semantic information is irrelevant information, and the device does not The corresponding function control will be carried out. Therefore, in a full-duplex working state, the device can prevent the device from also feeding back noise and other irrelevant voice information, which affects the working efficiency of the device.
在本发明的具体实施例中,该设备为冰箱,则“全双工状态”包括食材管理场景状态、菜谱管理场景状态、视听场景状态、外卖场景状态、功能调节场景状态等,即,所述冰箱预先设置若干个如上所述的场景,在上述这些场景中、在全双工状态下,用户和冰箱之间可以进行自然语言交互,在一次交互完成后,设备并不会退出该次交互,而是可以直接等待接收用户下一次的语音信息,而无需重复唤醒。而在全双工的工作状态下,若冰箱的语音识别装置识别到了不为预设语义的语音信息,即不对设备进行控制。In a specific embodiment of the present invention, the device is a refrigerator, and the "full-duplex state" includes an ingredient management scene state, a recipe management scene state, an audio-visual scene state, a takeaway scene state, a function adjustment scene state, etc., that is, the The refrigerator is preset with several scenarios as described above. In these scenarios, in the full-duplex state, the user and the refrigerator can interact with natural language. After an interaction is completed, the device will not exit the interaction. Instead, you can directly wait to receive the user's next voice message without repeatedly waking up. However, in the full-duplex working state, if the voice recognition device of the refrigerator recognizes voice information that does not have the preset semantics, it does not control the device.
并且,步骤“不对设备进行功能控制”中,在屏幕上实时显示语音信息。若当前语义状态不为预设语义,则设备不会对语义信息进行识别,当然也不会获得操作指令,但是,会在屏幕上实时显示该语音信息,以告诉用户该类语音信息无法在全双工状态下进行识别。And, in the step "do not control the function of the device", the voice information is displayed on the screen in real time. If the current semantic state is not the preset semantics, the device will not recognize the semantic information, and of course will not obtain operation instructions. However, the voice information will be displayed on the screen in real time to inform the user that this type of voice information cannot be used in full Identify in duplex state.
预设语义为共有语义或全双工语义。其中,全双工语义是指与上述全双工状态相对应的语义。例如,上述“全双工状态”包括食材管理场景状态、菜谱管理场景状态、视听场景状态、外卖场景状态、功能调节场景状态等,则“全双工语义”则包括食材管理场景语义、菜谱管理场景语义、视听场景语义、外卖场景语义、功能调节场景语义等。共有语义则是指可在所有全双工状态中都能让设备识别的语义,例如调节音量或音质语义、调节屏幕亮度语义、调节字体或字体大小语义等。The default semantics are shared semantics or full-duplex semantics. The full-duplex semantics refers to the semantics corresponding to the above-mentioned full-duplex state. For example, the above-mentioned "full-duplex state" includes the scene status of ingredient management, the scene status of recipe management, the status of audio-visual scene, the status of takeaway scene, the status of function adjustment scene, etc., while the "full-duplex semantics" includes the scene status of ingredient management, the scene status of recipe management, etc. Scene semantics, audiovisual scene semantics, takeaway scene semantics, function adjustment scene semantics, etc. Shared semantics refers to semantics that can be recognized by the device in all full-duplex states, such as adjusting volume or sound quality semantics, adjusting screen brightness semantics, adjusting font or font size semantics, etc.
以上介绍了当前语义不为预设语义的情况,以下对当前语义为预设语义的情况进行具体分析。具体的,一方面的,步骤“在设备为全双工状态的情况下,判断当前语义状态是否为预设语义”包括:The case where the current semantics is not the preset semantics has been described above, and the following is a specific analysis of the case where the current semantics is the preset semantics. Specifically, on the one hand, the step of "judging whether the current semantic state is the preset semantic when the device is in a full-duplex state" includes:
若当前语义状态为预设语义,判断当前语义状态是否为共有语义;If the current semantic state is preset semantics, determine whether the current semantic state is shared semantics;
若当前语义状态为共有语义,则识别语音信息获得操作指令,并根据操作指令对设备进行功能控制。If the current semantic state is shared semantics, the voice information is recognized to obtain an operation instruction, and the device is functionally controlled according to the operation instruction.
即,在判断到语义状态为预设语义时,进而对该语义状态的类别进行判断,若该语义状态为共有语义,才对该语音信息识别成操作指令,并进行功能控制。例如,在菜谱管理场景状态中,语音信息为“提高音量”,则该语音信息为共有语义,即可识别该语音信息并获得“提高音量”的操作指令,并相应的将设备的音量进行提高。That is, when it is judged that the semantic state is preset semantics, the category of the semantic state is further judged, and if the semantic state is shared semantics, the voice information is recognized as an operation instruction and function control is performed. For example, in the recipe management scene state, if the voice information is "increase the volume", then the voice information has shared semantics, and the voice information can be recognized and the operation command of "increase the volume" can be obtained, and the volume of the device can be increased accordingly. .
另一方面的,步骤“在设备为全双工状态的情况下,判断当前语义状态是否为预设语义”包括:On the other hand, the step of "judging whether the current semantic state is the preset semantic state when the device is in a full-duplex state" includes:
若当前语义状态为预设语义,判断当前语义状态是否为全双工语义;If the current semantic state is the preset semantic, determine whether the current semantic state is the full-duplex semantic;
若当前语义状态为全双工语义,则判断当前语义状态与全双工状态是否一致;If the current semantic state is full-duplex semantics, determine whether the current semantic state is consistent with the full-duplex state;
若一致,则识别语音信息获得操作指令,并根据操作指令对设备进行功能控制;If they are consistent, recognize the voice information to obtain operation instructions, and perform functional control of the device according to the operation instructions;
若不一致,则不对设备进行功能控制。If they are inconsistent, no functional control of the device is performed.
如上述所述,全双工状态不止包括为一种,因而,在确认当前语义状态为全双工语义后,还需要判断当前语义状态与全双工状态是否一致。若一致,则说明是在当前的自然语音交互过程中用户重新提出了一个与当前全双工状态相关的语音信息,则设备即根据相应的操作指令进行控制。若不一致,则说明是另外不相关的语音信息,则设备不会进行功能控制。As mentioned above, the full-duplex state includes more than one type. Therefore, after confirming that the current semantic state is full-duplex semantics, it is also necessary to determine whether the current semantic state is consistent with the full-duplex state. If they are consistent, it means that the user has re-proposed a piece of voice information related to the current full-duplex state during the current natural voice interaction process, and the device is controlled according to the corresponding operation instruction. If it is inconsistent, it means that it is another irrelevant voice information, and the device will not perform function control.
和上述情况相同的,步骤“不对设备进行功能控制”中,在屏幕上实时显示语音信息。若当前语义状态与全双工状态不一致,则设备不会对语义信息进行识别,当然也不会获得操作指令,但是,会在屏幕上实时显示该语音信息,以告诉用户该类语音信息无法在全双工状态不一致的情况下进行识别。As in the above case, in the step "do not control the function of the device", the voice information is displayed on the screen in real time. If the current semantic state is inconsistent with the full-duplex state, the device will not recognize the semantic information, and of course will not obtain operation instructions. However, the voice information will be displayed on the screen in real time to inform the user that this type of voice information cannot be stored on the screen. Identify when the full-duplex status is inconsistent.
例如,当前语义状态为食材管理场景状态,而当前语音信息为“查看今天的推荐菜谱”,则可判断该当前语音信息无法在该食材管理场景状态中使用,则不对设备进行功能控制。而若当前语音信息为“查看剩余土豆的数量”,则可判断该当前语音信息可在该食材管理场景状态中使用。若当前语音信息为“调高音量”,则可判断当前语音信息的语义状态为共有语义,则也可以进行识别,并控制冰箱将音量调高。For example, if the current semantic state is the state of the ingredient management scene, and the current voice information is "View today's recommended recipes", it can be judged that the current voice information cannot be used in the state of the ingredient management scene, and the device is not functionally controlled. And if the current voice information is "check the number of remaining potatoes", it can be determined that the current voice information can be used in the state of the ingredient management scene. If the current voice information is "turn up the volume", it can be determined that the semantic state of the current voice information is shared semantics, and the recognition can also be performed, and the refrigerator can be controlled to increase the volume.
因而,在本实施方式中,通过设定不同的全双工状态,要求在同一个全双工状态下,才可以实现连续对话,从而设备可进行快速的识别语音信息和对设备进行功能控制,从而可使得设备的识别及反馈速度大大降低,提高交互效率,解决用户使用语音交互时等待时间过长的问题。Therefore, in this embodiment, by setting different full-duplex states, it is required to be in the same full-duplex state to realize continuous dialogue, so that the device can quickly recognize voice information and control the function of the device. Therefore, the recognition and feedback speed of the device can be greatly reduced, the interaction efficiency can be improved, and the problem of long waiting time when the user interacts with voice can be solved.
当然,在本发明的另一种实施例中,也可不对全双工状态进行分类,而是只要在全双工状态下,若语义状态为全双工语义,则均可进行识别和反馈,也可达到本发明的目的。Of course, in another embodiment of the present invention, the full-duplex state may not be classified, but as long as the full-duplex state is in the full-duplex state, if the semantic state is full-duplex semantics, it can be recognized and fed back. The object of the present invention can also be achieved.
另外,在设备为全双工状态的情况下,若在T时间内没有获得操作指令,则退出全双工状态。通常的,在全双工状态下,设备通常会有一定的等待时间T,若在等待时间T内,依然没有共有语义或与当前全双工状态一致的全双工语义,则将不会识别获得操作指令,并退出全双工状态。或者,当然,若采用其他方式来退出全双工,例如有专门的退出全双工状态的语音指令或按键指令等,也可以达到本发明的目的。In addition, when the device is in the full-duplex state, if no operation instruction is obtained within T time, the device exits the full-duplex state. Usually, in the full-duplex state, the device usually has a certain waiting time T. If within the waiting time T, there is still no shared semantics or full-duplex semantics consistent with the current full-duplex state, it will not recognize Obtain the operation command and exit the full-duplex state. Or, of course, if other methods are used to exit the full-duplex state, for example, there is a special voice command or key-press command for exiting the full-duplex state, etc., the purpose of the present invention can also be achieved.
需要说明的是,在本具体实施方式中,识别语音信息的语义状态可通过识别语音信息中的相应关键词来判断当前语义状态。例如,若在语音信息中识别到“食材”、“食物”、“土豆”、“洋葱”等的关键词,则可判断当前语义状态为全双工语义,并且为与食材管理场景状态相一致的语义。若在语音信息中识别到有和“音量”、“声音”等相关的关键词,则可判断当前语义状态为共有语义。当然,若采用其他方式来判断当前语义状态,也可达到本发明的目的。It should be noted that, in this specific implementation manner, to recognize the semantic state of the speech information, the current semantic state can be determined by recognizing the corresponding keywords in the speech information. For example, if keywords such as "ingredients", "food", "potatoes", "onions", etc. are identified in the speech information, it can be determined that the current semantic state is full-duplex semantics and is consistent with the state of the food management scene semantics. If keywords related to "volume", "sound", etc. are identified in the speech information, it can be determined that the current semantic state is shared semantics. Of course, if other methods are used to judge the current semantic state, the object of the present invention can also be achieved.
以上,对设备在全双工状态下的情况进行说明,以下对设备不为全双工状态的情况进行描述。同样的,设备不为全双工状态时,当前语义也包括有“为预设语义”和“不为预设语义”两种情况。具体的,以下进行详细说明。In the above, the case where the device is in the full-duplex state is described, and the following describes the case where the device is not in the full-duplex state. Similarly, when the device is not in the full-duplex state, the current semantics also include two situations: "is the default semantics" and "is not the default semantics". Specifically, it will be described in detail below.
一方面,若当前语义不为预设语义时,步骤“获取设备的当前状态”包括:On the one hand, if the current semantics is not the preset semantics, the step "obtaining the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态不为预设语义,则忽略该当前语音信息,不对设备进行功能控制。If the current semantic state is not the preset semantic, the current voice information is ignored, and no function control of the device is performed.
即,在设备不为全双工状态,且当前语义不是预设语义时,设备不会做出任何反应,直接忽略该当前语音信息。则在该情况下,可避免噪音和无关信息等。That is, when the device is not in the full-duplex state and the current semantics is not the preset semantics, the device will not make any response, and directly ignore the current voice information. Then in this case, noise, irrelevant information, etc. can be avoided.
如上述所述,全双工状态是指用户和设备在无需唤醒的自然语音交互的状态,那么,本发明中还提供了一种普通的非自然语音交互的情形,即每次用户发出语音信息或用户指令时均需要唤醒,这种非自然语音交互的情形可应用于较为简单、不需要单独另外设置全双工状态的语音交互中。As mentioned above, the full-duplex state refers to the state of natural voice interaction between the user and the device without waking up. Then, the present invention also provides a common unnatural voice interaction situation, that is, every time the user sends out voice information Or the user needs to wake up when instructing, this kind of unnatural voice interaction can be applied to a relatively simple voice interaction that does not need to separately set the full-duplex state.
则,在另一实施例中,步骤“获取设备的当前状态”包括:Then, in another embodiment, the step "obtaining the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态不为预设语义,识别语音信息并判断是否获得操作指令,若获得操作指令,则根据操作指令对设备进行功能控制。If the current semantic state is not the preset semantics, the voice information is recognized and it is judged whether an operation command is obtained, and if the operation command is obtained, the device is functionally controlled according to the operation command.
显然的,若识别语音信息后无法获得操作指令,则说明该语音信息完全为噪音,也不会响应的进行功能控制。Obviously, if the operation command cannot be obtained after recognizing the voice information, it means that the voice information is completely noise, and the function control will not be performed in response.
在这种情况下,设备不为全双工状态,则用户在唤醒设备后,用户发出语音信息并使得设备进行相应的功能控制后,设备即退出该次交互,等待用户下一次唤醒。In this case, if the device is not in full-duplex state, after the user wakes up the device, the user sends a voice message and makes the device perform corresponding function control, and the device exits the interaction and waits for the user to wake up next time.
另一方面,若当前语义为预设语义时,则还需要判断该当前语义是共有语义还是全双工语义。On the other hand, if the current semantics is the preset semantics, it is also necessary to determine whether the current semantics is the shared semantics or the full-duplex semantics.
具体的,一实施例中,步骤“获取设备的当前状态”包括:Specifically, in one embodiment, the step "obtaining the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态为预设语义,判断当前语义状态是否为共有语义;If the current semantic state is preset semantics, determine whether the current semantic state is shared semantics;
若当前语义状态为共有语义,则识别语音信息获得操作指令,并根据操作指令对设备进行功能控制。If the current semantic state is shared semantics, the voice information is recognized to obtain an operation instruction, and the device is functionally controlled according to the operation instruction.
在设备不为全双工的状态下,但是用户发出的语音信息若为共有语义,则设备依然可以识别该语音信息并获得操作指令。但是,同样的,由于不处于全双工状态,因而也仅仅只能进行单次的语音交互后就退出该次交互,等待用户下一次唤醒。When the device is not in a full-duplex state, but if the voice information sent by the user has shared semantics, the device can still recognize the voice information and obtain operation instructions. However, in the same way, since it is not in a full-duplex state, it is only possible to perform a single voice interaction and then exit the interaction, waiting for the user to wake up next time.
另一实施例中,步骤“获取设备的当前状态”包括:In another embodiment, the step "obtaining the current state of the device" includes:
在设备不为全双工状态的情况下,判断当前语义状态是否为预设语义;In the case that the device is not in the full-duplex state, determine whether the current semantic state is the preset semantic;
若当前语义状态为预设语义,判断当前语义状态是否为全双工语义;If the current semantic state is the preset semantic, determine whether the current semantic state is the full-duplex semantic;
若当前语义状态为全双工语义,则设置设备的当前状态为全双工状态,并识别语音信息获得操作指令,根据操作指令对设备进行功能控制。If the current semantic state is full-duplex semantics, set the current state of the device to be a full-duplex state, recognize voice information to obtain operation instructions, and perform functional control of the device according to the operation instructions.
即,若设备不处于全双工状态,而当前语义状态又为全双工语义,则可将设备当前的状态设置为全双工状态,并根据操作指令对设备进行功能控制。即,用户在唤醒设备后,第一次对设备发出语音信息,则当然,该设备肯定不处于全双工状态。从而,再接下来的步骤中,将设备的当前状态设置为全双工状态,并进一步的,可设置为某一个全双工状态,例如上述所述的食材管理场景状态,以进行后续的语音交互。That is, if the device is not in the full-duplex state, but the current semantic state is full-duplex semantics, the current state of the device can be set to the full-duplex state, and the device can be functionally controlled according to the operation instruction. That is, if the user sends a voice message to the device for the first time after waking up the device, of course, the device must not be in a full-duplex state. Therefore, in the next step, the current state of the device is set to the full-duplex state, and further, it can be set to a certain full-duplex state, such as the above-mentioned food management scene state, so as to carry out the subsequent speech interact.
综上所述,本发明中提供了设备语音交互方法,在本发明中,对设备的当前状态及语义信息进行判断,在设备为全双工状态的情况,若此时语义状态不为预设语义,则说明该语义信息为无关信息,设备不会进行相应的功能控制。从而,设备在全双工的工作状态下,可防止设备对噪音及其他无关语音信息也进行反馈,影响设备的工作效率。To sum up, the present invention provides a device voice interaction method. In the present invention, the current state and semantic information of the device are judged. When the device is in a full-duplex state, if the semantic state is not preset at this time semantics, it means that the semantic information is irrelevant information, and the device will not perform corresponding function control. Therefore, in a full-duplex working state, the device can prevent the device from also feeding back noise and other irrelevant voice information, which affects the working efficiency of the device.
本发明还提出了一种冰箱,包括存储器和处理器,所述存储器存储有可在所述处理器上运行的计算机程序,所述处理器执行所述程序时实现如上述所述设备语音交互方法中的步骤,也就是说,所述处理器执行所述程序时实现如上述所述设备语音交互方法中任意一个技术方案中的步骤。The present invention also provides a refrigerator, including a memory and a processor, the memory stores a computer program that can be run on the processor, and the processor implements the above-mentioned method for device voice interaction when executing the program In other words, when the processor executes the program, the processor implements the steps in any one of the technical solutions in the above-mentioned device voice interaction method.
本发明还提出了一种计算机可读存储介质,其上存储有计算机程序,所述计算机程序被处理器执行时实现如上述所述设备语音交互方法中的步骤,也就是说,所述处理器执行所述计算机程序时实现如上述所述设备语音交互方法中任意一个技术方案中的步骤。The present invention also provides a computer-readable storage medium on which a computer program is stored, and when the computer program is executed by a processor, implements the steps in the above-mentioned device voice interaction method, that is, the processor When the computer program is executed, the steps in any one of the technical solutions in the above-mentioned device voice interaction method are implemented.
应当理解,虽然本说明书按照实施例加以描述,但并非每个实施例仅包含一个独立的技术方案,说明书的这种叙述方式仅仅是为清楚起见,本领域技术人员应当将说明书作为一个整体,各实施例中的技术方案也可以经适当组合,形成本领域技术人员可以理解的其他实施例。It should be understood that although this specification is described according to embodiments, not every embodiment only includes an independent technical solution, and this description in the specification is only for the sake of clarity, and those skilled in the art should take the specification as a whole, and each The technical solutions in the embodiments can also be appropriately combined to form other embodiments that can be understood by those skilled in the art.
上文所列出的一系列的详细说明仅仅是针对本发明的可行性实施例的具体说明,并非用以限制本发明的保护范围,凡未脱离本发明技艺精神所作的等效实施例或变更均应包含在本发明的保护范围之内。The series of detailed descriptions listed above are only specific descriptions for the feasible embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any equivalent embodiments or changes made without departing from the technical spirit of the present invention All should be included within the protection scope of the present invention.
Claims (11)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010070740.8A CN113223510B (en) | 2020-01-21 | 2020-01-21 | Refrigerator and equipment voice interaction method and computer readable storage medium thereof |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202010070740.8A CN113223510B (en) | 2020-01-21 | 2020-01-21 | Refrigerator and equipment voice interaction method and computer readable storage medium thereof |
Publications (2)
Publication Number | Publication Date |
---|---|
CN113223510A CN113223510A (en) | 2021-08-06 |
CN113223510B true CN113223510B (en) | 2022-09-20 |
Family
ID=77085451
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202010070740.8A Active CN113223510B (en) | 2020-01-21 | 2020-01-21 | Refrigerator and equipment voice interaction method and computer readable storage medium thereof |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN113223510B (en) |
Families Citing this family (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN113223510B (en) * | 2020-01-21 | 2022-09-20 | 青岛海尔电冰箱有限公司 | Refrigerator and equipment voice interaction method and computer readable storage medium thereof |
Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2467220A1 (en) * | 2003-05-29 | 2004-11-29 | Microsoft Corporation | Semantic object synchronous understanding implemented with speech application language tags |
US8060371B1 (en) * | 2007-05-09 | 2011-11-15 | Nextel Communications Inc. | System and method for voice interaction with non-voice enabled web pages |
CN105719649A (en) * | 2016-01-19 | 2016-06-29 | 百度在线网络技术(北京)有限公司 | Voice recognition method and device |
CN107316643A (en) * | 2017-07-04 | 2017-11-03 | 科大讯飞股份有限公司 | Voice interactive method and device |
CN108093350A (en) * | 2017-12-21 | 2018-05-29 | 广东小天才科技有限公司 | Microphone control method and microphone |
CN108337362A (en) * | 2017-12-26 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | Voice interactive method, device, equipment and storage medium |
WO2019015435A1 (en) * | 2017-07-19 | 2019-01-24 | 腾讯科技(深圳)有限公司 | Speech recognition method and apparatus, and storage medium |
CN109920413A (en) * | 2018-12-28 | 2019-06-21 | 广州索答信息科技有限公司 | A kind of implementation method and storage medium of kitchen scene touch screen voice dialogue |
CN110634486A (en) * | 2018-06-21 | 2019-12-31 | 阿里巴巴集团控股有限公司 | Voice processing method and device |
CN113223510A (en) * | 2020-01-21 | 2021-08-06 | 青岛海尔电冰箱有限公司 | Refrigerator and equipment voice interaction method and computer readable storage medium thereof |
Family Cites Families (1)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
KR102463806B1 (en) * | 2017-11-09 | 2022-11-07 | 삼성전자주식회사 | Electronic device capable of moving and method for operating thereof |
-
2020
- 2020-01-21 CN CN202010070740.8A patent/CN113223510B/en active Active
Patent Citations (10)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CA2467220A1 (en) * | 2003-05-29 | 2004-11-29 | Microsoft Corporation | Semantic object synchronous understanding implemented with speech application language tags |
US8060371B1 (en) * | 2007-05-09 | 2011-11-15 | Nextel Communications Inc. | System and method for voice interaction with non-voice enabled web pages |
CN105719649A (en) * | 2016-01-19 | 2016-06-29 | 百度在线网络技术(北京)有限公司 | Voice recognition method and device |
CN107316643A (en) * | 2017-07-04 | 2017-11-03 | 科大讯飞股份有限公司 | Voice interactive method and device |
WO2019015435A1 (en) * | 2017-07-19 | 2019-01-24 | 腾讯科技(深圳)有限公司 | Speech recognition method and apparatus, and storage medium |
CN108093350A (en) * | 2017-12-21 | 2018-05-29 | 广东小天才科技有限公司 | Microphone control method and microphone |
CN108337362A (en) * | 2017-12-26 | 2018-07-27 | 百度在线网络技术(北京)有限公司 | Voice interactive method, device, equipment and storage medium |
CN110634486A (en) * | 2018-06-21 | 2019-12-31 | 阿里巴巴集团控股有限公司 | Voice processing method and device |
CN109920413A (en) * | 2018-12-28 | 2019-06-21 | 广州索答信息科技有限公司 | A kind of implementation method and storage medium of kitchen scene touch screen voice dialogue |
CN113223510A (en) * | 2020-01-21 | 2021-08-06 | 青岛海尔电冰箱有限公司 | Refrigerator and equipment voice interaction method and computer readable storage medium thereof |
Also Published As
Publication number | Publication date |
---|---|
CN113223510A (en) | 2021-08-06 |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
WO2020048222A1 (en) | Sound effect adjustment method and apparatus, electronic device and storage medium | |
WO2020168571A1 (en) | Device control method, apparatus, system, electronic device and cloud server | |
CN106297781B (en) | Control method and controller | |
US20200075018A1 (en) | Control method of multi voice assistants | |
KR100679043B1 (en) | Voice chat interface device and method | |
KR20220024557A (en) | Detection and/or registration of hot commands to trigger response actions by automated assistants | |
CN113168304A (en) | Conditionally assign various auto-assistant functions to interactions with peripheral assistant-controlled devices | |
JP2023015054A (en) | Dynamic and/or context-specific hot word for calling automation assistant | |
CN107704169B (en) | Virtual human state management method and system | |
CN118200349A (en) | Method and system for generating IoT-based notifications and providing commands | |
KR102439144B1 (en) | Auto-assistant action recommendations for inclusion in auto-assistant routines | |
CN111627436B (en) | Voice control method and device | |
JP2016012340A (en) | Action control system and program | |
JP2003526120A (en) | Dialogue processing method with consumer electronic equipment system | |
KR20160132748A (en) | Electronic apparatus and the controlling method thereof | |
CN115327932A (en) | Scene creation method and device, electronic equipment and storage medium | |
CN109271129B (en) | Sound effect adjustment method, device, electronic device and storage medium | |
WO2021196617A1 (en) | Voice interaction method and apparatus, electronic device and storage medium | |
CN112015365A (en) | Volume adjustment method and device and electronic equipment | |
CN106486118A (en) | A kind of sound control method of application and device | |
WO2016082344A1 (en) | Voice control method and apparatus, and storage medium | |
CN113223510B (en) | Refrigerator and equipment voice interaction method and computer readable storage medium thereof | |
CN110164426A (en) | Sound control method and computer storage medium | |
CN109658924B (en) | Session message processing method and device and intelligent equipment | |
CN113777944B (en) | Intelligent equipment control preference configuration method and system in intelligent home scene |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination | ||
GR01 | Patent grant | ||
GR01 | Patent grant |