CN115206308A - Man-machine interaction method and electronic equipment - Google Patents
Man-machine interaction method and electronic equipment Download PDFInfo
- Publication number
- CN115206308A CN115206308A CN202110381295.1A CN202110381295A CN115206308A CN 115206308 A CN115206308 A CN 115206308A CN 202110381295 A CN202110381295 A CN 202110381295A CN 115206308 A CN115206308 A CN 115206308A
- Authority
- CN
- China
- Prior art keywords
- voice command
- electronic device
- user
- wake
- word
- Prior art date
- Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
- Pending
Links
Images
Classifications
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
-
- G—PHYSICS
- G10—MUSICAL INSTRUMENTS; ACOUSTICS
- G10L—SPEECH ANALYSIS TECHNIQUES OR SPEECH SYNTHESIS; SPEECH RECOGNITION; SPEECH OR VOICE PROCESSING TECHNIQUES; SPEECH OR AUDIO CODING OR DECODING
- G10L15/00—Speech recognition
- G10L15/22—Procedures used during a speech recognition process, e.g. man-machine dialogue
- G10L2015/223—Execution procedure of a spoken command
Landscapes
- Engineering & Computer Science (AREA)
- Computational Linguistics (AREA)
- Health & Medical Sciences (AREA)
- Audiology, Speech & Language Pathology (AREA)
- Human Computer Interaction (AREA)
- Physics & Mathematics (AREA)
- Acoustics & Sound (AREA)
- Multimedia (AREA)
- Telephone Function (AREA)
Abstract
本申请提供了一种人机交互的方法及电子设备,该方法可以应用于手机、平板电脑等具有语音识别和语音交互功能的电子设备,如果用户发出的语音指令或者回复电子设备的答案中包括唤醒词,该方法可以精确确定唤醒词在该语音指令中的位置,避免该语音指令中的唤醒词打断当前的交互流程,从而避免打断当前正在执行的任务,保证了人机对话的连贯性;此外,对于具有声源定位功能和/或图像采集功能的机器人等设备,该方法可以根据语音指令的声源方向确定是否要发生偏转,并根据采集的图像等估计用户的交互意愿,进而更加精准的和用户进行语音交互,提高了用户体验。
The present application provides a method and electronic device for human-computer interaction. The method can be applied to electronic devices with voice recognition and voice interaction functions such as mobile phones and tablet computers. If the voice command issued by the user or the answer to the electronic device includes Wake-up word, this method can accurately determine the position of the wake-up word in the voice command, avoid the wake-up word in the voice command from interrupting the current interaction process, thereby avoiding interrupting the currently executing task, and ensuring the coherence of the man-machine dialogue In addition, for devices such as robots with sound source localization function and/or image acquisition function, this method can determine whether to deflect according to the sound source direction of the voice command, and estimate the user's interaction willingness according to the collected images, etc., and then More accurate voice interaction with users, improving user experience.
Description
技术领域technical field
本申请涉及电子技术领域,尤其涉及一种人机交互的方法及电子设备。The present application relates to the field of electronic technology, and in particular, to a method and electronic device for human-computer interaction.
背景技术Background technique
随着技术的发展,越来越多的电子设备支持“人机交互”,或者称为“语音交互”,人机交互逐渐成为用户传达意图以及控制电子设备的一种方式,人机交互主要通过用户的语音指令控制电子设备,从而解放用户的双手,方便用户操控电子设备。With the development of technology, more and more electronic devices support "human-computer interaction", or "voice interaction", and human-computer interaction has gradually become a way for users to convey their intentions and control electronic devices. The user's voice command controls the electronic device, thereby liberating the user's hands and facilitating the user to control the electronic device.
在用户与电子设备进行人机交互之前,一般可以先通过“唤醒词”唤醒电子设备。当电子设备被唤醒后,可以为用户提供一个唤醒成功的响应,开始采集用户的语音指令并进行自动语音识别(automatic speech recognition,ASR)。在电子设备被唤醒后的语音识别过程,如果获取的语音指令中包括唤醒词,该唤醒词可能会打断当前的人机交互过程,重新开始采集用户的语音指令并进行语音识别。该过程打断当前的人机交互可能并不是用户期望的,即唤醒词直接打断当前正在执行的任务,使得电子设备重新开始采集手机用户的语音指令,这样就会导致人机对话不连贯,影响了用户的使用进程,且降低了人机交互的体验。Before the user interacts with the electronic device, the electronic device can generally be woken up through a "wake-up word". When the electronic device is woken up, it can provide the user with a successful wake-up response, start to collect the user's voice command and perform automatic speech recognition (ASR). During the voice recognition process after the electronic device is awakened, if the acquired voice command includes a wake-up word, the wake-up word may interrupt the current human-computer interaction process, and start to collect the user's voice command and perform voice recognition again. Interrupting the current human-computer interaction in this process may not be expected by the user, that is, the wake-up word directly interrupts the currently executing task, so that the electronic device restarts to collect the voice commands of the mobile phone user, which will lead to incoherent human-computer dialogue. It affects the user's use process and reduces the experience of human-computer interaction.
发明内容SUMMARY OF THE INVENTION
本申请提供了一种人机交互的方法及电子设备,该电子设备可以包括手机、机器人、平板、电脑等具有语音识别功能的设备,该方法可以为用户提供一种连贯的沉浸式体验,提高了用户的视觉体验。The present application provides a human-computer interaction method and electronic device. The electronic device may include mobile phones, robots, tablets, computers, and other devices with speech recognition functions. The method can provide users with a coherent immersive experience and improve the user's visual experience.
第一方面,提供了一种人机交互的方法,该方法包括:接收用户发出的唤醒词,响应于该唤醒词,开启电子设备的语音识别功能;获取该用户的第一语音指令,检测到该第一语音指令中包括该唤醒词时,确定该唤醒词在该第一语音指令对应时段内所占据的第一时段;去除该第一时段内的唤醒词,识别该第一语音指令中除了该唤醒词之外的目标语音指令;响应于该目标语音指令,进行应答。In a first aspect, a method for human-computer interaction is provided, the method includes: receiving a wake-up word issued by a user, and in response to the wake-up word, enabling a voice recognition function of an electronic device; acquiring a first voice command of the user, and detecting When the wake-up word is included in the first voice command, determine the first time period occupied by the wake-up word in the time period corresponding to the first voice command; The target voice command other than the wake-up word; in response to the target voice command, answer.
一种可能的场景中,以用户通过唤醒词“小艺小艺”唤醒手机为例,手机被唤醒后,进入监听用于的语音指令的状态,如果用户发出的语音指令中再次包括唤醒词“小艺小艺”,该唤醒词可以打断当前的人机交互进程重新进入下一个人机交互进程,该过程可能并不是用户期望的,即唤醒词直接打断当前正在执行的任务,使得手机需要重新开始采集用户的语音指令,这样就会导致人机对话不连贯,影响了用户的使用进程,且降低了人机交互的体验。In a possible scenario, taking the user waking up the mobile phone through the wake-up word "Xiaoyi Xiaoyi" as an example, after the mobile phone is woken up, it enters the state of monitoring the voice command used for it, if the voice command issued by the user again includes the wake-up word " "Xiaoyi Xiaoyi", the wake-up word can interrupt the current human-computer interaction process and re-enter the next human-computer interaction process, which may not be what the user expects, that is, the wake-up word directly interrupts the currently executing task, making the phone It is necessary to restart the collection of the user's voice command, which will lead to incoherent human-computer dialogue, affect the user's use process, and reduce the human-computer interaction experience.
通过上述方法,在用户和电子设备的语音交互过程中,用户通过唤醒词唤醒电子设备之后,如果用户发出的语音指令中再次包括唤醒词,该方法可以避免该语音指令中的唤醒词打断当前的交互流程,从而避免直接打断当前电子设备正在执行的任务,重新开始采集用户语音指令的过程,保证了人机对话的连贯性,提高了用户体验。Through the above method, in the process of voice interaction between the user and the electronic device, after the user wakes up the electronic device through the wake-up word, if the voice command issued by the user again includes the wake-up word, the method can avoid the wake-up word in the voice command from interrupting the current Therefore, the process of collecting the user's voice command is restarted, so as to avoid directly interrupting the task being performed by the current electronic device, which ensures the continuity of the human-machine dialogue and improves the user experience.
应理解,手机的自动语音识别(automatic speech recognition,ASR)模块并不是一直开启处于工作状态的,当用户发出语音指令的时候,手机的ASR模块是关闭的;或者,当手机在回答用户的时候,ASR模块是关闭的,避免采集了手机自己的语音,干扰用户语音指令的采集和识别。通过唤醒词,手机被唤醒后可以先检测ASR模块是否处于开启状态,如果ASR处于休眠或不工作的关闭状态,可以触发开启ASR模块,即开启电子设备的语音识别功能。It should be understood that the automatic speech recognition (ASR) module of the mobile phone is not always on and working. When the user issues a voice command, the ASR module of the mobile phone is turned off; or, when the mobile phone is answering the user. , the ASR module is closed to avoid collecting the phone's own voice and interfering with the collection and recognition of the user's voice commands. Through the wake-up word, the mobile phone can first detect whether the ASR module is turned on after being woken up. If the ASR is in a dormant or inactive state, the ASR module can be triggered to turn on, that is, the voice recognition function of the electronic device can be turned on.
可选地,当手机第一次获取并识别到唤醒词“小艺小艺”时,如果确定手机当前处于开启ASR模块的状态,则可以忽略本次唤醒,继续当前的对话流程。Optionally, when the mobile phone acquires and recognizes the wake-up word "Xiaoyi Xiaoyi" for the first time, if it is determined that the mobile phone is currently in the state where the ASR module is turned on, the wake-up can be ignored and the current dialogue process can be continued.
结合第一方面,在第一方面的某些实现方式中,第一时段是该第一语音指令对应时段的末尾时段、中间时段或起始时段。With reference to the first aspect, in some implementations of the first aspect, the first period is an end period, a middle period, or a start period of the period corresponding to the first voice instruction.
手机被唤醒之后监测用户的第一语音指令,当检测到该第一语音指令中再次包括了该唤醒词“小艺小艺”时,可以先判断该唤醒词“小艺小艺”在该第一语音指令中的位置,该位置主要可以包括位于第一语音指令的首位、第一语音指令的中间、第一语音指令的末尾。示例性的,用户发出的第一语音指令在包括唤醒词的情况下,可能是“模仿牛的叫声,小艺小艺”(唤醒词位于第一语音指令的末尾)、“模仿动物叫声,小艺小艺,模仿牛的叫声(唤醒词位于第一语音指令的中间)”或者“小艺小艺,模仿牛的叫声”(唤醒词位于第一语音指令的首位)。After the mobile phone is woken up, it monitors the user's first voice command, and when it is detected that the first voice command includes the wake-up word "Xiaoyi Xiaoyi" again, it can be judged that the wake-up word "Xiaoyi Xiaoyi" is in the first voice command. A position in a voice command, the position may mainly include being located at the first position of the first voice command, the middle of the first voice command, and the end of the first voice command. Exemplarily, when the first voice command issued by the user includes a wake-up word, it may be "Imitate the call of a cow, Xiaoyi Xiaoyi" (the wake-up word is at the end of the first voice command), "Imitate the sound of an animal. , Xiaoyi Xiaoyi, imitating the call of a cow (the wake-up word is in the middle of the first voice command)" or "Xiaoyi Xiaoyi, imitating the call of a cow" (the wake-up word is at the top of the first voice command).
结合第一方面和上述实现方式,在第一方面的某些实现方式中,当该第一时段是该第一语音指令对应时段的末尾时段时,该方法还包括:检测该第一语音指令中与该唤醒词最接近的语音指令,到该唤醒词的时间间隔;当该时间间隔大于或等于第一预设值时,暂停当前的对话流程且响应于该唤醒词,重新开启该电子设备的语音识别功能,使得该电子设备获取第二语音指令。In combination with the first aspect and the above-mentioned implementation manner, in some implementation manners of the first aspect, when the first time period is the end period of the time period corresponding to the first voice instruction, the method further includes: detecting that in the first voice instruction The voice command closest to the wake-up word, the time interval to the wake-up word; when the time interval is greater than or equal to the first preset value, suspend the current dialogue flow and restart the electronic device in response to the wake-up word. The voice recognition function enables the electronic device to obtain the second voice command.
应理解,该第一预设值可以用于判断当前用户是否希望中断该对话流程。示例性的,当用户发出的第一语音指令为:“模仿牛的叫声,小艺小艺”,唤醒词位于语音指令的末尾。唤醒词“小艺小艺”最接近的语音指令就是“模仿牛的叫声”,手机可以根据“模仿牛的叫声”和“小艺小艺”之间的时间间隔判断用户发出该唤醒词“小艺小艺”的母的。当“模仿牛的叫声”的“声”和“小艺小艺”的第一个“小”之间的时间间隔小于第一预设值时,可以判断用户可能仅仅把唤醒词“小艺小艺”作为口头禅一部分,希望继续当前的对话流程,不切换下一个新的对话流程。It should be understood that the first preset value can be used to determine whether the current user wishes to interrupt the dialog flow. Exemplarily, when the first voice instruction issued by the user is: "Imitate the call of a cow, Xiaoyi Xiaoyi", the wake-up word is located at the end of the voice instruction. The closest voice command to the wake-up word "Xiaoyi Xiaoyi" is "Imitate the call of a cow". The mobile phone can judge the user to issue the wake-up word according to the time interval between "Imitate the call of a cow" and "Xiaoyi Xiaoyi". The mother of "Xiaoyi Xiaoyi". When the time interval between the "sound" of "imitation of the cow's cry" and the first "small" of "Xiaoyi Xiaoyi" is less than the first preset value, it can be judged that the user may only put the wake-up word "Xiaoyi" As part of the mantra, Xiaoyi hopes to continue the current dialogue flow without switching to the next new dialogue flow.
可选地,手机可以根据该第一语音指令,记录唤醒词“小艺小艺”在该第一语音指令中的时间信息。本申请实施例对时间信息的记录和标示规则不作限定,示例性的,如果以最初唤醒词唤醒手机为起始时间,该唤醒词再次出现在第一语音指令中的时段为t1-t2;如果以最初唤醒词唤醒手机为起始时间,该唤醒词再次出现在第一语音指令中的时段为T1-T2,可以根据时间信息确定该唤醒词在第一语音指令中的位置。Optionally, the mobile phone can record the time information of the wake-up word "Xiaoyi Xiaoyi" in the first voice instruction according to the first voice instruction. The embodiment of the present application does not limit the rules for recording and marking time information. Exemplarily, if the initial wake-up word wakes up the mobile phone as the starting time, the time period during which the wake-up word reappears in the first voice instruction is t 1 -t 2 If the initial wake-up word wakes up the mobile phone as the starting time, the wake-up word reappears in the first voice command in a period of time T 1 -T 2 , and the position of the wake-up word in the first voice command can be determined according to the time information.
第二方面,提供了一种人机交互的方法,该方法包括:获取用户的第一语音指令,根据该第一语音指令检测该第一语音指令的声源方向;确定该第一语音指令的声源方向与电子设备当前面向的第一视线方向之间的第一角度;当该第一角度大于或等于第一预设角度时,确定该第一语音指令的声源方向与第二语音指令的声源方向的第二角度,该第二语音指令是用户在该第一语音指令之前发出的且与该第一语音指令最接近的语音指令;当该第二角度小于或等于第二预设角度时,该电子设备响应于该第一语音指令,进行应答。In a second aspect, a method for human-computer interaction is provided, the method comprising: acquiring a first voice command of a user, detecting a sound source direction of the first voice command according to the first voice command; The first angle between the sound source direction and the first line of sight direction the electronic device is currently facing; when the first angle is greater than or equal to the first preset angle, determine the sound source direction of the first voice command and the second voice command The second angle of the sound source direction, the second voice command is the voice command issued by the user before the first voice command and is closest to the first voice command; when the second angle is less than or equal to the second preset When the angle is turned, the electronic device responds to the first voice command and responds.
在另一种可能的场景中,有些电子设备可能具有声源定位的能力,或者具有摄像头的图像采集的功能,例如机器人等。当机器人被唤醒词唤醒后,可以根据声源定位功能确定用户所在方向,并转动具有图像采集功能的摄像头,直接转到根据声源定位的用户所在的方向或位置。在该过程中,用户所在方向可能会因为声音被墙壁反射等问题出现较大的判断误差,在出现这种较大的误差时,会出现设备转动后不是面对人的现象。In another possible scenario, some electronic devices may have the capability of sound source localization, or have the function of camera image acquisition, such as robots. When the robot is awakened by the wake-up word, it can determine the direction of the user according to the sound source localization function, and turn the camera with the image acquisition function to directly turn to the direction or position of the user according to the sound source localization. During this process, there may be a large judgment error in the direction of the user due to problems such as sound being reflected by the wall. When such a large error occurs, the device will not face the person after rotating.
通过上述方法,使得机器人的唤醒过程更加符合人的预期,当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿强烈时,机器人可以确定自动转向用户;当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿较低时,机器人还可以转回来,且在该过程中不会中断用户和机器人的交互流程,带给用户更好的人机交互体验。Through the above method, the wake-up process of the robot is more in line with human expectations. When the angle θ between the sound source direction of the user's voice command and the current line of sight of the robot is greater than or equal to the first preset angle and the user's willingness to interact When it is strong, the robot can determine to automatically turn to the user; when the angle θ between the sound source direction of the user's voice command and the direction of sight the robot is currently facing is greater than or equal to the first preset angle and the user's willingness to interact is low, the robot It can also be turned back, and the interaction process between the user and the robot will not be interrupted in the process, bringing users a better human-computer interaction experience.
应理解,当该第一语音指令的声源方向与视线方向的之间的夹角θ大于或等于第一预设角度时,可以认为发出语音指令的用户和机器人并不是处于面对面的位置关系,或者说,发出语音指令的用户不在机器人采集图像的中心区域范围内,本申请实施例对中心区域对应的范围不作限定。It should be understood that when the angle θ between the sound source direction of the first voice command and the line of sight direction is greater than or equal to the first preset angle, it can be considered that the user and the robot issuing the voice command are not in a face-to-face positional relationship, In other words, the user who issued the voice command is not within the range of the central area of the image captured by the robot, and the embodiment of the present application does not limit the range corresponding to the central area.
还应理解,这里“前一次语音指令”为第一语音指令之前的最接近的语音指令。可选地,该“前一次语音指令”可以是用户的唤醒词指令,例如:小艺小艺。或者该“前一次语音指令”是唤醒词之后的其他语音指令,例如:请模仿牛的叫声。本申请实施例对此不作限定。It should also be understood that the "previous voice command" here is the closest voice command before the first voice command. Optionally, the "previous voice command" may be the user's wake-up word command, for example: Xiaoyi Xiaoyi. Or the "previous voice command" is another voice command after the wake-up word, for example: please imitate the call of a cow. This embodiment of the present application does not limit this.
结合第二方面,在第二方面的某些实现方式中,该方法还包括:检测该第一语音指令和该第二语音指令的时间间隔;当该时间间隔小于或等于第二预设值时,调用转向执行函数,转动该电子设备面向或无限接近与该第一语音指令的声源方向。With reference to the second aspect, in some implementations of the second aspect, the method further includes: detecting a time interval between the first voice command and the second voice command; when the time interval is less than or equal to a second preset value , call the steering execution function, and turn the electronic device to face or infinitely approach the sound source direction of the first voice command.
可选地,“第一语音指令的声源方向与前一次语音指令的声源方向之间的夹角大于或等于第二预设角度”和“两次语音指令的时间间隔大于或等于第二预设值”的显示条件可以满足任意一个,或者同时满足,调用转向执行函数,转换机器人方向。本申请实施例对此不作限定。Optionally, "the angle between the sound source direction of the first voice command and the sound source direction of the previous voice command is greater than or equal to the second preset angle" and "the time interval between the two voice commands is greater than or equal to the second The display conditions of "preset value" can satisfy any one, or both, and call the steering execution function to change the direction of the robot. This embodiment of the present application does not limit this.
结合第二方面和上述实现方式,在第二方面的某些实现方式中,该方法还包括:采集该电子设备在该第一视线方向的第一图像;当该第一图像中包括该用户且该用户的视线方向和该第一语音指令的声源方向之间的第三角度小于或等于第三预设角度时,调用转向执行函数,转动该电子设备面向或无限接近该第一语音指令的声源方向。In combination with the second aspect and the foregoing implementations, in some implementations of the second aspect, the method further includes: collecting a first image of the electronic device in the first line of sight direction; when the first image includes the user and When the third angle between the user's sight direction and the sound source direction of the first voice command is less than or equal to the third preset angle, the steering execution function is called, and the electronic device is rotated to face or infinitely close to the first voice command. sound source direction.
可选地,机器人还可以通过摄像头采集图像,并检测采集的图像中用户的眼睛所注视的方向估计用户的交互意愿。Optionally, the robot may also collect images through a camera, and detect the direction in which the user's eyes are looking in the collected images to estimate the user's willingness to interact.
结合第二方面和上述实现方式,在第二方面的某些实现方式中,该方法还包括:该电子设备采集面向或无限接近该第一语音指令的声源方向上的第二图像;当该第二图像中不包括该用户或者该用户的视线方向和该电子设备当前的第二视线方向之间的第四角度大于第四预设角度时,转动该电子设备恢复至该第一视线方向。In combination with the second aspect and the above implementations, in some implementations of the second aspect, the method further includes: collecting, by the electronic device, a second image facing or infinitely close to the sound source direction of the first voice instruction; When the user is not included in the second image or the fourth angle between the user's gaze direction and the current second gaze direction of the electronic device is greater than a fourth preset angle, the electronic device is rotated to return to the first gaze direction.
综上所述,在用户和电子设备的语音交互过程中,用户通过唤醒词唤醒电子设备之后,如果用户发出的语音指令或者回复电子设备的答案中再次包括唤醒词,该方法可以避免该语音指令中的唤醒词打断当前的交互流程,从而避免直接打断当前电子设备正在执行的任务,重新开始采集用户语音指令的过程,保证了人机对话的连贯性,提高了用户体验。To sum up, in the process of voice interaction between the user and the electronic device, after the user wakes up the electronic device through the wake-up word, if the voice command issued by the user or the answer to the electronic device again includes the wake-up word, this method can avoid the voice command. The wake-up word in the device interrupts the current interaction process, so as to avoid directly interrupting the task being performed by the current electronic device, and restart the process of collecting the user's voice command, which ensures the continuity of the man-machine dialogue and improves the user experience.
此外,对于具有声源定位的能力的机器人等电子设备,本申请实施例提供的方法可以根据语音指令的声源方向确定是否要发生偏转,并根据采集的图像等估计用户的交互意愿,进而更加精准的和用户进行语音交互。具体地,当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿强烈时,机器人可以确定自动转向用户;当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿较低时,机器人还可以转回来,且在该过程中不会中断用户和机器人的交互流程,带给用户更好的人机交互体验。In addition, for electronic devices such as robots that have the capability of sound source localization, the method provided by the embodiment of the present application can determine whether to deflect according to the direction of the sound source of the voice command, and estimate the user's willingness to interact according to the collected images, etc., and furthermore Precise voice interaction with users. Specifically, when the angle θ between the sound source direction of the user's voice command and the direction of sight that the robot is currently facing is greater than or equal to the first preset angle and the user's willingness to interact is strong, the robot can determine to automatically turn to the user; when the user When the angle θ between the sound source direction of the voice command and the line-of-sight direction currently facing the robot is greater than or equal to the first preset angle and the user's willingness to interact is low, the robot can also turn back, and it will not turn back during the process. Interrupt the interaction process between the user and the robot, and bring the user a better human-computer interaction experience.
第三方面,提供了一种电子设备,包括:一个或多个处理器;一个或多个存储器;安装有多个应用程序的模块;该存储器存储有一个或多个程序,当该一个或者多个程序被该处理器执行时,使得该电子设备执行使得该电子设备执行以下步骤:接收用户发出的唤醒词,响应于该唤醒词,开启电子设备的语音识别功能;获取该用户的第一语音指令,检测到该第一语音指令中包括该唤醒词时,确定该唤醒词在该第一语音指令对应时段内所占据的第一时段;去除该第一时段内的唤醒词,识别该第一语音指令中除了该唤醒词之外的目标语音指令;响应于该目标语音指令,进行应答。In a third aspect, an electronic device is provided, comprising: one or more processors; one or more memories; a module installed with a plurality of application programs; the memory stores one or more programs, when the one or more When a program is executed by the processor, the electronic device is executed and the electronic device performs the following steps: receiving a wake-up word issued by the user, in response to the wake-up word, enabling the voice recognition function of the electronic device; acquiring the user's first voice instruction, when it is detected that the wake-up word is included in the first voice command, determine the first time period occupied by the wake-up word in the time period corresponding to the first voice command; remove the wake-up word in the first time period, and identify the first time period The target voice command except the wake-up word in the voice command; in response to the target voice command, answer.
结合第三方面,在第三方面的某些实现方式中,第一时段是该第一语音指令对应时段的末尾时段、中间时段或起始时段。With reference to the third aspect, in some implementations of the third aspect, the first period is an end period, a middle period, or a start period of the period corresponding to the first voice instruction.
结合第三方面和上述实现方式,在第三方面的某些实现方式中,当该第一时段是该第一语音指令对应时段的末尾时段时,该电子设备还可以执行以下步骤:检测该第一语音指令中与该唤醒词最接近的语音指令,到该唤醒词的时间间隔;当该时间间隔大于或等于第一预设值时,暂停当前的对话流程且响应于该唤醒词,重新开启该电子设备的语音识别功能,使得该电子设备获取第二语音指令。In combination with the third aspect and the above implementation manner, in some implementation manners of the third aspect, when the first period is the end period of the period corresponding to the first voice command, the electronic device may further perform the following steps: detecting the first time period The time interval from the voice command closest to the wake-up word in a voice command to the wake-up word; when the time interval is greater than or equal to the first preset value, the current dialogue flow is suspended and restarted in response to the wake-up word The voice recognition function of the electronic device enables the electronic device to acquire the second voice command.
第四方面,提供了一种电子设备,包括:摄像头;一个或多个处理器;一个或多个存储器;安装有多个应用程序的模块;该存储器存储有一个或多个程序,当该一个或者多个程序被该处理器执行时,使得该电子设备执行使得该电子设备执行以下步骤:获取用户的第一语音指令,根据该第一语音指令检测该第一语音指令的声源方向;确定该第一语音指令的声源方向与电子设备当前面向的第一视线方向之间的第一角度;当该第一角度大于或等于第一预设角度时,确定该第一语音指令的声源方向与第二语音指令的声源方向的第二角度,该第二语音指令是用户在该第一语音指令之前发出的且与该第一语音指令最接近的语音指令;当该第二角度小于或等于第二预设角度时,该电子设备响应于该第一语音指令,进行应答。In a fourth aspect, an electronic device is provided, comprising: a camera; one or more processors; one or more memories; a module in which multiple application programs are installed; Or when multiple programs are executed by the processor, the electronic device is made to perform the following steps: acquiring the user's first voice command, detecting the sound source direction of the first voice command according to the first voice command; determining The first angle between the sound source direction of the first voice command and the first line of sight direction the electronic device is currently facing; when the first angle is greater than or equal to the first preset angle, determine the sound source of the first voice command The second angle between the direction and the sound source direction of the second voice command, the second voice command is the voice command issued by the user before the first voice command and is closest to the first voice command; when the second angle is less than When the angle is equal to or equal to the second preset angle, the electronic device responds to the first voice command and responds.
结合第四方面,在第四方面的某些实现方式中,当该一个或者多个程序被该处理器执行时,使得该电子设备执行使得该电子设备执行以下步骤:检测该第一语音指令和该第二语音指令的时间间隔;当该时间间隔小于或等于第二预设值时,调用转向执行函数,转动该电子设备面向或无限接近与该第一语音指令的声源方向。In conjunction with the fourth aspect, in some implementations of the fourth aspect, when the one or more programs are executed by the processor, causing the electronic device to execute causes the electronic device to perform the following steps: detecting the first voice instruction and The time interval of the second voice command; when the time interval is less than or equal to the second preset value, the steering execution function is called to rotate the electronic device to face or infinitely approach the sound source direction of the first voice command.
结合第四方面和上述实现方式,在第四方面的某些实现方式中,当该一个或者多个程序被该处理器执行时,使得该电子设备执行使得该电子设备执行以下步骤:采集该电子设备在该第一视线方向的第一图像;当该第一图像中包括该用户且该用户的视线方向和该第一语音指令的声源方向之间的第三角度小于或等于第三预设角度时,调用转向执行函数,转动该电子设备面向或无限接近该第一语音指令的声源方向。In combination with the fourth aspect and the above-mentioned implementation manners, in some implementation manners of the fourth aspect, when the one or more programs are executed by the processor, the electronic device is caused to execute so that the electronic device performs the following steps: collecting the electronic The first image of the device in the first sight direction; when the first image includes the user and the third angle between the user's sight direction and the sound source direction of the first voice command is less than or equal to a third preset When the angle is changed, the steering execution function is called, and the electronic device is turned to face or infinitely approach the sound source direction of the first voice command.
结合第四方面和上述实现方式,在第四方面的某些实现方式中,当该一个或者多个程序被该处理器执行时,使得该电子设备执行使得该电子设备执行以下步骤:采集面向或无限接近该第一语音指令的声源方向上的第二图像;当该第二图像中不包括该用户或者该用户的视线方向和该电子设备当前的第二视线方向之间的第四角度大于第四预设角度时,转动该电子设备恢复至该第一视线方向。In combination with the fourth aspect and the above-mentioned implementation manners, in some implementation manners of the fourth aspect, when the one or more programs are executed by the processor, the electronic device is caused to execute, causing the electronic device to perform the following steps: collecting oriented or The second image is infinitely close to the sound source direction of the first voice instruction; when the user is not included in the second image or the fourth angle between the user's gaze direction and the current second gaze direction of the electronic device is greater than At the fourth preset angle, rotate the electronic device to return to the first line of sight.
第五方面,本申请提供了一种装置,该装置包含在电子设备中,该装置具有实现上述方面及上述方面的可能实现方式中电子设备行为的功能。功能可以通过硬件实现,也可以通过硬件执行相应的软件实现。硬件或软件包括一个或多个与上述功能相对应的模块或单元。例如,显示模块或单元、检测模块或单元、处理模块或单元等。In a fifth aspect, the present application provides an apparatus, the apparatus is included in an electronic device, and the apparatus has a function of implementing the behavior of the electronic device in the above-mentioned aspect and possible implementation manners of the above-mentioned aspect. The functions can be implemented by hardware, or by executing corresponding software by hardware. The hardware or software includes one or more modules or units corresponding to the above functions. For example, a display module or unit, a detection module or unit, a processing module or unit, and the like.
第六方面,本申请提供了一种电子设备,包括:触摸显示屏,其中,触摸显示屏包括触敏表面和显示器;一个或多个音频设备;摄像头;一个或多个处理器;存储器;多个应用程序;以及一个或多个计算机程序。其中,一个或多个计算机程序被存储在存储器中,一个或多个计算机程序包括指令。当指令被电子设备执行时,使得电子设备执行上述任一方面任一项可能的实现中的人机交互的方法。In a sixth aspect, the present application provides an electronic device, comprising: a touch display screen, wherein the touch display screen includes a touch-sensitive surface and a display; one or more audio devices; a camera; one or more processors; a memory; an application program; and one or more computer programs. Wherein, one or more computer programs are stored in the memory, the one or more computer programs comprising instructions. When the instructions are executed by the electronic device, the electronic device is caused to perform the method of human-computer interaction in any of the possible implementations of any of the above aspects.
第七方面,本申请提供了一种电子设备,包括一个或多个处理器和一个或多个存储器。该一个或多个存储器与一个或多个处理器耦合,一个或多个存储器用于存储计算机程序代码,计算机程序代码包括计算机指令,当一个或多个处理器执行计算机指令时,使得电子设备执行上述任一方面任一项可能的实现中的人机交互的方法。In a seventh aspect, the present application provides an electronic device including one or more processors and one or more memories. The one or more memories are coupled to the one or more processors for storing computer program code, the computer program code comprising computer instructions that, when executed by the one or more processors, cause the electronic device to perform A method for human-computer interaction in any possible implementation of any of the above aspects.
第八方面,本申请提供了一种计算机可读存储介质,包括计算机指令,当计算机指令在电子设备上运行时,使得电子设备执行上述任一方面任一项可能的人机交互的方法。In an eighth aspect, the present application provides a computer-readable storage medium, including computer instructions, when the computer instructions are executed on an electronic device, the electronic device performs any of the possible human-computer interaction methods in any of the foregoing aspects.
第九方面,本申请提供了一种计算机程序产品,当计算机程序产品在电子设备上运行时,使得电子设备执行上述任一方面任一项可能的人机交互的方法。In a ninth aspect, the present application provides a computer program product that, when the computer program product runs on an electronic device, enables the electronic device to perform any of the possible human-computer interaction methods in any of the foregoing aspects.
附图说明Description of drawings
图1是本申请实施例提供的一例电子设备的结构示意图。FIG. 1 is a schematic structural diagram of an example of an electronic device provided by an embodiment of the present application.
图2是本申请实施例的电子设备的软件结构框图。FIG. 2 is a block diagram of a software structure of an electronic device according to an embodiment of the present application.
图3是一例人机交互过程的图形用户界面的示意图。FIG. 3 is a schematic diagram of a graphical user interface of an example of a human-computer interaction process.
图4是本申请实施例提供的一例人机交互的方法的示意性流程图。FIG. 4 is a schematic flowchart of an example of a method for human-computer interaction provided by an embodiment of the present application.
图5是本申请实施例提供的一例人机交互的场景示意图。FIG. 5 is a schematic diagram of an example of a human-computer interaction scenario provided by an embodiment of the present application.
图6是本申请实施例提供的一例人机交互的方法的示意性流程图。FIG. 6 is a schematic flowchart of an example of a method for human-computer interaction provided by an embodiment of the present application.
具体实施方式Detailed ways
下面将结合本申请实施例中的附图,对本申请实施例中的技术方案进行描述。其中,在本申请实施例的描述中,除非另有说明,“/”表示或的意思,例如,A/B可以表示A或B;本文中的“和/或”仅仅是一种描述关联对象的关联关系,表示可以存在三种关系,例如,A和/或B,可以表示:单独存在A,同时存在A和B,单独存在B这三种情况。另外,在本申请实施例的描述中,“多个”是指两个或多于两个。The technical solutions in the embodiments of the present application will be described below with reference to the accompanying drawings in the embodiments of the present application. Wherein, in the description of the embodiments of the present application, unless otherwise stated, “/” means or means, for example, A/B can mean A or B; “and/or” in this document is only a description of the associated object The association relationship of , indicates that there can be three kinds of relationships, for example, A and/or B, can indicate that A exists alone, A and B exist at the same time, and B exists alone. In addition, in the description of the embodiments of the present application, "plurality" refers to two or more than two.
以下,术语“第一”、“第二”仅用于描述目的,而不能理解为指示或暗示相对重要性或者隐含指明所指示的技术特征的数量。由此,限定有“第一”、“第二”的特征可以明示或者隐含地包括一个或者更多个该特征。Hereinafter, the terms "first" and "second" are only used for descriptive purposes, and should not be construed as indicating or implying relative importance or implicitly indicating the number of indicated technical features. Thus, a feature defined as "first" or "second" may expressly or implicitly include one or more of that feature.
本申请实施例提供的人机交互的方法可以应用于手机、平板电脑、可穿戴设备、车载设备、增强现实(augmented reality,AR)/虚拟现实(virtual reality,VR)设备、笔记本电脑、超级移动个人计算机(ultra-mobile personal computer,UMPC)、上网本、个人数字助理(personal digital assistant,PDA)等电子设备上,本申请实施例对电子设备的具体类型不作任何限制。The human-computer interaction method provided by the embodiments of the present application can be applied to mobile phones, tablet computers, wearable devices, vehicle-mounted devices, augmented reality (AR)/virtual reality (VR) devices, laptop computers, super mobile devices On electronic devices such as a personal computer (ultra-mobile personal computer, UMPC), a netbook, a personal digital assistant (personal digital assistant, PDA), the embodiments of the present application do not impose any restrictions on the specific type of the electronic device.
示例性的,图1是本申请实施例提供的一例电子设备的结构示意图。电子设备100可以包括处理器110,外部存储器接口120,内部存储器121,通用串行总线(universalserial bus,USB)接口130,充电管理模块140,电源管理模块141,电池142,天线1,天线2,移动通信模块150,无线通信模块160,音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,传感器模块180,按键190,马达191,指示器192,摄像头193,显示屏194,以及用户标识模块(subscriber identification module,SIM)卡接口195等。其中传感器模块180可以包括压力传感器180A,陀螺仪传感器180B,气压传感器180C,磁传感器180D,加速度传感器180E,距离传感器180F,接近光传感器180G,指纹传感器180H,温度传感器180J,触摸传感器180K,环境光传感器180L,骨传导传感器180M等。Exemplarily, FIG. 1 is a schematic structural diagram of an example of an electronic device provided by an embodiment of the present application. The electronic device 100 may include a
可以理解的是,本申请实施例示意的结构并不构成对电子设备100的具体限定。在本申请另一些实施例中,电子设备100可以包括比图示更多或更少的部件,或者组合某些部件,或者拆分某些部件,或者不同的部件布置。图示的部件可以以硬件,软件或软件和硬件的组合实现。It can be understood that the structures illustrated in the embodiments of the present application do not constitute a specific limitation on the electronic device 100 . In other embodiments of the present application, the electronic device 100 may include more or less components than shown, or combine some components, or separate some components, or arrange different components. The illustrated components may be implemented in hardware, software, or a combination of software and hardware.
处理器110可以包括一个或多个处理单元,例如:处理器110可以包括应用处理器(application processor,AP),调制解调处理器,图形处理器(graphics processingunit,GPU),图像信号处理器(image signal processor,ISP),控制器,存储器,视频编解码器,数字信号处理器(digital signal processor,DSP),基带处理器,和/或神经网络处理器(neural-network processing unit,NPU)等。其中,不同的处理单元可以是独立的器件,也可以集成在一个或多个处理器中。The
其中,控制器可以是电子设备100的神经中枢和指挥中心。控制器可以根据指令操作码和时序信号,产生操作控制信号,完成取指令和执行指令的控制。The controller may be the nerve center and command center of the electronic device 100 . The controller can generate an operation control signal according to the instruction operation code and timing signal, and complete the control of fetching and executing instructions.
处理器110中还可以设置存储器,用于存储指令和数据。在一些实施例中,处理器110中的存储器为高速缓冲存储器。该存储器可以保存处理器110刚用过或循环使用的指令或数据。如果处理器110需要再次使用该指令或数据,可从所述存储器中直接调用。避免了重复存取,减少了处理器110的等待时间,因而提高了系统的效率。A memory may also be provided in the
在一些实施例中,处理器110可以包括一个或多个接口。接口可以包括集成电路(inter-integrated circuit,I2C)接口,集成电路内置音频(inter-integrated circuitsound,I2S)接口,脉冲编码调制(pulse code modulation,PCM)接口,通用异步收发传输器(universal asynchronous receiver/transmitter,UART)接口,移动产业处理器接口(mobile industry processor interface,MIPI),通用输入输出(general-purposeinput/output,GPIO)接口,用户标识模块(subscriber identity module,SIM)接口,和/或通用串行总线(universal serial bus,USB)接口等。In some embodiments, the
I2C接口是一种双向同步串行总线,包括一根串行数据线(serial data line,SDA)和一根串行时钟线(derail clock line,SCL)。在一些实施例中,处理器110可以包含多组I2C总线。处理器110可以通过不同的I2C总线接口分别耦合触摸传感器180K,充电器,闪光灯,摄像头193等。例如:处理器110可以通过I2C接口耦合触摸传感器180K,使处理器110与触摸传感器180K通过I2C总线接口通信,实现电子设备100的触摸功能。The I2C interface is a bidirectional synchronous serial bus that includes a serial data line (SDA) and a serial clock line (SCL). In some embodiments, the
I2S接口可以用于音频通信。在一些实施例中,处理器110可以包含多组I2S总线。处理器110可以通过I2S总线与音频模块170耦合,实现处理器110与音频模块170之间的通信。在一些实施例中,音频模块170可以通过I2S接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。The I2S interface can be used for audio communication. In some embodiments, the
PCM接口也可以用于音频通信,将模拟信号抽样,量化和编码。在一些实施例中,音频模块170与无线通信模块160可以通过PCM总线接口耦合。在一些实施例中,音频模块170也可以通过PCM接口向无线通信模块160传递音频信号,实现通过蓝牙耳机接听电话的功能。所述I2S接口和所述PCM接口都可以用于音频通信。The PCM interface can also be used for audio communications, sampling, quantizing and encoding analog signals. In some embodiments, the
UART接口是一种通用串行数据总线,用于异步通信。该总线可以为双向通信总线。它将要传输的数据在串行通信与并行通信之间转换。在一些实施例中,UART接口通常被用于连接处理器110与无线通信模块160。例如:处理器110通过UART接口与无线通信模块160中的蓝牙模块通信,实现蓝牙功能。在一些实施例中,音频模块170可以通过UART接口向无线通信模块160传递音频信号,实现通过蓝牙耳机播放音乐的功能。The UART interface is a universal serial data bus used for asynchronous communication. The bus may be a bidirectional communication bus. It converts the data to be transmitted between serial communication and parallel communication. In some embodiments, a UART interface is typically used to connect the
MIPI接口可以被用于连接处理器110与显示屏194,摄像头193等外围器件。MIPI接口包括摄像头串行接口(camera serial interface,CSI),显示屏串行接口(displayserial interface,DSI)等。在一些实施例中,处理器110和摄像头193通过CSI接口通信,实现电子设备100的拍摄功能。处理器110和显示屏194通过DSI接口通信,实现电子设备100的显示功能。The MIPI interface can be used to connect the
GPIO接口可以通过软件配置。GPIO接口可以被配置为控制信号,也可被配置为数据信号。在一些实施例中,GPIO接口可以用于连接处理器110与摄像头193,显示屏194,无线通信模块160,音频模块170,传感器模块180等。GPIO接口还可以被配置为I2C接口,I2S接口,UART接口,MIPI接口等。The GPIO interface can be configured by software. The GPIO interface can be configured as a control signal or as a data signal. In some embodiments, the GPIO interface may be used to connect the
USB接口130是符合USB标准规范的接口,具体可以是Mini USB接口,Micro USB接口,USB Type C接口等。USB接口130可以用于连接充电器为电子设备100充电,也可以用于电子设备100与外围设备之间传输数据。也可以用于连接耳机,通过耳机播放音频。该接口还可以用于连接其他电子设备,例如AR设备等。The USB interface 130 is an interface that conforms to the USB standard specification, and may specifically be a Mini USB interface, a Micro USB interface, a USB Type C interface, and the like. The USB interface 130 can be used to connect a charger to charge the electronic device 100, and can also be used to transmit data between the electronic device 100 and peripheral devices. It can also be used to connect headphones to play audio through the headphones. The interface can also be used to connect other electronic devices, such as AR devices.
可以理解的是,本申请实施例示意的各模块间的接口连接关系,只是示意性说明,并不构成对电子设备100的结构限定。在本申请另一些实施例中,电子设备100也可以采用上述实施例中不同的接口连接方式,或多种接口连接方式的组合。It can be understood that the interface connection relationship between the modules illustrated in the embodiments of the present application is only a schematic illustration, and does not constitute a structural limitation of the electronic device 100 . In other embodiments of the present application, the electronic device 100 may also adopt different interface connection manners in the foregoing embodiments, or a combination of multiple interface connection manners.
充电管理模块140用于从充电器接收充电输入。其中,充电器可以是无线充电器,也可以是有线充电器。在一些有线充电的实施例中,充电管理模块140可以通过USB接口130接收有线充电器的充电输入。在一些无线充电的实施例中,充电管理模块140可以通过电子设备100的无线充电线圈接收无线充电输入。充电管理模块140为电池142充电的同时,还可以通过电源管理模块141为电子设备供电。The
电源管理模块141用于连接电池142,充电管理模块140与处理器110。电源管理模块141接收电池142和/或充电管理模块140的输入,为处理器110,内部存储器121,外部存储器,显示屏194,摄像头193,和无线通信模块160等供电。电源管理模块141还可以用于监测电池容量,电池循环次数,电池健康状态(漏电,阻抗)等参数。在其他一些实施例中,电源管理模块141也可以设置于处理器110中。在另一些实施例中,电源管理模块141和充电管理模块140也可以设置于同一个器件中。The power management module 141 is used for connecting the battery 142 , the
电子设备100的无线通信功能可以通过天线1,天线2,移动通信模块150,无线通信模块160,调制解调处理器以及基带处理器等实现。The wireless communication function of the electronic device 100 may be implemented by the antenna 1, the antenna 2, the
天线1和天线2用于发射和接收电磁波信号。电子设备100中的每个天线可用于覆盖单个或多个通信频带。不同的天线还可以复用,以提高天线的利用率。例如:可以将天线1复用为无线局域网的分集天线。在另外一些实施例中,天线可以和调谐开关结合使用。Antenna 1 and Antenna 2 are used to transmit and receive electromagnetic wave signals. Each antenna in electronic device 100 may be used to cover a single or multiple communication frequency bands. Different antennas can also be reused to improve antenna utilization. For example, the antenna 1 can be multiplexed as a diversity antenna of the wireless local area network. In other embodiments, the antenna may be used in conjunction with a tuning switch.
移动通信模块150可以提供应用在电子设备100上的包括2G/3G/4G/5G等无线通信的解决方案。移动通信模块150可以包括至少一个滤波器,开关,功率放大器,低噪声放大器(low noise amplifier,LNA)等。移动通信模块150可以由天线1接收电磁波,并对接收的电磁波进行滤波,放大等处理,传送至调制解调处理器进行解调。移动通信模块150还可以对经调制解调处理器调制后的信号放大,经天线1转为电磁波辐射出去。在一些实施例中,移动通信模块150的至少部分功能模块可以被设置于处理器110中。在一些实施例中,移动通信模块150的至少部分功能模块可以与处理器110的至少部分模块被设置在同一个器件中。The
调制解调处理器可以包括调制器和解调器。其中,调制器用于将待发送的低频基带信号调制成中高频信号。解调器用于将接收的电磁波信号解调为低频基带信号。随后解调器将解调得到的低频基带信号传送至基带处理器处理。低频基带信号经基带处理器处理后,被传递给应用处理器。应用处理器通过音频设备(不限于扬声器170A,受话器170B等)输出声音信号,或通过显示屏194显示图像或视频。在一些实施例中,调制解调处理器可以是独立的器件。在另一些实施例中,调制解调处理器可以独立于处理器110,与移动通信模块150或其他功能模块设置在同一个器件中。The modem processor may include a modulator and a demodulator. Wherein, the modulator is used to modulate the low frequency baseband signal to be sent into a medium and high frequency signal. The demodulator is used to demodulate the received electromagnetic wave signal into a low frequency baseband signal. Then the demodulator transmits the demodulated low-frequency baseband signal to the baseband processor for processing. The low frequency baseband signal is processed by the baseband processor and passed to the application processor. The application processor outputs sound signals through audio devices (not limited to the
无线通信模块160可以提供应用在电子设备100上的包括无线局域网(wirelesslocal area networks,WLAN)(如无线保真(wireless fidelity,Wi-Fi)网络),蓝牙(bluetooth,BT),全球导航卫星系统(global navigation satellite system,GNSS),调频(frequency modulation,FM),近距离无线通信技术(near field communication,NFC),红外技术(infrared,IR)等无线通信的解决方案。无线通信模块160可以是集成至少一个通信处理模块的一个或多个器件。无线通信模块160经由天线2接收电磁波,将电磁波信号调频以及滤波处理,将处理后的信号发送到处理器110。无线通信模块160还可以从处理器110接收待发送的信号,对其进行调频,放大,经天线2转为电磁波辐射出去。The
在一些实施例中,电子设备100的天线1和移动通信模块150耦合,天线2和无线通信模块160耦合,使得电子设备100可以通过无线通信技术与网络以及其他设备通信。所述无线通信技术可以包括全球移动通讯系统(global system for mobile communications,GSM),通用分组无线服务(general packet radio service,GPRS),码分多址接入(codedivision multiple access,CDMA),宽带码分多址(wideband code division multipleaccess,WCDMA),时分码分多址(time-division code division multiple access,TD-SCDMA),长期演进(long term evolution,LTE),BT,GNSS,WLAN,NFC,FM,和/或IR技术等。所述GNSS可以包括全球卫星定位系统(global positioning system,GPS),全球导航卫星系统(global navigation satellite system,GLONASS),北斗卫星导航系统(beidounavigation satellite system,BDS),准天顶卫星系统(quasi-zenith satellitesystem,QZSS)和/或星基增强系统(satellite based augmentation systems,SBAS)。In some embodiments, the antenna 1 of the electronic device 100 is coupled with the
电子设备100通过GPU,显示屏194,以及应用处理器等实现显示功能。GPU为图像处理的微处理器,连接显示屏194和应用处理器。GPU用于执行数学和几何计算,用于图形渲染。处理器110可包括一个或多个GPU,其执行程序指令以生成或改变显示信息。The electronic device 100 implements a display function through a GPU, a display screen 194, an application processor, and the like. The GPU is a microprocessor for image processing, and is connected to the display screen 194 and the application processor. The GPU is used to perform mathematical and geometric calculations for graphics rendering.
显示屏194用于显示图像,视频等。显示屏194包括显示面板。显示面板可以采用液晶显示屏(liquid crystal display,LCD),有机发光二极管(organic light-emittingdiode,OLED),有源矩阵有机发光二极体或主动矩阵有机发光二极体(active-matrixorganic light emitting diode的,AMOLED),柔性发光二极管(flex light-emittingdiode,FLED),MiniLED,MicroLED,Micro-OLED,量子点发光二极管(quantum dot lightemitting diodes,QLED)等。在一些实施例中,电子设备100可以包括1个或N个显示屏194,N为大于1的正整数。Display screen 194 is used to display images, videos, and the like. Display screen 194 includes a display panel. The display panel can be a liquid crystal display (LCD), an organic light-emitting diode (OLED), an active-matrix organic light-emitting diode or an active-matrix organic light-emitting diode (active-matrix organic light-emitting diode). , AMOLED), flexible light-emitting diode (flex light-emitting diode, FLED), MiniLED, MicroLED, Micro-OLED, quantum dot light emitting diode (quantum dot light emitting diodes, QLED) and so on. In some embodiments, the electronic device 100 may include one or N display screens 194 , where N is a positive integer greater than one.
电子设备100可以通过ISP,摄像头193,视频编解码器,GPU,显示屏194以及应用处理器等实现拍摄功能。The electronic device 100 may implement a shooting function through an ISP, a camera 193, a video codec, a GPU, a display screen 194, an application processor, and the like.
ISP用于处理摄像头193反馈的数据。例如,拍照时,打开快门,光线通过镜头被传递到摄像头感光元件上,光信号转换为电信号,摄像头感光元件将所述电信号传递给ISP处理,转化为肉眼可见的图像。ISP还可以对图像的噪点,亮度,肤色进行算法优化。ISP还可以对拍摄场景的曝光,色温等参数优化。在一些实施例中,ISP可以设置在摄像头193中。The ISP is used to process the data fed back by the camera 193 . For example, when taking a photo, the shutter is opened, the light is transmitted to the camera photosensitive element through the lens, the light signal is converted into an electrical signal, and the camera photosensitive element transmits the electrical signal to the ISP for processing, and converts it into an image visible to the naked eye. ISP can also perform algorithm optimization on image noise, brightness, and skin tone. ISP can also optimize the exposure, color temperature and other parameters of the shooting scene. In some embodiments, the ISP may be provided in the camera 193 .
摄像头193用于捕获静态图像或视频。物体通过镜头生成光学图像投射到感光元件。感光元件可以是电荷耦合器件(charge coupled device,CCD)或互补金属氧化物半导体(complementary metal-oxide-semiconductor,CMOS)光电晶体管。感光元件把光信号转换成电信号,之后将电信号传递给ISP转换成数字图像信号。ISP将数字图像信号输出到DSP加工处理。DSP将数字图像信号转换成标准的RGB,YUV等格式的图像信号。在一些实施例中,电子设备100可以包括1个或N个摄像头193,N为大于1的正整数。The camera 193 is used to capture still images or video. The object is projected through the lens to generate an optical image onto the photosensitive element. The photosensitive element may be a charge coupled device (CCD) or a complementary metal-oxide-semiconductor (CMOS) phototransistor. The photosensitive element converts the optical signal into an electrical signal, and then transmits the electrical signal to the ISP to convert it into a digital image signal. The ISP outputs the digital image signal to the DSP for processing. DSP converts digital image signals into standard RGB, YUV and other formats of image signals. In some embodiments, the electronic device 100 may include 1 or N cameras 193 , where N is a positive integer greater than 1.
数字信号处理器用于处理数字信号,除了可以处理数字图像信号,还可以处理其他数字信号。例如,当电子设备100在频点选择时,数字信号处理器用于对频点能量进行傅里叶变换等。A digital signal processor is used to process digital signals, in addition to processing digital image signals, it can also process other digital signals. For example, when the electronic device 100 selects a frequency point, the digital signal processor is used to perform Fourier transform on the frequency point energy and so on.
视频编解码器用于对数字视频压缩或解压缩。电子设备100可以支持一种或多种视频编解码器。这样,电子设备100可以播放或录制多种编码格式的视频,例如:动态图像专家组(moving picture experts group,MPEG)1,MPEG2,MPEG3,MPEG4等。Video codecs are used to compress or decompress digital video. The electronic device 100 may support one or more video codecs. In this way, the electronic device 100 can play or record videos in various encoding formats, for example, moving picture experts group (MPEG) 1, MPEG2, MPEG3, MPEG4, and so on.
NPU为神经网络(neural-network,NN)计算处理器,通过借鉴生物神经网络结构,例如借鉴人脑神经元之间传递模式,对输入信息快速处理,还可以不断的自学习。通过NPU可以实现电子设备100的智能认知等应用,例如:图像识别,人脸识别,语音识别,文本理解等。The NPU is a neural-network (NN) computing processor. By drawing on the structure of biological neural networks, such as the transfer mode between neurons in the human brain, it can quickly process the input information and can continuously learn by itself. Applications such as intelligent cognition of the electronic device 100 can be implemented through the NPU, such as image recognition, face recognition, speech recognition, text understanding, and the like.
外部存储器接口120可以用于连接外部存储卡,例如Micro SD卡,实现扩展电子设备100的存储能力。外部存储卡通过外部存储器接口120与处理器110通信,实现数据存储功能。例如将音乐,视频等文件保存在外部存储卡中。The
内部存储器121可以用于存储计算机可执行程序代码,所述可执行程序代码包括指令。处理器110通过运行存储在内部存储器121的指令,从而执行电子设备100的各种功能应用以及数据处理。内部存储器121可以包括存储程序区和存储数据区。其中,存储程序区可存储操作系统,至少一个功能所需的应用程序(比如声音播放功能,图像播放功能等)等。存储数据区可存储电子设备100使用过程中所创建的数据(比如音频数据,电话本等)等。此外,内部存储器121可以包括高速随机存取存储器,还可以包括非易失性存储器,例如至少一个磁盘存储器件,闪存器件,通用闪存存储器(universal flash storage,UFS)等。Internal memory 121 may be used to store computer executable program code, which includes instructions. The
电子设备100可以通过音频模块170,扬声器170A,受话器170B,麦克风170C,耳机接口170D,以及应用处理器等实现音频功能。例如音乐播放,录音等。The electronic device 100 may implement audio functions through an
音频模块170用于将数字音频信息转换成模拟音频信号输出,也用于将模拟音频输入转换为数字音频信号。音频模块170还可以用于对音频信号编码和解码。在一些实施例中,音频模块170可以设置于处理器110中,或将音频模块170的部分功能模块设置于处理器110中。The
扬声器170A,也称“喇叭”,用于将音频电信号转换为声音信号。电子设备100可以通过扬声器170A收听音乐,或收听免提通话。
受话器170B,也称“听筒”,用于将音频电信号转换成声音信号。当电子设备100接听电话或语音信息时,可以通过将受话器170B靠近人耳接听语音。The
麦克风170C,也称“话筒”,“传声器”,用于将声音信号转换为电信号。当拨打电话或发送语音信息时,用户可以通过人嘴靠近麦克风170C发声,将声音信号输入到麦克风170C。电子设备100可以设置至少一个麦克风170C。在另一些实施例中,电子设备100可以设置两个麦克风170C,除了采集声音信号,还可以实现降噪功能。在另一些实施例中,电子设备100还可以设置三个,四个或更多麦克风170C,实现采集声音信号,降噪,还可以识别声音来源,实现定向录音功能等。The
耳机接口170D用于连接有线耳机。耳机接口170D可以是USB接口130,也可以是3.5mm的开放移动电子设备平台(open mobile terminal platform,OMTP)标准接口,美国蜂窝电信工业协会(cellular telecommunications industry association of the USA,CTIA)标准接口。The
压力传感器180A用于感受压力信号,可以将压力信号转换成电信号。在一些实施例中,压力传感器180A可以设置于显示屏194。压力传感器180A的种类很多,如电阻式压力传感器,电感式压力传感器,电容式压力传感器等。电容式压力传感器可以是包括至少两个具有导电材料的平行板。当有力作用于压力传感器180A,电极之间的电容改变。电子设备100根据电容的变化确定压力的强度。当有触摸操作作用于显示屏194,电子设备100根据压力传感器180A检测所述触摸操作强度。电子设备100也可以根据压力传感器180A的检测信号计算触摸的位置。在一些实施例中,作用于相同触摸位置,但不同触摸操作强度的触摸操作,可以对应不同的操作指令。例如:当有触摸操作强度小于第一压力阈值的触摸操作作用于短消息应用图标时,执行查看短消息的指令。当有触摸操作强度大于或等于第一压力阈值的触摸操作作用于短消息应用图标时,执行新建短消息的指令。The pressure sensor 180A is used to sense pressure signals, and can convert the pressure signals into electrical signals. In some embodiments, the pressure sensor 180A may be provided on the display screen 194 . There are many types of pressure sensors 180A, such as resistive pressure sensors, inductive pressure sensors, capacitive pressure sensors, and the like. The capacitive pressure sensor may be comprised of at least two parallel plates of conductive material. When a force is applied to the pressure sensor 180A, the capacitance between the electrodes changes. The electronic device 100 determines the intensity of the pressure according to the change in capacitance. When a touch operation acts on the display screen 194, the electronic device 100 detects the intensity of the touch operation according to the pressure sensor 180A. The electronic device 100 may also calculate the touched position according to the detection signal of the pressure sensor 180A. In some embodiments, touch operations acting on the same touch position but with different touch operation intensities may correspond to different operation instructions. For example, when a touch operation whose intensity is less than the first pressure threshold acts on the short message application icon, the instruction for viewing the short message is executed. When a touch operation with a touch operation intensity greater than or equal to the first pressure threshold acts on the short message application icon, the instruction to create a new short message is executed.
陀螺仪传感器180B可以用于确定电子设备100的运动姿态。在一些实施例中,可以通过陀螺仪传感器180B确定电子设备100围绕三个轴(即,x,y和z轴)的角速度。陀螺仪传感器180B可以用于拍摄防抖。示例性的,当按下快门,陀螺仪传感器180B检测电子设备100抖动的角度,根据角度计算出镜头模组需要补偿的距离,让镜头通过反向运动抵消电子设备100的抖动,实现防抖。陀螺仪传感器180B还可以用于导航,体感游戏场景。The gyro sensor 180B may be used to determine the motion attitude of the electronic device 100 . In some embodiments, the angular velocity of electronic device 100 about three axes (ie, x, y, and z axes) may be determined by gyro sensor 180B. The gyro sensor 180B can be used for image stabilization. Exemplarily, when the shutter is pressed, the gyro sensor 180B detects the shaking angle of the electronic device 100, calculates the distance that the lens module needs to compensate according to the angle, and allows the lens to offset the shaking of the electronic device 100 through reverse motion to achieve anti-shake. The gyro sensor 180B can also be used for navigation and somatosensory game scenarios.
气压传感器180C用于测量气压。在一些实施例中,电子设备100通过气压传感器180C测得的气压值计算海拔高度,辅助定位和导航。The air pressure sensor 180C is used to measure air pressure. In some embodiments, the electronic device 100 calculates the altitude through the air pressure value measured by the air pressure sensor 180C to assist in positioning and navigation.
磁传感器180D包括霍尔传感器。电子设备100可以利用磁传感器180D检测翻盖皮套的开合。在一些实施例中,当电子设备100是翻盖机时,电子设备100可以根据磁传感器180D检测翻盖的开合。进而根据检测到的皮套的开合状态或翻盖的开合状态,设置翻盖自动解锁等特性。The magnetic sensor 180D includes a Hall sensor. The electronic device 100 can detect the opening and closing of the flip holster using the magnetic sensor 180D. In some embodiments, when the electronic device 100 is a flip machine, the electronic device 100 can detect the opening and closing of the flip according to the magnetic sensor 180D. Further, according to the detected opening and closing state of the leather case or the opening and closing state of the flip cover, characteristics such as automatic unlocking of the flip cover are set.
加速度传感器180E可检测电子设备100在各个方向上(一般为三轴)加速度的大小。当电子设备100静止时可检测出重力的大小及方向。还可以用于识别电子设备姿态,应用于横竖屏切换,计步器等应用。The acceleration sensor 180E can detect the magnitude of the acceleration of the electronic device 100 in various directions (generally three axes). The magnitude and direction of gravity can be detected when the electronic device 100 is stationary. It can also be used to identify the posture of electronic devices, and can be used in applications such as horizontal and vertical screen switching, pedometers, etc.
距离传感器180F,用于测量距离。电子设备100可以通过红外或激光测量距离。在一些实施例中,拍摄场景,电子设备100可以利用距离传感器180F测距以实现快速对焦。Distance sensor 180F for measuring distance. The electronic device 100 can measure the distance through infrared or laser. In some embodiments, when shooting a scene, the electronic device 100 can use the distance sensor 180F to measure the distance to achieve fast focusing.
接近光传感器180G可以包括例如发光二极管(LED)和光检测器,例如光电二极管。发光二极管可以是红外发光二极管。电子设备100通过发光二极管向外发射红外光。电子设备100使用光电二极管检测来自附近物体的红外反射光。当检测到充分的反射光时,可以确定电子设备100附近有物体。当检测到不充分的反射光时,电子设备100可以确定电子设备100附近没有物体。电子设备100可以利用接近光传感器180G检测用户手持电子设备100贴近耳朵通话,以便自动熄灭屏幕达到省电的目的。接近光传感器180G也可用于皮套模式,口袋模式自动解锁与锁屏。Proximity light sensor 180G may include, for example, light emitting diodes (LEDs) and light detectors, such as photodiodes. The light emitting diodes may be infrared light emitting diodes. The electronic device 100 emits infrared light to the outside through the light emitting diode. Electronic device 100 uses photodiodes to detect infrared reflected light from nearby objects. When sufficient reflected light is detected, it can be determined that there is an object near the electronic device 100 . When insufficient reflected light is detected, the electronic device 100 may determine that there is no object near the electronic device 100 . The electronic device 100 can use the proximity light sensor 180G to detect that the user holds the electronic device 100 close to the ear to talk, so as to automatically turn off the screen to save power. Proximity light sensor 180G can also be used in holster mode, pocket mode automatically unlocks and locks the screen.
环境光传感器180L用于感知环境光亮度。电子设备100可以根据感知的环境光亮度自适应调节显示屏194亮度。环境光传感器180L也可用于拍照时自动调节白平衡。环境光传感器180L还可以与接近光传感器180G配合,检测电子设备100是否在口袋里,以防误触。The ambient light sensor 180L is used to sense ambient light brightness. The electronic device 100 can adaptively adjust the brightness of the display screen 194 according to the perceived ambient light brightness. The ambient light sensor 180L can also be used to automatically adjust the white balance when taking pictures. The ambient light sensor 180L can also cooperate with the proximity light sensor 180G to detect whether the electronic device 100 is in a pocket, so as to prevent accidental touch.
指纹传感器180H用于采集指纹。电子设备100可以利用采集的指纹特性实现指纹解锁,访问应用锁,指纹拍照,指纹接听来电等。The fingerprint sensor 180H is used to collect fingerprints. The electronic device 100 can use the collected fingerprint characteristics to realize fingerprint unlocking, accessing application locks, taking pictures with fingerprints, answering incoming calls with fingerprints, and the like.
温度传感器180J用于检测温度。在一些实施例中,电子设备100利用温度传感器180J检测的温度,执行温度处理策略。例如,当温度传感器180J上报的温度超过阈值,电子设备100执行降低位于温度传感器180J附近的处理器的性能,以便降低功耗实施热保护。在另一些实施例中,当温度低于另一阈值时,电子设备100对电池142加热,以避免低温导致电子设备100异常关机。在其他一些实施例中,当温度低于又一阈值时,电子设备100对电池142的输出电压执行升压,以避免低温导致的异常关机。The temperature sensor 180J is used to detect the temperature. In some embodiments, the electronic device 100 uses the temperature detected by the temperature sensor 180J to execute a temperature processing strategy. For example, when the temperature reported by the temperature sensor 180J exceeds a threshold value, the electronic device 100 reduces the performance of the processor located near the temperature sensor 180J in order to reduce power consumption and implement thermal protection. In other embodiments, when the temperature is lower than another threshold, the electronic device 100 heats the battery 142 to avoid abnormal shutdown of the electronic device 100 caused by the low temperature. In some other embodiments, when the temperature is lower than another threshold, the electronic device 100 boosts the output voltage of the battery 142 to avoid abnormal shutdown caused by low temperature.
触摸传感器180K,也称“触控面板”。触摸传感器180K可以设置于显示屏194,由触摸传感器180K与显示屏194组成触摸屏,也称“触控屏”。触摸传感器180K用于检测作用于其上或附近的触摸操作。触摸传感器可以将检测到的触摸操作传递给应用处理器,以确定触摸事件类型。可以通过显示屏194提供与触摸操作相关的视觉输出。在另一些实施例中,触摸传感器180K也可以设置于电子设备100的表面,与显示屏194所处的位置不同。Touch sensor 180K, also called "touch panel". The touch sensor 180K may be disposed on the display screen 194 , and the touch sensor 180K and the display screen 194 form a touch screen, also called a “touch screen”. The touch sensor 180K is used to detect a touch operation on or near it. The touch sensor can pass the detected touch operation to the application processor to determine the type of touch event. Visual output related to touch operations may be provided through display screen 194 . In other embodiments, the touch sensor 180K may also be disposed on the surface of the electronic device 100 , which is different from the location where the display screen 194 is located.
骨传导传感器180M可以获取振动信号。在一些实施例中,骨传导传感器180M可以获取人体声部振动骨块的振动信号。骨传导传感器180M也可以接触人体脉搏,接收血压跳动信号。在一些实施例中,骨传导传感器180M也可以设置于耳机中,结合成骨传导耳机。音频模块170可以基于所述骨传导传感器180M获取的声部振动骨块的振动信号,解析出语音信号,实现语音功能。应用处理器可以基于所述骨传导传感器180M获取的血压跳动信号解析心率信息,实现心率检测功能。The
按键190包括开机键,音量键等。按键190可以是机械按键。也可以是触摸式按键。电子设备100可以接收按键输入,产生与电子设备100的用户设置以及功能控制有关的键信号输入。The keys 190 include a power-on key, a volume key, and the like. Keys 190 may be mechanical keys. It can also be a touch key. The electronic device 100 may receive key inputs and generate key signal inputs related to user settings and function control of the electronic device 100 .
马达191可以产生振动提示。马达191可以用于来电振动提示,也可以用于触摸振动反馈。例如,作用于不同应用(例如拍照,音频播放等)的触摸操作,可以对应不同的振动反馈效果。作用于显示屏194不同区域的触摸操作,马达191也可对应不同的振动反馈效果。不同的应用场景(例如:时间提醒,接收信息,闹钟,游戏等)也可以对应不同的振动反馈效果。触摸振动反馈效果还可以支持自定义。Motor 191 can generate vibrating cues. The motor 191 can be used for vibrating alerts for incoming calls, and can also be used for touch vibration feedback. For example, touch operations acting on different applications (such as taking pictures, playing audio, etc.) can correspond to different vibration feedback effects. The motor 191 can also correspond to different vibration feedback effects for touch operations on different areas of the display screen 194 . Different application scenarios (for example: time reminder, receiving information, alarm clock, games, etc.) can also correspond to different vibration feedback effects. The touch vibration feedback effect can also support customization.
指示器192可以是指示灯,可以用于指示充电状态,电量变化,也可以用于指示消息,未接来电,通知等。The indicator 192 can be an indicator light, which can be used to indicate the charging state, the change of the power, and can also be used to indicate a message, a missed call, a notification, and the like.
SIM卡接口195用于连接SIM卡。SIM卡可以通过插入SIM卡接口195,或从SIM卡接口195拔出,实现和电子设备100的接触和分离。电子设备100可以支持1个或N个SIM卡接口,N为大于1的正整数。SIM卡接口195可以支持Nano SIM卡,Micro SIM卡,SIM卡等。同一个SIM卡接口195可以同时插入多张卡。所述多张卡的类型可以相同,也可以不同。SIM卡接口195也可以兼容不同类型的SIM卡。SIM卡接口195也可以兼容外部存储卡。电子设备100通过SIM卡和网络交互,实现通话以及数据通信等功能。在一些实施例中,电子设备100采用eSIM,即:嵌入式SIM卡。eSIM卡可以嵌在电子设备100中,不能和电子设备100分离。The
电子设备100的软件系统可以采用分层架构,事件驱动架构,微核架构,微服务架构,或云架构。本申请实施例以分层架构的系统为例,示例性说明电子设备100的软件结构。The software system of the electronic device 100 may adopt a layered architecture, an event-driven architecture, a microkernel architecture, a microservice architecture, or a cloud architecture. The embodiments of the present application use a layered architecture Taking the system as an example, the software structure of the electronic device 100 is exemplarily described.
图2是本申请实施例的电子设备100的软件结构框图。分层架构将软件分成若干个层,每一层都有清晰的角色和分工。层与层之间通过软件接口通信。在一些实施例中,将系统分为四层,从上至下分别为应用程序层,应用程序框架层,安卓运行时(runtime)和系统库,以及内核层。应用程序层可以包括一系列应用程序包。FIG. 2 is a block diagram of the software structure of the electronic device 100 according to the embodiment of the present application. The layered architecture divides the software into several layers, and each layer has a clear role and division of labor. Layers communicate with each other through software interfaces. In some embodiments, the The system is divided into four layers, from top to bottom are the application layer, the application framework layer, the Android runtime ( runtime) and system libraries, as well as the kernel layer. The application layer can include a series of application packages.
如图2所示,应用程序包可以包括相机,图库,日历,通话,地图,导航,WLAN,蓝牙,音乐,视频,短信息等应用程序。As shown in Figure 2, the application package can include applications such as camera, gallery, calendar, call, map, navigation, WLAN, Bluetooth, music, video, short message and so on.
应用程序框架层为应用程序层的应用程序提供应用编程接口(applicationprogramming interface,API)和编程框架。应用程序框架层包括一些预先定义的函数。The application framework layer provides an application programming interface (application programming interface, API) and a programming framework for the applications of the application layer. The application framework layer includes some predefined functions.
如图2所示,应用程序框架层可以包括窗口管理器,内容提供器,视图系统,电话管理器,资源管理器,通知管理器等。As shown in Figure 2, the application framework layer may include window managers, content providers, view systems, telephony managers, resource managers, notification managers, and the like.
窗口管理器用于管理窗口程序。窗口管理器可以获取显示屏大小,判断屏幕是否有状态栏,或者参与执行锁定屏幕,截取屏幕等操作。A window manager is used to manage window programs. The window manager can obtain the size of the display screen, determine whether the screen has a status bar, or participate in performing operations such as locking the screen and taking screenshots.
内容提供器用来存放和获取数据,并使这些数据可以被应用程序访问。存放的数据可以包括视频数据、图像数据、音频数据等,还可以包括拨打和接听的通话记录数据,用户的浏览历史和书签等数据,此处不再赘述。Content providers are used to store and retrieve data and make these data accessible to applications. The stored data may include video data, image data, audio data, etc., and may also include dialed and answered call record data, user browsing history, bookmarks, and other data, which will not be repeated here.
视图系统包括可视控件,例如显示文字的控件,显示图片的控件等。视图系统可用于构建应用程序。显示界面可以由一个或多个视图组成的。例如,包括短信通知图标的显示界面,可以包括显示文字的视图以及显示图片的视图。The view system includes visual controls, such as controls for displaying text, controls for displaying pictures, and so on. View systems can be used to build applications. A display interface can consist of one or more views. For example, the display interface including the short message notification icon may include a view for displaying text and a view for displaying pictures.
电话管理器用于提供电子设备100的通信功能。例如通话状态的管理(包括电话的接通、挂断等)。The phone manager is used to provide the communication function of the electronic device 100 . For example, the management of the call state (including the connection and hang-up of the phone, etc.).
资源管理器为应用程序提供各种资源,比如本地化字符串、图标、图片、布局文件、视频文件等等。The resource manager provides various resources for the application, such as localization strings, icons, pictures, layout files, video files, and so on.
通知管理器使应用程序可以在屏幕的状态栏中显示通知信息,可以用于向用户传达消息,该通知信息可以在状态栏短暂停留后自动消失,无需用户执行关闭操作等交互过程。比如通知管理器可以告知用户下载完成等消息。通知管理器还可以是以图表或者滚动条文本形式出现在系统顶部状态栏的通知,例如后台运行的应用程序的通知;或者,通知管理器还可以是以对话窗口形式出现在屏幕上的通知,例如在状态栏提示文本信息等;又或者,通知管理器还可以控制电子设备发出提示音,电子设备的振动,电子设备的指示灯闪烁等,此处不再赘述。The notification manager enables applications to display notification information in the status bar of the screen, which can be used to convey messages to the user, and the notification information can disappear automatically after a short stay in the status bar, without the need for user interaction such as closing operations. For example, the notification manager can notify the user of messages such as download completion. The notification manager can also be a notification that appears in the status bar at the top of the system in the form of a graph or scroll bar text, such as a notification of an application running in the background; or, the notification manager can also be a notification that appears on the screen in the form of a dialog window, For example, text information is prompted in the status bar; alternatively, the notification manager can also control the electronic device to emit a prompt sound, the electronic device to vibrate, the electronic device's indicator light to flash, etc., which will not be repeated here.
runtime包括核心库和虚拟机。runtime负责安卓系统的调度和管理。 The runtime includes core libraries and virtual machines. The runtime is responsible for scheduling and management of the Android system.
核心库包含两部分:一部分是java语言需要调用的功能函数,另一部分是安卓的核心库。The core library consists of two parts: one is the function functions that the java language needs to call, and the other is the core library of Android.
应用程序层和应用程序框架层运行在虚拟机中。虚拟机将应用程序层和应用程序框架层的java文件执行为二进制文件。虚拟机用于执行对象的生命周期管理、堆栈管理、线程管理、安全和异常的管理、以及垃圾回收等功能。The application layer and the application framework layer run in virtual machines. The virtual machine executes the java files of the application layer and the application framework layer as binary files. The virtual machine is used to perform functions such as object lifecycle management, stack management, thread management, safety and exception management, and garbage collection.
系统库可以包括多个功能模块。例如:表面管理器(surface manager),媒体库(media libraries),三维(three dimensional,3D)图形处理库(例如:OpenGL ES),二维(two dimensional,2D)图形引擎等。A system library can include multiple functional modules. For example: a surface manager (surface manager), a media library (media libraries), a three-dimensional (3D) graphics processing library (eg: OpenGL ES), a two-dimensional (two dimensional, 2D) graphics engine, and the like.
表面管理器用于对电子设备的显示子系统进行管理,并且为多个应用程序提供了2D和3D图层的融合。Surface Manager is used to manage the display subsystem of an electronic device and provides a fusion of 2D and 3D layers for multiple applications.
媒体库支持多种常用的音频,视频格式回放和录制,以及静态图像文件等。媒体库可以支持多种音视频编码格式,例如:MPEG4,H.264,MP3,AAC,AMR,JPG,PNG等。The media library supports playback and recording of a variety of commonly used audio and video formats, as well as still image files. The media library can support a variety of audio and video encoding formats, such as: MPEG4, H.264, MP3, AAC, AMR, JPG, PNG, etc.
三维图形处理库用于实现三维图形绘图,图像渲染,合成,和图层处理等。The 3D graphics processing library is used to implement 3D graphics drawing, image rendering, compositing, and layer processing.
二维图形引擎是二维绘图的绘图引擎。A 2D graphics engine is a drawing engine for 2D drawing.
内核层是硬件和软件之间的层。内核层至少包含显示驱动,摄像头驱动,音频驱动,传感器驱动。The kernel layer is the layer between hardware and software. The kernel layer contains at least display drivers, camera drivers, audio drivers, and sensor drivers.
为了便于理解,本申请以下实施例将以具有图1和图2所示结构的电子设备为例,结合附图和应用场景,对本申请实施例提供的人机交互的方法进行具体阐述。For ease of understanding, the following embodiments of the present application will take the electronic device having the structure shown in FIG. 1 and FIG. 2 as an example, and combine the drawings and application scenarios to specifically describe the human-computer interaction method provided by the embodiments of the present application.
首先,在介绍本申请实施例将提供的人机交互的方法之前,先列举几种可能的应用场景。First, before introducing the human-computer interaction method provided by the embodiments of the present application, several possible application scenarios are listed first.
一种可能的场景中,本申请实施例提供的人机交互的方法可以应用于包括单独的电子设备的场景中。示例性的,该电子设备可以是前述结合图1所示的结构介绍的手机、平板、智慧屏等不同的电子设备,本申请实施例对此不作限定。下面将以手机为例,详细介绍本申请提供的显示人机交互指令的提示的方法。In a possible scenario, the human-computer interaction method provided by the embodiments of the present application may be applied to a scenario including a separate electronic device. Exemplarily, the electronic device may be different electronic devices, such as a mobile phone, a tablet, and a smart screen, as described above in conjunction with the structure shown in FIG. 1 , which is not limited in this embodiment of the present application. The following will take a mobile phone as an example to describe in detail the method for displaying a prompt of a human-computer interaction instruction provided by the present application.
图3是一例人机交互过程的图形用户界面(graphical user interface,GUI)的示意图,其中,图3中的(a)图示出了手机的解锁模式下,手机的屏幕显示系统显示了当前输出的界面内容301,该界面内容301为手机的主界面。该界面内容301显示了多款应用程序(application,App),例如邮件、计算器、设置和音乐等。应理解,界面内容301还可以包括其他更多的应用程序,本申请对此不作限定。FIG. 3 is a schematic diagram of a graphical user interface (GUI) of an example of a human-computer interaction process, wherein (a) in FIG. 3 shows that in the unlocking mode of the mobile phone, the screen display system of the mobile phone displays the current output The
一种可能的实现方式中,在语音助手的使用过程中,用户可以通过设置应用开启手机的智慧语音的功能。示例性的,如图3中的(a)图所示,用户可以点击设置应用的图标,响应于用户的点击操作,手机显示如图3中的(b)图所示设置应用的主界面302。该设置应用的主界面302上可以包括多项菜单,例如WLAN、蓝牙、桌面与壁纸、显示与亮度、声音和智慧助手等菜单。用户可以点击界面302上的智慧助手菜单,响应于用户的点击操作,手机显示如图3中的(c)图所示智慧助手界面303,该智慧助手界面303上包括智慧语音、智慧视觉、智慧识屏、情景智能、智慧搜索等选项,此外,该智慧助手界面303上还显示了唤醒词“小艺小艺”,该唤醒词“小艺小艺”可以用于用户唤醒该手机,使得手机进入监听且采集用户的语音指令的状态。In a possible implementation manner, during the use of the voice assistant, the user can enable the smart voice function of the mobile phone through a setting application. Exemplarily, as shown in (a) of FIG. 3 , the user can click on the icon of the setting application, and in response to the user's click operation, the mobile phone displays the
如图3中的(c)图所示,用户点击该智慧助手界面303上的智慧语音选项,响应于用户的点击操作,手机显示如图3中的(d)图所示的智慧语音界面304。该智慧语音界面304上可以包括语音唤醒开关、电源键换型开关、人工智能(artificial intelligence,AI)字母开关、驾驶场景开关等。在本申请实施例中,用户可以点击语音唤醒开关,开启手机的语音交互功能。换言之,开启了语音唤醒开关之后,手机可以被唤醒词“小艺小艺”唤醒,开始采集用户的语音指令并进入语音识别阶段。As shown in (c) in Figure 3, the user clicks the smart voice option on the
当用户开启了手机的语音交互功能,用户如果发出“小艺小艺”的唤醒词,手机屏幕上可以显示悬浮窗口,以提示用户当前开始采集用户的语音指令,开始语音交互过程。示例性的,如图3中的(e)图所示,用户发出“小艺小艺”的唤醒词后,手机被该唤醒词唤醒,并在屏幕上可以显示悬浮窗口10,该悬浮窗口10中包括用户和手机的对话内容(例如:嗨,我在听…),以及在手机监听用户的语音指令时的监听图标10-1,该监听图标10-1可以动态闪烁显示,以表示当前正在监听用户的语音指令,本申请实施例对此不作限定。When the user turns on the voice interaction function of the mobile phone, if the user sends the wake-up word "Xiaoyi Xiaoyi", a floating window can be displayed on the screen of the mobile phone to remind the user to start collecting the user's voice command and start the voice interaction process. Exemplarily, as shown in (e) in Figure 3, after the user sends out the wake-up word "Xiaoyi Xiaoyi", the mobile phone is woken up by the wake-up word, and a floating
如图3中的(e)图所示,手机监听到用户的语音指令:模仿牛的叫声,小艺小艺。手机可以识别该语音指令的内容,并将识别到的语音指令显示在悬浮窗口10中,在现有的方案中,手机可以根据该语音指令做出响应,例如模仿出牛的叫声。As shown in (e) of Figure 3, the mobile phone monitors the user's voice command: imitating the sound of a cow, Xiaoyi Xiaoyi. The mobile phone can recognize the content of the voice command, and display the recognized voice command in the floating
但是,当该语音指令中又再次包括了唤醒词“小艺小艺”的时候,手机可能被该语音指令中的唤醒词“小艺小艺”打断,进而中断了当前的人机交互过程,重新开始采集用户的语音指令并进行语音识别。如图3中的(f)图所示,手机识别到用户的语音指令中包括了唤醒词“小艺小艺”之后,响应于包括该唤醒词“小艺小艺”的语音指令,手机会重新进入下一个人机交互流程,在悬浮窗口10中做出响应:嗨,我在听…,以及显示手机监听用户的语音指令的动态闪烁的监听图标10-1,以表示当前手机重新开始监听用户的语音指令。However, when the voice command includes the wake-up word "Xiaoyi Xiaoyi" again, the mobile phone may be interrupted by the wake-up word "Xiaoyi Xiaoyi" in the voice command, thereby interrupting the current human-computer interaction process , and restart to collect the user's voice command and perform voice recognition. As shown in (f) in Figure 3, after the mobile phone recognizes that the user's voice command includes the wake-up word "Xiaoyi Xiaoyi", the mobile phone will respond to the voice command including the wake-up word "Xiaoyi Xiaoyi". Re-enter the next human-computer interaction process, and respond in the floating window 10: Hey, I'm listening..., and the dynamic blinking monitor icon 10-1 that displays the phone's monitoring of the user's voice command to indicate that the current phone starts monitoring again The user's voice command.
在上述场景中,用户发出的语音指令中如果包括唤醒词“小艺小艺”,该唤醒词可以打断当前的人机交互进程重新进入下一个人机交互进程,该过程可能并不是用户期望的,即唤醒词直接打断当前正在执行的任务,使得手机需要重新开始采集用户的语音指令,这样就会导致人机对话不连贯,影响了用户的使用进程,且降低了人机交互的体验。In the above scenario, if the voice command issued by the user includes the wake-up word "Xiaoyi Xiaoyi", the wake-up word can interrupt the current human-computer interaction process and re-enter the next human-computer interaction process, which may not be expected by the user. That is, the wake-up word directly interrupts the currently executing task, so that the mobile phone needs to start collecting the user's voice commands again, which will lead to incoherent human-computer dialogue, affect the user's use process, and reduce the human-computer interaction experience. .
本申请实施例提供了一种人机交互的方法,可以避免人机交互流程被语音指令中的唤醒词打断,以带给用户更好的人机交互体验。The embodiment of the present application provides a human-computer interaction method, which can avoid interruption of the human-computer interaction process by the wake-up word in the voice command, so as to bring a better human-computer interaction experience to the user.
图4是本申请实施例提供的一例人机交互的方法的示意性流程图,应理解,该方法400可以应用于手机、PC、车载设备等具有图1和图2所示结构的电子设备上。如图4所示,方法400包括:FIG. 4 is a schematic flowchart of an example of a method for human-computer interaction provided by an embodiment of the present application. It should be understood that the
401,获取用户的第一语音指令,检测到第一语音指令中包括唤醒词。401. Acquire a first voice instruction of a user, and detect that the first voice instruction includes a wake-up word.
示例性的,结合图3中的(e)图所示的场景,如果用户当前期望与手机的对话是以下内容:Exemplarily, with reference to the scenario shown in (e) of FIG. 3 , if the user currently expects a conversation with the mobile phone to be the following:
用户:小艺小艺。User: Xiaoyi Xiaoyi.
手机:嗨,我在听…Phone: Hi, I'm listening...
用户:模仿牛的叫声,小艺小艺。User: Imitate the sound of a cow, Xiaoyi Xiaoyi.
手机:哞哞…Cell Phone: Moo Moo...
当手机检测到用户第一次说出唤醒词“小艺小艺”时,唤醒手机,手机进入监听用户的语音指令的状态。When the mobile phone detects that the user speaks the wake-up word "Xiaoyi Xiaoyi" for the first time, the mobile phone wakes up, and the mobile phone enters the state of monitoring the user's voice command.
402,确定ASR模块是否为开启状态。402. Determine whether the ASR module is in an on state.
应理解,手机的ASR模块并不是一直开启处于工作状态的,当用户发出语音指令的时候,手机是关闭自动语音识别(automatic speech recognition,ASR)功能的,即关闭ASR模块;或者,当手机在回答用户的时候,ASR模块也是关闭的,避免采集了手机自己的语音,干扰用户语音指令的采集和识别。通过步骤402,手机先检测ASR模块是否处于开启状态,如果ASR处于休眠或不工作的关闭状态,可以触发开启ASR模块。It should be understood that the ASR module of the mobile phone is not always on and working. When the user issues a voice command, the mobile phone turns off the automatic speech recognition (ASR) function, that is, the ASR module is turned off; When answering the user, the ASR module is also closed to avoid collecting the phone's own voice and interfering with the collection and recognition of the user's voice command. Through step 402, the mobile phone first detects whether the ASR module is in an on state, and if the ASR is in a dormant or inactive state, the ASR module can be triggered to be turned on.
可选地,当手机第一次获取并识别到唤醒词“小艺小艺”时,如果确定手机当前处于开启ASR模块的状态,则可以忽略本次唤醒,继续当前的对话流程。Optionally, when the mobile phone acquires and recognizes the wake-up word "Xiaoyi Xiaoyi" for the first time, if it is determined that the mobile phone is currently in the state where the ASR module is turned on, the wake-up can be ignored and the current dialogue process can be continued.
403,当手机确定ASR模块为开启状态时,确定唤醒词在该第一语音指令中的位置。403. When the mobile phone determines that the ASR module is in an on state, determine the position of the wake-up word in the first voice instruction.
一种可能的实现方式中,手机被唤醒之后监测用户的第一语音指令,当检测到该第一语音指令中再次包括了该唤醒词“小艺小艺”时,可以先判断该唤醒词“小艺小艺”在该第一语音指令中的位置,该位置主要可以包括位于第一语音指令的首位、第一语音指令的中间、第一语音指令的末尾。示例性的,用户发出的第一语音指令在包括唤醒词的情况下,可能是“模仿牛的叫声,小艺小艺”(唤醒词位于第一语音指令的末尾)、“模仿动物叫声,小艺小艺,模仿牛的叫声(唤醒词位于第一语音指令的中间)”或者“小艺小艺,模仿牛的叫声”(唤醒词位于第一语音指令的首位)。In a possible implementation, the mobile phone monitors the user's first voice command after being woken up, and when it is detected that the first voice command includes the wake-up word "Xiaoyi Xiaoyi" again, the wake-up word "Xiaoyi Xiaoyi" can be judged first. The position of "Xiaoyi Xiaoyi" in the first voice command, the position may mainly include being located at the first position of the first voice command, the middle of the first voice command, and the end of the first voice command. Exemplarily, when the first voice command issued by the user includes a wake-up word, it may be "Imitate the call of a cow, Xiaoyi Xiaoyi" (the wake-up word is at the end of the first voice command), "Imitate the sound of an animal. , Xiaoyi Xiaoyi, imitating the call of a cow (the wake-up word is in the middle of the first voice command)" or "Xiaoyi Xiaoyi, imitating the call of a cow" (the wake-up word is at the top of the first voice command).
404-1,当唤醒词在该第一语音指令中的位置为末尾处时,执行步骤405,判断唤醒词距离最接近的语音指令的时长是否小于第一预设值。404-1, when the position of the wake-up word in the first voice command is at the end, step 405 is executed to determine whether the duration of the voice command whose wake-up word distance is closest is less than the first preset value.
406,当唤醒词距离最接近的语音指令的时间间隔小于第一预设值时,记录唤醒词对应的时间信息。406 , when the time interval between the wake-up word and the voice command closest to the wake-up word is less than the first preset value, record time information corresponding to the wake-up word.
应理解,该第一预设值可以用于判断当前用户是否希望中断该对话流程。示例性的,当用户发出的第一语音指令为:“模仿牛的叫声,小艺小艺”,唤醒词位于语音指令的末尾。根据步骤406,唤醒词“小艺小艺”最接近的语音指令就是“模仿牛的叫声”,手机可以根据“模仿牛的叫声”和“小艺小艺”之间的时间间隔判断用户发出该唤醒词“小艺小艺”的母的。当“模仿牛的叫声”的“声”和“小艺小艺”的第一个“小”之间的时间间隔小于第一预设值时,可以判断用户可能仅仅把唤醒词“小艺小艺”作为口头禅一部分,希望继续当前的对话流程,不切换下一个新的对话流程。It should be understood that the first preset value can be used to determine whether the current user wishes to interrupt the dialog flow. Exemplarily, when the first voice instruction issued by the user is: "Imitate the call of a cow, Xiaoyi Xiaoyi", the wake-up word is located at the end of the voice instruction. According to step 406, the closest voice command to the wake-up word "Xiaoyi Xiaoyi" is "Imitate the sound of a cow", and the mobile phone can judge the user according to the time interval between "Imitate the sound of a cow" and "Xiaoyi Xiaoyi" The mother who issued the wake-up word "Xiaoyi Xiaoyi". When the time interval between the "sound" of "imitation of the cow's cry" and the first "small" of "Xiaoyi Xiaoyi" is less than the first preset value, it can be judged that the user may only put the wake-up word "Xiaoyi" As part of the mantra, Xiaoyi hopes to continue the current dialogue flow without switching to the next new dialogue flow.
可选地,手机可以根据该第一语音指令,记录唤醒词“小艺小艺”在该第一语音指令中的时间信息。本申请实施例对时间信息的记录和标示规则不作限定,示例性的,如果以最初唤醒词唤醒手机为起始时间,该唤醒词再次出现在第一语音指令中的时段为t1-t2;如果以最初唤醒词唤醒手机为起始时间,该唤醒词再次出现在第一语音指令中的时段为T1-T2,可以根据时间信息确定该唤醒词在第一语音指令中的位置。Optionally, the mobile phone can record the time information of the wake-up word "Xiaoyi Xiaoyi" in the first voice instruction according to the first voice instruction. The embodiments of the present application do not limit the rules for recording and marking time information. Exemplarily, if the initial wake-up word wakes up the mobile phone as the starting time, the time period during which the wake-up word reappears in the first voice command is t 1 -t 2 If the initial wake-up word wakes up the mobile phone as the starting time, the wake-up word reappears in the first voice command in a period of time T 1 -T 2 , and the position of the wake-up word in the first voice command can be determined according to the time information.
407,根据唤醒词对应的时间信息,忽略该唤醒词,并对第一语音指令进行识别。407. According to the time information corresponding to the wake-up word, ignore the wake-up word, and recognize the first voice instruction.
408,正常应答。可选地,这里正常应答可以包括手机根据用户的提问进行的反馈,和用户持续对话,或者还可以包括“嗯”、“好的”等语音应答,本申请实施例对此不作限定。408, normal response. Optionally, the normal response here may include feedback from the mobile phone according to the user's question, and continuous dialogue with the user, or may also include voice responses such as "um" and "okay", which are not limited in this embodiment of the present application.
409,当唤醒词距离最接近的语音指令的时长大于或者等于第一预设值时,暂停当前的对话框,开启下一个新的对话框。409. When the duration of the voice command with the closest wake-up word distance is greater than or equal to the first preset value, suspend the current dialog box and open a next new dialog box.
410,开启ASR,识别新的对话框的用户的第二语音指令,再次根据用户的第二语音指令进行正常应答,或者返回到步骤401,再次检测第二语音指令中是否包括该唤醒词,重复上述流程,为了简便,此处不再赘述。410, turn on ASR, identify the user's second voice command in the new dialog box, and respond normally according to the user's second voice command again, or return to step 401, check again whether the wake-up word is included in the second voice command, repeat For the sake of simplicity, the above process is not repeated here.
示例性的,当用户发出的第一语音指令为:“模仿牛的叫声,小艺小艺”,唤醒词位于第一语音指令的末尾。当“模仿牛的叫声”的“声”和“小艺小艺”的第一个“小”之间的时间间隔大于或等于第一预设值时,可以判断用户可能希望中断当前的对话流程,进入下一个新的对话流程。换言之,手机可以将该第一语音指令中再次包括的唤醒词“小艺小艺”作为下一个对话流程的唤醒词,手机重新被唤醒,打断以前的“模仿牛的叫声”的对话流程。可选地,此时手机可以回复“嗨,我在听…”,本申请实施例对此不作限定。Exemplarily, when the first voice instruction issued by the user is: "Imitate the sound of a cow, Xiaoyi Xiaoyi", the wake-up word is located at the end of the first voice instruction. When the time interval between the "sound" of "Imitating the Cow's Cow" and the first "small" of "Xiaoyi Xiaoyi" is greater than or equal to the first preset value, it can be judged that the user may wish to interrupt the current conversation flow to enter the next new dialogue flow. In other words, the mobile phone can use the wake-up word "Xiaoyi Xiaoyi" included in the first voice command as the wake-up word for the next dialogue process, and the mobile phone is awakened again, interrupting the previous dialogue process of "imitation of a cow's cry" . Optionally, at this time, the mobile phone may reply "Hi, I'm listening...", which is not limited in this embodiment of the present application.
可选地,第一预设值可以是1秒,2秒等,本申请实施例对此不作限定。Optionally, the first preset value may be 1 second, 2 seconds, etc., which is not limited in this embodiment of the present application.
411,当手机确定ASR模块为未开启状态时,开启ASR模块,启动监听功能。且在开启了ASR监听功能后,继续执行步骤401获取用户的语音指令的过程,此处不再赘述。411 , when the mobile phone determines that the ASR module is not turned on, turn on the ASR module and start the monitoring function. And after the ASR monitoring function is enabled, the process of obtaining the user's voice command in step 401 is continued, which will not be repeated here.
对于步骤403,当确定唤醒词位于该第一语音指令中的首位或者位于该第一语音指令中的中间时,即404-3,当唤醒词在该第一语音指令中的首位,或者404-2,当唤醒词在该第一语音指令中的中间时,执行步骤406-408,记录唤醒词对应的时间信息,根据唤醒词对应的时间信息,忽略该唤醒词,并对第一语音指令进行识别,进行正常应答,为了简便,此处不再赘述。For step 403, when it is determined that the wake-up word is at the first position in the first voice command or in the middle of the first voice command, that is, 404-3, when the wake-up word is at the first position in the first voice command, or 404- 2. When the wake-up word is in the middle of the first voice command, perform steps 406-408, record the time information corresponding to the wake-up word, ignore the wake-up word according to the time information corresponding to the wake-up word, and perform the first voice command. Identify and perform a normal response. For simplicity, details are not repeated here.
一种可能的场景中,如果在语音识别刚结束的很短时间内,重新开启语音识别,手机可以判断用户是否继续说话,如果用户没有继续说话,手机可以使用之前的语音识别结果继续和用户对话。In a possible scenario, if speech recognition is turned on again shortly after speech recognition, the mobile phone can determine whether the user continues to speak. If the user does not continue to speak, the mobile phone can continue to talk to the user using the previous speech recognition results. .
通过上述方法,在用户和电子设备的语音交互过程中,用户通过唤醒词唤醒电子设备之后,如果用户发出的语音指令中再次包括唤醒词,该方法可以避免该语音指令中的唤醒词打断当前的交互流程,从而避免直接打断当前电子设备正在执行的任务,重新开始采集用户语音指令的过程,保证了人机对话的连贯性,提高了用户体验。Through the above method, in the process of voice interaction between the user and the electronic device, after the user wakes up the electronic device through the wake-up word, if the voice command issued by the user again includes the wake-up word, the method can avoid the wake-up word in the voice command from interrupting the current Therefore, the process of collecting the user's voice command is restarted, so as to avoid directly interrupting the task being performed by the current electronic device, which ensures the continuity of the human-machine dialogue and improves the user experience.
此外,在另一种可能的场景中,有些电子设备可能具有声源定位的能力,或者具有摄像头的图像采集的功能,例如机器人等。当机器人被唤醒词唤醒后,可以根据声源定位功能确定用户所在方向,并转动具有图像采集功能的摄像头,直接转到根据声源定位的用户所在的方向或位置。在该过程中,用户所在方向可能会因为声音被墙壁反射等问题出现较大的判断误差,在出现这种较大的误差时,会出现设备转动后不是面对人的现象。In addition, in another possible scenario, some electronic devices may have the ability of sound source localization, or have the function of image acquisition of cameras, such as robots. When the robot is awakened by the wake-up word, it can determine the direction of the user according to the sound source localization function, and turn the camera with the image acquisition function to directly turn to the direction or position of the user according to the sound source localization. During this process, there may be a large judgment error in the direction of the user due to problems such as sound being reflected by the wall. When such a large error occurs, the device will not face the person after rotating.
应理解,机器人可以具有图1所示的部分或全部结构,或者具有图2所示的软件架构,本申请实施例对此不作限定。It should be understood that the robot may have some or all of the structures shown in FIG. 1 , or have the software architecture shown in FIG. 2 , which is not limited in this embodiment of the present application.
示例性的,图5是本申请实施例提供的一例人机交互的场景示意图。如图5所示,假设机器人具有声源定位能力和图像采集功能,该机器人可以根据用户的语音指令确定声源方向,且可以根据摄像头采集的图像确定自身的视线估计(gaze estimation)方向。其中,视线方向和声源方向的夹角记作θ。Exemplarily, FIG. 5 is a schematic diagram of an example of human-computer interaction provided by an embodiment of the present application. As shown in Figure 5, assuming that the robot has the capability of sound source localization and image acquisition, the robot can determine the direction of the sound source according to the user's voice command, and can determine its own gaze estimation direction according to the image collected by the camera. Among them, the angle between the line of sight direction and the sound source direction is denoted as θ.
可选地,机器人根据摄像头采集的图像确定自身的视线估计(gaze estimation)方向的过程中,可以建立相机坐标系,基于摄像头的公开参数,将gaze目标以及用户眼睛位置坐标通过三维的六个关键点等算法变换到相机坐标下,具体可以参照现有技术的计算过程,此处不再赘述。Optionally, in the process of determining its own gaze estimation direction according to the image collected by the camera, the robot can establish a camera coordinate system, and based on the public parameters of the camera, pass the gaze target and the coordinates of the user's eye position through six three-dimensional keys. For transforming points and other algorithms into camera coordinates, specific reference may be made to the calculation process in the prior art, which will not be repeated here.
本申请实施例针对具有声源定位的能力的机器人等电子设备,还提供了一种人机交互的方法,可以避免人机交互流程被语音指令中的唤醒词打断,以带给用户更好的人机交互体验。The embodiments of the present application also provide a human-computer interaction method for electronic devices such as robots capable of sound source localization, which can prevent the human-computer interaction process from being interrupted by the wake-up word in the voice command, so as to provide users with better human-computer interaction experience.
图6是本申请实施例提供的一例人机交互的方法的示意性流程图,应理解,该方法600可以应用于机器人等具有声源定位的能力的电子设备上。如图6所示,方法600包括:FIG. 6 is a schematic flowchart of an example of a method for human-computer interaction provided by an embodiment of the present application. It should be understood that the
601,机器人获取用户的第一语音指令。601. The robot obtains the first voice instruction of the user.
602,机器人根据第一语音指令,检测该第一语音指令的声源方向。602. The robot detects the sound source direction of the first voice command according to the first voice command.
603,机器人判断该第一语音指令的声源方向与机器人当前的视线方向之间的夹角θ是否大于或等于第一预设角度。603: The robot determines whether the included angle θ between the sound source direction of the first voice command and the current line of sight of the robot is greater than or equal to a first preset angle.
604,当该第一语音指令的声源方向与视线方向之间的夹角θ大于或等于第一预设角度时,机器人判断用户的交互意愿是否小于预设值。604. When the angle θ between the sound source direction of the first voice command and the line of sight direction is greater than or equal to the first preset angle, the robot determines whether the user's willingness to interact is less than the preset value.
应理解,当该第一语音指令的声源方向与视线方向的之间的夹角θ大于或等于第一预设角度时,可以认为发出语音指令的用户和机器人并不是处于面对面的位置关系,或者说,发出语音指令的用户不在机器人采集图像的中心区域范围内,本申请实施例对中心区域对应的范围不作限定。It should be understood that when the angle θ between the sound source direction of the first voice command and the line of sight direction is greater than or equal to the first preset angle, it can be considered that the user and the robot issuing the voice command are not in a face-to-face positional relationship, In other words, the user who issued the voice command is not within the range of the central area of the image captured by the robot, and the embodiment of the present application does not limit the range corresponding to the central area.
可选地,步骤604中,机器人可以通过摄像头采集图像,并检测采集的图像中用户的眼睛所注视的方向估计用户的交互意愿。例如,表1列举了一例可能的用户交互意愿范围。Optionally, in step 604, the robot may collect an image through a camera, and detect the direction in which the user's eyes are looking in the collected image to estimate the user's willingness to interact. For example, Table 1 lists an example of a range of possible user interaction intentions.
表1Table 1
如表1所示,当根据用户注视的方向与机器人视线方向的夹角范围确定交互意愿预估范围为0.8-1.0时,机器人可以判断用户当前的交互意愿强烈;当根据用户注视的方向与机器人视线方向的夹角范围确定交互意愿预估范围为0.5-0.8时,机器人可以判断用户当前的交互意愿一般;当根据用户注视的方向与机器人视线方向的夹角范围确定交互意愿预估范围为0.1-0.5时,机器人可以判断用户当前的交互意愿较低,本申请实施例对此不作限定。As shown in Table 1, when the estimated range of interaction willingness is determined to be 0.8-1.0 according to the angle between the user's gaze direction and the robot's gaze direction, the robot can judge that the user's current willingness to interact is strong; When the estimated range of interaction willingness is determined by the angle range of the gaze direction, the robot can judge that the user's current interaction willingness is normal; when the estimated range of interaction willingness is determined according to the angle range between the user's gaze direction and the robot's gaze direction, the estimated range of interaction willingness is 0.1 When -0.5, the robot can determine that the user's current willingness to interact is low, which is not limited in this embodiment of the present application.
可选地,预设值可以设置为0.5,当估计的用户当前的交互意愿大于或等于预设值时,继续执行下述步骤605。Optionally, the preset value may be set to 0.5. When the estimated user's current interaction intention is greater than or equal to the preset value, the following step 605 is continued.
605,机器人判断该第一语音指令的声源方向与前一次语音指令的声源方向之间的夹角是否小于第二预设角度,且两次语音指令的时间间隔是否小于第二预设值。605. The robot determines whether the angle between the sound source direction of the first voice command and the sound source direction of the previous voice command is less than a second preset angle, and whether the time interval between two voice commands is less than a second preset value .
606,当该第一语音指令的声源方向与前一次语音指令的声源方向之间的夹角小于第二预设角度,且两次语音指令的时间间隔小于第二预设值,机器人进行正常应答。606, when the angle between the sound source direction of the first voice command and the sound source direction of the previous voice command is less than the second preset angle, and the time interval between the two voice commands is less than the second preset value, the robot performs Respond normally.
应理解,这里“前一次语音指令”为第一语音指令之前的最接近的语音指令。可选地,该“前一次语音指令”可以是用户的唤醒词指令,例如:小艺小艺。或者该“前一次语音指令”是唤醒词之后的其他语音指令,例如:请模仿牛的叫声。本申请实施例对此不作限定。It should be understood that the "previous voice command" here is the closest voice command before the first voice command. Optionally, the "previous voice command" may be the user's wake-up word command, for example: Xiaoyi Xiaoyi. Or the "previous voice command" is another voice command after the wake-up word, for example: please imitate the call of a cow. This embodiment of the present application does not limit this.
还应理解,这里正常应答可以理解为机器人识别用户的第一语音指令,并根据第一语音指令做出相应的反馈,此处不再赘述。It should also be understood that the normal response here can be understood as the robot recognizing the first voice command of the user and making corresponding feedback according to the first voice command, which will not be repeated here.
607,当该第一语音指令的声源方向与前一次语音指令的声源方向之间的夹角大于或等于第二预设角度,且两次语音指令的时间间隔大于或等于第二预设值时,机器人调用转向执行函数,转换机器人方向。607, when the angle between the sound source direction of the first voice command and the sound source direction of the previous voice command is greater than or equal to the second preset angle, and the time interval between the two voice commands is greater than or equal to the second preset angle When the value is set, the robot calls the steering execution function to convert the robot direction.
可选地,“第一语音指令的声源方向与前一次语音指令的声源方向之间的夹角大于或等于第二预设角度”和“两次语音指令的时间间隔大于或等于第二预设值”的显示条件可以满足任意一个,或者同时满足,调用转向执行函数,转换机器人方向。本申请实施例对此不作限定。Optionally, "the angle between the sound source direction of the first voice command and the sound source direction of the previous voice command is greater than or equal to the second preset angle" and "the time interval between the two voice commands is greater than or equal to the second The display conditions of "preset value" can satisfy any one, or both, and call the steering execution function to change the direction of the robot. This embodiment of the present application does not limit this.
608,响应于转向执行函数,机器人转换方向后,确定转换方向后的用户交互意愿。可选地,该确定用户交互意愿的过程可以通过采集图像并判断图像中用户注视的方向来确定,具体请参照前述步骤604的相关介绍,此处不再赘述。608. In response to the steering execution function, after the robot changes direction, determine the user interaction intention after the direction is changed. Optionally, the process of determining the user's willingness to interact can be determined by collecting an image and judging the direction of the user's gaze in the image. For details, please refer to the relevant introduction in the foregoing step 604, which will not be repeated here.
一种可能的实现方式中,步骤608中,当机器人响应于转向执行函数转换方向后,确定转换方向后的用户交互意愿比较低,机器人可以再转回的视线方向,同时执行步骤606,对用户的地语音指令做出相应的反馈,进行正常应答。In a possible implementation manner, in step 608, after the robot changes the direction in response to the steering execution function, it is determined that the user's willingness to interact after the direction is changed is relatively low, and the robot can turn back to the line of sight direction, and step 606 is performed at the same time. The voice command will give corresponding feedback and respond normally.
另一种可能的场景中,如果用户的第一语音指令中可能包含了唤醒词,可以结合图4所示的方法,且采集图像并根据图像中用户注视的方向来估计用户交互意愿。当机器人在图像中并没有检测到人,可以确定用户的交互意愿很低,或者中断本次的人机交互流程,换言之,这种场景可以认为是机器人被误唤醒。In another possible scenario, if the user's first voice instruction may contain a wake-up word, the method shown in FIG. 4 can be combined to collect images and estimate the user's willingness to interact according to the direction of the user's gaze in the image. When the robot does not detect a person in the image, it can be determined that the user's willingness to interact is very low, or the current human-computer interaction process is interrupted. In other words, this scenario can be considered as the robot being awakened by mistake.
又一种可能的场景中,如果第一语音指令就是唤醒词,唤醒词的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度,则机器人可以根据当前采集的图像中是否有用户并估计用户的交互意愿是否强烈来判断是否需要响应本次唤醒。In another possible scenario, if the first voice command is the wake-up word, and the angle θ between the sound source direction of the wake-up word and the current line of sight of the robot is greater than or equal to the first preset angle, the robot can Whether there is a user in the collected images and estimating whether the user's willingness to interact is strong to determine whether it is necessary to respond to this wake-up.
示例性的,如果唤醒词的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度,且当前用户的交互意愿较强,可以设置机器人需要连续两次在同一声源方向的唤醒词才可以唤醒机器人,即机器人才会响应于用户的唤醒词。Exemplarily, if the angle θ between the direction of the sound source of the wake-up word and the direction of sight that the robot is currently facing is greater than or equal to the first preset angle, and the current user's willingness to interact is strong, it can be set that the robot needs to The wake-up word in the same sound source direction can wake up the robot, that is, the robot will respond to the user's wake-up word.
或者,如果唤醒词的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度,机器人转到唤醒词的声源方向后,并没有检测到用户,可以再转回唤醒前的角度,继续与唤醒前的人进行语音交互。Or, if the angle θ between the sound source direction of the wake-up word and the current line-of-sight direction of the robot is greater than or equal to the first preset angle, after the robot turns to the sound source direction of the wake-up word, it does not detect the user, and can repeat Turn back to the pre-wake angle and continue the voice interaction with the person before the wake-up.
通过上述方法,使得机器人的唤醒过程更加符合人的预期,当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿强烈时,机器人可以确定自动转向用户;当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿较低时,机器人还可以转回来,且在该过程中不会中断用户和机器人的交互流程,带给用户更好的人机交互体验。Through the above method, the wake-up process of the robot is more in line with human expectations. When the angle θ between the sound source direction of the user's voice command and the current line of sight of the robot is greater than or equal to the first preset angle and the user's willingness to interact When it is strong, the robot can determine to automatically turn to the user; when the angle θ between the sound source direction of the user's voice command and the direction of sight the robot is currently facing is greater than or equal to the first preset angle and the user's willingness to interact is low, the robot It can also be turned back, and the interaction process between the user and the robot will not be interrupted in the process, bringing users a better human-computer interaction experience.
综上所述,在用户和电子设备的语音交互过程中,用户通过唤醒词唤醒电子设备之后,如果用户发出的语音指令或者回复电子设备的答案中再次包括唤醒词,该方法可以避免该语音指令中的唤醒词打断当前的交互流程,从而避免直接打断当前电子设备正在执行的任务,重新开始采集用户语音指令的过程,保证了人机对话的连贯性,提高了用户体验。To sum up, in the process of voice interaction between the user and the electronic device, after the user wakes up the electronic device through the wake-up word, if the voice command issued by the user or the answer to the electronic device again includes the wake-up word, this method can avoid the voice command. The wake-up word in the device interrupts the current interaction process, so as to avoid directly interrupting the task being performed by the current electronic device, and restart the process of collecting the user's voice command, which ensures the continuity of the man-machine dialogue and improves the user experience.
此外,对于具有声源定位的能力的机器人等电子设备,本申请实施例提供的方法可以根据语音指令的声源方向确定是否要发生偏转,并根据采集的图像等估计用户的交互意愿,进而更加精准的和用户进行语音交互。具体地,当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿强烈时,机器人可以确定自动转向用户;当用户的语音指令的声源方向与机器人当前面向的视线方向之间的夹角θ大于或等于第一预设角度且用户的交互意愿较低时,机器人还可以转回来,且在该过程中不会中断用户和机器人的交互流程,带给用户更好的人机交互体验。In addition, for electronic devices such as robots that have the capability of sound source localization, the method provided by the embodiment of the present application can determine whether to deflect according to the direction of the sound source of the voice command, and estimate the user's willingness to interact according to the collected images, etc., and furthermore Precise voice interaction with users. Specifically, when the angle θ between the sound source direction of the user's voice command and the direction of sight that the robot is currently facing is greater than or equal to the first preset angle and the user's willingness to interact is strong, the robot can determine to automatically turn to the user; when the user When the angle θ between the sound source direction of the voice command and the line-of-sight direction currently facing the robot is greater than or equal to the first preset angle and the user's willingness to interact is low, the robot can also turn back, and it will not turn back during the process. Interrupt the interaction process between the user and the robot, and bring the user a better human-computer interaction experience.
可以理解的是,电子设备为了实现上述功能,其包含了执行各个功能相应的硬件和/或软件模块。结合本文中所公开的实施例描述的各示例的算法步骤,本申请能够以硬件或硬件和计算机软件的结合形式来实现。某个功能究竟以硬件还是计算机软件驱动硬件的方式来执行,取决于技术方案的特定应用和设计约束条件。本领域技术人员可以结合实施例对每个特定的应用来使用不同方法来实现所描述的功能,但是这种实现不应认为超出本申请的范围。It can be understood that, in order to realize the above-mentioned functions, the electronic device includes corresponding hardware and/or software modules for executing each function. The present application can be implemented in hardware or in the form of a combination of hardware and computer software in conjunction with the algorithm steps of each example described in conjunction with the embodiments disclosed herein. Whether a function is performed by hardware or computer software driving hardware depends on the specific application and design constraints of the technical solution. Those skilled in the art may use different methods to implement the described functionality for each particular application in conjunction with the embodiments, but such implementations should not be considered beyond the scope of this application.
本实施例可以根据上述方法示例对电子设备进行功能模块的划分,例如,可以对应各个功能划分各个功能模块,也可以将两个或两个以上的功能集成在一个处理模块中。上述集成的模块可以采用硬件的形式实现。需要说明的是,本实施例中对模块的划分是示意性的,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式。In this embodiment, the electronic device can be divided into functional modules according to the above method examples. For example, each functional module can be divided corresponding to each function, or two or more functions can be integrated into one processing module. The above-mentioned integrated modules can be implemented in the form of hardware. It should be noted that, the division of modules in this embodiment is schematic, and is only a logical function division, and there may be other division manners in actual implementation.
在采用对应各个功能划分各个功能模块的情况下,上述实施例中涉及的机器人、手机等电子设备或可以包括:采集单元、检测单元和处理单元。In the case where each functional module is divided according to each function, the electronic devices such as robots and mobile phones involved in the above embodiments may include: a collection unit, a detection unit, and a processing unit.
其中,采集单元、检测单元和处理单元相互配合,可以用于支持机器人、手机等电子设备可以执行上述步骤等,和/或用于本文所描述的技术的其他过程。Wherein, the acquisition unit, the detection unit and the processing unit cooperate with each other, and can be used to support electronic devices such as robots and mobile phones to perform the above steps, etc., and/or be used for other processes of the technology described herein.
需要说明的是,上述方法实施例涉及的各步骤的所有相关内容均可以援引到对应功能模块的功能描述,在此不再赘述。It should be noted that, all relevant contents of the steps involved in the above method embodiments can be cited in the functional description of the corresponding functional module, which will not be repeated here.
本实施例提供的电子设备,用于执行上述视频播放的方法,因此可以达到与上述实现方法相同的效果。The electronic device provided in this embodiment is used to execute the above-mentioned video playback method, and thus can achieve the same effect as the above-mentioned implementation method.
在采用集成的单元的情况下,电子设备可以包括处理模块、存储模块和通信模块。其中,处理模块可以用于对电子设备的动作进行控制管理,例如,可以用于支持电子设备执行上述采集单元、检测单元和处理单元执行的步骤。存储模块可以用于支持电子设备执行存储程序代码和数据等。通信模块,可以用于支持电子设备与其他设备的通信。Where an integrated unit is employed, the electronic device may include a processing module, a memory module and a communication module. The processing module may be used to control and manage the actions of the electronic device, for example, may be used to support the electronic device to perform the steps performed by the above acquisition unit, detection unit and processing unit. The storage module may be used to support the electronic device to execute stored program codes and data, and the like. The communication module can be used to support the communication between the electronic device and other devices.
其中,处理模块可以是处理器或控制器。其可以实现或执行结合本申请公开内容所描述的各种示例性的逻辑方框,模块和电路。处理器也可以是实现计算功能的组合,例如包含一个或多个微处理器组合,数字信号处理(digital signal processing,DSP)和微处理器的组合等等。存储模块可以是存储器。通信模块具体可以为射频电路、蓝牙芯片、Wi-Fi芯片等与其他电子设备交互的设备。The processing module may be a processor or a controller. It may implement or execute the various exemplary logical blocks, modules and circuits described in connection with this disclosure. The processor may also be a combination that implements computing functions, such as a combination comprising one or more microprocessors, a combination of digital signal processing (DSP) and a microprocessor, and the like. The storage module may be a memory. The communication module may specifically be a device that interacts with other electronic devices, such as a radio frequency circuit, a Bluetooth chip, and a Wi-Fi chip.
在一个实施例中,当处理模块为处理器,存储模块为存储器时,本实施例所涉及的电子设备可以为具有图1所示结构的设备。In one embodiment, when the processing module is a processor and the storage module is a memory, the electronic device involved in this embodiment may be a device having the structure shown in FIG. 1 .
本实施例还提供一种计算机可读存储介质,该计算机可读存储介质中存储有计算机指令,当该计算机指令在电子设备上运行时,使得电子设备执行上述相关方法步骤实现上述实施例中的人机交互的方法。This embodiment also provides a computer-readable storage medium, where computer instructions are stored in the computer-readable storage medium, and when the computer instructions are executed on the electronic device, the electronic device executes the above-mentioned related method steps to realize the above-mentioned embodiments. methods of human-computer interaction.
本实施例还提供了一种计算机程序产品,当该计算机程序产品在计算机上运行时,使得计算机执行上述相关步骤,以实现上述实施例中的人机交互的方法。This embodiment also provides a computer program product, which when the computer program product runs on a computer, causes the computer to execute the above-mentioned relevant steps, so as to realize the method for human-computer interaction in the above-mentioned embodiment.
另外,本申请的实施例还提供一种装置,这个装置具体可以是芯片,组件或模块,该装置可包括相连的处理器和存储器;其中,存储器用于存储计算机执行指令,当装置运行时,处理器可执行存储器存储的计算机执行指令,以使芯片执行上述各方法实施例中的人机交互的方法。In addition, the embodiments of the present application also provide an apparatus, which may specifically be a chip, a component or a module, and the apparatus may include a connected processor and a memory; wherein, the memory is used for storing computer execution instructions, and when the apparatus is running, The processor can execute the computer-executed instructions stored in the memory, so that the chip executes the method for human-computer interaction in the foregoing method embodiments.
其中,本实施例提供的电子设备、计算机可读存储介质、计算机程序产品或芯片均用于执行上文所提供的对应的方法,因此,其所能达到的有益效果可参考上文所提供的对应的方法中的有益效果,此处不再赘述。Wherein, the electronic device, computer-readable storage medium, computer program product or chip provided in this embodiment are all used to execute the corresponding method provided above. Therefore, for the beneficial effects that can be achieved, reference may be made to the above-provided method. The beneficial effects in the corresponding method will not be repeated here.
通过以上实施方式的描述,所属领域的技术人员可以了解到,为描述的方便和简洁,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将装置的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。From the description of the above embodiments, those skilled in the art can understand that for the convenience and brevity of the description, only the division of the above functional modules is used as an example for illustration. In practical applications, the above functions can be allocated by different The function module is completed, that is, the internal structure of the device is divided into different function modules, so as to complete all or part of the functions described above.
在本申请所提供的几个实施例中,应该理解到,所揭露的装置和方法,可以通过其它的方式实现。例如,以上所描述的装置实施例仅仅是示意性的,例如,模块或单元的划分,仅仅为一种逻辑功能划分,实际实现时可以有另外的划分方式,例如多个单元或组件可以结合或者可以集成到另一个装置,或一些特征可以忽略,或不执行。另一点,所显示或讨论的相互之间的耦合或直接耦合或通信连接可以是通过一些接口,装置或单元的间接耦合或通信连接,可以是电性,机械或其它的形式。In the several embodiments provided in this application, it should be understood that the disclosed apparatus and method may be implemented in other manners. For example, the apparatus embodiments described above are only illustrative. For example, the division of modules or units is only a logical function division. In actual implementation, there may be other division methods. For example, multiple units or components may be combined or May be integrated into another device, or some features may be omitted, or not implemented. On the other hand, the shown or discussed mutual coupling or direct coupling or communication connection may be through some interfaces, indirect coupling or communication connection of devices or units, and may be in electrical, mechanical or other forms.
作为分离部件说明的单元可以是或者也可以不是物理上分开的,作为单元显示的部件可以是一个物理单元或多个物理单元,即可以位于一个地方,或者也可以分布到多个不同地方。可以根据实际的需要选择其中的部分或者全部单元来实现本实施例方案的目的。Units described as separate components may or may not be physically separated, and components shown as units may be one physical unit or multiple physical units, that is, may be located in one place, or may be distributed in multiple different places. Some or all of the units may be selected according to actual needs to achieve the purpose of the solution in this embodiment.
另外,在本申请各个实施例中的各功能单元可以集成在一个处理单元中,也可以是各个单元单独物理存在,也可以两个或两个以上单元集成在一个单元中。上述集成的单元既可以采用硬件的形式实现,也可以采用软件功能单元的形式实现。In addition, each functional unit in each embodiment of the present application may be integrated into one processing unit, or each unit may exist physically alone, or two or more units may be integrated into one unit. The above-mentioned integrated units may be implemented in the form of hardware, or may be implemented in the form of software functional units.
集成的单元如果以软件功能单元的形式实现并作为独立的产品销售或使用时,可以存储在一个可读取存储介质中。基于这样的理解,本申请实施例的技术方案本质上或者说对现有技术做出贡献的部分或者该技术方案的全部或部分可以以软件产品的形式体现出来,该软件产品存储在一个存储介质中,包括若干指令用以使得一个设备(可以是单片机,芯片等)或处理器(processor)执行本申请各个实施例方法的全部或部分步骤。而前述的存储介质包括:U盘、移动硬盘、只读存储器(read only memory,ROM)、随机存取存储器(random access memory,RAM)、磁碟或者光盘等各种可以存储程序代码的介质。The integrated unit, if implemented in the form of a software functional unit and sold or used as a stand-alone product, may be stored in a readable storage medium. Based on such understanding, the technical solutions of the embodiments of the present application can be embodied in the form of software products in essence, or the parts that contribute to the prior art, or all or part of the technical solutions, which are stored in a storage medium , including several instructions to make a device (which may be a single chip microcomputer, a chip, etc.) or a processor (processor) to execute all or part of the steps of the methods in the various embodiments of the present application. The aforementioned storage medium includes: a U disk, a removable hard disk, a read only memory (ROM), a random access memory (RAM), a magnetic disk or an optical disk and other media that can store program codes.
以上内容,仅为本申请的具体实施方式,但本申请的保护范围并不局限于此,任何熟悉本技术领域的技术人员在本申请揭露的技术范围内,可轻易想到变化或替换,都应涵盖在本申请的保护范围之内。因此,本申请的保护范围应以权利要求的保护范围为准。The above content is only a specific embodiment of the present application, but the protection scope of the present application is not limited to this. Covered within the scope of protection of this application. Therefore, the protection scope of the present application shall be subject to the protection scope of the claims.
Claims (10)
Priority Applications (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110381295.1A CN115206308A (en) | 2021-04-08 | 2021-04-08 | Man-machine interaction method and electronic equipment |
Applications Claiming Priority (1)
Application Number | Priority Date | Filing Date | Title |
---|---|---|---|
CN202110381295.1A CN115206308A (en) | 2021-04-08 | 2021-04-08 | Man-machine interaction method and electronic equipment |
Publications (1)
Publication Number | Publication Date |
---|---|
CN115206308A true CN115206308A (en) | 2022-10-18 |
Family
ID=83571346
Family Applications (1)
Application Number | Title | Priority Date | Filing Date |
---|---|---|---|
CN202110381295.1A Pending CN115206308A (en) | 2021-04-08 | 2021-04-08 | Man-machine interaction method and electronic equipment |
Country Status (1)
Country | Link |
---|---|
CN (1) | CN115206308A (en) |
Cited By (5)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN117725187A (en) * | 2024-02-08 | 2024-03-19 | 人和数智科技有限公司 | Question-answering system suitable for social assistance |
WO2024159882A1 (en) * | 2023-02-02 | 2024-08-08 | 华为技术有限公司 | Sound pickup method and electronic device |
WO2025036321A1 (en) * | 2023-08-14 | 2025-02-20 | 华为技术有限公司 | Interaction method and electronic device |
WO2025050718A1 (en) * | 2023-09-04 | 2025-03-13 | 华为技术有限公司 | Speech interaction method and apparatus, and first electronic device |
CN119811368A (en) * | 2025-03-13 | 2025-04-11 | 上海岩芯数智人工智能科技有限公司 | A voice interaction method and device |
Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN109147779A (en) * | 2018-08-14 | 2019-01-04 | 苏州思必驰信息科技有限公司 | Voice data processing method and device |
CN109545207A (en) * | 2018-11-16 | 2019-03-29 | 广东小天才科技有限公司 | Voice awakening method and device |
CN110349579A (en) * | 2019-07-15 | 2019-10-18 | 北京梧桐车联科技有限责任公司 | Voice wakes up processing method and processing device, electronic equipment and storage medium |
CN110415695A (en) * | 2019-07-25 | 2019-11-05 | 华为技术有限公司 | A voice wake-up method and electronic equipment |
CN110727821A (en) * | 2019-10-12 | 2020-01-24 | 深圳海翼智新科技有限公司 | Method, apparatus, system and computer storage medium for preventing device from being awoken by mistake |
CN111194439A (en) * | 2017-08-07 | 2020-05-22 | 搜诺思公司 | Wake-up word detection suppression |
CN112133307A (en) * | 2020-08-31 | 2020-12-25 | 百度在线网络技术(北京)有限公司 | Human-computer interaction method, device, electronic device and storage medium |
CN112185388A (en) * | 2020-09-14 | 2021-01-05 | 北京小米松果电子有限公司 | Speech recognition method, device, equipment and computer readable storage medium |
-
2021
- 2021-04-08 CN CN202110381295.1A patent/CN115206308A/en active Pending
Patent Citations (8)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
CN111194439A (en) * | 2017-08-07 | 2020-05-22 | 搜诺思公司 | Wake-up word detection suppression |
CN109147779A (en) * | 2018-08-14 | 2019-01-04 | 苏州思必驰信息科技有限公司 | Voice data processing method and device |
CN109545207A (en) * | 2018-11-16 | 2019-03-29 | 广东小天才科技有限公司 | Voice awakening method and device |
CN110349579A (en) * | 2019-07-15 | 2019-10-18 | 北京梧桐车联科技有限责任公司 | Voice wakes up processing method and processing device, electronic equipment and storage medium |
CN110415695A (en) * | 2019-07-25 | 2019-11-05 | 华为技术有限公司 | A voice wake-up method and electronic equipment |
CN110727821A (en) * | 2019-10-12 | 2020-01-24 | 深圳海翼智新科技有限公司 | Method, apparatus, system and computer storage medium for preventing device from being awoken by mistake |
CN112133307A (en) * | 2020-08-31 | 2020-12-25 | 百度在线网络技术(北京)有限公司 | Human-computer interaction method, device, electronic device and storage medium |
CN112185388A (en) * | 2020-09-14 | 2021-01-05 | 北京小米松果电子有限公司 | Speech recognition method, device, equipment and computer readable storage medium |
Cited By (6)
Publication number | Priority date | Publication date | Assignee | Title |
---|---|---|---|---|
WO2024159882A1 (en) * | 2023-02-02 | 2024-08-08 | 华为技术有限公司 | Sound pickup method and electronic device |
WO2025036321A1 (en) * | 2023-08-14 | 2025-02-20 | 华为技术有限公司 | Interaction method and electronic device |
WO2025050718A1 (en) * | 2023-09-04 | 2025-03-13 | 华为技术有限公司 | Speech interaction method and apparatus, and first electronic device |
CN117725187A (en) * | 2024-02-08 | 2024-03-19 | 人和数智科技有限公司 | Question-answering system suitable for social assistance |
CN117725187B (en) * | 2024-02-08 | 2024-04-30 | 人和数智科技有限公司 | Question-answering system suitable for social assistance |
CN119811368A (en) * | 2025-03-13 | 2025-04-11 | 上海岩芯数智人工智能科技有限公司 | A voice interaction method and device |
Similar Documents
Publication | Publication Date | Title |
---|---|---|
RU2766255C1 (en) | Voice control method and electronic device | |
CN110910872B (en) | Voice interaction method and device | |
WO2021052263A1 (en) | Voice assistant display method and device | |
CN111046680B (en) | Translation method and electronic equipment | |
CN111819533B (en) | Method for triggering electronic equipment to execute function and electronic equipment | |
CN111316199B (en) | Information processing method and electronic equipment | |
WO2021213164A1 (en) | Application interface interaction method, electronic device, and computer readable storage medium | |
WO2020182065A1 (en) | Shortcut function activation method and electronic device | |
JP7397861B2 (en) | Stylus pen detection method, system and related equipment | |
CN115206308A (en) | Man-machine interaction method and electronic equipment | |
CN109976626A (en) | A kind of switching method and electronic equipment of application icon | |
WO2021052139A1 (en) | Gesture input method and electronic device | |
WO2022037726A1 (en) | Split-screen display method and electronic device | |
CN110633043A (en) | A split-screen processing method and terminal device | |
WO2022095983A1 (en) | Gesture misrecognition prevention method, and electronic device | |
CN114915721A (en) | Method for establishing connection and electronic equipment | |
CN111835904A (en) | A method and electronic device for opening an application based on situational awareness and user portrait | |
CN110058729B (en) | Method and electronic device for adjusting the sensitivity of touch detection | |
CN114637392A (en) | Display method and electronic equipment | |
CN113380240B (en) | Voice interaction method and electronic device | |
WO2022143094A1 (en) | Window page interaction method and apparatus, electronic device, and readable storage medium | |
CN117119102B (en) | Awakening method and electronic device for voice interaction function | |
WO2022143891A1 (en) | Focal point synchronization method and electronic device | |
CN113050864B (en) | Screen capturing method and related equipment | |
CN115016666A (en) | Touch processing method, terminal device and storage medium |
Legal Events
Date | Code | Title | Description |
---|---|---|---|
PB01 | Publication | ||
PB01 | Publication | ||
SE01 | Entry into force of request for substantive examination | ||
SE01 | Entry into force of request for substantive examination |