CN114928763A

CN114928763A - Playing detection, starting up and echo processing method and device, electronic equipment and product

Info

Publication number: CN114928763A
Application number: CN202110150959.3A
Authority: CN
Inventors: 任万喜; 黄龙诚; 付强; 纳跃跃; 姜南; 田彪
Original assignee: Alibaba Group Holding Ltd
Current assignee: Alibaba Group Holding Ltd
Priority date: 2021-02-03
Filing date: 2021-02-03
Publication date: 2022-08-19
Anticipated expiration: 2041-02-03
Also published as: CN114928763B

Abstract

The present application relates to a playback detection, power-on, echo processing method, device, electronic device and product, wherein the sound playback detection processing method includes: acquiring a first audio signal and an audio and video source detected by a sound pickup module of an audio and video source output device The second audio signal sent by the output device to the audio and video playback device; according to the audio signal correlation and phase difference between the first audio signal and the second audio signal, detect whether the audio and video playback device plays the second audio signal; if the audio and video playback device If the second audio signal is not played, the second audio signal is played through the speaker of the audio and video source output device. The embodiment of the present invention realizes intelligent switching by analyzing and judging the correlation and phase difference between the first audio signal retrieved by the sound pickup module of the audio and video source output device and the second audio signal sent to the audio and video playback device. The device that plays the audio, the detection of the power-on state, and the processing of echo cancellation.

Description

Playback detection, startup, echo processing method, device, electronic device and product

技术领域technical field

本申请涉及一种播放检测、开机、回声处理方法、装置、电子设备及产品，属于计算机技术领域。The present application relates to a method, device, electronic device and product for playback detection, startup, and echo processing, belonging to the technical field of computers.

背景技术Background technique

现有技术中，电视盒子一般作为音视频源输出设备，电视设备一般作为音视频播放设备，相对于电视设备而言，电视盒子提供的声源可能只是电视设备输入的声源之一，当电视设备播放其他声源的信号时，电视设备的声音通道被占用，电视盒子向电视设备输出的音频信号可能就无法被播放。此外，当电视设备与电视盒子之间声音信号传输线路出现问题时，电视盒子向电视设备输出的音频信号也可能无法被播放。这样的情况会严重影响视频播放效果，用户往往需要通过手动调整或者设备的方式才能解决，给用户造成极大不便。In the prior art, TV boxes are generally used as audio and video source output devices, and TV devices are generally used as audio and video playback devices. Compared with TV devices, the sound source provided by TV boxes may only be one of the sound sources input by TV devices. When the device plays signals from other sound sources, the sound channel of the TV device is occupied, and the audio signal output by the TV box to the TV device may not be played. In addition, when there is a problem with the sound signal transmission line between the TV device and the TV box, the audio signal output by the TV box to the TV device may not be played. Such a situation will seriously affect the video playback effect, and the user often needs to manually adjust or use the device to solve the problem, causing great inconvenience to the user.

发明内容SUMMARY OF THE INVENTION

本发明实施例提供一种播放检测、开机、回声处理方法、装置、电子设备及产品，以智能地选择音视频播放设备来进行声源信号的播放，减少声源信号无法被播放的情形出现。Embodiments of the present invention provide a method, device, electronic device, and product for playback detection, startup, and echo processing, so as to intelligently select audio and video playback devices to play sound source signals and reduce the occurrence of situations where sound source signals cannot be played.

为了实现上述目的，本发明实施例提供了一种声音播放检测处理方法，包括：In order to achieve the above purpose, an embodiment of the present invention provides a sound playback detection and processing method, including:

获取音视频源输出设备的拾音模块检测到的第一音频信号和所述音视频源输出设备向音视频播放设备发出的第二音频信号；Obtain the first audio signal detected by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device;

根据所述第一音频信号和所述第二音频信号的音频信号相关度和相位差，检测所述音视频播放设备是否播放所述第二音频信号；Detecting whether the audio and video playback device plays the second audio signal according to the audio signal correlation and phase difference of the first audio signal and the second audio signal;

如果所述音视频播放设备未播放所述第二音频信号，则通过所述音视频源输出设备的扬声器播放所述第二音频信号。If the audio and video playback device does not play the second audio signal, the second audio signal is played through the speaker of the audio and video source output device.

本发明实施例还提供了一种音视频播放设备的开机状态检测处理方法，包括：Embodiments of the present invention also provide a method for detecting and processing a power-on state of an audio and video playback device, including:

向音视频播放设备发送检测音频信号，获取所述音视频源输出设备的拾音模块检测到的第一音频信号；Send a detection audio signal to the audio and video playback device, and obtain the first audio signal detected by the sound pickup module of the audio and video source output device;

根据所述第一音频信号和所述检测音频信号的音频信号相关度和相位差，检测所述音视频播放设备是否播放所述检测音频信号；Detecting whether the audio and video playback device plays the detected audio signal according to the audio signal correlation and phase difference between the first audio signal and the detected audio signal;

如果音视频播放设备播放所述检测音频信号，则向所述音视频播放设备输出音频和/或视频内容信号。If the audio and video playback device plays the detected audio signal, output the audio and/or video content signal to the audio and video playback device.

本发明实施例还提供了一种回声消除处理方法，包括：The embodiment of the present invention also provides an echo cancellation processing method, including:

将所述第一音频信号和所述第二音频信号进行信号特征比对，获取音频信号相关度大于预设的相关度阈值的相关音频信号部分；Comparing the signal characteristics of the first audio signal and the second audio signal, and obtaining the part of the audio signal whose correlation is greater than a preset correlation threshold;

确定所述第一音频信号的相关音频信号部分和所述第二音频信号的相关音频信号部分之间的相位差；determining a phase difference between the associated audio signal portion of the first audio signal and the associated audio signal portion of the second audio signal;

从所述第一音频信号中去除掉消除相位差后的第二音频信号。The phase difference-eliminated second audio signal is removed from the first audio signal.

本发明实施例还提供了一种声音播放检测处理装置，包括：The embodiment of the present invention also provides a sound playback detection and processing device, including:

音频信号获取模块，用于获取音视频源输出设备的拾音模块检测到的第一音频信号和所述音视频源输出设备向音视频播放设备发出的第二音频信号；an audio signal acquisition module for acquiring the first audio signal detected by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device;

播放检测模块，用于根据所述第一音频信号和所述第二音频信号的音频信号相关度和相位差，检测所述音视频播放设备是否播放所述第二音频信号；A playback detection module, configured to detect whether the audio and video playback device plays the second audio signal according to the audio signal correlation and phase difference of the first audio signal and the second audio signal;

音频播放切换处理模块，用于在检测到所述音视频播放设备未播放所述第二音频信号的情况下，通过所述音视频源输出设备的扬声器播放所述第二音频信号。An audio playback switching processing module, configured to play the second audio signal through the speaker of the audio and video source output device when it is detected that the audio and video playback device does not play the second audio signal.

本发明实施例还提供了一种音视频播放设备的开机状态检测处理装置，包括：The embodiment of the present invention also provides a power-on state detection and processing device for an audio and video playback device, including:

检测信号发送模块，用于向音视频播放设备发送检测音频信号；The detection signal sending module is used to send the detection audio signal to the audio and video playback device;

音频信号获取模块，用于获取所述音视频源输出设备的拾音模块检测到的第一音频信号；an audio signal acquisition module for acquiring the first audio signal detected by the sound pickup module of the audio and video source output device;

播放检测模块，用于根据所述第一音频信号和所述检测音频信号的音频信号相关度和相位差，检测所述音视频播放设备是否播放所述检测音频信号；A playback detection module, configured to detect whether the audio and video playback device plays the detected audio signal according to the audio signal correlation and phase difference of the first audio signal and the detected audio signal;

播放处理模块，用于在检测到音视频播放设备播放所述检测音频信号的情况下，向所述音视频播放设备输出音频和/或视频内容信号。A playback processing module, configured to output audio and/or video content signals to the audio and video playback device when it is detected that the audio and video playback device plays the detected audio signal.

本发明实施例还提供了一种回声消除处理装置，包括：The embodiment of the present invention also provides an echo cancellation processing device, including:

相关度处理模块，用于将所述第一音频信号和所述第二音频信号进行信号特征比对，获取音频信号相关度大于预设的相关度阈值的相关音频信号部分；a correlation processing module, configured to compare the signal characteristics of the first audio signal and the second audio signal, and obtain the part of the audio signal whose correlation is greater than a preset correlation threshold;

相位差确定模块，用于确定所述第一音频信号的相关音频信号部分和所述第二音频信号的相关音频信号部分之间的相位差；a phase difference determination module for determining a phase difference between a relevant audio signal portion of the first audio signal and a relevant audio signal portion of the second audio signal;

回声消除模块，用于从所述第一音频信号中去除掉消除相位差后的第二音频信号。The echo cancellation module is configured to remove the second audio signal after the phase difference has been eliminated from the first audio signal.

本发明实施例还提供了一种投影播放处理方法，包括：The embodiment of the present invention also provides a projection playback processing method, including:

向投影设备输出音视频信号；Output audio and video signals to projection equipment;

通过拾音模块检测所述投影设备是否播放所述音视频信号中的音频信号，如果检测确定所述投影设备未播放所述音频信号，则通过音视频源输出设备的扬声器播放所述音频信号。The sound pickup module detects whether the projection device plays the audio signal in the audio and video signal, and if it is determined that the projection device does not play the audio signal, the audio signal is played through the speaker of the audio and video source output device.

本发明实施例还提供了一种远程会议协同播放处理方法，包括：The embodiment of the present invention also provides a remote conference collaborative playback processing method, including:

获取远程会议的同屏图像信号和会议现场的音视频信号；Obtain the same-screen image signal of the remote conference and the audio and video signals of the conference site;

将所述音视频信号中的视频信号与所述同屏图像信号进行比对，确定同屏图像信号对应的音频信号；Compare the video signal in the audio-video signal with the image signal on the same screen, and determine the audio signal corresponding to the image signal on the same screen;

将所述同屏图像信号和对应的音频信号进行同步播放。The same-screen image signal and the corresponding audio signal are played synchronously.

本发明实施例还提供了一种音视频源输出设备，包括：The embodiment of the present invention also provides an audio and video source output device, including:

音视频信号输出模块，用于向音视频播放设备输出音视频信号；The audio and video signal output module is used to output audio and video signals to the audio and video playback device;

音频信号播放模块，用于根据音频播放检测模块的指令，播放向音视频播放设备输出的音频信号；The audio signal playing module is used for playing the audio signal output to the audio and video playing device according to the instruction of the audio playing detection module;

所述音频播放检测模块，用于对所述音频播放设备是否播放所述音频信号进行检测，如果确定所述音频播放设备未播放所述音频信号，则触发所述音频信号播放模块播放所述音频信号。The audio playback detection module is used to detect whether the audio playback device plays the audio signal, and if it is determined that the audio playback device does not play the audio signal, trigger the audio signal playback module to play the audio Signal.

本发明实施例还提供了一种电子设备，包括：The embodiment of the present invention also provides an electronic device, including:

存储器，用于存储程序；memory for storing programs;

处理器，用于运行所述存储器中存储的所述程序，以执行前述的声音播放检测处理方法、音视频播放设备的开机状态检测处理方法、回声消除处理方法、投影播放处理方法以及远程会议协同播放处理方法中的任意一个或多个方法。The processor is used to run the program stored in the memory to execute the aforementioned sound playback detection processing method, the power-on state detection processing method of the audio and video playback equipment, the echo cancellation processing method, the projection playback processing method and the remote conference coordination Play any one or more of the processing methods.

本发明实施例还提供了一种计算机程序产品，包括计算机程序或指令，其特征在于，当所述计算机程序或指令被处理器执行时，致使所述处理器实现前述的声音播放检测处理方法、音视频播放设备的开机状态检测处理方法、回声消除处理方法、投影播放处理方法以及远程会议协同播放处理方法中的任意一个或多个方法。An embodiment of the present invention also provides a computer program product, comprising a computer program or an instruction, characterized in that, when the computer program or instruction is executed by a processor, the processor is caused to implement the aforementioned sound playback detection processing method, Any one or more methods of a power-on state detection processing method, an echo cancellation processing method, a projection playback processing method, and a remote conference cooperative playback processing method of an audio and video playback device.

本发明实施例的播放检测、开机、回声处理方法、装置、电子设备及产品，通过对音视频源输出设备的拾音模块回采的第一音频信号和音视频源输出设备向音视频播放设备发送的第二音频信号之间的相关度和相位差进行分析判断，实现了智能地切换播放音频的设备以及对音视频播放设备进行开机状态的检测和回声消除的处理，从而提高了音视频播放的用户体验。The playback detection, power-on, and echo processing methods, devices, electronic devices, and products of the embodiments of the present invention are sent to the audio and video playback device through the first audio signal retrieved by the sound pickup module of the audio and video source output device and the audio and video source output device. The correlation and phase difference between the second audio signals are analyzed and judged, which realizes the intelligent switching of the audio-playing device, the detection of the power-on state of the audio-video playback device and the processing of echo cancellation, thereby improving the user experience of audio and video playback. experience.

上述说明仅是本发明技术方案的概述，为了能够更清楚了解本发明的技术手段，而可依照说明书的内容予以实施，并且为了让本发明的上述和其它目的、特征和优点能够更明显易懂，以下特举本发明的具体实施方式。The above description is only an overview of the technical solutions of the present invention, in order to be able to understand the technical means of the present invention more clearly, it can be implemented according to the content of the description, and in order to make the above and other purposes, features and advantages of the present invention more obvious and easy to understand , the following specific embodiments of the present invention are given.

附图说明Description of drawings

图1为本发明实施例的声音播放检测处理方法的应用场景示意图；1 is a schematic diagram of an application scenario of a sound playback detection processing method according to an embodiment of the present invention;

图2为本发明实施例的电视开机的处理方法的应用场景示意图；FIG. 2 is a schematic diagram of an application scenario of a processing method for starting a TV according to an embodiment of the present invention;

图3为本发明实施例的回声消除处理方法的应用场景示意图；3 is a schematic diagram of an application scenario of an echo cancellation processing method according to an embodiment of the present invention;

图4为本发明实施例的声音播放检测处理方法的流程示意图；4 is a schematic flowchart of a sound playback detection processing method according to an embodiment of the present invention;

图5为本发明实施例的音视频播放设备的开机状态检测处理方法的流程示意图；5 is a schematic flowchart of a method for detecting a power-on state of an audio and video playback device according to an embodiment of the present invention;

图6为本发明实施例的回声消除处理方法的流程示意图；6 is a schematic flowchart of an echo cancellation processing method according to an embodiment of the present invention;

图7为本发明实施例的声音播放检测处理装置的结构示意图；7 is a schematic structural diagram of a sound playback detection processing apparatus according to an embodiment of the present invention;

图8为本发明实施例的音视频播放设备的开机状态检测处理装置的结构示意图；8 is a schematic structural diagram of a power-on state detection and processing device of an audio and video playback device according to an embodiment of the present invention;

图9为本发明实施例的回声消除处理装置的结构示意图；9 is a schematic structural diagram of an echo cancellation processing apparatus according to an embodiment of the present invention;

图10为本发明实施例的电子设备的结构示意图。FIG. 10 is a schematic structural diagram of an electronic device according to an embodiment of the present invention.

具体实施方式Detailed ways

下面将参照附图更详细地描述本公开的示例性实施例。虽然附图中显示了本公开的示例性实施例，然而应当理解，可以以各种形式实现本公开而不应被这里阐述的实施例所限制。相反，提供这些实施例是为了能够更透彻地理解本公开，并且能够将本公开的范围完整的传达给本领域的技术人员。Exemplary embodiments of the present disclosure will be described in more detail below with reference to the accompanying drawings. While exemplary embodiments of the present disclosure are shown in the drawings, it should be understood that the present disclosure may be embodied in various forms and should not be limited by the embodiments set forth herein. Rather, these embodiments are provided so that the present disclosure will be more thoroughly understood, and will fully convey the scope of the present disclosure to those skilled in the art.

在一些应用场景中，音视频源输出设备通过有线或者无线的方式与音视频播放设备进行连接，并向音视频播放设备提供播放内容，然后音视频播放设备的扬声器进行播放。但是，如果音视频播放设备的声道被其他播放源占用，或者播放内容传输线路出现问题，就会造成无法播放声音的情形。本发明实施例针对这种情形，提供了一种声音播放检测处理方法，通过设置在音视频源输出设备上的拾音模块(例如麦克风)对音视频播放设备播放的声音进行采集并识别，判断音视频播放设备是否正在播放音视频源输出设备输出的音频信号，如果没有播放，则通过音视频源输出设备上的扬声器进行播放。In some application scenarios, the audio and video source output device is wired or wirelessly connected to the audio and video playback device, and provides playback content to the audio and video playback device, and then the audio and video playback device's speakers play. However, if the channel of the audio and video playback device is occupied by other playback sources, or there is a problem with the transmission line of the playback content, it will cause the situation that the sound cannot be played. In view of this situation, the embodiments of the present invention provide a sound playback detection and processing method, which collects and recognizes the sound played by the audio and video playback device through a sound pickup module (such as a microphone) set on the audio and video source output device, and judges Whether the audio and video playback device is playing the audio signal output by the audio and video source output device, if not, it will be played through the speakers on the audio and video source output device.

如图1所示，其为本发明实施例的声音播放检测处理方法的应用场景示意图，图1所示场景以电视设备作为音视频播放设备的示例，以电视盒子作为音视频源输出设备的示例。如图中所示，电视盒子通过互联网接入到音视频内容提供平台，从平台获取音视频内容数据，在下文中，音视频内容可以包括音频内容、视频内容以及音频和视频结合内容。电视盒子中具有用于对音视频源进行处理的系统，该处理系统将从平台获取音视频内容数据进行解码等处理后，可以通过HDMI(High Definition Multimedia Interface，高清多媒体接口)信号输出端口向电视设备进行输出。电视设备与电视盒子之间可以通过有线或者无线方式连接，作为示例，图1中的电视设备与电视盒子之间通过该HDMI传输线进行连接。As shown in FIG. 1, it is a schematic diagram of an application scenario of the sound playback detection processing method according to the embodiment of the present invention. The scene shown in FIG. 1 uses a TV device as an example of an audio and video playback device, and a TV box as an example of an audio and video source output device. . As shown in the figure, the TV box accesses the audio and video content providing platform through the Internet, and obtains audio and video content data from the platform. Hereinafter, the audio and video content may include audio content, video content, and combined audio and video content. The TV box has a system for processing audio and video sources. After the processing system obtains the audio and video content data from the platform for decoding and other processing, it can pass the HDMI (High Definition Multimedia Interface, high-definition multimedia interface) signal output port to the TV. device to output. The TV device and the TV box can be connected in a wired or wireless manner. As an example, the TV device and the TV box in FIG. 1 are connected through the HDMI transmission line.

在本发明实施例中，主要涉及对音频信号的处理，这里所说的音频信号可以是独立音频内容，例如音乐、广播等，也可以是与视频结合在一起的音频内容，例如影视剧等。对于电视设备而言，电视盒子即为音视频源输出设备，在正常情况下，电视设备通过HDIM信号输入端口，获取电视盒子发送的HDMI信号，并通过音视频播放处理系统进行播放，其中，音频信号通过电视设备上的扬声器1进行播放。此外，电视盒子也可以通过HDMI通道向电视设备发送纯音频信号，例如播放在线音乐等，这种情况下，电视设备仅作为播放音频的设备。不过，在一些情况下，向扬声器发送音频信号的声道可能被其他音频源占用，可能会出现电视设备播放了视频画面但是却未播放对应的音频，或者，在播放音乐的情况下，虽然，电视设备接收到了电视盒子向其传输的音频信号，但是却无法播放，例如，电视设备正在播放内部存储器的音频内容，或者，声道被其他HDMI信号输入端口所占用。In the embodiment of the present invention, it mainly involves the processing of audio signals. The audio signals mentioned here can be independent audio content, such as music, broadcasting, etc., or audio content combined with video, such as film and television dramas. For TV equipment, the TV box is the audio and video source output device. Under normal circumstances, the TV device obtains the HDMI signal sent by the TV box through the HDIM signal input port, and plays it through the audio and video playback processing system. Among them, the audio The signal is played through speaker 1 on the TV device. In addition, the TV box can also send pure audio signals to the TV device through the HDMI channel, such as playing online music, etc. In this case, the TV device only acts as a device for playing audio. However, in some cases, the channel that sends the audio signal to the speaker may be occupied by other audio sources, it may appear that the TV device plays the video but does not play the corresponding audio, or, in the case of playing music, although, The TV device receives the audio signal transmitted to it by the TV box, but cannot play it, for example, the TV device is playing audio content from the internal memory, or the channel is occupied by another HDMI signal input port.

针对这种情况，在本发明实施例中，可以利用电视盒子上的拾音模块对电视设备的扬声器1的音频信号进行检测，如果检测到了扬声器1的音频信号和电视盒子向电视设备发送的音频信号是一致的，那么说明电视设备在正常播放电视盒子发出的音频信号，否则，说明电视设备为播放电视盒子向电视设备发送的音频信号，则打开电视盒子上的扬声器2来播放该音频信号。In view of this situation, in the embodiment of the present invention, the sound pickup module on the TV box can be used to detect the audio signal of the speaker 1 of the TV device. If the audio signal of the speaker 1 and the audio signal sent by the TV box to the TV device are detected If the signals are consistent, it means that the TV device is playing the audio signal sent by the TV box normally, otherwise, it means that the TV device is playing the audio signal sent by the TV box to the TV device, then turn on the speaker 2 on the TV box to play the audio signal.

具体地，可以通过如下方式来检测电视设备的扬声器1是否播放了电视盒子向电视设备发送的音频信号。如图1所示，该部分处理可以由图中的音频播放检测模块来实现，音频播放检测模块可以由硬件和/或软件来实现，并且可以作为音视频源进行处理的系统而存在于电视盒子中。一方面，音频播放检测模块可以通过麦克风检测到环境中的音频信号，为了方便于区分表示，这里称作第一音频信号，需要说明的是，这里的第一音频信号是通过接收环境中的声音而生成的音频信号，其可能包含扬声器1发出的音频信号，也可能包含环境中其他设备发出的音频信号，还可能包含电视盒子自身的扬声器2发出的音频信号；另一方面，音频播放检测模块可以从HDMI信号输出端口获取到向电视设备输出的音频信号，这里称作第二音频信号，该第二音频信号相对确定的音频信号，该音频信号可以从电视盒子的硬件系统或者软件系统中读取，只不过为了计算相位差的精确性，从HDMI信号输出端口获取信号较好，这样可以忽略掉第二音频信号电视盒子内部的信号延时。Specifically, whether the speaker 1 of the TV device plays the audio signal sent by the TV box to the TV device can be detected in the following manner. As shown in Figure 1, this part of the processing can be implemented by the audio playback detection module in the figure. The audio playback detection module can be implemented by hardware and/or software, and can exist in the TV box as a system for processing audio and video sources. middle. On the one hand, the audio playback detection module can detect the audio signal in the environment through the microphone. For the convenience of distinguishing the representation, it is called the first audio signal here. It should be noted that the first audio signal here is obtained by receiving the sound in the environment. The generated audio signal may include the audio signal emitted by the speaker 1, may also include the audio signal emitted by other devices in the environment, and may also include the audio signal emitted by the speaker 2 of the TV box itself; on the other hand, the audio playback detection module The audio signal output to the TV device can be obtained from the HDMI signal output port, which is referred to as the second audio signal here. The second audio signal is relatively determined. The audio signal can be read from the hardware system or software system of the TV box. However, in order to calculate the accuracy of the phase difference, it is better to obtain the signal from the HDMI signal output port, so that the signal delay inside the TV box of the second audio signal can be ignored.

音频播放检测模块在获取到第一音频信号和第二音频信号后，对第一音频信号和第二音频信号之间的音频信号相关度和相位差进行分析，根据分析结果来确定电视设备是否在播放电视盒输出的第二音频信号，并进一步确定通过哪个设备的扬声器来播放该第二音频信号。具体地，可以存在如下几种情形：After acquiring the first audio signal and the second audio signal, the audio playback detection module analyzes the audio signal correlation and phase difference between the first audio signal and the second audio signal, and determines whether the television device is in the Play the second audio signal output by the TV box, and further determine which device speaker is used to play the second audio signal. Specifically, the following situations may exist:

情形一：如果第一音频信号和第二音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号和第二音频信号之间的相位差大于预设的相位差阈值，则确定电视设备正在播放第二音频信号，这种情况不需要做任何处理。其中，音频信号相关度说明了第一音频信号中是否包含有第二音频信号的成分，而音频相位差说明了具有相关性的音频信号成分之间的滞后情况。在电视设备和电视盒子之间通过HDMI线路连接的情况下，从电视盒子到设备的传输以及电视设备接收HDMI信号等处理会造成一定的延时，并且通过电视盒子的麦克风进行声音回采然后再转换为音频信号也需要一定的处理延时，这些延时相比于通过电视盒子的扬声器直接播出第二音频信号产生的延时大得多，一般来说，回采后的第一音频信号与电视盒子输出的第二音频信号中具有相关性的音频信号成分之间的延时大约在几百毫秒，而通过电视盒子的扬声器直接播出第二音频信号产生的延时大约在几十毫秒。因此，通过延时可以区分出那个扬声器在播放与第二音频信号对应的声音。Situation 1: If the audio signal correlation of the first audio signal and the second audio signal is greater than the preset correlation threshold, and the phase difference between the first audio signal and the second audio signal is greater than the preset phase difference threshold, then It is determined that the TV device is playing the second audio signal, and no processing is required in this case. The audio signal correlation degree indicates whether the first audio signal contains components of the second audio signal, and the audio phase difference indicates the lag between the correlated audio signal components. In the case where the TV device and the TV box are connected by an HDMI line, the transmission from the TV box to the device and the processing of the HDMI signal received by the TV device will cause a certain delay, and the sound is captured through the microphone of the TV box and then converted. A certain processing delay is also required for the audio signal. These delays are much larger than the delay caused by directly broadcasting the second audio signal through the speakers of the TV box. The delay between related audio signal components in the second audio signal output by the box is about several hundreds of milliseconds, while the delay generated by directly broadcasting the second audio signal through the speakers of the TV box is about several tens of milliseconds. Therefore, through the delay, it can be distinguished which speaker is playing the sound corresponding to the second audio signal.

情形二：如果第一音频信号和第二音频信号的音频信号相关度小于预设的相关度阈值，则说明电视设备并未播放第二音频信号，则需要将第二音频信号通过电视盒子的扬声器播放。Scenario 2: If the audio signal correlation between the first audio signal and the second audio signal is less than the preset correlation threshold, it means that the TV device does not play the second audio signal, and the second audio signal needs to be passed through the speakers of the TV box. play.

情形三：如果第一音频信号和第二音频信号的音频信号相关度大于预设的相关度阈值，但是，第一音频信号和第二音频信号之间的相位差小于预设的相位差阈值，则说明电视盒子的扬声器在播放该第二音频信号，这种情况不需要做任何处理。情形三出现在情形二的后续情况，即电视设备仍然处于无法正常播放第二音频信号的情形，则继续由电视盒子的扬声器在播放。Situation 3: If the audio signal correlation of the first audio signal and the second audio signal is greater than the preset correlation threshold, but the phase difference between the first audio signal and the second audio signal is less than the preset phase difference threshold, It means that the speaker of the TV box is playing the second audio signal, and no processing is required in this case. Scenario 3 occurs in a subsequent situation of Scenario 2, that is, the TV device is still in a situation where the second audio signal cannot be played normally, and the speakers of the TV box continue to play.

情形四：如果第一音频信号和第二音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号中包括相位差不同的子音频信号和子音频信号，其中一个子音频信号和第二音频信号之间的相位差大于预设的相位差阈值，另一个子音频信号和第二音频信号之间的相位差小于相位差阈值，则确定电视设备和电视盒子的扬声器都在播放该第二音频信号，这种情况下，可以停止音视频源输出设备的扬声器播放第二音频信号。情形四一般会出现在之前电视设备无法正常播放第二音频信号，而转至通过电视盒子的扬声器来播放第二音频信号，即上述情形二的后续状态，然而，过了一段的时间后，电视设备又可以正常播放第二音频信号，从而出现了两个设备的扬声器都在播放的情形，针对这种情况，优先选择让电视设备播放的扬声器播放该第二音频信号，因此，停止电视盒子的扬声器播放第二音频信号。Case 4: If the audio signal correlation of the first audio signal and the second audio signal is greater than the preset correlation threshold, and the first audio signal includes sub audio signals and sub audio signals with different phase differences, one of the sub audio signals and The phase difference between the second audio signals is greater than the preset phase difference threshold, and the phase difference between the other sub-audio signal and the second audio signal is less than the phase difference threshold, then it is determined that both the TV equipment and the speakers of the TV box are playing the The second audio signal. In this case, the speaker of the audio and video source output device can stop playing the second audio signal. Situation 4 generally occurs when the TV device cannot play the second audio signal normally, and switches to play the second audio signal through the speakers of the TV box, that is, the follow-up state of the above-mentioned situation 2. However, after a period of time, the TV The device can play the second audio signal normally, so that the speakers of both devices are playing. In this case, the second audio signal is preferably played by the speakers of the TV The speaker plays the second audio signal.

上述的音频信号间的相关度可以通过对音频信号的频域特征进行提取，然后进行频域分析和比对来确定。在每次需要进行检测时，音频播放检测模块可以同时对HDMI信号输出端口和麦克风的音频信号进行分别采集，分别形成多帧的第二音频信号和第一音频信号。由于通过麦克风回采的音频信号会有一定的延时，因此，可以对一定时间段内的多帧音频信号进行比对，例如，每200毫秒为一帧采集音频信号，可以连续采样多帧第二音频信号和5帧第一音频信号。针对某一帧音频信号，可以与相同时间段的对应帧的第二音频信号以及前几帧的第二音频信号进行比对，从而可以判断出第一音频信号中是否包含有与第二音频信号相关度超过预设的相关度阈值的音频信号成分，从而来确定麦克风采集的音频信号是否包含第二音频信号的内容。The above-mentioned correlation between audio signals can be determined by extracting frequency domain features of the audio signals, and then performing frequency domain analysis and comparison. Each time detection is required, the audio playback detection module can simultaneously collect audio signals from the HDMI signal output port and the microphone, respectively, to form multiple frames of the second audio signal and the first audio signal. Since the audio signal collected through the microphone will have a certain delay, it is possible to compare the audio signals of multiple frames within a certain period of time. audio signal and 5 frames of the first audio signal. For a certain frame of audio signal, it can be compared with the second audio signal of the corresponding frame in the same time period and the second audio signal of the previous frames, so as to determine whether the first audio signal contains the same frequency as the second audio signal. The audio signal components whose correlation degree exceeds a preset correlation degree threshold value are used to determine whether the audio signal collected by the microphone contains the content of the second audio signal.

上述的音频信号间的相位差可以在确定了第一音频信号中包含第二音频信号的成分的基础上，找出第一音频信号与第二音频信号中相关度较高的部分的特征点，根据这些特征点来计算相位差，例如音频信号的峰值点之间的相位差。确定了音频信号之间的相位差也就确定了音频信号之间的延时，从而可以进一步判断出，第一音频信号中包含的第二音频信号的成分是电视设备的扬声器播放的还是电视盒子的扬声器播放的。The phase difference between the above-mentioned audio signals can be determined on the basis of the components of the first audio signal including the second audio signal, and the feature point of the part with a higher degree of correlation in the first audio signal and the second audio signal can be found, Based on these feature points, the phase difference, such as the phase difference between the peak points of the audio signal, is calculated. Determining the phase difference between the audio signals also determines the delay between the audio signals, so that it can be further determined whether the component of the second audio signal contained in the first audio signal is played by the speaker of the TV device or the TV box of speakers.

以上介绍了本发明实施例在针对电视设备与电视盒子之间的声音播放检测以及切换处理方面的应用。下面再介绍一下针对电视设备开机场景的应用。The application of the embodiments of the present invention in detection of sound playback and switching processing between a TV device and a TV box has been described above. Next, the application for the booting scene of the TV device will be introduced.

如图2所示，其为本发明实施例的电视开机的处理方法的应用场景示意图，图中以电视盒子作为音视频源输出设备的示例，以电视设备作为音视频播放设备的示例进行说明。在一下情况下，电视盒子可以通过麦克风检测用户的语音指令，触发音视频播放处理，例如，用户通过语音发出“播放XX视频”的指令，在这种情况下，电视盒子需要确认电视设备是否处于已开机状态，然后再发送音视频信号，从而避免电视设备未开机或者处于开机过程中，导致一段时间内无法播放音视频内容，从而影响用户体验。电视设备的开机操作可以独立于电视盒子进行，例如可以通过遥控器或者语音指令由用户进行开机操作，另外，由于电视盒子一般与电视设备进行直接连接，电视设备的开机操作也可通过电视盒子来实现，即电视盒子可以通过控制指令对电视设备进行开机控制。在这种情况下，电视盒子在接收到用户的音视频播放语音指令后，可以执行对电视设备的开机状态检测，并在确认电视设备已经开机后，再发送要播放的音视频内容信号，如果检测到电视设备还没有开机，可以向电视设备发送开机控制指令，以触发电视开机，此外，电视盒子也可以在接收到用户的音视频播放语音指令后，直接向电视设备发出开机控制指令，并不断检测电视设备是否已经完成开机。As shown in FIG. 2 , it is a schematic diagram of an application scenario of the processing method for turning on a TV according to an embodiment of the present invention. In the figure, a TV box is used as an example of an audio and video source output device, and a TV device is used as an example of an audio and video playback device for description. In the following cases, the TV box can detect the user's voice command through the microphone, and trigger the audio and video playback processing. For example, the user sends a "play XX video" command through the voice. In this case, the TV box needs to confirm whether the TV device is in It is turned on, and then sends audio and video signals, so as to avoid the TV equipment not being turned on or in the process of turning on, resulting in the inability to play audio and video content for a period of time, thus affecting the user experience. The power-on operation of the TV device can be performed independently of the TV box. For example, the user can perform the boot-up operation through the remote control or voice commands. In addition, since the TV box is generally directly connected with the TV device, the power-on operation of the TV device can also be performed through the TV box. Realization, that is, the TV box can control the power-on of the TV device through the control command. In this case, after receiving the user's audio and video playback voice command, the TV box can perform the power-on state detection of the TV device, and after confirming that the TV device has been turned on, send the audio and video content signal to be played. It is detected that the TV device has not been turned on, and a boot control command can be sent to the TV device to trigger the TV to turn on. In addition, the TV box can also directly send a boot control command to the TV device after receiving the user's audio and video playback voice command, and Constantly detect whether the TV device has finished booting.

电视设备开机一般需要一段时间的开机等待，这个过程中，电视设备的系统正在启动，因此，无法播放电视盒子发送的音频或者视频内容。此外，一些电视设备还可能在系统启动后，播放开机视频或者开机广告，从而无法播放电视盒子发送的音视频内容。在电视设备无法播放音视频内容的时间段中，如果电视盒子开始向电视设备发送音视频内容，则会导致用户错过这段时间的播放内容。针对这种情况，本发明实施例通过在向电视设备发出的检测音频信号，并通过电视盒子的麦克风进行声音回采比对的方式，来判断电视设备是否已经完成开机。It generally takes a period of time for the TV device to start up and wait. During this process, the system of the TV device is starting up, so the audio or video content sent by the TV box cannot be played. In addition, some TV devices may play a boot-up video or boot-up advertisement after the system is started, so that the audio and video content sent by the TV box cannot be played. During the time period when the TV device cannot play the audio and video content, if the TV box starts to send the audio and video content to the TV device, the user will miss the playback content during this time. In response to this situation, the embodiment of the present invention determines whether the TV device has been turned on by detecting the audio signal sent to the TV device and performing sound retrieval and comparison through the microphone of the TV box.

具体地，如图2所示，电视盒子通过HDMI信号输出端口向电视设备发送检测音频信号，并通过麦克风进行音频信号检测，生成第一音频信号。然后，根据第一音频信号和开机检测音频信号的音频信号相关度和相位差，检测电视设备是否播放开机检测音频信号。具体的检测和判定方式和前面的介绍的声音播放检测处理方法中的检测原理是相同的，即如果第一音频信号和检测音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号和检测音频信号之间的相位差大于预设的相位差阈值，就可以认为电视设备播放了检测音频信号，从而可以向电视设备传输要播放的视频内容信号了，即如图2(b)所示。另外，如果电视设备未播放开机检测音频信号看，例如，可能是电视设备未完成启动，如图2(a)电视设备的扬声器没有播放任何声音或者播放的是检测音频以外的开机视频或者广告等声音，针对这种情况，需要再次向电视设备发出的开机检测音频信号，并进行新一轮的检测电视设备是否播放开机检测音频信号的处理，每一轮检测处理之间可以间隔一定的时间，通过这样不断的重复检测，可以在电视设备开机完成后，就可以自动地开始视频播放，并且也不会因为电视设备的开机过程而导致用户错过一部分音视频内容。Specifically, as shown in FIG. 2 , the TV box sends a detection audio signal to the TV device through the HDMI signal output port, and detects the audio signal through the microphone to generate the first audio signal. Then, according to the audio signal correlation degree and phase difference between the first audio signal and the power-on detection audio signal, it is detected whether the television device plays the power-on detection audio signal. The specific detection and determination method is the same as the detection principle in the sound playback detection processing method described above, that is, if the correlation between the first audio signal and the audio signal of the detected audio signal is greater than the preset correlation threshold, and the first audio signal If the phase difference between the audio signal and the detected audio signal is greater than the preset phase difference threshold, it can be considered that the TV device has played the detected audio signal, so that the video content signal to be played can be transmitted to the TV device, as shown in Figure 2 (b). ) shown. In addition, if the TV device does not play the power-on detection audio signal, for example, it may be that the TV device has not completed startup, as shown in Figure 2(a) The speaker of the TV device does not play any sound or plays a boot-up video or advertisement other than the detection audio, etc. Sound, in view of this situation, it is necessary to send the power-on detection audio signal to the TV equipment again, and perform a new round of processing to detect whether the TV equipment plays the power-on detection audio signal. There can be a certain time interval between each round of detection processing. Through such repeated detection, video playback can be automatically started after the TV device is powered on, and the user will not miss part of the audio and video content due to the boot process of the TV device.

下面再介绍一下使用上述声音播放检测的技术原理的另一应用场景。如图3所示，其为本发明实施例的回声消除处理方法的应用场景示意图。仍然以电视盒子作为音视频源输出设备的示例，以电视设备作为音视频播放设备的示例。该回声消除处理方法应用于识别用户的语音指令的情形，如图中所示，电视盒子通过HDMI信号输出端口向电视设备传输音视频内容信号，为了便于描述，将其中的音频部分称作第二音频信号，第二音频信号会通过电视设备的扬声器1播放，为了与原始的电视盒子输出的第二音频信号进行区分(存在相位差)，被扬声器1播放后的音频称作内容音频。另一方面，用户可能会向电视盒子发出语音指令，该语音指令与扬声器1播放的内容音频在环境中进行混合，形成混合音频。该混合音频被电视盒子的麦克风采集后，形成了第一音频信号，也就是说，当用户发出语音指令时，第一音频信号中包括了经过延时后播放的第二音频信号和用户发出的语音指令。本发明实施例的回声消除处理方法就是想要从第一音频信号中消除到第二音频信号，从而可以获得较为纯净的用户语音指令，以进行精确的控制处理。Hereinafter, another application scenario of using the above-mentioned technical principle of sound playback detection will be introduced. As shown in FIG. 3 , it is a schematic diagram of an application scenario of the echo cancellation processing method according to the embodiment of the present invention. Still take a TV box as an example of an audio and video source output device, and a TV device as an example of an audio and video playback device. The echo cancellation processing method is applied to the situation of recognizing the user's voice command. As shown in the figure, the TV box transmits audio and video content signals to the TV device through the HDMI signal output port. For the convenience of description, the audio part is called the second part. Audio signal, the second audio signal will be played through the speaker 1 of the TV device. In order to distinguish it from the second audio signal output by the original TV box (there is a phase difference), the audio played by the speaker 1 is called content audio. On the other hand, the user may issue a voice command to the TV box, and the voice command and the content audio played by the speaker 1 are mixed in the environment to form mixed audio. After the mixed audio is collected by the microphone of the TV box, a first audio signal is formed. That is to say, when the user sends a voice command, the first audio signal includes the second audio signal played after a delay and the audio signal sent by the user. voice commands. The echo cancellation processing method of the embodiment of the present invention is to eliminate the first audio signal to the second audio signal, so as to obtain relatively pure user voice commands for precise control processing.

回声消除的方式是仍然是利用第一音频信号与第二音频信号之间的音频信号相关度和相位差的特性来实现地。具体地，回声消除处理可以通过图中的回声消除模块来实现，其基本原理和图1所示的情形类似，回声消除模块可以从麦克风获取到第一音频信号，以及从HDMI信号输出端口获取到电视盒子向电视设备发出的第二音频信号。之后，将第一音频信号和第二音频信号进行信号特征比对，获取音频信号相关度大于预设的相关度阈值的相关音频信号部分，并确定第一音频信号的相关音频信号部分和第二音频信号的相关音频信号部分之间的相位差，然后根据确定的相位差，从第一音频信号中去除掉消除相位差后的第二音频信号，这里所说的消除掉相位差的处理可以具体为根据确定的相位差在时间轴上将信号对齐，例如，确定出的相位差为100ms，即第一音频信号中的相关成分比第二音频信号滞后100ms，那么可以将当前第一音频信号与100ms前的第二音频信号进行对齐后做减法，从而就可以消除掉第二音频信号对应的成分。The echo cancellation is still implemented by utilizing the characteristics of the audio signal correlation and the phase difference between the first audio signal and the second audio signal. Specifically, the echo cancellation processing can be implemented by the echo cancellation module in the figure. The basic principle is similar to that shown in FIG. 1. The echo cancellation module can obtain the first audio signal from the microphone, and obtain the first audio signal from the HDMI signal output port. The second audio signal from the TV box to the TV device. After that, compare the signal characteristics of the first audio signal and the second audio signal, obtain the relevant audio signal part of which the correlation degree of the audio signal is greater than the preset correlation degree threshold, and determine the relevant audio signal part of the first audio signal and the second audio signal. The phase difference between the relevant audio signal parts of the audio signal, and then according to the determined phase difference, the second audio signal after eliminating the phase difference is removed from the first audio signal. The process of eliminating the phase difference mentioned here can be specific In order to align the signals on the time axis according to the determined phase difference, for example, if the determined phase difference is 100ms, that is, the correlation component in the first audio signal lags behind the second audio signal by 100ms, then the current first audio signal can be The second audio signal before 100ms is aligned and then subtracted, so that the component corresponding to the second audio signal can be eliminated.

上述的确定相位差的计算处理，不需要在每次进行回声消除处理时进行计算，可以以预定的时间间隔进行，即计算了一次相位差后，在这个时间间隔内，就可以使用这个相位差来进行回声消除。由于HDMI信号的传输的延时可能会存在波动，在实际应用中，可以根据HDMI信号的延时波动情况来确定合理的时间间隔。此外，在一些情形下，上述的第二音频信号可能是通过电视盒子自身的扬声器2而播放的，针对这种情况的回声处理的原理与前面介绍的是一样的，只不过确定出的相位差会相对较小。The above-mentioned calculation process of determining the phase difference does not need to be calculated every time the echo cancellation process is performed, and can be performed at a predetermined time interval, that is, after calculating the phase difference once, within this time interval, the phase difference can be used. for echo cancellation. Since the transmission delay of the HDMI signal may fluctuate, in practical applications, a reasonable time interval can be determined according to the delay fluctuation of the HDMI signal. In addition, in some cases, the above-mentioned second audio signal may be played through the speaker 2 of the TV box itself. The principle of echo processing for this case is the same as that described above, except that the determined phase difference will be relatively small.

下面通过一些具体实施例来进一步说明本发明的技术方案。The technical solutions of the present invention are further described below through some specific embodiments.

实施例一Example 1

如图4所示，其为本发明实施例的声音播放检测处理方法的流程示意图，该方法可以应用于电视盒子等音视频源输出设备上，该音视频源输出设备自身带用于采集音频信号的拾音模块(例如麦克风)以及扬声器，该音视频源输出设备与例如电视设备等音视频播放设备之间通过有线或者无线的方式连接，从而进行音视频信号的传输。具体地，该方法包括：As shown in FIG. 4 , which is a schematic flowchart of a sound playback detection processing method according to an embodiment of the present invention, the method can be applied to an audio and video source output device such as a TV box, and the audio and video source output device itself is used for collecting audio signals. The sound pickup module (such as a microphone) and a speaker are connected between the audio and video source output device and an audio and video playback device such as a TV device through a wired or wireless manner, so as to transmit audio and video signals. Specifically, the method includes:

S101：获取音视频源输出设备的拾音模块检测到的第一音频信号和音视频源输出设备向音视频播放设备发出的第二音频信号。其中，第一音频信号是通过音视频源输出设备的拾音模块采集环境中的声音而生成的音频信号，其可能包含音视频播放设备发出的音频信号，也可能包含环境中其他设备发出的音频信号，还可能包含音视频源输出设备自身的扬声器发出的音频信号。第二音频信号是音视频源输出设备向音视频播放设备发出的希望音视频播放设备播放的音频内容，相对于第一音频信号而言，第二音频信号为确定的信号。在音视频源输出设备为电视盒子的情况下，第二音频信号可以从音视频源输出设备的HDMI输出端口获取。S101: Acquire a first audio signal detected by a sound pickup module of an audio and video source output device and a second audio signal sent by the audio and video source output device to an audio and video playback device. The first audio signal is an audio signal generated by collecting sounds in the environment through the sound pickup module of the audio and video source output device, which may include audio signals sent by the audio and video playback device, or may include audio signals sent by other devices in the environment. It may also include audio signals from the speakers of the audio and video source output device itself. The second audio signal is the audio content that the audio and video source output device sends to the audio and video playback device and is expected to be played by the audio and video playback device. Compared with the first audio signal, the second audio signal is a definite signal. When the audio and video source output device is a TV box, the second audio signal can be obtained from the HDMI output port of the audio and video source output device.

S102：根据第一音频信号和第二音频信号的音频信号相关度和相位差，检测音视频播放设备是否播放第二音频信号。具体地判定方式可以包括：如果第一音频信号和第二音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号和第二音频信号之间的相位差大于预设的相位差阈值，则确定音视频播放设备播放第二音频信号，否则，确定音视频播放设备未播放第二音频信号。S102: Detect whether the audio and video playback device plays the second audio signal according to the audio signal correlation and the phase difference between the first audio signal and the second audio signal. Specifically, the determination method may include: if the audio signal correlation of the first audio signal and the second audio signal is greater than a preset correlation threshold, and the phase difference between the first audio signal and the second audio signal is greater than a preset phase If the difference threshold is exceeded, it is determined that the audio and video playback device plays the second audio signal; otherwise, it is determined that the audio and video playback device does not play the second audio signal.

具体地，上述的音频信号间的相关度可以通过对音频信号的频域特征进行提取，然后进行频域分析和比对来确定。音频信号间的相位差可以在确定了第一音频信号中包含第二音频信号的成分的基础上，找出第一音频信号与第二音频信号中相关度较高的部分的特征点，根据这些特征点来计算相位差，例如音频信号的峰值点之间的相位差。Specifically, the above-mentioned correlation between audio signals can be determined by extracting frequency domain features of the audio signals, and then performing frequency domain analysis and comparison. The phase difference between the audio signals can be determined based on the components of the first audio signal including the second audio signal, and the feature points of the part with high correlation between the first audio signal and the second audio signal can be found. feature points to calculate the phase difference, such as the phase difference between the peak points of an audio signal.

进一步地，音视频播放设备未播放第二音频信号的情况下，还可能会存在如下几种具体情形：Further, when the audio and video playback device does not play the second audio signal, the following specific situations may exist:

如果第一音频信号和第二音频信号的音频信号相关度小于预设的相关度阈值，则说明第一音频信号中不含有与第二音频信号相关的成分，当然也说明音视频播放设备并未播放第二音频信号；If the audio signal correlation between the first audio signal and the second audio signal is less than the preset correlation threshold, it means that the first audio signal does not contain components related to the second audio signal, and of course it also means that the audio and video playback device does not play the second audio signal;

如果第一音频信号和第二音频信号的音频信号相关度大于预设的相关度阈值，但是，第一音频信号和第二音频信号之间的相位差小于预设的相位差阈值。这说明从音视频源输出设备的拾音模块采集到的第一音频信号中是包含第二音频信号的相关成分的，只不过，相关成分不是由音视频播放设备发出的，而是由音视频源输出设备自身的扬声器发出的，所以相位差较小。If the audio signal correlation of the first audio signal and the second audio signal is greater than the preset correlation threshold, however, the phase difference between the first audio signal and the second audio signal is smaller than the preset phase difference threshold. This shows that the first audio signal collected from the sound pickup module of the audio and video source output device contains the relevant components of the second audio signal, but the relevant components are not emitted by the audio and video playback device, but by the audio and video signals. The source output device's own speaker emits, so the phase difference is small.

如果第一音频信号和第二音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号中包括相位差不同的子音频信号和子音频信号，其中一个子音频信号和第二音频信号之间的相位差大于预设的相位差阈值，另一个子音频信号和第二音频信号之间的相位差小于相位差阈值，则说明音视频播放设备和音视频源输出设备都在播放该第而音频信号。If the audio signal correlation of the first audio signal and the second audio signal is greater than the preset correlation threshold, and the first audio signal includes sub audio signals and sub audio signals with different phase differences, one of the sub audio signals and the second audio signal The phase difference between the signals is greater than the preset phase difference threshold, and the phase difference between the other sub-audio signal and the second audio signal is less than the phase difference threshold, it means that both the audio and video playback device and the audio and video source output device are playing the And the audio signal.

S103：如果音视频播放设备未播放第二音频信号，则通过音视频源输出设备的扬声器播放第二音频信号，而如果音视频播放设备播放了第二音频信号，则说明音视频播放设备处于正常状态，不需要进行切换。S103: If the audio and video playback device does not play the second audio signal, play the second audio signal through the speaker of the audio and video source output device, and if the audio and video playback device plays the second audio signal, it means that the audio and video playback device is in a normal state state, no switching is required.

此外，针对检测到音视频播放设备未播放第二音频信号，而音视频源输出设备的扬声器正在播放第二音频信号的情形，说明音视频播放设备还未回复正常，则使用音视频源输出设备的扬声器继续播放第二音频信号，因此，在这种情况下，也不需要进行处理。而针对音视频播放设备和音视频源输出设备的扬声器都在播放第二音频信号的情形，则应当停止音视频源输出设备的扬声器播放第二音频信号。In addition, for the situation where it is detected that the audio and video playback device is not playing the second audio signal, but the speaker of the audio and video source output device is playing the second audio signal, it means that the audio and video playback device has not returned to normal, then the audio and video source output device is used. The loudspeaker continues to play the second audio signal, so no processing is required in this case either. In the case where the speakers of the audio and video playback device and the audio and video source output device are both playing the second audio signal, the speaker of the audio and video source output device should stop playing the second audio signal.

上述的声音播放检测处理可以以一定的时间间隔进行重复的多轮执行，从而不断地根据音视频播放设备的状态来调整播放第二音频信号的设备，能够确保用户始终能够听到音视频源输出设备输出的音频内容，而不受音视频播放设备的状态的影响。The above-mentioned sound playback detection processing can be repeated for multiple rounds at a certain time interval, so as to continuously adjust the device for playing the second audio signal according to the state of the audio and video playback device, which can ensure that the user can always hear the audio and video source output. The audio content output by the device is not affected by the state of the audio and video playback device.

实施例二Embodiment 2

如图5所示，其为本发明实施例的音视频播放设备的开机状态检测处理方法的流程示意图，该方法可以应用于电视盒子等音视频源输出设备上。该音视频源输出设备自身带有用于采集音频信号的拾音模块，该音视频源输出设备与例如电视设备等音视频播放设备的之间通过有线或者无线的方式连接，从而进行音视频信号的传输。在本实施例的应用场景中，当音视频源输出设备需要向音视频播放设备传输音视频播放内容时，希望先确认音视频播放设备是否处于已经开机的状态，然后再发送音视频信号，从而避免电视设备未开机或者处于开机过程中，导致一段时间内无法播放音视频源输出设备传输的音视频内容，从而影响用户体验。具体地，该方法包括：As shown in FIG. 5 , which is a schematic flowchart of a method for detecting a power-on state of an audio and video playback device according to an embodiment of the present invention, the method can be applied to an audio and video source output device such as a TV box. The audio and video source output device itself has a pickup module for collecting audio signals, and the audio and video source output device and audio and video playback devices such as TV equipment are connected by wired or wireless means, so as to perform audio and video signal acquisition. transmission. In the application scenario of this embodiment, when the audio and video source output device needs to transmit the audio and video playback content to the audio and video playback device, it is hoped to first confirm whether the audio and video playback device is turned on, and then send the audio and video signals, thereby Avoid the TV device is not powered on or is in the process of booting, which will cause the audio and video content transmitted by the audio and video source output device to be unable to be played for a period of time, thus affecting the user experience. Specifically, the method includes:

S201：向音视频播放设备发送检测音频信号，获取音视频源输出设备的拾音模块检测到的第一音频信号。检测音频信号可以是一段简单的音频内容，例如“即将为您播放精彩内容”，如果音视频播放设备处于正常的开机状态，则应当很快播放此段音频，并且音视频源输出设备能够通过拾音模块采集到包含该段音频的第一音频信号。此外，音视频源输出设备还可以通过控制协议对音视频播放设备进行控制，例如音视频源输出设备可以通过控制指令来控制音视频播放设备的开关机等。而音视频源输出设备也可以具有通过拾音模块检测用户的语音指令，触发音视频播放处理能力。因此，在一些应用场景下，可以是用户通过语音指令触发音视频源输出设备发起音视频播放，而此时音视频播放设备还处于关机状态，针对这种情况，在向音视频播放设备发送检测音频信号前还可以包括音视频源输出设备向发出开机指令。S201: Send a detection audio signal to an audio and video playback device, and obtain a first audio signal detected by a sound pickup module of the audio and video source output device. The detected audio signal can be a simple piece of audio content, such as "an exciting content is about to be played for you". If the audio and video playback device is in a normal power-on state, the audio should be played soon, and the audio and video source output device can pass the pickup. The audio module collects the first audio signal including the audio segment. In addition, the audio and video source output device can also control the audio and video playback device through a control protocol. For example, the audio and video source output device can control the audio and video playback device on and off through control instructions. The audio and video source output device may also have the ability to detect the user's voice command through the pickup module to trigger the audio and video playback processing capability. Therefore, in some application scenarios, the user may trigger the audio and video source output device to initiate audio and video playback through a voice command, while the audio and video playback device is still in a shutdown state. The audio signal may also include an audio and video source output device sending a power-on command.

S202：根据第一音频信号和检测音频信号的音频信号相关度和相位差，检测音视频播放设备是否播放检测音频信号。具体的检测和判定方式和前面的介绍的声音播放检测处理方法中的检测原理是相同的，即如果第一音频信号和检测音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号和检测音频信号之间的相位差大于预设的相位差阈值，则确定音视频播放设备播放了检测音频信号，否则确定音视频播放设备未播放检测音频信号。S202: Detect whether the audio-video playback device plays the detected audio signal according to the correlation degree and phase difference of the audio signal between the first audio signal and the detected audio signal. The specific detection and determination method is the same as the detection principle in the sound playback detection processing method described above, that is, if the correlation between the first audio signal and the audio signal of the detected audio signal is greater than the preset correlation threshold, and the first audio signal If the phase difference between the audio signal and the detected audio signal is greater than the preset phase difference threshold, it is determined that the audio and video playback device has played the detected audio signal; otherwise, it is determined that the audio and video playback device has not played the detected audio signal.

S203：如果音视频播放设备播放检测音频信号，则向音视频播放设备输出音频和/或视频内容信号。如果电视设备未播放检测音频信号，则再次向电视设备发出的检测音频信号，并进行新一轮的检测电视设备是否播放检测音频信号的处理，直至检测了预定的轮次或者检测到音视频播放设备播放了检测音频信号。此外，音视频播放设备之所以没有播放出检测音频信号可能是因为音视频播放设备并未开机，因此，可以在未检测到音视频播放设备播放检测音频信号的情况下，向该音视频播放设备发出开机指令，触发其进行开机处理。S203: If the audio and video playback device plays the detected audio signal, output the audio and/or video content signal to the audio and video playback device. If the TV device does not play the detection audio signal, send the detection audio signal to the TV device again, and perform a new round of processing to detect whether the TV device plays the detection audio signal, until the predetermined round is detected or the audio and video playback is detected. The device played a detection audio signal. In addition, the reason why the audio and video playback device does not play the detected audio signal may be because the audio and video playback device is not powered on. Issue a boot command to trigger it to perform boot processing.

本发明实施例的音视频播放设备的开机状态检测处理方法，通过向音视频播放设备发送检测音频信号，然后使用拾音模块进行声音回采，根据音频信号相关度和相位差来判断回采的音频信号中是否包含检测音频信号，从而实现对音视频播放设备的开机状态的检测。通过这种方式，能够避免由于音视频播放设备未开机或者处于开机过程中而导致用户错过一段时间的音视频播放内容。The method for detecting and processing the power-on state of an audio and video playback device according to the embodiment of the present invention sends a detection audio signal to the audio and video playback device, and then uses a sound pickup module to perform sound retrieval, and judges the retrieved audio signal according to the audio signal correlation and phase difference. Whether the detection audio signal is included in it, so as to realize the detection of the power-on state of the audio and video playback device. In this way, it can be avoided that the user misses the audio and video playback content for a period of time because the audio and video playback device is not powered on or is in the process of booting.

实施例三Embodiment 3

如图6所示，其为本发明实施例的回声消除处理方法的流程示意图，该方法可以应用于电视盒子等音视频源输出设备上。该音视频源输出设备自身带有用于采集音频信号的拾音模块，并且可以通过拾音模块接收用户的语音指令，该音视频源输出设备与例如电视设备等音视频播放设备的之间通过有线或者无线的方式连接，从而进行音视频信号的传输。在一些场景下，用户向音视频源输出设备发出语音指令时，音视频播放设备可能正在播放音视频内容，这些声音会干扰到语音指令的识别，本发明实施例提供的回声消除处理方法，就是为了消除这些干扰。具体地，该方法包括：As shown in FIG. 6 , which is a schematic flowchart of an echo cancellation processing method according to an embodiment of the present invention, the method can be applied to an audio and video source output device such as a TV box. The audio and video source output device itself has a pickup module for collecting audio signals, and can receive user's voice commands through the pickup module. The audio and video source output device and audio and video playback devices such as TV equipment are connected by cable Or connect wirelessly to transmit audio and video signals. In some scenarios, when the user sends a voice command to the audio and video source output device, the audio and video playback device may be playing audio and video content, and these sounds may interfere with the recognition of the voice command. The echo cancellation processing method provided by the embodiment of the present invention is as follows: to eliminate these disturbances. Specifically, the method includes:

S301：获取音视频源输出设备的拾音模块检测到的第一音频信号和音视频源输出设备向音视频播放设备发出的第二音频信号。其中，第二音频信号对应于音视频源输出设备向音视频播放设备发送的音频内容，也就是音视频播放设备的扬声器需要播放的音频内容，而第一音频信号是通过音视频源输出设备的拾音模块采集后的音频信号，其包括了经过延时后由音视频播放设备的扬声器播放的第二音频信号，当用户发出语音指令时，还可能包括用户发出的语音指令。此外，在一些情形下，上述的第二音频信号可能会通过音视频源输出设备自身的扬声器播放的，从而使得第一音频信号包括了经过延时后由音视频源输出设备的扬声器播放的第二音频信号，当用户发出语音指令时，还可能包括用户发出的语音指令。这两种情形的区别在于播放第二音频信号的延时会不同。S301: Acquire a first audio signal detected by a sound pickup module of an audio and video source output device and a second audio signal sent by the audio and video source output device to an audio and video playback device. The second audio signal corresponds to the audio content sent by the audio and video source output device to the audio and video playback device, that is, the audio content that needs to be played by the speaker of the audio and video playback device, and the first audio signal is transmitted through the audio and video source output device. The audio signal collected by the sound pickup module includes the second audio signal played by the speaker of the audio and video playback device after a delay. When the user sends out a voice command, it may also include a voice command sent by the user. In addition, in some cases, the above-mentioned second audio signal may be played through the speaker of the audio and video source output device itself, so that the first audio signal includes the first audio signal played by the speaker of the audio and video source output device after a delay. The second audio signal, when the user sends a voice command, may also include the voice command sent by the user. The difference between the two cases is that the delay in playing the second audio signal will be different.

S302：将第一音频信号和第二音频信号进行信号特征比对，获取音频信号相关度大于预设的相关度阈值的相关音频信号部分。音频信号相关度大于预设的相关度阈值说明第一音频信号中包含了第二音频信号的内容。上述的音频信号间的音频信号相关度可以通过对音频信号的频域特征进行提取，然后进行频域分析和比对来确定。S302 : Compare the signal features of the first audio signal and the second audio signal, and obtain the part of the audio signal whose correlation degree of the audio signal is greater than a preset correlation degree threshold. If the audio signal correlation is greater than the preset correlation threshold, it indicates that the first audio signal contains the content of the second audio signal. The audio signal correlation between the above audio signals can be determined by extracting frequency domain features of the audio signals, and then performing frequency domain analysis and comparison.

S303：确定第一音频信号的相关音频信号部分和第二音频信号的相关音频信号部分之间的相位差。上述的音频信号间的相位差可以在确定了第一音频信号中包含第二音频信号的成分的基础上，找出第一音频信号与第二音频信号中相关度较高的音频信号部分的特征点，根据这些特征点来计算相位差，例如相关音频信号部分的峰值点之间的相位差。S303: Determine the phase difference between the relevant audio signal part of the first audio signal and the relevant audio signal part of the second audio signal. The above-mentioned phase difference between the audio signals can be based on determining that the first audio signal contains the components of the second audio signal, and find out the characteristics of the audio signal part with a higher degree of correlation in the first audio signal and the second audio signal. From these feature points, the phase difference, such as the phase difference between the peak points of the relevant audio signal parts, is calculated.

S304：从第一音频信号中去除掉消除相位差后的第二音频信号。这里所说的消除掉相位差的处理可以具体为根据确定的相位差在时间轴上将信号对齐，例如，确定出的相位差为100ms，即第一音频信号中的相关成分比第二音频信号滞后100ms，那么可以将当前第一音频信号与100ms前的第二音频信号进行对齐后做减法，从而就可以消除掉第二音频信号对应的成分。S304: Remove the second audio signal after the phase difference has been eliminated from the first audio signal. The process of eliminating the phase difference mentioned here may specifically be aligning the signals on the time axis according to the determined phase difference. For example, the determined phase difference is 100ms, that is, the correlation components in the first audio signal are higher than those in the second audio signal. If the delay is 100ms, the current first audio signal can be aligned with the second audio signal 100ms ago and then subtracted, so that the component corresponding to the second audio signal can be eliminated.

上述确定相位差的处理可以以预设的时间间隔执行执行，即步骤S302和S303的处理以预设的时间间隔执行，相应地，在时间间隔内，在步骤S304中，以最新确定的相位差，从第一音频信号中去除掉消除相位差后的第二音频信号。The above-mentioned process of determining the phase difference can be performed at a preset time interval, that is, the processes of steps S302 and S303 are performed at a preset time interval. Correspondingly, within the time interval, in step S304, the newly determined phase difference , the second audio signal after the phase difference has been eliminated is removed from the first audio signal.

此外，确定了音频信号之间的相位差也就确定了音频信号之间的延时，从而可以进一步判断出，第一音频信号中包含的第二音频信号的成分是音视频播放设备的扬声器播放的还是音视频源输出设备的扬声器播放的，不过，在本发明实施例的场景下，只要确定了相位差即可进行回声消除处理，而不需要区分当前播放音频的是音视频播放设备还是音视频源输出设备。In addition, determining the phase difference between the audio signals also determines the delay between the audio signals, so that it can be further determined that the component of the second audio signal contained in the first audio signal is played by the speaker of the audio and video playback device. However, in the scenario of the embodiment of the present invention, as long as the phase difference is determined, the echo cancellation processing can be performed, and there is no need to distinguish whether the currently playing audio is the audio and video playback device or the audio and video playback device. Video source output device.

本发明实施例的回声消除处理方法，通过音视频源输出设备的拾音模块回采的第一音频信号和音视频源输出设备向音视频播放设备发送的第二音频信号来计算相位差，然后，使用该相位差来进行回声消除，从而能够屏蔽掉由于播放音视频源输出设备输出的第二音频信号的影响，从而可以提取到较为纯净的语音指令，进而执行准确的语音控制处理。In the echo cancellation processing method of the embodiment of the present invention, the phase difference is calculated by the first audio signal recovered by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device, and then, using The phase difference is used for echo cancellation, so that the influence of the second audio signal output by the output device of the audio and video source can be shielded, so that relatively pure voice commands can be extracted, and then accurate voice control processing can be performed.

实施例四Embodiment 4

如图7所示，其为本发明实施例的声音播放检测处理装置的结构示意图，该装置可以应用于电视盒子等音视频源输出设备上，该音视频源输出设备自身带用于采集音频信号的拾音模块以及扬声器，该音视频源输出设备与例如电视设备等音视频播放设备之间通过有线或者无线的方式连接，从而进行音视频信号的传输。具体地，该装置包括：As shown in FIG. 7 , which is a schematic structural diagram of a sound playback detection and processing device according to an embodiment of the present invention, the device can be applied to an audio and video source output device such as a TV box, and the audio and video source output device itself is used for collecting audio signals. The audio and video source output device and audio and video playback devices such as TV devices are connected in a wired or wireless manner, so as to transmit audio and video signals. Specifically, the device includes:

音频信号获取模块11，用于获取音视频源输出设备的拾音模块检测到的第一音频信号和音视频源输出设备向音视频播放设备发出的第二音频信号。其中，第一音频信号是通过音视频源输出设备的拾音模块采集环境中的声音而生成的音频信号，其可能包含音视频播放设备发出的音频信号，也可能包含环境中其他设备发出的音频信号，还可能包含音视频源输出设备自身的扬声器发出的音频信号。第二音频信号是音视频源输出设备向音视频播放设备发出的希望音视频播放设备播放的音频内容，相对于第一音频信号而言，第二音频信号为确定的信号。在音视频源输出设备为电视盒子的情况下，第二音频信号可以从音视频源输出设备的HDMI输出端口获取。The audio signal acquisition module 11 is configured to acquire the first audio signal detected by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device. The first audio signal is an audio signal generated by collecting sounds in the environment through the sound pickup module of the audio and video source output device, which may include audio signals sent by the audio and video playback device, or may include audio signals sent by other devices in the environment. It may also include audio signals from the speakers of the audio and video source output device itself. The second audio signal is the audio content that the audio and video source output device sends to the audio and video playback device and is expected to be played by the audio and video playback device. Compared with the first audio signal, the second audio signal is a definite signal. When the audio and video source output device is a TV box, the second audio signal can be obtained from the HDMI output port of the audio and video source output device.

播放检测模块12，用于根据第一音频信号和第二音频信号的音频信号相关度和相位差，检测音视频播放设备是否播放第二音频信号。具体地判定方式可以包括：如果第一音频信号和第二音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号和第二音频信号之间的相位差大于预设的相位差阈值，则确定音视频播放设备播放第二音频信号，否则，确定音视频播放设备未播放第二音频信号。The playback detection module 12 is configured to detect whether the audio and video playback device plays the second audio signal according to the audio signal correlation and the phase difference between the first audio signal and the second audio signal. Specifically, the determination method may include: if the audio signal correlation of the first audio signal and the second audio signal is greater than a preset correlation threshold, and the phase difference between the first audio signal and the second audio signal is greater than a preset phase If the difference threshold is exceeded, it is determined that the audio and video playback device plays the second audio signal; otherwise, it is determined that the audio and video playback device does not play the second audio signal.

音频播放切换处理模块13，用于在检测到音视频播放设备未播放第二音频信号的情况下，通过音视频源输出设备的扬声器播放第二音频信号，而在检测到音视频播放设备播放了第二音频信号，则说明音视频播放设备处于正常状态，不需要进行切换。The audio playback switching processing module 13 is used to play the second audio signal through the speaker of the audio and video source output device when it is detected that the audio and video playback device does not play the second audio signal, and when it is detected that the audio and video playback device has played the second audio signal. The second audio signal indicates that the audio and video playback device is in a normal state and does not need to be switched.

实施例五Embodiment 5

如图8所示，其为本发明实施例的音视频播放设备的开机状态检测处理装置的结构示意图，该装置可以应用于电视盒子等音视频源输出设备上。该音视频源输出设备自身带有用于采集音频信号的拾音模块，该音视频源输出设备与例如电视设备等音视频播放设备的之间通过有线或者无线的方式连接，从而进行音视频信号的传输。在本实施例的应用场景中，当音视频源输出设备需要向音视频播放设备传输音视频播放内容时，希望先确认音视频播放设备是否处于已经开机的状态，然后再发送音视频信号，从而避免电视设备未开机或者处于开机过程中，导致一段时间内无法播放音视频源输出设备传输的音视频内容，从而影响用户体验。具体地，该装置包括：As shown in FIG. 8 , it is a schematic structural diagram of a power-on state detection and processing device of an audio and video playback device according to an embodiment of the present invention, and the device can be applied to an audio and video source output device such as a TV box. The audio and video source output device itself has a pickup module for collecting audio signals, and the audio and video source output device and audio and video playback devices such as TV equipment are connected by wired or wireless means, so as to perform audio and video signal acquisition. transmission. In the application scenario of this embodiment, when the audio and video source output device needs to transmit the audio and video playback content to the audio and video playback device, it is hoped to first confirm whether the audio and video playback device is turned on, and then send the audio and video signals, thereby Avoid the TV device is not powered on or is in the process of booting, which will cause the audio and video content transmitted by the audio and video source output device to be unable to be played for a period of time, thus affecting the user experience. Specifically, the device includes:

检测信号发送模块21，用于向音视频播放设备发送检测音频信号。检测音频信号可以是一段简单的音频内容，例如“即将为您播放精彩内容”，如果音视频播放设备处于正常的开机状态，则应当很快播放此段音频，并且音视频源输出设备能够通过拾音模块采集到包含该段音频的第一音频信号。此外，音视频源输出设备还可以通过控制协议对音视频播放设备进行控制，例如音视频源输出设备可以通过控制指令来控制音视频播放设备的开关机等。而音视频源输出设备也可以具有通过拾音模块检测用户的语音指令，触发音视频播放处理能力。因此，在一些应用场景下，可以是用户通过语音指令触发音视频源输出设备发起音视频播放，而此时音视频播放设备还处于关机状态，针对这种情况，检测信号发送模块21在向音视频播放设备发送检测音频信号前还可以包括音视频源输出设备向发出开机指令。The detection signal sending module 21 is used for sending the detection audio signal to the audio and video playback device. The detected audio signal can be a simple piece of audio content, such as "an exciting content is about to be played for you". If the audio and video playback device is in a normal power-on state, the audio should be played soon, and the audio and video source output device can pass the pickup. The audio module collects the first audio signal including the audio segment. In addition, the audio and video source output device can also control the audio and video playback device through a control protocol. For example, the audio and video source output device can control the audio and video playback device on and off through control instructions. The audio and video source output device may also have the ability to detect the user's voice command through the pickup module to trigger the audio and video playback processing capability. Therefore, in some application scenarios, the user may trigger the audio and video source output device to initiate audio and video playback through a voice command, while the audio and video playback device is still in a shutdown state. Before sending the detection audio signal, the video playback device may also include the audio and video source output device sending a power-on instruction to the device.

音频信号获取模块22，用于获取音视频源输出设备的拾音模块检测到的第一音频信号。第一音频信号是通过音视频源输出设备的拾音模块采集环境中的声音而生成的音频信号，其可能包含音视频播放设备发出的音频信号，也可能包含环境中其他设备发出的音频信号，还可能包含音视频源输出设备自身的扬声器发出的音频信号。The audio signal acquisition module 22 is configured to acquire the first audio signal detected by the sound pickup module of the audio and video source output device. The first audio signal is an audio signal generated by collecting the sound in the environment through the sound pickup module of the audio and video source output device, which may include the audio signal sent by the audio and video playback device, and may also include the audio signal sent by other devices in the environment, It may also include audio signals from the audio and video source output device's own speakers.

播放检测模块23，用于根据第一音频信号和检测音频信号的音频信号相关度和相位差，检测音视频播放设备是否播放检测音频信号。具体的检测和判定方式和前面的介绍的声音播放检测处理方法以及装置中的检测原理是相同的，即如果第一音频信号和检测音频信号的音频信号相关度大于预设的相关度阈值，并且第一音频信号和检测音频信号之间的相位差大于预设的相位差阈值，则确定音视频播放设备播放了检测音频信号，否则确定音视频播放设备未播放检测音频信号。The playback detection module 23 is configured to detect whether the audio and video playback device plays the detected audio signal according to the correlation and phase difference between the first audio signal and the audio signal of the detected audio signal. The specific detection and determination method is the same as the sound playback detection processing method and the detection principle in the device described above, that is, if the audio signal correlation between the first audio signal and the detected audio signal is greater than the preset correlation threshold, and If the phase difference between the first audio signal and the detected audio signal is greater than the preset phase difference threshold, it is determined that the audio and video playback device has played the detected audio signal, otherwise it is determined that the audio and video playback device has not played the detected audio signal.

播放处理模块24，用于在检测到音视频播放设备播放检测音频信号的情况下，向音视频播放设备输出音频和/或视频内容信号。另外，如果电视设备未播放检测音频信号，则再次向电视设备发出的检测音频信号，并触发开机状态检测处理装置执行新一轮的检测电视设备是否播放检测音频信号的处理，直至检测了预定的轮次或者检测到音视频播放设备播放了检测音频信号。此外，音视频播放设备之所以没有播放出检测音频信号可能是因为音视频播放设备并未开机，因此，可以在未检测到音视频播放设备播放检测音频信号的情况下，向该音视频播放设备发出开机指令，触发其进行开机处理。The playback processing module 24 is configured to output audio and/or video content signals to the audio and video playback device when it is detected that the audio and video playback device plays the detected audio signal. In addition, if the television equipment does not play the detection audio signal, the detection audio signal is sent to the television equipment again, and the power-on state detection processing device is triggered to perform a new round of processing of detecting whether the television equipment plays the detection audio signal until a predetermined amount of time is detected. round or detected that the audio and video playback device plays the detected audio signal. In addition, the reason why the audio and video playback device does not play the detected audio signal may be because the audio and video playback device is not powered on. Issue a boot command to trigger it to perform boot processing.

本发明实施例的音视频播放设备的开机状态检测处理装置，通过向音视频播放设备发送检测音频信号，然后使用拾音模块进行声音回采，根据音频信号相关度和相位差来判断回采的音频信号中是否包含检测音频信号，从而实现对音视频播放设备的开机状态的检测。通过这种方式，能够避免由于音视频播放设备未开机或者处于开机过程中而导致用户错过一段时间的音视频播放内容。The power-on state detection and processing device of an audio and video playback device according to the embodiment of the present invention sends a detection audio signal to the audio and video playback device, and then uses a sound pickup module to perform sound retrieval, and judges the retrieved audio signal according to the audio signal correlation and phase difference. Whether the detection audio signal is included in it, so as to realize the detection of the power-on state of the audio and video playback device. In this way, it can be avoided that the user misses the audio and video playback content for a period of time because the audio and video playback device is not powered on or is in the process of booting.

实施例六Embodiment 6

如图9所示，其为本发明实施例的回声消除处理装置的结构示意图，该装置可以应用于电视盒子等音视频源输出设备上。该音视频源输出设备自身带有用于采集音频信号的拾音模块，并且可以通过拾音模块接收用户的语音指令，该音视频源输出设备与例如电视设备等音视频播放设备的之间通过有线或者无线的方式连接，从而进行音视频信号的传输。在一些场景下，用户向音视频源输出设备发出语音指令时，音视频播放设备可能正在播放音视频内容，这些声音会干扰到语音指令的识别，本发明实施例提供的回声消除处理装置，就是为了消除这些干扰。具体地，该装置包括：As shown in FIG. 9 , which is a schematic structural diagram of an echo cancellation processing apparatus according to an embodiment of the present invention, the apparatus can be applied to an audio and video source output device such as a TV box. The audio and video source output device itself has a pickup module for collecting audio signals, and can receive user's voice commands through the pickup module. The audio and video source output device and audio and video playback devices such as TV equipment are connected by cable Or connect wirelessly to transmit audio and video signals. In some scenarios, when a user sends a voice command to the audio and video source output device, the audio and video playback device may be playing audio and video content, and these sounds may interfere with the recognition of the voice command. The echo cancellation processing device provided by the embodiment of the present invention is to eliminate these disturbances. Specifically, the device includes:

音频信号获取模块31，用于获取音视频源输出设备的拾音模块检测到的第一音频信号和音视频源输出设备向音视频播放设备发出的第二音频信号。其中，第二音频信号对应于音视频源输出设备向音视频播放设备发送的音频内容，也就是音视频播放设备的扬声器需要播放的音频内容，而第一音频信号是通过音视频源输出设备的拾音模块采集后的音频信号，其包括了经过延时后由音视频播放设备的扬声器播放的第二音频信号，当用户发出语音指令时，还可能包括用户发出的语音指令。此外，在一些情形下，上述的第二音频信号可能会通过音视频源输出设备自身的扬声器播放的，从而使得第一音频信号包括了经过延时后由音视频源输出设备的扬声器播放的第二音频信号，当用户发出语音指令时，还可能包括用户发出的语音指令。这两种情形的区别在于播放第二音频信号的延时会不同。The audio signal acquisition module 31 is configured to acquire the first audio signal detected by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device. The second audio signal corresponds to the audio content sent by the audio and video source output device to the audio and video playback device, that is, the audio content that needs to be played by the speaker of the audio and video playback device, and the first audio signal is transmitted through the audio and video source output device. The audio signal collected by the sound pickup module includes the second audio signal played by the speaker of the audio and video playback device after a delay. When the user sends out a voice command, it may also include a voice command sent by the user. In addition, in some cases, the above-mentioned second audio signal may be played through the speaker of the audio and video source output device itself, so that the first audio signal includes the first audio signal played by the speaker of the audio and video source output device after a delay. The second audio signal, when the user sends a voice command, may also include the voice command sent by the user. The difference between the two cases is that the delay in playing the second audio signal will be different.

相关度处理模块32，用于将第一音频信号和第二音频信号进行信号特征比对，获取音频信号相关度大于预设的相关度阈值的相关音频信号部分。音频信号相关度大于预设的相关度阈值说明第一音频信号中包含了第二音频信号的内容。上述的音频信号间的音频信号相关度可以通过对音频信号的频域特征进行提取，然后进行频域分析和比对来确定。The correlation processing module 32 is configured to compare the signal features of the first audio signal and the second audio signal, and obtain the relevant audio signal parts whose audio signal correlation is greater than a preset correlation threshold. If the audio signal correlation is greater than the preset correlation threshold, it indicates that the first audio signal contains the content of the second audio signal. The audio signal correlation between the above audio signals can be determined by extracting frequency domain features of the audio signals, and then performing frequency domain analysis and comparison.

相位差确定模块33，用于确定第一音频信号的相关音频信号部分和第二音频信号的相关音频信号部分之间的相位差。上述的音频信号间的相位差可以在确定了第一音频信号中包含第二音频信号的成分的基础上，找出第一音频信号与第二音频信号中相关度较高的音频信号部分的特征点，根据这些特征点来计算相位差，例如相关音频信号部分的峰值点之间的相位差。The phase difference determination module 33 is configured to determine the phase difference between the relevant audio signal part of the first audio signal and the relevant audio signal part of the second audio signal. The above-mentioned phase difference between the audio signals can be based on determining that the first audio signal contains the components of the second audio signal, and find out the characteristics of the audio signal part with a higher degree of correlation in the first audio signal and the second audio signal. From these feature points, the phase difference, such as the phase difference between the peak points of the relevant audio signal parts, is calculated.

回声消除模块34，用于从第一音频信号中去除掉消除相位差后的第二音频信号。这里所说的消除掉相位差的处理可以具体为根据确定的相位差在时间轴上将信号对齐，例如，确定出的相位差为100ms，即第一音频信号中的相关成分比第二音频信号滞后100ms，那么可以将当前第一音频信号与100ms前的第二音频信号进行对齐后做减法，从而就可以消除掉第二音频信号对应的成分。The echo cancellation module 34 is configured to remove the second audio signal after the phase difference has been eliminated from the first audio signal. The process of eliminating the phase difference mentioned here may specifically be aligning the signals on the time axis according to the determined phase difference. For example, the determined phase difference is 100ms, that is, the correlation components in the first audio signal are higher than those in the second audio signal. If the delay is 100ms, the current first audio signal can be aligned with the second audio signal 100ms ago and then subtracted, so that the component corresponding to the second audio signal can be eliminated.

上述确定相位差的处理可以以预设的时间间隔执行执行，即相关度处理模块32和相位差确定模块33所执行的处理以预设的时间间隔执行，相应地，在时间间隔内，在回声消除模块34以最新确定的相位差，执行从第一音频信号中去除掉消除相位差后的第二音频信号的处理。The above-mentioned process of determining the phase difference can be performed at preset time intervals, that is, the processing performed by the correlation processing module 32 and the phase difference determination module 33 is performed at preset time intervals. The cancellation module 34 performs a process of removing the phase difference-eliminated second audio signal from the first audio signal with the newly determined phase difference.

本发明实施例的回声消除处理装置，通过音视频源输出设备的拾音模块回采的第一音频信号和音视频源输出设备向音视频播放设备发送的第二音频信号来计算相位差，然后，使用该相位差来进行回声消除，从而能够屏蔽掉由于播放音视频源输出设备输出的第二音频信号的影响，从而可以提取到较为纯净的语音指令，进而执行准确的语音控制处理。The echo cancellation processing apparatus of the embodiment of the present invention calculates the phase difference by using the first audio signal recovered by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device, and then using The phase difference is used for echo cancellation, so that the influence of the second audio signal output by the output device of the audio and video source can be shielded, so that relatively pure voice commands can be extracted, and then accurate voice control processing can be performed.

实施例七Embodiment 7

本实施例涉及在一些场景下的应用本发明实施例的技术方案来实现音视频的协同播放。例如，一些多人会议或者教育培训的场景下，往往会通过投影设备来播放视频画面，一些投影设备可能自带音频播放功能，而有些投影设备可能只有图像播放功能，另外，对于带有音频播放功能的投影设备，也可能会由于信号传输故障或者设备设置等原因，而无法正常播放音频。This embodiment relates to the application of the technical solutions of the embodiments of the present invention in some scenarios to realize the cooperative playback of audio and video. For example, in some multi-person conferences or education and training scenarios, video images are often played through projection devices. Some projection devices may have their own audio playback function, while some projection devices may only have image playback functions. The projection equipment with the same function may also be unable to play audio normally due to signal transmission failure or equipment settings.

针对这种情况，本实施例提供一种投影播放处理方法，该方法可以应用于例如电视盒子等音视频源输出设备上，音视频源输出设备自身带有用于采集音频信号的拾音模块(例如麦克风)，该音视频源输出设备与投影设备的之间通过有线或者无线的方式连接，从而进行音视频信号的传输。具体地，该方法包括：音视频源输出设备向投影设备输出音视频信号；通过拾音模块检测投影设备是否播放音视频信号中的音频信号，如果检测确定投影设备未播放音频信号，则通过音视频源输出设备的扬声器播放音频信号，如果检测确定投影设备正常播放音频信号，则优先使用投影设备来播放音频信号。In view of this situation, this embodiment provides a projection playback processing method, which can be applied to an audio and video source output device such as a TV box, and the audio and video source output device itself has a pickup module for collecting audio signals (for example, Microphone), the audio and video source output device and the projection device are connected in a wired or wireless manner, so as to transmit audio and video signals. Specifically, the method includes: an audio and video source output device outputs an audio and video signal to the projection device; detecting whether the projection device plays an audio signal in the audio and video signal through a sound pickup module, if the detection determines that the projection device does not play an audio signal, The speaker of the video source output device plays the audio signal. If the detection determines that the projection device plays the audio signal normally, the projection device is used preferentially to play the audio signal.

对于投影设备是否播放音频信号的检测方式可以采用前述实施例中介绍的，通过拾音模块回采环境中的音频信号，然后通过信号之间的相关度和相位差进行分析判断，确定投影设备是否播放了音视频源输出设备输出的音频。For the detection method of whether the projection device plays audio signals, the method described in the previous embodiment can be adopted. The audio signals in the environment are collected by the sound pickup module, and then the correlation and phase difference between the signals are analyzed and judged to determine whether the projection device plays the audio signal. The audio output from the audio and video source output device is displayed.

此外，在远程会议场景下，一方面，基于云平台的共享处理机制，能够实现远程的同屏图像显示，另一方面，在会议现场的音视频信号也可以通过网络传输到远程会议场所的设备上。远程会议场所的需求为：视频播放的画面为同屏凸显，音频播放的内容为会议现场传输的音频信号，并且希望两者尽可能的协同播放。In addition, in the remote conference scenario, on the one hand, the shared processing mechanism based on the cloud platform can realize remote image display on the same screen. On the other hand, the audio and video signals at the conference site can also be transmitted to the equipment in the remote conference site through the network. superior. The requirements of the remote conference venue are: the video played on the same screen is highlighted, and the content of the audio played is the audio signal transmitted by the conference site, and it is hoped that the two can be played together as much as possible.

为此，本发明实施例提供了一种远程会议协同播放处理方法，该方法可以应用于例如电视盒子等音视频源输出设备上，音视频源输出设备自身带有用于采集音频信号的拾音模块(例如麦克风)，该音视频源输出设备与远程会议场所的投影设备或者大型显示设备的之间通过有线或者无线的方式连接，从而进行图像信号和音频信号的传输。包括：To this end, an embodiment of the present invention provides a remote conference collaborative playback processing method, which can be applied to an audio and video source output device such as a TV box, and the audio and video source output device itself has a pickup module for collecting audio signals. (eg microphone), the audio and video source output device and the projection device or large-scale display device in the remote meeting place are connected by wire or wireless, so as to transmit image signals and audio signals. include:

S401：获取远程会议的同屏图像信号和会议现场的音视频信号。在远程会议场所中，可以通过电视盒子等音视频源输出设备通过网络接收会议现场的同屏图像信号和会议现场的音视频信号。其中，同屏图像信号可以来自于云平台，即会议现场的会议主持人员基于云平台的办公系统进行平面分享，从而在云平台上生成同屏图像信号，并传送至远程会议场所的音视频源输出设备中。另一方面，在会议现场可以设置录音录像设备，并同步向远程会议场所的音视频源输出设备传送音视频信号。S401: Acquire the same-screen image signal of the remote conference and the audio and video signals of the conference site. In the remote conference venue, the video signal on the same screen at the conference site and the audio and video signals at the conference site can be received through the network through the audio and video source output device such as the TV box. Among them, the same-screen image signal can come from the cloud platform, that is, the meeting host at the meeting site can share the plane based on the office system of the cloud platform, so as to generate the same-screen image signal on the cloud platform and transmit it to the audio and video source of the remote meeting place. in the output device. On the other hand, audio and video recording equipment can be set up at the conference site, and the audio and video signals can be transmitted to the audio and video source output equipment of the remote conference site synchronously.

S402：将音视频信号中的视频信号与同屏图像信号进行比对，确定同屏图像信号对应的音频信号。在远程会议场所的音视频源输出设备中，可以通过会议现场拍摄的视频信号与同屏图像信号的对比分析，确定这两路信号之间的同步情况，由于会议现场的录制的音频信号和视频信号是对应的，在确定了视频信号与同屏图像信号之间的同步状况后，也就间接确定了音频信号的同步状态，即在本实施例中，现场拍摄的视频信号主要作用是为了确定音频信号与同屏图像信号之间的对应关系。S402: Compare the video signal in the audio-video signal with the image signal on the same screen, and determine the audio signal corresponding to the image signal on the same screen. In the audio and video source output device of the remote conference venue, the synchronization between the two signals can be determined by comparing and analyzing the video signal captured at the conference site and the image signal on the same screen. The signals are corresponding. After the synchronization between the video signal and the image signal on the same screen is determined, the synchronization state of the audio signal is also indirectly determined. That is, in this embodiment, the main function of the video signal captured on site is to determine Correspondence between the audio signal and the image signal on the same screen.

S403：将同屏图像信号和对应的音频信号进行同步播放。基于前面步骤确定出的同屏图像信号与音频信号之间的对应关系，将音频信号和同屏图像信号进行协同播放。S403: Synchronously play the same-screen image signal and the corresponding audio signal. Based on the corresponding relationship between the same-screen image signal and the audio signal determined in the previous steps, the audio signal and the same-screen image signal are cooperatively played.

另外，本发明实施例还提供了一种音视频源输出设备，该设备可以具体为电视盒子等设备，具体地，该设备包括：In addition, an embodiment of the present invention also provides an audio and video source output device, which may be specifically a TV box or other device. Specifically, the device includes:

音视频信号输出模块，用于向音视频播放设备输出音视频信号。具体地，该音视频信号输出模块可以通过有线(例如HDMI信号线)或者无线(局域网)的方式向例如智能电视、投影仪等音视频播放设备传输音视频信号。The audio and video signal output module is used to output audio and video signals to the audio and video playback device. Specifically, the audio and video signal output module can transmit audio and video signals to audio and video playback devices such as smart TVs and projectors by wire (eg, HDMI signal line) or wirelessly (local area network).

音频信号播放模块，用于根据音频播放检测模块的指令，播放向音视频播放设备输出的音频信号。该音频信号播放模块可以是音视频源输出设备自身设置的声卡和扬声器模块。The audio signal playing module is used for playing the audio signal output to the audio and video playing device according to the instruction of the audio playing detection module. The audio signal playback module may be a sound card and a speaker module set by the audio and video source output device itself.

音频播放检测模块，用于对音频播放设备是否播放音频信号进行检测，如果确定音频播放设备未播放音频信号，则触发音频信号播放模块播放音频信号。具体的检测方式可以采用前述实施例中介绍的，通过拾音模块回采环境中的音频信号，然后通过信号之间的相关度和相位差进行分析判断，确定音视频播放设备是否播放了音视频源输出设备输出的音频。The audio playback detection module is used for detecting whether the audio playback device plays the audio signal, and if it is determined that the audio playback device does not play the audio signal, the audio signal playback module is triggered to play the audio signal. The specific detection method can adopt the method introduced in the foregoing embodiment, the audio signal in the environment is collected through the sound pickup module, and then the correlation and phase difference between the signals are analyzed and judged to determine whether the audio and video playback device has played the audio and video source. The audio output by the output device.

此外，在音视频播放设备和音视频源输出设备都可以进行音频播放的情况下，用户可以通过语音指令进行选择，具体地，上述设备还可以包括：切换控制模块，用于接收用户的语音指令，并根据语音指令，在音视频播放设备和音频信号播放模块之间进行音频信号播放的切换处理。In addition, in the case that both the audio and video playback device and the audio and video source output device can perform audio playback, the user can select through a voice command. And according to the voice command, the audio signal playback switching process is performed between the audio and video playback device and the audio signal playback module.

实施例八Embodiment 8

前面实施例描述了播放检测、开机、回声消除处理、投影播放处理、远程会议协同播放处理方法的流程处理及对应的装置结构，上述的方法和装置的功能可借助一种电子设备实现完成，如图10所示，其为本发明实施例的电子设备的结构示意图，具体包括：存储器110和处理器120。The previous embodiment describes the process of playback detection, power-on, echo cancellation processing, projection playback processing, and remote conference collaborative playback processing methods and the corresponding device structure. As shown in FIG. 10 , it is a schematic structural diagram of an electronic device according to an embodiment of the present invention, which specifically includes: a memory 110 and a processor 120 .

存储器110，用于存储程序。The memory 110 is used to store programs.

除上述程序之外，存储器110还可被配置为存储其它各种数据以支持在电子设备上的操作。这些数据的示例包括用于在电子设备上操作的任何应用程序或方法的指令，联系人数据，电话簿数据，消息，图片，视频等。In addition to the above-described programs, the memory 110 may also be configured to store various other data to support operations on the electronic device. Examples of such data include instructions for any application or method operating on the electronic device, contact data, phonebook data, messages, pictures, videos, etc.

存储器110可以由任何类型的易失性或非易失性存储设备或者它们的组合实现，如静态随机存取存储器(SRAM)，电可擦除可编程只读存储器(EEPROM)，可擦除可编程只读存储器(EPROM)，可编程只读存储器(PROM)，只读存储器(ROM)，磁存储器，快闪存储器，磁盘或光盘。Memory 110 may be implemented by any type of volatile or non-volatile storage device or combination thereof, such as static random access memory (SRAM), electrically erasable programmable read only memory (EEPROM), erasable Programmable Read Only Memory (EPROM), Programmable Read Only Memory (PROM), Read Only Memory (ROM), Magnetic Memory, Flash Memory, Magnetic or Optical Disk.

处理器120，耦合至存储器110，用于执行存储器110中的程序，以执行前述实施例中所描述的播放检测、开机、回声消除处理、投影播放处理、远程会议协同播放处理方法之一或者任意多个方法的操作步骤。The processor 120, coupled to the memory 110, is configured to execute the program in the memory 110 to perform one or any of the playback detection, power-on, echo cancellation processing, projection playback processing, and teleconference collaborative playback processing methods described in the foregoing embodiments. Operation steps for multiple methods.

此外，处理器120也可以包括前述实施例所描述的各种模块以执行播放检测、开机、回声消除处理方法之一或者任意多个方法的处理，并且存储器110可以例如用于存储这些模块执行操作所需要的数据和/或所输出的数据。In addition, the processor 120 may also include various modules described in the foregoing embodiments to perform one or any of the processing methods of playback detection, power-on, and echo cancellation, and the memory 110 may be used, for example, to store these modules to perform operations. required data and/or output data.

对于上述处理过程具体说明、技术原理详细说明以及技术效果详细分析在前面实施例中进行了详细描述，在此不再赘述。The specific description of the above-mentioned processing process, the detailed description of the technical principle, and the detailed analysis of the technical effect have been described in detail in the foregoing embodiments, and will not be repeated here.

进一步，如图所示，电子设备还可以包括：通信组件130、电源组件140、音频组件150、显示器160等其它组件。图中仅示意性给出部分组件，并不意味着电子设备只包括图中所示组件。Further, as shown in the figure, the electronic device may further include: a communication component 130 , a power supply component 140 , an audio component 150 , a display 160 and other components. Only some components are schematically shown in the figure, which does not mean that the electronic device only includes the components shown in the figure.

通信组件130被配置为便于电子设备和其他设备之间有线或无线方式的通信。电子设备可以接入基于通信标准的无线网络，如WiFi，2G、3G、4G/LTE、5G等移动通信网络，或它们的组合。在一个示例性实施例中，通信组件130经由广播信道接收来自外部广播管理系统的广播信号或广播相关信息。在一个示例性实施例中，通信组件130还包括近场通信(NFC)模块，以促进短程通信。例如，在NFC模块可基于射频识别(RFID)技术，红外数据协会(IrDA)技术，超宽带(UWB)技术，蓝牙(BT)技术和其他技术来实现。Communication component 130 is configured to facilitate wired or wireless communications between electronic devices and other devices. Electronic devices can access wireless networks based on communication standards, such as WiFi, mobile communication networks such as 2G, 3G, 4G/LTE, and 5G, or a combination thereof. In one exemplary embodiment, the communication component 130 receives broadcast signals or broadcast related information from an external broadcast management system via a broadcast channel. In one exemplary embodiment, the communication component 130 also includes a near field communication (NFC) module to facilitate short-range communication. For example, the NFC module may be implemented based on radio frequency identification (RFID) technology, infrared data association (IrDA) technology, ultra-wideband (UWB) technology, Bluetooth (BT) technology and other technologies.

电源组件140，为电子设备的各种组件提供电力。电源组件140可以包括电源管理系统，一个或多个电源，及其他与为电子设备生成、管理和分配电力相关联的组件。The power supply assembly 140 provides power for various components of the electronic device. Power components 140 may include a power management system, one or more power supplies, and other components associated with generating, managing, and distributing power to electronic devices.

音频组件150被配置为输出和/或输入音频信号。例如，音频组件150包括一个麦克风(MIC)，当电子设备处于操作模式，如呼叫模式、记录模式和语音识别模式时，麦克风被配置为接收外部音频信号。所接收的音频信号可以被进一步存储在存储器110或经由通信组件130发送。在一些实施例中，音频组件150还包括一个扬声器，用于输出音频信号。Audio component 150 is configured to output and/or input audio signals. For example, audio component 150 includes a microphone (MIC) that is configured to receive external audio signals when the electronic device is in operating modes, such as calling mode, recording mode, and voice recognition mode. The received audio signal may be further stored in the memory 110 or transmitted via the communication component 130 . In some embodiments, the audio component 150 also includes a speaker for outputting audio signals.

显示器160包括屏幕，其屏幕可以包括液晶显示器(LCD)和触摸面板(TP)。如果屏幕包括触摸面板，屏幕可以被实现为触摸屏，以接收来自用户的输入信号。触摸面板包括一个或多个触摸传感器以感测触摸、滑动和触摸面板上的手势。触摸传感器可以不仅感测触摸或滑动动作的边界，而且还检测与触摸或滑动操作相关的持续时间和压力。The display 160 includes a screen, which may include a liquid crystal display (LCD) and a touch panel (TP). If the screen includes a touch panel, the screen may be implemented as a touch screen to receive input signals from a user. The touch panel includes one or more touch sensors to sense touch, swipe, and gestures on the touch panel. A touch sensor can sense not only the boundaries of a touch or swipe action, but also the duration and pressure associated with the touch or swipe action.

此外，本发明实施例还提供了一种计算机程序产品，包括计算机程序或指令，其特征在于，当计算机程序或指令被处理器执行时，致使处理器实现前述的声音播放检测处理方法、音视频播放设备的开机状态检测处理方法以及回声消除处理方法、、投影播放处理方法以及远程会议协同播放处理方法之一或者任意多个方法。In addition, an embodiment of the present invention also provides a computer program product, including a computer program or an instruction, characterized in that, when the computer program or instruction is executed by the processor, the processor is caused to implement the aforementioned sound playback detection processing method, audio and video One or any of a plurality of methods for detecting and processing a power-on state of a playback device, an echo cancellation processing method, a projection playback processing method, and a remote conference collaborative playback processing method.

本领域普通技术人员可以理解：实现上述各方法实施例的全部或部分步骤可以通过程序指令相关的硬件来完成。前述的程序可以存储于计算机可读取存储介质中。该程序在执行时，执行包括上述各方法实施例的步骤；而前述的存储介质包括：ROM、RAM、磁碟或者光盘等各种可以存储程序代码的介质。Those of ordinary skill in the art can understand that all or part of the steps of implementing the above method embodiments may be completed by program instructions related to hardware. The aforementioned program can be stored in a computer-readable storage medium. When the program is executed, the steps including the above method embodiments are executed; and the foregoing storage medium includes: ROM, RAM, magnetic disk or optical disk and other media that can store program codes.

最后应说明的是：以上各实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述各实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明各实施例技术方案的范围。Finally, it should be noted that the above embodiments are only used to illustrate the technical solutions of the present invention, but not to limit them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: The technical solutions described in the foregoing embodiments can still be modified, or some or all of the technical features thereof can be equivalently replaced; and these modifications or replacements do not make the essence of the corresponding technical solutions deviate from the technical solutions of the embodiments of the present invention. scope.

Claims

1. A sound playback detection and processing method, comprising:

Obtain the first audio signal detected by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device;

Detecting whether the audio and video playback device plays the second audio signal according to the audio signal correlation and phase difference of the first audio signal and the second audio signal;

If the audio and video playback device does not play the second audio signal, the second audio signal is played through the speaker of the audio and video source output device.

2. The method according to claim 1, wherein, according to the audio signal correlation degree and phase difference of the first audio signal and the second audio signal, it is determined whether the audio and video playback device normally plays the second audio signal. Audio signals include:

If the audio signal correlation of the first audio signal and the second audio signal is greater than a preset correlation threshold, and the phase difference between the first audio signal and the second audio signal is greater than a preset If the phase difference threshold is determined, it is determined that the audio and video playback device plays the second audio signal; otherwise, it is determined that the audio and video playback device does not play the second audio signal.

3. The method according to claim 1, wherein, according to the audio signal correlation degree and the phase difference of the first audio signal and the second audio signal, it is determined whether the audio and video playback device normally plays the second audio signal. Audio signals include:

If the audio signal correlation of the first audio signal and the second audio signal is greater than a preset correlation threshold, and the first audio signal includes sub audio signals and sub audio signals with different phase differences, one of the sub audio signals The phase difference between the audio signal and the second audio signal is greater than a preset phase difference threshold, and the phase difference between another sub-audio signal and the second audio signal is less than the phase difference threshold, then it is determined that the The speakers of both the audio and video playback device and the audio and video source output device are playing the second audio signal, and the speakers of the audio and video source output device are stopped from playing the second audio signal.

4. The method according to claim 1, wherein, according to the audio signal correlation degree and the phase difference of the first audio signal and the second audio signal, it is determined whether the audio and video playback device normally plays the second audio signal. Audio signals include:

If the audio signal correlation of the first audio signal and the second audio signal is greater than a preset correlation threshold, but the phase difference between the first audio signal and the second audio signal is less than a preset The phase difference threshold is determined as the audio and video playback device is not playing the second audio signal, and the speaker of the audio and video source output device is playing the second audio signal.

5. The method according to claim 1, wherein, the audio and video source output device is a TV box, the audio and video playback device is a TV device, and the first audio and video source output device sent by the audio and video source output device to the audio and video playback device is obtained. The second audio signal includes: obtaining the second audio signal from the HDMI output port of the audio and video source output device.

6. A method for detecting and processing a power-on state of an audio and video playback device, comprising:

Send a detection audio signal to the audio and video playback device, and obtain the first audio signal detected by the sound pickup module of the audio and video source output device;

Detecting whether the audio and video playback device plays the detected audio signal according to the audio signal correlation and phase difference between the first audio signal and the detected audio signal;

If the audio and video playback device plays the detected audio signal, output the audio and/or video content signal to the audio and video playback device.

7. The method according to claim 6, wherein, if the audio-video playback device does not play the detection audio signal, then the detection audio signal sent to the audio-video playback device again, and carries out a new round of detection of the audio-video signal Whether the playback device plays the process of detecting the audio signal.

8. The method according to claim 7, further comprising: if the audio and video playback device does not play the detected audio signal, sending a power-on instruction to the audio and video playback device.

9. The method according to claim 6, wherein, according to the audio signal correlation degree and phase difference of the first audio signal and the detected audio signal, detecting whether the audio and video playback device plays the detected audio signal comprises: :

If the audio signal correlation between the first audio signal and the detected audio signal is greater than a preset correlation threshold, and the phase difference between the first audio signal and the detected audio signal is greater than the preset phase difference If the threshold is set, it is determined that the audio and video playback device plays the detected audio signal; otherwise, it is determined that the audio and video playback device does not play the detected audio signal.

10. An echo cancellation processing method, comprising:

Comparing the signal characteristics of the first audio signal and the second audio signal, and obtaining the part of the audio signal whose correlation is greater than a preset correlation threshold;

determining a phase difference between the associated audio signal portion of the first audio signal and the associated audio signal portion of the second audio signal;

The phase difference-eliminated second audio signal is removed from the first audio signal.

11. The method according to claim 10, wherein the signal feature comparison of the first audio signal and the second audio signal is performed at a preset time interval, and the obtained audio signal correlation degree is greater than a preset time interval. a process of determining the phase difference between the correlated audio signal portion of the correlation threshold and the correlated audio signal portion of the first audio signal and the correlated audio signal portion of the second audio signal,

Removing the phase difference-eliminated second audio signal from the first audio signal includes: in the preset time interval, removing the phase difference from the first audio signal with the newly determined phase difference The second audio signal after eliminating the phase difference.

12. A sound playback detection and processing device, comprising:

an audio signal acquisition module for acquiring the first audio signal detected by the sound pickup module of the audio and video source output device and the second audio signal sent by the audio and video source output device to the audio and video playback device;

A playback detection module, configured to detect whether the audio and video playback device plays the second audio signal according to the audio signal correlation and phase difference of the first audio signal and the second audio signal;

An audio playback switching processing module, configured to play the second audio signal through the speaker of the audio and video source output device when it is detected that the audio and video playback device does not play the second audio signal.

13. The apparatus according to claim 12, wherein, according to the audio signal correlation degree and the phase difference of the first audio signal and the second audio signal, it is determined whether the audio and video playback device normally plays the second audio signal. Audio signals include:

14. A power-on state detection and processing device for an audio and video playback device, comprising:

The detection signal sending module is used to send the detection audio signal to the audio and video playback device;

an audio signal acquisition module for acquiring the first audio signal detected by the sound pickup module of the audio and video source output device;

A playback detection module, configured to detect whether the audio and video playback device plays the detected audio signal according to the audio signal correlation and phase difference of the first audio signal and the detected audio signal;

A playback processing module, configured to output audio and/or video content signals to the audio and video playback device when it is detected that the audio and video playback device plays the detected audio signal.

15. An echo cancellation processing device, comprising:

a correlation processing module, configured to compare the signal characteristics of the first audio signal and the second audio signal, and obtain the part of the audio signal whose correlation is greater than a preset correlation threshold;

a phase difference determination module for determining a phase difference between a relevant audio signal portion of the first audio signal and a relevant audio signal portion of the second audio signal;

The echo cancellation module is configured to remove the second audio signal after the phase difference has been eliminated from the first audio signal.

16. A projection playback processing method, comprising:

Output audio and video signals to projection equipment;

The sound pickup module detects whether the projection device plays the audio signal in the audio and video signal, and if it is determined that the projection device does not play the audio signal, the audio signal is played through the speaker of the audio and video source output device.

17. A remote conference collaborative playback processing method, comprising:

Obtain the same-screen image signal of the remote conference and the audio and video signals of the conference site;

Compare the video signal in the audio-video signal with the image signal on the same screen, and determine the audio signal corresponding to the image signal on the same screen;

The same-screen image signal and the corresponding audio signal are played synchronously.

18. An audio and video source output device, comprising:

The audio and video signal output module is used to output audio and video signals to the audio and video playback device;

The audio signal playing module is used for playing the audio signal output to the audio and video playing device according to the instruction of the audio playing detection module;

The audio playback detection module is used to detect whether the audio playback device plays the audio signal, and if it is determined that the audio playback device does not play the audio signal, trigger the audio signal playback module to play the audio Signal.

19. The apparatus of claim 18, further comprising:

The switching control module is used for receiving a user's voice command, and according to the voice command, performs switching processing of audio signal playback between the audio and video playback device and the audio signal playback module.

20. An electronic device comprising:

memory for storing programs;

A processor, for running the program stored in the memory, to execute the sound playback detection processing method described in any one of claims 1 to 5, and the booting of the audio and video playback device described in any one of claims 6 to 9 Any one or more of the state detection processing method, the echo cancellation processing method of claim 10 or 11, the projection playback processing method of claim 16, and the teleconference collaborative playback processing method of claim 17.

21. A computer program product, comprising a computer program or an instruction, characterized in that, when the computer program or instruction is executed by a processor, the processor is caused to implement the sound playback detection according to any one of claims 1 to 5 The processing method, the power-on state detection processing method of the audio and video playback device according to any one of claims 6 to 9, the echo cancellation processing method according to claim 10 or 11, the projection playback processing method according to claim 16, and the processing method according to claim 17. Any one or more of the remote conference collaborative playback processing methods.