CN115691516B

CN115691516B - Low-delay audio matrix configuration method and server

Info

Publication number: CN115691516B
Application number: CN202211362509.1A
Authority: CN
Inventors: 黄志军; 王子兵; 林弟; 张常华; 朱正辉; 赵定金
Original assignee: Guangdong Baolun Electronics Co ltd
Current assignee: Guangdong Baolun Electronics Co ltd
Priority date: 2022-11-02
Filing date: 2022-11-02
Publication date: 2023-09-05
Anticipated expiration: 2042-11-02
Also published as: CN115691516A

Abstract

The invention discloses a low-delay audio matrix configuration method and a server, wherein the method comprises the following steps: s1: driving the input equipment to establish connection with the output equipment, and performing audio data interaction between the input equipment and the output equipment; s2: the hardware operation rate, the current network delay condition and the audio data packet processing amount of each playing device are obtained, the audio data interception amount is calculated, and the output device is driven to intercept corresponding audio data and then sent to the playing device. In the invention, the hardware operation rate, the current network delay condition and the processing capacity of the audio data packet of each playing device are fully considered, and the quantity to be intercepted is calculated in real time, so that the data to be processed of the devices with different network delay conditions are dynamically intercepted to different degrees, a plurality of devices can realize synchronization, the requirement of synchronously playing the audio for the plurality of devices in a professional scene is met, and the delay sense caused by the network environment is eliminated.

Description

A low-latency audio matrix configuration method and server

技术领域technical field

本发明涉及音频处理技术领域，尤其涉及一种低延迟音频矩阵配置方法及服务器。The invention relates to the technical field of audio processing, in particular to a low-delay audio matrix configuration method and server.

背景技术Background technique

随着社会信息化的进步，各行各业对中大型舞台会议、播音室和演播厅等专业场景的音频矩阵在不同场景下的音频输出要求越来越高，包括但不限于输出源的输出准确性、无延迟性、高质量混音效果和应对复杂的音频抢占逻辑。With the advancement of social informatization, all walks of life have higher and higher requirements for the audio output of audio matrices in professional scenes such as medium and large stage conferences, studios and studios, including but not limited to accurate output of output sources. performance, no delay, high-quality mixing effects and complex audio preemption logic.

目前现有的技术只是针对于音频数据包进行拆分，将大包的音频数据拆分成多个小包，从而做到一定减低延迟，但是在网络或者设备环境当中，会存在网络设备等等一系列环境导致的延迟问题，并且现有技术是将延迟时间最高的设备作为调整的标准，将其他的数据同步延迟到最高的时间，进行延迟同步发送，并未从根本解决延迟的问题，导致在一些专业场景下，音频播放出来会有明显的延迟感。At present, the existing technology is only aimed at splitting audio data packets, splitting large packets of audio data into multiple small packets, so as to reduce the delay to a certain extent, but in the network or device environment, there will be network devices and so on. The delay problem caused by the series environment, and the existing technology uses the device with the highest delay time as the adjustment standard, delays other data synchronization to the highest time, and sends the delay synchronously, which does not fundamentally solve the delay problem, resulting in In some professional scenarios, there will be a significant delay in audio playback.

发明内容Contents of the invention

为了克服现有技术的不足，本发明的目的之一在于提供一种低延迟音频矩阵配置方法，其能解决在专业场景下音频数据的传输受网络环境影响出现明显延迟的问题。In order to overcome the deficiencies of the prior art, one of the purposes of the present invention is to provide a low-latency audio matrix configuration method, which can solve the problem of obvious delay in the transmission of audio data under the influence of the network environment in professional scenarios.

本发明的目的之二在于提供一种低延迟音频矩阵配置服务器，其能解决在专业场景下音频数据的传输受网络环境影响出现明显延迟的问题。The second object of the present invention is to provide a low-latency audio matrix configuration server, which can solve the problem of obvious delay in the transmission of audio data under the influence of the network environment in professional scenarios.

为了达到上述目的之一，本发明所采用的技术方案如下：In order to achieve one of the above-mentioned purposes, the technical scheme adopted in the present invention is as follows:

一种低延迟音频矩阵配置方法，包括以下步骤：A low-latency audio matrix configuration method, comprising the following steps:

S1：驱使输入设备与输出设备建立连接，输入设备和输出设备之间进行音频数据交互；S1: Drive the input device to establish a connection with the output device, and perform audio data interaction between the input device and the output device;

S2：获取各个播放设备的硬件运算速率、当前网络延迟情况以及音频数据包处理量，计算音频数据截取量，驱使输出设备截取相应的音频数据，再发送至播放设备。S2: Obtain the hardware operation rate of each playback device, the current network delay and the processing volume of audio data packets, calculate the amount of audio data interception, drive the output device to intercept the corresponding audio data, and then send it to the playback device.

优选的，所述S2具体由以下步骤实现：Preferably, said S2 is specifically implemented by the following steps:

S21：获取各个播放设备的当前网络延迟情况；S21: Obtain the current network delay of each playback device;

S22：判断当前网络延迟情况是否大于或等于预设延迟值，若否，则驱使输出设备将音频数据发送至播放设备，若是，则执行S23；S22: Determine whether the current network delay is greater than or equal to the preset delay value, if not, drive the output device to send the audio data to the playback device, if so, execute S23;

S23：获取播放设备的硬件运算速率以及音频数据包处理量，并计算音频数据截取量；S23: Obtain the hardware computing rate and audio data packet processing volume of the playback device, and calculate the audio data interception volume;

S24：驱使输出设备截取相应的音频数据，再发送至播放设备。S24: Drive the output device to intercept corresponding audio data, and then send it to the playback device.

优选的，所述S1具体由以下步骤实现：Preferably, said S1 is specifically implemented by the following steps:

向输入设备与输出设备发送音频配置信息，驱使输入设备与输出设备通过音频配置信息建立连接，进行音频数据交互。Send audio configuration information to the input device and output device, drive the input device and output device to establish a connection through the audio configuration information, and perform audio data interaction.

优选的，所述S1和S2之间还包括以下步骤：Preferably, the following steps are further included between S1 and S2:

A1：驱使输入设备将回声从音频帧中清除，并进行降噪处理，识别以及过滤背景噪音；A1: Drive the input device to remove the echo from the audio frame, and perform noise reduction processing, identify and filter background noise;

A2：判断输入设备当前是否获取到音频数据，若否，则编码生成一个空白编码帧，输送至输出设备，若是，则执行A3；A2: Determine whether the input device currently obtains audio data, if not, encode to generate a blank encoded frame, and send it to the output device, if so, execute A3;

A3：判断是否进行混音处理，若是，则通过衰减因子对语音进行衰减，若否，则将音频数据解析成PCM音频流数据，并进行噪音检测处理。A3: Determine whether to perform audio mixing processing, if yes, attenuate the voice through the attenuation factor, if not, parse the audio data into PCM audio stream data, and perform noise detection processing.

为了达到上述目的之二，本发明所采用的技术方案如下：In order to achieve two of the above-mentioned purposes, the technical scheme adopted in the present invention is as follows:

一种低延迟音频矩阵配置服务器，包括储存器和处理器；A low-latency audio matrix configuration server, including a memory and a processor;

储存器，用于储存程序指令；a memory for storing program instructions;

处理器，用于运行所述程序指令，以执行如上述的低延迟音频矩阵配置方法。The processor is configured to run the program instructions to execute the above-mentioned low-latency audio matrix configuration method.

相比现有技术，本发明的有益效果在于：通过对各个播放设备的硬件运算速率、当前网络延迟情况以及音频数据包处理量进行充分考虑，并实时计算需要截取的数量，从而对不同网络延迟情况的设备所需处理的数据进行不同程度的动态截取，使得多个设备能够实现同步，以适应专业场景中对多个设备同步播放音频的需求，消除网络环境造成的延迟感。Compared with the prior art, the beneficial effect of the present invention is that: by fully considering the hardware operation rate of each playback device, the current network delay situation and the processing volume of audio data packets, and calculating the quantity to be intercepted in real time, thereby different network delays Different degrees of dynamic interception of data to be processed by the device in different situations, so that multiple devices can be synchronized to meet the needs of multiple devices to play audio synchronously in professional scenarios, and eliminate the sense of delay caused by the network environment.

附图说明Description of drawings

图1为本发明中所述的低延迟音频矩阵配置方法的流程图。FIG. 1 is a flow chart of the low-latency audio matrix configuration method described in the present invention.

具体实施方式Detailed ways

以下结合附图对本发明的优选实施例进行说明，应当理解，此处所描述的优选实施例仅用于说明和解释本发明，并不用于限定本发明。The preferred embodiments of the present invention will be described below in conjunction with the accompanying drawings. It should be understood that the preferred embodiments described here are only used to illustrate and explain the present invention, and are not intended to limit the present invention.

在本发明的描述中，需要说明的是，术语“中心”、“上”、“下”、“左”、“右”、“竖直”、“水平”、“内”、“外”等指示的方位或位置关系为基于附图所示的方位或位置关系，仅是为了便于描述本发明和简化描述，而不是指示或暗示所指的装置或元件必须具有特定的方位、以特定的方位构造和操作，因此不能理解为对本发明的限制。此外，术语“第一”、“第二”、“第三”仅用于描述目的，而不能理解为指示或暗示相对重要性。In the description of the present invention, it should be noted that the terms "center", "upper", "lower", "left", "right", "vertical", "horizontal", "inner", "outer" etc. The indicated orientation or positional relationship is based on the orientation or positional relationship shown in the drawings, and is only for the convenience of describing the present invention and simplifying the description, rather than indicating or implying that the referred device or element must have a specific orientation, or in a specific orientation. construction and operation, therefore, should not be construed as limiting the invention. In addition, the terms "first", "second", and "third" are used for descriptive purposes only, and should not be construed as indicating or implying relative importance.

在本发明的描述中，需要说明的是，除非另有明确的规定和限定，术语“安装”、“相连”、“连接”应做广义理解，例如，可以是固定连接，也可以是可拆卸连接，或一体地连接；可以是机械连接，也可以是电连接；可以是直接相连，也可以通过中间媒介间接相连，可以是两个元件内部的连通。对于本领域的普通技术人员而言，可以具体情况理解上述术语在本发明中的具体含义。In the description of the present invention, it should be noted that unless otherwise specified and limited, the terms "installation", "connection" and "connection" should be understood in a broad sense, for example, it can be a fixed connection or a detachable connection. Connected, or integrally connected; it can be mechanically connected or electrically connected; it can be directly connected or indirectly connected through an intermediary, and it can be the internal communication of two components. Those of ordinary skill in the art can understand the specific meanings of the above terms in the present invention in specific situations.

下面，结合附图以及具体实施方式，对本发明做进一步描述：Below, in conjunction with accompanying drawing and specific embodiment, the present invention is described further:

在本发明中，主机(相当于服务器)、输入设备与输出设备构建成音频矩阵，所述输入设备可以为TX-8770编码器，所述输出设备可以为TX-8704解码器，所述主机、输入设备与输出设备均接入到同一网络中。In the present invention, the host (equivalent to the server), the input device and the output device are constructed into an audio matrix, the input device can be a TX-8770 encoder, the output device can be a TX-8704 decoder, the host, Both the input device and the output device are connected to the same network.

实施例一：Embodiment one:

如图1所示，一种低延迟音频矩阵配置方法，包括以下步骤：As shown in Figure 1, a low-latency audio matrix configuration method includes the following steps:

具体的，通过驱使输入设备与输出设备建立连接，使得输入设备将从音频源中采集的音频数据传输至输出设备，输出设备再将音频数据传输至播放设备，在本实施例中，所述S1具体由以下步骤实现：Specifically, by driving the input device to establish a connection with the output device, the input device transmits the audio data collected from the audio source to the output device, and the output device transmits the audio data to the playback device. In this embodiment, the S1 Specifically, it is realized by the following steps:

优选的，主机(服务器)服务正常启动后，向网络环境中发送UDP广播协议，输入设备和输出设备接收到UDP广播协议后，向主机发送上线请求协议，主机接受到各设备的上线协议的设备后，标记设备配置信息，并保持和各设备的健康心跳通信，在主机中设置设备信息、矩阵配置、抢占逻辑配置和音频配置，根据输入设备和输出设备的地址，主机分别发送音频配置信息(相当于配置协议信号)到输入设备和输出设备，输入设备根据接收的音频配置信息进行配置，输出音频数据到输出源设备，输出设备根据接收的音频配置信息进行配置、校验并接受输入设备输送的音频数据，然后输出设备的各通道向播放设备输出接收到的音频数据。Preferably, after the host (server) service starts normally, it sends the UDP broadcast protocol in the network environment. After the input device and the output device receive the UDP broadcast protocol, they send the online request protocol to the host, and the host receives the equipment of the online protocol of each device. Finally, mark the device configuration information, and maintain the healthy heartbeat communication with each device, set the device information, matrix configuration, preemption logic configuration and audio configuration in the host, according to the address of the input device and output device, the host sends audio configuration information ( Equivalent to the configuration protocol signal) to the input device and output device, the input device configures according to the received audio configuration information, outputs the audio data to the output source device, and the output device configures, verifies and accepts the input device transmission according to the received audio configuration information The audio data, and then each channel of the output device outputs the received audio data to the playback device.

若主机离线，输入设备和输出设备仍可按照音频配置信息进行音频数据传输，若输入设备和输出设备出现断电，断网等异常情况，设备重启后，主机会重新下发音频配置信息，输入设备和输出设备的音频数据交互恢复正常。If the host is offline, the input device and output device can still transmit audio data according to the audio configuration information. If the input device and output device experience abnormal conditions such as power failure or network disconnection, after the device is restarted, the host will re-send the audio configuration information. Input The audio data interaction between the device and the output device returns to normal.

进一步的，所述S1和S2之间还包括以下步骤：Further, the following steps are also included between S1 and S2:

具体的，通过输入设备对音频数据进行回声消除、噪音抑制和静音检测等常规的音频处理，优选的，依据刚播放的音频帧，在采集的音频帧中做一些类似抵消的运算，从而将回声从采集的音频帧中清除掉；进行降噪处理时，将属于背景噪音的部分识别出来，并从音频帧中过滤掉。Specifically, conventional audio processing such as echo cancellation, noise suppression, and silence detection is performed on the audio data through the input device. Preferably, according to the audio frame just played, some operations similar to cancellation are performed in the collected audio frame, so that the echo Remove it from the collected audio frame; when performing noise reduction processing, identify the part that belongs to the background noise and filter it out from the audio frame.

具体的，通过静音检测算法结合前面的噪声抑制算法，可以识别出当前没有音频源输入到输入设备，若没有，则输入设备编码就可以输出一个特殊的编码帧，如空白编码帧，输送至输出设备，可以达到节省带宽和流量。Specifically, through the silence detection algorithm combined with the previous noise suppression algorithm, it can be identified that there is no audio source input to the input device. If not, the input device can encode a special encoded frame, such as a blank encoded frame, and send it to the output. Devices can save bandwidth and traffic.

具体的，进行混音时，采用归一化混音算法(自适应加权混音算法)进行混音处理，使用更多的位数(32bit)来表示音频数据的一个样本，混完音后在想办法降低其振幅，使其仍旧分布在16bit所能表示的范围之内。为避免发生溢出，使用一个可变的衰减因子对语音进行衰减。这个衰减因子也就代表语音的权重，衰减因子随着音频数据的变化而变化，所以称为自适应加权混音。当溢出时，衰减因子较小，使得溢出的数据在衰减后能够处于临界值以内，而在没有溢出时，又让衰减因子慢慢增大，使数据较为平缓的变化。Specifically, when performing sound mixing, a normalized sound mixing algorithm (adaptive weighted sound mixing algorithm) is used for sound mixing processing, and more bits (32 bits) are used to represent a sample of audio data. Find a way to reduce its amplitude so that it is still distributed within the range that 16bit can represent. To avoid overflow, the speech is attenuated using a variable attenuation factor. This attenuation factor also represents the weight of the voice, and the attenuation factor changes with the change of audio data, so it is called adaptive weighted mixing. When there is overflow, the attenuation factor is small, so that the overflowed data can be within the critical value after attenuation, and when there is no overflow, the attenuation factor is slowly increased, so that the data changes more gently.

非混音时采用基于深度学习的AEC算法，传统信号处理与深度学习技术结合的方式，在回声消除自适应滤波之后做进一步的残余回声抑制，在程序内部，对音频解析成PCM音频流数据，进行一系列噪音检测处理，自适应将音质增益到一个稳定的频率区间，在传输中减低了延迟，提高音频质量。When not mixing, the AEC algorithm based on deep learning is adopted, and the combination of traditional signal processing and deep learning technology is used to perform further residual echo suppression after echo cancellation adaptive filtering. Inside the program, the audio is parsed into PCM audio stream data. Perform a series of noise detection processing, adaptively gain the sound quality to a stable frequency range, reduce the delay in transmission, and improve the audio quality.

具体的，在音频数据发送之前，根据每个播放设备之间在网络中延迟和设备处理数据速率情况，进行音频数据截取量计算，实时同步低延迟处理发送，在本实施例中，所述S2具体由以下步骤实现：Specifically, before the audio data is sent, according to the delay between each playback device in the network and the device processing data rate, the audio data interception amount is calculated, and the real-time synchronous low-delay processing is sent. In this embodiment, the S2 Specifically, it is realized by the following steps:

具体的，可通过直接ping保存在服务器中的各个终端设备(播放设备)的网络延迟程度，获取当前网络延迟情况。Specifically, the current network delay situation can be obtained by directly pinging the network delay degree of each terminal device (playing device) stored in the server.

具体的，通过判断当前网络延迟情况是否大于预设延迟值，来衡量是否需要对音频数据进行处理，优选的，若当前网络延迟情况小于1ms，则认为是正常，并记录各个终端当前秒的延迟情况，若当前网络延迟情况大于1ms，则需要对音频数据进行截取。Specifically, whether the audio data needs to be processed is measured by judging whether the current network delay is greater than the preset delay value. Preferably, if the current network delay is less than 1 ms, it is considered normal, and the current second delay of each terminal is recorded If the current network delay is greater than 1ms, the audio data needs to be intercepted.

具体的，计算获取到当前设备的芯片的运算速率，因为设备和服务器之间存在特定的私有协议，会实时保持心跳数据，心跳包存储当前设备的基本信息，如网络情况，温度，当前处理速率，速率的计算公式如下：数据量(字节/秒)＝采样率(Hz)*采样位数(bit)*芯片采样通道数/8，其中1Byte＝8bit，12位采样位数，4通道，200K采样频率的AD芯片1秒数据量大小为：200000*12*4/8＝1200000Byte＝1171.875K＝1.14441M；也就是一秒钟传输速率为1200000Byte/s，每毫秒传输1200byte/ms，如同时接收到其他的数据过来，将计算出剩余还能处理的数据量参数进行读取存储到服务器中，然后计算音频的每秒数据包情况，如下：设置音频采样率＝48000，采样通道＝2，位深度＝16，采样间隔＝20ms，即一秒钟总的数据量，采样间隔20ms，每秒钟需采集50次(1s＝1000ms)，那么总的数据量计算为：Specifically, calculate and obtain the computing speed of the chip of the current device, because there is a specific private protocol between the device and the server, which will keep the heartbeat data in real time, and the heartbeat packet stores the basic information of the current device, such as network conditions, temperature, and current processing rate , the calculation formula of the rate is as follows: data amount (byte/second) = sampling rate (Hz) * sampling bit (bit) * chip sampling channel number / 8, where 1Byte = 8bit, 12-bit sampling bit, 4 channels, 200K sampling frequency AD chip 1 second data size is: 200000*12*4/8＝1200000Byte＝1171.875K＝1.14441M; That is to say, the transmission rate is 1200000Byte/s per second, 1200byte/ms per millisecond, such as at the same time After receiving other data, read and store the parameters of the remaining data that can be processed into the server, and then calculate the data packets per second of the audio, as follows: set audio sampling rate = 48000, sampling channel = 2, Bit depth = 16, sampling interval = 20ms, that is, the total amount of data in one second, the sampling interval is 20ms, 50 times per second (1s = 1000ms), then the total amount of data is calculated as:

一秒钟总的数据量＝44100*2*16/8＝192000(Byte)The total amount of data in one second=44100*2*16/8=192000(Byte)

每帧音频数据大小＝192000/50＝3840(Byte)Audio data size per frame = 192000/50 = 3840 (Byte)

每个通道样本数＝320/2＝160(Byte)Number of samples per channel = 320/2 = 160 (Byte)

每毫秒的数据量＝192000/1000＝192(Byte)The amount of data per millisecond = 192000/1000 = 192 (Byte)

再根据设备播放设备的硬件运算速率和当前网络延迟情况，例如，播放设备A的自身当前处理的速率剩余能处理量：5000byte/ms以及当前网络延迟情况1ms，那么5000byte/ms数据处理速率>192byte/ms,正常传输数据；Then according to the hardware operation rate of the device playback device and the current network delay, for example, the current processing rate of playback device A and the remaining energy processing capacity: 5000byte/ms and the current network delay is 1ms, then the data processing rate of 5000byte/ms> 192byte /ms, normal data transmission;

若播放设备B的自身当前处理的速率剩余能处理量：100byte/ms以及当前网络延迟情况2ms，即当前多出1ms的延迟，处理速率为100byte<192byte，那么所需处理总数据为192+192-100＝284byte；If the current processing rate of playback device B is 100byte/ms and the current network delay is 2ms, that is, the current delay is 1ms, and the processing rate is 100byte<192byte, then the total data required to be processed is 192+192 -100=284byte;

若播放设备C的自身当前处理的速率剩余能处理量：1200byte/ms以及当前网络延迟情况3ms，即当前多出2ms的延迟，故当前音频数据为192*2＝386byte,设备本身处理速率为1200byte/ms>192byte/ms故应当处理的总数居为386byte；If the current processing rate of the playback device C is the remaining capacity: 1200byte/ms and the current network delay is 3ms, that is, the current delay is 2ms, so the current audio data is 192*2=386byte, and the processing rate of the device itself is 1200byte /ms>192byte/ms, so the total number that should be processed is 386byte;

综上所述，最终发送的数据结果，将192000byte/s的数据拆分成512byte数据为一包，一秒钟发送375次，在本实施例中，在普通的局域网环境，udp传输的数据最大为1472字节最好(避免分片重组)。但在网络编程中，Internet中的路由器可能有设置成不同的值(小于默认值)，Internet上的标准MTU值为576，所以Internet的UDP编程时数据长度最好在576－20－8＝548字节以内，又因为512是2的9次方，计算机是二进制的。所以，所有与2有关的乘法和除法都可以转化成位移操作，能大大降低计算负担。To sum up, the final sent data result is to split the 192000byte/s data into 512byte data into one packet, and send 375 times per second. In this embodiment, in the ordinary LAN environment, the data transmitted by udp is the largest 1472 bytes is the best (to avoid fragment reassembly). However, in network programming, routers in the Internet may have different values (less than the default value), and the standard MTU value on the Internet is 576, so the data length of UDP programming on the Internet is preferably 576-20-8=548 Within bytes, and because 512 is 2 to the 9th power, the computer is binary. Therefore, all multiplications and divisions related to 2 can be converted into shift operations, which can greatly reduce the computational burden.

在本实施例中，若播放设备A正常发送192000byte数据，第一包数据正常发送；播放设备B一秒钟发送192000-284＝191716byte，将数据前面284个字节的数据截取掉，从第284开始算到第192000个数据第一包数据发送；播放设备C一秒钟发送192000-386＝191614byte数据，将数据前面386个字节的数据截取掉，从第386开始算到第192000个数据第一包数据发送512-386＝126byte；所以在音频数据截取完成之后，将播放设备A、B和C在时间上的误差解决了，它们的数据出来是同步的，声音播放出来和声音输入是1ms误差，故不影响正常的使用，虽然会造成设备有一定的音频数据少，但是做了声音的淡入效果，效果从小到大，所处的时间完全够音频一起同步发出，不影响使用，之后的音频数据包跟随着网络波动情况和设备本身处理情况进行实时调整计算，在播音室专业场景里，保持了音源进来到声音出来的时间延迟到1ms以内。In this embodiment, if playback device A sends 192000 byte data normally, the first packet of data is sent normally; playback device B sends 192000-284=191716 byte in one second, intercepts the data of 284 bytes in front of the data, and starts from the 284th byte Start counting to the 192000th data and send the first packet of data; the playback device C sends 192000-386=191614byte data in one second, intercepts the 386 bytes of data in front of the data, and counts from the 386th to the 192000th data A packet of data is sent 512-386=126byte; so after the audio data interception is completed, the time error of playback devices A, B and C is resolved, and their data is synchronized, and the sound playback and sound input are 1ms Error, so it does not affect the normal use, although it will cause the device to have a certain amount of audio data, but the sound fade-in effect is done, the effect is small to large, and the time is completely enough for the audio to be sent out synchronously without affecting the use. The audio data packets are adjusted and calculated in real time according to the network fluctuations and the processing of the device itself. In the professional studio scene, the delay between the sound source coming in and the sound coming out is kept within 1ms.

进一步的，在音频数据上进行了唯一标识和对数据进行了数据标识的加盐加密，不至于在传输中被抓包截取数据，对目标设备不断输出音频造成干扰)抢占逻辑可在客户端页面调用不同场景的下的音频矩阵系统，场景存在优先级，优先级高的可抢占优先级低的，当配置了音频抢占的时候，根据输入通道配置的优先级值的高低进行抢占，优先级高的可抢占优先级低的。当没有配置音频抢占的时候，会各路音频则进行混音或非混音输出。Furthermore, the audio data is uniquely identified and the data is encrypted with salt and data identification, so that the data will not be captured and intercepted during transmission, which will interfere with the continuous audio output of the target device) The preemption logic can be found on the client page Call the audio matrix system under different scenes. There are priorities in the scene. The one with high priority can preempt the one with low priority. When audio preemption is configured, it will be preempted according to the priority value of the input channel configuration, and the priority is high. The preemptible priority is low. When audio preemption is not configured, each audio channel will be mixed or unmixed.

另外，系统考虑到特定场景需要随时切换音频，输出不同的音频，还提供IO触发切换场景，实现了脱机操作，在主机上将各个需要的场景矩阵点配置好，对应着设备编码器(输入设备)不同的IO触发口，之后脱机使用方便用户进行切换，在某种特定的场所，要及时切换场景，就可以绕开主机，直接对设备编码器进行IO口的触发，进行设备场景配置，切换不同的编码器和解码器之间的音频处理，应用不同的音源输入，输出不同的音频。流程如下：In addition, the system considers that specific scenes need to switch audio at any time, output different audio, and also provide IO trigger to switch scenes, realize offline operation, configure each required scene matrix point on the host, corresponding to the device encoder (input Device) different IO trigger ports, and then offline use is convenient for users to switch. In a specific place, if you want to switch scenes in time, you can bypass the host and directly trigger the IO port of the device encoder to configure the device scene , switch the audio processing between different encoders and decoders, apply different audio source inputs, and output different audio. The process is as follows:

场景1：有编码器1，编码器2，解码器1，解码器2，将各个矩阵点对应的输出通道和输入通道配置完之后，保存，在编码器1中，将IO触发的通道1联动到场景2，保存Scene 1: There are encoder 1, encoder 2, decoder 1, and decoder 2. After configuring the output channel and input channel corresponding to each matrix point, save it. In encoder 1, link the channel 1 triggered by IO Go to scene 2, save

场景2：有编码器1，编码器3，解码器1，解码器2，将各个矩阵点对应的输出通道和输入通道配置完之后，保存，在编码器1中，将IO触发的通道2联动到场景1，保存。Scene 2: There are encoder 1, encoder 3, decoder 1, and decoder 2. After configuring the output channel and input channel corresponding to each matrix point, save it. In encoder 1, link the channel 2 triggered by IO Go to scene 1, save.

在项目现场根据实际需求对编码器1的IO通道口进行短路触发切换之前保存的场景或者在电脑端进行场景切换，将不同的音频应用到该场景中，输入输出指定的音频音源。At the project site, according to the actual needs, the IO channel port of the encoder 1 is short-circuited to trigger the switching of the saved scene or the scene switching is performed on the computer side, and different audio is applied to the scene, and the specified audio source is input and output.

实施例二：Embodiment two:

处理器，用于运行所述程序指令，以执行如实施例一所述的低延迟音频矩阵配置方法。The processor is configured to run the program instructions to execute the low-latency audio matrix configuration method as described in Embodiment 1.

对本领域的技术人员来说，可根据以上描述的技术方案以及构思，做出其它各种相应的改变以及形变，而所有的这些改变以及形变都应该属于本发明权利要求的保护范围之内。Those skilled in the art can make various other corresponding changes and deformations according to the above-described technical solutions and concepts, and all these changes and deformations should fall within the protection scope of the claims of the present invention.

Claims

1. A method of low delay audio matrix configuration comprising the steps of:

s1: driving the input equipment to establish connection with the output equipment, and performing audio data interaction between the input equipment and the output equipment;

s2: obtaining the hardware operation rate, the current network delay condition and the audio data packet processing amount of each playing device, calculating the audio data interception amount, driving the output device to intercept corresponding audio data, then sending to the playing device,

the step S2 is specifically realized by the following steps:

s21: acquiring the current network delay condition of each playing device;

s22: judging whether the current network delay condition is greater than or equal to a preset delay value, if not, driving the output equipment to send the audio data to the playing equipment, and if so, executing S23;

s23: the hardware operation rate q and the processing amount of the audio data packet of the playing device are obtained, the interception amount of the audio data is calculated,

wherein the processing amount m of the audio data packet represents the residual energy processing amount m of the playing device based on the current hardware operation rate q, the residual energy processing amount m represents the processable data amount in unit time, each playing device calculates and obtains the corresponding audio data interception amount based on the corresponding operation rate q, the residual energy processing amount m and the current network delay n,

if m > (n-1) q, the audio data interception amount is (n-1) q,1 represents 1 unit time,

if m is less than (n-1) q, the interception amount of the audio data is nq-m;

s24: and driving the output device to intercept corresponding audio data and then sending the audio data to the playing device.

2. The low-delay audio matrix configuration method according to claim 1, wherein S1 is specifically implemented by the steps of:

and sending audio configuration information to the input device and the output device, and driving the input device and the output device to establish connection through the audio configuration information so as to perform audio data interaction.

3. The low-delay audio matrix configuration method according to claim 1, wherein the steps between S1 and S2 further include:

a1: driving an input device to remove echoes from the audio frame, performing noise reduction treatment, and identifying and filtering background noise;

a2: judging whether the input equipment acquires the audio data currently, if not, encoding to generate a blank encoding frame, transmitting the blank encoding frame to the output equipment, and if so, executing A3;

a3: judging whether to carry out the audio mixing processing, if so, attenuating the voice by an attenuation factor, otherwise, analyzing the audio data into PCM audio stream data, and carrying out noise detection processing.

4. A low-latency audio matrix configuration server, characterized by: comprising a memory and a processor;

a memory for storing program instructions;

a processor for executing the program instructions to perform the low delay audio matrix configuration method of any of claims 1-3.