CN101155140A

CN101155140A - Method, device and system for audio stream error concealment

Info

Publication number: CN101155140A
Application number: CN 200610159697
Authority: CN
Inventors: 万华林; 王喆; 张军
Original assignee: Huawei Technologies Co Ltd
Current assignee: Huawei Technologies Co Ltd
Priority date: 2006-10-01
Filing date: 2006-10-01
Publication date: 2008-04-02
Also published as: WO2008040250A1

Abstract

The invention discloses a method for audio stream error concealment, the method comprising: a. classifying the transmitted audio frame according to the content to obtain the type information of the audio frame; b. combining the type information of the audio frame with the audio frame The encoding result is packaged and sent out; c. When frame loss occurs, for the lost audio frame, determine the type information of the audio frame obtained when it is classified according to the content; d. According to the type information of the lost audio frame, use the corresponding The error recovery strategy for audio frame reconstruction. The error concealment method of the present invention makes the reconstruction of the lost frame more pertinent, and can adaptively reconstruct the audio frame to achieve a better compensation effect. The invention also discloses a sending and receiving method for audio stream error concealment. The invention also discloses a transmitter, a receiver and a system for audio stream error concealment.

Description

Method, device and system for audio stream error concealment

技术领域 technical field

本发明涉及实时音频通信技术，特别涉及音频流错误隐藏的方法、装置和系统。The invention relates to real-time audio communication technology, in particular to a method, device and system for audio stream error concealment.

背景技术 Background technique

人们很早就开始从事音频分类研究，但对于不同的应用场景，音频分类的方法及分类的结果不尽相同。例如，1、在高级音响设备的噪音抑制中，常用调频分析或贝叶斯分类器将音频信号分类。2、为了更好地索引和检索因特网上音频资源，人们进行了基于内容的音频分类和检索研究。比较有代表性的基于内容的音频分类工作中详细分析了音频的区别性特征，包括响度(loudness)、基音(pitch)、谐度(harmonicity)等，并且设计了音频的分类器。3、音频分类另外一种应用是服务于音频--特别是语音编码器的语音激活检测器(VAD)，目的是用来检测语音通信时是否有话音存在，对话音和非话音分别采用不同的编码方式，以达到在不降低通话质量的情况下节省话路资源的作用。People have been engaged in audio classification research for a long time, but for different application scenarios, the methods and classification results of audio classification are not the same. For example, 1. In the noise suppression of advanced audio equipment, frequency modulation analysis or Bayesian classifiers are commonly used to classify audio signals. 2. In order to better index and retrieve audio resources on the Internet, researches on content-based audio classification and retrieval have been carried out. In the more representative content-based audio classification work, the distinguishing features of audio are analyzed in detail, including loudness, pitch, harmonicity, etc., and an audio classifier is designed. 3. Another application of audio classification is to serve audio—especially the voice activation detector (VAD) of the speech coder. The purpose is to detect whether there is voice in voice communication. Coding method to achieve the effect of saving voice channel resources without reducing the voice quality.

在实时音频传输系统(如VoIP)中，网络传输造成音质下降的原因主要是时延、静电干扰和包丢失，其中，包丢失是音质下降的最主要原因。实际通讯网络总是存在着一定干扰，因此存在一定的包丢失概率，尽管协议低层有纠错编码，但只能解决包内误码，无法解决丢包问题。此外，由于实时音频业务有严格的延时限制，分组延迟超过一个极限也将被视为丢失。为了能够在一定丢包率情况下，仍然保证一定的通讯质量，很多编解码算法中都集成了错误隐藏技术，用来消除丢包的影响。In a real-time audio transmission system (such as VoIP), the main reasons for the degradation of sound quality caused by network transmission are delay, static interference and packet loss, among which packet loss is the most important reason for the degradation of sound quality. There is always some interference in the actual communication network, so there is a certain probability of packet loss. Although the lower layer of the protocol has error correction codes, it can only solve the error code in the packet, but cannot solve the problem of packet loss. In addition, since real-time audio services have strict delay constraints, packets delayed beyond a limit will also be considered lost. In order to ensure a certain communication quality under a certain packet loss rate, many codec algorithms integrate error concealment technology to eliminate the impact of packet loss.

图1为音频流错误隐藏的框图。如图1所示，压缩音频信号经过IP网络或其他不可靠网络的传输后，通常将接收到的音频数据包存放在抖动缓存器，该抖动缓冲器用于解决迟包、早包的重排序等功能，然后进行丢包、错包检测。如果有丢包或错包发生，系统将启动错误隐藏进行丢包补偿，否则将正确接收音频包解码输出。Figure 1 is a block diagram of audio stream error concealment. As shown in Figure 1, after the compressed audio signal is transmitted through the IP network or other unreliable networks, the received audio data packets are usually stored in the jitter buffer, which is used to solve the reordering of late packets and early packets, etc. function, and then perform packet loss and error packet detection. If there is a packet loss or error packet, the system will start error concealment to compensate for packet loss, otherwise it will receive the audio packet decoding output correctly.

音频实时传输中的丢包恢复技术按照处理阶段可以划分为两个大类：基于发送端的修复和基于接收端的修复。The packet loss recovery technology in real-time audio transmission can be divided into two categories according to the processing stage: repair based on the sender and repair based on the receiver.

●基于发送端的错误隐藏●Sender-based error concealment

基于发送端的丢包恢复由发送端发起，并需要发送端和接收端协同进行。常见的方法有，增加冗余度、前向纠错、优先级设置和分类处理等。Packet loss recovery based on the sender is initiated by the sender and requires the cooperation of the sender and the receiver. Common methods include increasing redundancy, forward error correction, priority setting, and classification processing.

1、增加冗余度：增加数据的冗余度可以提高系统的容错能力，但是同时增加了带宽。1. Increased redundancy: Increased data redundancy can improve the fault tolerance of the system, but at the same time increase the bandwidth.

2、前向纠错(FEC)：该策略也依赖于在传输流附加可修复丢失数据包的信息，利用块或代数码字生成能辅助纠错的额外传输包，同样需要增加带宽。2. Forward Error Correction (FEC): This strategy also relies on adding information that can repair lost packets in the transport stream, using blocks or algebraic codes to generate additional transport packets that can assist error correction, which also requires increased bandwidth.

3、优先级设置方法：这种技术需要网络支持并按优先级传输分组，否则无法实现，并且只能改善网络拥塞造成的丢包概率。3. Priority setting method: This technology requires network support and transmits packets according to priority, otherwise it cannot be realized, and it can only improve the packet loss probability caused by network congestion.

4、分类处理方法：在语音编码中，为了在接收端更好地采用波形替换技术，发送端可以根据语音信号的特性分类处理，比如3GPP2VMR-WB和ITU-T G.729.1将语音帧进一步描述为voiced，unvoiced，voiced transition，unvoiced transition，onset等类型，而解码端接收到之后，利用其前一帧和后一帧的语音帧类型，可以推测出丢帧的类型，解码器得到丢帧类型后，可以较好的恢复丢失帧的信息。4. Classification processing method: In speech coding, in order to better adopt waveform replacement technology at the receiving end, the sending end can classify and process the speech signal according to the characteristics of the speech signal. For example, 3GPP2VMR-WB and ITU-T G.729.1 further describe the speech frame It is voiced, unvoiced, voiced transition, unvoiced transition, onset and other types, and after the decoder receives it, it can use the voice frame type of the previous frame and the next frame to infer the type of frame loss, and the decoder gets the frame loss type After that, the information of the lost frame can be better recovered.

●基于接收端的错误隐藏● Receiver-based error concealment

不需要发送端参与的接收端错误隐藏技术，本质上是对接收到的数据通过一系列的方法来估计丢失的数据，并根据人的生理特点进行优化，基本上是一种被动的修补，通常比较容易实现且不增加带宽需求。基于接收端的错误隐藏方法可分为三类：The error concealment technology at the receiving end that does not require the participation of the sending end is essentially to estimate the lost data through a series of methods for the received data, and optimize it according to human physiological characteristics. It is basically a passive repair, usually Relatively easy to implement and does not increase bandwidth requirements. Receiver-based error concealment methods can be divided into three categories:

1、基于插入的策略：这类技术包括拼接(Splicing)、静音和噪声替代等方法。拼接技术会扰乱媒体流的时序，效果也不好。静音替代(用静音帧填充丢帧位置)的适用范围非常有限，当数据包的丢失频率很低(小于2％)且缺口宽度小于4ms时，这种方法比较有效；当缺口宽度达40ms时，其效果会让人无法接受。与静音替代相比，噪声替代(用噪音帧填充丢帧位置)可给人带来更好的主观听觉感受，同时能改善语音信号的可分辨性。当采用背景噪声而不是静音的时候，人脑能下意识地用正确的声音来修补语音信号中丢失的部分。插入方式与语音编码无关，也与分组的编码无关，只是对解码后丢失的语音进行处理。1. Insertion-based strategies: These techniques include methods such as splicing, mute and noise replacement. The splicing technique messes up the timing of the media stream and doesn't work well. The scope of application of silence replacement (filling the frame loss position with silence frames) is very limited. When the loss frequency of data packets is very low (less than 2%) and the gap width is less than 4ms, this method is more effective; when the gap width reaches 40ms, The effect would be unacceptable. Compared with silence replacement, noise replacement (filling the lost frame position with noise frames) can bring people a better subjective auditory experience, and can improve the distinguishability of speech signals at the same time. When background noise is used instead of silence, the human brain can subconsciously fill in the missing parts of the speech signal with the correct sound. The insertion method has nothing to do with the speech coding, nor the coding of the packet, but only processes the lost speech after decoding.

2、基于插值的策略：与插入技术相比，插值技术使得处理得到的声音能给人带来相对更好的主观感受。2. Interpolation-based strategy: Compared with interpolation technology, interpolation technology makes the processed sound bring relatively better subjective experience to people.

3、基于重新生成的策略：从丢失包周围的信息提取解码状态，并由此生成丢失包的替代包。这种方法的实现过程比较复杂，但会取得较好的结果。3. Regeneration-based strategy: extract the decoding state from the information around the lost packet, and generate a replacement packet for the lost packet. The implementation process of this method is more complicated, but it will achieve better results.

一般来说，基于发送端错误隐藏会增加网络带宽和计算复杂性，效果比基于接收端的好，但是如果发送端错误隐藏独立于接收端，也就是说与媒体内容无关，那么它将不能根据所丢帧的特性采取相应的错误隐藏策略(例如，稳定语音帧与其前一帧非常相似，用帧复制策略就能取得很好的隐藏效果，过渡帧则需要考虑前后帧的状态才能确定隐藏策略)。接收端的技术简单，也能取得一定的隐藏效果，但如果隐藏策略与音频编码无关，也就是说，不分析当前丢失帧和周围音频帧的内容特性，从而采取有针对性的错误隐藏策略，可采用错误隐藏的策略将非常有限。Generally speaking, the error concealment based on the sender will increase the network bandwidth and computational complexity, and the effect is better than that based on the receiver, but if the error concealment of the sender is independent of the receiver, that is, it has nothing to do with the media content, then it will not be able to do so according to all The characteristics of the frame loss adopt the corresponding error concealment strategy (for example, the stable speech frame is very similar to the previous frame, and the frame copy strategy can achieve a good hiding effect, and the transition frame needs to consider the state of the previous frame to determine the hiding strategy) . The technology at the receiving end is simple and can achieve certain concealment effects. However, if the concealment strategy has nothing to do with audio coding, that is, the content characteristics of the current lost frame and surrounding audio frames are not analyzed, and a targeted error concealment strategy can be adopted. Strategies for error concealment will be very limited.

目前在编码端和接收端都考虑错误隐藏需求的音频编码标准越来越多，有代表性的主要有两类方法：At present, there are more and more audio coding standards that consider error concealment requirements at both the encoding end and the receiving end. There are mainly two representative methods:

(1)编码器在正式编码前分析音频帧特性，对不同特性的音频帧采用不同的编码方法。例如，AMR-WB+根据音频帧内容对信号帧分别采用ACELP及TCX编码，形成26种超帧(每四帧组成一个超帧)编码模式。编码模式信息被用于错误隐藏，在某帧丢失的情况下，接收端根据超帧其余3帧的编码类型，推断或估计出超帧的编码模式，从而实现一定的错误隐藏功能。(1) The encoder analyzes the characteristics of the audio frame before formal encoding, and adopts different encoding methods for audio frames with different characteristics. For example, AMR-WB+ adopts ACELP and TCX encoding to the signal frame according to the content of the audio frame, forming 26 kinds of superframe (every four frames form a superframe) coding mode. Coding mode information is used for error concealment. In the case of a frame loss, the receiving end infers or estimates the coding mode of the superframe according to the coding types of the remaining 3 frames of the superframe, so as to achieve a certain error concealment function.

(2)根据语音帧的基音、频谱等特性将语音帧划分为voiced，unvoiced，voiced transition，unvoiced transition，onset等类型。例如，在最新的ITU-TG.729.1协议中，编码器根据帧内容及其特性，将语音帧分为voiced，unvoiced，voiced transition，unvoiced transition，onset(VMR-WB也将语音帧划分为这5类)，在层2用2比特标示其类型，为进一步增强错误隐藏的准确性，G.729.1还计算帧的相位和能量，分别在其下一帧的层3和层4传送。解码器会试图从已知的类别标识中(包括前帧的类别标识)恢复出丢帧的模式标识，从而根据丢帧的类别模式，结合其相位和能量信息重构出音频波形。(2) According to the pitch, spectrum and other characteristics of the speech frame, the speech frame is divided into types such as voiced, unvoiced, voiced transition, unvoiced transition, and onset. For example, in the latest ITU-TG.729.1 protocol, the encoder divides the speech frame into voiced, unvoiced, voiced transition, unvoiced transition, onset according to the frame content and its characteristics (VMR-WB also divides the speech frame into these 5 class), and 2 bits are used to mark its type in layer 2. In order to further enhance the accuracy of error concealment, G.729.1 also calculates the phase and energy of the frame, and transmits them in layer 3 and layer 4 of the next frame respectively. The decoder will try to recover the mode identification of the lost frame from the known category identification (including the category identification of the previous frame), so as to reconstruct the audio waveform according to the category mode of the lost frame by combining its phase and energy information.

对于第一类方法：虽然AMR-WB+根据音频帧特性分别采用ACELP、TCX256、TCX512和TCX1024四种模式编码，分别用2，2，4，8比特表示编码模式信息，并且在错误隐藏时利用编码模式信息，推断或估计出超帧(由4帧1024个采样点组成)的编码模式，从而实现一定的错误隐藏功能，但是标示出的仅仅是音频编码的编码模式，并不能根据音频帧的内容采用策略进行丢帧重构，因此不能实现高效的错误隐藏。For the first type of method: Although AMR-WB+ uses ACELP, TCX256, TCX512 and TCX1024 four modes of encoding according to the characteristics of the audio frame, 2, 2, 4, and 8 bits are used to represent the encoding mode information, and the encoding mode is used for error concealment. Mode information, infer or estimate the encoding mode of the superframe (composed of 4 frames and 1024 sampling points), so as to achieve a certain error concealment function, but only the encoding mode of the audio encoding is marked, and it cannot be based on the content of the audio frame Strategies are employed for frame-drop reconstruction, so efficient error concealment cannot be achieved.

对于第二类方法：目前这类错误隐藏技术是针对语音帧设计的，在处理其他类型的音频帧时效果不好。对于音乐和自然声音等的分类检测、特别是它们在丢包情况下，如何重构丢包信息，使得音频通信也能够容忍较高的丢包率，目前尚没有有效的方法。For the second category of methods: Currently this type of error concealment technique is designed for speech frames and does not work well for other types of audio frames. For the classification and detection of music and natural sounds, especially in the case of packet loss, how to reconstruct the packet loss information so that audio communication can tolerate a high packet loss rate, there is currently no effective method.

综上所述，目前的音频流错误隐藏技术尚不能实现对音频帧的高效错误隐藏，从而使得音频通信过程中对于丢包率的要求无法降低。To sum up, the current audio stream error concealment technology cannot realize efficient error concealment of audio frames, so that the requirement for packet loss rate in the audio communication process cannot be lowered.

发明内容 Contents of the invention

有鉴于此，本发明实施例提供音频流错误隐藏的方法，能够实现对音频流的高效错误隐藏。In view of this, the embodiments of the present invention provide a method for audio stream error concealment, which can realize efficient error concealment for audio streams.

本发明实施例还提供音频流错误隐藏的装置和系统，应用该装置和系统，能够实现对音频流的高效错误隐藏。The embodiment of the present invention also provides an audio stream error concealment device and system, and the application of the device and system can realize efficient error concealment of the audio stream.

为实现上述发明目的，采用如下的技术方案：In order to realize the above-mentioned purpose of the invention, adopt following technical scheme:

一种音频流错误隐藏的发送方法，其特征在于，该方法包括：A method for sending audio stream error concealment, characterized in that the method comprises:

a、对发送的音频帧按照内容进行分类，得到该音频帧的类型信息；a. Classify the sent audio frame according to the content to obtain the type information of the audio frame;

b、将音频帧的类型信息与该音频帧的编码结果封装打包发送出去。b. The type information of the audio frame and the encoding result of the audio frame are packaged and sent out.

一种音频流错误隐藏的接收方法，该方法包括：A receiving method for audio stream error concealment, the method comprising:

a、当发生丢帧时，对于丢失的音频帧，确定其按照内容进行分类时得到的该音频帧的类型信息；a. When frame loss occurs, for the lost audio frame, determine the type information of the audio frame obtained when it is classified according to the content;

b、根据丢失的音频帧的类型信息，采用相应的错误恢复策略进行音频帧重构。b. According to the type information of the lost audio frame, a corresponding error recovery strategy is used to reconstruct the audio frame.

一种音频流错误隐藏的方法，该方法包括：A method for audio stream error concealment, the method comprising:

b、将音频帧的类型信息与该音频帧的编码结果封装打包发送给接收端；b. Encapsulate and package the type information of the audio frame with the encoding result of the audio frame and send it to the receiving end;

c、当发生丢帧时，对于丢失的音频帧，接收端确定其按照内容进行分类时得到的该音频帧的类型信息；c. When a frame is lost, for the lost audio frame, the receiving end determines the type information of the audio frame obtained when it is classified according to the content;

d、根据丢失的音频帧的类型信息，采用相应的错误恢复策略进行音频帧重构。d. According to the type information of the lost audio frame, a corresponding error recovery strategy is used to reconstruct the audio frame.

一种音频流错误隐藏的发射机，包括音频编码器模块、帧封装模块和音频帧分类器模块；A transmitter for audio stream error concealment comprising an audio encoder module, a frame packing module and an audio frame classifier module;

所述音频帧分类器模块，用于对发送的音频帧按照内容进行分类，得到该音频帧的类型信息，并将该类型信息发送给所述帧封装模块；The audio frame classifier module is used to classify the sent audio frame according to the content, obtain the type information of the audio frame, and send the type information to the frame encapsulation module;

所述帧封装模块，用于接收所述音频帧分类器模块发送的音频帧的类型信息和所述音频编码器模块发送的音频帧的编码结果，将该音频帧的类型信息和音频帧的编码结果封装打包发送出去。The frame encapsulation module is configured to receive the type information of the audio frame sent by the audio frame classifier module and the encoding result of the audio frame sent by the audio encoder module, and encode the type information of the audio frame and the audio frame The results are packaged and sent out.

一种音频流错误隐藏的接收机，该接收机包括帧类型判别模块和错误隐藏模块，A receiver for audio stream error concealment, the receiver includes a frame type discrimination module and an error concealment module,

所述帧类型判别模块，用于丢失的音频帧按照内容进行分类时得到的该音频帧的类型信息，并将该类型信息发送给所述错误隐藏模块；The frame type discrimination module is used for the type information of the audio frame obtained when the lost audio frame is classified according to the content, and sends the type information to the error concealment module;

所述错误隐藏模块，用于根据接收到的丢失的音频帧的类型信息，采用相应的错误恢复策略进行音频帧重构。The error concealment module is configured to use a corresponding error recovery strategy to reconstruct the audio frame according to the type information of the received lost audio frame.

一种音频流错误隐藏系统，该系统包括：发射机和接收机；An audio stream error concealment system, the system includes: a transmitter and a receiver;

所述发射机，用于对发送的音频帧按照内容进行分类，得到该音频帧的类型信息，并将音频帧的类型信息与该音频帧的编码结果封装打包发送给所述接收机；The transmitter is configured to classify the transmitted audio frame according to the content, obtain the type information of the audio frame, and package the type information of the audio frame and the encoding result of the audio frame to send to the receiver;

所述接收机，用于在发生丢帧时，确定丢失的音频帧按照内容进行分类时得到的类型信息，并根据该类型信息，采用相应的错误恢复策略进行音频帧重构。The receiver is configured to determine the type information obtained when the lost audio frame is classified according to content when a frame loss occurs, and use a corresponding error recovery strategy to reconstruct the audio frame according to the type information.

由上述技术方案可见，本发明在发送端，根据音频帧内容进行分类，并将音频帧的类型信息和音频帧的编码结果一同发送出去；在接收端，当发生丢包时，根据丢失的音频帧按照内容进行分类时得到的不同类型，采取相应的错误隐藏策略重构音频信号。可见，本发明所述的错误隐藏方式使得对丢失帧的重构具有更强的针对性，能够自适应地重构音频帧，以达到更好的补偿效果，为收端用户带来更好的主观听觉感受，同时能改善音频帧信号的可分辨性，使得音频通信能够容忍更高的包丢失率。It can be seen from the above technical solution that the present invention classifies audio frames according to the content of the audio frame at the sending end, and sends the type information of the audio frame and the encoding result of the audio frame together; The different types obtained when the frame is classified according to the content, and the corresponding error concealment strategy is used to reconstruct the audio signal. It can be seen that the error concealment method described in the present invention makes the reconstruction of the lost frame more pertinent, and can adaptively reconstruct the audio frame to achieve a better compensation effect and bring better benefits to the receiving end user. Subjective auditory experience, while improving the resolution of audio frame signals, enabling audio communication to tolerate higher packet loss rates.

附图说明 Description of drawings

图1为音频流错误隐藏框图。Figure 1 is a block diagram of audio stream error concealment.

图2为本发明中音频流错误隐藏的方法总体流程图。Fig. 2 is an overall flow chart of the method for audio stream error concealment in the present invention.

图3为本发明中音频流错误隐藏的系统总体结构图。Fig. 3 is a general structural diagram of the system for audio stream error concealment in the present invention.

图4为本发明中音频流错误隐藏的发射机总体结构图。Fig. 4 is an overall structural diagram of a transmitter for audio stream error concealment in the present invention.

图5为本发明中音频流错误隐藏的接收机总体结构图。Fig. 5 is a general structural diagram of a receiver for audio stream error concealment in the present invention.

图6为本发明实施例中音频流错误隐藏的发送方法具体流程图。FIG. 6 is a specific flow chart of the sending method for audio stream error concealment in the embodiment of the present invention.

图7为本发明实施例中对音频帧分类的示意图。Fig. 7 is a schematic diagram of classifying audio frames in an embodiment of the present invention.

图8为本发明实施例中音频流错误隐藏的接收方法具体流程图。FIG. 8 is a specific flow chart of a receiving method for audio stream error concealment in an embodiment of the present invention.

图9为本发明实施例中音频流错误隐藏的发射机具体结构图。FIG. 9 is a specific structural diagram of a transmitter for audio stream error concealment in an embodiment of the present invention.

图10为本发明实施例中音频流错误隐藏的接收机具体结构图。FIG. 10 is a specific structural diagram of a receiver for audio stream error concealment in an embodiment of the present invention.

具体实施方式 Detailed ways

为使本发明的目的、技术手段和优点更加清楚明白，以下结合附图并举实施例，说明本发明的具体实施方式。In order to make the purpose, technical means and advantages of the present invention clearer, the specific implementation manners of the present invention will be described below in conjunction with the accompanying drawings and examples.

本发明的基本思想是：根据音频帧内容进行分类，并在发生丢包情况下，根据丢失的音频帧的不同类型，采取相应的错误隐藏策略重构音频信号。The basic idea of the present invention is to classify according to the audio frame content, and adopt corresponding error concealment strategy to reconstruct the audio signal according to the different types of the lost audio frame in case of packet loss.

图2为本发明中音频流错误隐藏的方法总体流程图。如图2所示，该方法包括：Fig. 2 is an overall flow chart of the method for audio stream error concealment in the present invention. As shown in Figure 2, the method includes:

步骤201，对发送的音频帧按照内容进行分类，得到该音频帧的类型信息。Step 201, classify the transmitted audio frame according to the content, and obtain the type information of the audio frame.

步骤202，将音频帧的类型信息与该音频帧的编码结果封装打包发送出去。In step 202, the type information of the audio frame and the encoding result of the audio frame are packaged and sent out.

步骤203，当发生丢帧时，对于丢失的音频帧，确定其按照内容进行分类时得到的该音频帧的类型信息；Step 203, when a frame loss occurs, for the lost audio frame, determine the type information of the audio frame obtained when it is classified according to the content;

步骤204，根据丢失的音频帧的类型信息，采用相应的错误恢复策略进行音频帧重构。Step 204, according to the type information of the lost audio frame, use a corresponding error recovery strategy to reconstruct the audio frame.

其中，步骤201～202构成了音频流错误隐藏的发送方法总体流程；步骤203～204构成了音频流错误隐藏的接收方法总体流程。Wherein, steps 201-202 constitute the overall flow of the sending method for audio stream error concealment; steps 203-204 constitute the overall flow of the receiving method for audio stream error concealment.

图3为本发明中音频流错误隐藏的系统总体结构图。如图3所示，该系统包括发射机301和接收机302。在该系统中，发射机301，用于对发送的音频帧按照内容进行分类，得到该音频帧的类型信息，并将音频帧的类型信息与该音频帧的编码结果封装打包发送给接收机302；接收机302，用于在发生丢帧时，确定丢失的音频帧按照内容进行分类时得到的类型信息，并根据该类型信息，采用相应的错误恢复策略进行音频帧重构。该系统中的发射机和接收机可以分别采用下面图4和图5所示的发射机400和接收机500的具体结构。Fig. 3 is a general structural diagram of the system for audio stream error concealment in the present invention. As shown in FIG. 3 , the system includes a transmitter 301 and a receiver 302 . In this system, the transmitter 301 is used to classify the transmitted audio frame according to the content, obtain the type information of the audio frame, package the type information of the audio frame and the encoding result of the audio frame and send it to the receiver 302 ; The receiver 302 is used to determine the type information obtained when the lost audio frame is classified according to the content when a frame loss occurs, and according to the type information, use a corresponding error recovery strategy to reconstruct the audio frame. The transmitter and receiver in this system can respectively adopt the specific structures of the transmitter 400 and the receiver 500 shown in FIG. 4 and FIG. 5 below.

图4为本发明中音频流错误隐藏的发射机总体结构图。如图4所示，该发射机400包括音频编码器模块410、音频帧分类器模块420和帧封装模块430。Fig. 4 is an overall structural diagram of a transmitter for audio stream error concealment in the present invention. As shown in FIG. 4 , the transmitter 400 includes an audio encoder module 410 , an audio frame classifier module 420 and a frame packing module 430 .

在该发射机400中，音频编码器模块410，用于将发送的音频帧进行编码，并将编码结果发送给帧封装模块430。音频帧分类器，用于对发送的音频帧按照内容进行分类，得到该音频帧的类型信息，并将该类型信息发送给帧封装模块430。帧封装模块430，用于接收音频编码器模块410发送的音频帧的编码结果和音频帧分类器模块420发送的音频帧的类型信息，将该音频帧的类型信息和音频帧的编码结果封装打包发送出去。在音频编码器410中对音频帧进行编码时，可以根据音频帧分类器发送的该音频帧的类型信息，对音频帧采用不同的编码方式，或者直接对所有的编码帧采用相同的编码方式。In the transmitter 400 , the audio encoder module 410 is configured to encode the transmitted audio frame, and send the encoding result to the frame packing module 430 . The audio frame classifier is configured to classify the sent audio frame according to the content, obtain the type information of the audio frame, and send the type information to the frame encapsulation module 430 . The frame encapsulation module 430 is used to receive the encoding result of the audio frame sent by the audio encoder module 410 and the type information of the audio frame sent by the audio frame classifier module 420, and package the type information of the audio frame and the encoding result of the audio frame send it out. When encoding an audio frame in the audio encoder 410, different encoding methods may be used for the audio frame according to the type information of the audio frame sent by the audio frame classifier, or the same encoding method may be directly used for all encoded frames.

与发射机相应地，图5为本发明中音频流错误隐藏的接收机总体结构图。该接收机包括：帧类型判别模块510和错误隐藏模块520。Corresponding to the transmitter, FIG. 5 is a general structural diagram of a receiver for audio stream error concealment in the present invention. The receiver includes: a frame type discrimination module 510 and an error concealment module 520 .

在该接收机中，帧类型判别模块510，用于丢失的音频帧按照内容进行分类时得到的该音频帧的类型信息，并将该类型信息发送给错误隐藏模块520。错误隐藏模块520，用于根据接收到的丢失的音频帧的类型信息，采用相应的错误恢复策略进行音频帧重构。In the receiver, the frame type discrimination module 510 uses the type information of the audio frame obtained when the lost audio frame is classified according to the content, and sends the type information to the error concealment module 520 . The error concealment module 520 is configured to use a corresponding error recovery strategy to reconstruct the audio frame according to the type information of the received lost audio frame.

由上述可见，本发明在发送端对音频帧按内容进行分类得到音频帧的类型信息发送给接收端，接收端在发生丢帧时，根据丢失的音频帧的类型信息，采用不同的错误恢复策略进行音频帧重构，以高效进行错误隐藏。As can be seen from the above, the present invention classifies audio frames by content at the sending end to obtain the type information of the audio frame and sends it to the receiving end. When a frame is lost, the receiving end adopts different error recovery strategies according to the type information of the lost audio frame Perform audio frame reconstruction for efficient error concealment.

以上是对本发明的方法、装置和系统的总体概述，下面通过具体实施例来进一步对本发明进行详细的阐述。The above is a general overview of the method, device and system of the present invention, and the present invention will be further described in detail through specific embodiments below.

图6为本发明实施例中音频流错误隐藏的方法具体流程图。如图6所示，该方法包括：FIG. 6 is a specific flowchart of a method for audio stream error concealment in an embodiment of the present invention. As shown in Figure 6, the method includes:

步骤601，将音频信号分成等间隔的音频帧。Step 601, divide the audio signal into equally spaced audio frames.

本步骤中，音频帧的帧长根据编码协议而定。In this step, the frame length of the audio frame is determined according to the encoding protocol.

步骤602，对音频帧的内容及特性进行分析，得到音频帧的类型信息。Step 602: Analyze the content and characteristics of the audio frame to obtain type information of the audio frame.

本步骤中，将音频帧分为语音信号帧、噪音信号帧、静音信号帧、乐音信号帧等类型，然后还可以再对每种类型进一步细分，比如，语音信号帧还可进一步分为voiced，unvoiced，voiced transition，unvoiced transition，onset等类型，乐音信号帧可以简单地根据信号的稳定特性分为稳定乐音帧(steady)、过渡乐音帧(transition)等类型。In this step, the audio frame is divided into voice signal frame, noise signal frame, silent signal frame, musical tone signal frame and other types, and then each type can be further subdivided, for example, the voice signal frame can be further divided into voiced , unvoiced, voiced transition, unvoiced transition, onset and other types, the tone signal frame can be simply divided into stable tone frame (steady), transition tone frame (transition) and other types according to the stability characteristics of the signal.

步骤603，对发送的音频帧进行编码压缩。Step 603, encoding and compressing the sent audio frame.

本步骤中，可以对整个音频信号采用同样的编码方法，或者也可以根据音频帧的类型不同，采用不同的编码方法。In this step, the same coding method may be used for the entire audio signal, or different coding methods may be used according to different types of audio frames.

步骤604，将音频帧的类型和编码压缩的结果封装打包，并发送出去。Step 604, package the type of the audio frame and the result of encoding and compression, and send them out.

本步骤中，在封装打包时，音频帧的类型信息可以在本帧或下一帧的帧头标识出来。In this step, when encapsulating and packaging, the type information of the audio frame can be identified in the frame header of the current frame or the next frame.

至此，本实施例中音频流错误隐藏的发送流程结束。So far, the sending process of audio stream error concealment in this embodiment ends.

在上述流程中，步骤602中进行音频帧分类时，可以采用图7所示的方法进行。参见图7，首先利用VAD检测该音频帧是否为噪声信号帧。若是噪声信号帧，则对该音频帧进行频谱能量分析，若是非噪声信号帧，则对该音频帧进行频谱稳定性分析。In the above process, when audio frames are classified in step 602, the method shown in FIG. 7 may be used. Referring to FIG. 7 , firstly, VAD is used to detect whether the audio frame is a noise signal frame. If it is a noise signal frame, perform spectrum energy analysis on the audio frame, and if it is a non-noise signal frame, perform spectrum stability analysis on the audio frame.

根据频谱能量分析的结果，将该音频帧划分为静音信号帧和噪音信号帧，然后可以进一步对该静音信号帧或噪音信号帧做细化的分类，得到该音频帧的类型信息。According to the result of spectral energy analysis, the audio frame is divided into a silent signal frame and a noise signal frame, and then the silent signal frame or the noise signal frame can be further classified to obtain the type information of the audio frame.

根据频谱稳定性分析的结果，将该音频帧划分为语音信号帧和乐音信号帧，然后可以进一步对该语音信号帧或乐音信号帧做细化的分类，如对于语音信号帧可以细化为voiced，unvoiced，voiced transition等，对于乐音信号帧可以细化为稳定乐音帧、过渡乐音帧等。According to the results of spectrum stability analysis, the audio frame is divided into speech signal frame and musical tone signal frame, and then the classification of the speech signal frame or musical tone signal frame can be further refined, such as the speech signal frame can be refined into voiced , unvoiced, voiced transition, etc. For the tone signal frame, it can be refined into stable tone frame, transition tone frame, etc.

与上述发送方法中对音频帧进行的分类，在接收端，本实施例中采用图8所示的方法进行接收。如图8所示，该接收方法包括：Similar to the classification of audio frames in the above sending method, at the receiving end, the method shown in FIG. 8 is used for receiving in this embodiment. As shown in Figure 8, the receiving method includes:

步骤801，对音频信号进行丢帧检测，若发生丢帧，则执行步骤804及其后续步骤，否则执行步骤802及其后续步骤。Step 801 , perform frame loss detection on the audio signal, if frame loss occurs, execute step 804 and its subsequent steps, otherwise execute step 802 and its subsequent steps.

本步骤中，根据音频帧中携带的帧序号判定是否出现音频帧的丢失。In this step, it is determined whether an audio frame is lost according to the frame number carried in the audio frame.

步骤802，对音频帧的类型进行检测并记录。Step 802, detect and record the type of the audio frame.

当发生丢帧时，可以利用本步骤中记录的音频帧类型确定丢失的音频帧的类型信息。When a frame is lost, the type information of the lost audio frame can be determined by using the type of the audio frame recorded in this step.

步骤803，对音频帧进行解码，并输出解码结果，结束本流程。Step 803, decode the audio frame, and output the decoding result, and end this process.

本步骤中，根据发送端对音频帧的编码方法，采用相应的解码方法进行解码。In this step, according to the encoding method of the audio frame by the sending end, the corresponding decoding method is used for decoding.

步骤804，确定丢失的音频帧按照内容进行分类时得到的类型信息。Step 804, determine the type information obtained when the lost audio frames are classified according to content.

本步骤中，若音频帧的类型信息是携带在本音频帧中传送，则接收端提取历史数据，根据正确接收帧的类型信息推断当前丢失帧的类型；若音频帧的类型信息是携带在其他正确接收音频帧中传送的，则接收端直接在相应正确接收音频帧中提取当前丢失帧的类型信息即可。In this step, if the type information of the audio frame is transmitted in the audio frame, the receiving end extracts historical data, and infers the type of the currently lost frame according to the type information of the correctly received frame; if the type information of the audio frame is carried in other If it is transmitted in the correctly received audio frame, the receiving end can directly extract the type information of the currently lost frame from the corresponding correctly received audio frame.

步骤805，根据丢失的音频帧的类型，自适应地采用相应的错误恢复策略重构音频帧，并输出重构的结果，结束本流程。Step 805, according to the type of the lost audio frame, adaptively adopt the corresponding error recovery strategy to reconstruct the audio frame, and output the reconstructed result, and end this process.

本步骤中，可以根据丢失的音频帧的类型，选用针对该类型最合适的错误恢复策略重构音频帧。如，稳定语音帧与其前一帧非常相似，用帧复制策略就能取得很好的隐藏效果，过渡帧则需要考虑前后帧的状态来确定隐藏策略等。In this step, according to the type of the lost audio frame, the most suitable error recovery strategy for the type can be selected to reconstruct the audio frame. For example, a stable speech frame is very similar to its previous frame, and a frame copy strategy can be used to achieve a good hiding effect, while a transition frame needs to consider the state of the preceding and following frames to determine the hiding strategy.

至此，音频流错误隐藏的接收方法结束。So far, the receiving method of audio stream error concealment ends.

在本实施例的发送方法中，对音频帧进行分类时采用的是图7所示的方式进行分类，当然也可以采用其他基于内容的音频帧分类方式，只要能够达到对音频帧根据内容分类的目的即可。In the sending method of this embodiment, the audio frame is classified in the manner shown in FIG. 7 , and of course other content-based audio frame classification methods can also be used, as long as the audio frame can be classified according to the content. purpose.

由上述可见，发送端和接收端相配合，就能够利用对音频帧的分类，来高效实现错误隐藏，大大提高实时音频通信对于丢包率的容忍度。It can be seen from the above that the cooperation between the sending end and the receiving end can use the classification of audio frames to efficiently implement error concealment, and greatly improve the tolerance of real-time audio communication to packet loss rate.

上述为本实施例中提供的音频流错误隐藏的发送和接收方法的具体实施方式。由该两种实施方式相互配合，即可以构成本发明中音频流错误隐藏的方法具体实施方式。另外，本实施例还提供了相应的音频流错误隐藏的发射机和接收机的具体实施方式。The foregoing is a specific implementation manner of the audio stream error concealment sending and receiving method provided in this embodiment. The specific implementation of the method for audio stream error concealment in the present invention can be constituted by the mutual cooperation of the two implementations. In addition, this embodiment also provides specific implementation manners of a corresponding audio stream error concealment transmitter and receiver.

图9为本发明实施例中音频流错误隐藏的发射机具体结构图。如图9所示，该发射机900包括：音频编码器模块910、音频帧分类器模块920、帧封装模块930和音频帧划分模块940。FIG. 9 is a specific structural diagram of a transmitter for audio stream error concealment in an embodiment of the present invention. As shown in FIG. 9 , the transmitter 900 includes: an audio encoder module 910 , an audio frame classifier module 920 , a frame packing module 930 and an audio frame dividing module 940 .

在该发射机900中，音频帧划分模块940，用于根据不同的编码协议，将音频信号划分为等间隔的音频帧，并将音频帧发送给音频编码器模块910和音频帧分类器模块920。In the transmitter 900, the audio frame division module 940 is used to divide the audio signal into equally spaced audio frames according to different encoding protocols, and send the audio frames to the audio encoder module 910 and the audio frame classifier module 920 .

音频编码器模块910，用于对音频帧进行编码，并将编码结果发送给帧封装模块930。音频帧分类器模块920，用于对音频帧按照内容进行分类，其具体分类方式可以采用图7所示的方式，并将音频帧的类型信息发送给帧封装模块930。The audio encoder module 910 is configured to encode the audio frame and send the encoding result to the frame packing module 930 . The audio frame classifier module 920 is used to classify the audio frames according to the content, and the specific classification method can adopt the method shown in FIG. 7 , and send the type information of the audio frame to the frame encapsulation module 930 .

帧封装模块930，用于接收音频编码器模块910发送的音频帧编码结果和音频帧分类器模块920发送的音频帧类型信息，并将类型信息和音频帧的编码结果封装打包，并发送出去。在进行封装打包时，可以将音频帧的类型信息封装在本音频帧或下一音频帧中，具体可以位于帧头的部分。The frame encapsulation module 930 is configured to receive the audio frame encoding result sent by the audio encoder module 910 and the audio frame type information sent by the audio frame classifier module 920, encapsulate the type information and the audio frame encoding result, and send them out. When performing encapsulation and packaging, the type information of the audio frame can be encapsulated in the current audio frame or the next audio frame, specifically in the frame header.

在音频编码器910中对音频帧进行编码时，可以根据音频帧分类器发送的该音频帧的类型信息，对音频帧采用不同的编码方式，或者直接对所有的编码帧采用相同的编码方式。When encoding an audio frame in the audio encoder 910, different encoding methods may be used for the audio frame according to the type information of the audio frame sent by the audio frame classifier, or the same encoding method may be directly used for all encoded frames.

图10为本发明实施例中音频流错误隐藏的接收机具体结构图。如图10所示，该接收机1000包括帧类型判别模块1010、错误隐藏模块1020、差错检测模块1030和音频解码器模块1040。其中，帧类型判别模块1010包括判别子模块1011和存储子模块1012；错误隐藏模块1020包括策略判决子模块1021和错误隐藏子模块1022。FIG. 10 is a specific structural diagram of a receiver for audio stream error concealment in an embodiment of the present invention. As shown in FIG. 10 , the receiver 1000 includes a frame type discrimination module 1010 , an error concealment module 1020 , an error detection module 1030 and an audio decoder module 1040 . Among them, the frame type judgment module 1010 includes a judgment sub-module 1011 and a storage sub-module 1012 ; the error concealment module 1020 includes a policy decision sub-module 1021 and an error concealment sub-module 1022 .

在该接收机1000中，差错检测模块1030，用于从信道上接收音频帧，将接收到的音频帧发送给帧类型判别模块1010中的判别子模块1011，并检测是否出现丢帧，若出现丢帧，则通知帧类型判别模块1010中的判别子模块1011。In the receiver 1000, the error detection module 1030 is used to receive the audio frame from the channel, send the received audio frame to the discrimination sub-module 1011 in the frame type discrimination module 1010, and detect whether there is a frame loss, if there is If the frame is lost, the discrimination sub-module 1011 in the frame type discrimination module 1010 will be notified.

在帧类型判别模块1010中，在确定音频帧按照内容进行分类得到的类型时，若音频帧的类型信息在正确接收到的音频帧中携带，则直接将该类型信息提取出来存储到存储子模块1012中；若音频帧的类型信息在丢失的音频帧中携带，则根据前后帧的类型推断该丢失的音频帧按照内容进行分类时得到的类型信息。In the frame type discrimination module 1010, when determining the type obtained by classifying the audio frame according to the content, if the type information of the audio frame is carried in the correctly received audio frame, the type information is directly extracted and stored in the storage submodule In 1012: if the type information of the audio frame is carried in the lost audio frame, infer the type information obtained when the lost audio frame is classified according to the content according to the types of the preceding and following frames.

在错误隐藏模块1020中，策略判决子模块1021，用于接收判别子模块1011发送的丢失帧的类型信息，并根据该类型信息，判定采用的错误恢复策略，并将结果发送给错误隐藏子模块1022。错误隐藏子模块1022，用于根据策略判决子模块1021发送的错误恢复策略判决结果，对丢失的音频帧进行重构。In the error concealment module 1020, the policy judgment submodule 1021 is used to receive the type information of the lost frame sent by the discrimination submodule 1011, and according to the type information, determine the error recovery strategy adopted, and send the result to the error concealment submodule 1022. The error concealment sub-module 1022 is configured to reconstruct the lost audio frame according to the error recovery policy decision result sent by the policy decision sub-module 1021 .

音频帧解码器模块1040，用于对接收到的音频帧进行解码，并输出解码结果。The audio frame decoder module 1040 is configured to decode the received audio frame and output the decoding result.

在本实施例的发射机900中，音频帧分类器模块920采用的是图7的方式对音频帧进行分类，当然也可以采用其他基于内容的音频帧分类方式，这里就不再赘述。在接收机1000中，将帧类型判决模块1010细化为判决子模块1011和存储子模块1012，分别进行帧类型的判决和存储；错误隐藏模块1020细化为策略判决子模块1021和错误隐藏子模块1022，分别进行策略判决和错误隐藏。其中，错误隐藏子模块1022还可以进一步划分为多种不同类型的错误隐藏单元，如噪音错误隐藏单元、话音错误隐藏单元等，用于处理不同类型的音频帧的错误隐藏。In the transmitter 900 of this embodiment, the audio frame classifier module 920 uses the method shown in FIG. 7 to classify the audio frames. Of course, other content-based audio frame classification methods may also be used, which will not be repeated here. In the receiver 1000, the frame type decision module 1010 is subdivided into a decision submodule 1011 and a storage submodule 1012, which respectively perform frame type judgment and storage; the error concealment module 1020 is subdivided into a policy decision submodule 1021 and an error concealment submodule Module 1022, respectively perform policy decision and error concealment. Wherein, the error concealment sub-module 1022 can be further divided into multiple different types of error concealment units, such as noise error concealment units, voice error concealment units, etc., for handling the error concealment of different types of audio frames.

本发明音频流错误隐藏系统的实施方式可以为：利用上述图9和图10所示的发射机900和接收机1000作为音频流错误隐藏系统中发射机和接收机的具体实施方式，并且，将发射机900中帧封装模块930输出的音频帧发送给接收机1000中的差错检测模块1030。这样便可以构成本发明的音频流错误隐藏系统的一种实施方式。The implementation manner of the audio stream error concealment system of the present invention may be: use the transmitter 900 and the receiver 1000 shown in FIG. 9 and FIG. 10 as the specific implementation manners of the transmitter and receiver in the audio stream error concealment system, and the The audio frame output by the frame encapsulation module 930 in the transmitter 900 is sent to the error detection module 1030 in the receiver 1000 . In this way, an embodiment of the audio stream error concealment system of the present invention can be constituted.

由上述本发明方法、装置和系统的具体实施方式可见，采用本发明的技术方案，使得对丢失帧的重构具有更强的针对性，能够自适应地重构音频帧，以达到更好的补偿效果，为收端用户带来更好的主观听觉感受，同时能改善音频帧信号的可分辨性，使得音频通信能够容忍更高的包丢失率。It can be seen from the above specific implementation of the method, device and system of the present invention that the technical solution of the present invention makes the reconstruction of lost frames more pertinent, and can adaptively reconstruct audio frames to achieve better The compensation effect brings better subjective auditory experience to the receiving end user, and at the same time improves the distinguishability of the audio frame signal, making the audio communication tolerant to a higher packet loss rate.

以上仅为本发明的较佳实施例而已，并非用于限定本发明的保护范围。凡在本发明的精神和原则之内，所作的任何修改、等同替换、改进等，均应包含在本发明的保护范围之内。The above are only preferred embodiments of the present invention, and are not intended to limit the protection scope of the present invention. Any modifications, equivalent replacements, improvements, etc. made within the spirit and principles of the present invention shall be included within the protection scope of the present invention.

Claims

1. A transmission method for audio stream error concealment, characterized in that the method comprises:

a. Classify the sent audio frame according to the content to obtain the type information of the audio frame;

b. The type information of the audio frame and the encoding result of the audio frame are packaged and sent out.

2. The method according to claim 1, wherein classifying the audio frames sent according to the content in step a comprises:

a1, utilize the voice activation detector VAD to detect whether the audio frame is a noise signal frame, if so, then perform step a2, otherwise perform step a3;

a2. Perform spectrum energy analysis on the noise signal frame, and determine that the audio frame is a silent signal frame or a noise signal frame;

a3. Spectrum stability analysis is performed on the non-noise signal frame, and the audio frame is determined to be a speech signal frame or a musical tone signal frame.

3. The method according to claim 2, characterized in that the silent signal frame, the noise signal frame, the speech signal frame or the tone signal frame are further finely divided.

4. The method according to claim 1, wherein said encapsulating and packing the type information of the audio frame and the encoding result of the audio frame is: packing the type information of the audio frame into a frame header of the audio frame.

5. The method according to claim 4, wherein the frame header of the audio frame is: the frame header of the audio frame represented by the type information of the audio frame or the lower part of the audio frame represented by the type information of the audio frame The header of a frame.

6. The method according to claim 1, wherein the encoding result of the audio frame is: the encoding result obtained by using the same encoding method for all audio frames, or, according to the different types of audio frames, different encoding methods are used. The encoded result of the method.

7. A receiving method for audio stream error concealment, characterized in that the method comprises:

a. When frame loss occurs, for the lost audio frame, determine the type information of the audio frame obtained when it is classified according to the content;

b. According to the type information of the lost audio frame, a corresponding error recovery strategy is used to reconstruct the audio frame.

8. The receiving method according to claim 7, characterized in that, the type information of the audio frame obtained when determining that it is classified according to content as described in step a is:

When the type information of the lost audio frame is located in the correctly received audio frame, directly extracting the type information of the lost audio frame from the correctly received audio frame;

When the type information of the lost audio frame is located in the lost audio frame, the type information of the lost audio frame is deduced according to the type information of the correctly received frame.

9. The receiving method according to claim 7 or 8, further comprising: before step a, detecting whether there is frame loss in the received audio signal, if so, performing step a and subsequent steps thereof, otherwise, The correctly received audio frame is decoded, and the type information of the audio frame carried in the frame is extracted and stored.

10. A method for audio stream error concealment, characterized in that the method comprises:

b. Encapsulate and package the type information of the audio frame with the encoding result of the audio frame and send it to the receiving end;

c. When a frame is lost, for the lost audio frame, the receiving end determines the type information of the audio frame obtained when it is classified according to the content;

d. According to the type information of the lost audio frame, a corresponding error recovery strategy is used to reconstruct the audio frame.

11. The method according to claim 10, wherein classifying the audio frames sent according to the content in step a comprises:

12. The method according to claim 10, characterized in that, the type information of the audio frame and the encoding result of the audio frame are packaged and packed as described in step b: the type information of the audio frame is packed into the frame header of the audio frame middle.

13. The method according to claim 12, wherein the frame header of the audio frame is: the frame header of the audio frame represented by the type information of the audio frame or the lower part of the audio frame represented by the type information of the audio frame The header of a frame.

14. The method according to claim 10, wherein the type information of the audio frame obtained when determining that it is classified according to content as described in the step c is:

15. The method according to any one of claims 10 to 14, characterized in that, between steps b and c, further comprising: detecting whether a frame is lost in the received audio signal, if so, performing steps c and Its subsequent steps, otherwise, decode the correctly received audio frame, extract and store the type information of the audio frame carried in the frame.

16. A transmitter for audio stream error concealment, comprising an audio encoder module and a frame packing module, characterized in that the transmitter also includes an audio frame classifier module;

The audio frame classifier module is used to classify the sent audio frame according to the content, obtain the type information of the audio frame, and send the type information to the frame encapsulation module;

The frame encapsulation module is configured to receive the type information of the audio frame sent by the audio frame classifier module and the encoding result of the audio frame sent by the audio encoder module, and encode the type information of the audio frame and the audio frame The results are packaged and sent out.

17. A receiver for audio stream error concealment, characterized in that the receiver includes a frame type discrimination module and an error concealment module,

The frame type discrimination module is used to determine the type information of the audio frame obtained when the lost audio frame is classified according to the content, and send the type information to the error concealment module;

The error concealment module is configured to use a corresponding error recovery strategy to reconstruct the audio frame according to the type information of the received lost audio frame.

18. The receiver according to claim 17, further comprising an error detection module and an audio decoder module,

The error detection module is used to receive audio frames from the channel, send the received audio frames to the frame type discrimination module, and detect whether frame loss occurs, and if frame loss occurs, notify the frame type discrimination module ;

The frame type discrimination module is further used to forward the audio frame to the audio frame decoder module;

The audio frame decoder module is used to decode the audio frame.

19. The receiver according to claim 18, wherein the frame type discrimination module comprises a discrimination submodule and a storage submodule,

The discrimination submodule is used to determine the type information of the audio frame obtained when the lost audio frame is classified according to the content, and send the type information to the storage submodule, and is also used to receive the error detection After the frame loss notification sent by the module, the type information of the lost frame is sent to the error concealment module, and the received audio frame is forwarded to the audio frame decoder module;

The storage sub-module is used for saving the type information of the audio frame.

20. The receiver according to claim 18 or 19, wherein the error concealment module comprises a strategy decision submodule and an error concealment submodule,

The strategy judgment submodule is used to receive the type information of the lost frame sent by the frame type judgment module, and judge the error recovery strategy adopted according to the type information, and send the result to the error concealment submodule;

The error concealment submodule is configured to reconstruct the lost audio frame according to the error recovery policy decision result sent by the policy decision submodule.

21. An audio stream error concealment system, characterized in that the system comprises: a transmitter and a receiver;

The transmitter is configured to classify the transmitted audio frame according to the content, obtain the type information of the audio frame, and package the type information of the audio frame and the encoding result of the audio frame to send to the receiver;

The receiver is configured to determine the type information obtained when the lost audio frame is classified according to content when a frame loss occurs, and use a corresponding error recovery strategy to reconstruct the audio frame according to the type information.

22. The system according to claim 21, wherein the transmitter includes an audio encoder module, a frame packing module and an audio frame classifier module, and the receiver includes a frame type discrimination module and an error concealment module;

The frame encapsulation module is configured to receive the type information of the audio frame sent by the audio frame classifier module and the encoding result of the audio frame sent by the audio encoder module, and encode the type information of the audio frame and the audio frame The result is packaged and sent out;

23. The system of claim 22, wherein the receiver further comprises an error detection module and an audio decoder module,

The audio frame decoder module is used to decode the audio frame;

The frame encapsulation module is configured to encapsulate and package the type information of the audio frame and the encoding result of the audio frame and send them to the error detection module.

24. The receiver according to claim 22 or 23, wherein the error concealment module comprises a policy decision submodule and an error concealment submodule,