CN105071897A

CN105071897A - Multipath redundant transmission method for network real-time audio conversation media data

Info

Publication number: CN105071897A
Application number: CN201510388523.2A
Authority: CN
Inventors: 雷为民; 李�浩; 张伟; 刘少伟; 关云冲
Original assignee: Northeastern University China
Current assignee: Northeastern University China
Priority date: 2015-07-03
Filing date: 2015-07-03
Publication date: 2015-11-18
Anticipated expiration: 2035-07-03
Also published as: CN105071897B

Abstract

The present invention provides a network real-time audio session media data multi-path redundant transmission method, comprising: the sending end of the audio media data packs the captured audio media data according to the audio codec format and network transmission protocol negotiated by both parties of the real-time audio session, and adopts The redundant mode sends to multiple incompletely intersecting transmission paths at the same time, and performs multi-path redundant transmission of real-time audio media data. The receiving end of audio media data performs redundancy elimination and reorganization on the received audio media data packets from different transmission paths. Operate, and restore the original audio data according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session. As a result, it can effectively reduce the overall impact of packet loss, delay and jitter caused by changes in single path conditions on the end-to-end transmission of media data, thereby improving the reliability of real-time audio media data transmission and improving the user experience quality of network audio conversation services (QoE).

Description

A method for multi-path redundant transmission of network real-time audio session media data

技术领域：Technical field:

本发明涉及网络通信技术领域，尤其涉及一种网络实时音频会话媒体数据多径冗余传输方法。The invention relates to the technical field of network communication, in particular to a method for multi-path redundant transmission of network real-time audio session media data.

背景技术：Background technique:

因特网采用“尽力而为”的传输机制，并不能为端到端媒体传输提供服务质量(QoS)保证。实时音频会话是一类典型的网络通信业务，其媒体传输通常占用几kbps到几百kbps的传输带宽，是一类窄带业务，虽然占用带宽小，但对传输实时性要求较高。目前IP通信终端音频媒体传输通常采用传统RTP和UDP协议进行传输控制，采用没有服务质量(QoS)保证的端到端缺省路由路径(单一路径)进行传输，端到端路径环节中不确定拥塞引起的数据丢包和时延抖动常常造成音频媒体数据无法重组和解码，严重影响实时音频会话的品质。因特网已经成为网络通信重要的承载网络，改进实时音频会话媒体传输方式，提升业务体验质量是亟待解决的重要问题。The Internet uses a "best effort" transmission mechanism, which cannot provide quality of service (QoS) guarantees for end-to-end media transmission. Real-time audio session is a typical network communication service. Its media transmission usually occupies several kbps to hundreds of kbps of transmission bandwidth. It is a type of narrowband service. Although it occupies a small bandwidth, it has high requirements for real-time transmission. At present, audio media transmission of IP communication terminals usually adopts traditional RTP and UDP protocols for transmission control, and uses end-to-end default routing path (single path) without quality of service (QoS) guarantee for transmission, and uncertain congestion in the end-to-end path link The resulting data packet loss and delay jitter often cause audio media data to fail to be reassembled and decoded, seriously affecting the quality of real-time audio sessions. The Internet has become an important bearer network for network communications. Improving the transmission mode of real-time audio conversation media and improving the quality of service experience are important issues that need to be solved urgently.

发明内容：Invention content:

针对现有技术的缺陷，本发明提供一种网络实时音频会话媒体数据多径冗余传输方法，该方法结合实时音频媒体占用传输带宽较低的特征，通过构建端到端的不完全相交传输路径，并采用多径冗余传输控制机制与协议，实现实时音频媒体数据多路径冗余方式传输。这种多径冗余传输方法可有效降低单一路径条件变化引起的丢包、时延和抖动对媒体数据端到端传输的整体影响，进而提升实时音频媒体数据传输的可靠性，提升网络音频会话业务的用户体验质量。Aiming at the defects of the prior art, the present invention provides a method for multi-path redundant transmission of network real-time audio session media data. The method combines the characteristics of low transmission bandwidth occupied by real-time audio media, and constructs an end-to-end incompletely intersecting transmission path. And the multi-path redundant transmission control mechanism and protocol are adopted to realize the multi-path redundant transmission of real-time audio media data. This multi-path redundant transmission method can effectively reduce the overall impact of packet loss, delay and jitter caused by changes in single path conditions on the end-to-end transmission of media data, thereby improving the reliability of real-time audio media data transmission and improving network audio sessions The user experience quality of the service.

本发明提供一种网络实时音频会话媒体数据多径冗余传输方法，包括：The present invention provides a network real-time audio session media data multi-path redundant transmission method, comprising:

音频媒体数据的发送端将捕获的音频媒体数据按照实时音频会话双方协商的音频编解码格式和网络传输协议打包，采用冗余方式同时发送至多条不完全相交的传输路径上，进行实时音频媒体数据多径冗余传输；The sender of the audio media data packs the captured audio media data according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session, and sends them to multiple incompletely intersecting transmission paths in a redundant manner at the same time for real-time audio media data Multipath redundant transmission;

音频媒体数据的接收端对接收的来自不同传输路径的音频媒体数据分组进行冗余剔除和重组操作，并按照实时音频会话双方协商的音频编解码格式和网络传输协议还原原始音频数据。The receiving end of the audio media data performs redundancy removal and reassembly operations on the received audio media data packets from different transmission paths, and restores the original audio data according to the audio codec format and network transmission protocol negotiated by the two sides of the real-time audio session.

可选地，所述多条不完全相交的传输路径包括：一条基于缺省路由的默认路径和一条或多条不完全相交的冗余传输路径。Optionally, the multiple incompletely intersecting transmission paths include: a default path based on a default route and one or more incompletely intersecting redundant transmission paths.

可选地，根据所述实时音频会话设置冗余系数，所述多条不完全相交的传输路径的条数大于所述实时音频会话设置的冗余系数。Optionally, a redundancy coefficient is set according to the real-time audio session, and the number of the multiple incompletely intersecting transmission paths is greater than the redundancy coefficient set for the real-time audio session.

可选地，所述冗余剔除和重组操作，包括：Optionally, the redundant elimination and reorganization operations include:

设置并初始化子流缓冲区、重组缓冲区和抖动消除窗口；Set up and initialize substream buffers, reassembly buffers and jitter removal windows;

所述音频媒体数据的接收端对接收到的所述音频媒体数据分组进行冗余剔除操作；The receiving end of the audio media data performs a redundancy removal operation on the received audio media data packets;

所述音频媒体数据的接收端对冗余剔除操作后的所述音频媒体数据分组进行重组操作。The receiver of the audio media data performs a reassembly operation on the audio media data packets after the redundancy removal operation.

可选地，所述设置并初始化子流缓冲区、重组缓冲区和抖动消除窗口，包括：Optionally, the setting and initializing the substream buffer, the reassembly buffer and the jitter removal window include:

设置与不完全相交的传输路径条数相同个数的子流缓冲区，所述子流缓冲区用于接收不同传输路径的音频媒体数据分组，初始化每个所述子流缓冲区的各个存储位置为空；Set the same number of sub-stream buffers as the number of incompletely intersecting transmission paths, the sub-stream buffers are used to receive audio media data packets of different transmission paths, and initialize each storage location of each of the sub-stream buffers Is empty;

设置重组缓冲区，所述重组缓冲区用于存储冗余剔除后的音频媒体数据分组的序号、存储所述音频媒体数据分组的子流缓冲区的序号及所述音频媒体数据分组在存储的子流缓冲区中的存储位置，初始化所述重组缓冲区大小为N_j，所述重组缓冲区中用于存储的冗余剔除操作后的音频媒体数据分组的序号全部初始化为-1，所述存储所述音频媒体数据分组的子流缓冲区的序号及所述音频媒体数据分组在存储的子流缓冲区中的存储位置全部初始化为空；The reorganization buffer is set, and the reorganization buffer is used to store the sequence number of the audio media data packet after redundant elimination, the sequence number of the substream buffer for storing the audio media data packet and the substream of the audio media data packet in the storage The storage location in the stream buffer, initialize the size of the reorganization buffer to be N _j , the sequence numbers of the audio media data packets used for storing in the reorganization buffer after the redundancy removal operation are all initialized to -1, and the storage The sequence number of the sub-stream buffer of the audio media data packet and the storage location of the audio media data packet in the stored sub-stream buffer are all initialized to be empty;

设置抖动消除窗口，所述抖动消除窗口用于实现所述音频媒体数据分组的抖动消除，初始化所述抖动消除窗口的大小为W，W∈[W_min,W_max]。Set a jitter removal window, the jitter removal window is used to implement the jitter removal of the audio media data packet, initialize the size of the jitter removal window to W, W∈[W _min , W _max ].

可选地，所述冗余剔除操作，包括：Optionally, the redundant elimination operation includes:

S1、采用轮询的方式查询每个所述子流缓冲区，得到最新接收到的所述音频媒体数据分组i，提取所述音频媒体数据分组i的序号，记为FSN_i；S1, adopt polling mode to inquire about each described sub-flow buffer, obtain the latest received described audio media data packet i, extract the sequence number of described audio media data packet i, and record it as FSN _i ;

S2、将所述音频媒体数据分组的序号FSN_i与所述重组缓冲区大小N_j进行模运算，得到数值m，即m＝FSN_imodN_j，查询所述重组缓冲区中m位置存储的音频媒体数据分组的序号，记为J_FSN_m；S2. Perform a modulo operation on the serial number FSN _i of the audio media data group and the size N _j of the reassembly buffer to obtain a value m, that is, m=FSN _i modN _j , and query the audio stored at position m in the reassembly buffer The sequence number of the media data packet, denoted as J_FSN _m ;

S3、若J_FSN_m＝-1，则将FSN_i的值赋给J_FSN_m，将所述音频媒体数据分组i所在的子流缓冲区的序号及存储位置存储到重组缓冲区m位置中，执行步骤S1；S3, if J_FSN _m =-1, then assign the value of FSN _i to J_FSN _m , store the serial number and the storage location of the sub-stream buffer where the audio media data packet i is located in the reassembly buffer m position, and perform the steps S1;

S4、若J_FSN_m≠-1，FSN_i＞J_FSN_m，则将FSN_i的值赋给J_FSN_m，将所述音频媒体数据分组i所在的子流缓冲区的序号及存储位置存储到重组缓冲区m位置中，执行步骤S1；S4. If J_FSN _m ≠-1, FSN _i >J_FSN _m , then assign the value of FSN _i to J_FSN _m , and store the sequence number and storage location of the substream buffer where the audio media data packet i is located in the reassembly buffer In position m, execute step S1;

S5、若J_FSN_m≠-1，FSN_i≤J_FSN_m，则执行步骤S1。S5. If J_FSN _m ≠ -1 and FSN _i ≤ J_FSN _m , execute step S1.

可选地，所述采用轮询的方式查询每个所述子流缓冲区，包括：Optionally, the querying each of the substream buffers in a polling manner includes:

所述音频媒体数据的接收端定期统计每个所述子流缓冲区中的音频媒体数据分组的接收与冗余剔除情况，得到不同传输路径对应的子流缓冲区的轮询优先级序列；The receiving end of the audio media data regularly counts the reception and redundancy elimination of the audio media data packets in each of the sub-stream buffers, and obtains the polling priority sequences of the sub-stream buffers corresponding to different transmission paths;

所述音频媒体数据的接收端根据所述轮询优先级序列查询每个所述子流缓冲区，进行冗余剔除操作。The receiving end of the audio media data queries each of the sub-stream buffers according to the polling priority sequence, and performs a redundant elimination operation.

可选地，所述重组操作，包括：Optionally, the reorganization operation includes:

动态调节抖动消除窗口大小；Dynamically adjust the size of the jitter removal window;

根据所述抖动消除窗口大小，进行所述音频媒体数据分组回调操作。The callback operation of the audio media data packet is performed according to the size of the jitter elimination window.

可选地，所述动态调节抖动消除窗口大小，包括：Optionally, the dynamically adjusting the size of the jitter elimination window includes:

记录音频媒体数据分组i到达所述重组缓冲区的时间R(i)，根据所述音频媒体数据分组i中时间戳或其他用于记录音频媒体数据分组发送时间的标志位，得到所述音频媒体数据分组i的发送时间S(i)，计算所述音频媒体数据分组i的时延抖动J(i)＝R(i)-S(i)；Record the time R(i) at which the audio media data packet i arrives at the reassembly buffer, and obtain the audio media according to the timestamp in the audio media data packet i or other flags used to record the sending time of the audio media data packet The sending time S(i) of the data packet i, calculate the delay jitter J(i)=R(i)-S(i) of the audio media data packet i;

根据已到达所述重组缓冲区的音频媒体数据分组，得到新到达所述重组缓冲区的音频媒体数据分组i的预期时延抖动其中，N为固定值，i≥N，P(k)为加权系数，且 $Σ_{k = i - N}^{i - 1} P (k) = 1;$ According to the audio media data packet that has arrived at the reassembly buffer, the expected delay jitter of the audio media data packet i newly arriving at the reassembly buffer is obtained Among them, N is a fixed value, i≥N, P(k) is a weighting coefficient, and $Σ_{k = i - N}^{i - 1} P (k) = 1;$

计算所述音频媒体数据分组i时延抖动的类标准误差 $J_{A R M S E} (i) = \sqrt{\frac{1}{N} - Σ_{k = i - N + 1}^{i} {[J (k) - \overset{&OverBar;}{J (k)}]}^{2}};$ Calculate the class standard error of the delay jitter of the audio media data packet i $J_{A R m S E.} (i) = \sqrt{\frac{1}{N} - Σ_{k = i - N + 1}^{i} {[J (k) - \overset{&OverBar;}{J (k)}]}^{2}};$

设置阈值g₁、g₂用于判断抖动消除窗口大小更改范围，且g₁＜g₂；Setting thresholds g ₁ and g ₂ is used to judge the change range of the jitter elimination window size, and g ₁ <g ₂ ;

当增加抖动消除窗口大小，若J_ARMSE(i)∈[g₁,g₂]，抖动消除窗口增加若J_ARMSE(i)＞g₂，抖动消除窗口增加其中，k∈[0.5,1]；when Increase the size of the jitter removal window, if J _ARMSE (i)∈[g ₁ ,g ₂ ], the jitter removal window increases If J _ARMSE (i)＞g ₂ , the jitter removal window increases where k∈[0.5,1];

当缩减抖动消除窗口大小，若J_ARMSE(i)∈[g₁,g₂]，抖动消除窗口缩减若J_ARMSE(i)＞g₂，抖动消除窗口缩减其中，k∈[0.5,1]；when Reduce the size of the jitter removal window, if J _ARMSE (i)∈[g ₁ ,g ₂ ], the jitter removal window is reduced If J _ARMSE (i)＞g ₂ , the jitter removal window is reduced where k∈[0.5,1];

若调节后的抖动消除窗口的大小W小于抖动消除窗口最小值W_min，则将抖动消除窗口的大小设置为抖动消除窗口最小值W_min，若调节后的抖动消除窗口的大小W大于抖动消除窗口最大值W_max，则将抖动消除窗口的大小设置为抖动消除窗口最大值W_max；If the adjusted jitter elimination window size W is smaller than the jitter elimination window minimum value W _min , then set the jitter elimination window size to the jitter elimination window minimum value W _min , if the adjusted jitter elimination window size W is greater than the jitter elimination window maximum value W _max , then the size of the jitter elimination window is set to the maximum value W _max of the jitter elimination window;

当J_ARMSE(i)＜g₁，抖动消除窗口的大小无需改变；When J _ARMSE (i)<g ₁ , the size of the jitter elimination window does not need to be changed;

若对抖动消除窗口大小进行了修改，则更改的音频媒体数据分组i的预期时延抖动为区间[J(i)-4,J(i)+4]中任一整数，否则，不进行更改。If the jitter removal window size is modified, the expected delay jitter of the changed audio media data packet i It is any integer in the interval [J(i)-4, J(i)+4], otherwise, no change is made.

可选地，所述根据所述抖动消除窗口大小，进行所述音频媒体数据分组回调操作，包括：Optionally, the performing the audio media data packet callback operation according to the size of the jitter elimination window includes:

在抖动消除窗口中查找到音频媒体数据分组序号为FSN的音频媒体数据分组，按照实时音频会话双方协商的音频编解码格式和网络传输协议进行回调解码，在抖动消除窗口中查找音频媒体数据分组序号为FSN+1的音频媒体数据分组；Find the audio media data packet whose serial number is FSN in the jitter elimination window, perform callback decoding according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session, and find the audio media data packet serial number in the jitter elimination window It is the audio media data packet of FSN+1;

若查找到，则将音频媒体数据分组序号为FSN+1的音频媒体数据分组回调解码，继续查找音频媒体数据分组序号为FSN+2的音频媒体数据分组，否则，判断抖动消除窗口是否有剩余，若无剩余，则将音频媒体数据分组序号为FSN+1的音频媒体数据分组再执行一次回调解码操作，继续查找音频媒体数据分组序号为FSN+2的音频媒体数据分组，否则，等待预设时间t后继续查找；If found, then the audio media data packet serial number is FSN+1 audio media data packet callback decoding, continue to search the audio media data packet serial number is the audio media data packet of FSN+2, otherwise, judge whether the jitter elimination window has surplus, If there is no remaining, then the audio media data packet whose serial number is FSN+1 is called back and decoded again, and continues to search for the audio media data packet whose serial number is FSN+2, otherwise, wait for the preset time Continue to search after t;

若连续三次查找成功，则将抖动消除窗口缩减一个数据帧的大小，判断此时抖动消除窗口是否小于抖动消除窗口最小值W_min，若是，则将抖动消除窗口的大小设置为抖动消除窗口最小值W_min，继续执行查找操作；If the search is successful three times in a row, reduce the jitter elimination window to the size of one data frame, and judge whether the jitter elimination window is smaller than the minimum value W _min of the jitter elimination window at this time, and if so, set the size of the jitter elimination window to the minimum value of the jitter elimination window W _min , continue to perform the search operation;

若连续三次查找不成功，则将抖动消除窗口增加一个数据帧的大小，判断此时抖动消除窗口是否大于抖动消除窗口最大值W_max，若是，则将抖动消除窗口的大小设置为抖动消除窗口最大值W_max，继续执行查找操作。If three consecutive searches are unsuccessful, then increase the size of the jitter elimination window by one data frame, and judge whether the jitter elimination window is greater than the maximum value W _max of the jitter elimination window at this time, and if so, set the size of the jitter elimination window to the maximum value of the jitter elimination window value W _max , continue to perform the search operation.

由上述技术方案可知，本发明的网络实时音频会话媒体数据多径冗余传输方法，包括：音频媒体数据的发送端将捕获的音频媒体数据按照实时音频会话双方协商的音频编解码格式和网络传输协议打包，采用冗余方式同时发送至多条不完全相交的传输路径上，进行实时音频媒体数据多径冗余传输，音频媒体数据的接收端对接收的来自不同传输路径的音频媒体数据分组进行冗余剔除和重组操作，并按照实时音频会话双方协商的音频编解码格式和网络传输协议还原原始音频数据。由此，通过对于实时音频媒体数据采用多路径冗余传输可以有效地提高数据传输中的丢包补偿概率，改善传输相关的丢包、时延和抖动指标，进而提升数据传输的可靠性，改善业务体验质量(QoE)。As can be seen from the above technical solution, the multipath redundant transmission method of network real-time audio session media data of the present invention includes: the audio media data sending end of the audio media data transmits the captured audio media data according to the audio codec format negotiated by both parties of the real-time audio session and the network transmission Protocol packaging, using redundant methods to send to multiple incompletely intersecting transmission paths at the same time, for multi-path redundant transmission of real-time audio media data, the receiving end of audio media data performs redundancy on the received audio media data packets from different transmission paths The remaining culling and reorganization operations are performed, and the original audio data is restored according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session. Therefore, by using multi-path redundant transmission for real-time audio media data, the probability of packet loss compensation in data transmission can be effectively improved, and the transmission-related packet loss, delay, and jitter indicators can be improved, thereby improving the reliability of data transmission and improving Service Quality of Experience (QoE).

附图说明：Description of drawings:

图1为本发明第一实施例提供的网络实时音频会话媒体数据多径冗余传输方法流程示意图；Fig. 1 is the schematic flow chart of the network real-time audio session media data multipath redundant transmission method that the first embodiment of the present invention provides;

图2为本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的多径冗余传输系统结构图；2 is a structural diagram of a multipath redundant transmission system of a multipath relay transmission service system in which SIPProxy/IMSCSCF participates in session negotiation provided by the second embodiment of the present invention;

图3为本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法流程示意图；FIG. 3 is a schematic flow diagram of a method for multipath redundant transmission of network real-time audio session media data according to the second embodiment of the present invention;

图4为本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的音频媒体数据分组的传输示意图；Fig. 4 is the transmission schematic diagram of the audio media data packet of the multi-path relay transmission service system that SIPProxy/IMSCSCF that the second embodiment of the present invention provides participates in session negotiation;

图5为本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的多径冗余传输过程的消息流程图；5 is a message flow diagram of the multipath redundant transmission process of the multipath relay transmission service system in which SIPProxy/IMSCSCF participates in session negotiation provided by the second embodiment of the present invention;

图6为本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法的音频媒体数据分组接收端缓冲区设置框图；Fig. 6 is the audio media data packet receiver buffer setting block diagram of the network real-time audio session media data multipath redundant transmission method that the second embodiment of the present invention provides;

图7为本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法的音频媒体数据分组接收端的缓冲区实施设计图；Fig. 7 is the implementation design diagram of the buffer zone of the audio media data packet receiving end of the network real-time audio session media data multipath redundant transmission method provided by the second embodiment of the present invention;

图8为本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的多径冗余传输的音频媒体数据分组的封装格式示意图；8 is a schematic diagram of the encapsulation format of the audio media data packet of the multipath redundant transmission of the multipath relay transmission service system provided by the SIPProxy/IMSCSCF participating in the session negotiation provided by the second embodiment of the present invention;

图9为本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法的音频媒体数据分组的冗余剔除流程图。FIG. 9 is a flowchart of redundant elimination of audio media data packets in the method for multi-path redundant transmission of media data in a network real-time audio session according to the second embodiment of the present invention.

具体实施方式：Detailed ways:

下面结合附图和实施例，对本发明的具体实施方式作进一步详细描述。以下实施例用于说明本发明，但不用来限制本发明的范围。The specific implementation manners of the present invention will be further described in detail below in conjunction with the accompanying drawings and embodiments. The following examples are used to illustrate the present invention, but are not intended to limit the scope of the present invention.

图1示出了本发明第一实施例提供的网络实时音频会话媒体数据多径冗余传输方法流程示意图，如图1所示，本实施例的方法如下所述。FIG. 1 shows a schematic flowchart of a method for multipath redundant transmission of network real-time audio session media data provided by the first embodiment of the present invention. As shown in FIG. 1 , the method of this embodiment is as follows.

101、音频媒体数据的发送端将捕获的音频媒体数据按照实时音频会话双方协商的音频编解码格式和网络传输协议打包，采用冗余方式同时发送至多条不完全相交的传输路径上，进行实时音频媒体数据多径冗余传输。101. The sending end of audio media data packs the captured audio media data according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session, and sends them to multiple incompletely intersecting transmission paths in a redundant manner at the same time for real-time audio Multipath redundant transmission of media data.

本步骤中，在音频会话建立过程中，音频会话的主叫方与被叫方获取多条不完全相交的传输路径，应说明的是，所述多条不完全相交的传输路径包括：一条基于缺省路由的默认路径和一条或多条不完全相交的冗余传输路径。In this step, during the establishment of the audio session, the calling party and the called party of the audio session obtain multiple incompletely intersecting transmission paths. It should be noted that the multiple incompletely intersecting transmission paths include: one based on The default path of the default route and one or more incompletely intersecting redundant transmission paths.

具体地，在音频会话进行过程中，根据所述实时音频会话设置冗余系数，所述多条不完全相交的传输路径的条数大于所述实时音频会话设置的冗余系数。Specifically, during an audio session, a redundancy coefficient is set according to the real-time audio session, and the number of the multiple incompletely intersecting transmission paths is greater than the redundancy coefficient set for the real-time audio session.

102、音频媒体数据的接收端对接收的来自不同传输路径的音频媒体数据分组进行冗余剔除和重组操作，并按照实时音频会话双方协商的音频编解码格式和网络传输协议还原原始音频数据。102. The receiving end of the audio media data performs redundancy removal and reassembly operations on the received audio media data packets from different transmission paths, and restores the original audio data according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session.

本步骤中，所述冗余剔除和重组操作，包括：In this step, the redundant removal and reorganization operations include:

所述音频媒体数据的接收端对接收到的所述音频媒体数据分组进行冗余剔除；The receiving end of the audio media data performs redundancy elimination on the received audio media data packets;

进一步地，所述设置并初始化子流缓冲区、重组缓冲区和抖动消除窗口，包括：Further, the setting and initializing the substream buffer, the reassembly buffer and the jitter elimination window include:

设置重组缓冲区，所述重组缓冲区用于存储冗余剔除后的音频媒体数据分组的序号FSN、存储所述音频媒体数据分组的子流缓冲区的序号及所述音频媒体数据分组在存储的子流缓冲区中的存储位置，初始化所述重组缓冲区大小为N_j，所述重组缓冲区中用于存储的冗余剔除操作后的音频媒体数据分组的序号FSN全部初始化为-1，所述存储所述音频媒体数据分组的子流缓冲区序号及所述音频媒体数据分组在存储的子流缓冲区中的存储位置全部初始化为空；The reorganization buffer is set, and the reorganization buffer is used to store the serial number FSN of the audio media data packet after redundant elimination, the serial number of the substream buffer for storing the audio media data packet and the audio media data packet in the stored The storage location in the sub-stream buffer, initialize the size of the reorganization buffer to be N _j , and the serial numbers FSN of the audio media data packets after the redundant elimination operation for storing in the reorganization buffer are all initialized to -1, so The sub-stream buffer serial number storing the audio media data packet and the storage location of the audio media data packet stored in the stored sub-stream buffer are all initialized to be empty;

设置抖动消除窗口，所述抖动消除窗口用于实现所述音频媒体数据分组的抖动消除，初始化所述抖动消除窗口的大小为W，W∈[W_min,W_max]。Set a jitter elimination window, the jitter elimination window is used to realize the jitter elimination of the audio media data packet, initialize the size of the jitter elimination window to W, W∈[W _min , W _max ].

进一步地，所述音频媒体数据分组的冗余剔除操作，包括：Further, the redundant elimination operation of the audio media data packet includes:

S2、将所述音频媒体数据分组的序号FSN_i与所述重组缓冲区大小N_j进行模运算，得到数值m，即m＝FSN_imodN_j，查询所述重组缓冲区m位置存储的音频媒体数据分组的序号，记为J_FSN_m；S2. Perform a modulo operation on the serial number FSN _i of the audio media data group and the size N _j of the reassembly buffer to obtain a value m, that is, m=FSN _i modN _j , and query the audio media stored in the m position of the reassembly buffer The sequence number of the data packet, denoted as J_FSN _m ;

应说明的是，所述采用轮询的方式查询每个所述子流缓冲区，包括：It should be noted that the polling method used to query each of the substream buffers includes:

所述音频媒体数据的接收端定期统计每个所述子流缓冲区中的音频媒体数据分组的接收与剔除情况，得到不同传输路径对应的子流缓冲区的轮询优先级序列；The receiving end of the audio media data regularly counts the reception and elimination of the audio media data packets in each of the sub-stream buffers, and obtains the polling priority sequences of the sub-stream buffers corresponding to different transmission paths;

进一步地，所述重组操作，包括：Further, the reorganization operation includes:

具体地，在多路径实时音频会话进行过程中，音频数据数据分组的接收端接收所述多条不完全相交的传输路径传输的音频媒体数据分组，根据所述音频媒体数据分组中共有的序号或时间戳等能够区分所述音频媒体数据分组的标志进行音频媒体数据分组的冗余剔除操作；通过查找动态缓存区或动态缓冲窗口中的音频媒体数据分组，得到待解码的音频媒体数据分组，按照实时音频会话双方协商的音频编解码格式和网络传输协议对所述音频媒体数据分组进行回调解码，还原原始的音频数据。Specifically, during the multi-path real-time audio session, the receiving end of the audio data packet receives the audio media data packets transmitted by the multiple incompletely intersecting transmission paths, and according to the shared sequence number or Timestamp etc. can distinguish the mark of described audio media data grouping and carry out the redundant elimination operation of audio media data grouping; By searching the audio media data grouping in dynamic cache area or dynamic buffer window, obtain the audio media data grouping to be decoded, according to The audio codec format and the network transmission protocol negotiated by both parties of the real-time audio session perform callback decoding on the audio media data packet to restore the original audio data.

图2示出了本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的多径冗余传输系统结构图，如图2所示，主叫方210和被叫方220分别位于两端的用户侧网络，SIPProxy/IMSCSCF230、媒体中继控制器240、媒体中继服务器250均部署于网络中。Fig. 2 shows the multipath redundant transmission system structural diagram of the multipath relay transmission service system that SIPProxy/IMSCSCF participates in the session negotiation that the second embodiment of the present invention provides, as shown in Fig. 2, the calling party 210 and the called party The parties 220 are respectively located in the user-side networks at both ends, and the SIPProxy/IMSCSCF 230, the media relay controller 240, and the media relay server 250 are all deployed in the network.

所述SIPProxy/IMSCSCF230是具有多路径传输会话协商能力的SIPProxy/IMSCSCF。通过扩展SIP和SDP消息实现对于多路径会话传输的信令协商，在SIPProxy/IMSCSCF230中添加一个用于处理多路径会话请求的模块，可有效实现SIPProxy/IMSCSCF230处理来自两端用户侧网络的多路径会话请求。The SIPProxy/IMSCSCF 230 is a SIPProxy/IMSCSCF capable of multipath transmission session negotiation. By extending SIP and SDP messages, the signaling negotiation for multipath session transmission is realized, and a module for processing multipath session requests is added to SIPProxy/IMSCSCF230, which can effectively realize SIPProxy/IMSCSCF230 to handle multipath from the user-side network at both ends session request.

所述媒体中继控制器240用于管理媒体中继服务器250的网络拓扑与行为，负责中继传输路径的分配，所述媒体中继服务器250参与中继传输路径的构建，负责数据分组的接收与转发。所述媒体中继控制器240与所述媒体中继服务器250构成多路径中继传输服务系统，为多路径会话提供中继传输服务，举例来说，本实施例中包含两个媒体中继服务器，即图2中所示的媒体中继服务器250-1和媒体中继服务器250-2。The media relay controller 240 is used to manage the network topology and behavior of the media relay server 250, and is responsible for the allocation of relay transmission paths, and the media relay server 250 participates in the construction of relay transmission paths, and is responsible for receiving data packets with forwarding. The media relay controller 240 and the media relay server 250 constitute a multi-path relay transmission service system, which provides relay transmission services for multi-path sessions. For example, two media relay servers are included in this embodiment , that is, the media relay server 250-1 and the media relay server 250-2 shown in FIG. 2 .

具体地，具有多路径传输会话协商能力的SIPProxy/IMSCSCF230接收到来自两端的用户侧网络的多路径会话建立请求，向媒体中继控制器240请求分配中继传输路径。所述媒体中继控制器240与所述媒体中继服务器250协商完成中继路径的分配与建立，向具有多路径传输会话协商能力的SIPProxy/IMSCSCF230返回中继传输路径的分配信息。Specifically, the SIPProxy/IMSCSCF 230 having the capability of negotiating a multipath transmission session receives a multipath session establishment request from the user-side networks at both ends, and requests the media relay controller 240 to allocate a relay transmission path. The media relay controller 240 negotiates with the media relay server 250 to complete the allocation and establishment of the relay path, and returns the allocation information of the relay transmission path to the SIP Proxy/IMSCSCF 230 capable of negotiating multi-path transmission sessions.

上述图2的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统，结合图3所示的本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法流程示意图，进一步阐述本实施例的方法。The above-mentioned multipath relay transmission service system in which SIPProxy/IMSCSCF participates in session negotiation in FIG. 2 is further elaborated in conjunction with the flow diagram of the method for multipath redundant transmission of network real-time audio session media data provided by the second embodiment of the present invention shown in FIG. 3 The method of this embodiment.

301、主叫方和被叫方之间建立实时音频会话，音频媒体数据的发送端将捕获的音频媒体数据按照实时音频会话双方协商的音频编解码格式和网络传输协议打包，采用冗余方式同时发送至三条不完全相交的传输路径上，进行实时音频媒体数据多径冗余传输。301. A real-time audio session is established between the calling party and the called party, and the sending end of the audio media data packages the captured audio media data according to the audio codec format and network transmission protocol negotiated by the two parties of the real-time audio session, and adopts a redundant method at the same time Send to three incompletely intersecting transmission paths for multi-path redundant transmission of real-time audio media data.

本步骤中，如图2所示，根据音频会话设置冗余系数，本实施例中的音频媒体数据传输的冗余系数为2，主叫方210和被叫方220之间获取了三条传输路径260，其中包括一条基于缺省路由的默认路径260-D、经由所述媒体中继服务器250-1的中继传输路径260-R1和经由所述媒体中继服务器250-2的中继传输路径260-R2。In this step, as shown in Figure 2, the redundancy coefficient is set according to the audio session, the redundancy coefficient of the audio media data transmission in this embodiment is 2, and three transmission paths are obtained between the calling party 210 and the called party 220 260, including a default path 260-D based on a default route, a relay transmission path 260-R1 via the media relay server 250-1, and a relay transmission path via the media relay server 250-2 260-R2.

图4示出了本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的音频媒体数据分组的传输示意图，如图4所示，每条传输路径传输相同的音频媒体数据分组。Fig. 4 shows the transmission schematic diagram of the audio media data packet of the multi-path relay transmission service system in which SIPProxy/IMSCSCF participates in the session negotiation provided by the second embodiment of the present invention. As shown in Fig. 4, each transmission path transmits the same audio Media data grouping.

图5示出了本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的多径冗余传输过程的消息流程图，具体的多路径冗余传输的信令过程如下所述。Fig. 5 shows the message flow chart of the multipath redundant transmission process of the multipath relay transmission service system in which SIPProxy/IMSCSCF participates in session negotiation provided by the second embodiment of the present invention, and the specific signaling process of multipath redundant transmission as described below.

501、主叫方210向具有多路径传输会话协商功能的SIPProxy/IMSCSCF230请求为主叫方210和被叫方220之间建立多路径实时音频会话，音频会话的冗余系数为2；501. The calling party 210 requests the SIPProxy/IMSCSCF230 having a multi-path transmission session negotiation function to establish a multi-path real-time audio session between the calling party 210 and the called party 220, and the redundancy factor of the audio session is 2;

502、具有多径传输会话协商能力的SIPProxy/IMSCSCF230查看主叫方210和被叫方220是否具体有多路径会话能力，如果两方都没有多径会话能力或者一方没有多路径会话能力，具有多径传输会话协商能力的SIPProxy/IMSCSCF230便拒绝多路径音频会话请求。如果主叫方210和被叫方220都具有多径会话能力，具有多径传输会话协商能力的SIPProxy/IMSCSCF230便向媒体中继控制器240请求为主叫方210和被叫方220之间多路径实时音频会话分配多条中继传输路径；502. The SIPProxy/IMSCSCF 230 with multi-path transmission session negotiation capability checks whether the calling party 210 and the called party 220 have specific multi-path session capabilities. If neither party has the multi-path session capability or one party does not have the multi-path session capability, the The SIPProxy/IMSCSCF 230 that does not have the session negotiation capability of the path transmission rejects the multipath audio session request. If both the calling party 210 and the called party 220 have the multipath session capability, the SIPProxy/IMSCSCF230 with the multipath transmission session negotiation capability will request the media relay controller 240 for the multipath communication between the calling party 210 and the called party 220. Route real-time audio sessions to allocate multiple relay transmission paths;

503、由于会话的冗余系数为2，媒体中继控制器240为主叫方210和被叫方220之间多路径实时音频会话分配了两条中继传输路径，一条经由媒体中继服务器250-1，一条经由媒体中继服务器250-2。媒体中继控制器240分别向媒体中继服务器250-1和媒体中继服务器250-2发送中继路径添加请求503-1和503-2；503. Since the redundancy factor of the session is 2, the media relay controller 240 allocates two relay transmission paths for the multi-path real-time audio session between the calling party 210 and the called party 220, one of which passes through the media relay server 250 -1, one via the media relay server 250-2. The media relay controller 240 sends relay path adding requests 503-1 and 503-2 to the media relay server 250-1 and the media relay server 250-2 respectively;

504、媒体中继服务器250-1和媒体中继服务器250-2完成中继路径的添加，分别向媒体中继控制器240返回中继路径添加成功响应504-1和504-2；504. The media relay server 250-1 and the media relay server 250-2 complete the addition of the relay path, and return relay path addition success responses 504-1 and 504-2 to the media relay controller 240 respectively;

505、媒体中继控制器240向具有多路径传输会话协商能力的SIPProxy/IMSCSCF230返回中继路径分配成功响应，响应消息中包含两条中继路径的信息；505. The media relay controller 240 returns a relay path assignment success response to the SIPProxy/IMSCSCF 230 capable of multipath transmission session negotiation, and the response message includes information about two relay paths;

506、具有多路径传输会话协商能力的SIPProxy/IMSCSCF230通知被叫方220传输路径的分配情况；506. The SIPProxy/IMSCSCF230 having the multipath transmission session negotiation capability notifies the called party 220 of the allocation of transmission paths;

507、被叫方220为多路径音频会话分配并初始化缓冲区，并向具有多路径传输会话协商能力的SIPProxy/IMSCSCF230返回506的通知成功响应；507. The called party 220 allocates and initializes a buffer for the multipath audio session, and returns a notification success response of 506 to the SIPProxy/IMSCSCF230 having the multipath transmission session negotiation capability;

508、具有多径传输会话协商能力的SIPProxy/IMSCSCF230向主叫方210返回多径会话建立成功响应，并通知主叫方210两条传输路径分配情况；508. The SIPProxy/IMSCSCF230 having the multipath transmission session negotiation capability returns a multipath session establishment success response to the calling party 210, and notifies the calling party 210 of the assignment of the two transmission paths;

509、主叫方210与被叫方220之间有三条传输路径：一条基于缺省路由的默认路径260-D，经由媒体中继服务器250-1的中继传输路径260-R1和经由媒体中继服务器250-2的中继传输路径260-R2。使用冗余传输，媒体数据分组的传输方式如图4所示。509. There are three transmission paths between the calling party 210 and the called party 220: a default path 260-D based on a default route, a relay transmission path 260-R1 via the media relay server 250-1, and a The relay transmission path 260-R2 of the relay server 250-2. Using redundant transmission, the transmission mode of media data packets is shown in FIG. 4 .

302、音频媒体数据的接收端设置并初始化子流缓冲区、重组缓冲区和抖动消除窗口，接收三条不完全相交的传输路径上传输的音频媒体数据分组。302. The receiver of audio media data sets and initializes a substream buffer, a reassembly buffer, and a jitter removal window, and receives audio media data packets transmitted on three incompletely intersecting transmission paths.

本步骤中，所述设置并初始化子流缓冲区、重组缓冲区和抖动消除窗口，包括：In this step, the setting and initialization of the sub-stream buffer, the reassembly buffer and the jitter removal window include:

设置抖动消除窗口，所述抖动消除窗口用于实现所述音频媒体数据分组的抖动消除，初始化所述抖动消除窗口的大小为W＝100ms，W∈[W_min,W_max]，W_min＝40ms，W_max＝160ms。Set the jitter elimination window, the jitter elimination window is used to realize the jitter elimination of the audio media data packet, initialize the size of the jitter elimination window to be W=100ms, W∈[W _min , W _max ], W _min =40ms , W _max =160 ms.

具体地，图6示出了本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法的音频媒体数据分组接收端缓冲区设置框图，如图6所示，图6中包括子流缓冲区610、重组缓冲区620和抖动消除窗口630，结合图7所示的本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法的音频媒体数据接收端的缓冲区实施设计图，图7中重组缓冲区固定，音频媒体数据分组通过多条不完全相交条路径进入各传输路径对应的子流缓冲区710，其中，子流缓冲区710-D对应传输路径260-D，子流缓冲区710-R1对应传输路径260-R1，子流缓冲区710-R2对应传输路径260-R2，通过冗余剔除操作进入重组缓冲区720，进而通过动态调节重组缓冲区720上的抖动消除窗口730大小，实现音频媒体数据分组的抖动消除，最后对音频媒体数据分组按照实时音频会话双方协商的音频编解码格式和网络传输协议进行回调解码，还原原始的音频数据。Specifically, Fig. 6 shows a block diagram of setting buffers at the receiving end of the audio media data packet of the network real-time audio session media data multipath redundant transmission method provided by the second embodiment of the present invention, as shown in Fig. 6 , including Substream buffer 610, reassembly buffer 620 and jitter removal window 630, combined with the buffer of the audio media data receiving end of the network real-time audio session media data multipath redundant transmission method provided by the second embodiment of the present invention shown in FIG. 7 Implementation design diagram, in Figure 7, the reassembly buffer is fixed, and the audio media data packets enter the sub-stream buffer 710 corresponding to each transmission path through multiple incompletely intersecting paths, wherein the sub-stream buffer 710-D corresponds to the transmission path 260- D, the sub-stream buffer 710-R1 corresponds to the transmission path 260-R1, and the sub-stream buffer 710-R2 corresponds to the transmission path 260-R2, enters the reassembly buffer 720 through the redundancy elimination operation, and then dynamically adjusts the reassembly buffer 720 The size of the jitter elimination window is 730 to realize the jitter elimination of the audio media data packet, and finally perform callback decoding on the audio media data packet according to the audio codec format and network transmission protocol negotiated by both parties of the real-time audio session, and restore the original audio data.

图8示出了本发明第二实施例提供的SIPProxy/IMSCSCF参与会话协商的多径中继传输服务系统的多径冗余传输的音频媒体数据分组的封装格式示意图，格式封装采用的是多径传输协议(MPTP)，图8中各个标志位的含义如下：Fig. 8 shows a schematic diagram of the encapsulation format of the audio media data packet of the multipath redundant transmission of the multipath relay transmission service system provided by the SIPProxy/IMSCSCF participating in the session negotiation provided by the second embodiment of the present invention, and the format encapsulation adopts multipath Transport protocol (MPTP), the meanings of each flag in Figure 8 are as follows:

801：版本号，2bit，当前为版本1；801: version number, 2bit, currently version 1;

802：类型，1bit，用于说明音频媒体数据分组的类型(媒体数据分组或者控制数据分组)；802: type, 1 bit, used to describe the type of audio media data packet (media data packet or control data packet);

803：填充位，1bit，指示是否有非有效载荷的填充数据；803: padding bit, 1bit, indicating whether there is non-payload padding data;

804：面向特殊应用的MPTP类型，4bit，指示该类数据分组的特定应用；804: MPTP type for special applications, 4bit, indicating the specific application of this type of data packet;

805：业务类型，4bit，指明不同的业务种类的传输需求；805: business type, 4bit, indicating the transmission requirements of different business types;

806：保留字段，4bit，置0；806: Reserved field, 4bit, set to 0;

807：子流序列号，16bit，用于标识音频媒体数据分组在该路径中的传输序号，冗余传输中，设置子流序列号与音频媒体数据分组序列号相同；807: sub-stream serial number, 16bit, used to identify the transmission serial number of the audio media data packet in the path, in redundant transmission, set the sub-stream serial number to be the same as the audio media data packet serial number;

808：路径标识符，32bit，用于标识一条传输路径，用于音频媒体数据分组接收端按此存放音频媒体数据分组；808: path identifier, 32bit, used to identify a transmission path, used to store audio media data packets at the receiving end of audio media data packets;

809：序列号，32bit，用于标识实时音频会话中唯一的音频媒体数据分组，用于音频媒体数据分组接收端按此序列号重组原始音频数据；809: serial number, 32bit, used to identify the unique audio media data packet in the real-time audio session, and used for the receiver of the audio media data packet to reorganize the original audio data according to this serial number;

810：负载，即为待传输的实时音频媒体数据分组。810: The payload is the real-time audio media data packet to be transmitted.

具体应用中，子流缓冲区610负责接收来自多条不完全相交路径的音频媒体数据分组，通过对音频媒体数据分组进行冗余剔除，并将音频媒体数据分组的存储位置存储到重组缓冲区620，同时根据接收的音频媒体数据分组的统计和冗余音频媒体数据分组的剔除情况，定期调节子流缓冲区610的不同传输路径对应的子流缓冲区的轮询优先级。通过动态调节抖动消除窗口630的大小，能够实现音频媒体数据分组的抖动消除，同时音频媒体数据分组的回调解码过程也将按照回调的结果实时调节抖动消除窗口的大小。In a specific application, the sub-stream buffer 610 is responsible for receiving audio media data packets from multiple incompletely intersecting paths, by performing redundant elimination on the audio media data packets, and storing the storage location of the audio media data packets in the reassembly buffer 620 At the same time, according to the statistics of the received audio media data packets and the elimination of redundant audio media data packets, periodically adjust the polling priorities of the sub-stream buffers corresponding to different transmission paths of the sub-stream buffer 610 . By dynamically adjusting the size of the jitter elimination window 630, the jitter elimination of the audio media data packet can be realized, and the callback decoding process of the audio media data packet will also adjust the size of the jitter elimination window in real time according to the callback result.

303、音频媒体数据的接收端对接收到的所述音频媒体数据分组进行冗余剔除。303. The receiving end of the audio media data performs redundancy removal on the received audio media data packets.

本步骤中，图9示出了本发明第二实施例提供的网络实时音频会话媒体数据多径冗余传输方法的音频媒体数据分组的冗余剔除流程图，具体步骤包括：In this step, FIG. 9 shows a flowchart for eliminating redundancy of audio media data packets in a method for multipath redundant transmission of network real-time audio session media data provided by the second embodiment of the present invention. The specific steps include:

901、初始化重组缓冲区，大小为N_j；901. Initialize the reassembly buffer with a size of N _j ;

902、采用轮询的方式从各路径子流缓冲区中得到音频媒体数据分组i；902. Obtain the audio media data packet i from each path substream buffer in a polling manner;

903、判断所述音频媒体数据分组i有效负载是否为空，若是，则执行步骤911，否则，执行步骤904；903. Determine whether the payload of the audio media data packet i is empty, if so, execute step 911, otherwise, execute step 904;

904、提取音频媒体数据分组i的序列号，如图8中所示的序列号809，记为FSN_i；904, extract the sequence number of the audio media data packet i, such as the sequence number 809 shown in Figure 8, denoted as FSN _i ;

905、计算m＝FSN_imodN_j；905. Calculate m=FSN _i mod N _j ;

906、查询重组缓冲区中m位置上存放的音频媒体数据分组的序列，同样如图8中所示的序列号809，记为J_FSN_m；906, inquire about the sequence of the audio media data packet stored on the m position in the reorganization buffer, the sequence number 809 as shown in Figure 8 is also denoted as J_FSN _m ;

907、判断J_FSN_m是否等于-1，若是，则说明音频媒体数据分组为新到达重组缓冲区的，执行步骤909，否则，执行步骤908；907, judge whether J_FSN _m is equal to-1, if so, then explain that the audio media data packet is newly arrived at the reassembly buffer, execute step 909, otherwise, execute step 908;

908、判断J_FSN_m是否小于FSN_i，若是，则说明音频媒体数据分组i为新到达的音频媒体数据分组，执行步骤909，否则，执行步骤911；908. Determine whether J_FSN _m is less than FSN _i , if so, then explain that the audio media data packet i is a newly arrived audio media data packet, and execute step 909, otherwise, execute step 911;

909、将FSN_i的值赋给J_FSN_m；909. Assign the value of FSN _i to J_FSN _m ;

910、将音频媒体数据分组i在子流缓冲区中的存储位置存放到重组缓冲区中m位置中；910. Store the storage location of the audio media data packet i in the substream buffer into the m location in the reassembly buffer;

911、判断是否结束接收，若是，结束音频媒体数据分组的接收，否则，执行步骤902。911. Determine whether to end receiving, if yes, end receiving audio media data packets; otherwise, execute step 902.

304、音频媒体数据的接收端对冗余剔除操作后的所述音频媒体数据分组进行重组操作，并按照实时音频会话双方协商的音频编解码格式和网络传输协议还原原始音频数据。304. The receiving end of the audio media data reassembles the audio media data packets after the redundancy removal operation, and restores the original audio data according to the audio codec format and network transmission protocol negotiated by both parties of the real-time audio session.

本步骤中，所述重组操作，包括：In this step, the reorganization operation includes:

进一步地，所述调节抖动消除窗口大小，包括：Further, said adjusting the size of the jitter elimination window includes:

根据已到达所述重组缓冲区的音频媒体数据分组，得到新到达所述重组缓冲区的音频媒体数据分组i的预期时延抖动其中，N为固定值，取N＝4，i≥N，P(k)为加权系数，一般取值P(i-1)＝0.5，P(i-2)＝0.3，P(i-3)＝0.125，P(i-4)＝0.075；According to the audio media data packet that has arrived at the reassembly buffer, the expected delay jitter of the audio media data packet i newly arriving at the reassembly buffer is obtained Among them, N is a fixed value, take N=4, i≥N, P(k) is a weighting coefficient, generally take values P(i-1)=0.5, P(i-2)=0.3, P(i-3 )=0.125, P(i-4)=0.075;

设置阈值g₁＝5ms、g₂＝10ms用于判断抖动消除窗口大小更改范围；Set the threshold g ₁ =5ms, g ₂ =10ms to judge the change range of the jitter elimination window size;

当增加抖动消除窗口大小，若J_ARMSE(i)∈[5,10]，抖动消除窗口的增加若J_ARMSE(i)＞10，抖动消除窗口增加其中，k＝1；when Increase the size of the jitter removal window, if J _ARMSE (i)∈[5,10], the increase of the jitter removal window If J _ARMSE (i) > 10, the jitter removal window increases where k=1;

当缩减抖动消除窗口大小，若J_ARMSE(i)∈[5,10]，抖动消除窗口的缩减若J_ARMSE(i)＞10，抖动消除窗口缩减其中，k＝0.6；when Reduce the size of the jitter removal window, if J _ARMSE (i)∈[5,10], the reduction of the jitter removal window If J _ARMSE (i) > 10, the jitter removal window is reduced where k=0.6;

若调节后的抖动消除窗口的大小W小于抖动消除窗口最小值W_min，则将抖动消除窗口的大小设为抖动消除窗口最小值W_min，若调节后的抖动消除窗口的大小W大于抖动消除窗口最大值W_max，则将抖动消除窗口的大小设为抖动消除窗口最大值W_max；If the size W of the adjusted jitter elimination window is smaller than the minimum value W _min of the jitter elimination window, then set the size of the jitter elimination window to the minimum value W _min of the jitter elimination window, if the adjusted size W of the jitter elimination window is greater than the jitter elimination window maximum value W _max , then set the size of the jitter elimination window as the maximum value W _max of the jitter elimination window;

当J_ARMSE(i)＜5，抖动消除窗口的大小无需改变；When J _ARMSE (i)<5, the size of the jitter elimination window does not need to be changed;

应说明的是，为了防止当前音频媒体数据分组i对下一音频媒体数据分组i+1的预测准确度的影响，若对抖动消除窗口大小进行了修改，则更改音频媒体数据分组i的时延抖动J(i)为区间[J(i)-4,J(i)+4]中任一整数，否则，不进行更改。It should be noted that, in order to prevent the impact of the current audio media data packet i on the prediction accuracy of the next audio media data packet i+1, if the size of the jitter removal window is modified, the time delay of the audio media data packet i is changed The jitter J(i) is any integer in the interval [J(i)-4, J(i)+4], otherwise, no change is made.

进一步地，所述根据所述抖动消除窗口大小，进行所述音频媒体数据分组回调操作，包括：Further, the performing the callback operation of the audio media data packet according to the size of the jitter elimination window includes:

在抖动消除窗口中查找到音频媒体数据分组序号为FSN的音频媒体数据分组，按照实时音频会话双方协商的音频编码解格式和网络传输协议进行回调解码，在抖动消除窗口中查找音频媒体数据分组序号为FSN+1的音频媒体数据分组；Find the audio media data packet whose serial number is FSN in the jitter elimination window, perform callback decoding according to the audio encoding solution format and network transmission protocol negotiated by both parties in the real-time audio session, and find the audio media data packet serial number in the jitter elimination window It is the audio media data packet of FSN+1;

若查找到，则将音频媒体数据分组序号为FSN+1的音频媒体数据分组回调解码，继续查找音频媒体数据分组序号为FSN+2的音频媒体数据分组；If found, then the audio media data packet sequence number is the audio media data packet callback decoding of FSN+1, continue to search for the audio media data packet sequence number of the audio media data packet is the audio media data packet of FSN+2;

否则，判断抖动消除窗口是否有剩余，若无剩余，则将音频媒体数据分组序号为FSN+1的音频数据分组再执行一次回调解码操作，继续查找音频媒体数据分组序号为FSN+2的音频媒体数据分组，否则，等待预设时间t后继续查找，t可取10ms；Otherwise, it is judged whether the jitter elimination window is left, if there is no remaining, then perform a callback decoding operation on the audio data packet with the audio media data packet serial number FSN+1, and continue to search for the audio media with the audio media data packet serial number FSN+2 Data grouping, otherwise, continue to search after waiting for the preset time t, t can be 10ms;

综上所述，本实施例的网络实时音频会话媒体数据多径冗余传输方法通过会话类音频数据采用多路径冗余传输可有效的提高数据传输中的丢包补偿概率，保证数据传输的可靠性。并通过在音频媒体数据分组的接收端进行冗余剔除和基于动态缓冲窗口的抖动消除的重组操作，有效地降低了时延抖动对音频媒体数据分组接收的影响，进而有效地提升用户业务体验质量。In summary, the method for multi-path redundant transmission of network real-time audio session media data in this embodiment can effectively improve the probability of packet loss compensation in data transmission by using multi-path redundant transmission of session audio data, and ensure the reliability of data transmission. sex. And by performing redundancy elimination and recombination operations based on dynamic buffer window jitter elimination at the receiving end of audio media data packets, the impact of delay jitter on the reception of audio media data packets is effectively reduced, thereby effectively improving user service experience quality .

最后应说明的是：以上各实施例仅用以说明本发明的技术方案，而非对其限制；尽管参照前述各实施例对本发明进行了详细的说明，本领域的普通技术人员应当理解：其依然可以对前述各实施例所记载的技术方案进行修改，或者对其中部分或者全部技术特征进行等同替换；而这些修改或者替换，并不使相应技术方案的本质脱离本发明权利要求所限定的范围。Finally, it should be noted that: the above embodiments are only used to illustrate the technical solutions of the present invention, rather than limiting them; although the present invention has been described in detail with reference to the foregoing embodiments, those of ordinary skill in the art should understand that: It is still possible to modify the technical solutions described in the foregoing embodiments, or perform equivalent replacements for some or all of the technical features; and these modifications or replacements do not make the essence of the corresponding technical solutions depart from the scope defined by the claims of the present invention .

Claims

1. A network real-time audio conversation media data multipath redundant transmission method, is characterized in that, comprises:

The sender of the audio media data packs the captured audio media data according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session, and sends them to multiple incompletely intersecting transmission paths in a redundant manner at the same time for real-time audio media data Multipath redundant transmission;

The receiving end of the audio media data performs redundancy removal and reassembly operations on the received audio media data packets from different transmission paths, and restores the original audio data according to the audio codec format and network transmission protocol negotiated by the two sides of the real-time audio session.

2. the network real-time audio session media data multipath redundant transmission method according to claim 1, is characterized in that, described a plurality of incompletely intersecting transmission paths comprises: a default path based on default routing and one or more Two incompletely intersecting redundant transmission paths.

3. network real-time audio session media data multipath redundant transmission method according to claim 1, it is characterized in that, according to described real-time audio session setting redundancy coefficient, the bar number of described a plurality of incompletely intersecting transmission paths Redundancy factor greater than the Realtime Audio session setting.

4. network real-time audio conversation media data multi-path redundant transmission method according to claim 1, is characterized in that, described redundant elimination and reorganization operation comprise:

Set up and initialize substream buffers, reassembly buffers and jitter removal windows;

The receiving end of the audio media data performs a redundancy removal operation on the received audio media data packets;

The receiver of the audio media data performs a reassembly operation on the audio media data packets after the redundancy removal operation.

5. the network real-time audio session media data multipath redundant transmission method according to claim 4, is characterized in that, described setting and initialization substream buffer, reorganization buffer and jitter elimination window include:

Set the same number of sub-stream buffers as the number of incompletely intersecting transmission paths, the sub-stream buffers are used to receive audio media data packets of different transmission paths, and initialize each of the sub-stream buffers storage location is empty;

The reorganization buffer is set, and the reorganization buffer is used to store the sequence number of the audio media data packet after redundant elimination, the sequence number of the substream buffer for storing the audio media data packet and the substream of the audio media data packet in the storage The storage location in the stream buffer, initialize the size of the reorganization buffer to be N _j , the sequence numbers of the audio media data packets used for storing in the reorganization buffer after the redundancy removal operation are all initialized to -1, and the storage The sequence number of the sub-stream buffer of the audio media data packet and the storage location of the audio media data packet in the stored sub-stream buffer are all initialized to be empty;

Set a jitter removal window, the jitter removal window is used to implement the jitter removal of the audio media data packet, initialize the size of the jitter removal window to W, W∈[W _min , W _max ].

6. the network real-time audio conversation media data multipath redundant transmission method according to claim 4, is characterized in that, described redundant elimination operation comprises:

S1, adopt polling mode to inquire about each described sub-flow buffer, obtain the latest received described audio media data packet i, extract the sequence number of described audio media data packet i, and record it as FSN _i ;

S2. Perform a modulo operation on the serial number FSN _i of the audio media data packet i and the size N _j of the reassembly buffer to obtain a value m, that is, m=FSN _i modN _j , and query the value stored in the m position in the reassembly buffer The sequence number of the audio media data packet, denoted as J_FSN _m ;

S3, if J_FSN _m =-1, then assign the value of FSN _i to J_FSN _m , store the serial number and the storage location of the sub-stream buffer where the audio media data packet i is located in the reassembly buffer m position, and perform the steps S1;

S4. If J_FSN _m ≠-1, FSN _i >J_FSN _m , then assign the value of FSN _i to J_FSN _m , and store the sequence number and storage location of the substream buffer where the audio media data packet i is located in the reassembly buffer In position m, execute step S1;

S5. If J_FSN _m ≠ -1 and FSN _i ≤ J_FSN _m , execute step S1.

7. the network real-time audio session media data multipath redundant transmission method according to claim 6, is characterized in that, described employing polling mode to inquire about each described sub-stream buffer, comprising:

The receiving end of the audio media data regularly counts the reception and redundancy elimination of the audio media data packets in each of the sub-stream buffers, and obtains the polling priority sequences of the sub-stream buffers corresponding to different transmission paths;

The receiving end of the audio media data queries each of the sub-stream buffers according to the polling priority sequence, and performs a redundant elimination operation.

8. the network real-time audio conversation media data multipath redundant transmission method according to claim 4, is characterized in that, described reorganization operation comprises:

Dynamically adjust the size of the jitter removal window;

The callback operation of the audio media data packet is performed according to the size of the jitter elimination window.

9. The network real-time audio conversation media data multipath redundant transmission method according to claim 8, is characterized in that, described dynamic adjustment shakes and eliminates window size, comprises:

Record the time R(i) at which the audio media data packet i arrives at the reassembly buffer, and obtain the audio media according to the timestamp in the audio media data packet i or other flags used to record the sending time of the audio media data packet The sending time S(i) of the data packet i, calculate the delay jitter J(i)=R(i)-S(i) of the audio media data packet i;

According to the audio media data packet that has arrived at the reassembly buffer, the expected delay jitter of the audio media data packet i newly arriving at the reassembly buffer is obtained Among them, N is a fixed value, i≥N, P(k) is a weighting coefficient, and

Σ_{k = i - N}^{i - 1} P (k) = 1;

Calculate the class standard error of the delay jitter of the audio media data packet i

J_{A R m S E.} (i) = \sqrt{\frac{1}{N} Σ_{k = i - N + 1}^{i} {[J (k) - \overset{&OverBar;}{J (k)}]}^{2}};

Setting thresholds g ₁ and g ₂ is used to judge the change range of the jitter elimination window size, and g ₁ <g ₂ ;

when Increase the size of the jitter removal window, if J _ARMSE (i)∈[g ₁ ,g ₂ ], the jitter removal window increases

2 g_{1} + k | J (i) - \overset{&OverBar;}{J (i)} |,

If J _ARMSE (i)＞g ₂ , the jitter removal window increases

2 g_{2} + \frac{k}{N} Σ_{k = i - N + 1}^{i} | J (k) - \overset{&OverBar;}{J (k)} |,

where k∈[0.5,1];

when Reduce the size of the jitter removal window, if J _ARMSE (i)∈[g ₁ ,g ₂ ], the jitter removal window is reduced

2 g_{1} + k | J (i) - \overset{&OverBar;}{J (i)} |,

If J _ARMSE (i)＞g ₂ , the jitter removal window is reduced

2 g_{2} + \frac{k}{N} Σ_{k = i - N + 1}^{i} | J (k) - \overset{&OverBar;}{J (k)} |,

where k∈[0.5,1];

If the adjusted jitter elimination window size W is smaller than the jitter elimination window minimum value W _min , then set the jitter elimination window size to the jitter elimination window minimum value W _min , if the adjusted jitter elimination window size W is greater than the jitter elimination window maximum value W _max , then the size of the jitter elimination window is set to the maximum value W _max of the jitter elimination window;

When J _ARMSE (i)<g ₁ , the size of the jitter elimination window does not need to be changed;

If the jitter removal window size is modified, the expected delay jitter of the changed audio media data packet i It is any integer in the interval [J(i)-4, J(i)+4], otherwise, no change is made.

10. The network real-time audio conversation media data multipath redundant transmission method according to claim 8, is characterized in that, described according to described jitter elimination window size, carries out described audio media data packet callback operation, comprises:

Find the audio data packet whose serial number is FSN in the jitter elimination window, perform callback decoding according to the audio codec format and network transmission protocol negotiated by both parties in the real-time audio session, and find the audio media data packet serial number in the jitter elimination window. Audio media data packet of FSN+1;

If found, then the audio media data packet serial number is FSN+1 audio media data packet callback decoding, continue to search the audio media data packet serial number is the audio media data packet of FSN+2, otherwise, judge whether there is surplus in the jitter elimination window, If there is no remaining, then the audio media data packet whose serial number is FSN+1 is called back and decoded again, and continues to search for the audio media data packet whose serial number is FSN+2, otherwise, wait for the preset time Continue to search after t;

If the search is successful three times in a row, reduce the jitter elimination window to the size of one data frame, and judge whether the jitter elimination window is smaller than the minimum value W _min of the jitter elimination window at this time, and if so, set the size of the jitter elimination window to the minimum value of the jitter elimination window W _min , continue to perform the search operation;

If three consecutive searches are unsuccessful, then increase the size of the jitter elimination window by one data frame, and judge whether the jitter elimination window is greater than the maximum value W _max of the jitter elimination window at this time, and if so, set the size of the jitter elimination window to the maximum value of the jitter elimination window value W _max , continue to perform the search operation.