JP2009065607A

JP2009065607A - JITTER BUFFER CONTROL METHOD AND VoIP TERMINAL

Info

Publication number: JP2009065607A
Application number: JP2007233917A
Authority: JP
Inventors: Takahiro Shimizu; 崇弘清水
Original assignee: Oki Electric Industry Co Ltd
Current assignee: Oki Electric Industry Co Ltd
Priority date: 2007-09-10
Filing date: 2007-09-10
Publication date: 2009-03-26

Abstract

<P>PROBLEM TO BE SOLVED: To provide a jitter buffer control method for reducing sound quality deterioration such as sound interruption, and a VoIP terminal which implements the method. <P>SOLUTION: A jitter buffer control method according to the present invention includes: dividing sound data into a plurality of divided units when storing the sound data in a jitter buffer; setting a thinning incomplete flag for each divided unit; monitoring the total amount of sound data in the jitter buffer; and applying thinning processing to at least one divided unit to which the thinning incomplete flag is set, and changing and setting the thinning incomplete flag of the relevant divided unit into a thinning complete flag when the total amount of sound data exceeds a predetermined threshold. <P>COPYRIGHT: (C)2009,JPO&INPIT

Description

本発明は、ＶｏＩＰ技術を用いて音声通信を行うＶｏｌＰ端末におけるジッタバッファ制御方法及びかかるＶｏｌＰ端末に関する。 The present invention relates to a jitter buffer control method in a VolP terminal that performs voice communication using VoIP technology, and to such a VolP terminal.

ＶｏｌＰ技術を用いた音声通信においては、音声データが所定の符号化形式でエンコードされた後に通常２０ｍｓｅｃ〜１０ｍｓｅｃの間隔でパケット化され、インターネット等のＶｏＩＰネットワークに向けて送信がされる。しかし、ＶｏＩＰネットワークの輻輳状況によっては必ずしも間隔１０ｍｓｅｃ〜２０ｍｓｅｃでのパケット伝送が保障されず、ジッタや遅延が不可避的に発生する。そこで一般に、ＶｏｌＰ端末にはパケット受信時におけるジッタを吸収するためにジッタバッファが設けられている。しかし、ジッタ分に相当するパケット分の音声データはそのまま遅延となってしまうため、ジッタバッファ内のパケット数が一定の閾値を超えるとパケットの破棄が行われる。 In voice communication using the VolP technology, voice data is encoded in a predetermined encoding format and then packetized, usually at intervals of 20 msec to 10 msec, and transmitted toward a VoIP network such as the Internet. However, depending on the congestion state of the VoIP network, packet transmission at an interval of 10 msec to 20 msec is not necessarily guaranteed, and jitter and delay inevitably occur. Therefore, in general, a VolP terminal is provided with a jitter buffer to absorb jitter at the time of packet reception. However, since the audio data for the packet corresponding to the jitter is directly delayed, the packet is discarded when the number of packets in the jitter buffer exceeds a certain threshold.

例えば、引用文献１は、ジッタバッファを構成するＦＩＦＯの内部に、パケット削除領域とパケット追加領域を設け、ＦＩＦＯの蓄積パケット量に応じてパケットを削除するか或いはパケットを追加するように制御する技術、並びにジッタバッファ内のパケット量に応じてクロック周波数を変化させて処理の高速化を図る技術を開示している。 For example, in the cited document 1, a technique is provided in which a packet deletion area and a packet addition area are provided in the FIFO constituting the jitter buffer, and control is performed so that packets are deleted or packets are added according to the amount of accumulated packets in the FIFO. In addition, a technique for increasing the processing speed by changing the clock frequency according to the amount of packets in the jitter buffer is disclosed.

また、引用文献２は、受信パケットのジッタ値から間延び又は間引きの要否を判定し、バッファ内の音声データに対してフレーム単位の間延び又は間引き処理を実行する技術を開示している。
特開２００４−２７４５７２号公報特開２００５−６４８７３号公報 Further, Cited Document 2 discloses a technique for determining whether or not a decimation or decimation is necessary from the jitter value of a received packet, and performing a decimation or decimation process on a frame basis for audio data in a buffer.
JP 2004-274572 A JP 2005-64873 A

しかしながら、かかる従来の技術によっては、パケット単位やフレーム単位の破棄が行われることで連続した数１０個〜数１００個の音声データ片が欠落してしまい、音声が途切れてしまうという問題がある。 However, depending on the conventional technique, there is a problem that the audio data is interrupted because several tens to several hundreds of continuous audio data pieces are lost due to discarding in units of packets or frames.

本発明の目的は、音声途切れ等の音声品質劣化を低減するジッタバッファ制御方法及びかかる方法を実行するＶｏＩＰ端末を提供することである。 An object of the present invention is to provide a jitter buffer control method for reducing voice quality degradation such as voice interruption and a VoIP terminal for executing such a method.

本発明によるジッタバッファ制御方法は、受信パケットに含まれる音声データを蓄積するジッタバッファと、該ジッタバッファに蓄積されている音声データを取り出してこれを復号する音声デコード部と、を備えるＶｏＩＰ端末におけるジッタバッファ制御方法であって、該ジッタバッファに音声データを蓄積する際に、該音声データを複数の分割単位に分割し、該分割単位毎に間引き未完フラグを設定する間引き未完フラグ設定ステップと、該ジッタバッファの音声データ総量を監視し、該音声データ総量が所定閾値を越えた場合に、該間引き未完フラグが設定されている少なくとも１つの分割単位に対して間引き処理を施すると共に当該分割単位の間引き未完フラグを間引き完了フラグに変更設定する間引き完了フラグ変更設定ステップと、を含むことを特徴とする。 A jitter buffer control method according to the present invention is provided in a VoIP terminal comprising a jitter buffer for accumulating audio data included in a received packet, and an audio decoding unit for extracting and decoding the audio data stored in the jitter buffer. A jitter buffer control method, wherein when audio data is stored in the jitter buffer, the audio data is divided into a plurality of division units, and a thinning incomplete flag setting step for setting a thinning incomplete flag for each division unit; The total amount of audio data in the jitter buffer is monitored, and when the total amount of audio data exceeds a predetermined threshold, thinning processing is performed on at least one division unit for which the decimation incomplete flag is set and the division unit Thinning completion flag change setting step for changing the thinning incomplete flag to the thinning completion flag , Characterized in that it comprises a.

本発明によるＶｏＩＰ端末は、受信パケットに含まれる音声データを蓄積するジッタバッファと、該ジッタバッファに蓄積されている音声データを取り出してこれを復号する音声デコード部と、を備えるＶｏＩＰ端末であって、該ジッタバッファが音声データを蓄積する際に、該音声データを複数の分割単位に分割し、該分割単位毎に間引き未完フラグを設定する間引き未完フラグ設定手段と、該ジッタバッファの音声データ総量を監視し、該音声データ総量が所定閾値を越えた場合に、該間引き未完フラグが設定されている少なくとも１つの分割単位に対して間引き処理を施すると共に当該分割単位の間引き未完フラグを間引き完了フラグに変更設定する間引き完了フラグ変更設定手段と、を含むことを特徴とする。 A VoIP terminal according to the present invention is a VoIP terminal comprising a jitter buffer that accumulates audio data included in a received packet, and an audio decoding unit that extracts and decodes the audio data accumulated in the jitter buffer. A decimation incomplete flag setting means for dividing the audio data into a plurality of division units and setting a decimation incomplete flag for each division unit when the jitter buffer accumulates audio data, and a total amount of audio data in the jitter buffer When the total amount of audio data exceeds a predetermined threshold, at least one division unit for which the thinning incomplete flag is set is thinned and the thinning incomplete flag for the division unit is thinned And a decimation completion flag change setting means for changing and setting the flag.

本発明によるジッタバッファ制御方法及びＶｏＩＰ端末によれば、ジッタ発生時において音声データの分割単位毎に音声データ片の削除を行う間引き処理が行われる。これにより音声途切れ等の音声品質劣化が低減される。 According to the jitter buffer control method and the VoIP terminal according to the present invention, the thinning process for deleting the audio data pieces is performed for each audio data division unit when jitter occurs. Thereby, voice quality deterioration such as voice interruption is reduced.

本発明の実施例について添付の図面を参照しつつ詳細に説明する。 Embodiments of the present invention will be described in detail with reference to the accompanying drawings.

図１は、本発明の実施例を示し、ＶｏＩＰ端末を含む全体の構成を示している。ここで、ＶｏＩＰ端末３０は、ＶｏＩＰネットワーク１０に接続されてＶｏＩＰ技術を用いた音声通信機能を奏する端末である。ＶｏＩＰ端末３０は、パケット受信部４０と、ジッタバッファ５０と、音声デコード部６０と、ジッタバッファ制御部７０とを含む。 FIG. 1 shows an embodiment of the present invention and shows the entire configuration including a VoIP terminal. Here, the VoIP terminal 30 is a terminal connected to the VoIP network 10 and having a voice communication function using the VoIP technology. The VoIP terminal 30 includes a packet receiving unit 40, a jitter buffer 50, an audio decoding unit 60, and a jitter buffer control unit 70.

パケット受信部４０は、ＶｏＩＰネットワーク１０から到来する受信パケットを受信しつつ、該パケットに含まれる音声データをジッタバッファ制御部７０による制御によりジッタバッファ５０に蓄積する。ジッタバッファ５０は、ＦＩＦＯ動作を行うメモリであり、先に到着して蓄積された音声データ部分が先に音声デコード部６０に取り込まれるように動作する。音声デコード部６０は、ジッタバッファ５０から蓄積された音声データを取り込みつつ、これに復号化処理を施して音声を復元する。 The packet receiver 40 receives a received packet arriving from the VoIP network 10 and accumulates audio data included in the packet in the jitter buffer 50 under the control of the jitter buffer controller 70. The jitter buffer 50 is a memory that performs a FIFO operation, and operates so that the audio data portion that has arrived and accumulated first is taken into the audio decoding unit 60 first. The audio decoding unit 60 takes in the audio data accumulated from the jitter buffer 50 and applies a decoding process to the audio data to restore the audio.

ジッタバッファ制御部７０はジッタバッファ５０の書込動作及び読出動作を制御する。ジッタバッファ制御部７０は、ジッタバッファ５０に蓄積した音声データを分割単位であるパケット単位にカウントするためのカウンタ７１と、間引き処理の実行契機を与えるパケット数の閾値が設定された閾値テーブル７２と、どれくらいの割合で音声データ片を間引きくかを与える間引き率が設定された間引き率テーブル７３と、間引き処理が実行中であるか否かを表す間引き実行フラグＧとを含む。本実施例では、間引き実行フラグＧの値が１の場合に間引き処理の実行が要求されている或いは実行中であることを表し、該値が０の場合に実行が要求されていない或いは実行中ではないことを表すものとする。 The jitter buffer control unit 70 controls the writing operation and the reading operation of the jitter buffer 50. The jitter buffer control unit 70 includes a counter 71 for counting audio data accumulated in the jitter buffer 50 in units of packets, which is a division unit, and a threshold table 72 in which thresholds for the number of packets that give an execution timing of the thinning process are set. , A thinning rate table 73 in which a thinning rate for giving a rate at which the audio data pieces are thinned, and a thinning execution flag G indicating whether the thinning processing is being executed are included. In the present embodiment, when the value of the thinning execution flag G is 1, it indicates that execution of the thinning process is requested or is being executed, and when the value is 0, execution is not requested or is being executed. It shall represent that it is not.

図２は、図１に示されたジッタバッファの構成例を示している。本図の例では、各々が１パケット分である複数の分割単位ＤＤ１〜ＤＤ１０の音声データが蓄積されていて、全体として１０パケット分が蓄積されているとしている。１パケット分である分割単位には１つの間引き完了フラグＦが付与される。間引き完了フラグＦの値が１の場合には当該分割単位に対して間引き処理が既になされたことを表し、間引き完了フラグＦの値が０の場合には間引き未完フラグとして当該分割単位に対して間引き処理が未だなされていないことを表す。 FIG. 2 shows a configuration example of the jitter buffer shown in FIG. In the example of this figure, audio data of a plurality of division units DD1 to DD10, each of which is for one packet, are stored, and 10 packets are stored as a whole. One decimation completion flag F is given to the division unit corresponding to one packet. When the value of the thinning completion flag F is 1, it indicates that the thinning process has already been performed for the division unit. When the value of the thinning completion flag F is 0, the thinning completion flag F is set to the division unit as a thinning incomplete flag. This indicates that the thinning process has not yet been performed.

間引き完了フラグＦが付与される１つの分割単位は１パケット分の音声データに限定されず、複数パケット分の音声データであっても、１パケット分の音声データがさらに分割されて各々に間引き完了フラグＦが付与されても良い。 One division unit to which the decimation completion flag F is assigned is not limited to audio data for one packet, and even for audio data for a plurality of packets, the audio data for one packet is further divided to complete each decimation. A flag F may be given.

図３は、図２に示された１つの分割単位を示している。例としての分割単位ＤＤｎ（ｎは正の整数）は、例えば１００個の音声データ片から構成される。また、間引き率を０．１とした間引き処理を行ったとして、図示されるように１０個の音声データ片に対して１個の音声データ片（図中で塗りつぶされている）が削除されている。１個の音声データ片は、音声符号化における１サンプリング分の音声データとするか、或いは所定数分のサンプリングがなされた音声データとすることができる。 FIG. 3 shows one division unit shown in FIG. The example division unit DDn (n is a positive integer) is composed of, for example, 100 audio data pieces. Also, assuming that the thinning process is performed with a thinning rate of 0.1, one voice data piece (filled in the figure) is deleted from ten voice data pieces as shown in the figure. Yes. One piece of audio data can be audio data for one sampling in audio encoding, or audio data that has been sampled for a predetermined number.

また、１つの分割単位に対してどのように離散的に音声データ片の削除を行うかは多様な形態が想定される。例えば、１つの分割単位のうちで任意の位置にある連続したＮ個（Ｎは正の整数）の音声データ片を削除する形態も想定される。具体的な削除形態や間引き率の値の決定は、実際の実験結果に基づいてなされる必要がある。例えば、分割単位DDｎ中の１００個の連続した音声データ片から連続したあるいは離散した５０個を削除すると再生した音声の品質への影響が無視できる程度ではなくなるため、間引き率は５割を超えないようにした方が望ましい。また、低い間引き率（例えば、１割）であっても、連続した音声データ片（例えば、図３における音声データ片の９１〜１００番目）を削除すると部分的にまとまった音声データ片を削除することとなり、分散した音声データ片を削除するよりは音声の品質へ影響を与えやすくなるので、削除する音声データ片は分散させておく方が望ましい。 In addition, various forms of how the audio data pieces are deleted discretely for one division unit are assumed. For example, a mode in which N consecutive (N is a positive integer) audio data pieces at an arbitrary position in one division unit is deleted is also assumed. The determination of the specific deletion mode and the value of the thinning rate needs to be made based on actual experimental results. For example, if 50 consecutive or discrete audio data pieces are deleted from 100 continuous audio data pieces in the division unit DDn, the influence on the quality of the reproduced audio is not negligible, so the thinning rate does not exceed 50%. It is desirable to do so. Moreover, even if the thinning rate is low (for example, 10%), if a continuous audio data piece (for example, 91st to 100th of the audio data pieces in FIG. 3) is deleted, a partially collected audio data piece is deleted. In other words, since it becomes easier to affect the quality of the voice than to delete the distributed audio data pieces, it is desirable to distribute the audio data pieces to be deleted.

図４は、本発明によるジッタバッファ制御方法における音声データ蓄積処理の処理手順を示している。音声パケットがＶｏＩＰ端末のパケット受信部に到着すると、パケット受信部は、受信パケットの中から音声データの部分のみをジッタバッファに蓄積する。 FIG. 4 shows a processing procedure of audio data storage processing in the jitter buffer control method according to the present invention. When the voice packet arrives at the packet receiving unit of the VoIP terminal, the packet receiving unit accumulates only the voice data portion from the received packet in the jitter buffer.

ジッタバッファ制御部７０は、受信パケットが新たに到来したか否かを常に監視している（ステップＳ１）。パケット受信部が音声データをジッタバッファに蓄積する際に、ジッタバッファ制御部７０は、パケット受信部に対する制御動作として、当該受信パケット分の音声データを１つの分割単位とし、これに間引き完了フラグＦを０にセットすると共に、カウンタを１加算する（ステップＳ２）。尚、ジッタバッファ制御部７０は、本図のフローチャートには示されていないが、音声デコード部が１パケット分の音声データを取り込んだ時点でカウンタを１減算する。従って、カウンタの値によりジッタバッファに蓄積されている音声データの総量を認識することができる。 The jitter buffer controller 70 always monitors whether or not a received packet has newly arrived (step S1). When the packet receiving unit accumulates the audio data in the jitter buffer, the jitter buffer control unit 70 uses the audio data corresponding to the received packet as one division unit as a control operation for the packet receiving unit, and the decimation completion flag F Is set to 0 and the counter is incremented by 1 (step S2). Although not shown in the flowchart of this figure, the jitter buffer control unit 70 decrements the counter by 1 when the audio decoding unit takes in audio data for one packet. Therefore, the total amount of audio data accumulated in the jitter buffer can be recognized from the counter value.

次いで、ジッタバッファ制御部７０は、カウンタの値が閾値以上であるか否かを判定する（ステップＳ３）。カウンタの値はジッタの発生により変動する。例えば、ジッタにより遅れた複数のパケットが短時間に集中してパケット受信部に到着すると、音声デコード部の処理が間に合わずジッタバッファに音声データが滞留し、カウンタの値が例えば閾値の１０以上になる場合がある。 Next, the jitter buffer control unit 70 determines whether or not the value of the counter is equal to or greater than a threshold value (step S3). The counter value varies depending on the occurrence of jitter. For example, when a plurality of packets delayed due to jitter arrive at the packet receiving unit in a short time, the processing of the audio decoding unit is not in time and the audio data stays in the jitter buffer, and the counter value becomes, for example, a threshold value of 10 or more There is a case.

ステップＳ３において、カウンタの値が閾値以上である場合には、間引き実行フラグＧを１にセットして（ステップＳ４）、ステップＳ１に戻る。間引き実行フラグＧの値が１に設定されることで後述する間引き処理が実行される。一方、カウンタの値が閾値以上ではない場合には、ジッタバッファ制御部７０はステップＳ１に戻り、新たな受信パケットがＶｏＩＰ端末のパケット受信部に到着するまで待機する。 If the counter value is equal to or greater than the threshold value in step S3, the thinning execution flag G is set to 1 (step S4), and the process returns to step S1. When the value of the thinning execution flag G is set to 1, a thinning process described later is executed. On the other hand, if the counter value is not equal to or greater than the threshold value, the jitter buffer control unit 70 returns to step S1 and waits until a new received packet arrives at the packet receiving unit of the VoIP terminal.

図５は、本発明によるジッタバッファ制御方法における間引き処理の処理手順を示している。かかる間引き処理は、ジッタバッファ制御部７０より実行されるが、図４に示された音声データ蓄積処理と並行して実行され得る。 FIG. 5 shows a processing procedure of the thinning process in the jitter buffer control method according to the present invention. Such thinning-out processing is executed by the jitter buffer control unit 70, but can be executed in parallel with the audio data storage processing shown in FIG.

常に、ジッタバッファ制御部７０は、間引き実行フラグＧが１であるか否か監視している（ステップＳ３１）。間引き実行フラグＧが０であればステップＳ３１の監視を継続し、もし１であれば以下の間引き処理を実行する。 The jitter buffer controller 70 always monitors whether or not the thinning execution flag G is 1 (step S31). If the thinning execution flag G is 0, the monitoring in step S31 is continued, and if it is 1, the following thinning process is executed.

先ず、ジッタバッファ制御部７０は、間引き完了フラグが０の分割単位（１パケット分）をジッタバッファから抽出する（ステップＳ３２）。間引き完了フラグが０にセットされたままの分割単位が残っている状態は図６が参照される。次に、当該分割単位中のいくつかの音声データ片を間引き率に従って削除する（ステップＳ３３）。次に、当該分割単位について間引き完了フラグを１にセットする（ステップＳ３４）。全ての分割単位について間引き完了フラグＦが１にセットされた状態は図７が参照される。音声データ片の削除は、例えば、間引き率を０．１とすると、音声データ片１０個に対して１個の割合で音声データ片が削除される（図３参照）。 First, the jitter buffer control unit 70 extracts a division unit (for one packet) whose decimation completion flag is 0 from the jitter buffer (step S32). FIG. 6 is referred to for a state in which a division unit with the thinning completion flag set to 0 remains. Next, some audio data pieces in the division unit are deleted according to the thinning rate (step S33). Next, a thinning completion flag is set to 1 for the division unit (step S34). FIG. 7 is referred to for the state in which the thinning completion flag F is set to 1 for all the division units. For example, if the thinning rate is 0.1, the audio data pieces are deleted at a rate of 1 out of 10 audio data pieces (see FIG. 3).

次に、ジッタバッファ制御部７０は、他に間引き完了フラグＦが０の分割単位があるか否かを判定する（ステップＳ３５）。もしあれば、当該分割単位についてステップＳ３３及びＳ３４の処理を繰り返す。一方、間引き完了フラグＦが０の分割単位がないならば、ジッタバッファ制御部７０は、間引き実行フラグＧを０にセットして間引き処理を終了する（ステップＳ３６）。 Next, the jitter buffer control unit 70 determines whether there is another division unit whose decimation completion flag F is 0 (step S35). If there is, the processes in steps S33 and S34 are repeated for the division unit. On the other hand, if there is no division unit whose decimation completion flag F is 0, the jitter buffer control unit 70 sets the decimation execution flag G to 0 and ends the decimation process (step S36).

ジッタバッファ制御部７０による間引き処理の結果、音声デコーダ部よりデコードするべき音声データが少なくなるため、音声デコーダ部は通常より早いタイミングでジッタバッファから分割単位である１パケット分の音声データを取得していくことになり、遅延が回復されていくことになる。 As a result of the thinning process by the jitter buffer control unit 70, there is less audio data to be decoded than the audio decoder unit, so the audio decoder unit obtains audio data for one packet as a division unit from the jitter buffer at a timing earlier than usual. The delay will be recovered.

以上の実施例において、１度に連続した数１０個〜数１００個の音声データ片を欠落させるのではなく、音声データの分割単位毎に音声データ片の削除が行われ、音声データ片の欠落が分散され一カ所に集中することがない。また、間引きされた分割単位には間引き完了フラグが設定されることから二重に間引き処理がなされることがない。これにより、音声の途切れ等の音声品質の劣化を抑えつつ、音声デコード処理の高速化を図ることができる。 In the above embodiment, several tens to several hundreds of audio data pieces that are consecutive at a time are not deleted, but the audio data pieces are deleted for each division unit of the audio data, and the audio data pieces are lost. Are distributed and do not concentrate in one place. Further, since a thinning completion flag is set for the thinned division unit, double thinning processing is not performed. As a result, it is possible to increase the speed of the audio decoding process while suppressing deterioration of the audio quality such as audio interruption.

尚、本発明の実施形態は、間引き完了フラグＦが０の分割単位の全てについて間引き処理を実行する形態に限定されない。本発明の実施形態は、少なくとも１つの分割単位について間引き処理を施す形態を含み、さらに、例えば、音声データの総量がどの程度閾値を越えているかに応じて間引き処理を施す分割単位の数を調整しても良い。また、新しい分割単位に対して或いは旧い分割単位に対して優先して間引き処理が施されるようにしても良い。 It should be noted that the embodiment of the present invention is not limited to a mode in which the thinning process is executed for all the division units for which the thinning completion flag F is 0. Embodiments of the present invention include a mode in which thinning processing is performed on at least one division unit, and further, for example, the number of division units on which thinning processing is performed is adjusted according to how much the total amount of audio data exceeds a threshold. You may do it. Further, the thinning process may be performed with priority over a new division unit or an old division unit.

本発明の実施例を示し、本発明によるＶｏＩＰ端末を含む全体の構成を示すブロック図である。It is a block diagram which shows the Example of this invention and shows the whole structure containing the VoIP terminal by this invention. 図１に示されたジッタバッファの構成例を示す概略図である。It is the schematic which shows the structural example of the jitter buffer shown by FIG. 図２に示された１つの分割単位を示す概略図である。FIG. 3 is a schematic diagram showing one division unit shown in FIG. 2. 本発明によるジッタバッファ制御方法における音声データ蓄積処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the audio | voice data storage process in the jitter buffer control method by this invention. 本発明によるジッタバッファ制御方法における間引き処理の処理手順を示すフローチャートである。It is a flowchart which shows the process sequence of the thinning process in the jitter buffer control method by this invention. 間引き処理が未完の分割単位があるジッタバッファの状態を示す図である。It is a figure which shows the state of the jitter buffer with the division | segmentation unit which has not completed the thinning-out process. 間引き処理が全ての分割単位について完了したジッタバッファの状態を示す図である。It is a figure which shows the state of the jitter buffer which the thinning-out process completed about all the division units.

Explanation of symbols

１０ＶｏＩＰネットワーク
３０ＶｏＩＰ端末
４０パケット受信部
５０ジッタバッファ
６０音声デコード部
７０ジッタバッファ制御部
７１カウンタ
７２閾値テーブル
７３間引き率テーブル
Ｆ間引き完了フラグ
Ｇ間引き実行フラグ DESCRIPTION OF SYMBOLS 10 VoIP network 30 VoIP terminal 40 Packet receiving part 50 Jitter buffer 60 Voice decoding part 70 Jitter buffer control part 71 Counter 72 Threshold table 73 Decimation rate table F Decimation completion flag G Decimation execution flag

Claims

A jitter buffer control method in a VoIP terminal comprising: a jitter buffer for accumulating audio data included in a received packet; and an audio decoding unit for extracting and decoding the audio data accumulated in the jitter buffer,
When accumulating audio data in the jitter buffer, the audio data is divided into a plurality of division units, and a thinning incomplete flag setting step for setting a thinning incomplete flag for each division unit;
The total amount of audio data in the jitter buffer is monitored, and when the total amount of audio data exceeds a predetermined threshold, at least one division unit for which the decimation incomplete flag is set is extracted and thinning processing is performed on the division unit And a decimation completion flag change setting step for changing the decimation incomplete flag of the division unit to a decimation completion flag,
A jitter buffer control method comprising:

2. The jitter buffer control method according to claim 1, wherein the thinning completion flag change setting step performs the thinning processing according to a thinning rate set in advance.

3. The jitter buffer control method according to claim 1, wherein the thinning incomplete flag setting step divides the audio data for one reception packet as one division unit.

A VoIP terminal comprising: a jitter buffer that accumulates audio data included in a received packet; and an audio decoding unit that extracts and decodes audio data accumulated in the jitter buffer,
When the jitter buffer accumulates audio data, the audio data is divided into a plurality of division units, and a thinning incomplete flag setting means for setting a thinning incomplete flag for each division unit;
The total amount of audio data in the jitter buffer is monitored, and when the total amount of audio data exceeds a predetermined threshold, the decimation process is performed on at least one division unit for which the decimation incomplete flag is set and the division unit A thinning completion flag change setting means for changing the thinning incomplete flag to a thinning completion flag;
VoIP terminal characterized by including.

5. The VoIP terminal according to claim 4, wherein the thinning completion flag change setting unit performs the thinning process according to a thinning rate set in advance.

6. The VoIP terminal according to claim 4, wherein the thinning incomplete flag setting means divides the audio data for one received packet as one division unit.