JP2006074555A

JP2006074555A - Audio / video adjustment method for multimedia gateway

Info

Publication number: JP2006074555A
Application number: JP2004256789A
Authority: JP
Inventors: Yuichiro Chikamatsu; 裕一郎近松
Original assignee: Fujitsu Ltd
Current assignee: Fujitsu Ltd
Priority date: 2004-09-03
Filing date: 2004-09-03
Publication date: 2006-03-16

Abstract

【課題】本発明はＩＰネットワークとマルチメディア携帯端末網との接続点に設けられたマルチメディアゲートウェイにおける音声・動画調整方式に関し，ＩＰネットワークからの音声と動画を６４Ｋｂｐｓの速度の一定周期のフレームで多重化して伝送する場合に音声と動画を最適なレートに調整することを目的とする。
【解決手段】多重化処理部は多重化データを生成する制御を行うスケジューラと各固定長内の音声レートと動画レートの組み合わせのデータを格納する多重化テーブルを備える。携帯端末への送信用に音声レートを最低レートとし残りの容量を動画とオーバヘッド用とするよう多重化テーブルを設定し，スケジューラは，多重化テーブルの音声レートのデータと動画レートを使用して多重化データを生成するよう構成する。
【選択図】図１The present invention relates to an audio / video adjustment method in a multimedia gateway provided at a connection point between an IP network and a multimedia mobile terminal network, and relates to audio and video from an IP network in a frame having a constant cycle of 64 Kbps. The purpose is to adjust the audio and video to the optimum rate when multiplexed and transmitted.
A multiplexing processing unit includes a scheduler that performs control for generating multiplexed data, and a multiplexing table that stores data of a combination of an audio rate and a moving image rate within each fixed length. The multiplexing table is set so that the audio rate is the lowest rate and the remaining capacity is for video and overhead for transmission to the mobile terminal, and the scheduler multiplexes using the audio rate data and video rate of the multiplexing table. Configured to generate structured data.
[Selection] Figure 1

Description

本発明は，ＩＰネットワークと６４Ｋｂｐｓベースの３Ｇ−３２４Ｍ（３ＧＰＰ）上で音声／動画の送受信が可能なマルチメディア携帯端末網との接続地点に設けられたマルチメディアゲートウエイにおける音声・動画調整方式に関する。 The present invention relates to an audio / video adjustment method in a multimedia gateway provided at a connection point between an IP network and a multimedia portable terminal network capable of audio / video transmission / reception on 64Kbps-based 3G-324M (3GPP).

近年，携帯電話技術の進歩により，従来型の音声主体である第２世代の携帯通信から，音声と動画を融合させた第３世代の通信へと移行が進んでいる。中でも，第３世代（３Ｇ）の移動体通信システムの標準化プロジェクトによる標準規格として，Ｗ−ＣＤＭＡ（Wideband Code Division Multiple Access :広帯域符号分割多重接続) を採用したキャリアは，３ＧＰＰ（３rd Generation Partnership Project)に準拠し，３ＧＰＰ規格で制定されたオーディオビジュアル通信の規格として３Ｇ−３２４Ｍという規格が採用されている。その３Ｇ−３２４Ｍの規格では，映像データはＭＰＥＧ−４（Moving Picture Experts Group：国際標準化機構ＩＳＯが定めた低速回線用の符号化技術と高度な動画フォーマットに対応する符号化方式）またはＨ．２６３（ＩＴＵ−Ｔが勧告した, 低い伝送速度のテレビ会議やテレビ電話のための映像の圧縮符号化方式）が用いられ，音声データはＡＭＲ（Adaptive Multi Rate ：適応型多レート方式）の符号化方式が採用されている。 In recent years, with the advancement of mobile phone technology, the transition from the second-generation mobile communication, which is mainly a conventional voice, to the third-generation communication that combines voice and video is progressing. Among them, carriers that employ W-CDMA (Wideband Code Division Multiple Access) as a standard by the standardization project of the third generation (3G) mobile communication system are 3GPP (3rd Generation Partnership Project) The standard 3G-324M is adopted as the audio visual communication standard established by the 3GPP standard. In the 3G-324M standard, video data is MPEG-4 (Moving Picture Experts Group: encoding technology for low-speed lines defined by the International Organization for Standardization ISO and an encoding method corresponding to advanced video formats) or H.264. 263 (compressed video encoding for low-speed video conferences and videophones recommended by ITU-T) is used, and audio data is encoded by AMR (Adaptive Multi Rate). The method is adopted.

３ＧＰＰを利用したマルチメディア通信は，６４Ｋｂｐｓという限られた帯域内で音声と動画を多重することにより通信が可能なため，より効率的なマルチメディア通信の技術として脚光を浴びている。 Multimedia communication using 3GPP is in the spotlight as a more efficient multimedia communication technology because communication is possible by multiplexing audio and moving images within a limited band of 64 Kbps.

更に，近年のＩＰ網の爆発的普及により，安価なＩＰネットワークを利用したマルチメディア通信も盛んであり，これらＩＰ網と３Ｇ−３２４Ｍを利用した携帯網とを接続したマルチメディアサービスも普及の兆しをみせてきている。ところが，ＩＰ網と３Ｇ−３２４Ｍを用いた携帯通信網との接続で問題となるのが，ＩＰ網側の６４ＫｂｐｓのＧ．７１１（ITU-T の音声符号化方式の勧告の一つで，ＰＣＭ方式により音声を８ＫＨｚでサンプリングし，各サンプルを８ビットで表し，６４Ｋｂｐｓの伝送速度を実現する方式）に変換された音声とＭＰＥＧ４またはＨ．２６３により符号化された動画を，音声品質・画質の高いレベルを維持しつつ，どのようにして６４Ｋｂｐｓの３Ｇ−３２４Ｍに多重化するかという点である。 Furthermore, with the explosive spread of IP networks in recent years, multimedia communication using inexpensive IP networks is also flourishing, and multimedia services connecting these IP networks with mobile networks using 3G-324M are also signs of widespread use. Has been showing. However, a problem with connection between an IP network and a mobile communication network using 3G-324M is that the G. of 64 Kbps on the IP network side. Speech converted to 711 (one of the recommendations of the ITU-T speech coding system, which is a PCM system that samples speech at 8 KHz, each sample is represented by 8 bits, and realizes a transmission rate of 64 Kbps) MPEG4 or H.264 How to multiplex a moving picture encoded by H.263 into 64Kbps 3G-324M while maintaining a high level of sound quality and image quality.

一般に，３Ｇ−３２４Ｍ上では，音声については圧縮された音声符号化方式であるＡＭＲを，動画についてはＭＰＥＧ４又はＨ．２６３を用いるのが一般的である。ＡＭＲは複数のレートからなる可変符号化方式であるため，３Ｇ−３２４Ｍの最大レートである６４Ｋｂｐｓ内に音声と動画を多重するためには，音声が最低レート時には動画レートを大きく，音声が最高レート時には動画レートを小さくするという，音声と動画の質が相反することになる。 In general, on 3G-324M, AMR, which is a compressed audio coding method, is used for audio, and MPEG4 or H.264 is used for moving images. It is common to use H.263. Since AMR is a variable encoding system consisting of a plurality of rates, in order to multiplex audio and video within 64 Kbps, the maximum rate of 3G-324M, the video rate is increased when the audio is at the lowest rate, and the audio is at the highest rate. Sometimes the video rate is reduced, which means that the audio and video quality conflict.

このように従来方式では，６４Ｋｂｐｓという限られた音声・動画のレートをあらかじめ決められたレートで設定していたため，一方を高くすれば他方が低くなるという弊害が発生していた。 As described above, in the conventional method, since the limited audio / video rate of 64 Kbps is set at a predetermined rate, there is a problem that if one is increased, the other is decreased.

図１３は従来のマルチメディアゲートウェイを含むシステムの構成を示す。図中，８０はマルチメディア携帯端末，８１は３Ｇ−３２４Ｍ又はＨ２６３を用いた符号化方式により音声・動画の伝送を行うマルチメディア携帯端末網，８２はマルチメディア携帯端末網８１とＩＰネットワークとの間に設けられ，２つのネットワークの端末またはサーバ間でマルチメディアの情報を相互に伝送するマルチメディアゲートウェイ，８３はＳＩＰ（Session Initiation Protocol:ＩＰネットワーク上でセッションを開始，編集，終了するためのシグナリングプロトコルの処理を行うサーバ）やＭＥＧＡＣＯ（Media Gateway Control:メディアゲートウェイを制御するプロトコルの処理を行う装置) を制御してＩＰネットワーク上で音声，映像の呼制御を行うコール・エージェント（ＣＡ:Call Agent)，８４はＩＰ（Internet Protocol)による通信を行うＩＰネットワーク，８５はＩＰネットワークに接続されて音声，動画等のマルチメディアの情報を配信するマルチメディア配信サーバである。 FIG. 13 shows the configuration of a system including a conventional multimedia gateway. In the figure, 80 is a multimedia portable terminal, 81 is a multimedia portable terminal network that transmits audio and moving images by an encoding method using 3G-324M or H263, and 82 is a multimedia portable terminal network 81 and an IP network. A multimedia gateway 83, which is provided in between and transmits multimedia information between two network terminals or servers, SIP (Session Initiation Protocol: signaling for starting, editing and ending a session on an IP network) A call agent (CA: Call Agent) that controls audio and video calls over an IP network by controlling a protocol processing server) and MEGACO (Media Gateway Control: a device that performs protocol processing to control a media gateway) ), 84 is an I for performing communication by IP (Internet Protocol). A P network 85 is a multimedia distribution server that is connected to the IP network and distributes multimedia information such as voice and moving images.

図１３に示すように，マルチメディア携帯端末８０は音声だけでなく動画を双方向に送受信する機能を備え，マルチメディア携帯端末網８１に収容されたマルチメディア携帯端末８０同士は３Ｇ−３２４Ｍの規格により音声，映像による相互通信が行われる。また，マルチメディア携帯端末８０に対してＩＰネットワーク８４に接続されたマルチメディア配信サーバ８５からマルチメディア情報である音声と動画を配信するサービスが実現されている。その場合，マルチメディア配信サーバ８５からは音声と動画の情報が，音声パケット８５ａと動画パケット８５ｂという分離した形態でＩＰネットワーク８４に送信される。音声パケット８５ａは，ＩＰ／ＵＤＰ（IPの上位のUser Datagram Protocolで相手先アプリケーション情報を含む) ／ＲＴＰ（Realtime Transport Protocol ：実時間伝送プロトコル) ／Ｇ．７１１( 音声を８ＫＨｚでサンプリングしてディジタル化した６４Ｋｂｐｓの信号に対応した符号化方式）のプロトコル及び符号化方式に対応したパケットである。動画パケット８５ｂは，ＩＰ／ＵＤＰ／ＲＴＰのプロトコルを使用する点は音声と共通（ヘッダ中に音声と画像を識別する情報が含まれる）であるが，符号化方式としてＭＰＥＧ４またはＨ２６３が使用される。これらの音声と動画のパケットはＩＰネットワーク８４の同じルートを交互に通ってマルチメディアゲートウェイ８２に入力する。 As shown in FIG. 13, the multimedia portable terminal 80 has a function of bidirectionally transmitting and receiving not only voice but also a moving image, and the multimedia portable terminals 80 accommodated in the multimedia portable terminal network 81 are based on the 3G-324M standard. Thus, mutual communication by voice and video is performed. In addition, a service for distributing audio and video as multimedia information from the multimedia distribution server 85 connected to the IP network 84 to the multimedia portable terminal 80 is realized. In this case, audio and video information is transmitted from the multimedia distribution server 85 to the IP network 84 in a form of separate audio packets 85a and video packets 85b. The voice packet 85a includes IP / UDP (includes partner application information in the upper User Datagram Protocol of IP) / RTP (Realtime Transport Protocol) / G. 711 (encoding system corresponding to a 64 Kbps signal obtained by sampling and digitizing voice at 8 KHz) and a packet corresponding to the encoding system. The video packet 85b uses the IP / UDP / RTP protocol in common with audio (the header includes information for identifying audio and video), but MPEG4 or H263 is used as the encoding method. . These voice and video packets enter the multimedia gateway 82 alternately through the same route of the IP network 84.

図１４は本発明の対象となるマルチメディアゲートウェイの構成を示す。図１４にはＩＰ網からマルチメディア端末へ送信する下り方向の信号を伝送する機構と，マルチメディア端末からＩＰ網へ送信する上り方向の信号を伝送する機構とがある。図中，下り方向の機構の８２０ａはＩＰ網からのＩＰ／ＵＤＰ／ＲＴＰ／Ｇ．７１１に対応する音声パケットを終端する音声パケット終端部，８２０ｂはＩＰ網からの動画パケットを終端する動画パケット終端部，８２１ａ，８２１ｂはそれぞれ音声データ，動画データのジッタバッファ，８２２は音声のＣＯＤＥＣ処理部，８２３は音声バッファ，８２４は音声バッファ８２３の音声データとジッタバッファ８２１ｂの動画データを多重化して３Ｇ−３２４Ｍ規格（６４Ｋｂｐｓ）のフレームに変換する３Ｇ−３２４Ｍ処理部である。上り側の機構である８２５はマルチメディア端末からの音声，動画を含むフレームを終端して分離する３Ｇ−３２４Ｍ終端部，８２６ａはＨ．２４５（エンド・ツウ・エンドの通信制御のプロトコル）による呼制御情報を処理するＨ．２４５処理部，８２６ｂは音声データを抽出する音声抽出部，８２６ｃは動画データを抽出する動画抽出部，８２７は呼制御情報を受け取ってＭＥＧＡＣＯを用いたＣＡ（図１３の８３）との通信による呼制御を行うＣＰＵ，８２８は３Ｇ−３２４Ｍ側の音声データ（ＡＭＲの符号化）を受け取って，Ｇ．７１１の符号化の信号に変換するＣＯＤＥＣ，８２９はＣＯＤＥＣ８２８からの音声データと動画抽出部８２６ｃから動画データを入力して，ＩＰ／ＵＤＰ／ＲＴＰ／Ｇ．７１１の規格による音声（ＲＴＰ）パケットとＩＰ／ＵＤＰ／ＲＴＰ／ＭＰＥＧ４またはＨ．２６３の規格による動画（ＲＴＰ）パケットとに分割して，送出するＩＰパケット化部である。 FIG. 14 shows the configuration of a multimedia gateway that is the subject of the present invention. FIG. 14 includes a mechanism for transmitting a downlink signal transmitted from the IP network to the multimedia terminal and a mechanism for transmitting an uplink signal transmitted from the multimedia terminal to the IP network. In the figure, the downstream mechanism 820a is IP / UDP / RTP / G. 802b is a video packet termination unit for terminating a video packet from the IP network, 821a and 821b are audio data, a jitter buffer for the video data, and 822 is a voice CODEC process. 823 is an audio buffer, and 824 is a 3G-324M processing unit that multiplexes the audio data of the audio buffer 823 and the moving image data of the jitter buffer 821b and converts them into frames of the 3G-324M standard (64 Kbps). An upstream side mechanism 825 is a 3G-324M termination unit that terminates and separates frames including audio and moving images from the multimedia terminal, and 826a is an H.264 standard. H.245 that processes call control information according to H.245 (end-to-end communication control protocol). 245 processing unit, 826b is a voice extracting unit that extracts voice data, 826c is a moving image extracting unit that extracts moving image data, and 827 is a call by communication with a CA (83 in FIG. 13) that receives call control information and uses MEGACO. The CPU 828 that performs control receives the audio data (AMR encoding) on the 3G-324M side, A CODEC 829 for converting into a signal encoded with 711 inputs audio data from the CODEC 828 and moving image data from the moving image extraction unit 826c, and inputs IP / UDP / RTP / G. 711 standard voice (RTP) packets and IP / UDP / RTP / MPEG4 or H.264 standard. This is an IP packetizing unit that divides and transmits the moving image (RTP) packet according to the H.263 standard.

マルチメディア携帯端末網（図１３の８１）から入力される６４Ｋｂｐｓの３Ｇ−３２４Ｍデータは，３Ｇ−３２４Ｍ終端部８２５で，音声（ＡＭＲ）８２６ｂ，動画（ＭＰＥＧ４又はＨ．２６３）８２６ｃ，データ（Ｈ．２４５対応の呼制御データ）８２６ａに分離され，音声はコーデック８２８でＡＭＲの圧縮符号からＧ．７１１の圧縮符号に変換され，データ（Ｈ．２４５）は，ＮＳＲＰ（Numbered Simple Retransmission Protocol:番号付け再送手順プロトコル) 及びＣＣＳＲＬ(Control Channel Segmentaion and Reassembly Layer:制御チャネルセグメント化と組立レイヤ）が終端された後，ＣＰＵ８２７に転送され，ＣＰＵはＭＥＧＡＣＯからの制御内容と連携して，Ｈ．２４５のネゴシエーションを行う。Ｇ．７１１に変換された音声はＩＰパケット化部８２９でＲＴＰパケット化（ＩＰ／ＵＤＰ／ＲＴＰ化）された後，ＩＰネットワーク上に転送される。また，動画については，コーデック処理されずそのままの形でＩＰパケット化部８２９でＲＴＰパケット化（ＩＰ／ＵＤＰ／ＲＴＰ化）されてＩＰネットワーク上に転送される。 64 Kbps 3G-324M data input from the multimedia portable terminal network (81 in FIG. 13) is transmitted to a 3G-324M termination unit 825, with audio (AMR) 826b, moving image (MPEG4 or H.263) 826c, data (H . 245 compatible call control data) 826a, and the audio is converted from the AMR compressed code to the G. Data (H.245) is terminated with NSRP (Numbered Simple Retransmission Protocol) and CCSRL (Control Channel Segmentation and Reassembly Layer). Then, it is transferred to the CPU 827. The CPU cooperates with the control content from MEGACO to 245 negotiations are performed. G. The voice converted to 711 is converted into RTP packets (IP / UDP / RTP) by the IP packetization unit 829 and then transferred onto the IP network. In addition, the moving image is not subjected to codec processing but is converted into RTP packets (IP / UDP / RTP) by the IP packetization unit 829 and transferred onto the IP network.

一方，マルチメディア配信サーバ８５から配信された音声と動画は，音声がＧ．７１１で符号化され，動画はＭＰＥＧ４またはＨ．２６３により符号化され，それぞれパケットとしてＩＰネットワーク８４を伝送され，図１４に示すマルチメディアゲートウェイのＩＰ／ＵＤＰ／ＲＴＰ終端部８２０ａ，８２０ｂで終端し，それぞれＩＰ／ＵＤＰ／ＲＴＰのプロトコルのヘッダが取り除かれ，揺らぎを吸収するため音声と動画の符号化データはジッタバッファ８２１ａ，８２１ｂに格納され，ＭＰＥＧ４（またはＨ．２６３）の動画の符号化データは３Ｇ−３２４Ｍ処理部８２４へ入力され，音声についてはＧ．７１１の符号化データがＣＯＤＥＣ処理部８２２において決められたＡＭＲレートでＡＭＲ符号化されて音声バッファ８２３に格納され，その出力が３Ｇ−３２４Ｍ処理部８２４へ入力される。３Ｇ−３２４Ｍ処理部８２４は音声バッファ８２３のＡＭＲの音声データとＭＰＥＧ４の動画データ及びＣＰＵ８２７からのデータ（呼制御データ）とを合成（多重）し，６４Ｋｂｐｓのレートに対応するビット数を持つフレーム構成の多重化データ（ＭＵＸ−ＰＤＵ：Multiplex Protocol Data Unit) を生成してマルチメディア携帯端末網８１を介してマルチメディア携帯端末８０に送信される。従来は，情報の送信側から相手側に対して音声と動画と制御情報（H.245)のデータ量のレート（配分量）が設定された多重化テーブルを予め送信しておき，送信側での多重化に合わせて受信側での分離処理が多重化テーブルを用いて行うことができるようになっている。 On the other hand, the audio and moving images distributed from the multimedia distribution server 85 have G. The video is encoded with MPEG4 or H.264. 263, each transmitted as an IP network 84 as a packet, terminated at the IP / UDP / RTP terminator 820a, 820b of the multimedia gateway shown in FIG. 14, and the header of the IP / UDP / RTP protocol is removed respectively. In order to absorb fluctuations, encoded data of audio and moving images is stored in jitter buffers 821a and 821b, and encoded data of moving images in MPEG4 (or H.263) is input to the 3G-324M processing unit 824, G. The encoded data 711 is AMR encoded at the AMR rate determined by the CODEC processing unit 822 and stored in the audio buffer 823, and the output is input to the 3G-324M processing unit 824. The 3G-324M processing unit 824 combines (multiplexes) the AMR audio data in the audio buffer 823, the MPEG4 moving image data, and the data (call control data) from the CPU 827, and has a frame configuration having a bit number corresponding to a rate of 64 Kbps. Multiplexed data (MUX-PDU: Multiplex Protocol Data Unit) is generated and transmitted to the multimedia portable terminal 80 via the multimedia portable terminal network 81. Conventionally, a multiplexing table in which a rate (allocation amount) of data amount of voice, video, and control information (H.245) is set in advance is transmitted from the information transmission side to the other side, and the transmission side In accordance with the multiplexing, the separation process on the receiving side can be performed using the multiplexing table.

下り方向の伝送の場合は，マルチメディアゲートウェイ（３Ｇ−３２４Ｍ処理部）からマルチメディア携帯端末へ多重化テーブルが送られ，上りの場合は，マルチメディア携帯端末からマルチメディアゲートウェイに多重化テーブルが送られる。この時，多重化テーブルは６４Ｋｂｐｓという限られた範囲内で，音声と動画のレートが予め決められたレートで設定されている。 In the case of downlink transmission, the multiplexing table is sent from the multimedia gateway (3G-324M processing unit) to the multimedia portable terminal, and in the case of uplink, the multiplexing table is sent from the multimedia portable terminal to the multimedia gateway. It is done. At this time, in the multiplexing table, the audio and moving image rates are set at a predetermined rate within a limited range of 64 Kbps.

従来技術としてインターネット網及び携帯電話網を介して動画，静止画，音声等の多重化して符号化ビットストリームを送信する手段と，各符号化データを１つのシーンとして合成・再生するためのシーン記述を送信する手段とを備え，符号化ビットストリーム送信手段のストリームとシーン記述送信手段のシーン記述とは異なる伝送路または異なるタイミングで送信するマルチメディア情報送信装置の技術がある（特許文献１参照）。また，動画像をＭＰＥＧ４規格で送受信する機能と地上波ディジタル放送を受信する機能を備えた携帯電話機の技術が知られている（特許文献２参照）。
特開２００３−２９９０４４号公報特開２００３−１５３１１１号公報 Conventionally, a means for transmitting a coded bit stream by multiplexing moving images, still images, audio, etc. via the Internet network and a cellular phone network, and a scene description for synthesizing and reproducing each coded data as one scene There is a technique of a multimedia information transmitting apparatus that transmits a stream of encoded bitstream transmitting means and a scene description of scene description transmitting means at different transmission paths or at different timings (see Patent Document 1). . In addition, a mobile phone technology having a function of transmitting and receiving moving images according to the MPEG4 standard and a function of receiving terrestrial digital broadcasting is known (see Patent Document 2).
JP 2003-299044 A JP 2003-153111 A

上記特許文献１の技術は多重化した符号化ビットストリームとシーン記述を送信する技術であり，特許文献２はＭＰＥＧ４の画像の送受信と地上波ディジタル放送の受信機能を備える携帯電話機の技術であり，音声・動画のレートを固定の帯域（６４Ｋｂｐｓ）内で効率的に制御することを示唆するものではない。 The technique of Patent Document 1 is a technique for transmitting a multiplexed encoded bit stream and a scene description, and Patent Document 2 is a technique of a mobile phone equipped with a function of transmitting / receiving MPEG4 images and receiving digital terrestrial broadcasting, This does not suggest that the audio / video rate is efficiently controlled within a fixed bandwidth (64 Kbps).

そして，上記図１３，図１４に説明した従来方式では，６４Ｋｂｐｓ内に音声・動画を多重する必要があり，双方とも高い品質で多重することが難しいという問題がある。 In the conventional system described in FIGS. 13 and 14, it is necessary to multiplex audio / video within 64 Kbps, and it is difficult to multiplex both with high quality.

また，動画，音声のレートがゲートウェイに接続される端末全てに対して予め決められた多重化テーブルにより一定に設定され，状況に応じて変化させるには音声，動画の状況を検出しアルゴリズムに基づいて比率を算出してテーブルを変更するための通知を行うという一連の処理を常時繰り返す必要がありそのための負荷が多大になるため，各携帯端末毎の特性や，携帯端末側の環境に対応することができず，携帯端末によっては接続性（機種によっては一部のレートにしか対応できないので接続できない）に問題が生じてしまう。 Also, the video and audio rates are set to be constant by a predetermined multiplexing table for all terminals connected to the gateway. To change according to the situation, the audio and video conditions are detected and an algorithm is used. Therefore, it is necessary to constantly repeat a series of processing to calculate the ratio and perform notification to change the table, which increases the load, and therefore corresponds to the characteristics of each mobile terminal and the environment on the mobile terminal side. However, depending on the mobile terminal, there is a problem in connectivity (cannot be connected because only some rates are supported depending on the model).

また，配信サーバから出力される動画はあくまで最大レートであり，平均レートがこれを大幅に下回る場合は，回線リソースを有効に活用していないことになる。例えば，動画の出力レートを５０Ｋｂｐｓに設定した場合，可能な音声出力レートは３Ｇ−３２４Ｍのオーバヘッド（管理用ビット等）を考慮し，４．７５Ｋｂｐｓとなるが，動画の出力レートは最大値を示す値であり，実際の平均レートは４０Ｋｂｐｓ程度ということになると考えられる。この場合，１０Ｋｂｐｓに相当するデータが無駄になってしまい，非効率的である。 In addition, the video output from the distribution server is the maximum rate to the last, and if the average rate is much lower than this, the line resources are not effectively used. For example, when the video output rate is set to 50 Kbps, the possible audio output rate is 4.75 Kbps considering the overhead of 3G-324M (eg management bits), but the video output rate shows the maximum value It is assumed that the actual average rate is about 40 Kbps. In this case, data corresponding to 10 Kbps is wasted, which is inefficient.

更に，配信サーバから予め決められた一定の音声・動画レートを配信サーバから垂れ流すのみであるため，携帯端末側での受信状況を認識する手段及びそれをフィードバックする手段がなく，音声，画像品質の安定性を確保することができない。携帯端末側での受信状況を確認する方法がなく音質，画質の安定性を確保できない。 Furthermore, since only a certain audio / video rate determined in advance from the distribution server is dropped from the distribution server, there is no means for recognizing the reception status on the mobile terminal side and no means for feeding it back. The stability of can not be ensured. There is no way to check the reception status on the mobile terminal side, so the stability of sound quality and image quality cannot be ensured.

この問題に対し，マルチメディア配信サーバまたはマルチメディアゲートウェイ上の各チャネル毎に動画のトランスレータ（変換器）を設けて，トランスレータで動画レートを可変に調整することで音声レートを自在に出力することが可能であるが，処理能力が重い動画トランスレータを実装することは，チャネル密度の低下，装置の肥大化，装置価格の上昇を招くことになる。 To solve this problem, a video translator is provided for each channel on the multimedia distribution server or multimedia gateway, and the audio rate can be freely output by adjusting the video rate variably with the translator. Although it is possible, implementing a video translator with heavy processing capability will lead to a decrease in channel density, an increase in the size of the device, and an increase in the device price.

本発明はマルチメディア携帯端末網に対応するＧ−３２４Ｍの音声と動画の多重化データ（ＭＵＸ−ＰＤＵ）の動画レートを最大限に割り当てることで動画の品質を維持し，多重化データの動画のレートに余裕が生まれるとその分を音声の品質に割り当てることができる等により限られたデータ量で音声・動画を適切なレートに調整することができるマルチメディアゲートウェイにおける音声・動画調整方式を提供することを目的とする。 The present invention maintains the quality of video by maximizing the video rate of G-324M audio and video multiplexed data (MUX-PDU) corresponding to a multimedia portable terminal network, and Provide an audio / video adjustment method for multimedia gateways that can adjust the audio / video to an appropriate rate with a limited amount of data, such as by assigning that amount to the quality of the audio when there is room in the rate For the purpose.

図１は本発明の原理構成を示す。この原理構成には，マルチメディアゲートウェイにおいて，ＩＰネットワーク（図１３の８４）からのマルチメディア情報（図１３のマルチメディア配信サーバ８５等からの情報）として音声パケットと動画パケットを受け取ってマルチメディア携帯端末網の携帯端末へ送信する下り方向の３Ｇ−３２４Ｍの多重化データを発生する多重化処理部の構成だけを示し，上り方向の信号を発生するための構成や，呼制御のための構成は図示省略されている。図中，１ａはマルチメディアゲートウェイの中に設けられＩＰネットワークからマルチメディア携帯端末網への下り方向のマルチメディア信号を３Ｇ−３２４Ｍへ多重化する多重化処理部，１０は多重化データ格納部，１１はスケジューラ，１１ａは多重化テーブルの初期設定手段，１１ｂは動画レートを監視する監視手段，１１ｃは多重処理手段，１２は多重化テーブル，２ａは終端部，２ｂはジッタバッファ，２ｃはコーデック（ＣＯＤＥＣ）処理部，２ｄは音声バッファ，３ａは終端部，３ｂはジッタバッファである。 FIG. 1 shows the principle configuration of the present invention. In this basic configuration, a multimedia gateway receives audio packets and video packets as multimedia information from the IP network (84 in FIG. 13) (information from the multimedia distribution server 85 etc. in FIG. 13). Only the configuration of the multiplexing processing unit that generates the downlink 3G-324M multiplexed data to be transmitted to the mobile terminal of the terminal network is shown. The configuration for generating the uplink signal and the configuration for call control are as follows: The illustration is omitted. In the figure, 1a is provided in the multimedia gateway, a multiplexing processing unit for multiplexing the downstream multimedia signal from the IP network to the multimedia portable terminal network into 3G-324M, 10 is a multiplexed data storage unit, 11 is a scheduler, 11a is a multiplexing table initial setting means, 11b is a monitoring means for monitoring a moving image rate, 11c is a multiplexing processing means, 12 is a multiplexing table, 2a is a termination unit, 2b is a jitter buffer, 2c is a codec ( CODEC) processing unit, 2d is an audio buffer, 3a is a termination unit, and 3b is a jitter buffer.

多重化処理部１ａのスケジューラ１１は，初期設定手段１１ａにおいて接続が行われたマルチメディア携帯端末（図１３の８０）へ送信するマルチメディア情報の音声と動画のレートとして，音声を最低レート（例えば４．７Ｋｂｐｓに対応するバイト数）とし，動画のレートを最大（例えば，５０Ｋｂｐｓに対応するバイト数）とし，残りをオーバヘッド（保守管理用）に割り当てた多重化パターンを多重化テーブル１２の当該携帯端末の通信に割り当てられたチャネルに対して設定する。ＩＰネットワーク（図１３の８４）から入力するＩＰ／ＵＤＰ／ＲＴＰ／Ｇ．７１１の音声パケットは終端部２ａで終端し，ＲＴＰに対応したフレームのヘッダが除かれてＧ．７１１により符号化された音声データがジッタバッファ２ｂを通ってコーデック処理部２ｃに供給される。コーデック処理部２ｃはスケジューラ１１の制御を受けて，多重化テーブル１２に従ってＧ．７１１の符号をＡＭＲの符号（最低レートによる）に変換し，音声バッファ２ｄに格納される。また，ＩＰネットワークからの動画パケットは終端部３ａで終端し，ヘッダを除いたＭＰＥＧ４のデータはジッタバッファ３ｂに格納される。 The scheduler 11 of the multiplexing processing unit 1a uses the lowest rate (for example, the rate of audio and video of multimedia information to be transmitted to the multimedia portable terminal (80 in FIG. 13) connected by the initial setting unit 11a (for example, The number of bytes corresponding to 4.7 Kbps), the maximum rate of the moving image (for example, the number of bytes corresponding to 50 Kbps), and the remaining multiplexing pattern assigned to overhead (for maintenance management) in the multiplexing table 12 Set for the channel assigned for terminal communication. IP / UDP / RTP / G.P input from the IP network (84 in FIG. 13). 711 voice packet is terminated at the termination unit 2a, and the header of the frame corresponding to RTP is removed. The audio data encoded by 711 is supplied to the codec processing unit 2c through the jitter buffer 2b. Under the control of the scheduler 11, the codec processing unit 2 c performs G. The code 711 is converted into an AMR code (by the lowest rate) and stored in the audio buffer 2d. The moving image packet from the IP network is terminated at the termination unit 3a, and the MPEG4 data excluding the header is stored in the jitter buffer 3b.

スケジューラ１１の多重処理手段１１ｃは携帯端末の通信に割り当てられたチャネルに対して設定された音声，動画のレートを多重化テーブル１２を参照し，音声については最低レートであることを識別し，音声バッファ２ｄのデータをコーデック処理部２ｃで最低レートのＡＭＲ符号に変換した後スケジューラにより多重化データ格納部１０に転送し，ジッタバッファ３ｂに格納された動画データ（ＭＰＥＧ４）を多重化テーブル１２に設定されたレートに対応するデータ長に変換して多重化データ格納部１０に格納する。また，オーバヘッドのデータは予め設定された一定長を多重化データ格納部１０に設定する。なお，多重化データのフレーム長を一定とすると６４Ｋｂｐｓに対応した固定のデータ量になる。 The multiplex processing means 11c of the scheduler 11 refers to the multiplex table 12 for the audio / video rate set for the channel assigned to the communication of the mobile terminal, identifies that the audio is the lowest rate, The data in the buffer 2d is converted into the lowest rate AMR code by the codec processing unit 2c, then transferred to the multiplexed data storage unit 10 by the scheduler, and the moving image data (MPEG4) stored in the jitter buffer 3b is set in the multiplexing table 12 The data is converted into a data length corresponding to the rate, and stored in the multiplexed data storage unit 10. The overhead data is set in the multiplexed data storage unit 10 with a predetermined fixed length. If the frame length of the multiplexed data is constant, the data amount is fixed corresponding to 64 Kbps.

最初に設定したレート（音声の符号データを最低レートに設定）による通信を行うことで，動画の品質を最大限に維持することができる。このようにして運用している時，スケジューラ１１の監視手段１１ｂが多重化データ格納部１０の動画データ１０ｂのデータ長を監視し，空きが生じると（動画データの量が最大レートに対応する領域より少ないと），空いた量を検出して多重処理手段１１ｃに通知する。多重処理手段１１ｃは通知された空きの量に対応して音声の符号化レートを増大させて，変換を行う（多重化テーブル１２には予め音声符号化レートを増大させたパターンも登録してあるため設定のパターン番号を変更する）。この後，多重処理手段１１ｃは監視手段１１ｂにより動画データの量が増大すると，音声の符号化レートを減少させるよう制御を行う。 By performing communication at the initially set rate (the audio code data is set to the lowest rate), the quality of the moving image can be maintained to the maximum. When operating in this way, the monitoring means 11b of the scheduler 11 monitors the data length of the moving image data 10b of the multiplexed data storage unit 10, and if a free space is generated (an area where the amount of moving image data corresponds to the maximum rate) If it is smaller, the vacant amount is detected and notified to the multiprocessing means 11c. The multiplex processing means 11c performs conversion by increasing the speech coding rate in accordance with the notified empty amount (a pattern in which the speech coding rate is increased is also registered in the multiplex table 12 in advance). Change the pattern number for setting). Thereafter, when the amount of moving image data is increased by the monitoring unit 11b, the multiprocessing unit 11c performs control so as to decrease the audio coding rate.

なお，音声・動画のレートについて上記の説明では音声を最低レートにしたが，最低レートに設定せずに，各携帯端末のチャネル毎に最適な音声レートを多重化テーブル１２に設定するようにしてもよい。 In the above description, the audio / video rate is set to the lowest rate in the above description. However, the optimum audio rate is set in the multiplexing table 12 for each channel of each mobile terminal without setting the minimum rate. Also good.

更に，動画バッファに閾値を設定することにより閾値を越えた場合により多重化密度の高い多重化テーブルを選択するか，ヘッダのオーバヘッド（保守，管理）を少なくして動画のレートを上げることで，音声・動画とも可能な範囲内で高いレートを維持しつつ，３Ｇ−３２４Ｍ回線上に多重化データを出力することができる。 Furthermore, by setting a threshold value in the video buffer, if the threshold value is exceeded, a multiplexing table with a higher multiplexing density is selected, or the overhead of the header (maintenance and management) is reduced to increase the video rate. Multiplexed data can be output on the 3G-324M line while maintaining a high rate within a possible range for both audio and moving images.

また，音声・動画を多重化する時，音声評価用と画像評価用のデータを多重して，これらの評価結果を携帯端末からフィードバックすることで高いレベルの音質・画質を維持するようにしてもよい。 Also, when multiplexing audio / video, it is possible to multiplex the data for audio evaluation and image evaluation, and to maintain a high level of sound quality / image quality by feeding back these evaluation results from the mobile terminal. Good.

本発明の原理及び実施例１により，マルチメディアゲートウェイにおいて，音声・動画レートの音声を最低レートにして動画レートを最大にすることで画像の品質を一定レベル以上に維持することができる。 According to the principle of the present invention and the first embodiment, in the multimedia gateway, the image quality can be maintained at a certain level or higher by setting the audio / video rate audio to the lowest rate and maximizing the video rate.

また，後述する実施例２の構成によれば，動画レートを監視することにより多重化データの空きリソースを有効に使い，ＡＭＲレートを上げることにより音質を上げることが可能となる。また，後述する実施例３によりチャネル別に送出する音声レートを管理するテーブルを保持することにより端末毎に適切な音声レートを送出することが可能となる。 In addition, according to the configuration of the second embodiment described later, it is possible to effectively use free resources of multiplexed data by monitoring the moving image rate, and improve the sound quality by increasing the AMR rate. In addition, it is possible to transmit an appropriate audio rate for each terminal by holding a table for managing the audio rate to be transmitted for each channel according to Example 3 described later.

また，後述する実施例４によれば，動画バッファに閾値を設け，閾値越えした場合は，更に多重化レートの大きい多重化テーブルを選択するか，または多重化データ（ＭＵＸ−ＰＤＵ）のアダプテーションレイヤ（ＡＬ）ヘッダのオプションを無くすことによりヘッダオーバヘッドを減少させて多重化効率を高め，音声・動画とも可能な範囲内での高いレートを維持しつつ，６４Ｋｂｐｓの３Ｇ−３２４Ｍ上に多重化することが可能となる。 In addition, according to a fourth embodiment to be described later, when a threshold is provided in the moving image buffer and the threshold is exceeded, a multiplexing table having a higher multiplexing rate is selected, or an adaptation layer of multiplexed data (MUX-PDU) (AL) Reduce the header overhead by eliminating the header option, increase the multiplexing efficiency, and multiplex on 3K-324M of 64Kbps while maintaining the high rate within the possible range for both audio and video. Is possible.

更に，後述する実施例５による多重を行う際，音声評価用データを多重化し，この評価結果を携帯端末からフィードバックすることによりマルチメディアゲートウェイで音声レートを適切な値に設定することが可能となる。また，実施例６により音声評価用データの他に画像評価用データも多重化して携帯端末に送信することで，その評価結果を携帯端末から配信サーバまでフィードバックすることで配信サーバから適切な画像レートで動画データを送出させるようにして，高いレベルの音質・画質を維持することが可能となる。 Furthermore, when multiplexing according to Example 5 described later, the voice evaluation data is multiplexed, and the evaluation result is fed back from the mobile terminal, whereby the multimedia gateway can set the voice rate to an appropriate value. . Further, according to the sixth embodiment, in addition to the voice evaluation data, the image evaluation data is multiplexed and transmitted to the mobile terminal, and the evaluation result is fed back from the mobile terminal to the distribution server, so that an appropriate image rate is obtained from the distribution server. It is possible to maintain a high level of sound quality and image quality by transmitting moving image data.

図２は実施例１の構成を示す。図２の実施例１には，マルチメディアゲートウェイ内のＩＰネットワークからマルチメディア携帯端末網の携帯端末へマルチメディア情報（音声と動画）を伝送する方向（下り方向）の構成だけを示し，上り方向については図示省略されている（上り方向の伝送は従来と同じ）。図２において，１はマルチメディアゲートウェイであり，上記図１に示す多重化処理部１ａ（符号１０〜１２により構成）はこのマルチメディアゲートウェイ１内に含まれる。図１の１０〜１２，２ａ〜２ｄ，３ａ，３ｂの各符号は上記図１に示す原理構成の同一符号の各部と同じ名称であり，説明を省略する。 FIG. 2 shows the configuration of the first embodiment. Example 1 in FIG. 2 shows only the configuration in the direction (downward direction) of transmitting multimedia information (voice and video) from the IP network in the multimedia gateway to the mobile terminal of the multimedia mobile terminal network. Is omitted (uplink transmission is the same as in the prior art). In FIG. 2, reference numeral 1 denotes a multimedia gateway, and the multiplexing processing unit 1 a (configured by reference numerals 10 to 12) shown in FIG. 1 is included in the multimedia gateway 1. Reference numerals 10 to 12, 2a to 2d, 3a, and 3b in FIG. 1 have the same names as the same reference numerals in the principle configuration shown in FIG.

図２の多重化データ格納部１０は，１フレーム分の３Ｇ−３２４Ｍに対応する多重化データ（ＭＵＸ−ＰＤＵ:Multiplex Protocol Data Unit)を格納する容量を備えるが，図２には時間的経過を含む２つのフレームのＭＵＸ−ＰＤＵ（多重化データ）を示す。各ＭＵＸ−ＰＤＵ（多重化データ）内の１０ａはＡＭＲにより圧縮された音声データ，１０ｂはＭＰＥＧ４またはＨ．２６３（以下の説明ではＭＰＥＧ４とする）により圧縮された動画データを示し，この実施例１では，音声データと動画データとで構成する１フレームは，２０ｍｓの長さであり，６４Ｋｂｐｓの伝送速度であるから，１フレームは１６０Ｂ（バイト）で構成し，３Ｇ−３２４Ｍの音声データ（ＡＭＲの圧縮符号）と動画データ（ＭＰＥＧ４）及び一定長のオーバヘッド（保守管理データ）とを合わせたデータである。スケジューラ１１は音声の圧縮データＧ．７１１をコーデック処理部２ｃにおいて多重化テーブル１２に設定されたレートでＡＭＲ符号に変換させるよう制御を行う。 The multiplexed data storage unit 10 in FIG. 2 has a capacity for storing multiplexed data (MUX-PDU: Multiplex Protocol Data Unit) corresponding to 3G-324M for one frame. 2 shows MUX-PDU (multiplexed data) of two frames included. In each MUX-PDU (multiplexed data), 10a is audio data compressed by AMR, 10b is MPEG4 or H.264. 263 (MPEG4 in the following description) is shown. In the first embodiment, one frame composed of audio data and moving image data has a length of 20 ms and a transmission rate of 64 Kbps. Therefore, one frame is composed of 160 B (bytes), and is a combination of 3G-324M audio data (AMR compression code), moving image data (MPEG4), and a certain length of overhead (maintenance management data). The scheduler 11 compresses voice compressed data G.A. In the codec processing unit 2c, control is performed so that the code 711 is converted into an AMR code at a rate set in the multiplexing table 12.

多重化テーブル１２には，各番号（多重化テーブル番号という）に対応して音声レートと動画レートを変化させて各組み合わせが設定されており，スケジューラ１１はマルチメディア情報を送信する際に，フレーム（ＭＵＸ−ＰＤＵ）のヘッダ（図示省略）の中に多重化テーブル番号（ＭＣという）を設定して送信する。この多重化テーブル番号は，携帯端末で受信されると，携帯端末にもスケジューラ１１の多重化テーブル１２と同様のテーブルを備えており，スケジューラ１１から送られてきたフレームのヘッダから多重化テーブル番号を取り出して，音声レートと動画レートを識別して多重化データを音声と動画のデータに分離して復号する。 Each combination is set in the multiplexing table 12 by changing the audio rate and the moving image rate corresponding to each number (referred to as a multiplexing table number), and the scheduler 11 transmits a frame when transmitting multimedia information. A multiplexing table number (referred to as MC) is set in the header (not shown) of (MUX-PDU) and transmitted. When this multiplex table number is received by the mobile terminal, the mobile terminal also has a table similar to the multiplex table 12 of the scheduler 11, and the multiplex table number is determined from the header of the frame sent from the scheduler 11. Are extracted, the audio rate and the moving image rate are identified, and the multiplexed data is separated into audio and moving image data and decoded.

この実施例１では，予め多重化テーブル１２に音声レートを最低速度である４．７５Ｋｂｐｓに対応するレート（１３バイト／フレームに相当）とすると，１フレーム全体（１６０バイト）の残りの１４７バイトの内，ＭＵＸ−ＰＤＵ（多重化データ）のレイヤ及びＡＬ（アダプテーションレイヤ）のオーバヘッドを除いた分が配置可能な動画レートになる。オーバヘッドのサイズは，時々の多重化パターンによって異なるが，２０ｍｓ間隔中にアダプテーションレイヤ２である音声とアダプテーションレイヤ３である動画が個別の多重化データ（ＭＵＸ−ＰＤＵ）で配置されるとすると，１８バイトとなる。この場合，配置可能な動画サイズは，残りの分である５１．６Ｋｂｐｓを配置（ほぼ１２９バイト／フレームに相当）して多重化テーブル番号を設定して，送信相手の携帯端末にそのテーブル番号（番号に対応した音声と画像のレートを含む）を通知して，携帯端末の多重化テーブルに設定させる。このように音声のＡＭＲ符号化データとして最低レートを選択することで，動画の画質レベルを高く保持するようにした。 In the first embodiment, if the audio rate is set in advance in the multiplexing table 12 at a rate corresponding to the minimum speed of 4.75 Kbps (corresponding to 13 bytes / frame), the remaining 147 bytes of the entire frame (160 bytes) Of these, the video rate that can be arranged is the amount excluding the MUX-PDU (multiplexed data) layer and the AL (adaptation layer) overhead. The size of the overhead varies depending on the multiplexing pattern from time to time. However, if the audio as the adaptation layer 2 and the moving image as the adaptation layer 3 are arranged as individual multiplexed data (MUX-PDU) during the 20 ms interval, 18 It becomes a byte. In this case, the video size that can be placed is the remaining 51.6 Kbps (corresponding to approximately 129 bytes / frame), a multiplexing table number is set, and the table number ( (Including the audio and image rates corresponding to the number) and set it in the multiplexing table of the portable terminal. Thus, by selecting the lowest rate as audio AMR encoded data, the image quality level of moving images is kept high.

図３は実施例１の処理フローであり，この処理はスケジューラ１１において実行される。スケジューラは，２０ｍｓの固定周期毎に先頭（音声データ）であるか判定し（図３のＳ１），先頭であるとＡＭＲデータを音声バッファ（図２の２ｄ）から出力し，多重化データ（ＭＵＸ−ＰＤＵ）の先頭に配置する（図３のＳ２）。次にジッタバッファ（図２の３ｂ）から動画データを出力し，多重化データ（ＭＵＸ−ＰＤＵ）のＡＭＲデータの次に配置して（図３のＳ３），処理を終了する。なお，図３のフローではオーバヘッドの配置について省略されているが，各フレームのＭＵＸ−ＰＤＵにオーバヘッドが配置されることはいうまでもない。 FIG. 3 is a processing flow of the first embodiment, and this processing is executed by the scheduler 11. The scheduler determines whether it is the head (voice data) every fixed period of 20 ms (S1 in FIG. 3). If it is the head, the AMR data is output from the voice buffer (2d in FIG. 2), and the multiplexed data (MUX) -PDU) (S2 in FIG. 3). Next, the moving image data is output from the jitter buffer (3b in FIG. 2), arranged next to the AMR data of the multiplexed data (MUX-PDU) (S3 in FIG. 3), and the processing ends. Although the overhead arrangement is omitted in the flow of FIG. 3, it goes without saying that the overhead is arranged in the MUX-PDU of each frame.

上記実施例１では，比較的簡単に音声・動画を与えられたレートで多重化可能であるが，任意の形に各レートを変更し，音質と画質とも最良のレベルに維持することはできないが，これに対処するために以下に実施例２〜実施例７の構成を説明する。 In the first embodiment, it is possible to multiplex audio / moving images at a given rate relatively easily, but it is not possible to change each rate to an arbitrary form and maintain the best sound quality and image quality. In order to cope with this, configurations of the second to seventh embodiments will be described below.

図４は実施例２の構成を示す。図中，１，１０〜１２，２ａ〜２ｄ，３ａ，３ｂの各符号は上記図２に示す実施例１の構成の同一符号の各部と同じ名称であり，説明を省略する。１００はスケジューラ１１におけるフレームのイメージング処理を示す。 FIG. 4 shows the configuration of the second embodiment. In the figure, reference numerals 1, 10 to 12, 2a to 2d, 3a, and 3b have the same names as the same reference numerals in the configuration of the first embodiment shown in FIG. Reference numeral 100 denotes a frame imaging process in the scheduler 11.

この実施例２では，動画のデータ量を確認して，固定の割り当て領域に空きがあることを確認すると，その空き領域に対応するレートを音声のレートを上げるために割り当てるように，コーデック処理部２ｃ（図４）を制御する。 In this second embodiment, the codec processing unit checks the amount of moving picture data and confirms that there is a free space in the fixed allocation area, so that the rate corresponding to the free area is allocated to increase the audio rate. 2c (FIG. 4) is controlled.

図４ではスケジューラ１１は，実際に多重化データ格納部１０上にマッピングする前に，次のようなイメージング処理を行う。まず，音声レートを最低レートの４．７５ＫｂｐｓでＭＵＸ−ＰＤＵ上に２０ｍｓの固定間隔でマッピングする。オーバヘッドを除いた残りを動画で配置するが，その際に動画バッファの残サイズを確認し，残サイズが次固定間隔以降でも均等に配置されるサイズを残し，該当固定間隔内に動画を配置する。この時，固定間隔に未だに空き領域がある時は，スケジューラ１１は，空き領域を満たす分だけ音声レートを上げるようコーデック処理部２ｃを制御する。なお，空き領域が全く無い時は，コーデック処理部２ｃの変換レートは４．７５Ｋｂｐｓのままとなる。 In FIG. 4, the scheduler 11 performs the following imaging process before mapping on the multiplexed data storage unit 10 in practice. First, the voice rate is mapped at a fixed rate of 20 ms on the MUX-PDU at a minimum rate of 4.75 Kbps. The rest, excluding overhead, is placed in the video. At that time, the remaining size of the video buffer is checked, and the video is placed within the corresponding fixed interval, leaving a size that remains even after the next fixed interval. . At this time, when there is still an empty area at a fixed interval, the scheduler 11 controls the codec processing unit 2c to increase the audio rate by the amount that satisfies the empty area. When there is no free space, the conversion rate of the codec processing unit 2c remains 4.75 Kbps.

音声はコーデック処理部２ｃにおいて，空き領域を埋めるレートに変換された後，動画バッファから取り出された動画データと共に多重化データ格納部１０の実際のＭＵＸ−ＰＤＵに配置される。図４の例では，イメージング処理１００のフレーム１の空き領域に対応するＡＭＲの変換レートは７．９５Ｋｂｐｓとなり，イメージング処理１００のフレーム２の空き領域に対応するＡＭＲの変換レートは５．９０Ｋｂｐｓとなっている。 The audio is converted into a rate for filling the empty area in the codec processing unit 2c, and then arranged in the actual MUX-PDU of the multiplexed data storage unit 10 together with the moving image data extracted from the moving image buffer. In the example of FIG. 4, the AMR conversion rate corresponding to the empty area of frame 1 of the imaging process 100 is 7.95 Kbps, and the AMR conversion rate corresponding to the empty area of frame 2 of the imaging process 100 is 5.90 Kbps. ing.

動画データは通常，配信サーバから最大（ＭＡＸ）値の設定レートで送信されるため，平均的には最大値を下回り，またバースト性も考慮すると，ＡＭＲを最低レートで考えた場合，空き領域は必ず発生する。 Since video data is normally transmitted from the distribution server at the maximum (MAX) value set rate, the average area is lower than the maximum value, and considering the burstiness, if AMR is considered at the minimum rate, the free space is Always occurs.

図５は実施例２の処理フローである。スケジューリング処理において，まず２０ｍｓの固定周期毎に先頭（音声データ）であるか判定し（図５のＳ１），先頭であるとＡＭＲデータを音声バッファから出力し，多重化データ（ＭＵＸ−ＰＤＵ）の先頭に配置し（同Ｓ２），次にジッタバッファ（図４の３ｂ）から動画データを出力し，多重化データ（ＭＵＸ−ＰＤＵ）のＡＭＲデータの次に配置する（図５のＳ３）。次にＭＵＸ−ＰＤＵに空き領域が有るか判別し（図５のＳ４），無ければ処理を終了するが，有る場合は空き領域分，ＡＭＲデータを高レートで再符号化し（図５のＳ５），ＡＭＲを先頭に再配置し，動画をＡＭＲの次位置に配置する（同Ｓ６）。なお，ＡＭＲデータを高レートで再符号化する処理は，コーデック処理部２ｃにおいて実行させる。 FIG. 5 is a processing flow of the second embodiment. In the scheduling process, it is first determined whether or not it is the head (voice data) every fixed period of 20 ms (S1 in FIG. 5), and if it is the head, AMR data is output from the voice buffer and multiplexed data (MUX-PDU) It is arranged at the head (S2 in the same), and then the moving image data is output from the jitter buffer (3b in FIG. 4), and is arranged next to the AMR data of the multiplexed data (MUX-PDU) (S3 in FIG. 5). Next, it is determined whether or not there is an empty area in the MUX-PDU (S4 in FIG. 5). If there is not, the process ends. If there is, the AMR data is re-encoded at a high rate for the empty area (S5 in FIG. 5). , AMR is rearranged at the head, and the moving image is arranged at the next position of AMR (S6). Note that the codec processing unit 2c executes the process of re-encoding AMR data at a high rate.

以上のように実施例２により，ＭＵＸ−ＰＤＵ内の空きリソースを有効に使い音声レートを上げることにより音声品質を上昇させることが可能で，その結果，この実施例２により同じ多重化レートにもかかわらず，音声品質のレベルを向上させることができる。 As described above, according to the second embodiment, it is possible to improve the voice quality by effectively using the free resources in the MUX-PDU and increasing the voice rate. As a result, the second embodiment can achieve the same multiplexing rate. Regardless, the level of voice quality can be improved.

図６は実施例３の構成を示す。図中，１，１１，１２，２ａ〜２ｄ，３ａ，３ｂの各符号は上記図２，図４に示す実施例１，実施例２の同一符号の各部と同じ名称であり，説明を省略する。この実施例３で設けられた２０はコーデック処理部２ｃで使用するチャネル別管理テーブルであり，各チャネル別にそのチャネルを使用する携帯端末へ送出する音声レートを管理するチャネル別管理テーブルである。また，１０−０，１０−１，……は多重化データ格納部１０（図２，図４に示す）がチャネル０，チャネル１，……に対応して変化する状態を表し，データの多重化処理は各チャネル０，１，２，……について時分割処理を行う。 FIG. 6 shows the configuration of the third embodiment. In the figure, the reference numerals 1, 11, 12, 2a to 2d, 3a, 3b have the same names as the parts having the same reference numerals in the first and second embodiments shown in FIGS. . Reference numeral 20 provided in the third embodiment is a management table for each channel used in the codec processing unit 2c, and is a management table for each channel for managing the audio rate transmitted to the mobile terminal that uses the channel for each channel. 10-0, 10-1,... Represent the state in which the multiplexed data storage unit 10 (shown in FIGS. 2 and 4) changes corresponding to channel 0, channel 1,. The time division processing is performed for each channel 0, 1, 2,.

チャネル別管理テーブル２０は，予めチャネル毎に設定された回線情報や，携帯端末からの初期設定時におけるネゴシエーション結果に基づき保持される。 The channel-specific management table 20 is held based on the line information set for each channel in advance and the negotiation result at the time of initial setting from the portable terminal.

この実施例３では，ＩＰ／ＵＤＰ／ＲＴＰ終端部２ａで終端した音声パケットは，ジッタバッファ２ｂを介してコーデック処理部２ｃにおいて，チャネル別管理テーブル２０の内容に基づきチャネルに対応して設定されたレートに変換された後，音声バッファ２ｄに蓄積される。その後，スケジューラによりチャネル毎に最適な多重化テーブルが選択され，各チャネルの多重化データ１０−０，１０−１，……としてマッピングされる。 In the third embodiment, the voice packet terminated at the IP / UDP / RTP termination unit 2a is set in correspondence with the channel based on the contents of the channel-specific management table 20 in the codec processing unit 2c via the jitter buffer 2b. After being converted to the rate, it is stored in the audio buffer 2d. Thereafter, the optimum multiplexing table for each channel is selected by the scheduler and mapped as multiplexed data 10-0, 10-1,.

図７は実施例３の処理フローである。スケジューラは，２０ｍｓの固定周期毎に先頭（音声データ）であるか判定し（図７のＳ１），先頭であるとコーデック処理部に対し各チャネル別管理テーブルを参照し，コーデック変換を実行させ（同Ｓ２），変換により得られたＡＭＲデータを音声バッファから出力し，多重化データ（ＭＵＸ−ＰＤＵ）の先頭に配置する（同Ｓ３）。次にジッタバッファ（図６の３ｂ）から動画を出力し，ＭＵＸ−ＰＤＵのＡＭＲの次に配置して（図７のＳ４），終了する。 FIG. 7 is a processing flow of the third embodiment. The scheduler determines whether it is the head (audio data) every fixed period of 20 ms (S1 in FIG. 7), and if it is the head, the codec processing unit refers to the management table for each channel and executes codec conversion ( In step S2), the AMR data obtained by the conversion is output from the audio buffer and placed at the head of the multiplexed data (MUX-PDU) (step S3). Next, a moving image is output from the jitter buffer (3b in FIG. 6), arranged next to the AMR of the MUX-PDU (S4 in FIG. 7), and the process ends.

この実施例３により，従来まではマルチメディアゲートウェイと接続される携帯端末の全てに一定の音声レートが送出されていたのに対して，携帯端末毎に適切な音声レートを送出することが可能となる。これにより，音声レートに関する携帯端末固有の接続性の問題からクリアされる。 According to the third embodiment, a constant audio rate has been transmitted to all the mobile terminals connected to the multimedia gateway until now, but an appropriate audio rate can be transmitted to each mobile terminal. Become. This clears the problem of connectivity specific to the mobile terminal regarding the audio rate.

図８は実施例４の構成を示す。図中，１，１０〜１２，２ａ〜２ｄ，３ａ，３ｂ及び１０−０，１０−１，２０の各符号は上記図６に示す実施例３の同一符号の各部と同じ名称であり，説明を省略する。この実施例４で設けられた３０は動画用のジッタバッファ３ｂ内に設けられたチャネル毎の個別の動画バッファを表し，３１は各チャネル別に動画レートが増大して多重化テーブルを変更させる境界となる動画レートの閾値を保持する多重化変更閾値保持部，３２は前記の多重化変更閾値を越えて多重化テーブルを変更した後，動画レートが低下して元の多重化テーブルに戻させる境界となる動画レートの閾値を保持する解除閾値保持部である。 FIG. 8 shows the configuration of the fourth embodiment. In the figure, reference numerals 1, 10 to 12, 2a to 2d, 3a, 3b, and 10-0, 10-1, and 20 have the same names as the parts having the same reference numerals in the third embodiment shown in FIG. Is omitted. 30 in the fourth embodiment represents an individual moving image buffer for each channel provided in the moving image jitter buffer 3b, and 31 represents a boundary for changing the multiplexing table by increasing the moving image rate for each channel. A multiplex change threshold value holding unit 32 for holding a moving image rate threshold value, and a boundary for changing the multiplex table beyond the multiplex change threshold value and then returning the original multiplex table to a lower moving image rate. A release threshold value holding unit that holds a threshold value of the moving image rate.

この実施例４では上記実施例３（図６）と同様にチャネル別管理テーブル２０を備え，各チャネル別に音声レートと動画レートの組み合わせに対応する多重化テーブル番号が設定されており，ジッタバッファ３ｂ内の各チャネルに対応した個別の動画バッファ３０に対し，多重化変更閾値（動画バッファ３０の特定のレートに対応する位置を表す値）を多重化変更閾値保持部３１に，解除閾値（動画バッファ３０の変更閾値より低いレートに対応する位置を表す値）を解除閾値保持部３２に設定しておく。 In the fourth embodiment, the channel-specific management table 20 is provided as in the third embodiment (FIG. 6), and the multiplexing table number corresponding to the combination of the audio rate and the moving image rate is set for each channel, and the jitter buffer 3b. For each individual video buffer 30 corresponding to each channel, a multiplexing change threshold value (a value representing a position corresponding to a specific rate of the video buffer 30) is stored in the multiplexing change threshold holding unit 31 and a cancellation threshold value (video buffer). A value representing a position corresponding to a rate lower than the change threshold value of 30) is set in the release threshold value holding unit 32.

スケジューラ１１はチャネル別管理テーブル２０に従ってチャネル毎に音声レートを決定し，多重化されるが，音声レートの残りのレートで動画の多重化を行おうとしても，動画のレート（データ量）が増大して残りのレートに収まらない場合，例えば，音声レートが最大の１２．２Ｋｂｐｓにもかかわらず，動画レートが最大の５０．４Ｋｂｐｓであった場合は，動画の残バッファ（出力できないデータ）が増大して，当該チャネルに対応して多重化変更閾値保持部３１の閾値を越えてしまう。この時，スケジューラ１１は，多重化テーブルの選択を通常よりも多重化レートの高い設定に変更することにより６４Ｋｂｐｓ内におさめようとする。しかし，多重化テーブルの変更のみでは依然として個別の動画バッファ３０の閾値を越える場合，例えば，上記音声レートが１２．２Ｋｂｐｓに対し，動画レートが５０．４Ｋｂｐｓの場合等には，更にスケジューラ１１が，音声データと動画データを多重化する場合にはアダプテーションレイヤのヘッダや，シリアル番号，誤りチェック用のデータ等が付加されているが，これらのデータはオプションにより選択することができる。この実施例４では，多重化変更閾値保持部３１の閾値を越えた場合にこのオプションのオーバヘッドを極力少なくして，動画レートを増加させるようにした。具体的には，上記音声・動画レートで示すと，４０ｍｓ＝３２０バイト間隔中に，音声・動画ともアダプテーションレイヤ２（ＡＬ２）及びＭＵＸ−ＰＤＵのオプションヘッダ無しで多重化すると，ヘッダオーバヘッドは１６０バイト中３バイトとなり，５０．４Ｋｂｐｓの動画がおさまることになる。 The scheduler 11 determines the audio rate for each channel in accordance with the channel-specific management table 20 and multiplexes. However, the video rate (data amount) increases even if the video is multiplexed at the remaining audio rate. If the rate does not fit in the remaining rate, for example, if the audio rate is 12.2 Kbps at the maximum but the video rate is the maximum at 50.4 Kbps, the remaining buffer of the video (data that cannot be output) increases. Thus, the threshold of the multiplexing change threshold holding unit 31 is exceeded corresponding to the channel. At this time, the scheduler 11 tries to keep the selection within the 64 Kbps by changing the selection of the multiplexing table to a setting with a higher multiplexing rate than usual. However, if only the change of the multiplexing table still exceeds the threshold value of the individual moving image buffer 30, for example, when the audio rate is 12.2 Kbps and the moving image rate is 50.4 Kbps, the scheduler 11 further When audio data and moving picture data are multiplexed, an adaptation layer header, serial number, error check data, and the like are added. These data can be selected as an option. In the fourth embodiment, when the threshold value of the multiplexing change threshold value holding unit 31 is exceeded, the overhead of this option is reduced as much as possible to increase the moving image rate. Specifically, in terms of the above audio / video rate, if both audio and video are multiplexed without an adaptation layer 2 (AL2) and MUX-PDU option header within an interval of 40 ms = 320 bytes, the header overhead is 160 bytes. It will be 3 bytes in length, and 50.4 Kbps video will be stored.

個別バッファの閾値越えが解除されたことが，動画バッファ３０のデータ量が解除閾値保持部３２に設定して閾値を下回ったことで検出される。この場合，スケジューラ１１はアダプテーションレイヤ２（ＡＬ２）及びＭＵＸ−ＰＤＵのヘッダを通常に戻し，また場合によっては多重化テーブルの設定も通常に戻す。これによりヘッダのオプションを復活し，多重化のレートを低くすることにより，エラーの発生・検出頻度を通常レベルまで維持することが可能となる。 The release of exceeding the threshold value of the individual buffer is detected when the data amount of the moving image buffer 30 is set in the release threshold value holding unit 32 and falls below the threshold value. In this case, the scheduler 11 returns the adaptation layer 2 (AL2) and MUX-PDU headers to normal, and in some cases, returns the setting of the multiplexing table to normal. As a result, the header option is restored and the multiplexing rate is lowered to maintain the error occurrence / detection frequency to the normal level.

図９は実施例４の処理フローである。スケジューラは，２０ｍｓの固定周期毎に先頭（音声データ）であるか判定し（図９のＳ１），先頭であると動画バッファが閾値（多重化変更閾値保持部３１の閾値）を越えたか判別する（同Ｓ２）。越えない場合は，動画バッファは閾値解除の条件を満たす（解除閾値保持部３２の閾値を下回る）かを判別し（図９のＳ３），満たすと多重化を効率化した状態を，多重化テーブルを戻し，オプションヘッダを復活して，元に戻す（同Ｓ４）。ステップＳ２において動画バッファが閾値を越えたと判断されると，最適な多重化テーブルを選択し，オプションヘッダの削除をすることによりＭＵＸ−ＰＤＵ多重化効率を増加させ（図９のＳ５），ＭＵＸ−ＰＤＵを生成する（同Ｓ６）。 FIG. 9 is a processing flow of the fourth embodiment. The scheduler determines whether it is the head (audio data) every fixed period of 20 ms (S1 in FIG. 9), and if it is the head, determines whether the video buffer has exceeded a threshold value (threshold value of the multiplexing change threshold value holding unit 31). (S2). If not exceeded, the video buffer determines whether the threshold release condition is satisfied (below the threshold value of the release threshold value holding unit 32) (S3 in FIG. 9). Is restored, the option header is restored and restored (S4). If it is determined in step S2 that the moving image buffer has exceeded the threshold, the MUX-PDU multiplexing efficiency is increased by selecting the optimum multiplexing table and deleting the option header (S5 in FIG. 9). A PDU is generated (S6).

図１０は実施例５の構成を示す。図中，１，１０〜１２，２ａ〜２ｄ，３ａ，３ｂの各符号は上記図２，図４，図６，図８の各実施例１〜４の同一符号の各部と同じ名称であり，説明を省略する。１３は多重化データ格納部１０に設定されて送信先である携帯端末に送られる音声評価用データを格納する音声評価用データ発生部，１４は携帯端末から送信された（上り方向の）３Ｇ−３２４Ｍの信号を終端する３Ｇ−３２４Ｍ終端部，１５はフィードバック結果解析部，１６は呼制御データをＭＥＧＡＣＯを介して送受信し呼処理を行うＣＰＵである。 FIG. 10 shows the configuration of the fifth embodiment. In the figure, reference numerals 1, 10 to 12, 2a to 2d, 3a, and 3b have the same names as the respective parts having the same reference numerals in Examples 1 to 4 in FIGS. Description is omitted. Reference numeral 13 denotes a voice evaluation data generation unit for storing voice evaluation data set in the multiplexed data storage unit 10 and sent to a portable terminal as a transmission destination. Reference numeral 14 denotes 3G- (upward) transmitted from the portable terminal. A 3G-324M termination unit that terminates a 324M signal, 15 is a feedback result analysis unit, and 16 is a CPU that performs call processing by transmitting and receiving call control data via MEGACO.

この実施例５ではスケジューラ１１が音声の品質評価用の予め決められたデータをリアルタイムの音声・動画データの通信中に付加して送信して，携帯端末において評価した結果を受け取って，フィードバックされた評価結果に従ってコーデック処理部２ｃにおける音声レートの調整を行う。この実施例５の音声評価にはＩＴＵ−Ｔ（国際電気通信連合電気通信標準化部門）が勧告したＰＳＱＭ（Perceptual Speech Quality Measure)という方式を用いる。この方式は，テストに用いる音声のサンプルとネットワークを経由して届いた受信側の音声を比較し，サンプルに対する受信側音声の劣化具合を数値化する。 In the fifth embodiment, the scheduler 11 adds predetermined data for audio quality evaluation during transmission of real-time audio / video data, transmits it, receives the evaluation result in the portable terminal, and is fed back. The audio rate is adjusted in the codec processing unit 2c according to the evaluation result. The voice evaluation of the fifth embodiment uses a method called PSQM (Perceptual Speech Quality Measure) recommended by ITU-T (International Telecommunication Union Telecommunication Standardization Sector). This method compares the audio sample used for testing with the audio on the receiving side that has arrived via the network, and quantifies the degree of degradation of the receiving audio relative to the sample.

図１０において，スケジューラ１１は，音声をＡＭＲ最低レートの４．７５Ｋｂｐｓで２０ｍｓの固定間隔で配置し，残りの領域（オーバヘッドの分は別）を動画データと音声評価用の予め決められた音声評価用データ（テストに用いる音声のサンプル）を音声評価用データ発生部１３から配置する。本来は動画データを配置すればよい領域に対し，音声評価用データ分を多重するため，動画レートが幾分少なくなるが，全ての固定間隔内に音声評価用データを多重する必要はなく，ある一定周期ｔ１（ｔ１＝２０ｍｓ×ｎ）の間隔で多重するので，動画への影響は極めて少ない。音声評価データ（音声のサンプル）は，相手側の携帯端末において受信すると，受信した音声評価データと予め保存している音声のサンプルとを比較する音声評価アルゴリズムにより評価を行って，評価結果を上り方向のマルチメディア情報のアダプテーションレイヤ２（ＡＬ２）等のインチャネルによりマルチメディアゲートウェイに送出する。 In FIG. 10, the scheduler 11 arranges audio at a fixed interval of 20 ms at the AMR minimum rate of 4.75 Kbps, and the remaining area (apart from the overhead) is used for video data and a predetermined audio evaluation for audio evaluation. Data for voice (sample of voice used for test) is arranged from the voice evaluation data generator 13. Since the audio evaluation data is multiplexed for the area where video data should be placed, the video rate is somewhat reduced, but there is no need to multiplex the audio evaluation data within all fixed intervals. Since multiplexing is performed at intervals of a constant period t1 (t1 = 20 ms × n), the influence on the moving image is extremely small. When the voice evaluation data (voice sample) is received by the other party's mobile terminal, the evaluation is performed by using a voice evaluation algorithm that compares the received voice evaluation data with a previously stored voice sample. The multimedia information in the direction is sent to the multimedia gateway by in-channel such as adaptation layer 2 (AL2).

マルチメディアゲートウェイの３Ｇ−３２４Ｍ終端部１４により評価結果を抽出して，フィードバック結果解析部１５によりフィードバック結果を解析し，音声レートの上げ下げが必要であればその旨をコーデック処理部２ｃに通知する。コーデック処理部２ｃはフィードバック結果解析部１５からの解析結果に従いＡＭＲの音声レートの上げ下げを行う。この時，音声レートを上げた（音声データ量を増やす）ことにより動画データに影響を与える場合があるが，多重化テーブルを用いて動画データの量を減らすようにするか，上記の実施例４（図８，図９）の動画バッファに対して閾値により制御する方法を用いることにより対処することもできる。 The evaluation result is extracted by the 3G-324M termination unit 14 of the multimedia gateway, the feedback result is analyzed by the feedback result analysis unit 15, and if it is necessary to raise or lower the audio rate, the codec processing unit 2c is notified accordingly. The codec processing unit 2 c raises or lowers the AMR audio rate according to the analysis result from the feedback result analysis unit 15. At this time, moving image data may be affected by increasing the audio rate (increasing the amount of audio data). However, the amount of moving image data may be reduced by using a multiplexing table, or the fourth embodiment described above. This can be dealt with by using a method of controlling the moving image buffer of FIG. 8 and FIG. 9 with a threshold.

図１１は実施例５の処理フローである。この処理フローは，携帯端末からのフィードバックを受け取った状態で，スケジューラはフィードバック結果は前回の音声評価値より悪いか判別し（図１１のＳ１），悪くない場合はコーデック処理部に対して，ＡＭＲ圧縮レートを上げる（高圧縮する）よう指示し（同Ｓ２），悪い場合はコーデック処理部に対し，ＡＭＲの圧縮レートを下げる（低圧縮する）よう指示する（同Ｓ３）。この低圧縮の指示に応じて音声レートによって動画レートが圧迫された（動画データの領域が減少した）場合は動画バッファ制御（上記図８，図９に示す実施例４に示す制御）を併用する（図１１のＳ４）。但し，ステップＳ４の動画バッファ制御は実施例４の処理によらないで，単に多重化テーブルの番号を動画レートを下げた組み合わせの番号に変更することで対応することもできる。 FIG. 11 is a processing flow of the fifth embodiment. In this processing flow, when the feedback from the portable terminal is received, the scheduler determines whether the feedback result is worse than the previous speech evaluation value (S1 in FIG. 11). The compression rate is instructed to be increased (highly compressed) (S2), and if it is bad, the codec processing unit is instructed to decrease (lowly compress) the AMR compression rate (S3). When the moving image rate is compressed by the audio rate in accordance with the low compression instruction (the moving image data area is reduced), the moving image buffer control (the control shown in the fourth embodiment shown in FIGS. 8 and 9) is also used. (S4 in FIG. 11). However, the moving image buffer control in step S4 can be dealt with by simply changing the number of the multiplexing table to the combination number with the moving image rate lowered without depending on the processing of the fourth embodiment.

図１２は実施例６の構成を示す。図中，１，１０〜１６，２ａ〜２ｄ，３ａ，３ｂの各符号は上記図１０に示す実施例５の同一符号の各部と同じ名称であり，説明を省略する。１７は画像評価用データ発生部，１８は画像フィードバック解析部，４はＳＩＰ（Session Initiation Protocol)サーバである。 FIG. 12 shows the configuration of the sixth embodiment. In the figure, reference numerals 1, 10 to 16, 2a to 2d, 3a, and 3b have the same names as those of the same reference numerals in the fifth embodiment shown in FIG. Reference numeral 17 denotes an image evaluation data generation unit, 18 denotes an image feedback analysis unit, and 4 denotes a SIP (Session Initiation Protocol) server.

この実施例６は，上記実施例５（図１０，図１１）で使用した音声評価用の予め決められたデータに加えて，更に予め決められた画像評価用データを多重して携帯端末に送信する。この実施例６の画像評価は，マーカ埋め込み型絶対評価方式を使用する。この方法では，評価対象の原画像に不可視信号（マーカ）を埋め込み，受信側でそのマーカ信号の劣化を元に画質推定を行う。これにより，受信側は評価対象の比較元となる画像を必要としない絶対評価を行うことが可能となる。 In the sixth embodiment, in addition to the predetermined data for voice evaluation used in the fifth embodiment (FIGS. 10 and 11), further predetermined image evaluation data is multiplexed and transmitted to the portable terminal. To do. The image evaluation of the sixth embodiment uses a marker embedded type absolute evaluation method. In this method, an invisible signal (marker) is embedded in the original image to be evaluated, and image quality estimation is performed on the receiving side based on the deterioration of the marker signal. As a result, the receiving side can perform absolute evaluation that does not require an image as a comparison source to be evaluated.

画像評価用データ発生部１７からはある一定周期ｔ２（ｔ２＝２０ｍｓ×ｍ）の間隔で多重する。音声評価用データ発生部１３からの音声評価用データとは独立して送ることができる。相手側の携帯端末は，音声評価用データを用いて音声評価を行うと同時に，画像評価用データを用いて画像評価を行い，その結果を音声評価結果とは別のアダプテーションレイヤ２（ＡＬ２）等でマルチメディアゲートウェイに対してフィードバックする。マルチメディアゲートウェイは３Ｇ−３２４Ｍ終端部１４で画像評価のフィードバック結果を動画データと分離して，画像フィードバック解析部１８で内容を解析し，フィードバック結果に従った画像レートの上げ下げ情報をＣＰＵ１６に供給する。この情報はＣＰＵ１６からＭＥＧＡＣＯのプロトコルにのせてＳＩＰサーバ４に転送し，ＳＩＰサーバ４からマルチメディア配信サーバ（図１３の８５）へ評価情報がフィードバックされることにより，配信サーバは適切な画像レートに変換して再送信する。 Multiplexing is performed from the image evaluation data generation unit 17 at an interval of a certain period t2 (t2 = 20 ms × m). It can be sent independently from the voice evaluation data from the voice evaluation data generating unit 13. The other party's mobile terminal performs voice evaluation using the voice evaluation data, and at the same time, performs image evaluation using the image evaluation data, and uses the result as an adaptation layer 2 (AL2) or the like different from the voice evaluation result. To feed back to the multimedia gateway. The multimedia gateway separates the feedback result of the image evaluation from the moving image data by the 3G-324M terminal unit 14, analyzes the content by the image feedback analysis unit 18, and supplies the CPU 16 with the image rate increase / decrease information according to the feedback result. . This information is transferred from the CPU 16 to the SIP server 4 over the MEGACO protocol, and the evaluation information is fed back from the SIP server 4 to the multimedia distribution server (85 in FIG. 13). Convert and resubmit.

この実施例６により，音声データのみならず動画についても，携帯端末側での実際の評価結果をフィードバックすることにより，高いレベルの音質・画質を維持して送信することができる。 According to the sixth embodiment, not only audio data but also a moving image can be transmitted while maintaining a high level of sound quality and image quality by feeding back an actual evaluation result on the mobile terminal side.

（付記１）音声・動画の送受信が可能な携帯端末が収容されたマルチメディア携帯端末網とマルチメディア配信サーバが接続されたＩＰネットワークとの接続点に設けられ，配信サーバからのＩＰ対応の音声パケットと画像パケットを終端して音声信号をマルチメディア携帯端末網のＡＭＲ（適応可変レート）の音声データに変換し，画像パケットを終端して動画データを取り出して，音声データと動画データ及びオーバヘッドを６４Ｋｂｐｓの速度で固定長の多重化データに組み立てて前記マルチメディア携帯端末網に一定周期で出力する多重化処理部を備えたマルチメディアゲートウェイにおける音声・動画調整方式において，前記多重化処理部は多重化データを生成する制御を行うスケジューラと，前記各固定長内の音声レートと動画レートの組み合わせが格納された多重化テーブルとを備え，各携帯端末への送信用に前記多重化テーブルに音声レートを最低レートとし残りの容量を動画とオーバヘッド用とするよう多重化テーブルに設定し，前記スケジューラは，前記多重化テーブルの前記音声レートのデータと前記動画レートを使用して多重化データを生成することを特徴とするマルチメディアゲートウェイにおける音声・動画調整方式。 (Supplementary note 1) IP-compatible audio from a distribution server provided at a connection point between a multimedia mobile terminal network accommodating mobile terminals capable of transmitting and receiving audio and video and an IP network to which the multimedia distribution server is connected Terminate packet and image packet, convert audio signal to AMR (adaptive variable rate) audio data of multimedia portable terminal network, terminate image packet and extract moving image data, and save audio data, moving image data and overhead In the audio / video adjustment method in the multimedia gateway having a multiplexing processing unit that assembles fixed-length multiplexed data at a rate of 64 Kbps and outputs the multiplexed data to the multimedia portable terminal network at a fixed period, the multiplexing processing unit A scheduler for controlling the generation of digitized data, and the voice rate and the dynamics within each fixed length. A multiplexing table storing rate combinations, and setting the multiplexing table in the multiplexing table so that the audio rate is the lowest rate and the remaining capacity is for video and overhead for transmission to each mobile terminal. , The scheduler generates multiplexed data using the audio rate data of the multiplexing table and the moving image rate, and the audio / video adjusting method in the multimedia gateway.

（付記２）付記１において，前記多重化処理部に生成された多重化データを格納する多重化データ格納部の空き領域を検出する手段を備え，前記空き領域を検出する手段により空き領域を検出すると，前記スケジューラは前記マルチメディア携帯端末網のＡＭＲの音声データに変換する手段に対して，前記最低レートの音声レートを前記空き領域を満たす分だけ増大させるよう制御することを特徴とするマルチメディアゲートウェイにおける音声・動画調整方式。 (Additional remark 2) In additional remark 1, it has a means to detect the empty area of the multiplexed data storage part which stores the multiplexed data produced | generated by the said multiplexing process part, and detects an empty area by the means to detect the said empty area Then, the scheduler controls the means for converting to the AMR audio data of the multimedia portable terminal network so as to increase the audio rate of the minimum rate by an amount corresponding to the empty area. Audio / video adjustment method at the gateway.

（付記３）付記１において，音声受信特性や受信環境条件に応じて特定の音声レートを保持する必要がある携帯端末について，該携帯端末のチャネル別に多重化テーブルに前記特定の音声レートを設定し，前記スケジューラは，前記一定間隔毎に動画を多重する際に前記多重化テーブルの該当チャネルに設定された音声レートにより調整することを特徴とするマルチメディアゲートウェイにおける音声・動画調整方式。 (Additional remark 3) In the additional remark 1, about the portable terminal which needs to hold | maintain a specific audio | voice rate according to an audio | voice reception characteristic or reception environment conditions, the said specific audio | voice rate is set to the multiplexing table according to the channel of this portable terminal. The scheduler adjusts according to the audio rate set in the corresponding channel of the multiplexing table when multiplexing the moving image at the predetermined intervals.

（付記４）付記１において，配信サーバからのＩＰ対応の画像パケットを終端した動画データを保持するチャネル対応の動画バッファを備え，各チャネルに対応して多重化変更閾値を設定した多重化変更閾値保持部を設け，前記スケジューラは，各チャネルの動画レートが増大して対応して設定された前記多重化変更閾値を越えたことを検出すると，アダプテーションレイヤのオプションのオーバヘッドを少なくして動画のレートを増加させることを特徴とするマルチメディアゲートウェイにおける音声・動画調整方式。 (Supplementary note 4) The multiplexing change threshold value according to supplementary note 1, further comprising a channel-compatible moving image buffer for holding moving image data obtained by terminating an IP-compatible image packet from a distribution server, wherein a multiplexing change threshold value is set for each channel. When the scheduler detects that the video rate of each channel has increased and has exceeded the correspondingly set multiplexing change threshold, it reduces the adaptation layer option overhead and reduces the video rate. An audio / video adjustment method in a multimedia gateway characterized by increasing

（付記５）付記４において，配信サーバからのＩＰ対応の画像パケットを終端した動画データを保持するチャネル対応の動画バッファに対応して解除閾値が設定された解除閾値保持部を備え，前記スケジューラは，前記チャネル対応の動画バッファの画像レートが前記多重化変更閾値を越えた後，画像レートが低下して前記解除閾値より下がると，アダプテーションレイヤのオーバヘッドを通常に戻すことを特徴とするマルチメディアゲートウェイにおける音声・動画調整方式。 (Supplementary note 5) In Supplementary note 4, the scheduler includes a release threshold value holding unit in which a release threshold value is set in correspondence with a channel-compatible video buffer that holds video data obtained by terminating an IP-compatible image packet from a distribution server. The multimedia gateway characterized in that the adaptation layer overhead is returned to normal when the image rate decreases and falls below the release threshold after the image rate of the video buffer corresponding to the channel exceeds the multiplexing change threshold. Audio / video adjustment method in Japan.

（付記６）付記１または付記４の何れかにおいて，前記多重化処理部に，音声の品質評価用のデータを発生する音声評価用データ発生部を設け，前記スケジューラは，音声データと動画データ及びオーバヘッドを組み立てる際に，前記多重化データを出力する前記一定周期の複数回に一回の割合で前記音声評価用データ発生部のデータを多重化して出力し，前記スケジューラは，前記携帯端末側から前記音声評価用データを用いた音声評価の結果を受け取ると，その音声評価の結果に応じて携帯端末への前記多重化データの音声レートを調整することを特徴とするマルチメディアゲートウェイにおける音声・動画調整方式。 (Supplementary note 6) In any one of Supplementary note 1 or Supplementary note 4, the multiplexing processing unit is provided with a voice evaluation data generating unit that generates voice quality evaluation data, and the scheduler includes voice data, moving image data, When assembling the overhead, the data of the voice evaluation data generating unit is multiplexed and output at a rate of once every a plurality of times of the fixed period for outputting the multiplexed data. When a voice evaluation result using the voice evaluation data is received, a voice rate of the multiplexed data to the mobile terminal is adjusted according to the voice evaluation result. Adjustment method.

（付記７）付記６において，前記多重化処理部に，画像評価用データを発生する画像評価用データ発生部を設け，前記スケジューラは，動画データを組み立てる際に，前記多重化データを出力する前記一定周期の複数回に一回の割合で前記画像評価用データ発生部のデータを多重化して出力し，前記スケジューラは，前記携帯端末側から前記画像評価用データを用いた画像評価の結果の情報を受け取ると，前記画像評価の結果情報をＩＰネットワークとマルチメディアゲートウェイとの間で呼制御情報を転送する経路を介してＩＰネットワークに接続した前記マルチメディア配信サーバに伝送し，前記マルチメディア配信サーバは前記画像評価の結果に従って，画像レートを調整してＩＰネットワークへ送信することを特徴とするマルチメディアゲートウェイにおける音声・動画調整方式。 (Supplementary note 7) In Supplementary note 6, the multiplexing processing unit is provided with an image evaluation data generating unit that generates image evaluation data, and the scheduler outputs the multiplexed data when assembling moving image data. The data of the image evaluation data generating unit is multiplexed and output at a rate of once per a plurality of times of a fixed period, and the scheduler is information on the result of image evaluation using the image evaluation data from the mobile terminal side. The image evaluation result information is transmitted to the multimedia distribution server connected to the IP network via a path for transferring call control information between the IP network and the multimedia gateway, and the multimedia distribution server According to the result of the image evaluation, the image rate is adjusted and transmitted to the IP network. Audio and video adjustment method in and breakfasts gateway.

本発明の原理構成を示す図である。It is a figure which shows the principle structure of this invention. 実施例１の構成を示す図である。1 is a diagram illustrating a configuration of Example 1. FIG. 実施例１の処理フローを示す図である。FIG. 3 is a diagram illustrating a processing flow of Example 1. 実施例２の構成を示す図である。6 is a diagram illustrating a configuration of Example 2. FIG. 実施例２の処理フローを示す図である。FIG. 6 is a diagram illustrating a processing flow of Example 2. 実施例３の構成を示す図である。6 is a diagram illustrating a configuration of Example 3. FIG. 実施例３の処理フローを示す図である。FIG. 10 is a diagram illustrating a processing flow of Example 3. 実施例４の構成を示す図である。FIG. 10 is a diagram showing a configuration of Example 4. 実施例４の処理フローを示す図である。FIG. 10 is a diagram illustrating a processing flow of Example 4. 実施例５の構成を示す図である。FIG. 10 is a diagram showing a configuration of Example 5. 実施例５の処理フローを示す図である。FIG. 10 is a diagram illustrating a processing flow of Example 5. 実施例６の構成を示す図である。10 is a diagram illustrating a configuration of Example 6. FIG. 従来のマルチメディアゲートウェイを含むシステムの構成を示す図である。It is a figure which shows the structure of the system containing the conventional multimedia gateway. 本発明の対象となるマルチメディアゲートウェイの構成を示す図である。It is a figure which shows the structure of the multimedia gateway used as the object of this invention.

Explanation of symbols

１ａ多重化処理部
１０多重化データ格納部
１１スケジューラ
１１ａ初期設定手段
１１ｂ監視手段
１１ｃ多重処理手段
１２多重化テーブル
２ａ終端部
２ｂジッタバッファ
２ｃコーデック（ＣＯＤＥＣ）処理部
２ｄ音声バッファ
３ａ終端部
３ｂジッタバッファ DESCRIPTION OF SYMBOLS 1a Multiplexing process part 10 Multiplexed data storage part 11 Scheduler 11a Initial setting means 11b Monitoring means 11c Multiprocessing means 12 Multiplexing table 2a Termination part 2b Jitter buffer 2c Codec (CODEC) processing part 2d Voice buffer 3a Termination part 3b Jitter buffer

Claims

IP-compatible voice packets and image packets from the distribution server provided at the connection point between the multimedia portable terminal network containing portable terminals capable of transmitting and receiving voice and video and the IP network to which the multimedia distribution server is connected The audio signal is converted into AMR (adaptive variable rate) audio data of the multimedia portable terminal network, the video packet is terminated and the moving image data is taken out, and the audio data, moving image data and overhead are transmitted at a speed of 64 Kbps. In an audio / video adjustment method in a multimedia gateway having a multiplexing processing unit that assembles fixed-length multiplexed data and outputs it to the multimedia portable terminal network at a fixed cycle,
The multiplexing processing unit includes a scheduler that performs control to generate multiplexed data, and a multiplexing table that stores combinations of audio rates and video rates within each fixed length,
For the transmission to each mobile terminal, set the multiplexing table so that the audio rate is the lowest rate and the remaining capacity is for video and overhead in the multiplexing table,
The audio / video adjustment method in a multimedia gateway, wherein the scheduler generates multiplexed data using the audio rate data and the video rate of the multiplexing table.

In claim 1,
Means for detecting an empty area of a multiplexed data storage unit for storing multiplexed data generated in the multiplexing processing unit;
When the vacant area is detected by the means for detecting the vacant area, the scheduler converts the lowest audio rate to the amount that satisfies the vacant area, with respect to the means for converting the audio data of the multimedia portable terminal network to AMR. An audio / video adjustment method in a multimedia gateway characterized by controlling to increase.

In claim 1,
For a portable terminal that needs to hold a specific voice rate according to voice reception characteristics and reception environment conditions, the specific voice rate is set in a multiplexing table for each channel of the portable terminal,
The scheduler adjusts according to the audio rate set to the corresponding channel of the multiplexing table when multiplexing the moving image at the predetermined intervals, and adjusting the audio / moving image in the multimedia gateway.

In claim 1,
A video buffer corresponding to a channel that holds video data obtained by terminating an IP-compliant image packet from a distribution server, and a multiplexing change threshold holding unit that sets a multiplexing change threshold corresponding to each channel;
When the scheduler detects that the video rate of each channel has increased and has exceeded the correspondingly set multiplexing change threshold, the scheduler can reduce the adaptation layer option overhead and increase the video rate. Audio / video adjustment method in the featured multimedia gateway.

In claim 1,
The multiplexing processing unit is provided with a voice evaluation data generation unit for generating voice quality evaluation data,
When assembling audio data, moving image data, and overhead, the scheduler multiplexes and outputs the data of the audio evaluation data generating unit at a rate of once every a plurality of times of the fixed period for outputting the multiplexed data. ,
When the scheduler receives a result of voice evaluation using the voice evaluation data from the mobile terminal side, the scheduler adjusts a voice rate of the multiplexed data to the mobile terminal according to the result of the voice evaluation. Audio / video adjustment method for multimedia gateways.