JP3651699B2

JP3651699B2 - Decoding device and encoding / decoding device

Info

Publication number: JP3651699B2
Application number: JP10908995A
Authority: JP
Inventors: 智河上
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 1995-04-09
Filing date: 1995-04-09
Publication date: 2005-05-25
Anticipated expiration: 2020-05-25
Also published as: JPH08289255A

Description

【０００１】
【目次】
以下の順序で本発明を説明する。
産業上の利用分野
従来の技術（図１７）
発明が解決しようとする課題（図１７及び図１８）
課題を解決するための手段（図１〜図１６）
作用（図１〜図１６）
実施例（図１〜図１６）
発明の効果
【０００２】
【産業上の利用分野】
本発明は復号化装置及び符号化復号化装置に関し、例えばＡＶサーバシステムに適用して好適なものである。
【０００３】
【従来の技術】
従来、ビデオ信号及びオーデイオ信号のデイジタル圧縮には種々の方式があるが、そのなかで現在最も注目を集めている方式の１つにＭＰＥＧ（Moving Pictur Experts Group ）規格がある。
このＭＰＥＧ規格に従つて圧縮したＡＶ（Audio Vidual：音響・映像）データを再生（デコード）する場合、ＡＶ間の同期をとる必要があるが、その方式には大きく分けて暗黙同期方式とタイム・スタンプ方式の２つの方式がある。
【０００４】
暗黙同期方式は、ほぼ同一の時刻、同一時間長に相当する各ＡＶデータをある単位で多重化しておき、１つのＡＶ素材（以下、これをクリツプと呼ぶ）の先頭においてＡＶの頭が合うようにそれぞれのデコード開始時刻を調節することによりＡＶ同期をとる方式である。
一方タイム・スタンプ方式は、ＡＶの多重化時に各ＡＶデータのある単位毎にその再生時刻を示すタイム・スタンプ（DTS:Decoding Time Stamp 及びPTS:Presentation Time Stamp ）を付加し、さらにそのタイム・スタンプの基準となる時刻を示すタイム・スタンプ（SCR:System Clock Refernece 又はPCR:Program Clock Refernece ）をある単位毎に付加しておき、各デコーダがこのタイム・スタンプに従つて再生を実行することによつてＡＶ同期をとる方式である。なおＭＰＥＧシステム規格はこの方式を採用している。
【０００５】
一般的にタイム・スタンプ方式は暗黙同期方式に比べて精度は高いがしくみが複雑になるため実現コストが高くなる特徴がある。
タイム・スタンプの使用方法についてもう少し追加説明する。図１７に示すように、デマルチプレクサ１は供給されるＡＶ多重化データＤ_Avから基準時刻情報PCR を抽出し、この値に同期するようにシステムクロツクＣＬ１を調整する。このシステムクロツクＣＬ１は、オーデイオデコーダ２及びビデオデコーダ３にそれぞれ与えられる。
【０００６】
またデマルチプレクサ１は、ＡＶ多重化データＤ_Avから抽出したオーデイオデータＤ_Adを再生時刻情報DTS と共にオーデイオデコーダ２に供給し、かつ抽出したビデオデータＤ_Viを再生時刻情報DTS 及びPTS と共にビデオデコーダ３に供給する。
これによりこれらオーデイオデコーダ２及びビデオデコーダ３は、それぞれシステムクロツクＣＬ１と、再生時刻情報DTS 及びPTS とに基づいてこれらを比較することによりいつ再生動作を開始すればよいのかを判断するようになされている。
【０００７】
【発明が解決しようとする課題】
ここで複数のクリツプを連続再生する場合を考える。
タイム・スタンプ方式の場合、図１８に示すように、クリツプのつなぎめ付近の時刻では、すでにデマルチプレクサ１には次のクリツプ（クリツプ２）のデータが入力され始めており、システムクロツクＣＬ１もクリツプ２の基準時刻情報PCR に同期しているが、オーデイオデコーダ２及びビデオデコーダ３ではまだ前のクリツプ（クリツプ１）の処理をしているという時間帯Ｔ₀が存在し得る。この場合各クリツプ１、２の基準時刻情報PCR はそれぞれ独立しているため、この時間帯Ｔ₀にオーデイオデコーダ２及びビデオデコーダ３がシステムクロツクＣＬ１に従つてデコードしようとすると正しく再生されない可能性がある。
【０００８】
このように従来のタイムスタンプ方式では、複数のクリツプを連続して再生し難い問題があつた。
かかる問題を解決するため、従来では複数のクリツプの連続再生を実現する方法として、オーデイオデコーダ２及びビデオデコーダ３を２組用意し、これら各組を交互に使用して、デマルチプレクサ１の出力をスイツチングする方法が広く用いられている。この方法によれば、ＡＶ同期の方式にタイムス・タンプ方式を採用することも可能である。
【０００９】
しかしながらこのような方法では、オーデイオデコーダ２及びビデオデコーダ３を２組用意する必要があり、またスイツチングを行う装置も必要となるなどシステム規模が大きくなる問題があった。
本発明は以上の点を考慮してなされたもので、簡易な構成で、各クリツプの再生を指定時刻に開始でき、かつ複数のクリツプを連続的に再生することのできる復号化装置及び符号化復号化装置を提案しようとするものである。
【００１０】
【課題を解決するための手段】
かかる課題を解決するため本発明においては、素材を形成する圧縮ビデオデータ及び圧縮オーデイオデータを復号する復号化装置に、データ供給源から供給される、所定の圧縮ビツトレートで圧縮符号化されて多重化された圧縮ビデオデータと圧縮オーデイオデータとを分割して出力する分割手段と、分割手段から出力される圧縮ビデオデータを復号する第１の復号手段と、分割手段から出力される圧縮オーデイオデータを復号する第２の復号手段と、データ供給源から供給される、素材の時間長を示す時間情報と、予め記憶している、素材の時間長毎に発生する圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差を示す再生時間差データとに基づいて、第２の復号手段の圧縮オーデイオデータに対する復号動作を再生時間差分だけ停止させる制御手段とを設けるようにした。
【００１１】
また本発明においては、予め素材の時間長に対する最小単位を設定しておき、復号化装置に、当該最小単位以上の時間長毎に発生する再生時間差を示す再生時間差データを記憶させるようにした。
【００１２】
【作用】
このように、所定の圧縮ビツトレートで圧縮符号化された圧縮ビデオデータ及び圧縮オーデイオデータで形成される素材の時間長毎に発生する圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差を、復号化側に再生時間差データとして記憶しておくようにしたことにより、タイムスタンプの有無に係わらず、素材の時間長のみから、当該素材を形成する圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差を特定することができ、さらに当該特定した再生時間差分だけ、圧縮オーデイオデータに対する復号動作を停止させることにより、複数の素材を連続再生する際に、各素材を形成する圧縮ビデオデータ及び圧縮オーデイオデータの再生開始時刻を同期させることができる。
【００１３】
また、予め素材の時間長に対する最小単位を設定しておくことにより、素材の時間長と、当該素材を形成する圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差との組み合わせパターンを、最小単位が設定されていない場合と比べて格段と少なくすることができる。
【００１４】
【実施例】
以下図面について、本発明の一実施例を詳述する。
【００１５】
〔１〕第１実施例
（１）利用技術
（１−１）ＭＰＥＧ規格でのビデオ圧縮
（１−１−１）仮想バツフアについて
ＭＰＥＧ規格によるビデオデータの圧縮では、各フレーム（画像）によつて発生するビツト量が変化する。このビツト量が多すぎたり少なすぎたりすると、デコーダ側のバツフアがオーバーフローしたりアンダーフローしたりする可能性がある。
このため、一般的にはデコーダのバツフアをある量に想定し、このバツフアが破綻しないようにエンコーダを駆動制御しながら、供給されるビデオデータを符号化（エンコード）している。以下、このようなバツフアを仮想バツフアと呼ぶ。
【００１６】
（１−１−２）スタートアツプデイレイについて
デコード時において、デコーダバツフアにデータが入力されると直ちに画像が再生されるわけではなく、ある時間が経過したのち再生が開始される。これはデコーダバツフアが破綻するのを防ぐためである。以下、この時間のことをスタート・アツプ・デイレイと呼ぶものとする。
スタート・アツプ・デレイは、どのようにビデオ素材を圧縮しても同じになるわけではなく、一般的にはビデオ素材によつてそれぞれ異なる。
図１に仮想バツフア４とスタート・アツプ・デイレイについて示す。この図１においてＴ₁は１フレーム時間、Ｔ₂はスタート・アツプ・デイレイ、Ｖ_bは仮想バツフア４の容量をそれぞれ示している。またここでは、一定速度でビデオデコーダ５にデータを供給することを考えているため、入力データの累積量を示す線は直線となつている。
【００１７】
このような仮想バツフア４では、例えば図２（Ａ）に示すように、所定のスタート・アツプ・デイレイＴ₂よりもデコード開始が遅すぎると、オーバーフローしてしまう危険がある。
一方図２（Ｂ）に示すように、所定のスタート・アツプ・デイレイＴ₂よりもデコード開始が早すぎると、アンダーフローしてしまう危険がある。
【００１８】
（１−１−３）圧縮ビデオデータフアイルのデータ量について
以下、１つのビデオ素材をエンコードした結果、得られたデータ群を「圧縮ビデオデータフアイル」と呼ぶものとする。通常、同じ時間長の異なるビデオ素材から生成される圧縮ビデオデータフアイルのデータ量は同じにはならず、図３（Ａ）及び（Ｂ）からも明らかなように、圧縮ビデオデータフアイルのデータ量をＶ_ｖ、ビデオ圧縮ビツトレートをＲ_ｖ、仮想バツフアの容量をＶ_ｂ、ビデオ素材の時間長をＴ_４とすると、次式
【数１】

のように、±仮想バツフアの容量Ｖ_ｂ分の範囲で変動する。
【００１９】
（１−２）圧縮ビデオデータフアイルの連続再生
１．ビデオデータの圧縮方式としてＭＰＥＧ方式を採用する。
２．ビデオデータの圧縮ビツトレートは一定とする。
３．各圧縮ビデオデータフアイルのスタート・アツプ・デイレイは等しいと限らない。
４．各圧縮ビデオデータフアイルのデータ量はその圧縮ビツトレートと再生時間（ビデオ素材の時間長）との積に等しいとは限らない。
という条件のもとで、圧縮ビデオデータフアイルを連続再生（デコード）するためには、
１．ビデオ・デコーダ・フアイルとして仮想バツフアの（容量又は数を）３倍以上を用意する。
２．連続再生すべき圧縮ビデオデータフアイル群の先頭フアイルは、ビデオ・デコーダ・バツフアに仮想バツフアの容量分だけデータがたまつてからデコードを開始する。
３．最大ビデオ・データ・フアイル量を、その再生時間内に転送可能な転送レートでデータをデコーダに転送する。
という方法が有効である。
【００２０】
以下、この方法に関して説明する。
２つの圧縮ビデオデータフアイルＦ₁、Ｆ₂をこの順に連続再生をする場合を考える。このとき圧縮ビツトレートをともにＲ_vとし、仮想バツフアのサイズをＶ_bとする。また再生時間をそれぞれＴ_F1及びＴ_F2とし、スタート・アツプ・デイレイをそれぞれＴｄ₁、Ｔｄ₂とする。
【００２１】
（１−２−１）圧縮ビデオデータフアイルの連続再生
簡単のため、各圧縮ビデオデータフアイルのデータ量はその圧縮ビツトレートと再生時間との積に等しい（データ量が特定できる）ものと仮定する。
もし、Ｔｄ₁＝Ｔｄ₂であれば、この２つの圧縮ビデオデータフアイルＦ₁、Ｆ₂を連続的に圧縮ビツトレートＲ_vの転送速度でデコーダに供給してやれば、これら２つの圧縮ビデオデータフアイルＦ₁、Ｆ₂は連続的に再生される。
ところがＴｄ₁＜Ｔｄ₂の場合には、圧縮ビデオデータフアイルＦ₂のデコード開始がＴｄ₂よりも早くなるため、仮想バアフアがアンダーフローする危険性がある。
【００２２】
逆にＴｄ₁＞Ｔｄ₂の場合には、圧縮ビデオデータフアイルＦ₂のデコード開始がＴｄ₂よりも遅くなるため、仮想バアフアがオーバーフローする危険性がある。
この場合発生し得る最大のスタート・アツプ・デイレイは、「仮想バツフアの容量分のデータを圧縮ビツトレートと等しい転送速度で転送するのに要する時間（Ｔ_b＝Ｖ_b／Ｒ_v）」に等しい。
【００２３】
従つて連続再生すべき圧縮ビデオデータフアイル群の先頭フアイルにおいては、そのスタート・アツプ・デイレイとは無関係に、この時間だけＴ_bだけ待つてからデコードを開始するようにすれば、バツフアがアンダーフローすることはない。つまりビデオデコーダバアフアにＶ_bだけデータがたまつてからデコードを開始するようにすれば良い。
また発生し得る最小のスタート・アツプ・デイレイは「０」であり、このようなフアイルをＴ_bだけデコードの開始を遅らせると、最悪Ｖ_b分だけバツフアがオーバーフローしてしまう。そこでビデオデコーダバツフアとして、仮想バツフアの２倍用意しておけばオーバーフローすることはない。
【００２４】
（１−２−２）データ量不足の考慮
簡単のため、各圧縮ビデオデータフアイルのスタート・アツプ・デイレイが等しい（スタート・アツプ・デイレイが特定できる）ものと仮定し、図４（Ａ）及び（Ｂ）を参照して説明する。
一般に、圧縮ビデオデータフアイルのサイズＶ_ｖは圧縮ビツトレートＲ_ｖ、とその再生時間Ｔの積に一致しない（Ｖ_ｖ、≠Ｒ_ｖ、Ｔ）ので、圧縮ビツトレートと等しい転送速度でデータをデコーダに供給した場合、入力に要する時間（Ｖ_ｖ、／Ｒ_ｖ、）と出力（再生）に要する時間Ｔとは一致しない。
【００２５】
圧縮ビデオデータフアイルＦ₁において、デコーダ入力所要時間が出力所要時間よりも長い（Ｖ_V1／Ｒ_v＞Ｔ_F1）の場合圧縮ビデオデータフアイルＦ₂のデコード開始がスタート・アツプ・デイレイＴｄ₂よりも遅くなるため、バツフアがオーバーフローする危険性がある。
逆に圧縮ビデオフアイルＦ₁において、入力所要時間が出力所要時間よりも短い（Ｖ_V1／Ｒ_v＜Ｔ_F1）場合、圧縮ビデオデータフアイルＦ₂のデコード開始がスタート・アツプ・デイレイＴ_F2よりも早くなるため、バツフアがアンダーフローする危険性がある。
【００２６】
発生し得る最大の圧縮ビデオデータ量Ｖ_ｖｍａｘは、次式
【数２】

である。そこで、実際に得られた圧縮ビデオデータ量がこの値に満たない場合には、そのフアイルの最後に「ゴミ」データＤ_ｄを付加することにより、任意のフアイルのサイズＶ_ｖをＶ_ｖｍａｘになるようにする。ＭＰＥＧ規格では、「０」を「ゴミ」データとして追加することが可能である（以下、このようにフアイルのサイズ調整のために追加する「ゴミ」データＤ_ｄを「パツド」と呼ぶ）。
このようにサイズ調整したフアイルの全データを、時間Ｔで転送できるだけの転送速度（Ｒ_ｖ′＝Ｒ_ｖ＋Ｖ_ｂ／Ｔ）でデコーダに入力するようにすれば、図４（Ｂ）に示すように、入力所要時間と出力所要時間がともにＴとなるため、上述のようなバツフア破綻は発生しない。
【００２７】
しかしながらこの転送速度Ｒ_v′は本来の転送速度Ｒ_vよりも大きいため、連続再生するしないにかかわらずバツフアがオーバーフローしてしまう危険性がある。最悪Ｖ_b分だけバツフアがオーバーフローしてしまうので、このような事態を防ぐためには、ビデオデコードバツフアとして、仮想バツフアの２倍用意しておけば良い。
以上のことから、結局スタート・アツプ・デイレイが不定であり、データ量も不定であることを考慮すると、ビデオデコーダバツフアとして、仮想バツフアの（容量又は数の）３倍用意しておけばよいことがわかる。
【００２８】
（１−３）ＭＰＥＧ規格でのオーデイオ圧縮
オーデイオデータの場合はビデオデータの場合と異なり、ＭＰＥＧ規格に基づいて圧縮した場合、ビデオデータのスタート・アツプ・デイレイに相当するものは存在しないし、またその圧縮データ量はその圧縮ビツトレートと素材の時間長との積に一致するので、特に工夫しなくても複数の圧縮オーデイオデータフアイルを連続的にデコーダに供給してやれば連続的に再生される。
【００２９】
（１−４）圧縮データの最小単位
ビデオ信号はフレームという単位で扱われ、これをＭＰＥＧ規格に基づいて圧縮した場合は、さらにＧＯＰ（Group Of Picture）という単位で扱われることになる。
例えばＮＴＳＣ方式の場合、そのフレームは29.97 〔Hz〕であり、通常１ＧＯＰは15フレームで構成される。またＰＡＬ方式の場合は、そのフレームは25〔Hz〕であり、12フレームからなるＧＯＰと13フレームからなるＧＯＰとを交互に繰り返す構造にする。
【００３０】
一方オーデイオをＭＰＥＧ規格に基づいて圧縮した場合、例えばサンプリング周波数を48〔ＫHz〕とし、かつサンプル数を1152とすると、その最小単位は24〔ms〕となる。
このように、圧縮したデータには扱うことのできる最小単位というものが存在するため、任意の時間長のクリツプを厳密に実現することはできない。
ここで簡単のためクリツプの最小時間長単位を１秒とする。このような制限を受けても実際の運用上問題になることはほとんどない。このときクリツプの時間長をＴとし、この時間以上で最も近い実現可能なビデオの時間をＴ_vとする。また時間Ｔ_v以下で最もこの時間に近く実現可能なオーデイオの時間をＴ_aとする。
【００３１】
ＰＡＬ方式の場合であれば、Ｔ_v＝Ｔとなるが、ＮＴＳＣ方式の場合はＴ_v≠Ｔとなつてしまう。
具体的な数値例を図５に示す。なおこの例は、ビデオはＮＴＳＣ方式の場合であり、オーデイオは24〔ms〕が最小単位の場合である。この例の場合、ビデオとオーデイオの誤差Ｔ_VX−Ｔ_cは、０から２３までの24通り存在する。
【００３２】
（２）実施例による符号化復号化装置の構成
図６において、１０は全体として符号化復号化装置を示し、エンコーダ部１１は供給される映像オーデイオ信号Ｓ１に基づく映像信号及びオーデイオ信号をそれぞれ符号化し、多重化した後、得られた圧縮ＡＶ多重化データＤ１をデータ供給部１２に供給する。
データ供給部１２は、供給される圧縮ＡＶ多重化データＤ１をコントローラ１３の制御に基づき記録メデイア１４に記録する。またこのデータ供給部１２では、記録メデイア１４に記録された圧縮ＡＶ多重化データＤ２をコントローラ１３の制御に基づいてフアイル単位で読み出し、これを一定速度でデコーダ１５のデマルチプレクサ１６に供給する。
デマルチプレクサ１６は、供給される圧縮ＡＶ多重化データＤ２を圧縮符号化されたビデオ信号でなる圧縮ビデオデータＤ３と、圧縮符号化されたオーデイオ信号でなる圧縮オーデイオデータＤ４とに分け、これらをそれぞれビデオデコーダ１７及びオーデイオデコーダ１８に送出する。
【００３３】
このときデータ供給部１２のコントローラ１３は共有メモリ１９を介してコントローラ２０に制御情報信号Ｓ３を送出すると共に、コントローラ２０は制御情報信号Ｓ３に基づく制御信号Ｓ４及びＳ５をそれぞれビデオデコーダ１７、オーデイオデコーダ１８に供給するようになされている。
かくしてビデオデコーダ１７は、制御信号Ｓ４に基づいて圧縮ビデオデータＤ３をバツフア２１に順次蓄え、読み出しながら、これを順次デコードして出力するようになされている。
同様にしてオーデイオデコーダ１８は、制御信号Ｓ５に基づいて圧縮オーデイオデータＤ４をバツフア２２に順次蓄え、読み出しながら、これを順次デコードして出力するようになされている。
【００３４】
この実施例の場合、ビデオデコーダ１７に対応するバツフア２１としては、仮想バツフアの３倍の容量を有するものが用いられている。これにより供給される圧縮ビデオデータＤ３のスタート・アツプ・デイレイの不定や、データ量も不定によつてバツフア２１が破綻（オーバーフロー、アンダーフロー）するのを防止し得るようになされている。
またこの実施例の場合、図７（Ｂ）に示すように、デマルチプレクサ１６は、分割した圧縮ビデオデータＤ３及び圧縮オーデイオデータＤ４を速度変換することなく、そのままバースト的にビデオデコーダ１７又はオーデイオデコーダ１８に送出するようになされている。
【００３５】
この場合、本来は、所定の圧縮ビツトレートで圧縮したデータは、図７（Ａ）に示すように一定速度で各デコーダ１７、１８に供給すべきであるが、このようにしようとするとデマルチプレクサ２５の部分に速度変換をする回路が必要になる。このため実施例の符号化復号化装置１０では、このようにバースト的にデータが各ビデオデコーダ１７やオーデイオデコーダ１８に入力されてもバツフア２１、２２が破綻することなく再生が可能なように、以下のようにフオーマツトを規定している。
【００３６】
（３）連続再生及び定時再生の実現
この実施例の符号化復号化装置１０では、ＡＶの同期をとる方式としてタイム・スタンプを使用しない暗黙同期方式を採用している。
この場合このような方式でＡＶ同期をとりながらクリツプを連続再生するためには、
１．クリツプが連続しても各クリツプの頭でビデオデータとオーデイオデータの頭がそろうこと
２．クリツプが連続しても各クリツプの再生中にビデオデータ及びオーデイオデータの各デコーダバツフアが破綻しないこと
の２点を満たせば良い。
以下にこれら各条件をそれぞれ満たすための方法を説明する。
【００３７】
（３−１）定時再生及び連続再生実現のためのＡＶ頭合わせの方法
現在、デコーダにはいろいろなタイプのものが存在しているが、実施例の符号化復号化装置１０では、ビデオデコーダ１７として、「スタート命令を受け取つた後、1.5 フーム時間（0.05秒）後に再生を開始するもの」を適用し、オーデイオデコーダとして、「スタート命令を受け取つた後、すぐに再生を開始するもの（通常のデコーダ仕様）」を適用して、以下のシーケンスにより定時再生を実現している。
【００３８】
１．再生を開始したい時刻のＴ_S秒（仮想バツフア分の圧縮ビデオデータをビデオデコーダ１８のバツフア２２に転送するのに要する時間）前に、データ供給部のコントローラ１３（図１）により記録メデイア１４からデータ転送を開始するようにする。このＴ_S秒という時間は、後述するように任意のクリツプについて同一となる。
２．デマルチプレクサ１６が最初のクリツプの先頭データを受信したら、デコーダ１５のコントローラ２０にデータの転送が開始されたことを伝える。
３．（Ｔ_S−0.05）秒経過したら、コントローラ２０からビデオデコーダ１７に対して再生動作開始命令を出す。
４．さらに0.05秒経過したら、コントローラ２０からオーデイオデコーダ１８に対して再生開始命令を出す。
【００３９】
これによりビデオデータ及びオーデイオデータについて同時に所定時刻に再生が開始されるが、このような方法を実現するためには経過時間を計測する必要がある。
しかるにこの実施例の符号化復号化装置１０では、圧縮ＡＶ多重化データＤ１が一定速度でデマルチプレクサ１６に転送されるので、このデマルチプレクサ１６を通過するデータの数をカウントしていれば、経過時間を計ることができる。
【００４０】
このためこの実施例の場合、カウンタ等を用いずに、エンコーダ部１１が映像信号及びオーデイオ信号を符号化、多重化する際、データフアイル内に「マルチプレクサ１６が各デコーダ１７、１８に対して動作命令を出すべき時刻に相当する位置」に目印として特殊なコード（以下、これをマークと呼ぶ）を入れるようになされている。
またデマルチプレクサ１６は、データフアイル内からこのマークを検出すると、これをマーク検出信号Ｓ１０（図６）としてコントローラ２０に伝える一方、コントローラ２０はこのマーク検出信号Ｓ１０に基づきその内容に応じてビデオデコーダ１７及びオーデイオデコーダ１８に対して制御信号Ｓ４及びＳ５を送出するようになされている。
これによりこの符号化復号化装置１０では、コントローラ２０が上述のようなタイミングでビデオデコーダ１７及びオーデイオデコーダ１８をそれぞれ駆動し得るようになされている。
【００４１】
なおこのマークとしては、再生圧縮多重化データＤ２の中から識別することができるように、ＭＰＥＧシステム規格によつて、決してマークは再生圧縮多重化データＤ２には現れないことが保証されている４バイトのコードを使用している。
この実施例の場合、ビデオデコーダ１７にデコードを開始させるビデオデコード開始マーク、及びオーデイオデコーダ１８にデコーダを開始させるオーデイオデコード開始マークとして、それぞれ16進数表示で「000001BD」、「000001BF」なるコードを使用している。
【００４２】
また圧縮ＡＶ多重化データＤ２の転送速度を５〔Ｍｂｐｓ〕とし、仮想バツフアの容量分の圧縮ビデオデータＤ４をビデオデコーダ１７のバツフア２１に転送するのに圧縮再生多重化データＤ２を４〔Ｍｂｉｔ〕供給する必要があるものとして、図８に示すように、先頭から次式
【数３】

の位置にビデオデコード開始マークＭ_１「０００００１ＢＤ」を置き、その次式
【数４】

後にオーデイオデコード開始マークＭ_２「０００００１ＢＦ」を置いている。
次に連続再生実現のためのＡＶ頭合わせの方法について説明する。
【００４３】
上述の「（１−４）圧縮データの最小単位」のところで説明したように、ビデオデータとオーデイオデータとの再生時間は一致しない。例えば時間が６秒であるクリツプでは、0.006 秒だけビデオの方が再生時間が長くなる。従つて６秒のクリツプを２つ連続して再生すると、最初のクリツプのＡＶ同期が完全にとれていたとしても、２つ目のクリツプはオーデイオ信号の再生開始が0.006 秒だけ早くなつてしまう。この程度の差であれば、実際上は何ら違和感を感じることはないが、クリツプの連続数が増えるごとにこの差は蓄積されていくため、例えは10個も連続すると、オーデイオとビデオとの差は0.06秒にもなつてしまい、違和感を感じるようになる。
このためこの実施例の符号化復号化装置１０では、以下の方法により各クリツプの頭においてＡＶの頭が合うようにしている。
１．再生を開始してからオーデイオの再生時間Ｔ_cが経過したら、オーデイオデコーダ１８を停止させる。
２．さらに再生時間差（Ｔ_v−Ｔ_c）だけ時間が経過したら、再びオーデイオデコーダ１８を動作させる。
３．この操作を各クリツプに対して繰り返す。
この場合この方法を実現するためには、定時再生の場合と同様に経過時間を計測する必要がある。
【００４４】
このためこの実施例では、上述の場合と同様にして、ここでもエンコーダ部１１が各データフアイルの所定位置に所定のマークを付加するようになされている。
例えば図５によりＴ＝６〔ｓ〕のクリツプの場合、再生時間差が０．００６秒であるから、オーデイオデコード開始マークＭ_１の前の位置にオーデイオデコーダ１８にデコードを停止させるオーデイオデコード停止マークＭ_ｓ１〜Ｍ_ｓ２３（図８）を入れる。
この場合圧縮オーデイオデータの最小単位が２４〔ｍｓ〕である場合、再生時間差は０〜２３〔ｍｓ〕までの２４通り存在するが、差が０〔ｍｓ〕の場合は、オーデイオデコーダ１８を停止させる必要がないのでオーデイオデコード停止マークＭ_ｓ１〜Ｍ_ｓ２３としては２３個用意すれば良い。
【００４５】
このためこのエンコーダ部１１では、オーデイオデコード停止マークＭ_ｓ１〜Ｍ_ｓ２３として、例えば「０００００１Ｃ１」〜「００００１Ｄ７」までの２３個のコードをそれぞれ１〜２３〔ｍｓ〕の再生時間差に対応させている。従つてこれらのコードは、５〔Ｍｂｐｓ〕×０．００１〔ｓ〕＝６２５〔ｂｙｔｅ〕毎に現れることになる。またこのエンコーダ５は、これら２３個のコードのうちどれに従つてデコードを停止させれば良いかを示すマーク（以下、これを停止マーク指定マークＭ_３と呼ぶ、例えば「０００００１Ｄ８」）を各クリツプの最後に付加し、この後に有効となるオーデイオデコード停止マークＭ_４を続けて付加するようになされている。
例えばＴ＝６〔ｓ〕のクリツプの場合、再生時間差が０．００６〔ｓ〕であるから、オーデイオデコード停止マークＭ_ｓ６を有効にさせる必要があるため、このクリツプの最後尾の８〔ｂｙｔｅ〕は「０００００１Ｄ８０００００１Ｃ６」となる。
【００４６】
クリツプ１（Ｔ＝６〔ｓ〕）とクリツプ２（Ｔ＝７〔ｓ〕）とを連続再生する場合、クリツプ１の停止マーク指定マークＭ_３からオーデイオデコード停止マークＭ_ｓ６が有効であることがわかつているので、クリツプ２のオーデイオデコード停止マークＭ_ｓ６がデマルチプレクサ１６において検出された時点で、オーデイオデコーダ１８に停止命令を出し、さらにクリツプ２のオーデイオデコード開始マークＭ_２が検出された時点で、オーデイオデコーダ１８に開始命令を出すようになされている。
なおクリツプの連続再生を開始する前に有効な各停止マークＭ_ｓ１〜Ｍ_ｓ２３をクリアするため、「００００００００」にしておくことにより先頭クリツプの停止マークＭ_ｓ１〜Ｍ_ｓ２３で停止命令が出されるのを防止し得るようになされている。
【００４７】
（３−２）連続再生実現のためのバツフア使用方法
（３−２−１）圧縮ＡＶ多重化フオーマツトの基本構造
以下、同一種類のデータのかたまりを「パケツト」と呼ぶことにする。ここで種類とは、ビデオ、オーデイオ及びパケツトの３つを意味する。各パケツトの頭にその種類を表すためのヘツダを付ける。
またいくつかのパケツトの集まりを「パツク」と呼ぶことにする。パツクはパケツトとは異なり、何らかの識別ヘツダがつくわけではない。
ここでこの実施例の符号化復号化装置１０では、以下に示す３種類のパツクを使用している。
１．標準パツク
……原則として、ビデオパケツト、オーデイオパケツト、パツドパケツト各１つずつから構成される。
２．先頭パツク
……仮想バツフア分だけビデオパケツトのデータ量が標準パツクよりも大きい。
３．残余パツク
……上述の２種類のパツクに含まれなかつた残りのデータから構成されている。
【００４８】
この実施例の場合、１つのクリツプをエンコードして得られる１つの圧縮ＡＶ多重化データフアイルは、図９（Ａ）に示すように、その先頭に１つの「先頭パツク」が存在し、その後複数個の「標準パツク」が連続し、最後に１つの「残余パツク」がくるように構成されている。
なおデータ制御用のマークは全て、先頭パツクのビデオパケツト内に位置することになるため、パケツト・ヘツダの内容にマークが挿入されて、パケツト・ヘツダが認識できなくなるようなことは発生しない。
【００４９】
（３−２−２）パケツト・ヘツダ
各パケツトの種類を識別するためのヘツダとしては、圧縮データの中には決して現れないパターンを採用する必要がある。このためこの符号化復号化装置１０では、MPEGシステム規格においてヘツダ用に確保されているパターンを採用している。実際上例えば、以下のようなパターンを用いている。なお、各ヘツダは16進数表示しており、そのサイズは４バイトである。
ビデオ・パケツト・ヘツダ ……000001E3
オーデイオ・パケツト・ヘツダ ……000001C0
パツド・パケツト・ヘツダ ……000001BE
かくしてデマルチプレクサ１６では、入力されてくる圧縮ＡＶ多重化データＤ２の任意の連続する４バイトを調べて、それがビデオ・パケツト・ヘツダであれば以下に続くデータをビデオ・デコーダ１７に振り分け、オーデイオ・パケツト・ヘツダであれば以下に続くデータをオーデイオ・デコーダ１８に振り分け、パツド・パケツト・データであれば以下に続くデータを破棄するようになされている。
【００５０】
（３−２−３）圧縮ビツト・レート値等に関する制限
時間長Ｔのクリツプを、仮想バツフア量Ｖ_ｂ、ビデオ圧縮ビツト・レートＲ_ｖで圧縮した場合、発生し得る最大の圧縮ビデオ・データ量は、上述のように次式
【数５】

で与えられる。
このためこの実施例の符号化復号化装置１０では、実際に得られた圧縮ビデオ・データ量がこの値に満たない場合は、その最後にパツド・データを付加することにより、任意のフアイルのサイズをＶ_ｖｍａｘになるようにしている。このようにサイズ調整したあとの全データを、時間Ｔで転送する場合、ビデオ・データ転送速度Ｒ_ｖ′は次式によつて表される。
【数６】

Ｒ_ｖ′が最大Ｒ_ｖｍａｘ′となるのはクリツプの時間長が最小Ｔ_ｍｉｎの場合である。
【数７】

圧縮ＡＶ多重化データ転送速度Ｒ_ｍは、少なくとも圧縮ビデオ・データ転送速度と圧縮オーデイオ・データ転送速度の和よりも大きくなければならない。オーデイオに関してはその圧縮ビツト・レートに等しい値の転送速度でよい。よつて、これらのパラメータは次式を満たさなければならない。（正確には、ヘツダやマーク分も考慮する必要があるが、ここでは簡単のため省略する）
【数８】

【数９】

【００５１】
実際上この符号化復号化装置１０では、Ｒ_v、Ｒ_m、Ｒ_a、Ｔ_min、Ｖ_b等のパラメータは任意の値が可能なわけではなく、その用途やハードウエアの都合等により制限が加わる。このためこの符号化復号化装置１０では、これらパラメータの中で制限の強いものから決めていき、最終的に（９）式を満たすように順次選定している。
【００５２】
（３−２−４）パケツト・サイズの算出
圧縮ビツト・レートの値等のパラメータ値が与えられた場合に、各パケツトのサイズをいくつにすればよいのかを算出するための式を以下に示す。
まず、以下の計算で使用する変数を列挙する。
ビデオ圧縮ビツト・レート ……Ｒ_v〔Mbps〕
ビデオ・データ転送速度 ……Ｒ_v′〔Mbps〕
オーデイオ圧縮ビツト・レート ……Ｒ_a〔kbps〕
圧縮ＡＶ多重化データ転送速度 ……Ｒ_m〔Mbps〕
クリツプの時間長（１秒単位） ……Ｔ〔ｓ〕
最小クリツプの時間長（１秒単位）……Ｔ_min〔ｓ〕
Ｔ以上でＴに最も近く現実可能なビデオの再生時間……Ｔ_v〔ｓ〕
Ｔ_v以下でＴに最も近く現実可能なオーデイオの再生時間……Ｔ_a〔ｓ〕
クリツプを転送するのに要する時間……Ｔ_t〔ｓ〕
仮想バツフア分の圧縮ビデオ・データが転送されるのに要する時間……Ｔ_S〔ｓ〕
仮想バツフア量 ……Ｖ_b〔Mbit〕
１クリツプの圧縮ビデオ・データ量……Ｖ_v〔byte〕
１クリツプの圧縮オーデイオ・データ量……Ｖ_a〔byte〕
１クリツプのパツド・データ量 ……Ｖ_p〔byte〕
１クリツプの総データ量 ……Ｖ_m〔byte〕
全マーク・データ量 ……Ｖ_k〔byte〕（＝４×27〔byte〕〕）
パケツト・ヘツダ・サイズ ……Ｌ_h〔byte〕（＝４〔byte〕）
標準ビデオ・パケツト・サイズ（ヘツダは含まない）……Ｌ_v〔byte〕
標準オーデイオ・パケツト・サイズ（ヘツダは含まない）……Ｌ_a〔byte〕
標準パツド・パケツト・サイズ（ヘツダは含まない）……Ｌ_p〔byte〕
標準パツク・サイズ（ヘツダを含む） ……Ｌ_m〔byte〕
パツク数（残余パツクを除く） ……Ｐ
＊標準パケツトとは標準パツク内のパケツトであることを表す。
残余ビデオ・パケツト・サイズ（ヘツダは含まない）……Ｌ_v′〔byte〕
残余オーデイオ・パケツト・サイズ（ヘツダは含まない）……Ｌ_a′〔byte〕
残余パツド・パケツト・サイズ（ヘツダは含まない）……Ｌ_p′〔byte〕
＊残余パケツトとは残余パツク内のパケツトであることを表す。
単位変換係数（Mbit→byte）……Ｃ_m＝125000＝1000×1000/8
単位変換係数（kbit→byte）……Ｃ_k＝125 ＝1000/8
【００５３】
Ｔを１秒単位とした場合、簡単のため、任意の時間長のクリツプにおいて標準パケツトのサイズＬ_ｖおよびＬ_ａがそれぞれ等しくなるようにするためには、次式を満たすようにＲ_ｖ、Ｒ_ａ、Ｌ_ｖ、Ｌ_ａを選択すればよい。
【数１０】

【数１１】

【数１２】

通常、Ｒ_ｖはせいぜい０．１〔Ｍｂｐｓ〕単位であり、Ｒ_ａは６４、１２８、１９２、２５６、３８４〔ｋｂｐｓ〕のいずれかである。従つて、上式を満たすような整数Ｌ_ｖ、及びＬ_ａは存在する。
圧縮データの取り得る最小単位の制限により発生する端数データ、つまりビデオにおけるＲ_ｖ、×（Ｔ_ｖ、−Ｔ）及びオーデイオにおけるＲ_ａ×（Ｔ_ａ−Ｔ）分のデータで残余パツクを構成するようにし、残りのデータ、つまり、ビデオにおけるＲ_ｖ、Ｔ＋Ｖ_ｂ及びオーデイオにおけるＲ_ａ×Ｔで標準パツクと初期パツクを構成するようにする。
【００５４】
このようにするとＲ_ａ及びＲ_ｖに関して次式が成り立つ。
【数１３】

【数１４】

【数１５】

【数１６】

（１４）式及び（１６）式よりＲ_ｖ、Ｒ_ａ、Ｌ_ａを与えれば、Ｌ_ｖを算出できる。
オーデイオ・データ量Ｖ_ａは
【数１７】

ビデオ・データ量Ｖ_ｖは
【数１８】

多重化データ量Ｖ_ｍは
【数１９】

よつて、パツド・データ量Ｖ_ｐは、
【数２０】

よつて
【数２１】

【００５５】
（２１）式よりＬ_ｐ及びＬ_ｐ′が求められる。
Ｔ_ａ＜Ｔの場合、次式を満たす整数Ｋ及びＬ_ａ″が存在する。
【数２２】

このとき標準パツクの最後からＫ＋１番目のパツクに関して、オーデイオ・パケツトのデータ・サイズをＬ_ａ−Ｌ_ａ″とし、このパケツトの直後にサイズがＬ_ａ″−Ｌ_ｈのパツト・パケツトを置くようにする。そして、標準パツクの最後からＫ個のパツク関して、オーデイオ・パケツトの代わりに同サイズのパツド・パケツトを置くようにする。
また、残余パツクに関して、次式が成り立つ。
【数２３】

【数２４】

【００５６】
（３−２−５）連続再生実現のための条件
上述の「（１−２）圧縮ビデオ・データ・フアイルの連続再生」で説明したように、圧縮ビデオデータだけであつても連続再生するためには、ビデオ・デコーダ・バツフアを仮想バツフアの３倍用意する必要がある。
以下に多重化した場合にも、ビデオ・デコーダ・バツフア及びオーデイオ・デコーダ・バツフアをそれぞれ仮想バツフアの３倍用意すれば連続再生できる条件を示す。
まず「（１−２−１）スタート・アツプ・デイレイ不定の考慮」で示したように、各クリツプのスタート・アツプ・デイレイが不定であることに対処するためには、仮想バツフアの２倍の容量が必要となる。図１１の「データ出力折れ線の推移可能領域」をカバーするのがこの２倍バツフアである。よつて、残りの１倍バツフア内でデータ入力折れ線を推移させるようにできれば、３倍バツフアで連続再生が可能となる。
【００５７】
ビデオ・デコーダ・バツフア１７には、１つのクリツプのデータだけが入つている時間と２つのクリツプのデータが入つている時間が存在する。
２つのクリツプのデータが入つている時間は、もしＴ_t＝Ｔ_vであれば、各クリツプの最初のＴ_S時間だけである。この時間は、任意のクリツプについて先頭パツクを転送している時間であり、先頭以外にオーデイオ・パケツトやパツド・パケツトが含まれないため、図１０からも明らかなように、入力折れ線は必ず直線▲１▼よりも下になる。つまり、１倍バツフア内をデータ入力折れ線が出ることはない。
１つのクリツプのデータだけが入つている時間（Ｔ_Sだけ経過した後の時間）に関しては、図１０において、直線▲１▼と直線▲２▼の間に入るようにふればよい。そのためには以下の条件を満たすようにすればよい。
条件１：直線▲１▼と直線▲３▼が原点以外で交わらないようにする。
【００５８】
つまり任意のｔ＞０についてｖ_３（ｔ）＞ｖ_１（ｔ）が成り立つようにすればよい。
【数２５】

【数２６】

条件２：直線▲２▼と直線▲４▼がビデオ・データを転送し終わる時刻ｔ_ｅまで交わらないようにする。つまりｖ_４（ｔ_ｅ）＜ｖ_２（ｔ_ｅ）が成り立つようにすればよい。この場合、次式
【数２７】

なので、
【数２８】

【数２９】

【００５９】
なお、各直線▲１▼〜▲４▼の方程式は以下の通りである。また、厳密にはヘツダやマークの分も考慮する必要があるが、圧縮データのデータ量に比べて非常に小さいので無視する。
【数３０】

【数３１】

【数３２】

【数３３】

従つて、仮想バツフアの３倍で連続再生を実現するためには、Ｔ_ｔ＝Ｔ_ｖとして、かつ（２６）式及び（２９）式を満たすように各パラメータを選択すれば良く、実際上この実施例の符号化復号化装置１０でもこのように各パラメータが選定されている。
【００６０】
（３−２−６）連続再生可能な圧縮ＡＶ多重化フオーマツトの例
Ｒ_ｍ＝５〔Ｍｂｐｓ〕、Ｒ_ｖ＝４〔Ｍｂｐｓ〕、Ｒ_ａ＝２５６〔ｋｂｐｓ〕、Ｔ_ｍｉｎ＝６〔ｓ〕、Ｖ_ｂ＝４〔Ｍｂｐｓ〕という要求を実現するフオーマツトを示す。ただし、ビデオはＮＴＳＣ方式とする。
なおこの場合、（９）式の右辺は５〔Ｍｂｐｓ〕であり、左辺は次式
【数３４】

となるため、（９）式は満足されており、これらの値が選択可能であることがわかる。
例えばクリツプ時間長Ｔ＝６〔ｓ〕の場合、図５から、
Ｔ_ｔ、＝Ｔ_ｖ、＝６．００６〔ｓ〕、Ｔ_ａ＝６〔ｓ〕
適当にＬ_ａ＝１０００〔ｂｙｔｅ〕として（あまり大きくするとオーデイオ・バツフアが破綻する可能性がある。逆に小さくしすぎると標準パツク数が増えてヘツダによるオーバ・ヘツドが増えてしまう）、（１４）式より標準パツク数を求める。
【数３５】

（１６）式より標準ビデオ・パケツトのサイズを求める。
【数３６】

（２１）式よりパツド・パケツトのサイズを求める。
【数３７】

よつて、Ｌ_ｐ＝２９３〔ｂｙｔｅ〕、Ｌ_ｐ′＝７４〔ｂｙｔｅ〕となる。
【００６１】
（２３）式より残余ビデオ・パケツトのサイズを求める。
【数３８】

ここでＴ_ａ＝Ｔなので、残余オーデイオ・パケツトのサイズは０である。
以上の値より標準パックのサイズを求める。
【数３９】

このとき（２６）式は満たされていることが分かる。
０．９２＞０．８
このとき（２９）式も満たされていることが分かる。
２３．２＜２４
以上の結果、得られるフオーマツトを図１２に示す。同様にして、クリツプ時間長７秒の場合についても算出し、その結果を図１３に示す。
この場合は、Ｔ_ａ≠Ｔとなるため、最後の標準パツクは他の標準パツクとは異なりＴ_ａ−Ｔに相当する分のパツド・パケツトが存在する。
各パラメータの値が決まれば、クリツプの内容に無関係に任意のクリツプについてクリツプ時間長ごとに、図１２及び図１３のようなフオーマツト・テーブルを作成することができるので、実際に圧縮ＡＶデータを多重化する場合、各エンコーダで得られた圧縮データをこのテーブルに従つて多重化してやればよい。
【００６２】
このためこの実施例の符号化復号化装置１０では、エンコーダ部１１にこのようなフオーマツト・テーブルが予め格納されており、エンコーダ部１１がこのフオーマツト・テーブルに基づいて符号化された映像信号と符号化されたオーデイオ信号とを多重化するようになされている。
【００６３】
（４）第１実施例の動作
以上の構成において、この符号化復号化装置１０では、エンコーダ部１１に供給される映像オーデイオ信号Ｓ１に基づく映像信号及びオーデイオ信号を、このエンコーダ部１１において符号化し、これらを上述の第１の多重化フオーマツトに従つて多重化することにより圧縮ＡＶ多重化データＤ１を形成すると共に、この際各クリツプ毎に、ビデオデコード開始マークＭ₁、オーデイオデコーダ開始マークＭ₂、オーデイオ停止マークＭ_S1〜Ｍ_S23、停止マーク指定マークＭ₃及びオーデイオデコード停止マークＭ₄（以下、これらをまとめてデコーダ制御マークＭ₁〜Ｍ₄、Ｍ_S1〜Ｍ_S23と呼ぶ）をそれぞれ所定位置に付加し、これをデータ供給部１２に供給して記録メデイア１４に記録させる。
この圧縮ＡＶ多重化データＤ１はコントローラ１３の制御に基づいてデータフアイル単位で読み出され、圧縮ＡＶ多重化データＤ２としてデマルチプレクサ１６に供給され、このデマルチプレクサ１６において圧縮ビデオデータＤ３と圧縮オーデイオデータＤ４とに分割されて、それぞれビデオデコーダ１７又はオーデイオデコーダ１８に送出される。
この際デマルチプレクサ１６は、このデータフアイルの中からデコーダ制御マークＭ₁〜Ｍ₄、Ｍ_S1〜Ｍ_S23を検出すると、検出結果に基づくマーク検出信号Ｓ１０をコントローラ２０に送出する。
コントローラ２０はこのマーク検出信号Ｓ１０に基づいて制御信号Ｓ４、Ｓ５をそれぞれビデオデコーダ１７及びオーデイオデコーダ１８にそれぞれ出力してビデオデコーダ１７を駆動制御することにより、ビデオデコーダ１７及びオーデ．オデコーダ１８を必要に応じて開始させ、又は停止させる。またこの際コントローラ２０はオーデイオデコーダ１８の再生動作をクリツプの時間長に応じた所定時間停止させる。
【００６４】
このようにこの符号化復号化装置１０では、エンコーダ側において、各クリツプ毎にクリツプの時間長に応じた制御用のデコーダ制御マークＭ₁〜Ｍ₄、Ｍ_S1〜Ｍ_S23を入れると共に、デコーダ側ではこれらデコーダ制御マークＭ₁〜Ｍ₄、Ｍ_S1〜Ｍ_S23に基づいてビデオデコーダ１７及びオーデイオデコーダ１８を駆動制御するため、各クリツプ毎に完結してＡＶの頭を合わせることができる。
またこの符号化復号化装置１０では、ビデオデコーダ１７のバツフア２１が仮想バツフアの（容量又は数の）３倍に設定されており、さらにこの３倍のバツフア２１を破綻させないように、クリツプの時間長のみに依存し、かつ先頭ビデオパケツトのサイズが他のビデオパケツトよりも仮想バツフアの容量分だけ大きい特徴を有する上述の第１の多重化フオーマツトに従つて、エンコーダ１１が符号化されたビデオ信号と、符号化されたオーデイオ信号とを多重化するため、ビデオデコーダ１７のバツフア２１が破綻することはない。
従つて「（３）連続再生の実現」において提示した２つの条件を満たすため、ＡＶ同期をとりながら連続再生することができる。
【００６５】
従つて例えばクリツプ素材としてテレビジヨン放送ＣＭを適用し、これをＭＰＥＧ圧縮して、上述の多重化フオーマツトに従つてＡＶ多重化したのち、ＨＤＤ等の大容量かつランダム・アクセス可能な記録メデイアに記録しておき、ＣＭ放送のタイム・テーブルに従つて指定のＣＭ圧縮データ群を指定時刻にデコーダに供給するようにすれば、指定時刻にかつ連続にＣＭを放送することができる。
【００６６】
（５）第１実施例の動作
以上の構成によれば、エンコーダ側において、各クリツプ毎にクリツプの時間長に応じた制御用のデコーダ制御マークＭ₁〜Ｍ₄、Ｍ_S1〜Ｍ_S23を入れると共に、デコーダ側ではこれらデコーダ制御マークＭ₁〜Ｍ₄、Ｍ_S1〜Ｍ_S23に基づいてビデオデコーダ１７及びオーデイオデコーダ１８を駆動制御する一方、エンコーダ部５が、クリツプの時間長のみに依存し、かつ先頭ビデオパケツトのサイズが他のビデオパケツトよりも仮想バツフアの容量分だけ大きい特徴を有する第１の多重化フオーマツトに従つて、符号化されたビデオ信号と、符号化されたオーデイオ信号とを多重化するようにしたことにより、各クリツプのＡＶの頭を合わせ得ると共にバツフア２１が破綻するのを防止でき、かくして複数のクリツプを指定された時刻に連続的に再生し得る符号化復号化装置を実現できる。
【００６７】
〔２〕第２実施例
図６との対応部分に同一符号を付して示す図１４は、第２実施例による符号化復号化装置４０を示し、デコーダ４１のコントローラ４２が、その内部メモリに予め記憶している各クリツプ長におけるビデオとオーデイオとの再生時間差データに基づいてビデオデコーダ１７及びオーデイオデコーダ１８の駆動を制御することを除いて第１実施例の符号化復号化装置１０（図６）とほぼ同様に構成されている。
すなわちこの符号化復号化装置４０の場合、エンコーダ４３は、供給される映像音声信号Ｓ１に基づく映像信号及びオーデイオ信号をそれぞれ符号化し、これらを上述の第１の多重化フオーマツトに従つて多重化し、圧縮ＡＶ多重化データＤ３０としてデータ供給部４４に送出する。
データ供給部４４は、供給される圧縮ＡＶ多重化データＤ３０をコントローラ４５の制御に基づいて記録メデイア１４に記録する。またデータ供給部４４は、コントローラ４５の制御に基づいて、記録メデイア１４に記録された圧縮ＡＶ多重化データＤ３１をクリツプ単位で再生し、これをデコーダ４１のデマルチプレクサ４６に送出する。
デマルチプレクサ４６は、供給される圧縮ＡＶ多重化データＤ３１を圧縮映像データＤ４及び圧縮オーデイオデータＤ５に順次分割し、これらをそれぞれビデオデコーダ１７及びオーデイオデコーダ１８に送出する。
この際コントローラ４５は、このクリツプの時間長を時間情報信号Ｓ３０として共有メモリ１９を介してデコーダ４１のコントローラ４２に供給する。
【００６８】
このコントローラ４２は、クリツプの各時間長におけるビデオとオーデイオとの間の再生時間差データを予め記憶しており、供給された時間情報信号Ｓ３０に基づくこのクリツプの時間長と、記憶している対応する再生時間差データとに基づく制御信号Ｓ４、Ｓ５を生成し、これらをそれぞれビデオデコーダ１７及びオーデイオデコーダ１８に送出する。
これによりこのコントローラ４２は、データ供給部４４のコントローラ４５が記録メデイア１４からこのクリツプを再生し始めてからオーデイオの再生時間Ｔc が経過した後オーデイオデコーダ１８を停止させ、この後再生時間差（Ｔｖ−Ｔc ）だけ時間が経過すると再びオーデイオデコーダ１８を動作させることを各クリツプに対して順次行うようにオーデイオデコーダ１８を駆動制御するようになされ、かくして各クリツプの頭においてビデオとオーデイオの頭を合わせさせ得るようになされている。
以上の構成において、この符号化復号化装置４０では、データ供給部４４のコントローラ４５が記録メデイア１４に記録された圧縮ＡＶ多重化データＤ３１をクリツプ単位で再生する際、このクリツプの再生時間情報をデコーダ４１のコントローラ４２に供給する。
【００６９】
一方コントローラ４２は、コントローラ４５から供給されるこのクリツプの再生時間時間情報と、コントローラ４２自体がもつている再生時間差データとに基づいて、各クリツプの頭においてビデオとオーデイオの頭が合うようにビデオデコーダ１７及びオーデイオデコーダ１８を駆動制御する。
このようにこの符号化復号化装置４０では、各クリツプ単位でこのようなオーデイオデコーダ１８の駆動制御が行われるため、各クリツプ内で完結してビデオ及びオーデイオ間の再生時間差を補正することができ、かくしてＡＶ同期を取ることができる。
またビデオデコーダ１７のバツフア２１が仮想バツフアの容量の３倍のものに選定されており、エンコーダ４３が供給これら符号化されたビデオ信号及びオーデイオ信号を第１の多重化フオーマツトに従つて多重化するため、第１実施例の場合と同様にＡＶ同期を取りながら各クリツプを連続再生することができる。
【００７０】
以上の構成によれば、ビデオデコーダ１７のバツフア２１の容量を仮想バツフアの３倍にし、かつエンコーダ４３が、供給される符号化されたビデオ信号及びオーデイオ信号を第１の多重化フオーマツトに従つて多重化するように設定すると共に、データ供給部４４のコントローラ４５が記録メデイア１４に記録された圧縮ＡＶ多重化データＤ３１をクリツプ単位で再生する際、このクリツプの再生時間情報をデコーダ４１のコントローラ４２に供給し、かつコントローラ４１がこの再生時間時間情報と、コントローラ４１自体がもつている再生時間差データとに基づいてビデオデコーダ１７及びオーデイオデコーダ１８を、各クリツプの頭においてビデオとオーデイオの頭が合うように駆動制御するようにしたことにより、第１実施例の符号化復号化装置１０と同様に複数のクリツプを指定された時刻に連続的に再生し得る符号化復号化装置を実現できる。
【００７１】
【発明の効果】
上述のように本発明によれば、所定の圧縮ビツトレートで圧縮符号化された圧縮ビデオデータ及び圧縮オーデイオデータで形成される素材の時間長毎に発生する圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差を、復号化側に再生時間差データとして記憶しておくようにしたことにより、タイムスタンプの有無に係わらず、素材の時間長のみから、当該素材を形成する圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差を特定することができ、さらに当該特定した再生時間差分だけ、圧縮オーデイオデータに対する復号動作を停止させることにより、複数の素材を連続再生する際に、各素材を形成する圧縮ビデオデータ及び圧縮オーデイオデータの再生開始時刻を同期させることができるので、従来と比して一段と簡易な構成でありながら、各素材の再生を指定時刻に開始でき、かつ複数の素材を連続的に再生することのできる復号化装置及び符号化復号化装置を実現できる。
また本発明によれば、予め素材の時間長に対する最小単位を設定しておくことにより、素材の時間長と、当該素材を形成する圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差との組み合わせパターンを、最小単位が設定されていない場合と比べて格段と少なくすることができるので、より少ないデータ量の再生時間差データで圧縮ビデオデータ及び圧縮オーデイオデータ間の再生時間差を特定することができる。
【図面の簡単な説明】
【図１】仮想バツフアとスタート・アツプ・デイレイの説明に供するグラフ及びブロツク図である。
【図２】スタート・アツプ・デイレイとバツフアの破綻の説明に供するグラフである。
【図３】圧縮ビデオデータフアイルのデータ量の説明に供するグラフである。
【図４】データ量とデータ転送速度の調整の説明に供するグラフである。
【図５】ビデオ及びオーデイオの実現可能時間を示す図表である。
【図６】第１実施例の符号化復号化装置の構成を示すブロツク図である。
【図７】デマルチプレクス後のデータ転送形態を示すブロツク図である。
【図８】デコーダ制御マークの説明に供する略線図である。
【図９】圧縮ＡＶ多重化データフアイルのフオーマツトを示す略線図である。
【図１０】ビデオデコーダへの入力の推移を示すグラフである。
【図１１】連続再生時のビデオデコーダバツフアの使用状況の説明に供するグラフである。
【図１２】圧縮ＡＶ多重化フオーマツトの例を示す図表である。
【図１３】圧縮ＡＶ多重化フオーマツトの例を示す図表である。
【図１４】第１実施例の符号化復号化装置の構成を示すブロツク図である。
【図１５】圧縮ＡＶ多重化フオーマツトの例を示す図表である。
【図１６】圧縮ＡＶ多重化フオーマツトの例を示す図表である。
【図１７】タイム・スタンプ方式の説明に供するブロツク図である。
【図１８】タイム・スタンプ方式の場合の連続再生の説明に供する略線図である。
【符号の説明】
１０、４０……符号化復号化装置、１１、４３……エンコーダ、１２、４４……データ供給部、１３、２０、４２、４５……コントローラ、１４……記録メデイア、１６、４６……デマルチプレクサ、１７……ビデオデコーダ、１８……オーデイオデコーダ、２１、２２……バツフア、Ｓ４、Ｓ５……制御信号、Ｓ１０……マーク検出信号、Ｄ１、Ｄ２……圧縮ＡＶ多重化データ、Ｍ₁……ビデオデコード開始マーク、Ｍ₂……オーデイオデコード開始マーク、Ｍ₃……停止マーク指定マーク、Ｍ₄……オーデイオデコーダ停止マーク、Ｍ_S1〜Ｍ_S23……オーデイオデコード停止マーク。
【数２７】

【数２８】

[0001]
【table of contents】
The present invention will be described in the following order.
Industrial application fields
Conventional technology (Fig. 17)
Problems to be Solved by the Invention (FIGS. 17 and 18)
Means for Solving the Problems (FIGS. 1 to 16)
Action (FIGS. 1-16)
Example (FIGS. 1 to 16)
The invention's effect
[0002]
[Industrial application fields]
The present inventionDecoding device and encoding / decoding deviceFor example, it is suitable for application to an AV server system.
[0003]
[Prior art]
Conventionally, there are various methods for digital compression of video signals and audio signals. Among them, one of the methods that are currently attracting the most attention is the MPEG (Moving Pictur Experts Group) standard.
When reproducing (decoding) AV (Audio Vidual: audio / video) data compressed according to the MPEG standard, it is necessary to synchronize between AVs. There are two methods of stamping.
[0004]
In the implicit synchronization method, each AV data corresponding to substantially the same time and the same time length is multiplexed in a certain unit so that the head of the AV matches at the head of one AV material (hereinafter referred to as a clip). In other words, the AV synchronization is achieved by adjusting the decoding start time.
The time stamp method, on the other hand, adds a time stamp (DTS: Decoding Time Stamp and PTS: Presentation Time Stamp) indicating the playback time for each AV data unit when AV is multiplexed. A time stamp (SCR: System Clock Refernece or PCR: Program Clock Refernece) indicating the reference time is added for each unit, and each decoder executes playback according to this time stamp. Therefore, this is a method of synchronizing AV. The MPEG system standard adopts this method.
[0005]
In general, the time stamp method is more accurate than the implicit synchronization method, but has a feature that the cost is high because the mechanism is complicated.
A little more explanation about how to use time stamps. As shown in FIG. 17, the demultiplexer 1 is supplied with AV multiplexed data D_AvThe reference time information PCR is extracted from, and the system clock CL1 is adjusted to synchronize with this value. The system clock CL1 is supplied to the audio decoder 2 and the video decoder 3, respectively.
[0006]
Further, the demultiplexer 1 is connected to the AV multiplexed data D_AvAudio data D extracted from_AdIs supplied to the audio decoder 2 together with the reproduction time information DTS, and the extracted video data D_ViIs supplied to the video decoder 3 together with the reproduction time information DTS and PTS.
As a result, the audio decoder 2 and the video decoder 3 determine when to start the reproduction operation by comparing them based on the system clock CL1 and the reproduction time information DTS and PTS, respectively. ing.
[0007]
[Problems to be solved by the invention]
Consider a case where a plurality of clips are continuously reproduced.
In the case of the time stamp method, as shown in FIG. 18, data of the next clip (clip 2) has already started to be input to the demultiplexer 1 at the time near the clip joint, and the system clock CL1 is also clipped. The time zone T in which the audio decoder 2 and the video decoder 3 are still processing the previous clip (clip 1).₀Can exist. In this case, since the reference time information PCR of each

clip

1 and 2 is independent, this time zone T₀However, if the audio decoder 2 and the video decoder 3 try to decode in accordance with the system clock CL1, there is a possibility that they are not reproduced correctly.
[0008]
Thus, the conventional time stamp method has a problem that it is difficult to continuously reproduce a plurality of clips.
In order to solve such a problem, conventionally, as a method of realizing continuous reproduction of a plurality of clips, two sets of audio decoders 2 and video decoders 3 are prepared, and these sets are alternately used to output the demultiplexer 1. Switching methods are widely used. According to this method, it is also possible to adopt the time stamp method as the AV synchronization method.
[0009]
  However, in such a method, there is a problem that the system scale becomes large because two sets of the audio decoder 2 and the video decoder 3 need to be prepared and an apparatus for performing switching is required.
  The present invention has been made in consideration of the above points. With a simple configuration, the reproduction of each clip can be started at a specified time, and a plurality of clips can be reproduced continuously.Decoding device and encoding / decoding deviceIs to try to propose.
[0010]
[Means for Solving the Problems]
  In order to solve this problem, in the present invention,Compressed video data and compressed audio data that are compressed and encoded at a predetermined compression bit rate and supplied from a data supply source to a decoding device that decodes compressed video data and compressed audio data forming the material. Dividing means for dividing and outputting, first decoding means for decoding compressed video data output from the dividing means, second decoding means for decoding compressed audio data output from the dividing means, and data supply source Based on the time information indicating the time length of the material, and the playback time difference data indicating the playback time difference between the compressed video data and the compressed audio data generated for each time length of the material stored in advance, Control means for stopping the decoding operation on the compressed audio data of the second decoding means by the reproduction time difference is provided.
[0011]
  In the present invention, the minimum unit for the time length of the material is set in advance, and the decoding apparatus stores the reproduction time difference data indicating the reproduction time difference generated for each time length equal to or greater than the minimum unit.
[0012]
[Action]
  In this way, the reproduction time difference between the compressed video data and the compressed audio data generated for each time length of the material formed by the compressed video data and the compressed audio data compressed and encoded at a predetermined compression bit rate is transmitted to the decoding side. By storing it as playback time difference data, it is possible to specify the playback time difference between the compressed video data and the compressed audio data forming the material from only the time length of the material regardless of the presence or absence of a time stamp. In addition, by stopping the decoding operation on the compressed audio data by the specified reproduction time difference, the reproduction start time of the compressed video data and the compressed audio data forming each material can be set when continuously reproducing a plurality of materials. Can be synchronized.
[0013]
  In addition, by setting the minimum unit for the time length of the material in advance, the minimum unit sets the combination pattern of the time length of the material and the playback time difference between the compressed video data and compressed audio data forming the material. Compared with the case where it is not done, it can be remarkably reduced.
[0014]
【Example】
Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.
[0015]
[1] First embodiment
(1) Technology used
(1-1) Video compression according to the MPEG standard
(1-1-1) Virtual buffer
In video data compression according to the MPEG standard, the amount of bits generated by each frame (image) changes. If the amount of bits is too large or too small, the buffer on the decoder side may overflow or underflow.
For this reason, generally, the buffer of the decoder is assumed to be a certain amount, and the supplied video data is encoded (encoded) while driving the encoder so that the buffer does not fail. Hereinafter, such a buffer is referred to as a virtual buffer.
[0016]
(1-1-2) Start up delay
At the time of decoding, when data is input to the decoder buffer, the image is not reproduced immediately, and the reproduction is started after a certain time has elapsed. This is to prevent the decoder buffer from failing. Hereinafter, this time is referred to as a start-up-delay.
The start up delay is not the same no matter how the video material is compressed, and generally differs depending on the video material.
FIG. 1 shows the virtual buffer 4 and the start up delay. In this FIG.₁Is one frame time, T₂Is the start-up dayray, V_bIndicates the capacity of the virtual buffer 4. Here, since it is considered that data is supplied to the video decoder 5 at a constant speed, the line indicating the accumulated amount of input data is a straight line.
[0017]
In such a virtual buffer 4, for example, as shown in FIG. 2 (A), a predetermined start up delay T₂If decoding starts too late, there is a risk of overflow.
On the other hand, as shown in FIG. 2 (B), a predetermined start up delay T₂If decoding starts too soon, there is a risk of underflow.
[0018]
(1-1-3) Data amount of compressed video data file
Hereinafter, a data group obtained as a result of encoding one video material is referred to as a “compressed video data file”. Normally, the data amount of the compressed video data file generated from different video materials having the same time length is not the same, and as is apparent from FIGS. 3A and 3B, the data amount of the compressed video data file is V_v, Video compression bit rate R_v, The virtual buffer capacity to V_b, T the length of video material₄Then, the following formula
[Expression 1]

+/- virtual buffer capacity V_bFluctuates in the range of minutes.
[0019]
(1-2) Continuous playback of compressed video data file
1. The MPEG system is adopted as a video data compression system.
2. The compression bit rate of video data is assumed to be constant.
3. The start up delay of each compressed video data file is not necessarily equal.
4). The data amount of each compressed video data file is not necessarily equal to the product of the compression bit rate and the playback time (time length of the video material).
In order to continuously play back (decode) a compressed video data file under the conditions
1. Prepare 3 times or more (capacity or number) of virtual buffers as video decoder files.
2. Decoding of the first file of the group of compressed video data files to be continuously reproduced starts after data is accumulated in the video decoder buffer for the capacity of the virtual buffer.
3. The maximum video data file amount is transferred to the decoder at a transfer rate that can be transferred within the playback time.
This method is effective.
[0020]
Hereinafter, this method will be described.
Two compressed video data files F₁, F₂Let's consider the case of continuous playback in this order. At this time, both compression bit rates are R_vAnd the size of the virtual buffer is V_bAnd Each playback time is set to T_F1And T_F2And start-up and day-lay respectively at Td₁, Td₂And
[0021]
(1-2-1) Continuous playback of compressed video data files
For simplicity, it is assumed that the data amount of each compressed video data file is equal to the product of the compression bit rate and the reproduction time (the data amount can be specified).
If Td₁= Td₂If so, these two compressed video data files F₁, F₂Continuously compressed bit rate R_vThese two compressed video data files F are supplied to the decoder at a transfer rate of₁, F₂Are played continuously.
However, Td₁<Td₂In the case of the compressed video data file F₂Decoding start of Td₂There is a risk of underflow of the virtual bear because it is faster.
[0022]
Conversely, Td₁> Td₂In the case of the compressed video data file F₂Decoding start of Td₂There is a risk of overflowing the virtual bearer.
The maximum start-up delay that can occur in this case is “the time required to transfer the data corresponding to the capacity of the virtual buffer at a transfer rate equal to the compression bit rate (T_b= V_b/ R_v)"be equivalent to.
[0023]
Therefore, in the first file of the group of compressed video data files to be continuously reproduced, this time is used regardless of the start-up-delay._bIf you just wait and start decoding, the buffer will not underflow. In other words, the video decoder_bIt is only necessary to start decoding after data has been collected.
The minimum start-up delay that can occur is "0"._bIf you only delay the start of decoding, the worst V_bThe buffer will overflow by the amount. Therefore, if the video decoder buffer is prepared twice as much as the virtual buffer, it will not overflow.
[0024]
(1-2-2) Consideration of insufficient data volume
For the sake of simplicity, it is assumed that the start up delay of each compressed video data file is equal (the start up delay can be specified), and description will be made with reference to FIGS. 4A and 4B.
In general, the size V of the compressed video data file_vIs the compression bit rate R_v, And its reproduction time T (V_v, ≠ R_v, T), when data is supplied to the decoder at a transfer rate equal to the compression bit rate, the time required for input (V_v, / R_v)) And the time T required for output (reproduction) do not match.
[0025]
Compressed video data file F₁, The decoder input time is longer than the output time (V_V1/ R_v> T_F1) In the case of compressed video data file F₂Decoding starts at start-up-delay Td₂There is a risk that the buffer will overflow.
Conversely, compressed video file F₁The input required time is shorter than the output required time (V_V1/ R_v<T_F1) If compressed video data file F₂Decoding starts at Start Up Dayray T_F2There is a risk that the buffer will underflow.
[0026]
Maximum amount of compressed video data that can be generated V_vmaxIs
[Expression 2]

It is. Therefore, if the actual amount of compressed video data obtained is less than this value, the “dust” data D is added at the end of the file._dCan be added to any file size V_vV_vmaxTo be. In the MPEG standard, it is possible to add “0” as “dust” data (hereinafter, “dust” data D added for file size adjustment in this way._dAre called “patds”).
A transfer speed (R that can transfer all the data of the file whose size has been adjusted in this way at time T._v'= R_v+ V_bWhen the data is input to the decoder at / T), as shown in FIG. 4B, both the required input time and the required output time are T, so that the above buffer failure does not occur.
[0027]
However, this transfer rate R_v'Is the original transfer rate R_vTherefore, there is a risk that the buffer will overflow regardless of whether or not continuous playback is performed. Worst V_bSince the buffer overflows by the amount, in order to prevent such a situation, it is sufficient to prepare twice as many as the virtual buffer as the video decoding buffer.
From the above, considering that the start-up-delay is indefinite and the amount of data is also indefinite, it is sufficient to prepare three times the virtual buffer (capacity or number) as the video decoder buffer. I understand that.
[0028]
(1-3) Audio compression in the MPEG standard
In the case of audio data, unlike the case of video data, when compressed based on the MPEG standard, there is no equivalent to the start-up / delay of video data, and the amount of compressed data depends on the compression bit rate and the material. Since it coincides with the product of the time length, it can be reproduced continuously if a plurality of compressed audio data files are continuously supplied to the decoder without any special effort.
[0029]
(1-4) Minimum unit of compressed data
A video signal is handled in units of frames, and when it is compressed based on the MPEG standard, it is further handled in units of GOP (Group Of Picture).
For example, in the case of the NTSC system, the frame is 29.97 [Hz], and normally 1 GOP is composed of 15 frames. In the case of the PAL system, the frame is 25 [Hz], and a GOP consisting of 12 frames and a GOP consisting of 13 frames are alternately repeated.
[0030]
On the other hand, when the audio is compressed based on the MPEG standard, for example, if the sampling frequency is 48 [KHz] and the number of samples is 1152, the minimum unit is 24 [ms].
As described above, since there is a minimum unit that can be handled in the compressed data, it is impossible to strictly realize a clip of an arbitrary time length.
Here, for simplicity, the minimum time length unit of the clip is 1 second. Even with such restrictions, there is almost no problem in actual operation. At this time, the clip time length is T, and the nearest realizable video time above this time is T._vAnd Time T_vBelow is the most achievable audio time T_aAnd
[0031]
In case of PAL system, T_v= T, but in the case of NTSC system, T_v≠ T.
Specific numerical examples are shown in FIG. In this example, the video is in the NTSC format, and the audio is in the minimum unit of 24 [ms]. In this example, the error T between the video and audio_VX-T_cThere are 24 types from 0 to 23.
[0032]
(2) Configuration of encoding / decoding device according to embodiment
In FIG. 6, reference numeral 10 denotes a coding / decoding device as a whole, and an encoder 11 encodes and multiplexes a video signal and an audio signal based on the supplied video audio signal S1, respectively, and then obtains the compressed AV multiplex obtained. The digitized data D1 is supplied to the data supply unit 12.
The data supply unit 12 records the supplied compressed AV multiplexed data D1 on the recording media 14 based on the control of the controller 13. The data supply unit 12 reads the compressed AV multiplexed data D2 recorded on the recording medium 14 in units of files based on the control of the controller 13, and supplies the data to the demultiplexer 16 of the decoder 15 at a constant speed.
The demultiplexer 16 divides the supplied compressed AV multiplexed data D2 into compressed video data D3 consisting of compression-encoded video signals and compressed audio data D4 consisting of compression-encoded audio signals, which are respectively The data is sent to the video decoder 17 and the audio decoder 18.
[0033]
At this time, the controller 13 of the data supply unit 12 sends the control information signal S3 to the controller 20 via the shared memory 19, and the controller 20 sends the control signals S4 and S5 based on the control information signal S3 to the video decoder 17 and the audio decoder, respectively. 18 is supplied.
Thus, the video decoder 17 sequentially stores the compressed video data D3 in the buffer 21 based on the control signal S4, and sequentially decodes and outputs this while reading it out.
Similarly, the audio decoder 18 sequentially stores the compressed audio data D4 in the buffer 22 based on the control signal S5, and sequentially decodes and outputs the data while reading it.
[0034]
In the case of this embodiment, the buffer 21 corresponding to the video decoder 17 has a capacity three times that of the virtual buffer. As a result, it is possible to prevent the buffer 21 from failing (overflow, underflow) due to indefinite start-up / delay of the compressed video data D3 supplied and the amount of data indefinite.
In the case of this embodiment, as shown in FIG. 7B, the demultiplexer 16 does not perform speed conversion on the divided compressed video data D3 and compressed audio data D4, but in a burst manner, the video decoder 17 or the audio decoder. 18 is sent out.
[0035]
In this case, originally, data compressed at a predetermined compression bit rate should be supplied to the

decoders

17 and 18 at a constant speed as shown in FIG. 7A. A circuit that performs speed conversion is required for this part. Therefore, in the encoding / decoding device 10 of the embodiment, even when data is input in a burst manner to each video decoder 17 or audio decoder 18, so that the

buffers

21 and 22 can be reproduced without failure. The format is defined as follows.
[0036]
(3) Realization of continuous playback and scheduled playback
The encoding / decoding apparatus 10 of this embodiment employs an implicit synchronization method that does not use a time stamp as a method for synchronizing AV.
In this case, in order to play back clips continuously with AV synchronization in this way,
1. Even if clips continue, the heads of video data and audio data should be aligned at the head of each clip.
2. Even if clips continue, the video data and audio data decoder buffers do not fail during playback of each clip.
It is sufficient to satisfy these two points.
A method for satisfying each of these conditions will be described below.
[0037]
(3-1) AV head alignment method for realizing regular playback and continuous playback
At present, there are various types of decoders. In the encoding / decoding apparatus 10 of the embodiment, as a video decoder 17, “after receiving a start command, playback after 1.5 foam time (0.05 seconds)”. As an audio decoder, apply “what starts playback immediately after receiving a start command (normal decoder specifications)” to achieve scheduled playback using the following sequence: Yes.
[0038]
1. T of the time you want to start_SData transfer from the recording media 14 is started by the controller 13 (FIG. 1) of the data supply unit before 2 seconds (time required to transfer the compressed video data for the virtual buffer to the buffer 22 of the video decoder 18). . This T_SThe time of seconds is the same for any clip as will be described later.
2. When the demultiplexer 16 receives the first data of the first clip, it notifies the controller 20 of the decoder 15 that the data transfer has started.
3. (T_S-0.05) When the second has elapsed, the controller 20 issues a playback operation start command to the video decoder 17.
4). When 0.05 seconds elapses, the controller 20 issues a reproduction start command to the audio decoder 18.
[0039]
As a result, playback of video data and audio data is started at a predetermined time at the same time. In order to realize such a method, it is necessary to measure elapsed time.
However, in the encoding / decoding device 10 of this embodiment, the compressed AV multiplexed data D1 is transferred to the demultiplexer 16 at a constant speed, so that if the number of data passing through the demultiplexer 16 is counted, the process has passed. You can time.
[0040]
Therefore, in this embodiment, when the encoder unit 11 encodes and multiplexes the video signal and the audio signal without using a counter or the like, “the multiplexer 16 operates on the

decoders

17 and 18 in the data file. A special code (hereinafter referred to as a mark) is inserted as a mark at the “position corresponding to the time at which a command should be issued”.
Further, when the demultiplexer 16 detects this mark from the data file, it transmits this mark to the controller 20 as a mark detection signal S10 (FIG. 6), while the controller 20 uses a video decoder in accordance with the contents based on the mark detection signal S10. The control signals S4 and S5 are sent to 17 and the audio decoder 18.
As a result, in the encoding / decoding device 10, the controller 20 can drive the video decoder 17 and the audio decoder 18 at the timing described above.
[0041]
This mark is guaranteed by the MPEG system standard that the mark never appears in the reproduction compression multiplexed data D2 so that it can be identified from the reproduction compression multiplexed data D2. Byte code is used.
In this embodiment, the codes “000001BD” and “000001BF” are used in hexadecimal notation as a video decode start mark for causing the video decoder 17 to start decoding and an audio decode start mark for causing the audio decoder 18 to start the decoder, respectively. doing.
[0042]
The transfer rate of the compressed AV multiplexed data D2 is set to 5 [Mbps], and the compressed playback multiplexed data D2 is transferred to 4 [Mbit] to transfer the compressed video data D4 corresponding to the capacity of the virtual buffer to the buffer 21 of the video decoder 17. As shown in Fig. 8, it is necessary to supply
[Equation 3]

Video decode start mark M at the position₁Place “000001BD” and the following formula
[Expression 4]

Audio decoding start mark M later₂“000001BF” is placed.
Next, an AV head alignment method for realizing continuous reproduction will be described.
[0043]
As described above in “(1-4) Minimum unit of compressed data”, the playback times of video data and audio data do not match. For example, in a clip with a time of 6 seconds, the playback time of the video is longer by 0.006 seconds. Therefore, if two 6-second clips are continuously played back, even if the first clip is completely synchronized with the AV, the second clip will start playing the audio signal earlier by 0.006 seconds. If the difference is this level, you will not actually feel any sense of incongruity, but this difference accumulates as the number of consecutive clips increases. The difference goes as long as 0.06 seconds, and you feel uncomfortable.
For this reason, in the encoding / decoding device 10 of this embodiment, the head of each AV is matched with the head of each clip by the following method.
1. Audio playback time T since playback started_cWhen elapses, the audio decoder 18 is stopped.
2. Furthermore, playback time difference (T_v-T_cWhen the time elapses, the audio decoder 18 is operated again.
3. This operation is repeated for each clip.
In this case, in order to realize this method, it is necessary to measure the elapsed time as in the case of the regular reproduction.
[0044]
For this reason, in this embodiment, as in the case described above, the encoder unit 11 also adds a predetermined mark to a predetermined position of each data file.
For example, in the case of a clip of T = 6 [s] according to FIG. 5, the reproduction time difference is 0.006 seconds, so the audio decoding start mark M₁Audio decoding stop mark M that causes the audio decoder 18 to stop decoding at a position before_s1~ M_s23Insert (Figure 8).
In this case, when the minimum unit of the compressed audio data is 24 [ms], there are 24 reproduction time differences from 0 to 23 [ms], but when the difference is 0 [ms], the audio decoder 18 is stopped. Audio decoding stop mark M because there is no need_s1~ M_s23It is sufficient to prepare 23.
[0045]
For this reason, the encoder 11 stops the audio decoding stop mark M._s1~ M_s23For example, 23 codes from “000001C1” to “00001D7” correspond to reproduction time differences of 1 to 23 [ms], respectively. Therefore, these codes appear every 5 [Mbps] × 0.001 [s] = 625 [bytes]. The encoder 5 also indicates a mark (hereinafter referred to as a stop mark designation mark M) indicating which of these 23 codes should be used to stop decoding.₃For example, “000001D8”) is added to the end of each clip, and the audio decoding stop mark M that becomes effective after this is added.₄Is added continuously.
For example, in the case of a clip of T = 6 [s], since the reproduction time difference is 0.006 [s], the audio decoding stop mark M_s6Therefore, the last 8 [bytes] of this clip is “000001D8000001C6”.
[0046]
When clip 1 (T = 6 [s]) and clip 2 (T = 7 [s]) are continuously reproduced, clip 1 stop mark designation mark M₃Audio decoding stop mark M_s6The audio decoding stop mark M of clip 2 is known to be valid._s6Is detected by the demultiplexer 16, a stop command is issued to the audio decoder 18, and the audio decoding start mark M of the clip 2 is issued.₂Is detected, a start command is issued to the audio decoder 18.
Each stop mark M that is valid before the clip starts to play continuously._s1~ M_s23In order to clear the mark, set “00000000” to stop mark M of the top clip._s1~ M_s23It is possible to prevent a stop command from being issued.
[0047]
(3-2) Buffer usage method for continuous playback
(3-2-1) Basic structure of compressed AV multiplexing format
Hereinafter, a group of data of the same type is referred to as a “packet”. Here, the type means three types: video, audio, and packet. A header is attached to the head of each packet to indicate its type.
A group of several packets is called “packet”. Unlike a packet, a packet does not have any identification header.
Here, in the encoding / decoding device 10 of this embodiment, the following three types of packs are used.
1. Standard pack
...... In principle, it consists of one video packet, one audio packet, and one packet packet.
2. Top pack
…… The amount of video packet data is larger than the standard packet by the amount of virtual buffer.
3. Residual pack
... consisting of the remaining data not included in the above two types of packs.
[0048]
In this embodiment, as shown in FIG. 9A, one compressed AV multiplexed data file obtained by encoding one clip has one “leading pack” at the beginning, and then a plurality of files. The “standard packs” are consecutive, and one “residual pack” comes to the end.
Since all the data control marks are located in the video packet of the first packet, there is no possibility that the mark is inserted into the contents of the packet / header and the packet / header cannot be recognized.
[0049]
(3-2-2) Packet Hedda
As a header for identifying the type of each packet, it is necessary to adopt a pattern that never appears in the compressed data. For this reason, the encoding / decoding device 10 employs a pattern reserved for the header in the MPEG system standard. In practice, for example, the following pattern is used. Each header is displayed in hexadecimal and its size is 4 bytes.
Video packet Hedda ...... 000001E3
Audio / Packet / Hedda ...... 000001C0
Padded packet Hedda ...... 000001BE
Thus, the demultiplexer 16 examines any continuous 4 bytes of the input compressed AV multiplexed data D2, and if it is a video packet header, it distributes the following data to the video decoder 17 and outputs the audio data. If it is a packet header, the following data is distributed to the audio decoder 18, and if it is a packet packet data, the following data is discarded.
[0050]
(3-2-3) Restrictions on compression bit rate values, etc.
The clip of time length T is converted to virtual buffer amount V_b, Video compression bit rate R_vThe maximum amount of compressed video data that can be generated is
[Equation 5]

Given in.
For this reason, in the encoding / decoding device 10 of this embodiment, when the actually obtained amount of compressed video data is less than this value, pad data is added to the end of the compressed video data amount so that an arbitrary file size can be obtained. V_vmaxIt is trying to become. When all the data after the size adjustment is transferred at time T, the video data transfer rate R_v′ Is expressed by the following equation.
[Formula 6]

R_v'Is the maximum R_vmax'Is the minimum clip time length T_minThis is the case.
[Expression 7]

Compressed AV multiplexed data transfer rate R_mMust be at least greater than the sum of the compressed video data rate and the compressed audio data rate. For audio, a transfer rate equal to the compression bit rate may be used. Therefore, these parameters must satisfy the following equation. (To be exact, it is necessary to consider the header and the mark, but they are omitted here for simplicity.)
[Equation 8]

[Equation 9]

[0051]
In practice, the encoding / decoding device 10 uses R_v, R_m, R_a, T_min, V_bThe parameters such as are not arbitrary values, and are restricted depending on the purpose of use and hardware. For this reason, the encoding / decoding apparatus 10 determines those parameters having the most restrictive ones, and sequentially selects them so as to satisfy the expression (9).
[0052]
(3-2-4) Calculation of packet size
An equation for calculating the size of each packet when a parameter value such as a compression bit rate value is given is shown below.
First, the variables used in the following calculation are listed.
Video compression bit rate ...... R_v[Mbps]
Video data transfer rate ...... R_v′ [Mbps]
Audio compression bit rate ...... R_a[Kbps]
Compressed AV multiplexed data transfer rate R_m[Mbps]
Clip duration (in 1 second increments) T [s]
Minimum clip time (in 1 second increments) T_min[S]
Video playback time that is closest to T and realistic after T_v[S]
T_vAudio playback time that is closest to T and can be realized in the following ... T_a[S]
Time required to transfer clip ... T_t[S]
Time required to transfer the compressed video data for the virtual buffer ... T_S[S]
Virtual buffer amount …… V_b[Mbit]
One clip of compressed video data amount V_v[Byte]
One clip of compressed audio data volume ... V_a[Byte]
1 clip of pad data amount …… V_p[Byte]
Total amount of data for one clip ...... V_m[Byte]
Total mark data amount …… V_k[Byte] (= 4 x 27 [byte]])
Packet Hedda Size L_h[Byte] (= 4 [byte])
Standard video packet size (excluding header) ... L_v[Byte]
Standard audio packet size (excluding header) …… L_a[Byte]
Standard packet size (excluding header) …… L_p[Byte]
Standard pack size (including header) ...... L_m[Byte]
Number of packs (excluding remaining packs) …… P
* Standard packet means a packet in the standard packet.
Residual video packet size (excluding header) …… L_v'[Byte]
Residual audio / packet size (excluding header) …… L_a'[Byte]
Residual packet, packet size (excluding header) …… L_p'[Byte]
* “Residual packet” means a packet in the residual packet.
Unit conversion coefficient (Mbit → byte) …… C_m= 125000 = 1000 × 1000/8
Unit conversion coefficient (kbit → byte) …… C_k= 125 = 1000/8
[0053]
When T is in units of 1 second, for the sake of simplicity, the standard packet size L in a clip of an arbitrary time length_vAnd L_aAre equal to each other, so that R_v, R_a, L_v, L_aShould be selected.
[Expression 10]

## EQU11 ##

[Expression 12]

Usually R_vIs at most 0.1 [Mbps] units, and R_aIs one of 64, 128, 192, 256, and 384 [kbps]. Therefore, an integer L that satisfies the above equation_v, And L_aExists.
Fractional data generated by the restriction of the smallest unit that can be taken by compressed data, that is, R in video_v, X (T_v, -T) and R in Audio_a× (T_a-T) the remaining packet is composed of data, and the remaining data, that is, R in the video_v, T + V_bAnd R in Audio_aA standard pack and an initial pack are constituted by × T.
[0054]
This way R_aAnd R_vThe following equation holds for.
[Formula 13]

[Expression 14]

[Expression 15]

[Expression 16]

From the equations (14) and (16), R_v, R_a, L_aL_vCan be calculated.
Audio data volume V_aIs
[Expression 17]

Video data volume V_vIs
[Expression 18]

Multiplexed data amount V_mIs
[Equation 19]

Therefore, the amount of padded data V_pIs
[Expression 20]

Yotsute
[Expression 21]

[0055]
L from equation (21)_pAnd L_p'Is required.
T_a<T, integers K and L satisfying the following formula_a″ Exists.
[Expression 22]

At this time, the data size of the audio packet is set to L for the K + 1th pack from the end of the standard pack._a-L_a″ And the size is L immediately after this packet_a″ -L_hMake sure to put the packet / packet. For the K packets from the end of the standard packet, the same size packet packet is placed instead of the audio packet.
Further, the following equation holds for the remaining pack.
[Expression 23]

[Expression 24]

[0056]
(3-2-5) Conditions for realizing continuous playback
As described above in “(1-2) Continuous playback of compressed video data file”, in order to continuously play back only compressed video data, the video decoder buffer is three times the virtual buffer. It is necessary to prepare.
The following shows the conditions under which multiple video decoders / audio buffers / audio decoders / buffers can be continuously reproduced even if they are multiplexed.
First, as shown in “(1-2-1) Consideration of indefinite start-up / delay”, in order to deal with the indefinite start-up / delay of each clip, it is twice as large as the virtual buffer. Capacity is required. It is this double buffer that covers the “data output broken line transitionable region” of FIG. Therefore, if the data input broken line can be shifted within the remaining 1 × buffer, continuous playback can be performed with 3 × buffer.
[0057]
In the video decoder buffer 17, there is a time during which only one clip data is input and a time during which two clip data are input.
The time that two clips are in is T_t= T_vThen the first T of each clip_SOnly time. This time is the time during which the first packet is transferred for an arbitrary clip, and since audio packets and packet packets are not included in addition to the beginning, the input broken line must be a straight line ▲, as is apparent from FIG. Below 1 ▼. That is, there is no data input broken line in the 1 × buffer.
Time when only one clip data is included (T_SAs for the time after the elapse of time), it may be arranged so as to fall between the straight line (1) and the straight line (2) in FIG. For that purpose, the following conditions should be satisfied.
Condition 1: The straight line (1) and the straight line (3) should not intersect except at the origin.
[0058]
In other words, for any t> 0, v₃(T)> v₁(T) may be satisfied.
[Expression 25]

[Equation 26]

Condition 2: Time t when the straight line {circle over (2)} and straight line {circle around (4)} finish transferring the video data_eDo not cross until. That is, v₄(T_e) <V₂(T_e) Should hold. In this case,
[Expression 27]

So,
[Expression 28]

[Expression 29]

[0059]
The equations for the straight lines (1) to (4) are as follows. Strictly speaking, it is necessary to consider the amount of headers and marks, but it is ignored because it is very small compared to the amount of compressed data.
[30]

[31]

[Expression 32]

[Expression 33]

Therefore, in order to realize continuous playback with 3 times the virtual buffer, T_t= T_vIn addition, the parameters may be selected so as to satisfy the expressions (26) and (29), and the parameters are actually selected in this way in the encoding / decoding apparatus 10 of this embodiment.
[0060]
(3-2-6) Example of compressed AV multiplexed format capable of continuous reproduction
R_m= 5 [Mbps], R_v= 4 [Mbps], R_a= 256 [kbps], T_min= 6 [s], V_b= 4 [Mbps] indicates a format that realizes the request. However, the video is NTSC.
In this case, the right side of equation (9) is 5 [Mbps], and the left side is
[Expression 34]

Therefore, equation (9) is satisfied, and it can be seen that these values can be selected.
For example, in the case of the clip time length T = 6 [s], from FIG.
T_t, = T_v, = 6.006 [s], T_a= 6 [s]
L appropriately_a= 1000 [bytes] (If the value is too large, the audio buffer may fail. Conversely, if the value is too small, the number of standard packs will increase and the overhead due to the header will increase.) Find the number of packs.
[Expression 35]

The size of the standard video packet is obtained from equation (16).
[Expression 36]

The size of the packet / packet is obtained from the equation (21).
[Expression 37]

Yotsu, L_p= 293 [bytes], L_p'= 74 [bytes].
[0061]
The size of the remaining video packet is obtained from equation (23).
[Formula 38]

Where T_a= T, so the size of the remaining audio packet is zero.
The size of the standard pack is obtained from the above values.
[39]

At this time, it can be seen that equation (26) is satisfied.
0.92> 0.8
At this time, it is understood that equation (29) is also satisfied.
23.2 <24
The resulting format is shown in FIG. Similarly, the case of the clip time length of 7 seconds is calculated, and the result is shown in FIG.
In this case, T_aSince ≠ T, the last standard pack is different from other standard packs._aThere are as many packet packets as -T.
If the value of each parameter is determined, a format table as shown in FIGS. 12 and 13 can be created for each clip time length regardless of the contents of the clip, so that the compressed AV data is actually multiplexed. In this case, the compressed data obtained by each encoder may be multiplexed according to this table.
[0062]
For this reason, in the encoding / decoding device 10 of this embodiment, such a format table is stored in advance in the encoder unit 11, and the video signal and code encoded by the encoder unit 11 on the basis of this format table are stored. Multiplexed audio signals are multiplexed.
[0063]
(4) Operation of the first embodiment
In the above configuration, the encoding / decoding device 10 encodes the video signal and the audio signal based on the video audio signal S1 supplied to the encoder unit 11 in the encoder unit 11, and these are encoded in the first multiplexing described above. The compressed AV multiplexed data D1 is formed by multiplexing in accordance with the conversion format, and at this time, the video decoding start mark M is provided for each clip.₁, Audio decoder start mark M₂, Audio stop mark M_S1~ M_S23Stop mark designation mark M_ThreeAnd audio decoding stop mark M_Four(Hereafter, these are collectively referred to as decoder control mark M.₁~ M_Four, M_S1~ M_S23Are added to predetermined positions, which are supplied to the data supply unit 12 and recorded on the recording media 14.
The compressed AV multiplexed data D1 is read in units of data files based on the control of the controller 13, and is supplied to the demultiplexer 16 as compressed AV multiplexed data D2. In the demultiplexer 16, the compressed video data D3 and the compressed audio data are supplied. Divided into D4 and sent to the video decoder 17 or the audio decoder 18, respectively.
At this time, the demultiplexer 16 uses the decoder control mark M from the data file.₁~ M_Four, M_S1~ M_S23Is detected, a mark detection signal S10 based on the detection result is sent to the controller 20.
Based on the mark detection signal S10, the controller 20 outputs control signals S4 and S5 to the video decoder 17 and the audio decoder 18, respectively, to drive and control the video decoder 17, so that the video decoder 17 and the audio. The o-decoder 18 is started or stopped as necessary. At this time, the controller 20 stops the reproduction operation of the audio decoder 18 for a predetermined time corresponding to the clip time length.
[0064]
As described above, in the encoding / decoding device 10, on the encoder side, the decoder control mark M for control corresponding to the clip time length is provided for each clip.₁~ M_Four, M_S1~ M_S23And at the decoder side these decoder control marks M₁~ M_Four, M_S1~ M_S23Since the video decoder 17 and the audio decoder 18 are driven and controlled based on the above, it is possible to complete the AV for each clip.
Further, in this encoding / decoding device 10, the buffer 21 of the video decoder 17 is set to be three times the virtual buffer (capacity or number), and further, the clip time is set so that the buffer 21 does not fail. A video signal encoded by the encoder 11 according to the first multiplexing format described above, which depends only on the length and has the characteristic that the size of the first video packet is larger than the capacity of the other video packets by the capacity of the virtual buffer; Since the encoded audio signal is multiplexed, the buffer 21 of the video decoder 17 does not fail.
Accordingly, since the two conditions presented in “(3) Realization of continuous playback” are satisfied, it is possible to perform continuous playback while maintaining AV synchronization.
[0065]
Therefore, for example, a television broadcast CM is applied as a clip material, which is MPEG-compressed and AV-multiplexed according to the above-described multiplexing format, and then recorded on a large-capacity and randomly accessible recording medium such as an HDD. If the specified CM compressed data group is supplied to the decoder at the specified time according to the CM broadcast time table, the CM can be broadcast continuously at the specified time.
[0066]
(5) Operation of the first embodiment
According to the above configuration, on the encoder side, the decoder control mark M for control corresponding to the clip time length for each clip.₁~ M_Four, M_S1~ M_S23And at the decoder side these decoder control marks M₁~ M_Four, M_S1~ M_S23On the other hand, the video decoder 17 and the audio decoder 18 are driven and controlled, while the encoder unit 5 has a feature that depends only on the clip time length and that the size of the first video packet is larger than the other video packets by the capacity of the virtual buffer. According to the first multiplexing format, the encoded video signal and the encoded audio signal are multiplexed, so that the AV head of each clip can be matched and the buffer 21 is broken. Thus, an encoding / decoding device capable of continuously reproducing a plurality of clips at a designated time can be realized.
[0067]
[2] Second embodiment
FIG. 14, in which parts corresponding to those in FIG. 6 are assigned the same reference numerals, shows the encoding / decoding device 40 according to the second embodiment. Each clip stored in the internal memory by the controller 42 of the decoder 41 in advance is shown. The encoding / decoding device 10 (FIG. 6) of the first embodiment is configured in substantially the same manner except that the driving of the video decoder 17 and the audio decoder 18 is controlled based on the reproduction time difference data between the video and the audio in the length. ing.
That is, in the case of the encoding / decoding device 40, the encoder 43 encodes the video signal and the audio signal based on the supplied video / audio signal S1, respectively, multiplexes them according to the first multiplexing format described above, The compressed AV multiplexed data D30 is sent to the data supply unit 44.
The data supply unit 44 records the supplied compressed AV multiplexed data D30 on the recording media 14 under the control of the controller 45. The data supply unit 44 reproduces the compressed AV multiplexed data D31 recorded on the recording medium 14 in units of clips based on the control of the controller 45, and sends this to the demultiplexer 46 of the decoder 41.
The demultiplexer 46 sequentially divides the supplied compressed AV multiplexed data D31 into compressed video data D4 and compressed audio data D5, and sends them to the video decoder 17 and the audio decoder 18, respectively.
At this time, the controller 45 supplies the clip time length as a time information signal S30 to the controller 42 of the decoder 41 via the shared memory 19.
[0068]
The controller 42 stores in advance the reproduction time difference data between the video and audio at each time length of the clip, and stores the corresponding time length of the clip based on the supplied time information signal S30. Control signals S4 and S5 based on the reproduction time difference data are generated and sent to the video decoder 17 and the audio decoder 18, respectively.
As a result, the controller 42 stops the audio decoder 18 after the audio playback time Tc has elapsed since the controller 45 of the data supply unit 44 began to play back the clip from the recording media 14, and thereafter the playback time difference (Tv-Tc). When the time elapses, the audio decoder 18 is driven and controlled so as to sequentially operate the audio decoder 18 again for each clip. Thus, the head of the video and the audio can be aligned at the head of each clip. It is made like that.
In the above-described configuration, in the encoding / decoding device 40, when the controller 45 of the data supply unit 44 reproduces the compressed AV multiplexed data D31 recorded on the recording medium 14 in units of clips, the reproduction time information of the clips is displayed. This is supplied to the controller 42 of the decoder 41.
[0069]
On the other hand, the controller 42, based on the clip playback time information supplied from the controller 45 and the playback time difference data held by the controller 42 itself, matches the video and audio heads at the head of each clip. The decoder 17 and the audio decoder 18 are driven and controlled.
As described above, in the encoding / decoding device 40, since the audio decoder 18 is driven and controlled in units of each clip, the reproduction time difference between the video and the audio can be corrected completely within each clip. Thus, AV synchronization can be achieved.
The buffer 21 of the video decoder 17 is selected to be three times the capacity of the virtual buffer, and the encoder 43 supplies and multiplexes the encoded video signal and audio signal according to the first multiplexing format. Therefore, each clip can be continuously reproduced while maintaining AV synchronization as in the case of the first embodiment.
[0070]
According to the above configuration, the capacity of the buffer 21 of the video decoder 17 is three times that of the virtual buffer, and the encoder 43 supplies the encoded video signal and audio signal supplied in accordance with the first multiplexed format. When the controller 45 of the data supply unit 44 reproduces the compressed AV multiplexed data D31 recorded on the recording medium 14 in units of clips, the reproduction time information of the clips is set to the controller 42 of the decoder 41. And the controller 41 sets the video decoder 17 and the audio decoder 18 on the basis of the reproduction time time information and the reproduction time difference data of the controller 41 itself. In this way, the control of the first embodiment is achieved. It can be realized coding and decoding apparatus capable of continuously reproducing the time specified multiple clip Like the decoding apparatus 10.
[0071]
【The invention's effect】
As described above, according to the present invention, the reproduction time difference between the compressed video data and the compressed audio data generated for each time length of the material formed by the compressed video data and the compressed audio data compressed and encoded at a predetermined compression bit rate. Is stored on the decoding side as playback time difference data, so that playback between compressed video data and compressed audio data forming the material can be performed only from the time length of the material regardless of the presence or absence of a time stamp. The time difference can be specified, and the decoding operation for the compressed audio data is stopped by the specified reproduction time difference, so that when a plurality of materials are continuously reproduced, the compressed video data and the compressed audio data forming each material are reproduced. Since the data playback start time can be synchronized, it is much simpler than before. Yet formed, to start playback of each material at a specified time, and a plurality of materials decoding apparatus and coding and decoding apparatus can be realized capable of continuously be reproduced.
Further, according to the present invention, by setting a minimum unit for the time length of the material in advance, a combination pattern of the time length of the material and the reproduction time difference between the compressed video data and the compressed audio data forming the material is obtained. Since the minimum unit can be significantly reduced as compared with the case where the minimum unit is not set, the reproduction time difference between the compressed video data and the compressed audio data can be specified with the reproduction time difference data having a smaller data amount.
[Brief description of the drawings]
FIG. 1 is a graph and a block diagram for explaining a virtual buffer and a start-up-delay.
FIG. 2 is a graph for explaining the failure of a start-up daylay and a buffer.
FIG. 3 is a graph for explaining the data amount of a compressed video data file.
FIG. 4 is a graph for explaining adjustment of data amount and data transfer rate;
FIG. 5 is a chart showing realizable time of video and audio.
FIG. 6 is a block diagram showing the configuration of the encoding / decoding device according to the first embodiment;
FIG. 7 is a block diagram showing a data transfer form after demultiplexing.
FIG. 8 is a schematic diagram for explaining a decoder control mark;
FIG. 9 is a schematic diagram showing a format of a compressed AV multiplexed data file.
FIG. 10 is a graph showing transition of input to a video decoder.
FIG. 11 is a graph for explaining a usage situation of a video decoder buffer during continuous reproduction.
FIG. 12 is a chart showing an example of a compressed AV multiplexing format.
FIG. 13 is a chart showing an example of a compressed AV multiplexing format.
FIG. 14 is a block diagram showing a configuration of a coding / decoding device according to the first embodiment;
FIG. 15 is a chart showing an example of a compressed AV multiplexing format.
FIG. 16 is a chart showing an example of a compressed AV multiplexing format.
FIG. 17 is a block diagram for explaining a time stamp method.
FIG. 18 is a schematic diagram for explaining continuous reproduction in the case of a time stamp method.
[Explanation of symbols]
10, 40 ... Coding / decoding device, 11, 43 ... Encoder, 12, 44 ... Data supply unit, 13, 20, 42, 45 ... Controller, 14 ... Recording media, 16, 46 ... Data Multiplexer, 17 ... Video decoder, 18 ... Audio decoder, 21, 22 ... Buffer, S4, S5 ... Control signal, S10 ... Mark detection signal, D1, D2 ... Compressed AV multiplexed data, M₁...... Video decoding start mark, M₂...... Audio decoding start mark, M_Three...... Stop mark designation mark, M_Four...... Audio decoder stop mark, M_S1~ M_S23...... Audio decoding stop mark.
[Expression 27]

[Expression 28]

Claims

In a decoding device for decoding compressed video data and compressed audio data forming material ,
A dividing means for dividing and outputting the compressed video data and the compressed audio data, which are supplied from a data supply source and are compressed and encoded at a predetermined compression bit rate and multiplexed;
First decoding means for decoding the compressed video data output from the dividing means;
Second decoding means for decoding the compressed audio data output from the dividing means;
The time information indicating the time length of the material supplied from the data supply source, and the reproduction time difference between the compressed video data and the compressed audio data generated for each time length of the material stored in advance are shown. based on the reproduction time difference data, decoding apparatus characterized by comprising a control means for stopping the decoding operation with respect to the compressed audio data of the second decoding means only the reproduction time difference.

Set the minimum unit for the time length of the material in advance,
The control means includes
2. The decoding apparatus according to claim 1, wherein the reproduction time difference data indicating the reproduction time difference generated every time length equal to or greater than the minimum unit is stored .

Becomes the encoding unit and the decoding unit, the compressed video data and compressed audio data is compression-encoded in the coding section, through a data supply section is supplied to the decoder, the compressed video data and the compressed In the encoding / decoding device for decoding the audio data by the decoding unit ,
The encoding unit is
Encoding means for compressing and encoding each of the video signal and audio signal forming the material at a predetermined compression bit rate, and outputting the compressed video data and the compressed audio data thus obtained;
Multiplexing means for multiplexing the compressed video data and the compressed audio data output from the encoding means and supplying the multiplexed data to the data supply unit;
With
The decryption unit
Supplied from the data supply unit, a dividing means for outputting the divided and multiplexed the compressed video data and the compressed audio data,
First decoding means for decoding the compressed video data output from the dividing means;
Second decoding means for decoding the compressed audio data output from the dividing means;
The time information indicating the time length of the material supplied from the data supply unit, and the reproduction time difference between the compressed video data and the compressed audio data generated for each time length of the material stored in advance are shown. based on the reproduction time difference data, coding and decoding apparatus characterized by comprising a control means for stopping the decoding operation with respect to the compressed audio data of the second decoding means only the reproduction time difference.

Set the minimum unit for the time length of the material in advance,
The control means includes
4. The encoding / decoding apparatus according to claim 3, wherein the reproduction time difference data indicating the reproduction time difference generated every time length equal to or greater than the minimum unit is stored .

In a decoding method for decoding compressed video data and compressed audio data forming material ,
A dividing step for dividing the multiplexed the compressed video data and the compressed audio data is compressed and encoded by a predetermined compression bit rate,
A decoding step for decoding the divided compressed video data and the compressed audio data;
In the decoding step, based on time information indicating a time length of the material and reproduction time difference data indicating a reproduction time difference between the compressed video data and the compressed audio data generated for each time length of the material, the compression step is performed. A decoding method, wherein the decoding process for audio data is stopped for the reproduction time difference .

In an encoding / decoding method for compressing and encoding each of a video signal and an audio signal forming a material and decoding the compressed video data and the compressed audio data thus obtained ,
An encoding step for compressing and encoding each of the video signal and the audio signal at a predetermined compression bit rate;
A multiplexing step for multiplexing the compressed video data and the compressed audio data thus obtained;
A dividing step of dividing the multiplexed the compressed video data and the compressed audio data,
A decoding step for decoding the compressed video data and the compressed audio data divided;
In the decoding step, based on time information indicating a time length of the material and reproduction time difference data indicating a reproduction time difference between the compressed video data and the compressed audio data generated for each time length of the material, the compression step is performed. An encoding / decoding method, wherein the decoding process for audio data is stopped by the reproduction time difference .