JP4649770B2

JP4649770B2 - Image data processing apparatus and method, recording medium, and program

Info

Publication number: JP4649770B2
Application number: JP2001139704A
Authority: JP
Inventors: 哲二郎近藤; 俊彦浜松; 丈晴西片; 真史内田; 威國弘
Original assignee: Sony Corp
Current assignee: Sony Corp
Priority date: 2001-05-10
Filing date: 2001-05-10
Publication date: 2011-03-16
Anticipated expiration: 2021-05-10
Also published as: JP2002335405A

Description

【０００１】
【発明の属する技術分野】
本発明は、画像データ処理装置および方法、記録媒体、並びにプログラムに関し、特に、圧縮された画像データを、簡単な構成で復号できるようにした画像データ処理装置および方法、記録媒体、並びにプログラムに関する。
【０００２】
【従来の技術】
例えば、ディジタル画像データは、そのデータ量が多いため、そのまま記録や伝送を行うには、大容量の記録媒体や伝送媒体が必要となる。そこで、一般には、画像データを圧縮符号化することにより、そのデータ量を削減してから、記録や伝送が行われる。
【０００３】
画像を圧縮符号化する方式としては、例えば、静止画の圧縮符号化方式であるJPEG(Joint Photographic Experts Group)方式や、動画の圧縮符号化方式であるMPEG(Moving Picture Experts Group)方式等がある。
【０００４】
例えば、JPEG方式による画像データの符号化は、図１に示すように行われる。
【０００５】
符号化対象の画像データは、ブロック化回路１に入力され、ブロック化回路１は、そこに入力される画像データを、８×８画素の６４画素でなるブロックに分割する。ブロック化回路１で得られる各ブロックは、DCT(Discrete Cosine Transform)回路２に供給される。DCT回路２は、ブロック化回路１からのブロックに対して、DCT（離散コサイン変換）処理を施し、１個のＤＣ(Direct Current)成分と、水平方向および垂直方向についての６３個の周波数成分（ＡＣ(Alternating Current)成分）の、合計６４個のDCT係数に変換する。各ブロックごとの６４個のDCT係数は、DCT回路２から量子化回路３に供給される。
【０００６】
量子化回路３は、所定の量子化テーブルにしたがって、DCT回路２からのDCT係数を量子化し、その量子化結果（以下、適宜、量子化DCT係数という）を、量子化に用いた量子化テーブルとともに、エントロピー符号化回路４に供給する。
【０００７】
図２は、量子化回路３において用いられる量子化テーブルの例を示している。
量子化テーブルには、一般に、人間の視覚特性を考慮して、重要性の高い低周波数のDCT係数は細かく量子化し、重要性の低い高周波数のDCT係数は粗く量子化するような量子化ステップが設定されており、これにより、画像の画質の劣化を抑えて、効率の良い圧縮が行われるようになっている。
【０００８】
エントロピー符号化回路４は、量子化回路３からの量子化DCT係数に対して、例えば、ハフマン符号化等のエントロピー符号化処理を施して、量子化回路３からの量子化テーブルを付加し、その結果得られる符号化データを、JPEG符号化結果として出力する。
【０００９】
図３は、図１のJPEG符号化装置が出力する符号化データを復号する、従来のJPEG復号装置の一例の構成を示している。
【００１０】
符号化データは、エントロピー復号回路１１に入力され、エントロピー復号回路１１は、符号化データを、エントロピー符号化された量子化DCT係数と、量子化テーブルとに分離する。さらに、エントロピー復号回路１１は、エントロピー符号化された量子化DCT係数をエントロピー復号し、その結果得られる量子化DCT係数を、量子化テーブルとともに、逆量子化回路１２に供給する。逆量子化回路１２は、エントロピー復号回路１１からの量子化DCT係数を、同じくエントロピー復号回路１１からの量子化テーブルにしたがって逆量子化し、その結果得られるDCT係数を、逆DCT回路１３に供給する。逆DCT回路１３は、逆量子化回路１２からのDCT係数に、逆DCT処理を施し、その結果られる８×８画素の（復号）ブロックを、ブロック分解回路１４に供給する。ブロック分解回路１４は、逆DCT回路１３からのブロックのブロック化を解くことで、復号画像を得て出力する。
【００１１】
【発明が解決しようとする課題】
上記のようなDCT等の直交変換を用いた圧縮符号化において、圧縮率を高くするために各係数データに対する量子化を粗くすると、原画像に対する復号画像の誤差が増大し、復号画像に劣化が生じる。その劣化は、画像のボケ、ブロック歪み、及びエッジ周辺のモスキートノイズといった形で現れ、大きな問題となっている。
【００１２】
本発明は、このような状況に鑑みてなされたものであり、情報量を増大させることなく、量子化誤差を低減したよい良好な復号画像を得ることができるようにするものである。
【００１３】
【課題を解決するための手段】
本発明の画像データ処理装置は、複数の係数データをクラス分類するためのテーブル情報を保持する保持手段と、保持手段に保持されているテーブル情報に基づいて、クラスコードを生成するクラスコード生成手段と、クラスコード生成手段により生成されたクラスコードに基づいて、予測係数セットを生成する予測係数生成手段と、クラスコード生成手段により生成されたクラスコードに基づいて、予測タップを生成する予測タップ生成手段と、予測係数生成手段により生成された予測係数セットと、予測タップ生成手段により生成された予測タップに基づいて、画素データを生成する画素データ生成手段とを備えることを特徴とする。
【００１４】
前記保持手段は、テーブル情報として、複数の係数データ、または係数データに対応する特徴量との関係を記述したテンプレートを保持することができる。
【００１５】
前記保持手段は、テーブル情報として、複数の係数データ、または係数データに対応する特徴量からなるベクトルが格納されたコードブックを保持し、クラスコード生成手段は、コードブックを用いて係数データをベクトル量子化して、クラスコードを生成することができる。
【００１６】
前記コードブックは、ＬＢＧアルゴリズムに基づいて生成されているようにすることができる。
【００１７】
前記クラスコード生成手段は、閾値判定処理を行うことができる。
【００１８】
本発明の画像データ処理方法は、複数の係数データをクラス分類するためのテーブル情報を保持する保持ステップと、保持ステップの処理により保持されているテーブル情報に基づいて、クラスコードを生成するクラスコード生成ステップと、クラスコード生成ステップの処理により生成されたクラスコードに基づいて、予測係数セットを生成する予測係数生成ステップと、クラスコード生成ステップの処理により生成されたクラスコードに基づいて、予測タップを生成する予測タップ生成ステップと、予測係数生成ステップの処理により生成された予測係数セットと、予測タップ生成ステップの処理により生成された予測タップに基づいて、画素データを生成する画素データ生成ステップとを含むことを特徴とする。
【００１９】
前記保持ステップは、テーブル情報として、複数の係数データ、または係数データに対応する特徴量との関係を記述したテンプレートを保持することができる。
【００２０】
前記保持ステップは、テーブル情報として、複数の係数データ、または係数データに対応する特徴量からなるベクトルが格納されたコードブックを保持し、クラスコード生成ステップは、コードブックを用いて係数データをベクトル量子化して、クラスコードを生成することができる。
【００２１】
前記コードブックは、ＬＢＧアルゴリズムに基づいて生成されているようにすることができる。
【００２２】
前記クラスコード生成ステップは、閾値判定処理を行うことができる。
【００２３】
本発明の記録媒体のプログラムは、直交変換処理により係数データに変換された後、量子化された画像データを復号する画像データ処理装置のプログラムであって、複数の係数データをクラス分類するためのテーブル情報を保持する保持ステップと、保持ステップの処理により保持されているテーブル情報に基づいて、クラスコードを生成するクラスコード生成ステップと、クラスコード生成ステップの処理により生成されたクラスコードに基づいて、予測係数セットを生成する予測係数生成ステップと、クラスコード生成ステップの処理により生成されたクラスコードに基づいて、予測タップを生成する予測タップ生成ステップと、予測係数生成ステップの処理により生成された予測係数セットと、予測タップ生成ステップの処理により生成された予測タップに基づいて、画素データを生成する画素データ生成ステップとを含むことを特徴とする。
【００２４】
本発明のプログラムは、直交変換処理により係数データに変換された後、量子化された画像データを復号する画像データ処理装置を制御するコンピュータに、複数の係数データをクラス分類するためのテーブル情報を保持する保持ステップと、保持ステップの処理により保持されているテーブル情報に基づいて、クラスコードを生成するクラスコード生成ステップと、クラスコード生成ステップの処理により生成されたクラスコードに基づいて、予測係数セットを生成する予測係数生成ステップと、クラスコード生成ステップの処理により生成されたクラスコードに基づいて、予測タップを生成する予測タップ生成ステップと、予測係数生成ステップの処理により生成された予測係数セットと、予測タップ生成ステップの処理により生成された予測タップに基づいて、画素データを生成する画素データ生成ステップとを実行させる。
【００２９】
本発明の画像データ処理装置および方法、記録媒体、並びにプログラムにおいては、複数の係数データをクラス分類するためのテーブル情報に基づいて、クラスコードが生成され、生成されたクラスコードに基づいて、予測係数セットと予測タップが生成される。そして、予測係数セットと、予測タップに基づいて、画素データが生成される。
【００３１】
【発明の実施の形態】
次に、図４は、本発明を適用した復号装置６０の構成例を示している。
【００３２】
符号化データは、エントロピー復号回路６１に供給され、エントロピー復号回路６１は、符号化データを、エントロピー復号して、その結果得られるブロックごとの量子化DCT係数Ｑを、係数データ変換回路６２に供給する。なお、符号化データには、図３のエントロピー復号回路１１で説明した場合と同様に、エントロピー符号化された量子化DCT係数の他、量子化テーブルも含まれるが、量子化テーブルは、後述するように、必要に応じて、量子化DCT係数の復号に用いることが可能である。
【００３３】
係数データ変換回路６２は、エントロピー復号回路６１からの量子化DCT係数Ｑと、後述する学習を行うことにより求められる予測係数を用いて、所定の予測演算を行うことにより、ブロックごとの量子化DCT係数を、８×８個の元の画素データのブロックに復号する。
【００３４】
ブロック分解回路６３は、係数データ変換回路６２において得られる、復号されたブロック（復号ブロック）のブロック化を解くことで、復号画像を得て出力する。
【００３５】
次に、図５のフローチャートを参照して、図４の復号装置６０の処理について説明する。
【００３６】
符号化データは、エントロピー復号回路６１に順次供給され、ステップＳ１において、エントロピー復号回路６１は、符号化データをエントロピー復号し、ブロックごとの量子化DCT係数Ｑを、係数データ変換回路６２に供給する。係数データ変換回路６２は、ステップＳ２において、エントロピー復号回路６１からのブロックごとの量子化DCT係数Ｑを、予測係数を用いた予測演算を行うことにより、ブロックごとの画素値に復号し、ブロック分解回路６３に供給する。ブロック分解回路６３は、ステップＳ３において、係数データ変換回路６２からの画素値のブロック（復号ブロック）のブロック化を解くブロック分解を行い、その結果得られる復号画像を出力して、処理を終了する。
【００３７】
図６は、量子化DCT係数を画素値に復号する、図４の係数データ変換回路６２のより詳細な構成例を示している。
【００３８】
エントロピー復号回路６１（図４）が出力するブロックごとの量子化DCT係数は、予測タップ抽出回路８１およびクラス分類回路８３に供給されるようになっている。
【００３９】
予測タップ抽出回路８１は、そこに供給される量子化DCT係数のブロック（以下、適宜、DCTブロックという）に対応する画素値のブロック（この画素値のブロックは、現段階では存在しないが、仮想的に想定される）（以下、適宜、画素ブロックという）を、順次、注目画素ブロックとし、さらに、その注目画素ブロックを構成する各画素を、例えば、いわゆるラスタスキャン順に、順次、注目画素とする。さらに、予測タップ抽出回路８１は、注目画素の画素値を予測するのに用いる量子化DCT係数を、予測タップテーブル８６を参照することで抽出し、予測タップとする。
【００４０】
予測タップテーブル８６は、注目画素についての予測タップとして抽出する量子化DCT係数の、注目画素に対する位置関係を表したパターン情報が登録されているパターンテーブルであり、予測タップ抽出回路８１は、そのパターン情報に基づいて、量子化DCT係数を抽出し、注目画素についての予測タップを構成する。
【００４１】
予測タップ抽出回路８１は、８×８の６４画素でなる画素ブロックを構成する各画素についての予測タップ、即ち、６４画素それぞれについての６４セットの予測タップを、上述のようにして構成し、積和演算回路８５に供給する。
【００４２】
クラス分類回路８３は、コードブック８２に格納されている代表ベクトルに基づいて、注目DCTブロックのDCT係数(AC係数)をベクトル量子化することにより、注目DCTブロックを幾つかのクラスのうちのいずれかに分類し、その結果得られるクラスに対応するクラスコードと画素位置モードを出力する（その処理の詳細は、図９と図１０を参照して後述する）。画素位置モードは、注目画素ブロックの、注目画素となっている画素の位置に対応した動作モードを意味する。
【００４３】
クラス分類回路８３が出力するクラスコードは、予測係数テーブル８４および予測タップテーブル８６に、アドレスとして与えられる。
【００４４】
予測係数テーブル８４には、画素値の予測に用いられる予測係数が、クラス毎及び画素位置モード毎に予め格納されている。この予測係数テーブル８４の作成方法については、図１１と図１２を参照して後述する。予測係数テーブル８４は、クラス分類回路８３から供給されるクラスコード、及び画素位置モードに応じて予測係数セットを選択し、積和演算回路８５へ出力する。
【００４５】
積和演算回路８５は、予測タップ抽出回路８１から供給される予測タップセットと、予測係数テーブル８４から得られる予測係数セットの積和演算を行い、その結果を画素データ(PD)として出力する。
【００４６】
すなわち、積和演算回路８５は、予測タップセットがTD₁、TD₂、TD₃、TD₄、TD₅、予測係数セットがFC₁、FC₂、FC₃、FC₄、FC₅の場合、画素データPDを、次式で表されるように演算する。

（１）
【００４７】
本実施の形態では、画素ブロックがクラス分類されるから、注目画素ブロックについて、１つのクラスコードが得られる。一方、画素ブロックは、本実施の形態では、８×８画素の６４画素で構成されるから、注目画素ブロックについて、それを構成する６４画素それぞれを復号するための６４セットの予測係数が必要である。従って、予測係数テーブル８４には、１つのクラスコード毎に、対応するアドレスに対して、６４セットの予測係数が記憶されている。
【００４８】
積和演算回路８５は、予測タップ抽出回路８１が出力する予測タップと、予測係数テーブル８４が出力する予測係数とを取得し、その予測タップと予測係数とを用いて、式（１）に示した線形予測演算（積和演算）を行い、その結果得られる注目画素ブロックの８×８画素の画素値を、対応するDCTブロックの復号結果として、ブロック分解回路６３（図４）に出力する。
【００４９】
予測タップ抽出回路８１においては、上述したように、注目画素ブロックの各画素が、順次、注目画素とされるが、積和演算回路８５は、注目画素ブロックの、注目画素となっている画素の位置に対応した動作モード（ずなわち、画素位置モード）の処理を行う。
【００５０】
例えば、注目画素ブロックの画素のうち、ラスタスキャン順で、ｉ番目の画素を、ｐ_iと表し、画素ｐ_iが、注目画素となっている場合、積和演算回路８５は、画素位置モード＃ｉの処理を行う。
【００５１】
具体的には、上述したように、予測係数テーブル８４は、注目画素ブロックを構成する６４画素それぞれを復号するための６４セットの予測係数を格納しているが、そのうちの画素ｐ_iを復号するための予測係数のセットをＷ_iと表すと、積和演算回路８５には、動作モードが、画素位置モード＃ｉのときには、セットＷ_iが出力される。同時に、予測タップ抽出回路８１からは、画素位置モード＃ｉの予測タップＴ_iが出力され、積和演算回路８５では、予測タップＴ_iと、予測係数セットＷ_iとを用いて、式（１）の積和演算を行い、その積和演算結果を、画素ｐ_iの復号結果とする。
【００５２】
予測タップテーブル８６には、画素値の予測に用いられる予測タップの位置、すなわち、どのDCT係数を予測に用いるかの情報が予め格納されている。予測タップテーブル８６は、クラス分類回路８３から供給されるクラスコード、及び画素位置モードに応じて、予測タップ位置セットを予測タップ抽出回路８１へ出力する。
【００５３】
予測タップ抽出回路８１は、予測タップテーブル８６から供給される予測タップ位置セットに基づき、各予測タップ位置に対応するDCT係数データを抽出し、それらを予測タップセットとして積和演算回路８５へ出力する。
【００５４】
ここで、予測タップテーブル８６においても、予測係数テーブル８４について説明したのと同様の理由から、１つのクラスコードに対応するアドレスに対して、６４セットのパターン情報（各画素位置モードごとのパターン情報）が記憶されている。
【００５５】
次に、図７のフローチャートを参照して、図６の係数データ変換回路６２の処理について説明する。
【００５６】
エントロピー復号回路６１が出力するブロックごとの量子化DCT係数は、予測タップ抽出回路８１およびクラス分類回路８３において順次受信され、そこに供給される量子化DCT係数のブロック（DCTブロック）に対応する画素ブロックが、順次、注目画素ブロックとされる。また、注目画素ブロックのが画素のうち、ラスタスキャン順で、まだ注目されていない画素が注目画素とされる。
【００５７】
クラス分類回路８３は、ステップＳ１１において、コードブック８２に格納されている代表ベクトルを用いて、注目DCTブロックをベクトル量子化することでクラス分類する。すなわち、クラス分類回路８３は、入力された係数データとの距離が最小となる代表ベクトルをコードブック８２から検索し、その代表ベクトルに対応するクラスコードをその係数データのクラスコードとする。クラス分類回路８３は、クラスコードと画素位置モード（画素ブロック内における画素の位置）を、予測係数テーブル８４および予測タップテーブル８６に出力する。このクラス分類処理の詳細は、図９と図１０を参照して後述する。
【００５８】
予測タップテーブル８６は、クラス分類回路８３からアドレスとしてクラスコードと画素位置モードを受信すると、ステップＳ１２において、そのアドレスに記憶されているパターン情報を読み出し、予測タップ抽出回路８１に出力する。
【００５９】
そして、ステップＳ１３に進み、予測タップ抽出回路８１は、予測タップテーブル８６から供給されるクラスコード、および注目画素の画素位置モードに対応するパターン情報にしたがって、その注目画素の画素値を予測するのに用いる量子化DCT係数を抽出し、予測タップとして構成する。この予測タップは、予測タップ抽出回路８１から積和演算回路８５に供給される。
【００６０】
予測係数テーブル８４は、クラス分類回路８３からアドレスとしてクラスコードと画素位置モードを受信すると、ステップＳ１４おいて、そのアドレスに記憶されている予測係数を読み出し、積和演算回路８５に出力する。
【００６１】
積和演算回路８５は、ステップＳ１５において、クラスコードおよび注目画素に対する画素位置モードに対応する予測係数のセットを取得し、その予測係数のセットと、ステップＳ１３で予測タップ抽出回路８１から供給された予測タップとを用いて、式（１）に示した積和演算を行い、注目画素の画素値の復号値を得る。
【００６２】
そして、ステップＳ１６に進み、クラス分類回路８３は、注目画素ブロックのすべての画素を、注目画素として処理したかどうかを判定する。ステップＳ１６において、注目画素ブロックのすべての画素を、注目画素として、まだ処理していないと判定された場合、ステップＳ１２に戻り、クラス分類回路８３は、注目画素ブロックの画素のうち、ラスタスキャン順で、まだ、注目画素とされていない画素を、新たに注目画素として、以下、同様の処理を繰り返す。
【００６３】
また、ステップＳ１６において、注目画素ブロックのすべての画素を、注目画素として処理したと判定された場合、即ち、注目画素ブロックのすべての画素の復号値が得られた場合、積和演算回路８５は、その復号値で構成される画素ブロック（復号ブロック）を、ブロック分解回路６３（図４）に出力し、処理を終了する。
【００６４】
このように、図７のフローチャートにしたがった処理は、係数データ変換回路６２が、新たな注目画素ブロックを設定するごとに繰り返し行われる。
【００６５】
図８は、コードブック８２を生成する回路の構成例を示している。この例においては、コードブック生成回路９１が、入力された周波数領域の係数データに基づいて、LBGアルゴリズムに基づいて学習し、複数の代表ベクトルからなるコードブックを生成する。
【００６６】
図９は、クラス分類回路８３の構成を示す。上述したように、この実施の形態では、入力された係数データとの距離が最小となる代表ベクトルをコードブック８２から検索することでクラスコードが決定される。この場合における距離の定義としては様々なものが考えられるが、図９の例では、代表ベクトルと入力ベクトル（入力されたDCT係数データ）のなす角度が距離として用いられる。代表ベクトルをT_k、入力ベクトルをSとすると、両者のなす角度θ_kは、次式により求めることができる。
【数１】

ただし、T_k・SはT_kとSの内積を、||T_k||はT_kのノルムを、||Ｓ||はＳのノルムを、それぞれ表す。
【００６７】
入力ベクトル抽出回路１０１は、注目ブロックの６４個のDCT係数のうち、63個のAC係数を抽出し、入力ベクトルとして内積回路１０２とノルム算出回路１０５に供給する。内積回路１０２は、コードブック８２内の代表ベクトルと、入力ベクトル抽出回路１０１からの入力ベクトルの内積を算出し、cosθ算出回路１０４に出力する。ノルム算出回路１０３は、コードブック８２からの代表ベクトルのノルムを算出し、cosθ算出回路１０４に出力する。ノルム算出回路１０５は、入力ベクトル抽出回路１０１からの入力ベクトルのノルムを算出し、cosθ算出回路１０４に出力する。
【００６８】
cosθ算出回路１０４は、内積回路１０２からの代表ベクトルと入力ベクトルの内積、ノルム算出回路１０３からの代表ベクトルのノルム、及びノルム算出回路１０５からの入力ベクトルのノルムからcosθ_kを（＝（Ｔ_k・Ｓ）／（||T_k||||Ｓ||））求め、cos^-1回路１０６に出力する。cos^-1回路１０６は、角度θ_kを算出し、最小値選択回路１０７に出力する。最小値選択回路１０７は、入力ベクトルとすべての代表ベクトルとのなす角度の中から、角度が最も小さい代表ベクトルを選択し、その代表ベクトルの番号をクラスコードとして出力する。
【００６９】
次に、図１０のフローチャートを参照して、図９のクラス分類回路８３の動作について説明する。最初に、ステップＳ３１において、入力ベクトル抽出回路１０１は、入力されたDCT係数から入力ベクトルを抽出する。ステップＳ３２において、代表ベクトルと入力ベクトルとの成す角度が算出される。
【００７０】
すなわち、上述したように、内積回路１０２は、入力ベクトル抽出回路１０１より供給された入力ベクトルと、コードブック８２より供給された１つの代表ベクトルとの内積を演算し、COSθ算出回路１０４に供給する。ノルム算出回路１０３は、コードブック８２により供給された代表ベクトルのノルムを算出し、COSθ算出回路１０４に供給する。
【００７１】
ノルム算出回路１０５は、入力ベクトル抽出回路１０１より供給された入力ベクトルのノルムを算出し、COSθ算出回路１０４に供給する。
【００７２】
COSθ算出回路１０４は、内積回路１０２から供給された内積、並びにノルム算出回路１０３とノルム算出回路１０５より供給されたノルムに基づいて、COSθを算出し、COS^-1回路１０６に供給する。COS^-1回路１０６は、入力されたCOSθから角度θを算出する。
【００７３】
ステップＳ３３において、全ての代表ベクトルと入力ベクトルとの角度が算出されたか否かが最小値選択回路１０７により判定され、まだ、角度が算出されていない代表ベクトルが存在する場合には、ステップＳ３１に戻り、それ以降の処理が繰り返し実行される。
【００７４】
以上のようにして、入力ベクトルと、全ての代表ベクトルとの角度θが算出されたと判定された場合、ステップＳ３４に進み、最小値選択回路１０７は、角度θが最小である代表ベクトルを選択する。そして、ステップＳ３５において、最小値選択回路１０７は、ステップＳ３４で選択した代表ベクトル番号をクラスコードとして出力する。
【００７５】
図１１に予測係数テーブル８４を学習により生成する予測係数テーブル生成回路の構成の一例を示す。入力されたディジタルビデオ信号は、ブロック化回路１２１においてブロック化処理が施され、DCT回路１２２へ供給される。DCT回路１２２はブロック毎にDCT変換を行い、DCT係数を量子化回路１２３へ出力する。量子化回路１２３に供給された係数データはそこで量子化され、クラス分類回路１２４及び予測タップ抽出回路１２９に供給される。クラス分類回路１２４は、図９と図１０を参照して説明した場合と同様に、テーブル１３０に記憶されているコードブックを利用して、注目ブロックからクラスコードを生成し、正規方程式加算回路１２５へ供給する。予測タップ抽出回路１２９は、予測タップテーブル１２６からの予測タップ位置情報に基づき、量子化回路１２３の出力である量子化DCT係数データから予測タップを抽出し、正規方程式加算回路１２５に出力する。
【００７６】
ここで、正規方程式加算回路１２５において用いられる正規方程式について説明する。画素データPD₁とその補正に用いる量子化DCT係数データQD₁乃至QD_nを用いた線形推定式を式(2)に示す。
PD₁＝w₁QD₁＋w₂QD₂＋…＋w_nQD_n （２）
【００７７】
学習前は予測係数w₁乃至w_nが未定である。この予測係数はクラス及び画素位置モード毎に用意する必要があるので、実際はその夫々について式を設定しなければならない。
【００７８】
学習は複数の信号データに対して行う。データ数がmの場合、式(２)より、
PD_1j＝w₁QD_1j＋w₂QD_2j＋…＋w_nQD_nj, j=1,2,…,m （３）
となる。m>nの場合、w₁乃至w_nは一意に決まらないので、誤差ベクトルEの要素をe_j＝PD_1j-(w₁QD_1j＋w₂QD_2j＋…＋w_nQD_nj), j=1,2,…,m （４）
と定義して、下記に示す式を最小にする係数を求める。
【数２】

【００７９】
すなわち、最小二乗法による解法である。ここで、式(5)のw_iによる偏微分係数を求める。
【数３】

式(6)を0にするように各w_iを求めれば良いので、
【数４】

として行列を用いると、
【数５】

となる。この方程式は一般に正規方程式と呼ばれている。正規方程式加算回路１２５はこの正規方程式の加算を行う。
【００８０】
全ての学習データの入力が終了した後、正規方程式加算回路１２５は、予測係数決定回路１２７へ正規方程式データを出力する。予測係数決定回路１２７は、正規方程式を掃き出し法等の一般的な行列解法を用いてw_iについて解き、予測係数を算出する。予測係数決定回路１２７は、算出された予測係数を予測係数メモリ１２８へ書き込む。
【００８１】
上記のような学習の結果、予測係数メモリ１２８には、注目画素データPD₁を推定するための、統計的に最も真値に近い予測係数が格納される。格納された各予測係数は、復号時にフィルタ係数として用いられる（予測係数メモリ１２８の記憶内容は、予測係数テーブル８４として用いられる）。
【００８２】
次に、図１２のフローチャートを参照して、図１１の予測係数テーブル生成回路の処理について説明する。ステップＳ５１において、ブロック化回路１２１は、ブロック化処理を行う。ステップＳ５２において、DCT回路１２２は、ブロック化回路１２１より供給された画素データをDCT処理し、量子化回路１２３に出力する。量子化回路１２３は、ステップＳ５３において、DCT回路１２２より供給されたDCT係数を量子化し、クラス分類回路１２４と正規方程式加算回路１２５に出力する。
【００８３】
ステップＳ５４において、クラス分類回路１２４は、クラス分類処理を行い、得られたクラスコードを、正規方程式加算回路１２５に出力する。
【００８４】
ステップＳ５５において、予測タップテーブル１２６は、予測タップを抽出し、正規方程式加算回路１２５に出力する。正規方程式加算回路１２５は、ステップＳ５６において、正規方程式加算処理を行う。
【００８５】
ステップＳ５７において、正規方程式加算回路１２５は、全てのブロックについての処理が終了したか否かを判定し、まだ終了していない場合には、ステップＳ５２に戻り、それ以降の処理を繰り返し実行する。
【００８６】
ステップＳ５７において、全てのブロックについての処理が終了したと判定された場合、ステップＳ５８に進み、予測係数決定回路１２７は、予測係数を算出し、予測係数メモリ１２８に供給し、記憶させる。
【００８７】
以上においては、コードブック８２に各ブロックの６４個のDCT係数のうちの６３個のAC係数を全て代表ベクトルとして用いるようにしたが、例えば、図１３に示されるように、図中灰色で示される低域のAC係数のみを、代表ベクトルとして用いるようにすることも可能である。この場合、入力ベクトル抽出回路１０１は、対応する低域のAC係数のみを抽出する。
【００８８】
この場合におけるクラス分類回路８３とその処理は、図９と図１０に示される場合と同様となる。
【００８９】
図１４は、コードブックを構成する代表ベクトルのさらに他の例を示している。この例では、注目ブロックだけでなく、注目ブロックの上下左右に隣接する周辺ブロックのDCT係数（低域のAC係数）も含めたベクトルによりコードブックが作成される。
【００９０】
この実施の形態におけるクラス分類回路８３の構成を図１５に示す。本実施の形態では、代表ベクトルと入力ベクトルの差分自乗和が距離dと定義され、次式により求められる。ただし、入力ベクトルをS、代表ベクトルをT_k、とする。
【数６】

【００９１】
入力ベクトル抽出回路１０１は、ベクトルの成分となるDCT係数を抽出し、入力ベクトルとしてベクトル差分回路１４１に供給する。ベクトル差分回路１４１は、コードブック８２からの代表ベクトルと、入力ベクトル抽出回路１０１からの入力ベクトルの差分を算出し、自乗和回路１４２に出力する。自乗和回路１４２は、代表ベクトルと入力ベクトルの差分の自乗和を求め、最小値選択回路１０７に出力する。最小値選択回路１０７は、入力ベクトルとすべての代表ベクトルとの距離の中から、最小値を選択し、それに対応する代表ベクトルの番号をクラスコードとして出力する。
【００９２】
次に、図１６のフローチャートを参照して図１５のクラス分類回路８３の処理について説明する。
【００９３】
最初に、ステップＳ７１において、入力ベクトル抽出回路１０１は、DCT係数から入力ベクトルを抽出する。ステップＳ７２において、ベクトル差分回路１４１は、入力ベクトル抽出回路１０１より供給された入力ベクトルと、コードブック８２より供給された代表ベクトルの差分を算出する。
【００９４】
ベクトル差分回路１４１で演算されたベクトル差分の値は、自乗和回路１４２に供給され、その自乗和が演算される。演算された自乗和は、最小値選択回路１０７に供給される。
【００９５】
ステップＳ７３において、自乗和（距離）の演算が、全ての代表ベクトルについて行われたか否かが最小値選択回路１０７により判定され、まだ処理されていない代表ベクトルが存在する場合には、ステップＳ７１に戻り、それ以降の処理が繰り返し実行される。
【００９６】
ステップＳ７３において、全ての代表ベクトルについての処理が行われたと判定された場合、ステップＳ７４に進み、最小値選択回路１０７は、距離が最小の代表ベクトルを選択する。
【００９７】
ステップＳ７５において、最小値選択回路１０７は、ステップＳ７４で選択された距離が最小の代表ベクトルの番号をクラスコードとして出力する。
【００９８】
コードブック８２に格納するベクトルの成分として、DCT係数をそのまま用いるのではなく、DCT係数から求まる特徴量を用いることができる。
【００９９】
この実施の形態におけるクラス分類回路８３の構成を図１７に、その処理を図１８に、それぞれ示す。
【０１００】
図１７のクラス分類回路８３では、図１５における入力ベクトル抽出回路１０１に代えて、特徴量抽出回路１５１が設けられている。また、図１６のステップＳ７１において、DCT係数から入力ベクトル抽出回路１０１により、入力ベクトルが抽出されるのに代えて、ステップＳ９１において、特徴量抽出回路１５１により、DCT係数から特徴量が算出され、入力ベクトルとされる。以上の点を除き、その構成と処理は、図１５と図１６に示される場合と基本的に同様である。
【０１０１】
図１９は、検出される特徴量の例を表している。この例においては、８×８個の画素が、水平方向と垂直方向において低域の領域、水平方向が高域で垂直方向が低域の領域、垂直方向が高域で水平方向が低域の領域、並びに水平方向と垂直方向の両方向において高域の領域という、４つの領域に分割され、各領域のDCT係数の自乗和の値Ｐ１乃至Ｐ４が特徴量とされる。
【０１０２】
以上においては、LBGアルゴリズム等の学習により作成されたコードブック８２を用いるようにしたが、このコードブック８２に代えて、テンプレートテーブルを用いることが可能である。以下に、この実施の形態について説明する。
【０１０３】
図２０は、係数データ変換回路６２のこの場合における構成例を表している。
この係数データ変換回路６２は、図６に示される係数データ変換回路６２のコードブック８２をテンプレートテーブル１６１に代えた点を除き、図６における場合と同様の構成とされている。
【０１０４】
テンプレートテーブル１６１には、クラス分類の際に用いられるテンプレートが予め格納されている。本実施の形態では、テンプレートは、ブロック内の63個のAC係数から成るベクトルとされる。ただし、後述する別の実施の形態に示すとおり、テンプレートはこれに限るものではない。
【０１０５】
図２１は、テンプレートテーブル１６１を生成するテンプレートテーブル生成回路の構成を示す。このテンプレートテーブル生成回路には、例えば強いエッジを含むブロックなど、量子化された場合に歪みの大きいブロックの画素データが入力される。入力された1ブロックの画素データは、DCT回路１７１によりDCT係数に変換され、その中から、AC係数抽出回路１７２により63個のAC係数だけが抽出され、テンプレートメモリ１７３に供給され、格納される。
【０１０６】
次に、図２２のフローチャートを参照して、図２１のテンプレートテーブル生成回路の処理について説明する。
【０１０７】
ステップＳ１１１において、DCT回路１７１は、入力ブロックをDCT処理し、AC係数抽出回路１７２に出力する。ステップＳ１１２において、AC係数抽出回路１７２は、各ブロックの６３個のAC係数だけを抽出し、テンプレートメモリ１７３に供給し、格納させる。
【０１０８】
ステップＳ１１３において、全ブロックについての処理が終了したか否かが判定され、まだ処理されていないブロックが残っている場合には、ステップＳ１１１に戻り、それ以降の処理が繰り返し実行される。ステップＳ１１３において、全てのブロックについての処理が終了したと判定された場合、処理は終了される。
【０１０９】
図２３は、図２０におけるクラス分離回路８３の構成例を表している。図９におけるコードブック８２に代えて、テンプレートテーブル１６１が設けられている。また、最小値選択回路１０７の出力が閾値判定回路１８１に供給され、閾値判定回路１８１の出力がクラスコードとして出力される。その他の基本的構成は、図９におけるクラス分離回路８３と同様の構成とされている。
【０１１０】
閾値判定回路１８１は、最小値選択回路１０７より供給された角度の最小値が、予め設定された閾値よりも小さければ、そのテンプレートの番号をクラスコードとして出力する。その最小値が閾値よりも大きかった場合は、クラス分類を行わずに、学習した予測係数（モノクラス係数）を用いることとし、モノクラスコードを出力する。閾値としては、例えば、係数データと量子化スケールから求まる係数の真値の範囲の値が設定される。
【０１１１】
次に、図２３のクラス分類回路８３の処理について、図２４のフローチャートを参照して説明する。
【０１１２】
ステップＳ１２１において、入力ベクトル抽出回路１０１は、入力されたDCT係数から入力ベクトルを抽出し、内積回路１０２とノルム算出回路１０５に出力する。ステップＳ１２２において、テンプレートと入力ベクトルの成す角度の算出処理が実行される。
【０１１３】
すなわち、内積回路１０２は、テンプレートテーブル１６１に記憶されているテンプレートと、入力ベクトル抽出回路１０１より供給された入力ベクトルとの内積を演算し、COSθ算出回路１０４に出力する。ノルム算出回路１０３は、テンプレートテーブル１６１より供給されたテンプレートのノルムを算出し、COSθ算出回路１０４に供給する。ノルム算出回路１０５は、入力ベクトル抽出回路１０１より供給された入力ベクトルのノルムを算出し、COSθ算出回路１０４に出力する。
【０１１４】
COSθ算出回路１０４は、内積回路１０２の出力、ノルム算出回路１０３の出力、並びにノルム算出回路１０５の出力より、COSθを算出し、COS^-1回路１０６に出力する。
【０１１５】
COS^-1回路１０６は、入力されたCOSθから角度θを算出し、最小値選択回路１０７に出力する。
【０１１６】
ステップＳ１２３において、最小値選択回路１０７は、全てのテンプレートについての処理が行われたか否かについて判定し、まだ行われていないテンプレートが残っている場合には、ステップＳ１２１に戻り、それ以降の処理が繰り返し実行される。
【０１１７】
ステップＳ１２３において、全てのテンプレートと入力ベクトルとの角度の算出処理が行われたと判定された場合、ステップＳ１２４に進み、最小値選択回路１０７は、角度が最小となるテンプレートを選択し、そのテンプレート番号と最小の角度を閾値判定回路１８１に出力する。
【０１１８】
閾値判定回路１８１は、ステップＳ１２５において、最小値選択回路１０７により選択された最小の角度と、予め設定されている所定の閾値とを比較し、最初の角度θが閾値より小さいか否かを判定する。
【０１１９】
角度θが閾値より小さい場合には、ステップＳ１２６に進み、閾値判定回路１８１は、最小の角度のテンプレート番号をクラスコードとして出力する。ステップＳ１２５において、最小の角度θが閾値より小さくないと判定された場合、ステップＳ１２７に進み、閾値判定回路１８１は、モノクラスコードを出力する。
すなわち、このとき、平均的なクラスに分類するためのクラスコードが出力されることになる。
【０１２０】
このテンプレートテーブルを生成する場合にも、図１３に示されるように、低域のDCT係数だけを用いてテンプレートテーブル１６１を生成することができる。この場合のクラス分類回路８３とその処理は、図２３と図２４に示される場合と同様となる。
【０１２１】
更に、テンプレートテーブル１６１を生成するのに、図１４に示されるように、注目ブロックの上下左右に隣接する周辺ブロックの低域のDCT係数を用いるようにすることもできる。
【０１２２】
この場合のクラス分離回路８３とその処理は、図２５と図２６に示される。
【０１２３】
この例では、テンプレートテーブル１６１のテンプレートと入力ベクトルの差分の自乗和が距離ｄとされる。また、この例においては、最小値選択回路１０７の出力が、図２３における場合と同様に、閾値判定回路１８１に供給されている。その他の構成は、図１５における場合と同様である。
【０１２４】
図２５のクラス分離回路８３の処理は、図２６のステップＳ１４１乃至Ｓ１４７の処理となるが、その基本的な処理は、図２４のステップＳ１２１乃至ステップＳ１２７の処理と同様の処理となる。但し、図２４のステップＳ１２２におけるテンプレートと入力ベクトルの成す角度が算出する処理が、ステップＳ１４２においては、テンプレートと入力ベクトルとの距離を算出する処理とされる。そして、ステップＳ１２４において、最小の角度のテンプレートが選択される処理が、ステップＳ１４４においては、最小の距離のテンプレートを選択する処理とされる。
【０１２５】
その他の処理は、図２４における場合と同様である。
【０１２６】
コードブックに代えて、テンプレートを用いる場合にも、DCT係数に代えて、特徴量をテンプレートとすることが可能である。この場合、図２１に示されるテンプレートテーブル生成回路は、図２７に示されるように構成される。なお、この場合の特徴量も、例えば、図１９に示されるように構成される。
【０１２７】
すなわち、図２１のテンプレートテーブル生成回路におけるAC係数抽出回路１７２は、図２７の例においては、特徴量抽出回路２０１に変更されている。その他の構成は、図２１における場合と同様である。
【０１２８】
図２８は、図２７のテンプレートテーブル生成回路の処理例を表している。ステップＳ１６１において、DCT回路１７１は、入力画像をDCT処理し、特徴量抽出回路２０１に出力する。特徴量抽出回路２０１は、ステップＳ１６２において、DCT回路１７１より供給されたDCT係数から特徴量を抽出し、テンプレートメモリ１７３に供給し、記憶させる。
【０１２９】
以上の処理は、ステップＳ１６３において、全てのブロックについての処理が終了したと判定されるまで繰り返し実行される。
【０１３０】
図２９は、このように特徴量を用いて、テンプレートを構成する場合におけるクラス分類回路８３の構成例を表している。この構成例においては、図２５におけるクラス分類回路８３の入力ベクトル抽出回路１０１に代えて、特徴量抽出回路２１１が設けられている。その他の構成は、図２５における場合と同様である。
【０１３１】
図３０は、図２９のクラス分類回路８３の処理例を表している。そのステップＳ１８１乃至ステップＳ１８７の処理は、図２６におけるステップＳ１４１乃至ステップＳ１４７の処理と基本的に同様の処理である。
【０１３２】
但し、図２６のステップＳ１４１において、DCT係数から入力ベクトルが抽出されるのに対して、図３０のステップＳ１８１においては、特徴量抽出回路２１１がDCT形成から特徴量を算出し、入力ベクトルとする処理を行っている。その他のステップＳ１８２乃至ステップＳ１８７の処理は、図２６のステップＳ１４２乃至ステップＳ１４７の処理と同様の処理である。
【０１３３】
上述した一連の処理は、ハードウェアにより行うこともできるし、ソフトウェアにより行うこともできる。一連の処理をソフトウェアによって行う場合には、そのソフトウェアを構成するプログラムが、汎用のコンピュータ等にインストールされる。
【０１３４】
図３１は、上述した一連の処理を実行するプログラムがインストールされるコンピュータの一実施の形態の構成例を示している。
【０１３５】
プログラムは、コンピュータに内蔵されている記録媒体としてのハードディスク３０５やＲＯＭ３０３に予め記録しておくことができる。
【０１３６】
あるいはまた、プログラムは、フレキシブルディスク、CD-ROM(Compact Disc Read Only Memory)，MO(Magneto optical)ディスク，DVD(Digital Versatile Disc)、磁気ディスク、半導体メモリなどのリムーバブル記録媒体３１１に、一時的あるいは永続的に格納（記録）しておくことができる。このようなリムーバブル記録媒体３１１は、いわゆるパッケージソフトウエアとして提供することができる。
【０１３７】
なお、プログラムは、上述したようなリムーバブル記録媒体３１１からコンピュータにインストールする他、ダウンロードサイトから、ディジタル衛星放送用の人工衛星を介して、コンピュータに無線で転送したり、LAN(Local Area Network)、インターネットといったネットワークを介して、コンピュータに有線で転送し、コンピュータでは、そのようにして転送されてくるプログラムを、通信部３０８で受信し、内蔵するハードディスク３０５にインストールすることができる。
【０１３８】
コンピュータは、CPU(Central Processing Unit)３０２を内蔵している。CPU３０２には、バス３０１を介して、入出力インタフェース３１０が接続されており、CPU３０２は、入出力インタフェース３１０を介して、ユーザによって、キーボードや、マウス、マイク等で構成される入力部３０７が操作等されることにより指令が入力されると、それにしたがって、ROM(Read Only Memory)３０３に格納されているプログラムを実行する。あるいは、また、CPU３０２は、ハードディスク３０５に格納されているプログラム、衛星若しくはネットワークから転送され、通信部３０８で受信されてハードディスク３０５にインストールされたプログラム、またはドライブ３０９に装着されたリムーバブル記録媒体３１１から読み出されてハードディスク３０５にインストールされたプログラムを、RAM(Random Access Memory)３０４にロードして実行する。これにより、CPU３０２は、上述したフローチャートにしたがった処理、あるいは上述したブロック図の構成により行われる処理を行う。そして、CPU３０２は、その処理結果を、必要に応じて、例えば、入出力インタフェース３１０を介して、LCD(Liquid Crystal Display)やスピーカ等で構成される出力部３０６から出力、あるいは、通信部３０８から送信、さらには、ハードディスク３０５に記録等させる。
【０１３９】
なお、本明細書において、コンピュータに各種の処理を行わせるためのプログラムを記述する処理ステップは、必ずしもフローチャートとして記載された順序に沿って時系列に処理する必要はなく、並列的あるいは個別に実行される処理（例えば、並列処理あるいはオブジェクトによる処理）も含むものである。
【０１４０】
また、プログラムは、１のコンピュータにより処理されるものであっても良いし、複数のコンピュータによって分散処理されるものであっても良い。さらに、プログラムは、遠方のコンピュータに転送されて実行されるものであっても良い。
【０１４１】
さらに、本実施の形態では、静止画を圧縮符号化するJPEG符号化された画像を対象としたが、本発明は、動画を圧縮符号化する、例えば、MPEG符号化された画像を対象とすることも可能である。
【０１４２】
また、本実施の形態では、少なくとも、DCT処理を行うJPEG符号化された符号化データの復号を行うようにしたが、本発明は、その他の直交変換または周波数変換によって、ブロック単位（ある所定の単位）で変換されたデータの復号や変換に適用可能である。即ち、本発明は、例えば、サブバンド符号化されたデータや、フーリエ変換されたデータ等を復号したり、それらの量子化誤差等を低減したデータに変換する場合にも適用可能である。
【０１４３】
【発明の効果】
本発明の画像データ処理装置および方法、記録媒体、並びにプログラムによれば、複数の係数データをクラス分類するためのテーブル情報に基づいて、クラスコードが生成され、生成されたクラスコードに基づいて、予測係数セットと予測タップが生成される。そして、予測係数セットと、予測タップに基づいて、画素データが生成される。従って、情報量を増大させることなく、量子化誤差を低減したより良好な復号画像を得ることが可能となる。
【図面の簡単な説明】
【図１】従来のJPEG符号化装置の構成を示すブロック図である。
【図２】量子化テーブルの例を示す図である。
【図３】従来のJPEG復号装置の構成を示すブロック図である。
【図４】本発明を適用した復号装置の構成例を示すブロック図である。
【図５】図４の復号装置の動作を説明するフローチャートである。
【図６】図４の係数データ変換回路の構成例を示すブロック図である。
【図７】図６の係数データ変換回路の係数データ変換処理を説明するフローチャートである。
【図８】図６のコードブックの生成回路の構成を示すブロック図である。
【図９】図６のクラス分類回路の構成例を示すブロック図である。
【図１０】図９のクラス分類回路の動作を説明するフローチャートである。
【図１１】図６の予測係数テーブルを生成する予測係数テーブル生成回路の構成例を示すブロック図である。
【図１２】図１１の予測係数テーブル生成回路の動作を説明するフローチャートである。
【図１３】低域のAC係数を説明する図である。
【図１４】周辺ブロックのDCT係数を説明する図である。
【図１５】図６のクラス分類回路の他の構成例を示すブロック図である。
【図１６】図１５のクラス分類回路の動作を説明するフローチャートである。
【図１７】図６のクラス分類回路の更に他の構成例を示すブロック図である。
【図１８】図１７のクラス分類回路の動作を説明するフローチャートである。
【図１９】特徴量の例を説明する図である。
【図２０】図４の係数データ変換回路の他の構成例を示すブロック図である。
【図２１】図２０のテンプレートテーブルを生成するテンプレートテーブル生成回路の構成例を示すブロック図である。
【図２２】図２１のテンプレートテーブル生成回路の動作を説明するフローチャートである。
【図２３】図２０のクラス分類回路の構成例を示すブロック図である。
【図２４】図２３のクラス分類回路の動作を説明するフローチャートである。
【図２５】図２０のクラス分類回路の他の構成例を示すブロック図である。
【図２６】図２５のクラス分類回路の動作を説明するフローチャートである。
【図２７】図２０のテンプレートを生成するテンプレートテーブル生成回路の他の構成例を示すブロック図である。
【図２８】図２７のテンプレートテーブル生成回路の動作を説明するフローチャートである。
【図２９】図２０のクラス分類回路の更に他の構成例を示すブロック図である。
【図３０】図２９のクラス分類回路の動作を説明するフローチャートである。
【図３１】本発明を適用したコンピュータの構成例を示すブロック図である。
【符号の説明】
６０復号装置，６１エントロビー復号回路，６２係数データ変換回路，６３ブロック分解回路，８１予測タップ抽出回路，８２コードブック，８３クラス分類回路，８４予測係数テーブル，８５積和演算回路，８６予測タップテーブル[0001]
BACKGROUND OF THE INVENTION
The present invention relates to an image data processing apparatus and method, a recording medium, and a program, and more particularly, to an image data processing apparatus and method, a recording medium, and a program that enable decoding of compressed image data with a simple configuration.
[0002]
[Prior art]
For example, since digital image data has a large amount of data, a large-capacity recording medium or transmission medium is required to perform recording or transmission as it is. In general, therefore, image data is compressed and encoded to reduce the amount of data before recording or transmission.
[0003]
As a method for compressing and encoding an image, for example, there are a JPEG (Joint Photographic Experts Group) method which is a compression encoding method for still images, an MPEG (Moving Picture Experts Group) method which is a compression encoding method for moving images, and the like. .
[0004]
For example, the encoding of image data by the JPEG method is performed as shown in FIG.
[0005]
The image data to be encoded is input to the block forming circuit 1, and the block forming circuit 1 divides the image data input thereto into blocks of 64 pixels of 8 × 8 pixels. Each block obtained by the blocking circuit 1 is supplied to a DCT (Discrete Cosine Transform) circuit 2. The DCT circuit 2 performs a DCT (Discrete Cosine Transform) process on the block from the blocking circuit 1 to generate one DC (Direct Current) component and 63 frequency components in the horizontal and vertical directions ( AC (Alternating Current) component) to a total of 64 DCT coefficients. The 64 DCT coefficients for each block are supplied from the DCT circuit 2 to the quantization circuit 3.
[0006]
The quantization circuit 3 quantizes the DCT coefficient from the DCT circuit 2 in accordance with a predetermined quantization table, and uses the quantization result (hereinafter referred to as “quantized DCT coefficient” as appropriate) for the quantization table. At the same time, it is supplied to the entropy encoding circuit 4.
[0007]
FIG. 2 shows an example of a quantization table used in the quantization circuit 3.
In general, the quantization table takes into account human visual characteristics and quantizes low-frequency DCT coefficients that are more important and finer quantization, and low-frequency high-frequency DCT coefficients that are coarsely quantized. Thus, it is possible to suppress the deterioration of the image quality of the image and perform efficient compression.
[0008]
The entropy encoding circuit 4 performs entropy encoding processing such as Huffman encoding on the quantized DCT coefficient from the quantizing circuit 3 and adds the quantization table from the quantizing circuit 3. The encoded data obtained as a result is output as a JPEG encoding result.
[0009]
FIG. 3 shows a configuration of an example of a conventional JPEG decoding apparatus that decodes encoded data output from the JPEG encoding apparatus of FIG.
[0010]
The encoded data is input to the entropy decoding circuit 11, and the entropy decoding circuit 11 separates the encoded data into entropy-coded quantized DCT coefficients and a quantization table. Further, the entropy decoding circuit 11 entropy-decodes the entropy-coded quantized DCT coefficients and supplies the resulting quantized DCT coefficients to the inverse quantization circuit 12 together with the quantization table. The inverse quantization circuit 12 inversely quantizes the quantized DCT coefficient from the entropy decoding circuit 11 according to the quantization table from the entropy decoding circuit 11, and supplies the resulting DCT coefficient to the inverse DCT circuit 13. . The inverse DCT circuit 13 performs inverse DCT processing on the DCT coefficient from the inverse quantization circuit 12 and supplies the resulting 8 × 8 pixel (decoded) block to the block decomposition circuit 14. The block decomposition circuit 14 obtains and outputs a decoded image by unblocking the block from the inverse DCT circuit 13.
[0011]
[Problems to be solved by the invention]
In compression coding using orthogonal transform such as DCT as described above, if the quantization for each coefficient data is made rough in order to increase the compression rate, the error of the decoded image with respect to the original image increases, and the decoded image is degraded. Arise. The deterioration appears in the form of blurring of the image, block distortion, and mosquito noise around the edge, which is a serious problem.
[0012]
The present invention has been made in view of such a situation, and makes it possible to obtain a good decoded image with a reduced quantization error without increasing the amount of information.
[0013]
[Means for Solving the Problems]
The present invention of The image data processing apparatus includes a holding unit that holds table information for classifying a plurality of coefficient data, a class code generating unit that generates a class code based on the table information held in the holding unit, a class A prediction coefficient generation unit that generates a prediction coefficient set based on the class code generated by the code generation unit; a prediction tap generation unit that generates a prediction tap based on the class code generated by the class code generation unit; A prediction coefficient set generated by the prediction coefficient generation unit and a pixel data generation unit that generates pixel data based on the prediction tap generated by the prediction tap generation unit.
[0014]
The holding unit can hold a plurality of coefficient data or a template describing a relationship with a feature amount corresponding to the coefficient data as table information.
[0015]
The holding means holds, as table information, a plurality of coefficient data or a code book in which a vector consisting of feature amounts corresponding to the coefficient data is stored, and the class code generating means uses the code book to vectorize the coefficient data. The class code can be generated by quantization.
[0016]
The code book may be generated based on an LBG algorithm.
[0017]
The class code generation means can perform a threshold determination process.
[0018]
The present invention of The image data processing method includes a holding step for holding table information for classifying a plurality of coefficient data, a class code generating step for generating a class code based on the table information held by the processing of the holding step, A prediction coefficient generation step for generating a prediction coefficient set based on the class code generated by the processing of the class code generation step, and a prediction tap is generated based on the class code generated by the processing of the class code generation step Including a prediction tap generation step, a prediction coefficient set generated by the processing of the prediction coefficient generation step, and a pixel data generation step of generating pixel data based on the prediction tap generated by the processing of the prediction tap generation step It is characterized by.
[0019]
The holding step can hold a plurality of coefficient data or a template describing a relationship with a feature amount corresponding to the coefficient data as table information.
[0020]
The holding step holds, as table information, a plurality of coefficient data or a code book in which a vector consisting of feature amounts corresponding to the coefficient data is stored, and the class code generating step uses the code book to vectorize the coefficient data. The class code can be generated by quantization.
[0021]
The code book may be generated based on an LBG algorithm.
[0022]
In the class code generation step, a threshold determination process can be performed.
[0023]
The present invention of The program of the recording medium is a program of an image data processing apparatus that decodes quantized image data after being converted into coefficient data by orthogonal transform processing, and includes table information for classifying a plurality of coefficient data. A holding step, a class code generation step for generating a class code based on the table information held by the processing of the holding step, and a prediction coefficient based on the class code generated by the processing of the class code generation step A prediction coefficient generation step for generating a set, a prediction tap generation step for generating a prediction tap based on the class code generated by the processing of the class code generation step, and a prediction coefficient set generated by the processing of the prediction coefficient generation step And generated by the process of the prediction tap generation step Based on the measurement tap, characterized in that it comprises a pixel data generation step of generating pixel data.
[0024]
The present invention of A program stores table information for classifying a plurality of coefficient data in a computer that controls an image data processing apparatus that decodes quantized image data after being converted into coefficient data by orthogonal transform processing. A class code generation step for generating a class code based on the step and the table information held by the processing of the holding step, and a prediction coefficient set is generated based on the class code generated by the processing of the class code generation step A prediction coefficient generation step, a prediction tap generation step for generating a prediction tap based on the class code generated by the processing of the class code generation step, a prediction coefficient set generated by the processing of the prediction coefficient generation step, and a prediction Prediction data generated by the tap generation step process Based on the flop, to execute a pixel data generation step of generating pixel data.
[0029]
The present invention of In the image data processing apparatus and method, the recording medium, and the program, a class code is generated based on table information for classifying a plurality of coefficient data, and a prediction coefficient set is generated based on the generated class code. A prediction tap is generated. Then, pixel data is generated based on the prediction coefficient set and the prediction tap.
[0031]
DETAILED DESCRIPTION OF THE INVENTION
Next, FIG. 4 shows a configuration example of a decoding device 60 to which the present invention is applied.
[0032]
The encoded data is supplied to the entropy decoding circuit 61. The entropy decoding circuit 61 entropy-decodes the encoded data and supplies the quantized DCT coefficient Q for each block obtained as a result to the coefficient data conversion circuit 62. To do. Note that the encoded data includes a quantization table in addition to the entropy-encoded quantized DCT coefficient as in the case of the entropy decoding circuit 11 in FIG. 3, and the quantization table will be described later. Thus, it can be used for decoding the quantized DCT coefficients as necessary.
[0033]
The coefficient data conversion circuit 62 performs a predetermined prediction operation using the quantized DCT coefficient Q from the entropy decoding circuit 61 and a prediction coefficient obtained by performing learning described later, thereby performing a quantized DCT for each block. The coefficients are decoded into 8 × 8 blocks of original pixel data.
[0034]
The block decomposition circuit 63 obtains and outputs a decoded image by unblocking the decoded block (decoded block) obtained in the coefficient data conversion circuit 62.
[0035]
Next, processing of the decoding device 60 in FIG. 4 will be described with reference to the flowchart in FIG.
[0036]
The encoded data is sequentially supplied to the entropy decoding circuit 61. In step S1, the entropy decoding circuit 61 entropy decodes the encoded data and supplies the quantized DCT coefficient Q for each block to the coefficient data conversion circuit 62. . In step S2, the coefficient data conversion circuit 62 decodes the quantized DCT coefficient Q for each block from the entropy decoding circuit 61 into a pixel value for each block by performing a prediction operation using the prediction coefficient, and performs block decomposition. This is supplied to the circuit 63. In step S3, the block decomposition circuit 63 performs block decomposition to unblock the pixel value block (decoded block) from the coefficient data conversion circuit 62, outputs the decoded image obtained as a result, and ends the processing. .
[0037]
FIG. 6 shows a more detailed configuration example of the coefficient data conversion circuit 62 of FIG. 4 that decodes quantized DCT coefficients into pixel values.
[0038]
The quantized DCT coefficients for each block output from the entropy decoding circuit 61 (FIG. 4) are supplied to the prediction tap extraction circuit 81 and the class classification circuit 83.
[0039]
The prediction tap extraction circuit 81 has a block of pixel values corresponding to a block of quantized DCT coefficients (hereinafter referred to as DCT block as appropriate) supplied thereto (this block of pixel values does not exist at this stage, (Hereinafter, appropriately referred to as a pixel block) is sequentially set as a target pixel block, and each pixel constituting the target pixel block is sequentially set as a target pixel, for example, in a so-called raster scan order. . Further, the prediction tap extraction circuit 81 extracts the quantized DCT coefficient used for predicting the pixel value of the target pixel by referring to the prediction tap table 86 and sets it as a prediction tap.
[0040]
The prediction tap table 86 is a pattern table in which pattern information representing the positional relationship of the quantized DCT coefficient extracted as a prediction tap for the pixel of interest with respect to the pixel of interest is registered. Based on the information, a quantized DCT coefficient is extracted, and a prediction tap for the pixel of interest is configured.
[0041]
The prediction tap extraction circuit 81 configures the prediction tap for each pixel constituting the pixel block of 64 pixels of 8 × 8, that is, 64 sets of prediction taps for each of the 64 pixels as described above, and This is supplied to the sum calculation circuit 85.
[0042]
The class classification circuit 83 vector-quantizes the DCT coefficient (AC coefficient) of the DCT block of interest based on the representative vector stored in the codebook 82, thereby converting the DCT block of interest from any of several classes. And a class code and a pixel position mode corresponding to the resulting class are output (details of the processing will be described later with reference to FIGS. 9 and 10). The pixel position mode means an operation mode corresponding to the position of the pixel that is the target pixel in the target pixel block.
[0043]
The class code output from the class classification circuit 83 is given to the prediction coefficient table 84 and the prediction tap table 86 as an address.
[0044]
In the prediction coefficient table 84, prediction coefficients used for pixel value prediction are stored in advance for each class and each pixel position mode. A method for creating the prediction coefficient table 84 will be described later with reference to FIGS. 11 and 12. The prediction coefficient table 84 selects a prediction coefficient set according to the class code supplied from the class classification circuit 83 and the pixel position mode, and outputs the prediction coefficient set to the product-sum operation circuit 85.
[0045]
The product-sum operation circuit 85 performs a product-sum operation on the prediction tap set supplied from the prediction tap extraction circuit 81 and the prediction coefficient set obtained from the prediction coefficient table 84, and outputs the result as pixel data (PD).
[0046]
That is, the product-sum operation circuit 85 has a prediction tap set of TD. ₁ , TD ₂ , TD _Three , TD _Four , TD _Five , Prediction coefficient set is FC ₁ , FC ₂ , FC _Three , FC _Four , FC _Five In this case, the pixel data PD is calculated as represented by the following equation.

(1)
[0047]
In this embodiment, since the pixel block is classified, one class code is obtained for the target pixel block. On the other hand, since the pixel block is composed of 64 pixels of 8 × 8 pixels in the present embodiment, 64 sets of prediction coefficients for decoding each of the 64 pixels constituting the pixel block of interest are required. is there. Therefore, the prediction coefficient table 84 stores 64 sets of prediction coefficients for the corresponding addresses for each class code.
[0048]
The product-sum operation circuit 85 acquires the prediction tap output from the prediction tap extraction circuit 81 and the prediction coefficient output from the prediction coefficient table 84, and uses the prediction tap and the prediction coefficient to obtain the equation (1). The 8 × 8 pixel value of the target pixel block obtained as a result is output to the block decomposing circuit 63 (FIG. 4) as a corresponding DCT block decoding result.
[0049]
In the prediction tap extraction circuit 81, as described above, each pixel of the target pixel block is sequentially set as the target pixel. However, the product-sum operation circuit 85 determines the pixel that is the target pixel of the target pixel block. An operation mode (that is, a pixel position mode) corresponding to the position is processed.
[0050]
For example, among the pixels of the target pixel block, the i-th pixel in the raster scan order is set to p. _i And the pixel p _i However, if it is the pixel of interest, the product-sum operation circuit 85 performs the processing of the pixel position mode #i.
[0051]
Specifically, as described above, the prediction coefficient table 84 stores 64 sets of prediction coefficients for decoding each of the 64 pixels constituting the pixel block of interest. _i A set of prediction coefficients for decoding _i When the operation mode is the pixel position mode #i, the product-sum operation circuit 85 has the set W _i Is output. At the same time, the prediction tap extraction circuit 81 receives the prediction tap T in the pixel position mode #i. _i Is output from the product-sum operation circuit 85. _i And the prediction coefficient set W _i And the product-sum operation of Expression (1) is performed, and the product-sum operation result is expressed as pixel p. _i Is the decoding result.
[0052]
The prediction tap table 86 stores in advance information on the position of a prediction tap used for prediction of pixel values, that is, which DCT coefficient is used for prediction. The prediction tap table 86 outputs a prediction tap position set to the prediction tap extraction circuit 81 according to the class code supplied from the class classification circuit 83 and the pixel position mode.
[0053]
The prediction tap extraction circuit 81 extracts DCT coefficient data corresponding to each prediction tap position based on the prediction tap position set supplied from the prediction tap table 86 and outputs them to the product-sum operation circuit 85 as a prediction tap set. .
[0054]
Here, in the prediction tap table 86 as well, for the same reason as described for the prediction coefficient table 84, 64 sets of pattern information (pattern information for each pixel position mode) for the address corresponding to one class code. ) Is stored.
[0055]
Next, processing of the coefficient data conversion circuit 62 of FIG. 6 will be described with reference to the flowchart of FIG.
[0056]
The quantized DCT coefficients for each block output from the entropy decoding circuit 61 are sequentially received by the prediction tap extraction circuit 81 and the class classification circuit 83, and pixels corresponding to the block of quantized DCT coefficients (DCT block) supplied thereto. Blocks are sequentially designated as pixel blocks of interest. In addition, among the pixels of the pixel block of interest, pixels that have not yet received attention in the raster scan order are set as pixels of interest.
[0057]
In step S11, the class classification circuit 83 classifies the target DCT block by vector quantization using the representative vector stored in the code book 82. That is, the class classification circuit 83 searches the code book 82 for a representative vector that minimizes the distance from the input coefficient data, and sets the class code corresponding to the representative vector as the class code of the coefficient data. The class classification circuit 83 outputs the class code and the pixel position mode (pixel position in the pixel block) to the prediction coefficient table 84 and the prediction tap table 86. Details of this class classification processing will be described later with reference to FIGS.
[0058]
When the prediction tap table 86 receives the class code and the pixel position mode as an address from the class classification circuit 83, the prediction tap table 86 reads the pattern information stored in the address and outputs it to the prediction tap extraction circuit 81 in step S12.
[0059]
In step S13, the prediction tap extraction circuit 81 predicts the pixel value of the target pixel according to the class code supplied from the prediction tap table 86 and the pattern information corresponding to the pixel position mode of the target pixel. Quantized DCT coefficients used in the above are extracted and configured as a prediction tap. This prediction tap is supplied from the prediction tap extraction circuit 81 to the product-sum operation circuit 85.
[0060]
When the prediction coefficient table 84 receives the class code and the pixel position mode as an address from the class classification circuit 83, the prediction coefficient stored in the address is read out and output to the product-sum operation circuit 85 in step S 14.
[0061]
In step S15, the product-sum operation circuit 85 acquires a set of prediction coefficients corresponding to the class code and the pixel position mode for the target pixel, and the prediction coefficient set and the prediction tap extraction circuit 81 supplied in step S13. Using the prediction tap, the product-sum operation shown in Expression (1) is performed to obtain a decoded value of the pixel value of the target pixel.
[0062]
In step S16, the class classification circuit 83 determines whether all the pixels in the target pixel block have been processed as the target pixels. If it is determined in step S16 that all the pixels of the pixel block of interest have not been processed as pixels of interest, the process returns to step S12, and the class classification circuit 83 selects the raster scan order among the pixels of the pixel block of interest. Then, a pixel that has not yet been set as the target pixel is newly set as the target pixel, and the same processing is repeated.
[0063]
If it is determined in step S16 that all the pixels of the pixel block of interest have been processed as pixels of interest, that is, if the decoded values of all the pixels of the pixel block of interest are obtained, the product-sum operation circuit 85 Then, the pixel block (decoded block) constituted by the decoded value is output to the block decomposition circuit 63 (FIG. 4), and the process is terminated.
[0064]
As described above, the processing according to the flowchart of FIG. 7 is repeatedly performed every time the coefficient data conversion circuit 62 sets a new target pixel block.
[0065]
FIG. 8 shows a configuration example of a circuit that generates the code book 82. In this example, the code book generation circuit 91 learns based on the LBG algorithm based on the input frequency domain coefficient data, and generates a code book composed of a plurality of representative vectors.
[0066]
FIG. 9 shows the configuration of the class classification circuit 83. As described above, in this embodiment, the class code is determined by searching the code book 82 for a representative vector that minimizes the distance from the input coefficient data. Various definitions of the distance in this case are conceivable. In the example of FIG. 9, the angle formed by the representative vector and the input vector (input DCT coefficient data) is used as the distance. T representative vector _k If the input vector is S, the angle θ between the two _k Can be obtained by the following equation.
[Expression 1]

T _k ・ S is T _k The inner product of S and S _k || is T _k || S || represents the norm of S.
[0067]
The input vector extraction circuit 101 extracts 63 AC coefficients from the 64 DCT coefficients of the block of interest, and supplies them to the inner product circuit 102 and the norm calculation circuit 105 as input vectors. The inner product circuit 102 calculates the inner product of the representative vector in the code book 82 and the input vector from the input vector extraction circuit 101, and outputs it to the cos θ calculation circuit 104. The norm calculation circuit 103 calculates the norm of the representative vector from the code book 82 and outputs it to the cos θ calculation circuit 104. The norm calculation circuit 105 calculates the norm of the input vector from the input vector extraction circuit 101 and outputs it to the cos θ calculation circuit 104.
[0068]
The cos θ calculation circuit 104 calculates the cos θ from the inner product of the representative vector from the inner product circuit 102 and the input vector, the norm of the representative vector from the norm calculation circuit 103, and the norm of the input vector from the norm calculation circuit 105. _k (= (T _k ・ S) / (|| T _k |||| S ||)) ^-1 Output to the circuit 106. cos ^-1 The circuit 106 has an angle θ _k And is output to the minimum value selection circuit 107. The minimum value selection circuit 107 selects the representative vector having the smallest angle from the angles formed by the input vector and all the representative vectors, and outputs the representative vector number as a class code.
[0069]
Next, the operation of the class classification circuit 83 in FIG. 9 will be described with reference to the flowchart in FIG. First, in step S31, the input vector extraction circuit 101 extracts an input vector from the input DCT coefficient. In step S32, an angle formed by the representative vector and the input vector is calculated.
[0070]
That is, as described above, the inner product circuit 102 calculates the inner product of the input vector supplied from the input vector extraction circuit 101 and one representative vector supplied from the code book 82, and supplies the result to the COSθ calculation circuit 104. . The norm calculation circuit 103 calculates the norm of the representative vector supplied from the code book 82 and supplies it to the COSθ calculation circuit 104.
[0071]
The norm calculation circuit 105 calculates the norm of the input vector supplied from the input vector extraction circuit 101 and supplies the calculated norm to the COSθ calculation circuit 104.
[0072]
The COSθ calculation circuit 104 calculates COSθ based on the inner product supplied from the inner product circuit 102 and the norm supplied from the norm calculation circuit 103 and the norm calculation circuit 105, and COSθ ^-1 This is supplied to the circuit 106. COS ^-1 The circuit 106 calculates the angle θ from the input COSθ.
[0073]
In step S33, the minimum value selection circuit 107 determines whether or not the angles between all the representative vectors and the input vectors have been calculated. If there is a representative vector whose angle has not yet been calculated, the process proceeds to step S31. Return, and subsequent processing is repeatedly executed.
[0074]
As described above, when it is determined that the angle θ between the input vector and all the representative vectors has been calculated, the process proceeds to step S34, and the minimum value selection circuit 107 selects the representative vector having the minimum angle θ. . In step S35, the minimum value selection circuit 107 outputs the representative vector number selected in step S34 as a class code.
[0075]
FIG. 11 shows an example of the configuration of a prediction coefficient table generation circuit that generates the prediction coefficient table 84 by learning. The input digital video signal is subjected to blocking processing in the blocking circuit 121 and supplied to the DCT circuit 122. The DCT circuit 122 performs DCT conversion for each block and outputs DCT coefficients to the quantization circuit 123. The coefficient data supplied to the quantization circuit 123 is quantized there and supplied to the class classification circuit 124 and the prediction tap extraction circuit 129. Similar to the case described with reference to FIGS. 9 and 10, the class classification circuit 124 generates a class code from the block of interest using the code book stored in the table 130, and the normal equation addition circuit 125. To supply. The prediction tap extraction circuit 129 extracts a prediction tap from the quantized DCT coefficient data that is the output of the quantization circuit 123 based on the prediction tap position information from the prediction tap table 126 and outputs the prediction tap to the normal equation addition circuit 125.
[0076]
Here, the normal equation used in the normal equation adding circuit 125 will be described. Pixel data PD ₁ And quantized DCT coefficient data QD used for correction ₁ To QD _n Equation (2) shows a linear estimation equation using.
PD ₁ = W ₁ QD ₁ + W ₂ QD ₂ + ... + w _n QD _n (2)
[0077]
Prediction coefficient w before learning ₁ Thru w _n Is undecided. Since this prediction coefficient needs to be prepared for each class and pixel position mode, in practice, an equation must be set for each of them.
[0078]
Learning is performed on a plurality of signal data. When the number of data is m, from equation (2)
PD _1j = W ₁ QD _1j + W ₂ QD _2j + ... + w _n QD _nj , j = 1,2, ..., m (3)
It becomes. If m> n, w ₁ Thru w _n Is not uniquely determined, so the element of error vector E is e _j = PD _1j -(w ₁ QD _1j + W ₂ QD _2j + ... + w _n QD _nj ), j = 1,2, ..., m (4)
And a coefficient that minimizes the following formula is obtained.
[Expression 2]

[0079]
That is, the solution is based on the least square method. Where w in equation (5) _i Obtain the partial differential coefficient by.
[Equation 3]

Each w so that Equation (6) is 0 _i So
[Expression 4]

Using a matrix as
[Equation 5]

It becomes. This equation is generally called a normal equation. The normal equation adding circuit 125 adds the normal equations.
[0080]
After completing the input of all the learning data, the normal equation adding circuit 125 outputs the normal equation data to the prediction coefficient determining circuit 127. The prediction coefficient determination circuit 127 uses a general matrix solving method such as sweeping out a normal equation w _i And calculate the prediction coefficient. The prediction coefficient determination circuit 127 writes the calculated prediction coefficient into the prediction coefficient memory 128.
[0081]
As a result of the learning as described above, the prediction coefficient memory 128 stores the target pixel data PD. ₁ A prediction coefficient that is statistically closest to the true value is stored. Each stored prediction coefficient is used as a filter coefficient at the time of decoding (the content stored in the prediction coefficient memory 128 is used as the prediction coefficient table 84).
[0082]
Next, processing of the prediction coefficient table generation circuit of FIG. 11 will be described with reference to the flowchart of FIG. In step S51, the blocking circuit 121 performs a blocking process. In step S <b> 52, the DCT circuit 122 performs DCT processing on the pixel data supplied from the blocking circuit 121 and outputs it to the quantization circuit 123. In step S53, the quantization circuit 123 quantizes the DCT coefficient supplied from the DCT circuit 122 and outputs the quantized circuit to the class classification circuit 124 and the normal equation addition circuit 125.
[0083]
In step S54, the class classification circuit 124 performs class classification processing and outputs the obtained class code to the normal equation addition circuit 125.
[0084]
In step S55, the prediction tap table 126 extracts prediction taps and outputs them to the normal equation addition circuit 125. In step S56, the normal equation addition circuit 125 performs normal equation addition processing.
[0085]
In step S57, the normal equation adding circuit 125 determines whether or not the processing for all the blocks has been completed. If not completed yet, the normal equation adding circuit 125 returns to step S52 and repeats the subsequent processing.
[0086]
If it is determined in step S57 that the processing for all the blocks has been completed, the process proceeds to step S58, where the prediction coefficient determination circuit 127 calculates a prediction coefficient, supplies it to the prediction coefficient memory 128, and stores it.
[0087]
In the above description, all 63 AC coefficients of the 64 DCT coefficients of each block are used as the representative vectors in the code book 82. For example, as shown in FIG. It is also possible to use only a low-frequency AC coefficient as a representative vector. In this case, the input vector extraction circuit 101 extracts only the corresponding low-frequency AC coefficient.
[0088]
The class classification circuit 83 and its processing in this case are the same as those shown in FIGS.
[0089]
FIG. 14 shows still another example of the representative vector constituting the code book. In this example, a codebook is created not only by the block of interest but also by a vector including DCT coefficients (low-frequency AC coefficients) of neighboring blocks adjacent to the block of interest in the vertical and horizontal directions.
[0090]
The configuration of the class classification circuit 83 in this embodiment is shown in FIG. In the present embodiment, the sum of squared differences between the representative vector and the input vector is defined as the distance d, and is obtained by the following equation. However, the input vector is S and the representative vector is T _k , And.
[Formula 6]

[0091]
The input vector extraction circuit 101 extracts DCT coefficients that are vector components and supplies them to the vector difference circuit 141 as input vectors. The vector difference circuit 141 calculates a difference between the representative vector from the code book 82 and the input vector from the input vector extraction circuit 101 and outputs the difference to the square sum circuit 142. The sum of squares circuit 142 calculates the sum of squares of the difference between the representative vector and the input vector, and outputs it to the minimum value selection circuit 107. The minimum value selection circuit 107 selects the minimum value from the distances between the input vector and all the representative vectors, and outputs the representative vector number corresponding to the minimum value as a class code.
[0092]
Next, processing of the class classification circuit 83 in FIG. 15 will be described with reference to the flowchart in FIG.
[0093]
First, in step S71, the input vector extraction circuit 101 extracts an input vector from the DCT coefficient. In step S <b> 72, the vector difference circuit 141 calculates a difference between the input vector supplied from the input vector extraction circuit 101 and the representative vector supplied from the code book 82.
[0094]
The value of the vector difference calculated by the vector difference circuit 141 is supplied to the square sum circuit 142, and the square sum thereof is calculated. The calculated sum of squares is supplied to the minimum value selection circuit 107.
[0095]
In step S73, the minimum value selection circuit 107 determines whether or not the calculation of the sum of squares (distance) has been performed for all the representative vectors. If there is a representative vector that has not yet been processed, the process proceeds to step S71. Return, and subsequent processing is repeatedly executed.
[0096]
If it is determined in step S73 that the processing has been performed for all the representative vectors, the process proceeds to step S74, and the minimum value selection circuit 107 selects the representative vector having the minimum distance.
[0097]
In step S75, the minimum value selection circuit 107 outputs the number of the representative vector having the minimum distance selected in step S74 as a class code.
[0098]
As a vector component stored in the codebook 82, a DCT coefficient is not used as it is, but a feature amount obtained from the DCT coefficient can be used.
[0099]
FIG. 17 shows the configuration of the class classification circuit 83 in this embodiment, and FIG. 18 shows the processing.
[0100]
In the class classification circuit 83 of FIG. 17, a feature amount extraction circuit 151 is provided instead of the input vector extraction circuit 101 in FIG. Further, instead of extracting the input vector from the DCT coefficient by the input vector extraction circuit 101 in step S71 of FIG. 16, in step S91, the feature quantity extraction circuit 151 calculates the feature quantity from the DCT coefficient. An input vector. Except for the above points, the configuration and processing are basically the same as those shown in FIGS. 15 and 16.
[0101]
FIG. 19 shows an example of detected feature values. In this example, 8 × 8 pixels have a low region in the horizontal and vertical directions, a high region in the horizontal direction and a low region in the vertical direction, a high region in the vertical direction and a low region in the horizontal direction. The region is divided into four regions, that is, a high region in both the horizontal direction and the vertical direction, and the square sum values P1 to P4 of the DCT coefficients of each region are used as feature amounts.
[0102]
In the above description, the code book 82 created by learning such as the LBG algorithm is used. However, a template table can be used instead of the code book 82. This embodiment will be described below.
[0103]
FIG. 20 shows a configuration example of the coefficient data conversion circuit 62 in this case.
The coefficient data conversion circuit 62 has the same configuration as that in FIG. 6 except that the code book 82 of the coefficient data conversion circuit 62 shown in FIG. 6 is replaced with a template table 161.
[0104]
In the template table 161, templates used for classification are stored in advance. In the present embodiment, the template is a vector composed of 63 AC coefficients in the block. However, the template is not limited to this as shown in another embodiment described later.
[0105]
FIG. 21 shows a configuration of a template table generation circuit that generates the template table 161. The template table generation circuit receives pixel data of a block having a large distortion when quantized, for example, a block including a strong edge. The input pixel data of one block is converted into DCT coefficients by the DCT circuit 171, from which only 63 AC coefficients are extracted by the AC coefficient extraction circuit 172, supplied to the template memory 173, and stored. .
[0106]
Next, processing of the template table generation circuit of FIG. 21 will be described with reference to the flowchart of FIG.
[0107]
In step S <b> 111, the DCT circuit 171 performs DCT processing on the input block and outputs it to the AC coefficient extraction circuit 172. In step S112, the AC coefficient extraction circuit 172 extracts only 63 AC coefficients of each block, supplies them to the template memory 173, and stores them.
[0108]
In step S113, it is determined whether or not the processing has been completed for all the blocks. If there is a block that has not yet been processed, the process returns to step S111, and the subsequent processing is repeatedly executed. If it is determined in step S113 that the processing for all the blocks has been completed, the processing is terminated.
[0109]
FIG. 23 shows a configuration example of the class separation circuit 83 in FIG. A template table 161 is provided instead of the code book 82 in FIG. Further, the output of the minimum value selection circuit 107 is supplied to the threshold determination circuit 181 and the output of the threshold determination circuit 181 is output as a class code. Other basic configurations are the same as those of the class separation circuit 83 in FIG.
[0110]
If the minimum value of the angle supplied from the minimum value selection circuit 107 is smaller than a preset threshold value, the threshold determination circuit 181 outputs the template number as a class code. If the minimum value is larger than the threshold value, the learned prediction coefficient (monoclass coefficient) is used without class classification, and a monoclass code is output. As the threshold value, for example, a value in the range of the true value of the coefficient obtained from the coefficient data and the quantization scale is set.
[0111]
Next, the processing of the class classification circuit 83 in FIG. 23 will be described with reference to the flowchart in FIG.
[0112]
In step S <b> 121, the input vector extraction circuit 101 extracts an input vector from the input DCT coefficient and outputs it to the inner product circuit 102 and the norm calculation circuit 105. In step S122, an angle calculation process between the template and the input vector is executed.
[0113]
That is, the inner product circuit 102 calculates the inner product of the template stored in the template table 161 and the input vector supplied from the input vector extraction circuit 101, and outputs it to the COSθ calculation circuit 104. The norm calculation circuit 103 calculates the norm of the template supplied from the template table 161 and supplies it to the COSθ calculation circuit 104. The norm calculation circuit 105 calculates the norm of the input vector supplied from the input vector extraction circuit 101 and outputs it to the COSθ calculation circuit 104.
[0114]
The COSθ calculation circuit 104 calculates COSθ from the output of the inner product circuit 102, the output of the norm calculation circuit 103, and the output of the norm calculation circuit 105. ^-1 Output to the circuit 106.
[0115]
COS ^-1 The circuit 106 calculates the angle θ from the input COSθ and outputs it to the minimum value selection circuit 107.
[0116]
In step S123, the minimum value selection circuit 107 determines whether or not processing has been performed for all templates. If there are templates that have not yet been processed, the process returns to step S121, and the subsequent processing. Is repeatedly executed.
[0117]
If it is determined in step S123 that the calculation processing of the angles between all the templates and the input vector has been performed, the process proceeds to step S124, and the minimum value selection circuit 107 selects the template having the smallest angle, and the template number. And the minimum angle are output to the threshold determination circuit 181.
[0118]
In step S125, the threshold determination circuit 181 compares the minimum angle selected by the minimum value selection circuit 107 with a predetermined threshold set in advance, and determines whether or not the initial angle θ is smaller than the threshold. To do.
[0119]
If the angle θ is smaller than the threshold value, the process proceeds to step S126, and the threshold value determination circuit 181 outputs the template number of the minimum angle as the class code. If it is determined in step S125 that the minimum angle θ is not smaller than the threshold value, the process proceeds to step S127, and the threshold value determination circuit 181 outputs a monoclass code.
That is, at this time, a class code for classifying into an average class is output.
[0120]
Even when this template table is generated, the template table 161 can be generated using only the low-frequency DCT coefficients, as shown in FIG. The class classification circuit 83 and its processing in this case are the same as those shown in FIGS.
[0121]
Furthermore, as shown in FIG. 14, the template table 161 may be generated by using low-frequency DCT coefficients of neighboring blocks adjacent to the target block in the vertical and horizontal directions.
[0122]
The class separation circuit 83 and its processing in this case are shown in FIG. 25 and FIG.
[0123]
In this example, the square sum of the difference between the template of the template table 161 and the input vector is set as the distance d. In this example, the output of the minimum value selection circuit 107 is supplied to the threshold determination circuit 181 as in the case of FIG. Other configurations are the same as those in FIG.
[0124]
The processing of the class separation circuit 83 in FIG. 25 is the processing in steps S141 to S147 in FIG. 26, but the basic processing is the same as the processing in steps S121 to S127 in FIG. However, the process for calculating the angle between the template and the input vector in step S122 in FIG. 24 is the process for calculating the distance between the template and the input vector in step S142. In step S124, the process for selecting the template with the minimum angle is the process for selecting the template with the minimum distance in step S144.
[0125]
Other processes are the same as those in FIG.
[0126]
Even when a template is used instead of the code book, the feature quantity can be used as a template instead of the DCT coefficient. In this case, the template table generation circuit shown in FIG. 21 is configured as shown in FIG. Note that the feature amount in this case is also configured as shown in FIG. 19, for example.
[0127]
That is, the AC coefficient extraction circuit 172 in the template table generation circuit of FIG. 21 is changed to the feature amount extraction circuit 201 in the example of FIG. Other configurations are the same as those in FIG.
[0128]
FIG. 28 shows a processing example of the template table generation circuit of FIG. In step S 161, the DCT circuit 171 performs DCT processing on the input image and outputs it to the feature amount extraction circuit 201. In step S162, the feature amount extraction circuit 201 extracts a feature amount from the DCT coefficient supplied from the DCT circuit 171 and supplies the feature amount to the template memory 173 for storage.
[0129]
The above processing is repeatedly executed until it is determined in step S163 that the processing for all the blocks has been completed.
[0130]
FIG. 29 illustrates a configuration example of the class classification circuit 83 in the case where a template is configured using the feature amount in this way. In this configuration example, a feature amount extraction circuit 211 is provided instead of the input vector extraction circuit 101 of the class classification circuit 83 in FIG. Other configurations are the same as those in FIG.
[0131]
FIG. 30 shows a processing example of the class classification circuit 83 of FIG. The processing from step S181 to step S187 is basically the same as the processing from step S141 to step S147 in FIG.
[0132]
However, in step S141 in FIG. 26, the input vector is extracted from the DCT coefficient, whereas in step S181 in FIG. 30, the feature amount extraction circuit 211 calculates the feature amount from the DCT formation and sets it as the input vector. Processing is in progress. The other processes in steps S182 to S187 are the same as the processes in steps S142 to S147 in FIG.
[0133]
The series of processes described above can be performed by hardware or software. When a series of processing is performed by software, a program constituting the software is installed in a general-purpose computer or the like.
[0134]
FIG. 31 shows a configuration example of an embodiment of a computer in which a program for executing the series of processes described above is installed.
[0135]
The program can be recorded in advance on a hard disk 305 or a ROM 303 as a recording medium built in the computer.
[0136]
Alternatively, the program is stored temporarily or on a removable recording medium 311 such as a flexible disk, a CD-ROM (Compact Disc Read Only Memory), an MO (Magneto optical) disk, a DVD (Digital Versatile Disc), a magnetic disk, or a semiconductor memory. It can be stored permanently (recorded). Such a removable recording medium 311 can be provided as so-called package software.
[0137]
The program is installed in the computer from the removable recording medium 311 as described above, or transferred from the download site to the computer wirelessly via a digital satellite broadcasting artificial satellite, or a LAN (Local Area Network), The program can be transferred to a computer via a network such as the Internet. The computer can receive the program transferred in this way by the communication unit 308 and install it in the built-in hard disk 305.
[0138]
The computer includes a CPU (Central Processing Unit) 302. An input / output interface 310 is connected to the CPU 302 via the bus 301, and the CPU 302 is operated by an input unit 307 including a keyboard, a mouse, a microphone, and the like by the user via the input / output interface 310. When a command is input as a result, the program stored in a ROM (Read Only Memory) 303 is executed accordingly. Alternatively, the CPU 302 also transfers a program stored in the hard disk 305, a program transferred from a satellite or a network, received by the communication unit 308 and installed in the hard disk 305, or a removable recording medium 311 attached to the drive 309. The program read and installed in the hard disk 305 is loaded into a RAM (Random Access Memory) 304 and executed. Thereby, the CPU 302 performs processing according to the flowchart described above or processing performed by the configuration of the block diagram described above. Then, the CPU 302 outputs the processing result from the output unit 306 configured with an LCD (Liquid Crystal Display), a speaker, or the like, for example, via the input / output interface 310, or from the communication unit 308 as necessary. Transmission and further recording on the hard disk 305 are performed.
[0139]
In this specification, the processing steps for describing a program for causing a computer to perform various types of processing do not necessarily have to be processed in chronological order according to the order described in the flowchart, and are executed in parallel or individually. Processing to be performed (for example, parallel processing or object processing) is also included.
[0140]
Further, the program may be processed by one computer or may be distributedly processed by a plurality of computers. Furthermore, the program may be transferred to a remote computer and executed.
[0141]
Furthermore, in this embodiment, a JPEG encoded image that compresses and encodes a still image is targeted. However, the present invention targets a moving image that is compressed and encoded, for example, an MPEG encoded image. It is also possible.
[0142]
In the present embodiment, at least the JPEG-encoded encoded data for performing the DCT process is decoded. However, the present invention is based on other orthogonal transforms or frequency transforms in units of blocks (a predetermined predetermined value). It can be applied to decoding and conversion of data converted in units. That is, the present invention can be applied to, for example, the case where sub-band encoded data, Fourier-transformed data, or the like is decoded or converted into data with reduced quantization error.
[0143]
【The invention's effect】
The present invention of According to the image data processing device and method, the recording medium, and the program, a class code is generated based on table information for classifying a plurality of coefficient data, and a prediction coefficient set is generated based on the generated class code. And a prediction tap is generated. Then, pixel data is generated based on the prediction coefficient set and the prediction tap. Therefore, it is possible to obtain a better decoded image with a reduced quantization error without increasing the amount of information.
[Brief description of the drawings]
FIG. 1 is a block diagram showing a configuration of a conventional JPEG encoding apparatus.
FIG. 2 is a diagram illustrating an example of a quantization table.
FIG. 3 is a block diagram showing a configuration of a conventional JPEG decoding apparatus.
FIG. 4 is a block diagram illustrating a configuration example of a decoding device to which the present invention has been applied.
FIG. 5 is a flowchart for explaining the operation of the decoding device of FIG. 4;
6 is a block diagram illustrating a configuration example of a coefficient data conversion circuit in FIG. 4;
7 is a flowchart for explaining coefficient data conversion processing of the coefficient data conversion circuit of FIG. 6;
8 is a block diagram showing a configuration of a code book generation circuit of FIG. 6; FIG.
9 is a block diagram illustrating a configuration example of a class classification circuit in FIG. 6;
10 is a flowchart for explaining the operation of the class classification circuit of FIG. 9;
11 is a block diagram illustrating a configuration example of a prediction coefficient table generation circuit that generates the prediction coefficient table of FIG. 6;
12 is a flowchart for explaining the operation of the prediction coefficient table generation circuit of FIG.
FIG. 13 is a diagram illustrating low-frequency AC coefficients.
FIG. 14 is a diagram illustrating DCT coefficients of peripheral blocks.
15 is a block diagram illustrating another configuration example of the class classification circuit of FIG. 6;
16 is a flowchart for explaining the operation of the class classification circuit of FIG. 15;
17 is a block diagram illustrating still another configuration example of the class classification circuit in FIG. 6;
18 is a flowchart for explaining the operation of the class classification circuit of FIG. 17;
FIG. 19 is a diagram illustrating an example of a feature amount.
20 is a block diagram showing another configuration example of the coefficient data conversion circuit of FIG. 4;
21 is a block diagram illustrating a configuration example of a template table generation circuit that generates the template table of FIG. 20;
22 is a flowchart for explaining the operation of the template table generation circuit of FIG.
23 is a block diagram illustrating a configuration example of the class classification circuit of FIG. 20;
24 is a flowchart for explaining the operation of the class classification circuit of FIG. 23;
25 is a block diagram illustrating another configuration example of the class classification circuit in FIG. 20;
FIG. 26 is a flowchart for explaining the operation of the class classification circuit of FIG. 25;
27 is a block diagram illustrating another configuration example of the template table generation circuit that generates the template of FIG. 20;
FIG. 28 is a flowchart for explaining the operation of the template table generation circuit of FIG. 27;
29 is a block diagram showing still another configuration example of the class classification circuit of FIG. 20;
30 is a flowchart for explaining the operation of the class classification circuit of FIG. 29;
FIG. 31 is a block diagram illustrating a configuration example of a computer to which the present invention has been applied.
[Explanation of symbols]
60 decoding device, 61 entropy decoding circuit, 62 coefficient data conversion circuit, 63 block decomposition circuit, 81 prediction tap extraction circuit, 82 codebook, 83 class classification circuit, 84 prediction coefficient table, 85 product-sum operation circuit, 86 prediction tap table

Claims

In an image data processing apparatus that decodes quantized image data after being converted into coefficient data by orthogonal transform processing,
Holding means for holding table information for classifying a plurality of coefficient data;
Class code generating means for generating a class code based on the table information held in the holding means;
Prediction coefficient generation means for generating a prediction coefficient set based on the class code generated by the class code generation means;
Based on the class code generated by the class code generation means, a prediction tap generation means for generating a prediction tap;
Image data comprising: the prediction coefficient set generated by the prediction coefficient generation means; and pixel data generation means for generating pixel data based on the prediction tap generated by the prediction tap generation means. Processing equipment.

The image data processing apparatus according to claim 1, wherein the holding unit holds, as the table information, a plurality of coefficient data, or a template describing a relationship with a feature amount corresponding to the coefficient data. .

The holding means holds, as the table information, a plurality of the coefficient data, or a code book in which a vector composed of feature amounts corresponding to the coefficient data is stored,
The image data processing apparatus according to claim 1, wherein the class code generation unit generates the class code by vector quantization of the coefficient data using the code book.

The image data processing apparatus according to claim 3, wherein the code book is generated based on an LBG algorithm.

The image data processing apparatus according to claim 1, wherein the class code generation unit performs a threshold determination process.

In an image data processing method of an image data processing apparatus that decodes quantized image data after being converted into coefficient data by orthogonal transform processing,
Holding step for holding table information for classifying a plurality of coefficient data;
A class code generating step for generating a class code based on the table information held by the holding step;
A prediction coefficient generation step for generating a prediction coefficient set based on the class code generated by the processing of the class code generation step;
A prediction tap generation step for generating a prediction tap based on the class code generated by the processing of the class code generation step;
A pixel data generation step for generating pixel data based on the prediction coefficient set generated by the processing of the prediction coefficient generation step and the prediction tap generated by the processing of the prediction tap generation step. An image data processing method.

The image data processing method according to claim 6, wherein the holding step holds, as the table information, a plurality of coefficient data or a template describing a relationship with a feature amount corresponding to the coefficient data. .

The holding step holds, as the table information, a plurality of the coefficient data, or a code book in which a vector composed of feature amounts corresponding to the coefficient data is stored,
The image data processing method according to claim 7, wherein the class code generation step generates the class code by vector quantization of the coefficient data using the code book.

The image data processing method according to claim 8, wherein the code book is generated based on an LBG algorithm.

The image data processing method according to claim 6, wherein the class code generation step performs a threshold determination process.

A program of an image data processing apparatus for decoding quantized image data after being converted into coefficient data by orthogonal transform processing,
Holding step for holding table information for classifying a plurality of coefficient data;
A class code generating step for generating a class code based on the table information held by the holding step;
A prediction coefficient generation step for generating a prediction coefficient set based on the class code generated by the processing of the class code generation step;
A prediction tap generation step for generating a prediction tap based on the class code generated by the processing of the class code generation step;
And a pixel data generation step of generating pixel data based on the prediction coefficient set generated by the processing of the prediction coefficient generation step and the prediction tap generated by the processing of the prediction tap generation step. A recording medium on which a computer-readable program is recorded.

A computer that controls an image data processing apparatus that decodes quantized image data after being converted into coefficient data by orthogonal transform processing,
Holding step for holding table information for classifying a plurality of coefficient data;
A class code generating step for generating a class code based on the table information held by the holding step;
A prediction coefficient generation step for generating a prediction coefficient set based on the class code generated by the processing of the class code generation step;
A prediction tap generation step for generating a prediction tap based on the class code generated by the processing of the class code generation step;
A program that executes the prediction coefficient set generated by the processing of the prediction coefficient generation step and the pixel data generation step of generating pixel data based on the prediction tap generated by the processing of the prediction tap generation step.