JP4359085B2

JP4359085B2 - Content feature extraction device

Info

Publication number: JP4359085B2
Application number: JP2003186107A
Authority: JP
Inventors: 恵吾真島; 清一合志; 一人小川; 逸郎室田; 剛大竹; 誠一難波
Original assignee: Japan Broadcasting Corp
Current assignee: Japan Broadcasting Corp
Priority date: 2003-06-30
Filing date: 2003-06-30
Publication date: 2009-11-04
Anticipated expiration: 2023-06-30
Also published as: JP2005018674A

Description

【０００１】
【発明の属する技術分野】
本発明は、ネットワークや記録媒体を介して流通しているコンテンツの特徴量抽出し、この抽出した特徴量を不正流通検出や類似検索に利用するコンテンツ特徴量抽出装置に関する。
【０００２】
【従来の技術】
近年のネットワークの高速化や記録媒体の大容量化に伴い、映像データ、音声データ等からなる大容量のデジタルコンテンツ（以下、コンテンツとする）を、高速のネットワークである公衆通信回線（光ファイバ通信回線、ＡＤＳＬ等）により配信、または、大容量の記録媒体である光ディスク（ＤＶＤ等）により流通させることを、誰でも容易に行える環境が整備されている。
【０００３】
また、ネットワークを介して配信されたコンテンツは蓄積が容易であり、記録媒体に記録されたコンテンツは配送が容易であるので、当該コンテンツの著作権者（以下、単に「著作権者」とする）や当該コンテンツを配信する配信事業者（以下、「コンテンツプロバイダ」とする）の許可を得ることなく、蓄積したコンテンツを複製した後、ネットワークを介して再配信したり、当該コンテンツを改竄（かいざん）したりするなど、不正行為を行うことが容易である。この不正行為がコンテンツを流通させる上での大きな阻害要因となっている。
【０００４】
特に、著作権者やコンテンツプロバイダの許可（許諾）を得ることなく、コンテンツの複製、再配信（再送信）を行う不正行為よる不正流通コンテンツは、著作権者やコンテンツプロバイダに多大な金銭的不利益をもたらすので、このような不正流通コンテンツを高精度、且つ、短時間に検出し、不正行為の抑止を図ることができる技術的な手段が模索されている。
【０００５】
また、インターネットの普及やデジタル情報機器（ストレージ装置等）の高性能化に伴い、大容量のコンテンツである動画像コンテンツ（大容量マルチメディアデータ）の利用が一般化している。このため、インターネット上やストレージ装置内に保持された膨大な数のコンテンツの中から、利用者が要望する特定のコンテンツを効率よく検索する手段が求められている。
【０００６】
不正流通コンテンツの検出や、特定のコンテンツの検索を行うために、コンテンツを構成する映像データの輝度や色情報等から当該コンテンツの特徴を表す特徴量データを抽出し、この抽出した特徴量データをコンテンツ間の同一性や類似性の判定（検定）に利用する特徴量抽出技術が有効な手段として提案されている。
【０００７】
従来の特徴量抽出技術の一つに、例えば、非特許文献１に記載されているように、放送番組の輝度及び色信号の時間軌跡を特徴量とし、この特徴量を手がかりにして動画像検索を実現するものがある。この手法では、まず、テレビ放送番組をＭＰＥＧ−２に変換し、ＭＰＥＧ−２ストリームの各イントラフレームのＤＣ成分を用いて、フレームの平均色情報を求める。続いて、このフレームの平均色情報を３次元の色空間に配置し、このフレームの平均色情報の軌跡を時間軸に投影し波形情報に変換する。そして、この波形情報で特定される波形を拡大縮小して、比較することによって動画像検索（動画像の類似検索）を行っている。
【０００８】
ここで、図１１を参照して、従来の動画像の類似検索を行う際の特徴量抽出方法および比較方法について説明する。
【０００９】
この図１１は、従来のコンテンツ特徴量抽出装置のブロック図であり、この図１１に示したように、コンテンツ特徴量抽出装置１０１は、動画像データ輝度・色差データ平均化部１０３と、参照コンテンツ輝度・色差データ平均化部１０５と、ダイナミックレンジ調整部１０７と、波形比較部１０９と、分散値算出部１１１と、閾値判定部１１３とを備えている。
【００１０】
動画像データ輝度・色差データ平均化部１０３は、動画像データを入力として、この動画像データの輝度信号（Ｙ）と、色差信号（Ｃｂ、Ｃｒ）とを用い、これらの信号をフレーム単位で平均化し、特徴量データ（波形）を出力するものである。
【００１１】
参照コンテンツ輝度・色差データ平均化部１０５は、参照コンテンツを入力として、この参照コンテンツの動画像データの輝度信号（Ｙ）と、色差信号（Ｃｂ、Ｃｒ）とを用い、これらの信号をフレーム単位で平均化し、比較波形データを出力するものである。
【００１２】
ダイナミックレンジ調整部１０７は、動画像データ輝度・色差データ平均化部１０３から出力された調整対象の波形である特徴量データの最大値および最小値を、参照コンテンツ輝度・色差データ平均化部１０５から出力された比較波形データの最大値および最小値に合わせるものである。
【００１３】
波形比較部１０９は、ダイナミックレンジ調整部１０７で最大値および最小値が調整された特徴量データおよび比較波形データの波形を比較する、つまり、各時点における両波形の差分値を、差分データ系列として出力するものである。
【００１４】
分散値算出部１１１は、波形比較部１０９から出力された差分データ系列から分散値を算出するものである。
【００１５】
閾値判定部１１３は、分散値算出部１１１で算出された分散値と、予め設定された閾値とに基づいて、閾値判定を行って、両波形（特徴量データ、比較波形データ）の一致、不一致を判定し、動画像データと参照コンテンツとの類似性を検出するものである。
【００１６】
また、コンテンツの特徴量を抽出する従来の技術に関して、例えば、ＩＳＯ／ＩＥＣ１５９３８−３「ＭＰＥＧ−７ビジュアル記述」では、映像データ（映像信号）の特徴を記述し、この記述した特徴を抽出する特徴量抽出アルゴリズムが規定されている。このビジュアル記述は、主として、映像データ（映像信号）ベースでの類似検索・フィルタリングのために用いられることが想定されており、このビジュアル記述の中で、映像データ上の色や形状等の低レベルの特徴量を記述する具体的なものとして、色の空間的な配置を周波数軸上で表現する「色配置記述（ＣｏｌｏｒＬａｙｏｕｔ）」が定義されている。
【００１７】
この色配置記述は、人間の視覚特性を反映させたもので、コンテンツを構成する各画像フレームに対して、高精度の検索を可能にしている。つまり、色配置記述によって、コンテンツ同士の類似性を検定する際に、不要な情報を周波数軸上で削除することができる。その結果、コンテンツの特徴を記述するデータ量が減少する。
【００１８】
【非特許文献１】
高橋、富永、杉浦、横井、寺島著“特徴的な動画像の画紋を用いた高能率動画像検索法”画像電子学会誌、第２９巻、第６号、ｐｐ８１８
−ｐｐ８２５（２０００）
【００１９】
【発明が解決しようとする課題】
しかしながら、図１１に示した従来のコンテンツ特徴量抽出装置１０１では、時間軸方向の波形データである特徴量データを動画像データの特徴量としているため、長時間の放送番組等のコンテンツでは、特徴量データのデータ量が膨大となってしまうという問題がある。
【００２０】
また、コンテンツ特徴量抽出装置１０１では、時間軸上において、直接、特徴量データと比較波形データとを比較しているので、編集されたコンテンツを検定する場合、例えば、放送番組の冒頭、途中または最後の部分が削除され、元々の放送番組と比較して欠落が生じている場合には、コンテンツの同一性または類似性の検出精度が著しく低下するという問題がある。
【００２１】
さらに、従来のビジュアル記述では、映像データ中の１フレームの画像データを対象に特徴量抽出を行っているので、このビジュアル記述を動画像データであるコンテンツに適用する場合、特徴量データのデータ量が膨大となり、さらに、編集されたコンテンツに対する耐性（検出精度を保つこと）が考慮されていない、つまり、コンテンツの同一性または類似性の検出精度が著しく低下するという問題がある。
【００２２】
そこで、本発明の目的は前記した従来の技術が有する課題を解消し、特徴量データのデータ量を増加させることなく、コンテンツの同一性または類似性の検出精度を維持することができるコンテンツ特徴量抽出装置、コンテンツ特徴量抽出プログラムおよびコンテンツ特徴量抽出方法を提供することにある。
【００２３】
【課題を解決するための手段】
本発明は、前記した目的を達成するため、以下に示す構成とした。
請求項１記載のコンテンツ特徴量抽出装置は、コンテンツを提供するコンテンツ提供事業者から提供され、ネットワークまたは記録媒体を介して流通する流通コンテンツを構成する特定の周波数パターンで表される流通コンテンツ特徴量データを抽出すると共に、この流通コンテンツ特徴量データと、前記コンテンツを構成する特定の周波数パターンで表される参照コンテンツ特徴量データとを比較するコンテンツ特徴量抽出装置であって、参照コンテンツ特徴量データ蓄積手段と、流通コンテンツ特徴量データ抽出手段と、特徴量データ比較手段と、を備え、流通コンテンツ特徴量データ抽出手段は、画素データ平均化手段と、データ並べ替え手段と、周波数変換手段と、周波数データ平均化手段と、周波数データ総和算出手段とを有していることを特徴とする。
【００２４】
かかる構成によれば、コンテンツ特徴量抽出装置は、参照コンテンツ特徴量データ蓄積手段に予め、コンテンツの特徴量である参照コンテンツ特徴量データを蓄積しており、まず、流通コンテンツ特徴量データ抽出手段によって、ネットワークまたは記録媒体（例えば、光ディスク（ＤＶＤ等））を介して流通している流通コンテンツを、当該ネットワークまたは当該記録媒体を介して取得し、この取得した流通コンテンツから流通コンテンツ特徴量データを抽出する。これら参照コンテンツ特徴量データまたは流通コンテンツ特徴量データは、コンテンツまたは流通コンテンツを構成する特定の周波数パターン（固有の波形パターン）を示すものであり、例えば、コンテンツまたは流通コンテンツの各画素の色の配置に基づいて、コンテンツまたは流通コンテンツ毎に一義的に決定されるものである。つまり、コンテンツまたは流通コンテンツの各画素は当該コンテンツまたは当該流通コンテンツの一時系列データであり、参照コンテンツ特徴量データまたは流通コンテンツ特徴量データはこの一時系列データから単一のデータとして生成したものである。また、これら参照コンテンツ特徴量データまたは流通コンテンツ特徴量データは、ＭＰＥＧ−２１で提案されている“ＦｉｎｇｅｒＰｒｉｎｔ”（画紋）に相当するものである。
【００２５】
そして、このコンテンツ特徴量抽出装置は、特徴量データ比較手段によって、参照コンテンツ特徴量データ蓄積手段に蓄積している参照コンテンツ特徴量データと、流通コンテンツ特徴量データ抽出手段で抽出された流通コンテンツ特徴量データとを比較する。この特徴量データ比較手段による比較した結果に基づいて、例えば、流通コンテンツが不正に複写され再送信されたものである場合に、当該流通コンテンツとコンテンツプロバイダの提供したコンテンツとが同一のものであることを特定することができる。
【００２７】
また、コンテンツ特徴量抽出装置の流通コンテンツ特徴量データ抽出手段は、画素データ平均化手段によって、流通コンテンツに含まれる各画素に関する画素データを、当該流通コンテンツを構成するフレーム単位またはフィールド単位で平均化し、データ並べ替え手段によって、画素データ平均化手段で平均化された所定単位画素データを並べ替えて時系列データとする。なお、流通コンテンツに含まれる各画素に関する画素データは、輝度データや連続するフレームにおける各画素の色差データ（Ｃｂ、Ｃｒ等）や、色信号データ（ＲＧＢ）等であって、これらの組み合わせであってもよい。
【００２８】
続いて、このコンテンツ特徴量抽出装置の流通コンテンツ特徴量データ抽出手段は、周波数変換手段によって、データ並べ替え手段で並べ替えられた時系列データを一定長毎に周波数変換し、周波数データとし、この周波数データを周波数データ平均化手段によって、周波数毎に平均化する。そして、このコンテンツ特徴量抽出装置の流通コンテンツ特徴量データ抽出手段は、周波数データ総和算出手段によって、周波数データ平均化手段で平均化された平均化周波数データを周波数全域に亘って総和し、この総和した平均化周波数データを特徴量データとする。つまり、このコンテンツ特徴量抽出装置では、流通コンテンツ特徴量データ抽出手段の周波数変換手段によって、時系列データを周波数データにすることによって、コンテンツの一時系列のデータである輝度データから単一のデータである周波数データを生成しており、この周波数データを平均化し、周波数全域に亘る総和を求めて、この総和を特徴量データ（流通コンテンツ特徴量データ）としている。
【００２９】
請求項２記載のコンテンツ特徴量抽出装置は、請求項１に記載のコンテンツ特徴量抽出装置において、前記データ並べ替え手段は、前記所定単位画素データを一定間隔毎に選択して配列し、この配列を１つのフレーム単位または１つのフィールド単位ずつ順次ずらしながら繰り返して、並べ替えることを特徴とする。
【００３０】
かかる構成によれば、コンテンツ特徴量抽出装置の流通コンテンツ特徴量データ抽出手段は、データ並べ替え手段によって、所定単位画素データを一定間隔毎に選択して配列し、この配列を１つのフレーム単位または１つのフィールド単位ずつ順次ずらしながら繰り返して並べることで、新たな時系列データ（データ系列）を得ることができる。
【００３１】
請求項３記載のコンテンツ特徴量抽出装置は、請求項１又は請求項２に記載のコンテンツ特徴量抽出装置において、前記コンテンツおよび前記流通コンテンツが複数のシーンで構成されており、前記コンテンツの特徴量である参照コンテンツ特徴量データが前記シーンに対応する参照シーン特徴量データを含み、前記流通コンテンツの特徴量である流通コンテンツ特徴量データが前記シーンに対応する流通シーン特徴量データを含むことを特徴とする。
【００３２】
かかる構成によれば、コンテンツ特徴量抽出装置は、複数のシーンからなるコンテンツおよび流通コンテンツを取り扱うことが可能であり、参照コンテンツ特徴量データが各シーンに対応する参照シーン特徴量データを含み、流通コンテンツ特徴量データが各シーンに対応する流通シーン特徴量データを含んでいるので、これらの参照シーン特徴量データおよび流通シーン特徴量データに基づいて、所望する流通コンテンツの特定するシーンを検索することができる。
【００３３】
請求項４記載のコンテンツ特徴量抽出装置は、請求項１から請求項３のいずれか一項に記載のコンテンツ特徴量抽出装置において、前記特徴量データ比較手段は、前記流通コンテンツの流通コンテンツ特徴量データと前記コンテンツの参照コンテンツ特徴量データとの差の絶対値と、予め設定した特徴量データ同一性閾値とに基づいて、前記流通コンテンツと前記コンテンツとの同一性を検出する同一性検出手段を有していることを特徴とする。
【００３４】
かかる構成によれば、コンテンツ特徴量抽出装置は、同一性検出手段によって、特徴量データ同一性閾値に基づいて、流通コンテンツとコンテンツとの同一性を検出する。これによって、不正流通コンテンツを検出することができる。
【００３５】
請求項５記載のコンテンツ特徴量抽出装置は、請求項１から請求項４のいずれか一項に記載のコンテンツ特徴量抽出装置において、前記特徴量データ比較手段は、前記流通コンテンツの流通コンテンツ特徴量データと前記コンテンツの参照コンテンツ特徴量データとの差の絶対値と、予め設定した特徴量データ類似性閾値とに基づいて、前記流通コンテンツと前記コンテンツとの類似性を検出する類似性検出手段を有していることを特徴とする。
【００３６】
かかる構成によれば、コンテンツ特徴量抽出装置は、類似性検出手段によって、特徴量データ類似性閾値に基づいて、流通コンテンツとコンテンツとの類似性を検出する。これによって、コンテンツと似かよった流通コンテンツを検索することができる。
【００４１】
【発明の実施の形態】
以下、本発明の一実施の形態について、図面を参照して詳細に説明する。
（コンテンツ特徴量抽出装置の構成）
図１はコンテンツ特徴量抽出装置のブロック図である。この図１に示すように、コンテンツ特徴量抽出装置１は、参照コンテンツ（参照番組）の特徴量である参照コンテンツ特徴量データを抽出して蓄積していると共に、流通コンテンツの特徴量である流通コンテンツ特徴量データを抽出し、これらの特徴量データを比較して、不正に流通している不正流通コンテンツの検出や番組（コンテンツの一種）の特定シーンを検索するもので、参照コンテンツ特徴量データ抽出・管理部３と、流通コンテンツ特徴量データ抽出・比較部５とを備えている。
【００４２】
これら参照コンテンツ特徴量データまたは流通コンテンツ特徴量データは、コンテンツまたは流通コンテンツを構成する特定の周波数パターン（固有の波形パターン）を示すものであり、参照コンテンツまたは流通コンテンツを特定する特徴量、つまり、人物を特定する指紋に当たる「画紋」といえるものである。例えば、参照コンテンツ特徴量データまたは流通コンテンツ特徴量データは、参照コンテンツまたは流通コンテンツの各画素の色（画素データ）の配置に基づいて、コンテンツまたは流通コンテンツ毎に一義的に決定されるものである（詳細は後記する）。
【００４３】
なお、このコンテンツ特徴量抽出装置１はネットワークに接続される一般的なサーバをベースに構成されたものであり、各部、各手段は、当該サーバのハードウェア資源（ＣＰＵ、メモリ、ハードディスク等）を、新たに記述したソフトウェアによって協同的に活用し、この活用した結果を機能的に特定したものである。
【００４４】
また、この実施の形態では、これら参照コンテンツ特徴量データ抽出・管理部３と流通コンテンツ特徴量データ抽出・比較部５とは、コンテンツ特徴量抽出装置１の各部として構成されているが、これらは単体の装置（サーバ）とし、データおよび制御信号を送受信可能に構成してもよい。
【００４５】
参照コンテンツ特徴量データ抽出・管理部３は、参照コンテンツ（参照番組）の特徴量である参照コンテンツ特徴量データを抽出して蓄積するもので、参照コンテンツ特徴量データ抽出手段７と、参照コンテンツ特徴量データ蓄積手段９と、特徴量データ管理手段１１とを備えている。参照コンテンツ（参照番組）が請求項に記載したコンテンツに相当しており、不正流通コンテンツの比較対象、検索対象となる一つまたは複数の参照用のコンテンツ（番組）のことを指している。
【００４６】
参照コンテンツ特徴量データ抽出手段７は、参照コンテンツの特徴量である参照コンテンツ特徴量データを当該参照コンテンツから抽出して、参照コンテンツ特徴量データ蓄積手段９に出力するものである。この参照コンテンツ特徴量データ抽出手段７における参照コンテンツ特徴量データの抽出は、後記する流通コンテンツ特徴量データ抽出手段と同様の抽出方法によって行われる。参照コンテンツ特徴量データは、参照コンテンツの各フレーム中の輝度データの当該フレーム毎に平均化したものを、時系列に並べ替えて周波数変換し、この周波数に変換した周波数データを総和したものである。つまり、参照コンテンツ特徴量データは、参照コンテンツ（参照番組）の一時系列データ（各フレームの輝度データ）を単一のものとして取り扱えるようにしたものであるといえる。
【００４７】
なお、この実施の形態では、参照コンテンツ特徴量データは、参照コンテンツの各フレーム中の輝度データを使用して求めたが、例えば、色差データ（Ｃｂ、Ｃｒ等）や色信号データ（ＲＧＢ）を使用して求めてもよい。つまり、参照コンテンツ特徴量データは、参照コンテンツに含まれている画素に関する画素データに基づいて求めることができる。
【００４８】
また、参照コンテンツが複数のシーンから構成されている場合には、参照コンテンツ特徴量データは、各シーンに対応するように求められる。つまり、各シーン毎に振られているメタデータに基づいて、各シーン毎の特徴量である参照シーン特徴量データが割り当てられる。
【００４９】
参照コンテンツ特徴量データ蓄積手段９は、参照コンテンツ特徴量データ抽出手段７で抽出された参照コンテンツ特徴量データを蓄積するものである。この参照コンテンツ特徴量データ蓄積手段９は、特徴量データ管理手段１１の管理下にあるもので、この特徴量データ管理手段１１からの出力された制御信号（参照コンテンツ特徴量データ順次出力信号）に基づいて、蓄積している参照コンテンツ特徴量データを流通コンテンツ特徴量データ抽出・比較部５の特徴量データ比較手段１７（後記する）に、順次出力するものである。
【００５０】
特徴量データ管理手段１１は、参照コンテンツ特徴量データ抽出・管理部３の制御を司るもので、流通コンテンツ特徴量データ抽出・比較部５から出力された制御信号（参照コンテンツ特徴量データ出力開始信号）に基づいて、当該参照コンテンツ特徴量データ抽出・管理部３（参照コンテンツ特徴量データ蓄積手段９）から参照コンテンツ特徴量データを出力するものである。
【００５１】
流通コンテンツ特徴量データ抽出・比較部５は、ネットワーク（インターネット、イントラネット等）や記録媒体（光ディスク（ＤＶＤ等）、ＶＴＲ等）を介して、流通している流通コンテンツ（検索対象番組／シーン）の特徴量である流通コンテンツ特徴量データを抽出し、この流通コンテンツ特徴量データと参照コンテンツ特徴量データとを比較するもので、流通コンテンツ特徴量データ抽出手段１３と、制御手段１５と、特徴量データ比較手段１７と、結果表示手段１９とを備えている。
【００５２】
流通コンテンツ特徴量データ抽出手段１３は、入力された流通コンテンツ（検索対象番組／シーン）の特徴量である流通コンテンツ特徴量データを抽出するもので、輝度データ平均化手段１３ａと、データ並べ替え手段１３ｂと、周波数変換手段１３ｃと、周波数データ平均化手段１３ｄと、周波数データ総和算出手段１３ｅとを備えている。この流通コンテンツ特徴量データ抽出手段１３は、制御手段１５から出力された制御信号（流通コンテンツ特徴量データ出力開始信号）に基づいて、流通コンテンツから流通コンテンツ特徴量データの抽出を開始する。
【００５３】
なお、流通コンテンツが複数のシーンから構成されている場合には、流通コンテンツ特徴量データは、各シーンに対応するように求められる。つまり、各シーン毎に振られているメタデータに基づいて、各シーン毎の特徴量である流通シーン特徴量データが割り当てられる。
【００５４】
輝度データ平均化手段１３ａは、流通コンテンツの動画像データを、連続するフレーム（フィールド）における各画素の輝度値（輝度データ）として入力し、この輝度値（輝度データ）を平均化した平均輝度値（所定単位輝度データ）を求めるもの、つまり、輝度値（輝度データ）の平均化処理を行うものである。
【００５５】
この輝度データ平均化手段１３ａによる輝度値（輝度データ）の平均化処理では、各フレーム（各フィールド）について、このフレーム（フィールド）中の全画素の輝度値（輝度データ）の平均値を算出している。ここでは、輝度値（輝度データ）の平均値として、全画素の輝度値（輝度データ）の総和を全画素数で除算した値を用いている。
【００５６】
また、任意のブロックサイズ（例えば、水平８画素×垂直８ライン）の小ブロックに分割した全ブロックに対してＤＣＴ（ＤｉｓｃｒｅｔｅＣｏｓｉｎｅＴｒａｎｓｆｏｒｍ；離散コサンイン変換）演算処理を施し、小ブロック毎のＤＣ（直流）係数を求め、このＤＣ（直流）係数の平均値を用いてもよい。
【００５７】
或いは、各フレーム（フィールド）における全画素を任意数のブロックに分割し、各ブロックにおける全画素の輝度値を平均化して、縮小画像を作成し、当該縮小画像に対してＤＣＴ演算処理を施してＤＣ係数を求めてもよい。
【００５８】
なお、この実施の形態では、流通コンテンツ特徴量データ抽出手段１３へ入力された流通コンテンツの連続するフレーム（フィールド）における各画素の輝度値（輝度データ）としているが、これに限定されず、例えば、連続するフレームにおける各画素の色差データ（Ｃｂ、Ｃｒ）や色信号データ（ＲＧＢ）等を用いてもよく、また、これら輝度データ、色差データ、色信号データの任意の組み合わせであってもよい。この輝度値（輝度データ）が請求項に記載した画素データに、平均輝度値（所定単位輝度データ）が所定単位画素データに相当するものである。
【００５９】
データ並べ替え手段１３ｂは、輝度データ平均化手段１３ａで求められた平均輝度値（所定単位輝度データ）を時系列に並べ替えて新規データ系列（時系列データ）とする、データ並べ替え処理を行うものである。つまり、このデータ並べ替え手段１３ｂは、一定間隔毎に平均輝度値（所定単位輝度データ）を選択して配列し、この配列を１つのフレーム単位または１つのフィールド単位ずつ順次ずらしながら繰り返して並べ替える。以下、新規データ系列（時系列データ）の１周期分を平均輝度値サンプル系列とする。なお、このデータ並べ替え手段１３ｂにおける平均輝度値（所定単位輝度データ）の並べ替えの詳細（概念）については後記する（図５を使用）。
【００６０】
周波数変換手段１３ｃは、データ並べ替え手段１３ｂで並べ替えられた新規データ系列（時系列データ）の複数の平均輝度値サンプル系列に対して、一定長毎にＤＦＴ（ＤｉｓｃｒｅｔｅＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ；離散フーリエ変換）またはＦＦＴ（ＦａｓｔＦｏｕｒｉｅｒＴｒａｎｓｆｏｒｍ；高速フーリエ変換）等の周波数変換処理を施すものである。この周波数変換手段１３ｃで周波数変換された新規データ系列（時系列データ）を周波数データとする。
【００６１】
なお、輝度データ平均化手段１３ａによる輝度値（輝度データ）の平均化処理におけるフレーム（フィールド）、データ並べ替え手段１３ｂによるデータ並べ替え処理における平均輝度値（所定単位輝度データ）の選択間隔、周波数変換手段１３ｃによる周波数変換処理における平均輝度値サンプル系列は、コンテンツの同一性または類似性の検出精度や、コンテンツ特徴量抽出装置１の処理速度を勘案して決定されるものである。
【００６２】
周波数データ平均化手段１３ｄは、周波数変換手段１３ｃで周波数変換された複数の周波数データを周波数毎に平均化した周波数特性データ（平均化周波数データ）を求める、周波数データ平均化処理を行うものである。この周波数データ平均化手段１３ｄは、求めた平均化周波数データを周波数データ総和算出手段１３ｅに出力する。なお、周波数特性データ（平均化周波数データ）は、周波数における対数表現による電力と、フレーム周波数で正規化した周波数とによって表される。
【００６３】
周波数データ総和算出手段１３ｅは、周波数データ平均化手段１３ｄで求められた周波数特性データ（平均化周波数データ）を周波数全域に亘って総和した総和平均化周波数データを、流通コンテンツ特徴量データとして、特徴量データ比較手段１７に出力するものである。つまり、この周波数データ総和算出手段１３ｅでは、周波数全域に亘って、全電力の対数値の総和を算出して、この値を流通コンテンツ特徴量データとしている。
【００６４】
制御手段１５は、コンテンツ特徴量抽出装置１の全体の制御を司るもので、参照コンテンツ特徴量データ抽出・管理部３の特徴量データ管理手段１１に制御信号（参照コンテンツ特徴量データ出力開始信号）と、流通コンテンツ特徴量データ抽出手段１３に制御信号（流通コンテンツ特徴量データ出力開始信号）と、特徴量データ比較手段１７に閾値とを出力するものである。
【００６５】
この制御手段１５は図示を省略した記録手段に、予め設定した閾値を記録（保持）しており、この閾値は、特徴量データ同一性閾値と、特徴量データ類似性閾値とである。
【００６６】
特徴量データ同一性閾値は、特徴量データ比較手段１７において、参照コンテンツ特徴量データと流通コンテンツ特徴量データとに基づいて、流通コンテンツが参照コンテンツと同一のものであるかどうかを判定する際の基準となるものである。流通コンテンツが参照コンテンツと同一のものであると判定された場合には、流通コンテンツが正規のルートを経由して配布され、流通しているとは言い難く、不正流通コンテンツであると断定される。
【００６７】
特徴量データ類似性閾値は、特徴量データ比較手段１７において、参照コンテンツ特徴量データと流通コンテンツ特徴量データとに基づいて、流通コンテンツが参照コンテンツと類似しているものであるかどうかを判定する際の基準となるものである。つまり、この特徴量データ類似性閾値は、所望の流通コンテンツ（検索対象番組）や流通コンテンツの一部であるシーンを検索する際に、参照コンテンツ（参照番組）の参照コンテンツ特徴量データ（参照シーン特徴量データ）と共に利用されるものである。
【００６８】
なお、この実施の形態では、制御手段１５の図示を省略した記録手段に閾値が記録されているが、特徴量データ比較手段１７に記録されている態様であってもよい。この場合、制御信号１５から特徴量データ比較手段１７には、閾値を活用して、参照コンテンツ特徴量データと流通コンテンツ特徴量データとを比較させる制御信号（閾値活用信号）が出力される。
【００６９】
特徴量データ比較手段１７は、参照コンテンツ特徴量データ蓄積手段９から出力された参照コンテンツ特徴量データと、流通コンテンツ特徴量データ抽出手段１３で抽出された流通コンテンツ特徴量データと、制御手段１５から出力された閾値とに基づいて、参照コンテンツと流通コンテンツとが、同一、類似、または非類似であるかを比較するもので、同一性検出手段１７ａと、類似性検出手段１７ｂとを備えている。
【００７０】
同一性検出手段１７ａは、参照コンテンツ特徴量データと流通コンテンツ特徴量データの差の絶対値と、特徴量データ同一性閾値とに基づいて、参照コンテンツと流通コンテンツとの同一性を検出するものである。
【００７１】
類似性検出手段１７ｂは、参照コンテンツ特徴量データと流通コンテンツ特徴量データの差の絶対値と、特徴量データ類似性閾値とに基づいて、参照コンテンツと流通コンテンツとの類似性を検出するものである。
【００７２】
つまり、この特徴量データ比較手段１７では、同一性検出手段１７ａと類似性検出手段１７ｂとによって、同一性、類似性が検出されなかった場合には、参照コンテンツと流通コンテンツとは非類似であるとされる。
【００７３】
結果表示手段１９は、特徴量データ比較手段１７による比較結果を表示させるためのものである。この結果表示手段１９には、参照コンテンツと流通コンテンツが同一である旨の表示や、参照コンテンツの各シーンの参照シーン特徴量データに基づいて検索された流通コンテンツの数量や当該流通コンテンツのタイトル名等が表示される。
【００７４】
このコンテンツ特徴量抽出装置１によれば、参照コンテンツ特徴量データ蓄積手段９に、参照コンテンツの特徴量である参照コンテンツ特徴量データが蓄積されており、流通コンテンツ特徴量データ抽出手段１３によって、ネットワークまたは記録媒体を介して流通している流通コンテンツが、当該ネットワークまたは当該記録媒体を介して取得され、この取得された流通コンテンツから流通コンテンツ特徴量データが抽出される。特徴量データ比較手段１７によって、参照コンテンツ特徴量データと流通コンテンツ特徴量データとが比較される。これらの特徴量データの比較に基づいて、参照コンテンツと流通コンテンツとの同一性、または、類似性を検出することができる。
【００７５】
また、このコンテンツ特徴量抽出装置１によれば、流通コンテンツ特徴量データ抽出手段１３の輝度データ平均化手段１３ａによって、流通コンテンツの輝度データが当該流通コンテンツを構成するフレーム単位またはフィールド単位で平均化され、データ並べ替え手段１３ｂによって、輝度データ平均化手段１３ａで平均化された所定単位輝度データが並べ替えられて時系列データとされる。続いて、周波数変換手段１３ｃによって、データ並べ替え手段１３ｂで並べ替えられた時系列データが一定長毎に周波数変換され、周波数データとされ、この周波数データが周波数データ平均化手段１３ｄによって、周波数毎に平均化される。そして、周波数データ総和算出手段１３ｅによって、周波数データ平均化手段１３ｄで平均化された平均化周波数データが周波数全域に亘って総和され、この総和された平均化周波数データが特徴量データとされる。
【００７６】
つまり、このコンテンツ特徴量抽出装置１では、周波数変換手段１３ｃによって、時系列データを周波数データにすることによって、コンテンツの一時系列のデータである輝度データから複数の周波数データを生成しており、この周波数データを平均化し、周波数全域に亘る総和を求めて、この総和を流通コンテンツ特徴量データとしているので、データ量を増加させることなく、参照コンテンツと流通コンテンツとの同一性または類似性の検出精度を維持することができる。
【００７７】
すなわち、このコンテンツ特徴量抽出装置１を、放送番組等のコンテンツ（参照コンテンツ）の提供者であるコンテンツプロバイダが利用することにより、外部のインターネット、或いは、内部のイントラネットを介して、または、光ディスク（例えば、ＤＶＤ）等の記録媒体を介して、流通する流通コンテンツの特徴量（流通コンテンツ特徴量データ）と、当該コンテンツプロバイダが保有するコンテンツの特徴量（参照コンテンツ特徴量データ）とを比較し、これらの同一性を検出することによって、不正流通コンテンツを検出することができる。
【００７８】
また、このコンテンツ特徴量抽出装置１は、検索対象となるコンテンツの特徴量（流通コンテンツ特徴量データ）と、参照するコンテンツ（参照コンテンツ）の特徴量（参照コンテンツ特徴量データ）とを比較し、これらの類似性を検定することにより、コンテンツプロバイダが保存管理している番組群（参照コンテンツの集合）から目的とする検索対象番組やシーンを検索することができる。
【００７９】
さらに、このコンテンツ特徴量抽出装置１によれば、流通コンテンツ特徴量データ抽出手段１３が、データ並べ替え手段１３ｂによって、所定単位輝度データを一定間隔毎に選択して配列し、この配列を１つのフレーム単位または１つのフィールド単位ずつ順次ずらしながら繰り返して並べることで、新たな時系列データ（データ系列）を得ることができる。
【００８０】
さらにまた、このコンテンツ特徴量抽出装置１によれば、複数のシーンからなる参照コンテンツおよび流通コンテンツを取り扱うことが可能であり、参照コンテンツ特徴量データが各シーンに対応する参照シーン特徴量データを含み、流通コンテンツ特徴量データが各シーンに対応する流通シーン特徴量データを含んでいるので、これらの参照シーン特徴量データおよび流通シーン特徴量データに基づいて、所望する流通コンテンツの特定するシーンを検索することができる。
【００８１】
そしてまた、このコンテンツ特徴量抽出装置１によれば、特徴量データ比較手段１７の同一性検出手段１７ａによって、特徴量データ同一性閾値に基づいて、参照コンテンツと流通コンテンツとの同一性を検出する。これによって、不正流通コンテンツを検出することができる。
【００８２】
或いはまた、このコンテンツ特徴量抽出装置１によれば、特徴量データ比較手段１７の類似性検出手段１７ｂによって、特徴量データ類似性閾値に基づいて、流通コンテンツと参照コンテンツとの類似性を検出する。これによって、参照コンテンツと似かよった流通コンテンツ（所望する番組等）を検索することができる。
【００８３】
なお、流通コンテンツおよび参照コンテンツのそれぞれに関し、入力する流通コンテンツおよび参照コンテンツである動画像データ系列の振幅レベルが相互に著しく異なり、特徴量データ比較手段１７における比較判定に影響を及ぼす場合には、それぞれの動画像データ系列を当該動画像データ系列の各最大値で正規化する処理を施した後に、データ並べ替え手段１３ｂによるデータ並べ替え処理以降の処理を行うことにすればよい。
【００８４】
また、この実施の形態では、コンテンツ特徴量抽出装置１が動画像データからなるコンテンツを処理する場合について説明したが、音声データからなる楽曲のコンテンツを処理することも可能であり、動画像データと音声データとを組み合わせて、特徴量データ（参照コンテンツ特徴量データ、流通コンテンツ特徴量データ）の検出を行うことも可能である。なお、音声データからなる楽曲のコンテンツを処理する場合、特徴量データ（参照コンテンツ特徴量データ、流通コンテンツ特徴量データ）は、音声データの波形パターン、周波数スペクトラム等によって表されることとなる。
【００８５】
（コンテンツ特徴量抽出装置の動作）
次に、図２に示すフローチャートを参照して、コンテンツ特徴量抽出装置１の動作について説明する（適宜、図１参照）。
まず、コンテンツ特徴量抽出装置１は、参照コンテンツ特徴量データ抽出手段７によって参照コンテンツ特徴量データを抽出する（Ｓ１）。この抽出した参照コンテンツ特徴量データを参照コンテンツ特徴量データ蓄積手段９に蓄積する（Ｓ２）。
【００８６】
そして、制御手段１５によって、閾値（特徴量データ同一性閾値Ｔｈ１、特徴量データ類似性閾値Ｔｈ２）を特徴量データ比較手段１７に入力する（Ｓ３）と共に、制御信号（流通コンテンツ特徴量データ出力開始信号）を流通コンテンツ特徴量データ抽出手段１３に、制御信号（参照コンテンツ特徴量データ出力開始信号）を特徴量データ管理手段１１に入力する。
【００８７】
すると、流通コンテンツ特徴量データ抽出手段１３によって、流通コンテンツ特徴量データｇ１を抽出し（Ｓ４）、特徴量データ比較手段１７に出力する。続いて、特徴量データ管理手段１１によって制御信号（参照コンテンツ特徴量データ順次出力信号）を参照コンテンツ特徴量データ蓄積手段９に出力し、この制御信号（参照コンテンツ特徴量データ順次出力信号）によって参照コンテンツ特徴量データ蓄積手段９から蓄積している参照コンテンツ特徴量データｇ２を特徴量データ比較手段１７に出力する（Ｓ５）。
【００８８】
そして、特徴量データ比較手段１７によって、特徴量データ同一性閾値Ｔｈ１よりも、流通コンテンツ特徴量データｇ１と参照コンテンツ特徴量データｇ２との差の絶対値の方が小さい値となるか、または、特徴量データ類似性閾値Ｔｈ２よりも、流通コンテンツ特徴量データｇ１と参照コンテンツ特徴量データｇ２との差の絶対値の方が小さい値となるかが判定される（Ｓ６）。
【００８９】
特徴量データ同一性閾値Ｔｈ１よりも、流通コンテンツ特徴量データｇ１と参照コンテンツ特徴量データｇ２との差の絶対値の方が小さい値と判定された場合、または、特徴量データ類似性閾値Ｔｈ２よりも、流通コンテンツ特徴量データｇ１と参照コンテンツ特徴量データｇ２との差の絶対値の方が小さい値と判定された場合（Ｓ６、Ｙｅｓ）、流通コンテンツと参照コンテンツとは、同一のコンテンツであるか、類似番組（類似しているコンテンツ）であると判定される（Ｓ７）。
【００９０】
また、特徴量データ同一性閾値Ｔｈ１よりも、流通コンテンツ特徴量データｇ１と参照コンテンツ特徴量データｇ２との差の絶対値の方が小さい値と判定されなかった場合、または、特徴量データ類似性閾値Ｔｈ２よりも、流通コンテンツ特徴量データｇ１と参照コンテンツ特徴量データｇ２との差の絶対値の方が小さい値と判定されなかった場合（Ｓ６、Ｎｏ）、流通コンテンツと参照コンテンツとは、別のコンテンツであるか、異種番組（類似していないコンテンツ）であると判定される（Ｓ８）。
【００９１】
これらＳ５からＳ８までの処理が参照コンテンツ特徴量データ蓄積手段９に蓄積されている参照コンテンツ特徴量データのデータ数（所定回数）に至るまで（Ｓ９、Ｎｏ）繰り返され、所定回数に至った場合（Ｓ９、Ｙｅｓ）動作は終了する。
【００９２】
（平均輝度値の時間変化について）
次に、図３、図４を参照して、平均輝度値（所定単位輝度データ）の時間変化について説明する（適宜、図１参照）。
【００９３】
この図３、図４は、コンテンツ特徴量抽出装置１の流通コンテンツ特徴量データ抽出手段１３の輝度データ平均化手段１３ａにおいて、輝度値（輝度データ）を平均化処理した後の平均輝度値（所定単位輝度データ）の時間変化を示したものである。
【００９４】
より具体的に説明すると、図３は、約１０分間の放送番組Ａにおける連続するフレーム１７９２０フレームの平均輝度値を求め、冒頭の３０００フレームについて、時系列に従ってグラフとしてプロットした事例を示すものである。図４は、図３の波形の一部を拡大したものである。この図３、図４において、縦軸は平均輝度値、横軸は連続するフレーム番号を示している。
【００９５】
この事例では、各フレームを水平８ブロック×垂直８ブロックの６４ブロックに分割し、各ブロックにおける全画素の輝度値（輝度値データ）を平均化して水平８画素×垂直８ラインの縮小画像を作成し、当該縮小画像に対してＤＣＴ演算処理を施し、ＤＣ係数を求め、このＤＣ係数を平均値輝度値（所定単位輝度データ）とした。この図３に示すように、得られた平均値輝度値を時系列に並べ替えることにより、平均輝度値の時間変化を把握することができる。
【００９６】
また、図４は、図３のグラフの一部（フレーム番号７００からフレーム番号１４００まで）を、時間軸（フレーム）方向に拡大して示したものである。この図４に示すように、隣接するフレーム同士の平均輝度値の差は小さい。つまり、シーンチェンジ等、画面の大きな変化が発生する場合を除いて、通常、３秒程度（９０フレームに相当）の時間内では、平均輝度値の変動が少ないことがわかる。
【００９７】
（データ並べ替え処理の概念について）
次に、図５を参照して、データ並べ替え手段１３ｂによるデータ並べ替え処理の概念について説明する。図５は、データ並べ替えの概念を模式的に、記号を使用して説明したものである。
【００９８】
この図５に示すように、コンテンツ特徴量抽出装置１のデータ並べ替え手段１３ｂによって、一定間隔毎に、平均値輝度値（所定単位輝度データ）を選択して配列し、この操作を１フレームずつ順次ずらしながら繰り返すことにより、新規データ系列（時系列データ）を得る。
【００９９】
つまり、［平均輝度値の時間変化］に示した、時刻１から２０までの記号“黒丸”、“三角”、“四角”、“バツ”“回転した四角”が［新規データ系列］では、同一記号同士が連続するように並び替えられており、各同一記号同士の集合により、平均輝度値サンプル系列が構成されている。例えば、新規データ系列における平均輝度値サンプル系列の“黒丸”は、時刻１、時刻６、時刻１１、時刻１６の平均輝度値の集合である。
【０１００】
このデータ並べ替え手段１３ｂによる操作によって、流通コンテンツのデータ量によらず、流通コンテンツ全体の平均輝度値の時間変化を少ないデータ量で表現することができ、流通コンテンツが妥当な長さのデータ系列である場合、このデータ並べ替え手段１３ｂ以降の処理、つまり、周波数変換処理を効率的に行うことが可能になる。
【０１０１】
（平均輝度値サンプル系列の重ね合わせについて）
次に、図６を参照して、平均輝度値サンプル系列の重ね合わせについて説明する。
【０１０２】
図６は、平均輝度値（所定単位輝度データ）を７０フレーム間隔で、図５を使用して説明したデータ並べ替え手段１３ｂによるデータ並べ替え処理を行って、得られた平均輝度サンプル系列を１０周期おきに７波形重ねた結果を示したものである。この図６において、縦軸は平均輝度値、横軸は平均輝度サンプル系列１周期内の位置を表している。ここで、ｄｉｖ（Ａ，Ｂ）はＡ÷Ｂの商を意味している。図４を使用して説明したように、隣接するフレームの平均輝度値の差は小さいので、各波形ともほぼ一致している。
【０１０３】
（番組データの周波数特性について）
次に、図７を参照して、番組データの周波数特性について説明する（適宜、図１参照）。
【０１０４】
図７は、番組Ａ（以下、コンテンツＡとする）と、このコンテンツＡの冒頭１０％を削減して作成した削減コンテンツＡ１と、別の番組Ｂ（以下、コンテンツＢとする）と、このコンテンツＢの冒頭１０％を削減して作成した削減コンテンツＢ１と、の４つのコンテンツＡ、Ａ１、Ｂ、Ｂ１（動画像データ）について、輝度データ平均化手段１３ａによる輝度値（輝度データ）の平均化処理、データ並べ替え手段１３ｂによるデータ並べ替え処理、周波数変換手段１３ｃによる周波数変換処理および周波数データ平均化手段１３ｄによる周波数データ平均化処理を施した結果得られた周波数特性の事例を示している。なお、周波数変換手段１３ｃによる周波数変換処理にはＦＦＴを用いている。この図７において、縦軸は各周波数における対数表現による電力、横軸はフレーム周波数で正規化した周波数を表している。
【０１０５】
この図７に示すように、異なる番組間、つまり、コンテンツＡとコンテンツＢとの違いを周波数特性によって識別できる。また、コンテンツＡとこのコンテンツＡの冒頭を１０％削減した削減コンテンツＡ１と、コンテンツＢとこのコンテンツＢ１の冒頭を１０％削減した削減コンテンツＢ１との周波数特性に高い相関を有していることがわかる。
【０１０６】
（特徴量データの算出結果の事例について）
次に、図８を参照して、特徴量データ（参照コンテンツ特徴量データ、流通コンテンツ特徴量データ）の算出結果の事例について説明する（適宜、図１参照）。
【０１０７】
図８は、図７を使用して説明したコンテンツＡ、削減コンテンツＡ１、コンテンツＢ、削減コンテンツＢ１について、これらの輝度データ（輝度信号；Ｙ）および色差データ（色差信号Ｃｂ、Ｃｒ）をコンテンツ特徴量抽出装置１の流通コンテンツ特徴量データ抽出手段１３に入力した場合に得られた流通コンテンツ特徴量データを示したものである。
【０１０８】
この図８に示すように、輝度データ（輝度信号；Ｙ）および色差データ（色差信号Ｃｂ、Ｃｒ）について、コンテンツＡと削減コンテンツＡ１とは近接した流通コンテンツ特徴量データをもち、コンテンツＢと削減コンテンツＢ１とは近接した流通コンテンツ特徴量データをもっていることがわかる。
【０１０９】
（特徴量データによる番組間の距離評価結果の事例について）
次に、図９を参照して、特徴量データ（参照コンテンツ特徴量データ、流通コンテンツ特徴量データ）の番組間（コンテンツ間）の距離評価結果の事例について説明する（適宜、図１参照）。
【０１１０】
図９は、流通コンテンツ特徴量データ抽出手段１３に入力される輝度値（輝度データ）、つまり、輝度信号Ｙ以外に、色差データ（Ｃｂ、Ｃｒ等）、つまり、色差信号（Ｃｂ、Ｃｒ）が入力される場合の、コンテンツＡ、削減コンテンツＡ１、コンテンツＢ、削減コンテンツＢ１の距離評価結果の事例を示したものである。
【０１１１】
この距離評価結果は、輝度信号Ｙを用いて求められた流通コンテンツ特徴量データｇ１_Y、輝度信号Ｙを用いて求められた参照コンテンツ特徴量データｇ２_Y、色差信号Ｃｂを用いて求められた流通コンテンツ特徴量データｇ１_Cb、色差信号Ｃｂを用いて求められた参照コンテンツ特徴量データｇ２_Cb、色差信号Ｃｒを用いて求められた流通コンテンツ特徴量データｇ１_Cr、色差信号Ｃｒを用いて求められた参照コンテンツ特徴量データｇ２_Crとして、流通コンテンツと参照コンテンツ間の番組間距離（コンテンツ間距離）をＤとする場合、
【０１１２】
【数１】

【０１１３】
この（１）式によって、コンテンツＡ、削減コンテンツＡ１、コンテンツＢ、削減コンテンツＢ１の番組間距離（コンテンツ間距離）Ｄを求め、一覧表にしたものである。
【０１１４】
この図９に示すように、異なる番組間（コンテンツ間）、つまりコンテンツＡとコンテンツＢとの距離に比べ、コンテンツＡと冒頭１０％を削減した削減コンテンツＡ１との距離、または、コンテンツＢと冒頭１０％を削減した削減コンテンツＢ１との距離の方が十分に小さく、適当の閾値（特徴量データ同一性閾値）を設けることによって、番組間（コンテンツ間）の同一性の検出を高精度に行うことができる。
【０１１５】
（参照コンテンツ特徴量データの抽出と管理の概念について）
次に、図１０を参照して、参照コンテンツ特徴量データ抽出・管理部３における参照コンテンツ特徴量データの抽出と管理の概念について説明する。
【０１１６】
この図１０に示すように、番組（参照コンテンツ）が複数のシーン（シーン１、シーン２、・・・シーン（ｎ−１）、シーンｎ）から構成されており、各シーン毎の特徴量（特徴量１、特徴量２、・・・特徴量（ｎ−１）、特徴量ｎ；参照シーン特徴量データ）が設定されている。
【０１１７】
これら各シーン毎に特徴量（参照シーン特徴量データ）が纏められて、番組（参照コンテンツ）全体の特徴量（参照コンテンツ特徴量データ）として、参照コンテンツ特徴量データ蓄積手段９に蓄積され、特徴量データ管理手段１１によって管理されている。つまり、特徴量データ管理手段１１から出力される制御信号によって、参照シーン特徴量データまたは参照コンテンツ特徴量データのいずれかが選択されて、特徴量データ比較部１７に出力される。
【０１１８】
以上、一実施形態に基づいて本発明を説明したが、本発明はこれに限定されるものではない。
例えば、コンテンツ特徴量抽出装置１の各構成の処理を汎用的なコンピュータ言語で記述したコンテンツ特徴量抽出プログラムとみなすこともできるし、各構成の処理を一つずつの過程ととらえたコンテンツ特徴量抽出方法とみなすことも可能である。これらの場合、コンテンツ特徴量抽出装置１と同様の効果を得ることができる。
【０１１９】
【発明の効果】
請求項１記載の発明によれば、これらの特徴量データの比較に基づいて、コンテンツと流通コンテンツとの同一性、または、類似性を検出することができる。
【０１２０】
請求項１記載の発明によれば、コンテンツの一時系列のデータである輝度データから単一のデータである周波数データを生成しており、この周波数データを平均化し、周波数全域に亘る総和を求めて、この総和を流通コンテンツ特徴量データとしているので、データ量を増加させることなく、コンテンツと流通コンテンツとの同一性または類似性の検出精度を維持することができる。
【０１２１】
請求項２記載の発明によれば、所定単位画素データを一定間隔毎に選択して配列し、この配列を１つのフレーム単位または１つのフィールド単位ずつ順次ずらしながら繰り返して並べることで、新たな時系列データ（データ系列）を得ることができ、これ以降の処理、例えば、周波数変換処理等を容易に行うことができる。
【０１２２】
請求項３記載の発明によれば、参照コンテンツ特徴量データが各シーンに対応する参照シーン特徴量データを含み、流通コンテンツ特徴量データが各シーンに対応する流通シーン特徴量データを含んでいるので、これらの参照シーン特徴量データおよび流通シーン特徴量データに基づいて、所望する流通コンテンツの特定するシーンを検索することができる。
【０１２３】
請求項４記載の発明によれば、特徴量データ同一性閾値に基づいて、コンテンツと流通コンテンツとの同一性を検出する。これによって、不正流通コンテンツを検出することができる。
【０１２４】
請求項５記載の発明によれば、特徴量データ類似性閾値に基づいて、流通コンテンツとコンテンツとの類似性を検出する。これによって、コンテンツと似かよった流通コンテンツ（所望する番組等）を検索することができる。
【図面の簡単な説明】
【図１】本発明による一実施の形態であるコンテンツ特徴量抽出装置のブロック図である。
【図２】図１に示したコンテンツ特徴量抽出装置の動作を説明したフローチャートである。
【図３】輝度値を平均化処理した後の平均輝度値の時間変化を示した図である。
【図４】図３に示した平均輝度値の時間変化の一部を拡大して示した図である。
【図５】データ並べ替え処理の概念について説明した図である。
【図６】平均輝度値サンプル系列の重ね合わせについて説明した図である。
【図７】番組データの周波数特性について説明した図である。
【図８】特徴量データの算出結果の事例について説明した図である。
【図９】特徴量データの番組間の距離評価結果の事例について説明した図である。
【図１０】参照コンテンツ特徴量データ抽出・管理部における参照コンテンツ特徴量データの抽出と管理の概念について説明した図である。
【図１１】従来のコンテンツ特徴量抽出装置のブロック図である。
【符号の説明】
１コンテンツ特徴量抽出装置
３参照コンテンツ特徴量データ抽出・管理部
５流通コンテンツ特徴量データ抽出・比較部
７参照コンテンツ特徴量データ抽出手段
９参照コンテンツ特徴量データ蓄積手段
１１特徴量データ管理手段
１３流通コンテンツ特徴量データ抽出手段
１５制御手段
１７特徴量データ比較手段
１７ａ同一性検出手段
１７ｂ類似性検出手段
１９結果表示手段[0001]
BACKGROUND OF THE INVENTION
  The present invention extracts a feature amount of content distributed via a network or a recording medium, and uses the extracted feature amount for illegal distribution detection and similarity search.In placeRelated.
[0002]
[Prior art]
With the recent increase in network speed and storage media capacity, large-capacity digital content (hereinafter referred to as content) consisting of video data, audio data, etc., is transferred to a public communication line (optical fiber communication), which is a high-speed network. An environment in which anyone can easily distribute or distribute via an optical disk (DVD or the like), which is a large-capacity recording medium, is provided.
[0003]
In addition, content distributed via a network is easy to store, and content recorded on a recording medium is easy to deliver. Therefore, the copyright owner of the content (hereinafter simply referred to as “copyright holder”). And without permission from the distributor (hereinafter referred to as “content provider”) to distribute the content, copy the stored content and re-distribute it via the network, or tamper with the content. It is easy to do fraudulent activities such as This fraudulent act is a major impediment to distributing content.
[0004]
In particular, illegally distributed content resulting from unauthorized acts of copying and redistributing (retransmitting) content without obtaining permission (permission) from the copyright owner or content provider is a significant financial inconvenience to the copyright owner or content provider. Therefore, technical means capable of detecting such illegally distributed content with high accuracy and in a short time and deterring illegal acts are being sought.
[0005]
In addition, with the spread of the Internet and higher performance of digital information devices (storage devices, etc.), the use of moving image content (large-capacity multimedia data), which is a large-capacity content, has become common. For this reason, there is a need for means for efficiently searching for specific content desired by a user from a vast number of contents held on the Internet or in a storage device.
[0006]
In order to detect illegally distributed content or to search for specific content, feature amount data representing the feature of the content is extracted from the luminance and color information of the video data constituting the content, and the extracted feature amount data is A feature amount extraction technique used for determination (testing) of identity and similarity between contents has been proposed as an effective means.
[0007]
As one of the conventional feature quantity extraction techniques, for example, as described in Non-Patent Document 1, the luminance of a broadcast program and the time trajectory of a color signal are used as feature quantities, and moving image search is performed using these feature quantities as clues. There is something that realizes. In this method, first, a television broadcast program is converted into MPEG-2, and average color information of the frame is obtained using the DC component of each intra frame of the MPEG-2 stream. Subsequently, the average color information of this frame is arranged in a three-dimensional color space, and the locus of the average color information of this frame is projected on the time axis and converted into waveform information. Then, a moving image search (moving image similarity search) is performed by enlarging / reducing and comparing the waveforms specified by the waveform information.
[0008]
Here, with reference to FIG. 11, a feature amount extraction method and a comparison method when performing a similar search for moving images will be described.
[0009]
FIG. 11 is a block diagram of a conventional content feature amount extraction device. As shown in FIG. 11, the content feature amount extraction device 101 includes a moving image data luminance / color difference data averaging unit 103, and reference content. A luminance / color difference data averaging unit 105, a dynamic range adjustment unit 107, a waveform comparison unit 109, a variance value calculation unit 111, and a threshold determination unit 113 are provided.
[0010]
The moving image data luminance / color difference data averaging unit 103 receives the moving image data, uses the luminance signal (Y) of the moving image data and the color difference signals (Cb, Cr), and uses these signals in units of frames. It averages and outputs feature data (waveform).
[0011]
The reference content luminance / color difference data averaging unit 105 receives the reference content, uses the luminance signal (Y) of the moving image data of the reference content, and the color difference signal (Cb, Cr), and uses these signals in units of frames. Is averaged and the comparison waveform data is output.
[0012]
The dynamic range adjustment unit 107 outputs the maximum value and the minimum value of the feature amount data, which is the waveform to be adjusted, output from the moving image data luminance / color difference data averaging unit 103 from the reference content luminance / color difference data averaging unit 105. It matches the maximum value and minimum value of the output comparison waveform data.
[0013]
The waveform comparison unit 109 compares the waveform of the feature amount data and the comparison waveform data whose maximum value and minimum value have been adjusted by the dynamic range adjustment unit 107, that is, the difference value of both waveforms at each time point is used as a difference data series. Output.
[0014]
The variance value calculation unit 111 calculates a variance value from the difference data series output from the waveform comparison unit 109.
[0015]
The threshold value determination unit 113 performs threshold value determination based on the variance value calculated by the variance value calculation unit 111 and a preset threshold value, and matches or does not match both waveforms (feature data and comparison waveform data). And the similarity between the moving image data and the reference content is detected.
[0016]
In addition, with respect to the conventional technique for extracting content feature amounts, for example, ISO / IEC 15938-3 “MPEG-7 Visual Description” describes features of video data (video signal) and extracts the described features. A quantity extraction algorithm is defined. It is assumed that this visual description is mainly used for similarity search / filtering based on video data (video signal), and in this visual description, low-level colors, shapes, etc. on video data. As a specific description of the feature amount, “color layout description” that defines the spatial arrangement of colors on the frequency axis is defined.
[0017]
This color arrangement description reflects human visual characteristics, and enables high-precision search for each image frame constituting the content. That is, unnecessary information can be deleted on the frequency axis when the similarity between contents is tested by the color arrangement description. As a result, the amount of data describing the content features is reduced.
[0018]
[Non-Patent Document 1]
Takahashi, Tominaga, Sugiura, Yokoi, Terashima, “Highly efficient video retrieval using characteristic motion picture prints”, Journal of the Institute of Image Electronics Engineers of Japan, Vol. 29, No. 6, pp 818
-Pp825 (2000)
[0019]
[Problems to be solved by the invention]
However, in the conventional content feature quantity extraction apparatus 101 shown in FIG. 11, the feature quantity data that is waveform data in the time axis direction is used as the feature quantity of the moving image data. There is a problem that the amount of the amount data becomes enormous.
[0020]
In addition, since the content feature quantity extraction apparatus 101 directly compares the feature quantity data and the comparison waveform data on the time axis, when the edited content is verified, for example, at the beginning, middle of the broadcast program, When the last part is deleted and a missing part is generated as compared with the original broadcast program, there is a problem that the detection accuracy of the identity or similarity of the content is remarkably lowered.
[0021]
Further, in the conventional visual description, feature amount extraction is performed on one frame of image data in video data. Therefore, when this visual description is applied to content that is moving image data, the data amount of the feature amount data In addition, there is a problem that tolerance for edited content (maintaining detection accuracy) is not taken into consideration, that is, the detection accuracy of content identity or similarity is significantly reduced.
[0022]
Therefore, an object of the present invention is to solve the problems of the conventional techniques described above, and to maintain content identity or similarity detection accuracy without increasing the amount of feature data. An object is to provide an extraction device, a content feature amount extraction program, and a content feature amount extraction method.
[0023]
[Means for Solving the Problems]
  In order to achieve the above-described object, the present invention has the following configuration.
  The content feature amount extraction device according to claim 1 is a distribution content feature amount represented by a specific frequency pattern constituting distribution content provided from a content provider that provides the content and distributed via a network or a recording medium. A content feature amount extraction apparatus that extracts data and compares the distribution content feature amount data with reference content feature amount data represented by a specific frequency pattern constituting the content, wherein the reference content feature amount data A storage unit, a distribution content feature amount data extraction unit, and a feature amount data comparison unit;The distribution content feature amount data extracting means includes pixel data averaging means, data rearranging means, frequency converting means, frequency data averaging means, and frequency data sum total calculating means. To do.
[0024]
According to such a configuration, the content feature amount extraction device stores the reference content feature amount data, which is the content feature amount, in the reference content feature amount data storage unit in advance. The distribution content distributed through the network or the recording medium (for example, optical disc (DVD etc.)) is acquired through the network or the recording medium, and the distribution content feature amount data is extracted from the acquired distribution content. To do. These reference content feature amount data or distribution content feature amount data indicates a specific frequency pattern (unique waveform pattern) constituting the content or distribution content. For example, the arrangement of the color of each pixel of the content or distribution content Is uniquely determined for each content or distributed content. That is, each pixel of the content or the distributed content is the content or temporary series data of the distributed content, and the reference content feature amount data or the distributed content feature amount data is generated as a single data from the temporary series data. . Further, the reference content feature amount data or the distribution content feature amount data corresponds to “Finger Print” (image print) proposed in MPEG-21.
[0025]
Then, the content feature amount extraction device includes the reference content feature amount data stored in the reference content feature amount data storage unit by the feature amount data comparison unit and the distribution content feature extracted by the distribution content feature amount data extraction unit. Compare with quantity data. Based on the result of comparison by the feature data comparison means, for example, when the distribution content is illegally copied and retransmitted, the distribution content and the content provided by the content provider are the same. Can be specified.
[0027]
  Also,The distribution content feature amount data extraction unit of the content feature amount extraction device averages pixel data related to each pixel included in the distribution content in units of frames or fields constituting the distribution content by the pixel data averaging unit. The predetermined unit pixel data averaged by the pixel data averaging means is rearranged by the rearranging means to obtain time series data. The pixel data related to each pixel included in the distribution content is luminance data, color difference data (Cb, Cr, etc.) of each pixel in successive frames, color signal data (RGB), etc., and these are combinations thereof. May be.
[0028]
Subsequently, the distribution content feature amount data extraction means of the content feature amount extraction device performs frequency conversion of the time-series data rearranged by the data rearrangement means by the frequency conversion means for every predetermined length to obtain frequency data. Frequency data is averaged for each frequency by frequency data averaging means. Then, the distribution content feature quantity data extraction means of this content feature quantity extraction device sums the averaged frequency data averaged by the frequency data averaging means over the entire frequency range by the frequency data sum calculation means, and this sum total. The averaged frequency data is used as feature amount data. That is, in this content feature quantity extraction device, the frequency conversion means of the distributed content feature quantity data extraction means converts the time series data into frequency data, so that the single data can be obtained from the luminance data that is the temporary series data of the contents. Certain frequency data is generated, this frequency data is averaged, a sum total over the entire frequency is obtained, and this sum is used as feature amount data (distributed content feature amount data).
[0029]
  Claim2The content feature amount extraction device described in claim1In the content feature amount extraction device described in item 4, the data rearranging unit selects and arranges the predetermined unit pixel data at regular intervals, and repeats the arrangement while sequentially shifting the unit by one frame unit or one field unit. And rearranging.
[0030]
According to such a configuration, the distribution content feature value data extraction means of the content feature value extraction device selects and arranges the predetermined unit pixel data at regular intervals by the data rearrangement means, and this arrangement is arranged in one frame unit or New time series data (data series) can be obtained by repeatedly arranging each field unit while sequentially shifting.
[0031]
  Claim3The content feature amount extraction apparatus described in claim 1Or claim 2In the content feature amount extraction device described in the above, the content and the distributed content are composed of a plurality of scenes, and the reference content feature amount data that is the feature amount of the content is the reference scene feature amount data corresponding to the scene. The distribution content feature amount data, which is a feature amount of the distribution content, includes distribution scene feature amount data corresponding to the scene.
[0032]
According to such a configuration, the content feature amount extraction apparatus can handle content composed of a plurality of scenes and distribution content, and the reference content feature amount data includes the reference scene feature amount data corresponding to each scene. Since the content feature amount data includes distribution scene feature amount data corresponding to each scene, a scene specified by the desired distribution content is searched based on the reference scene feature amount data and distribution scene feature amount data. Can do.
[0033]
  Claim4The content feature amount extraction device described in claim 1 to claim3In the content feature amount extraction device according to any one of the above, the feature amount data comparison unit includes an absolute value of a difference between the distributed content feature amount data of the distributed content and the reference content feature amount data of the content, It is characterized by having an identity detection means for detecting the identity between the distributed content and the content based on the set feature amount data identity threshold.
[0034]
According to this configuration, the content feature quantity extraction device detects the identity between the distributed content and the content based on the feature quantity data identity threshold by the identity detection means. As a result, unauthorized distribution content can be detected.
[0035]
  Claim5The content feature amount extraction device described in claim 1 to claim4In the content feature amount extraction device according to any one of the above, the feature amount data comparison unit includes an absolute value of a difference between the distributed content feature amount data of the distributed content and the reference content feature amount data of the content, The image processing apparatus includes a similarity detection unit that detects the similarity between the distributed content and the content based on a set feature amount data similarity threshold.
[0036]
According to this configuration, the content feature amount extraction device detects the similarity between the distributed content and the content based on the feature amount data similarity threshold by the similarity detection unit. As a result, it is possible to search for distribution content similar to the content.
[0041]
DETAILED DESCRIPTION OF THE INVENTION
Hereinafter, an embodiment of the present invention will be described in detail with reference to the drawings.
(Configuration of content feature extraction device)
FIG. 1 is a block diagram of a content feature amount extraction apparatus. As shown in FIG. 1, the content feature amount extraction apparatus 1 extracts and stores reference content feature amount data that is a feature amount of reference content (reference program), and distributes distribution feature that is a feature amount of distributed content. Content feature amount data is extracted, and these feature amount data are compared to detect illegally distributed content that is illegally distributed and to search for a specific scene of a program (a type of content). Reference content feature amount data An extraction / management unit 3 and a distribution content feature data extraction / comparison unit 5 are provided.
[0042]
These reference content feature amount data or distribution content feature amount data indicates a specific frequency pattern (unique waveform pattern) constituting the content or distribution content, and is a feature amount specifying the reference content or distribution content, that is, It can be said to be an “image pattern” that hits a fingerprint that identifies a person. For example, the reference content feature amount data or the distribution content feature amount data is uniquely determined for each content or distribution content based on the arrangement of the color (pixel data) of each pixel of the reference content or distribution content. (Details will be described later).
[0043]
The content feature quantity extraction device 1 is configured based on a general server connected to a network, and each unit and each means uses hardware resources (CPU, memory, hard disk, etc.) of the server. It is a collaborative use of newly described software, and the results of this use are functionally specified.
[0044]
In this embodiment, the reference content feature quantity data extraction / management unit 3 and the distributed content feature quantity data extraction / comparison unit 5 are configured as respective units of the content feature quantity extraction device 1. A single device (server) may be used so that data and control signals can be transmitted and received.
[0045]
The reference content feature value data extraction / management unit 3 extracts and accumulates reference content feature value data that is a feature value of the reference content (reference program). A quantity data storage means 9 and a feature quantity data management means 11 are provided. The reference content (reference program) corresponds to the content described in the claims, and indicates one or a plurality of reference contents (programs) to be compared and searched for illegally distributed content.
[0046]
The reference content feature amount data extraction unit 7 extracts reference content feature amount data, which is a feature amount of the reference content, from the reference content and outputs the reference content feature amount data to the reference content feature amount data storage unit 9. The extraction of the reference content feature amount data in the reference content feature amount data extracting means 7 is performed by the same extraction method as the distribution content feature amount data extracting means described later. The reference content feature amount data is obtained by averaging the luminance data in each frame of the reference content for each frame, rearranging the data in time series, performing frequency conversion, and summing the frequency data converted to this frequency. . That is, it can be said that the reference content feature amount data is obtained by handling the temporary series data (luminance data of each frame) of the reference content (reference program) as a single item.
[0047]
In this embodiment, the reference content feature amount data is obtained by using luminance data in each frame of the reference content. For example, color difference data (Cb, Cr, etc.) or color signal data (RGB) is used. You may ask for it. That is, the reference content feature amount data can be obtained based on the pixel data related to the pixels included in the reference content.
[0048]
In addition, when the reference content is composed of a plurality of scenes, the reference content feature amount data is required to correspond to each scene. That is, reference scene feature value data, which is a feature value for each scene, is assigned based on the metadata assigned to each scene.
[0049]
The reference content feature amount data storage unit 9 stores the reference content feature amount data extracted by the reference content feature amount data extraction unit 7. This reference content feature quantity data storage means 9 is under the control of the feature quantity data management means 11, and the control signal (reference content feature quantity data sequential output signal) output from the feature quantity data management means 11 is used. Based on this, the stored reference content feature amount data is sequentially output to the feature amount data comparison means 17 (described later) of the distribution content feature amount data extraction / comparison unit 5.
[0050]
The feature data management unit 11 controls the reference content feature data extraction / management unit 3. The feature data management unit 11 controls the control signal output from the distribution content feature data extraction / comparison unit 5 (reference content feature data output start signal). ), The reference content feature value data is output from the reference content feature value data extraction / management unit 3 (reference content feature value data storage means 9).
[0051]
The distribution content feature data extraction / comparison unit 5 is configured to store distribution contents (search target programs / scenes) distributed via a network (Internet, intranet, etc.) or a recording medium (optical disc (DVD, etc.), VTR, etc.). Distribution content feature amount data, which is a feature amount, is extracted, and this distribution content feature amount data is compared with reference content feature amount data. Distribution content feature amount data extraction means 13, control means 15, and feature amount data Comparing means 17 and result display means 19 are provided.
[0052]
The distribution content feature amount data extraction unit 13 extracts distribution content feature amount data which is a feature amount of the input distribution content (search target program / scene). The luminance data averaging unit 13a and the data rearrangement unit 13b, frequency conversion means 13c, frequency data averaging means 13d, and frequency data total calculation means 13e. The distribution content feature amount data extraction unit 13 starts extracting distribution content feature amount data from the distribution content based on the control signal (distribution content feature amount data output start signal) output from the control unit 15.
[0053]
When the distributed content is composed of a plurality of scenes, the distributed content feature amount data is required to correspond to each scene. In other words, distribution scene feature value data, which is a feature value for each scene, is assigned based on the metadata assigned to each scene.
[0054]
The luminance data averaging means 13a inputs the moving image data of the distributed content as the luminance value (luminance data) of each pixel in a continuous frame (field) and averages the luminance value (luminance data). This is to obtain (predetermined unit luminance data), that is, to perform an averaging process of luminance values (luminance data).
[0055]
In the averaging process of the luminance value (luminance data) by the luminance data averaging means 13a, the average value of the luminance values (luminance data) of all the pixels in this frame (field) is calculated for each frame (each field). ing. Here, as the average value of the luminance values (luminance data), a value obtained by dividing the sum of the luminance values (luminance data) of all the pixels by the total number of pixels is used.
[0056]
In addition, all blocks divided into small blocks of an arbitrary block size (for example, horizontal 8 pixels × vertical 8 lines) are subjected to DCT (Discrete Cosine Transform) calculation processing to obtain DC (direct current) for each small block. ) Coefficient may be obtained and the average value of the DC (direct current) coefficients may be used.
[0057]
Alternatively, all the pixels in each frame (field) are divided into an arbitrary number of blocks, the luminance values of all the pixels in each block are averaged, a reduced image is created, and the reduced image is subjected to DCT calculation processing. A DC coefficient may be obtained.
[0058]
In this embodiment, the luminance value (luminance data) of each pixel in a continuous frame (field) of the distributed content input to the distributed content feature amount data extraction unit 13 is used, but is not limited to this. The color difference data (Cb, Cr) and color signal data (RGB) of each pixel in successive frames may be used, or any combination of the luminance data, color difference data, and color signal data may be used. . This luminance value (luminance data) corresponds to the pixel data recited in the claims, and the average luminance value (predetermined unit luminance data) corresponds to the predetermined unit pixel data.
[0059]
The data rearrangement unit 13b performs a data rearrangement process in which the average luminance value (predetermined unit luminance data) obtained by the luminance data averaging unit 13a is rearranged in time series to obtain a new data series (time series data). Is. That is, the data rearranging means 13b selects and arranges average luminance values (predetermined unit luminance data) at regular intervals, and rearranges this arrangement while sequentially shifting by one frame unit or one field unit. . Hereinafter, one cycle of the new data series (time series data) is defined as an average luminance value sample series. The details (concept) of rearranging the average luminance value (predetermined unit luminance data) in the data rearranging means 13b will be described later (using FIG. 5).
[0060]
The frequency converting unit 13c performs DFT (Discrete Fourier Transform) for every fixed length on the plurality of average luminance value sample sequences of the new data series (time series data) rearranged by the data rearranging unit 13b. Alternatively, frequency conversion processing such as FFT (Fast Fourier Transform) is performed. The new data series (time series data) frequency-converted by the frequency conversion means 13c is used as frequency data.
[0061]
It should be noted that the frame (field) in the averaging process of the luminance value (luminance data) by the luminance data averaging means 13a, the selection interval and the frequency of the average luminance value (predetermined unit luminance data) in the data rearranging process by the data rearranging means 13b The average luminance value sample series in the frequency conversion process by the conversion means 13c is determined in consideration of the detection accuracy of the content identity or similarity and the processing speed of the content feature amount extraction apparatus 1.
[0062]
The frequency data averaging means 13d performs frequency data averaging processing for obtaining frequency characteristic data (averaged frequency data) obtained by averaging a plurality of frequency data frequency-converted by the frequency converting means 13c for each frequency. . The frequency data averaging means 13d outputs the obtained averaged frequency data to the frequency data sum calculating means 13e. Note that the frequency characteristic data (averaged frequency data) is represented by the power expressed by logarithm in the frequency and the frequency normalized by the frame frequency.
[0063]
The frequency data summation calculation means 13e is characterized by using the summation averaged frequency data obtained by summing the frequency characteristic data (averaged frequency data) obtained by the frequency data averaging means 13d over the entire frequency range as distribution content feature amount data. This is output to the quantity data comparison means 17. In other words, the frequency data sum calculating means 13e calculates the sum of logarithmic values of all powers over the entire frequency range, and uses this value as distribution content feature amount data.
[0064]
The control means 15 is responsible for overall control of the content feature quantity extraction device 1, and provides a control signal (reference content feature quantity data output start signal) to the feature quantity data management means 11 of the reference content feature quantity data extraction / management unit 3. Then, a control signal (distributed content feature value data output start signal) is output to the distributed content feature value data extracting means 13 and a threshold value is output to the feature value data comparing means 17.
[0065]
The control unit 15 records (holds) a preset threshold value in a recording unit (not shown), and the threshold value is a feature value data identity threshold value and a feature value data similarity threshold value.
[0066]
The feature amount data identity threshold is used when the feature amount data comparison unit 17 determines whether or not the distribution content is the same as the reference content based on the reference content feature amount data and the distribution content feature amount data. It is a standard. If it is determined that the distributed content is the same as the reference content, the distributed content is distributed via a legitimate route, and it is difficult to say that the distributed content is distributed. .
[0067]
The feature amount data similarity threshold is determined by the feature amount data comparison unit 17 based on the reference content feature amount data and the distributed content feature amount data to determine whether the distributed content is similar to the reference content. It is a standard for the occasion. In other words, this feature amount data similarity threshold is used to search reference content feature amount data (reference scene) of reference content (reference program) when searching for desired distribution content (search target program) or a scene that is a part of distribution content. (Feature data).
[0068]
In this embodiment, the threshold value is recorded in the recording unit (not shown) of the control unit 15, but may be recorded in the feature amount data comparison unit 17. In this case, the control signal 15 outputs a control signal (threshold utilization signal) for comparing the reference content feature data and the distributed content feature data using the threshold value to the feature data comparison unit 17.
[0069]
The feature amount data comparison unit 17 includes the reference content feature amount data output from the reference content feature amount data storage unit 9, the distribution content feature amount data extracted by the distribution content feature amount data extraction unit 13, and the control unit 15. Based on the output threshold value, the reference content and the distributed content are compared with each other to compare whether they are the same, similar, or dissimilar, and includes an identity detection unit 17a and a similarity detection unit 17b. .
[0070]
The identity detection means 17a detects the identity between the reference content and the distribution content based on the absolute value of the difference between the reference content feature quantity data and the distribution content feature quantity data and the feature quantity data identity threshold. is there.
[0071]
The similarity detection unit 17b detects the similarity between the reference content and the distribution content based on the absolute value of the difference between the reference content feature amount data and the distribution content feature amount data and the feature amount data similarity threshold. is there.
[0072]
That is, in the feature quantity data comparison unit 17, the reference content and the distribution content are dissimilar when the identity and similarity are not detected by the identity detection unit 17a and the similarity detection unit 17b. It is said.
[0073]
The result display means 19 is for displaying the comparison result by the feature data comparison means 17. The result display means 19 displays that the reference content and the distribution content are the same, the quantity of the distribution content searched based on the reference scene feature data of each scene of the reference content, and the title name of the distribution content Etc. are displayed.
[0074]
According to the content feature quantity extraction device 1, reference content feature quantity data, which is a feature quantity of reference content, is accumulated in the reference content feature quantity data storage means 9, and the distribution content feature quantity data extraction means 13 performs networking. Alternatively, the distribution content distributed via the recording medium is acquired via the network or the recording medium, and distribution content feature amount data is extracted from the acquired distribution content. The feature content data comparison means 17 compares the reference content feature data and the distributed content feature data. Based on the comparison of the feature amount data, the identity or similarity between the reference content and the distributed content can be detected.
[0075]
Further, according to the content feature amount extraction device 1, the luminance data averaging means 13a of the distribution content feature amount data extraction means 13 averages the luminance data of the distribution content in units of frames or fields constituting the distribution content. Then, the data rearranging unit 13b rearranges the predetermined unit luminance data averaged by the luminance data averaging unit 13a to obtain time series data. Subsequently, the time-series data rearranged by the data rearranging means 13b is frequency-converted by the frequency converting means 13c for every fixed length to be converted into frequency data, and the frequency data is averaged by the frequency data averaging means 13d. Is averaged. Then, the averaged frequency data averaged by the frequency data averaging means 13d is summed over the entire frequency by the frequency data sum calculating means 13e, and the summed averaged frequency data is used as feature amount data.
[0076]
That is, in the content feature amount extraction device 1, the frequency conversion unit 13c generates time-series data as frequency data, thereby generating a plurality of frequency data from the luminance data that is the temporary series data of the content. Since the frequency data is averaged and the sum over the entire frequency is obtained and this sum is used as the distribution content feature amount data, the accuracy of detecting the identity or similarity between the reference content and the distribution content without increasing the amount of data. Can be maintained.
[0077]
That is, the content feature quantity extraction apparatus 1 is used by a content provider that is a provider of content (reference content) such as a broadcast program, so that it can be accessed via the external Internet or an internal intranet, or an optical disc ( For example, the feature amount of the distributed content (distributed content feature amount data) is compared with the feature amount of the content held by the content provider (reference content feature amount data) via a recording medium such as a DVD. By detecting these identities, illegally distributed content can be detected.
[0078]
Further, the content feature quantity extraction device 1 compares the feature quantity (distributed content feature quantity data) of the content to be searched with the feature quantity (reference content feature quantity data) of the content to be referenced (reference content), By examining these similarities, it is possible to search for a target search target program or scene from a program group (a set of reference contents) stored and managed by the content provider.
[0079]
Further, according to the content feature amount extraction device 1, the distribution content feature amount data extraction unit 13 selects and arranges predetermined unit luminance data at regular intervals by the data rearrangement unit 13b, New time series data (data series) can be obtained by repeatedly arranging the frames or one field unit while sequentially shifting them.
[0080]
Furthermore, according to the content feature amount extraction apparatus 1, it is possible to handle reference content and distributed content including a plurality of scenes, and the reference content feature amount data includes reference scene feature amount data corresponding to each scene. Since the distribution content feature amount data includes distribution scene feature amount data corresponding to each scene, a scene for specifying the desired distribution content is searched based on the reference scene feature amount data and distribution scene feature amount data. can do.
[0081]
Further, according to the content feature quantity extraction device 1, the identity detection means 17a of the feature quantity data comparison means 17 detects the identity between the reference content and the distribution content based on the feature quantity data identity threshold. . As a result, unauthorized distribution content can be detected.
[0082]
Alternatively, according to the content feature amount extraction apparatus 1, the similarity detection unit 17b of the feature amount data comparison unit 17 detects the similarity between the distribution content and the reference content based on the feature amount data similarity threshold. . As a result, it is possible to search for distribution contents (a desired program or the like) similar to the reference contents.
[0083]
In addition, regarding each of the distribution content and the reference content, when the amplitude levels of the moving image data series that are the distribution content and the reference content to be input are significantly different from each other and affect the comparison determination in the feature data comparison unit 17, After performing the process of normalizing each moving image data series with each maximum value of the moving image data series, the processes after the data rearranging process by the data rearranging unit 13b may be performed.
[0084]
In this embodiment, the case where the content feature amount extraction apparatus 1 processes content composed of moving image data has been described, but it is also possible to process music content composed of audio data. It is also possible to detect feature amount data (reference content feature amount data, distribution content feature amount data) in combination with audio data. When processing music content composed of audio data, the feature data (reference content feature data, distribution content feature data) is represented by the waveform pattern, frequency spectrum, etc. of the audio data.
[0085]
(Operation of content feature extraction device)
Next, the operation of the content feature quantity extraction apparatus 1 will be described with reference to the flowchart shown in FIG. 2 (see FIG. 1 as appropriate).
First, the content feature quantity extraction device 1 extracts reference content feature quantity data by the reference content feature quantity data extraction means 7 (S1). The extracted reference content feature amount data is stored in the reference content feature amount data storage means 9 (S2).
[0086]
Then, the control means 15 inputs threshold values (feature quantity data identity threshold Th1, feature quantity data similarity threshold Th2) to the feature quantity data comparison means 17 (S3), and starts a control signal (distributed content feature quantity data output). The control signal (reference content feature data output start signal) is input to the feature data management unit 11.
[0087]
Then, the distribution content feature quantity data extraction unit 13 extracts the distribution content feature quantity data g1 (S4) and outputs it to the feature quantity data comparison unit 17. Subsequently, a control signal (reference content feature value data sequential output signal) is output to the reference content feature value data storage unit 9 by the feature data management unit 11 and is referred to by this control signal (reference content feature value data sequential output signal). The reference content feature quantity data g2 stored from the content feature quantity data storage means 9 is output to the feature quantity data comparison means 17 (S5).
[0088]
Then, the feature value data comparison means 17 makes the absolute value of the difference between the distributed content feature value data g1 and the reference content feature value data g2 smaller than the feature value data identity threshold Th1, or It is determined whether the absolute value of the difference between the distributed content feature value data g1 and the reference content feature value data g2 is smaller than the feature value data similarity threshold Th2 (S6).
[0089]
When it is determined that the absolute value of the difference between the distributed content feature value data g1 and the reference content feature value data g2 is smaller than the feature value data identity threshold Th1, or from the feature value data similarity threshold Th2 If it is determined that the absolute value of the difference between the distributed content feature value data g1 and the reference content feature value data g2 is smaller (S6, Yes), the distributed content and the reference content are the same content. Alternatively, it is determined that the program is similar (similar content) (S7).
[0090]
If the absolute value of the difference between the distributed content feature data g1 and the reference content feature data g2 is not determined to be smaller than the feature data data identity threshold Th1, or the feature data similarity If the absolute value of the difference between the distributed content feature amount data g1 and the reference content feature amount data g2 is not determined to be smaller than the threshold Th2 (No in S6), the distributed content and the reference content are different from each other. Or a heterogeneous program (content that is not similar) (S8).
[0091]
When the processes from S5 to S8 are repeated until the number of reference content feature quantity data stored in the reference content feature quantity data storage means 9 reaches a predetermined number (S9, No) (S9, Yes) The operation ends.
[0092]
(About time change of average luminance value)
Next, with reference to FIG. 3 and FIG. 4, the time change of the average luminance value (predetermined unit luminance data) will be described (see FIG. 1 as appropriate).
[0093]
3 and 4 show an average luminance value (predetermined after the luminance value (luminance data) is averaged in the luminance data averaging means 13a of the distribution content feature quantity data extraction means 13 of the content feature quantity extraction device 1. (Unit luminance data) shows a change with time.
[0094]
More specifically, FIG. 3 shows an example in which an average luminance value of 17920 consecutive frames in the broadcast program A for about 10 minutes is obtained and plotted as a graph in time series for the first 3000 frames. . FIG. 4 is an enlarged view of a part of the waveform of FIG. 3 and 4, the vertical axis indicates the average luminance value, and the horizontal axis indicates the continuous frame numbers.
[0095]
In this example, each frame is divided into 64 blocks of horizontal 8 blocks x vertical 8 blocks, and the luminance values (luminance value data) of all pixels in each block are averaged to create a reduced image of horizontal 8 pixels x vertical 8 lines. Then, the reduced image was subjected to DCT calculation processing to obtain a DC coefficient, and this DC coefficient was used as an average luminance value (predetermined unit luminance data). As shown in FIG. 3, the average luminance value obtained is rearranged in time series, whereby the temporal change in the average luminance value can be grasped.
[0096]
FIG. 4 is an enlarged view of a part of the graph of FIG. 3 (from frame number 700 to frame number 1400) in the time axis (frame) direction. As shown in FIG. 4, the difference in average luminance value between adjacent frames is small. That is, it can be seen that the average luminance value hardly fluctuates within a period of about 3 seconds (corresponding to 90 frames) except when a large screen change such as a scene change occurs.
[0097]
(About the concept of data rearrangement)
Next, the concept of the data rearrangement process performed by the data rearrangement unit 13b will be described with reference to FIG. FIG. 5 schematically illustrates the concept of data rearrangement using symbols.
[0098]
As shown in FIG. 5, the average value luminance value (predetermined unit luminance data) is selected and arranged at regular intervals by the data rearranging means 13b of the content feature quantity extraction apparatus 1, and this operation is performed frame by frame. A new data series (time series data) is obtained by repeating the process while sequentially shifting.
[0099]
That is, the symbols “black circle”, “triangle”, “square”, “cross”, “rotated square” from time 1 to 20 shown in [time change of average luminance value] are the same in [new data series]. The symbols are rearranged so that they are continuous, and an average luminance value sample series is constituted by a set of the same symbols. For example, “black circle” of the average brightness value sample series in the new data series is a set of average brightness values at time 1, time 6, time 11, and time 16.
[0100]
By the operation by the data rearranging means 13b, the time change of the average luminance value of the entire distribution content can be expressed with a small data amount regardless of the data amount of the distribution content, and the distribution content has a reasonable length of data series. In this case, it is possible to efficiently perform the processing after the data rearranging means 13b, that is, the frequency conversion processing.
[0101]
(About superposition of average luminance value sample series)
Next, superposition of the average luminance value sample series will be described with reference to FIG.
[0102]
FIG. 6 shows an average luminance sample sequence obtained by performing data rearrangement processing by the data rearrangement unit 13b described with reference to FIG. 5 at an average luminance value (predetermined unit luminance data) at intervals of 70 frames. The result of superposing seven waveforms every other period is shown. In FIG. 6, the vertical axis represents the average luminance value, and the horizontal axis represents the position within one period of the average luminance sample series. Here, div (A, B) means a quotient of A ÷ B. As described with reference to FIG. 4, since the difference in the average luminance value of adjacent frames is small, the waveforms almost coincide with each other.
[0103]
(About frequency characteristics of program data)
Next, the frequency characteristics of program data will be described with reference to FIG. 7 (see FIG. 1 as appropriate).
[0104]
FIG. 7 shows a program A (hereinafter referred to as content A), a reduced content A1 created by reducing the beginning 10% of the content A, another program B (hereinafter referred to as content B), and this content. Averaged luminance values (luminance data) by the luminance data averaging means 13a for the four contents A, A1, B, B1 (moving image data) of the reduced content B1 created by reducing the first 10% of B The example of the frequency characteristic obtained as a result of performing the process, the data rearrangement process by the data rearrangement means 13b, the frequency conversion process by the frequency conversion means 13c, and the frequency data averaging process by the frequency data averaging means 13d is shown. Note that FFT is used for frequency conversion processing by the frequency conversion means 13c. In FIG. 7, the vertical axis represents power in logarithmic expression at each frequency, and the horizontal axis represents the frequency normalized with the frame frequency.
[0105]
As shown in FIG. 7, the difference between different programs, that is, the difference between the contents A and B can be identified by the frequency characteristics. In addition, there is a high correlation between the frequency characteristics of the content A and the reduced content A1 in which the beginning of the content A is reduced by 10%, and the content B and the reduced content B1 in which the beginning of the content B1 is reduced by 10%. Recognize.
[0106]
(About examples of calculation results of feature data)
Next, with reference to FIG. 8, an example of a calculation result of feature amount data (reference content feature amount data, distribution content feature amount data) will be described (see FIG. 1 as appropriate).
[0107]
FIG. 8 shows the content characteristics of the luminance data (luminance signal; Y) and the color difference data (color difference signals Cb, Cr) for the content A, the reduced content A1, the content B, and the reduced content B1 described using FIG. The distribution content feature amount data obtained when input to the distribution content feature amount data extraction means 13 of the amount extraction device 1 is shown.
[0108]
As shown in FIG. 8, with respect to the luminance data (luminance signal; Y) and the color difference data (color difference signals Cb, Cr), the content A and the reduced content A1 have distribution content feature amount data close to each other, and the content B is reduced. It can be seen that the distribution content feature amount data is close to the content B1.
[0109]
(Examples of distance evaluation results between programs using feature data)
Next, an example of a distance evaluation result between programs (between contents) of feature amount data (reference content feature amount data, distribution content feature amount data) will be described with reference to FIG. 9 (see FIG. 1 as appropriate).
[0110]
In FIG. 9, in addition to the luminance value (luminance data) input to the distribution content feature value data extraction means 13, that is, the luminance signal Y, color difference data (Cb, Cr, etc.), that is, the color difference signals (Cb, Cr). The example of the distance evaluation result of the content A, the reduction content A1, the content B, and the reduction content B1 when being input is shown.
[0111]
This distance evaluation result is distributed content feature amount data g1 obtained using the luminance signal Y._Y, Reference content feature amount data g2 obtained using the luminance signal Y_YDistribution content feature amount data g1 obtained using the color difference signal Cb_Cb, Reference content feature amount data g2 obtained using the color difference signal Cb_CbDistribution content feature amount data g1 obtained using the color difference signal Cr_CrReference content feature amount data g2 obtained using the color difference signal Cr_CrAssuming that the distance between programs (distance between contents) between the distributed content and the reference content is D,
[0112]
[Expression 1]

[0113]
The inter-program distances (inter-content distances) D of the content A, the reduced content A1, the content B, and the reduced content B1 are obtained by this equation (1) and are listed.
[0114]
As shown in FIG. 9, the distance between the content A and the reduced content A1 reduced by 10% compared to the distance between different programs (between the contents), that is, the content A and the content B, or the content B and the beginning. The distance from the reduced content B1 reduced by 10% is sufficiently smaller, and by providing an appropriate threshold (feature data identity threshold), the identity between programs (between contents) is detected with high accuracy. be able to.
[0115]
(About the concept of reference content feature data extraction and management)
Next, the concept of extraction and management of reference content feature value data in the reference content feature value data extraction / management unit 3 will be described with reference to FIG.
[0116]
As shown in FIG. 10, a program (reference content) is composed of a plurality of scenes (scene 1, scene 2,... Scene (n−1), scene n), and the feature amount ( Feature amount 1, feature amount 2,... Feature amount (n-1), feature amount n; reference scene feature amount data) are set.
[0117]
The feature quantities (reference scene feature quantity data) for each of these scenes are collected and accumulated in the reference content feature quantity data storage means 9 as the feature quantities (reference content feature quantity data) of the entire program (reference content). It is managed by the quantity data management means 11. That is, either the reference scene feature value data or the reference content feature value data is selected by the control signal output from the feature value data management unit 11 and is output to the feature value data comparison unit 17.
[0118]
As mentioned above, although this invention was demonstrated based on one Embodiment, this invention is not limited to this.
For example, the processing of each component of the content feature extraction device 1 can be regarded as a content feature extraction program described in a general-purpose computer language, or the content feature can be regarded as a process of each component. It can also be regarded as an extraction method. In these cases, the same effect as that of the content feature amount extraction apparatus 1 can be obtained.
[0119]
【The invention's effect】
  Claim1According to the described invention, it is possible to detect the identity or similarity between the content and the distributed content based on the comparison of the feature amount data.
[0120]
  Claim1According to the described invention, frequency data that is single data is generated from luminance data that is temporary data of content, the frequency data is averaged, and a sum total over the entire frequency is obtained. Since the distribution content feature amount data is used, it is possible to maintain the accuracy of detecting the identity or similarity between the content and the distribution content without increasing the data amount.
[0121]
  Claim2According to the described invention, predetermined unit pixel data is selected and arranged at regular intervals, and this arrangement is repeatedly arranged while sequentially shifting by one frame unit or one field unit, so that new time series data ( Data series) and subsequent processing, for example, frequency conversion processing can be easily performed.
[0122]
  Claim3According to the described invention, the reference content feature quantity data includes the reference scene feature quantity data corresponding to each scene, and the distribution content feature quantity data includes the distribution scene feature quantity data corresponding to each scene. Based on the reference scene feature amount data and the distribution scene feature amount data, a scene specified by the desired distribution content can be searched.
[0123]
  Claim4According to the described invention, the identity between the content and the distributed content is detected based on the feature amount data identity threshold. As a result, unauthorized distribution content can be detected.
[0124]
  Claim5According to the described invention, the similarity between the distributed content and the content is detected based on the feature amount data similarity threshold. As a result, it is possible to search for distribution content (such as a desired program) similar to the content.
[Brief description of the drawings]
FIG. 1 is a block diagram of a content feature amount extraction apparatus according to an embodiment of the present invention.
FIG. 2 is a flowchart for explaining the operation of the content feature amount extraction apparatus shown in FIG. 1;
FIG. 3 is a diagram showing a change over time in the average luminance value after the luminance value is averaged.
4 is an enlarged view showing a part of the temporal change of the average luminance value shown in FIG.
FIG. 5 is a diagram illustrating the concept of data rearrangement processing.
FIG. 6 is a diagram illustrating superposition of average luminance value sample series.
FIG. 7 is a diagram illustrating frequency characteristics of program data.
FIG. 8 is a diagram illustrating an example of calculation results of feature amount data.
FIG. 9 is a diagram illustrating an example of a distance evaluation result between programs of feature amount data.
FIG. 10 is a diagram for explaining the concept of reference content feature data extraction and management in a reference content feature data extraction / management unit;
FIG. 11 is a block diagram of a conventional content feature amount extraction apparatus.
[Explanation of symbols]
1. Content feature extraction device
3. Reference content feature data extraction / management section
5 Distribution content feature data extraction / comparison section
7 Reference content feature data extraction means
9. Reference content feature data storage means
11 Feature data management means
13 Distribution content feature data extraction means
15 Control means
17 Feature data comparison means
17a Identity detection means
17b Similarity detection means
19 Result display means

Claims

Distribution content feature amount data represented by a specific frequency pattern constituting distribution content provided from a content provider providing content and distributed via a network or a recording medium is extracted, and this distribution content feature amount data And a content feature amount extraction device that compares reference content feature amount data represented by a specific frequency pattern constituting the content,
Reference content feature amount data storage means for storing the reference content feature amount data;
Distribution content feature amount data extracting means for acquiring the distribution content via the network or the recording medium and extracting distribution content feature amount data from the acquired distribution content;
Feature quantity data comparison means for comparing the reference content feature quantity data stored in the reference content feature quantity data storage means with the distribution content feature quantity data extracted by the distribution content feature quantity data extraction means;
With
The distribution content feature amount data extraction means includes:
Pixel data averaging means for averaging pixel data relating to each pixel included in the distribution content in units of frames or fields constituting the distribution content;
Data rearranging means for rearranging the predetermined unit pixel data averaged by the pixel data averaging means to make time-series data; and
Frequency conversion means that converts the frequency of the time series data rearranged by the data rearrangement means every fixed length, and sets the frequency data;
Frequency data averaging means for averaging frequency data frequency-converted by this frequency conversion means for each frequency;
Frequency data summation calculating means for summing averaged frequency data averaged by the frequency data averaging means over the entire frequency range, and using the summed averaged frequency data as the distribution content feature amount data. Content feature amount extraction apparatus characterized by

The data rearranging unit selects and arranges the predetermined unit pixel data at regular intervals, and repeats the arrangement while sequentially shifting one frame unit or one field unit at a time. Item 2. The content feature amount extraction device according to Item 1 .

The content and the distribution content are composed of a plurality of scenes,
The reference content feature value data that is the feature value of the content includes reference scene feature value data corresponding to the scene, and the distribution content feature value data that is the feature value of the distribution content is distribution scene feature value data corresponding to the scene. The content feature amount extraction apparatus according to claim 1 or 2 , characterized by comprising:

The feature data comparison means includes:
Based on the absolute value of the difference between the distributed content feature amount data of the distributed content and the reference content feature amount data of the content, and a preset feature amount data identity threshold, the identity of the distributed content and the content The content feature amount extraction apparatus according to any one of claims 1 to 3 , further comprising identity detection means for detecting the content.

The feature data comparison means includes:
The similarity between the distribution content and the content based on the absolute value of the difference between the distribution content feature amount data of the distribution content and the reference content feature amount data of the content, and a preset feature amount data similarity threshold content characteristic quantity extraction unit according to any one of claims 4 that claim 1, characterized in that a similarity detection means for detecting.