JP5334328B2

JP5334328B2 - Moving object detection device, moving object detection method, and program

Info

Publication number: JP5334328B2
Application number: JP2010184345A
Authority: JP
Inventors: 啓義森田; 素文翁; ロヨラルイス
Original assignee: THE UNIVERSITY OF ELECTRO-COMUNICATINS
Current assignee: THE UNIVERSITY OF ELECTRO-COMUNICATINS
Priority date: 2010-08-19
Filing date: 2010-08-19
Publication date: 2013-11-06
Anticipated expiration: 2030-08-19
Also published as: JP2012043222A

Abstract

<P>PROBLEM TO BE SOLVED: To provide a moving object detection device, a moving object detection method, and a program, suitable for real-time processing. <P>SOLUTION: A moving object detection device comprises: a detection part 11 for detecting a block size of variable block size movement compensation prediction; a calculation part 12 for calculating a first motion vector for a first macro block including a block of a first prescribed size when the detection part 11 detects the block; and a generation part 13 for setting a moving object area based on the first motion vector and adding also a second macro block including a block of a second prescribed size adjacent to the moving object area. The calculation part 12 calculates a second motion vector for a third macro block not including the blocks of the first and second prescribed sizes adjacent to the moving object area. The generation part 13 adds the third macro block to the moving object area based on the second motion vector. <P>COPYRIGHT: (C)2012,JPO&INPIT

Description

本発明は、動きオブジェクト検出装置、動きオブジェクト検出方法、及びプログラムに関するものであり、特に、動画像データにおける動きオブジェクト検出を行う装置、方法、及びプログラムに関するものである。 The present invention relates to a moving object detection apparatus, a moving object detection method, and a program, and more particularly, to an apparatus, method, and program for detecting a moving object in moving image data.

近年、動画像を撮像及び／又は表示する機能を有する端末が普及し、これらの端末のユーザにとって、動画像の撮像及び／又は視聴は日常的なものとなっている。これに伴い、動画像からの情報抽出や、動画像の商業的な利用に関して、多くの開発投資がなされている。具体的には、動画像中の動きのある物体（動きオブジェクト）を抽出するために、例えば、背景差分、オプティカルフロー、パーティクルフィルタ、動きベクトル、ＤＣ（Discrete Cosine）画像、スキップマクロブロック（Macro Block：MB）、及び動き補償マクロブロックサイズ等を用いた方法が開発されてきた。以下、各方法について説明する。 In recent years, terminals having a function of capturing and / or displaying moving images have become widespread, and capturing and / or viewing of moving images has become a daily routine for users of these terminals. Along with this, a lot of development investment has been made for information extraction from moving images and commercial use of moving images. Specifically, in order to extract a moving object (motion object) in a moving image, for example, background difference, optical flow, particle filter, motion vector, DC (Discrete Cosine) image, skip macroblock (Macro Block) : MB), and methods using motion compensation macroblock sizes have been developed. Hereinafter, each method will be described.

背景差分を用いた動きオブジェクト検出方法は、背景だけが映っている背景画像と動きオブジェクトを含むフレーム画像との差分をとることにより、動きオブジェクトだけを分離・抽出する方法であり、背景がほとんど動かず、動きオブジェクトのみが動くような動画像の処理において特に有効である（例えば、特許文献１）。 The moving object detection method using background difference is a method that separates and extracts only moving objects by taking the difference between a background image that shows only the background and a frame image that includes moving objects. In particular, this is particularly effective in processing a moving image in which only a moving object moves (for example, Patent Document 1).

オプティカルフローを用いた動きオブジェクト検出方法は、時刻の異なる２つの映像フレームの間で同一対象について対応付けを行い、その移動量をベクトルデータとして表現したもの（すなわち、オプティカルフロー）を利用する方法である。典型的には、オプティカルフローを求めるにあたって、各画素位置における輝度値や色情報の偏微分を算出する（例えば、非特許文献１）。 The moving object detection method using the optical flow is a method in which two video frames having different times are associated with each other with respect to the same target, and the movement amount is expressed as vector data (that is, an optical flow). is there. Typically, in obtaining the optical flow, the partial differentiation of the luminance value and color information at each pixel position is calculated (for example, Non-Patent Document 1).

パーティクルフィルタを用いた動きオブジェクト検出方法は、画面上にランダムに散布した点におけるオプティカルフローを用いて動きオブジェクトを推定する方法である（例えば、非特許文献２）。 A moving object detection method using a particle filter is a method of estimating a moving object using an optical flow at points randomly scattered on a screen (for example, Non-Patent Document 2).

動きベクトルを用いた動きオブジェクト方法は、圧縮動画像の予測符号化で用いる動きベクトルが発生する場所を移動領域と見なす方法である。動きベクトルは、オブジェクトの動きだけではなく、蛍光灯のちらつき（フリッカー）や、屋外の木の葉の揺れなどにも起因して生じるため、これらのノイズと実際の動きオブジェクトを区別することが必要である。そこで、ノイズ除去のために、動きベクトルの大きさや内積を利用する方法が提案されている（特許文献２、非特許文献３、４）。 The motion object method using a motion vector is a method in which a place where a motion vector used in predictive coding of a compressed moving image is generated is regarded as a moving region. Since motion vectors are caused not only by the movement of objects, but also by flickering of fluorescent lamps and shaking of the leaves of trees outdoors, it is necessary to distinguish these noises from actual moving objects. . Therefore, a method of using the magnitude and inner product of a motion vector for noise removal has been proposed (Patent Document 2, Non-Patent Documents 3 and 4).

DC画像を用いた動きオブジェクト検出方法は、MPEG（Moving Picture Experts Group）2のI（Intra-coded）フレームにおけるDCT係数の直流成分がブロックの平均色に対応していることを利用して、算出されたDC画像に対し、通常のテンプレートマッチング法を適用して動きオブジェクトを同定する方法である（非特許文献５，６）。 The motion object detection method using DC images is calculated using the fact that the DC component of the DCT coefficient in the I (Intra-coded) frame of MPEG (Moving Picture Experts Group) 2 corresponds to the average color of the block. This is a method for identifying a moving object by applying a normal template matching method to a DC image obtained (Non-Patent Documents 5 and 6).

スキップマクロブロック用いた動きオブジェクト検出方法は、動きベクトルが発生しなかったマクロブロックに割り当てられるタイプを用いて、動きのない背景領域を簡単に除去する方法が知られている(例えば、非特許文献７)。 As a motion object detection method using a skip macroblock, a method of easily removing a background region having no motion using a type assigned to a macroblock in which a motion vector has not occurred is known (for example, non-patent document). 7).

動き補償マクロブロックサイズを用いた動きオブジェクト検出方法は、符号化方式としてMPEG２を使用することを前提としており、動きオブジェクトと背景との境界領域においては、マクロブロックサイズが、通常の１６×１６ではなく、１６×８が用いられる傾向が強いことを利用して動きオブジェクト領域を検出する方法である(例えば、非特許文献８)。 The motion object detection method using the motion compensated macroblock size is based on the premise that MPEG2 is used as the encoding method, and in the boundary region between the motion object and the background, the macroblock size is 16 × 16. In other words, the moving object region is detected by using the fact that 16 × 8 is likely to be used (for example, Non-Patent Document 8).

特開平０８−２２１５７７号公報Japanese Patent Laid-Open No. 08-221577 特開２００１−２５０１１８号公報JP 2001-250118 A

山田・伊藤・上田、“背景差分法における波誤検出抑制法の検討”信学技法、PRMU98-109、1998Yamada, Ito, Ueda, “Study of wave false detection suppression method in background subtraction method”, IEICE Technics, PRMU98-109, 1998 Ya-Dong Wang, Jian-Kang Wu, Ashraf A. Kassim, “Particle Filter for Visual Tracking Using Multiple Cameras,” Proc. MVA2005, Tsukuba, May, pp.16-18, 2005.Ya-Dong Wang, Jian-Kang Wu, Ashraf A. Kassim, “Particle Filter for Visual Tracking Using Multiple Cameras,” Proc. MVA2005, Tsukuba, May, pp.16-18, 2005. H. Zen, T. Hasegawa, and S. Ozawa, “Moving object detection from MPEG coded picture,” Proc. of 1999 International Conference on Image Processing, vol.4, pp.25-29, 1999.H. Zen, T. Hasegawa, and S. Ozawa, “Moving object detection from MPEG coded picture,” Proc. Of 1999 International Conference on Image Processing, vol.4, pp.25-29, 1999. 岩崎・横山・渡辺・古賀 “MPEGビデオデータの動きベクトルを用いた圧縮領域における移動物体の検出と追跡”信学論D、 vol.J91-D、no.6、 pp.1592-1603, 2008Iwasaki, Yokoyama, Watanabe, Koga “Detection and tracking of moving objects in the compression domain using motion vectors of MPEG video data” IEICE D, vol.J91-D, no.6, pp.1592-1603, 2008 Dan Schonfeld and Dan Lelescu, “VORTEX: Video retrieval and tracking from compressed multimedia databases-multiple object tracking from MPEG-2 bit stream,” Journal of Visual Communication and Image Representation, vol.11, no.2, pp.154-182, 2000.Dan Schonfeld and Dan Lelescu, “VORTEX: Video retrieval and tracking from compressed multimedia databases-multiple object tracking from MPEG-2 bit stream,” Journal of Visual Communication and Image Representation, vol.11, no.2, pp.154-182 , 2000. Francesca Manerba, Jenny Benois-Pineau, Riccardo Leonardi, and Boris Mansencal, “Multiple moving object detection for fast video content description in compressed domain,”EURASIP J. Adv. Signal Process, vol.2008, no.1, pp.1-13, 2008.Francesca Manerba, Jenny Benois-Pineau, Riccardo Leonardi, and Boris Mansencal, “Multiple moving object detection for fast video content description in compressed domain,” EURASIP J. Adv. Signal Process, vol.2008, no.1, pp.1- 13, 2008. Wonsang You, M.S.Houari Sabirin, Munchurl Kim,“Real-time detection and tracking of multiple objects with partial decoding in Ｈ．２６４/A VC bitstream domain,”Proceedings of SPIE(2009), vol.7244, 2009.Wonsang You, M.S.Houari Sabirin, Munchurl Kim, “Real-time detection and tracking of multiple objects with partial decoding in H.264 / A VC bitstream domain,” Proceedings of SPIE (2009), vol.7244, 2009. 単鴻、“MPEG2動きベクトルを用いた複数移動物体の検知・追跡システム”電気通信大学、修士論文、2008年1月.Simple, “Multiple Moving Object Detection and Tracking System Using MPEG2 Motion Vectors”, The University of Electro-Communications, Master's Thesis, January 2008.

しかしながら、特許文献１や、非特許文献１に係る動きオブジェクト検出方法においては、画素単位の演算が必要であるため、動きオブジェクト検出のために必要な総計算量が膨大であり、実時間処理には不適切である。また、非特許文献２に係るパーティクルフィルタを用いた動きオブジェクト検出方法によれば、一見、当該方法によれば、オプティカルフローよりも計算量が低減されるようにも思えるが、推定精度を上げるためには、オプティカルフローと尤度の計算を併せて反復する必要があるので、実時間処理には不適切である。さらに、特許文献２及び非特許文献３，４に係る動きベクトルを用いた動きオブジェクト検出方法も、実時間処理には不適切である。 However, in the moving object detection methods according to Patent Document 1 and Non-Patent Document 1, since calculation in units of pixels is necessary, the total amount of calculation necessary for moving object detection is enormous, and real-time processing is required. Is inappropriate. In addition, according to the moving object detection method using the particle filter according to Non-Patent Document 2, at first glance, it seems that the calculation amount is reduced compared to the optical flow according to the method, but in order to increase the estimation accuracy. Since it is necessary to repeat the optical flow and the likelihood calculation together, it is not suitable for real-time processing. Furthermore, the motion object detection method using motion vectors according to Patent Document 2 and Non-Patent Documents 3 and 4 is also inappropriate for real-time processing.

さらに、非特許文献５，６に係るDC画像を用いた動きオブジェクト検出方法は、Iフレームの間隔が不均一且つMPEG２の場合よりも大幅に長いＨ．２６４形式の動画像データに適用すれば、検出精度の信頼性が劣化するおそれがある。その上、メインプロファイルでは、Iフレームにおいても画素単位のフレーム内予測符号化が採用されるため、DCT係数から色情報を取り出すためには少なくともIフレームの完全復元が必要となり、処理の高速性が失われることが懸念される。 Furthermore, the moving object detection method using DC images according to Non-Patent Documents 5 and 6 is an H.264 method in which the interval between I frames is not uniform and is significantly longer than that in the case of MPEG2. When applied to H.264 format moving image data, the reliability of detection accuracy may be degraded. In addition, in the main profile, intra-frame predictive coding is also adopted in the I frame even in the I frame. Therefore, in order to extract color information from the DCT coefficients, at least the I frame must be completely restored, and the processing speed is increased. There is concern about being lost.

さらに、非特許文献７に係るスキップマクロブロック用いたオブジェクト検出方法においては、スキップマクロブロックの出現頻度は解像度とフレームタイプに依存するため、Ｈ．２６４で規定されたすべてのプロフィールごとにパラメータを調整しなければならず、実時間処理には適さない。実際、フレームごとのスキップMB数は、解像度が高い場合や、PフレームよりもBフレームのほうが、大きくなる傾向がある。 Further, in the object detection method using the skip macroblock according to Non-Patent Document 7, the appearance frequency of the skip macroblock depends on the resolution and the frame type. The parameters have to be adjusted for every profile defined in H.264, which is not suitable for real-time processing. In fact, the number of skipped MBs per frame tends to be larger when the resolution is high or when the B frame is larger than the P frame.

さらに、非特許文献８に係る動き補償マクロブロックサイズを用いたオブジェクト検出方法では、移動領域のごく一部しか検出できない上に、Ｈ．２６４形式の動画像データに適用できないおそれがある。これは、非特許許文献８に係る方法においては、16×8サイズの動き補償ブロックが生成される箇所を境界領域であると推定して検出が行われるが、Ｈ．２６４形式の動画像においては、動きオブジェクト領域に限らず、画面全体で16×8サイズの動き補償ブロックが頻繁に用いられることに起因する。 Furthermore, the object detection method using the motion compensation macroblock size according to Non-Patent Document 8 can detect only a small part of the moving region. It may not be applicable to H.264 format moving image data. In the method according to Non-Patent Document 8, detection is performed by estimating that a 16 × 8 size motion compensation block is generated as a boundary region. In the H.264 format moving image, not only the motion object area but also the motion compensation block of 16 × 8 size is frequently used in the entire screen.

かかる点に鑑みてなされた本発明の目的は、実時間処理に適している、動きオブジェクト検出装置、動きオブジェクト検出方法、及びプログラムを提供することである。 An object of the present invention made in view of such a point is to provide a moving object detection device, a moving object detection method, and a program suitable for real-time processing.

上記目的を達成する第１の観点に係る動きオブジェクト検出装置の発明は、
画像データに対する可変ブロックサイズ動き補償予測に用いられるブロックサイズを検出する可変ブロックサイズ検出部と、
前記可変ブロックサイズ検出部が第１所定サイズのブロックを検出した場合に、当該ブロックを含む第１マクロブロックについて、第１動きベクトルを算出する動きベクトル算出部と、
前記第１動きベクトルに基づいて、前記第１マクロブロックを第１動きオブジェクト領域として設定し、設定した前記第１動きオブジェクト領域に隣接する、第２所定サイズのブロックを含む第２マクロブロックを、前記第１動きオブジェクト領域に加える動きオブジェクト領域生成部と、を備え、
前記動きベクトル算出部は、前記第１動きオブジェクト領域に隣接するマクロブロックであって、前記第１及び第２所定サイズのブロックを含まない第３マクロブロックのそれぞれについて、第２動きベクトルを算出し、
前記動きオブジェクト領域生成部は、前記第２動きベクトルに基づいて、前記第３マクロブロックを、前記第１動きオブジェクト領域に加える、
ことを特徴とするものである。 The invention of the moving object detection device according to the first aspect of achieving the above object,
A variable block size detection unit for detecting a block size used for variable block size motion compensation prediction for image data;
A motion vector calculation unit that calculates a first motion vector for a first macroblock including the block when the variable block size detection unit detects a block of a first predetermined size;
Based on the first motion vector, the first macroblock is set as a first motion object region, and a second macroblock including a second predetermined size block adjacent to the set first motion object region, A moving object area generator for adding to the first moving object area,
The motion vector calculation unit calculates a second motion vector for each of the third macro blocks that are adjacent to the first motion object region and do not include the first and second predetermined size blocks. ,
The moving object region generation unit adds the third macroblock to the first moving object region based on the second motion vector.
It is characterized by this.

また、第２の観点に係る発明は、第1の観点に係る動きオブジェクト検出装置であって、
前記動きオブジェクト領域生成部により生成された、前記動きオブジェクト領域について、所定時間後の前記動きオブジェクト領域の位置を予測する予測部と、
前記予測部により予測された前記動きオブジェクト領域の位置と、前記動きオブジェクト領域生成部により前記所定時間後の前記画像データについて生成された第２動きオブジェクト領域とに基づいて、前記第１動きオブジェクト領域と、第２動きオブジェクト領域との対応を判定する動きオブジェクト領域間対応判定部と、
を備えることを特徴とするものである。 The invention according to the second aspect is a moving object detection device according to the first aspect,
A predicting unit configured to predict a position of the moving object region after a predetermined time for the moving object region generated by the moving object region generating unit;
Based on the position of the moving object region predicted by the prediction unit and the second moving object region generated for the image data after the predetermined time by the moving object region generation unit, the first moving object region And a movement object area correspondence determination unit for determining a correspondence with the second movement object area,
It is characterized by providing.

また、第３の観点に係る発明は、第２の観点に係る動きオブジェクト検出装置であって、
前記予測部は、前記第１動きオブジェクト領域に含まれる各第１マクロブロックの動きベクトルに基づいて、第１フレームに後続する第２フレームの時点における、前記各第１マクロブロックの位置を予測するマクロブロック予測部であり、
前記オブジェクト領域間対応判定部は、前記マクロブロック予測部により予測された前記各第１マクロブロックの位置と、前記動きオブジェクト領域生成部により前記第２フレームについて生成された第２動きオブジェクト領域とについて、カウントを実行し、その結果に基づいて、前記第１動きオブジェクト領域と、第２動きオブジェクト領域との対応を判定する、
ことを特徴とするものである。 An invention according to a third aspect is a moving object detection device according to the second aspect,
The prediction unit predicts the position of each first macroblock at the time of the second frame subsequent to the first frame based on the motion vector of each first macroblock included in the first moving object region. A macroblock predictor,
The inter-object region correspondence determination unit is configured to determine the position of each first macroblock predicted by the macroblock prediction unit and the second moving object region generated for the second frame by the moving object region generation unit. , Count, and based on the result, determine the correspondence between the first moving object area and the second moving object area,
It is characterized by this.

また、第４の観点に係る発明は、第３の観点に係る動きオブジェクト検出装置であって、
前記予測部は、更に、カルマンフィルタを用いて、第１フレームの後所定時間分の複数の第３フレームの時点における前記第１動きオブジェクト領域の位置を予測し、
前記オブジェクト領域間対応判定部は、更に、前記カルマンフィルタを用いて予測された前記第１動きオブジェクト領域の位置と、前記所定時間中に前記動きオブジェクト領域生成部が生成した１又は複数の第３動きオブジェクト領域の位置と、に基づいて、前記第１動きオブジェクト領域と、前記第３動きオブジェクト領域との対応を判定する、ことを特徴とするものである。 An invention according to a fourth aspect is a moving object detection device according to the third aspect,
The predicting unit further predicts a position of the first moving object region at a time point of a plurality of third frames for a predetermined time after the first frame using a Kalman filter,
The inter-object region correspondence determination unit further includes the position of the first moving object region predicted using the Kalman filter, and one or more third movements generated by the moving object region generation unit during the predetermined time. The correspondence between the first moving object area and the third moving object area is determined based on the position of the object area.

上記目的を達成する第５の観点に係る発明は、動きオブジェクト検出プログラムであって、
コンピュータに、
画像データに対する可変ブロックサイズ動き補償予測に用いられるブロックサイズを検出する可変ブロックサイズ検出ステップと、
前記可変ブロックサイズ検出部が第１所定サイズのブロックを検出した場合に、当該ブロックを含む第１マクロブロックについて、第１動きベクトルを算出する動きベクトル算出ステップと、
前記第１動きベクトルに基づいて、前記第１マクロブロックを第１動きオブジェクト領域として設定し、設定した前記第１動きオブジェクト領域に隣接する、第２所定サイズのブロックを含む第２マクロブロックを、前記第１動きオブジェクト領域に加える動きオブジェクト領域生成ステップと、を実行させるためのプログラムであって、
前記動きベクトル算出ステップは、前記第１動きオブジェクト領域に隣接するマクロブロックであって、前記第１及び第２所定サイズのブロックを含まない第３マクロブロックのそれぞれについて、第２動きベクトルを算出し、
前記動きオブジェクト領域生成ステップは、前記第２動きベクトルに基づいて、前記第３マクロブロックを、前記第１動きオブジェクト領域に加える、
ことを特徴とするものである。 An invention according to a fifth aspect for achieving the above object is a moving object detection program,
On the computer,
A variable block size detecting step for detecting a block size used for variable block size motion compensation prediction for image data;
A motion vector calculation step of calculating a first motion vector for a first macroblock including the block when the variable block size detection unit detects a block of a first predetermined size;
Based on the first motion vector, the first macroblock is set as a first motion object region, and a second macroblock including a second predetermined size block adjacent to the set first motion object region, A moving object region generating step to be added to the first moving object region,
The motion vector calculation step calculates a second motion vector for each of the third macroblocks that are adjacent to the first motion object area and do not include the first and second predetermined size blocks. ,
The moving object region generating step adds the third macroblock to the first moving object region based on the second motion vector.
It is characterized by this.

また、第６の観点に係る発明は、第５の観点に係る動きオブジェクト検出プログラムであって、
前記動きオブジェクト領域生成ステップにより生成された、前記動きオブジェクト領域について、所定時間後の前記動きオブジェクト領域の位置を予測する予測ステップと、
前記予測ステップにおいて予測された前記動きオブジェクト領域の位置と、前記動きオブジェクト領域生成ステップにおいて前記所定時間後の前記画像データについて生成された第２動きオブジェクト領域とに基づいて、前記第１動きオブジェクト領域と、第２動きオブジェクト領域との対応を判定する動きオブジェクト領域間対応判定ステップと、
を実行させることを特徴とするものである。 The invention according to a sixth aspect is a moving object detection program according to the fifth aspect,
A predicting step of predicting a position of the moving object area after a predetermined time for the moving object area generated by the moving object area generating step;
Based on the position of the moving object area predicted in the prediction step and the second moving object area generated for the image data after the predetermined time in the moving object area generation step, the first moving object area And a movement object area correspondence determination step for determining a correspondence with the second movement object area,
Is executed.

上記目的を達成する第７の観点に係る発明は、動きオブジェクト検出方法であって、
画像データに対する可変ブロックサイズ動き補償予測に用いられるブロックサイズを検出する可変ブロックサイズ検出ステップと、
前記可変ブロックサイズ検出部が第１所定サイズのブロックを検出した場合に、当該ブロックを含む第１マクロブロックについて、第１動きベクトルを算出する動きベクトル算出ステップと、
前記第１動きベクトルに基づいて、前記第１マクロブロックを第１動きオブジェクト領域として生成し、生成した前記第１動きオブジェクト領域に隣接する、第２所定サイズのブロックを含む第２マクロブロックを、前記第１動きオブジェクト領域に加える動きオブジェクト領域生成ステップと、を含み、
前記動きベクトル算出ステップは、前記第１動きオブジェクト領域に隣接するマクロブロックであって、前記第１及び第２所定サイズのブロックを含まない第３マクロブロックのそれぞれについて、第２動きベクトルを算出し、
前記動きオブジェクト領域生成ステップは、前記第２動きベクトルに基づいて、前記第３マクロブロックを、前記第１動きオブジェクト領域に加える、
ことを特徴とするものである。 An invention according to a seventh aspect for achieving the above object is a moving object detection method,
A variable block size detecting step for detecting a block size used for variable block size motion compensation prediction for image data;
A motion vector calculation step of calculating a first motion vector for a first macroblock including the block when the variable block size detection unit detects a block of a first predetermined size;
Based on the first motion vector, the first macro block is generated as a first motion object region, and a second macro block including a block of a second predetermined size adjacent to the generated first motion object region, A moving object region generating step to add to the first moving object region,
The motion vector calculation step calculates a second motion vector for each of the third macroblocks that are adjacent to the first motion object area and do not include the first and second predetermined size blocks. ,
The moving object region generating step adds the third macroblock to the first moving object region based on the second motion vector.
It is characterized by this.

また、第８の観点に係る発明は、第７の観点に係る動きオブジェクト検出方法であって、
前記動きオブジェクト領域生成ステップにより生成された、前記動きオブジェクト領域について、所定時間後の前記動きオブジェクト領域の位置を予測する予測ステップと、
前記予測ステップにおいて予測された前記動きオブジェクト領域の位置と、前記動きオブジェクト領域生成ステップにおいて前記所定時間後の前記画像データについて生成された第２動きオブジェクト領域とに基づいて、前記第１動きオブジェクト領域と、第２動きオブジェクト領域との対応を判定する動きオブジェクト領域間対応判定ステップと、
を含むことを特徴とするものである。 The invention according to an eighth aspect is a moving object detection method according to the seventh aspect,
A predicting step of predicting a position of the moving object area after a predetermined time for the moving object area generated by the moving object area generating step;
Based on the position of the moving object area predicted in the prediction step and the second moving object area generated for the image data after the predetermined time in the moving object area generation step, the first moving object area And a movement object area correspondence determination step for determining a correspondence with the second movement object area,
It is characterized by including.

本発明によれば、実時間処理に適している、動きオブジェクト検出装置、方法、及びプログラムを提供することができる。 According to the present invention, it is possible to provide a moving object detection apparatus, method, and program suitable for real-time processing.

本発明の一実施形態に係る動きオブジェクト検出装置を含む動画像処理装置の要部構成を概略的に示す機能ブロック図である。It is a functional block diagram which shows roughly the principal part structure of the moving image processing apparatus containing the moving object detection apparatus which concerns on one Embodiment of this invention. 図１に示した動きオブジェクト検出装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the moving object detection apparatus shown in FIG. 図１に示した動きオブジェクト検出装置の動作を説明するための図である。It is a figure for demonstrating operation | movement of the moving object detection apparatus shown in FIG. 図１に示した動きオブジェクト検出装置の動作を示すフローチャートである。It is a flowchart which shows operation | movement of the moving object detection apparatus shown in FIG. 図１に示した動きオブジェクト検出装置を含むシステムの一例である。It is an example of the system containing the moving object detection apparatus shown in FIG. 図５に示したシステムの動作を説明するための図である。It is a figure for demonstrating operation | movement of the system shown in FIG.

以下、本発明の一実施形態について、図面を参照して説明する。図１は、本発明の一実施形態に係る動きオブジェクト検出装置を含む動画像処理装置の要部構成を概略的に示す機能ブロック図である。図１に示すように、動きオブジェクト領域特定部１０は、可変ブロックサイズ検出部１１と、マクロブロック動きベクトル算出部１２と、動きオブジェクト領域生成部１３と、を備えている。一方、動きオブジェクト追跡部２０は、マクロブロック予測部２１と、マクロブロックカウント部２２と、動きオブジェクト領域間対応判定部２３と、を備えている。 Hereinafter, an embodiment of the present invention will be described with reference to the drawings. FIG. 1 is a functional block diagram schematically showing a main configuration of a moving image processing apparatus including a moving object detection apparatus according to an embodiment of the present invention. As illustrated in FIG. 1, the moving object region specifying unit 10 includes a variable block size detecting unit 11, a macroblock motion vector calculating unit 12, and a moving object region generating unit 13. On the other hand, the moving object tracking unit 20 includes a macroblock prediction unit 21, a macroblock count unit 22, and a moving object region correspondence determination unit 23.

符号化部３０は、一般的な構成を有する映像信号（画像データ）を符号化するための装置である。動き補償予測部３１は、例えば、H．２６４において実装される、可変ブロックサイズ動き補償を行うものである。また、動き補償予測部３１は動きベクトル取得部（図示しない）を備えており、この動きベクトル取得部は、例えば、過去のフレームであるフレーム（ｔ）に基づいてフレーム（ｔ＋１）における動きベクトル（すなわち、順方向動きベクトル）を取得する。この場合、動き補償予測部３１は、符号化量及び動き補償精度を最適化するために、７通り（縦１６×横１６画素、１６×８、８×１６、８×８、８×４、４×８、４×４）のブロックサイズ（パーティションサイズ）を用いて動き補償予測を行う。 The encoding unit 30 is an apparatus for encoding a video signal (image data) having a general configuration. The motion compensation prediction unit 31 is, for example, H.264. H.264 implements variable block size motion compensation. The motion compensation prediction unit 31 includes a motion vector acquisition unit (not shown). The motion vector acquisition unit, for example, based on a frame (t) that is a past frame, a motion vector ( That is, a forward motion vector) is acquired. In this case, the motion compensation prediction unit 31 has seven patterns (16 × 16 pixels, 16 × 8, 8 × 16, 8 × 8, 8 × 4, 8 × 4) in order to optimize the coding amount and the motion compensation accuracy. Motion compensated prediction is performed using a block size (partition size) of 4 × 8, 4 × 4).

可変ブロックサイズ検出部１１は、動き補償予測部３１による可変ブロックサイズ動き補償に使用されるブロックサイズを監視し、当該動き補償予測部３１による動き補償予測に用いられるブロックサイズを検出する。発明者らは、Ｈ．２６４形式の画像データの動きオブジェクトの境界部分において、縦８×横８画素のブロックサイズのパーティション（以下、８×８パーティションという）が多く検出されるという知見を得た。このような知見に基づいて、本実施形態においては、可変ブロックサイズ検出部１１は、特に、８×８パーティションを検出した際に、マクロブロック動きベクトル算出部１２に８×８パーティションの情報を通知するように構成する。 The variable block size detection unit 11 monitors a block size used for motion compensation prediction by the motion compensation prediction unit 31 and detects a block size used for motion compensation prediction by the motion compensation prediction unit 31. The inventors have described H.C. It was found that many partitions having a block size of 8 × 8 pixels (hereinafter referred to as “8 × 8 partitions”) are detected at the boundary portion of the moving object of the H.264 format image data. Based on such knowledge, in this embodiment, the variable block size detection unit 11 notifies the macro block motion vector calculation unit 12 of the information of the 8 × 8 partition, particularly when detecting the 8 × 8 partition. To be configured.

マクロブロック動きベクトル算出部１２は、動き補償予測部３１の動きベクトルマクロブロックの動きベクトル検出部（図示しない）から、縦１６×横１６画素からなるマクロブロックに含まれる各パーティションの動きベクトルを取得し、マクロブロック全体の動きベクトルを算出する。また、マクロブロック動きベクトル算出部１２は、動きベクトルの方向を、図３に例に例示するように、０から７までの８つの方向領域の何れかに分類する。 The macroblock motion vector calculation unit 12 obtains a motion vector of each partition included in a macroblock composed of 16 pixels in the vertical direction and 16 pixels in the horizontal direction from a motion vector detection unit (not shown) of the motion vector macroblock in the motion compensation prediction unit 31. Then, the motion vector of the entire macro block is calculated. Further, the macroblock motion vector calculation unit 12 classifies the direction of the motion vector into any of eight direction areas from 0 to 7, as illustrated in the example in FIG.

動きオブジェクト領域生成部１３は、各マクロブロックの動きベクトルや、パーティションサイズに基づいて、動きオブジェクト領域を生成する。さらに、動きオブジェクト領域生成部１３は、生成した動きオブジェクト領域情報を出力する。動きオブジェクト領域特定部１０の動作については、図２を参照して詳述する。上述した、可変ブロックサイズ検出部１１と、マクロブロック動きベクトル算出部１２、及び動きオブジェクト領域生成部１３は、例えば、ＣＰＵ（Central Processing Unit）により実装する。 The moving object area generation unit 13 generates a moving object area based on the motion vector of each macroblock and the partition size. Further, the moving object area generation unit 13 outputs the generated moving object area information. The operation of the moving object area specifying unit 10 will be described in detail with reference to FIG. The variable block size detection unit 11, the macroblock motion vector calculation unit 12, and the motion object region generation unit 13 described above are implemented by, for example, a CPU (Central Processing Unit).

マクロブロック予測部２１は、取得した画像データに含まれる各フレームについて、上述した動きオブジェクト領域生成部１３により生成された動きオブジェクト領域内の各マクロブロックについて、順方向動きベクトルを用いて後続するフレームの時点における位置の予測を行う。 For each frame included in the acquired image data, the macroblock prediction unit 21 uses the forward motion vector for subsequent frames for each macroblock in the motion object region generated by the motion object region generation unit 13 described above. The position at the point of time is predicted.

マクロブロックカウント部２２は、カウンタ（図示しない）を備えており、マクロブロック予測部２１により予測されたマクロブロックの位置と、動きオブジェクト領域生成部１３により直後のフレームについて生成された動きオブジェクト領域とに基づいて、カウントを実行する。具体的なカウント方法については、例えば、マクロブロック予測部２１により予測されたマクロブロックの位置からなる動きオブジェクト領域を標的領域ｒとし、動きオブジェクト領域生成部１３により直後のフレームについて生成された動きオブジェクト領域を候補領域ｃとした場合について説明する。マクロブロックカウント部２２は、標的領域ｒと候補領域ｃとを対比して、候補領域ｃ内に入る標的領域ｒのマクロブロックの数をカウントする。そして各標的領域ｒについて、参照率を求める。参照率（P）は、以下の式により算出する。
P＝標的領域ｒのカウント数／候補領域ｃ内の全マクロブロック数
対象とする各フレームについて、動きオブジェクト領域生成部１３により複数の動きオブジェクト領域が生成されている場合、マクロブロックカウント部２２は、標的領域ｒ及び候補領域ｃは、それぞれ複数存在し、各候補領域につき、複数の標的領域ｒについて参照率（P）を算出する。 The macroblock count unit 22 includes a counter (not shown), and the position of the macroblock predicted by the macroblock prediction unit 21 and the motion object region generated for the immediately following frame by the motion object region generation unit 13 Based on the above, the count is executed. As for a specific counting method, for example, a moving object region composed of the position of the macroblock predicted by the macroblock prediction unit 21 is set as the target region r, and the moving object generated for the immediately following frame by the moving object region generation unit 13 is used. A case where the region is the candidate region c will be described. The macroblock count unit 22 compares the target area r and the candidate area c, and counts the number of macroblocks in the target area r that fall within the candidate area c. Then, a reference rate is obtained for each target region r. The reference rate (P) is calculated by the following formula.
P = number of counts of target area r / total number of macroblocks in candidate area c When a plurality of moving object areas are generated by the moving object area generating unit 13 for each target frame, the macroblock counting unit 22 There are a plurality of target regions r and candidate regions c, and for each candidate region, the reference rate (P) is calculated for the plurality of target regions r.

動きオブジェクト領域間対応判定部２３は、一つの候補領域ｃについて、マクロブロックカウント部２２により算出された参照率（P）が最大である標的領域ｒを、当該一つの候補領域ｃに対応する一つの標的領域ｒであると判定する。上述した、動きオブジェクト追跡部２０に含まれる、マクロブロック予測部２１と、マクロブロックカウント部２２と、動きオブジェクト領域間対応判定部２３とは、例えば、ＣＰＵにより実装する。 The inter-moving object region correspondence determination unit 23 selects a target region r having the maximum reference rate (P) calculated by the macroblock count unit 22 for one candidate region c, corresponding to the one candidate region c. It is determined that there are two target regions r. The above-described macroblock prediction unit 21, macroblock count unit 22, and moving object region correspondence determination unit 23 included in the moving object tracking unit 20 are implemented by, for example, a CPU.

図２は、図１に示した動きオブジェクト領域特定部の動作を示すフローチャートである。可変ブロックサイズ検出部１１は、動き補償予測部３１を監視している。可変ブロックサイズ検出部１１は、フレーム内に含まれる全ての８×８サイズの動き補償ブロック（８×８パーティション）を検出する（Ｓ０１）。これにより、動きオブジェクト領域特定部１０は処理をスタートする。 FIG. 2 is a flowchart showing the operation of the moving object area specifying unit shown in FIG. The variable block size detection unit 11 monitors the motion compensation prediction unit 31. The variable block size detection unit 11 detects all 8 × 8 size motion compensation blocks (8 × 8 partitions) included in the frame (S01). As a result, the moving object region specifying unit 10 starts processing.

次に、マクロブロック動きベクトル算出部１２は、可変ブロックサイズ検出部１１により検出された８×８パーティションを含む各マクロブロックに対して全パーティションの動きベクトルの平均値を算出する（Ｓ０２）。このとき、マクロブロック動きベクトル算出部１２は、動き補償予測部３１から、特定したマクロブロック内の全パーティションの動きベクトルを取得し、それらの平均値を算出する。 Next, the macroblock motion vector calculation unit 12 calculates an average value of motion vectors of all partitions for each macroblock including 8 × 8 partitions detected by the variable block size detection unit 11 (S02). At this time, the macroblock motion vector calculation unit 12 acquires the motion vectors of all partitions in the identified macroblock from the motion compensation prediction unit 31, and calculates an average value thereof.

動きオブジェクト領域生成部１３は、動きベクトルの平均値の値が正であるか判定する（Ｓ０３）。動きオブジェクト領域生成部１３は、動きベクトルの平均値の値が正であると判定した場合には、対応するマクロブロックを連結し、連結したマクロブロックごとにそれらを一つにまとめて動きオブジェクト領域として設定する（Ｓ０３のＹｅｓ、Ｓ０４）。他方、動きオブジェクト領域生成部１３は、平均値の値が正でないと判定した場合には、動きオブジェクト領域検出動作を終了する（Ｓ０４のＮｏ、Ｓ１０）。 The moving object region generation unit 13 determines whether the average value of the motion vectors is positive (S03). When the moving object area generation unit 13 determines that the average value of the motion vectors is positive, the moving object area generation unit 13 connects the corresponding macroblocks, and combines them into one moving object area for each connected macroblock. (Yes in S03, S04). On the other hand, if the moving object area generating unit 13 determines that the average value is not positive, the moving object area detecting operation ends (No in S04, S10).

そして、動きオブジェクト領域生成部１３は、ステップＳ０４において動きオブジェクト領域に含まれるマクロブロックに隣接するマクロブロックのパーティションが１６×８又は８×１６パーティションであるか判定する（Ｓ０５）。動きオブジェクト領域生成部１３は、隣接するマクロブロックのパーティションが１６×８又は８×１６パーティションであると判定した場合には、上述した動きオブジェクト領域にステップＳ０５の条件を満たす隣接するマクロブロックを加える（Ｓ０５のＹｅｓ、Ｓ０６）。ここで、動きオブジェクト領域生成部１３は、ステップＳ０７での動きオブジェクト領域に更に隣接するマクロブロックについて、ステップＳ０６及びＳ０７の動作を行う。このようにして、動きオブジェクト領域生成部１３は、ステップＳ０６の時点における動きオブジェクト領域に隣接するマクロブロックが１６×８又は８×１６のパーティションを含まないことを判定するまで、ステップＳ０５及びＳ０６の動作を繰り返す。 Then, the moving object region generation unit 13 determines whether the partition of the macroblock adjacent to the macroblock included in the moving object region is a 16 × 8 or 8 × 16 partition in step S04 (S05). When the moving object region generation unit 13 determines that the partition of the adjacent macroblock is a 16 × 8 or 8 × 16 partition, the moving object region generation unit 13 adds the adjacent macroblock that satisfies the condition of step S05 to the moving object region described above. (Yes in S05, S06). Here, the moving object region generation unit 13 performs the operations of steps S06 and S07 on the macroblock further adjacent to the moving object region in step S07. In this way, the moving object region generation unit 13 determines whether the macroblock adjacent to the moving object region at the time of step S06 does not include a 16 × 8 or 8 × 16 partition, until the determination in steps S05 and S06. Repeat the operation.

そして、動きオブジェクト領域生成部１３は、ステップＳ０６の時点における動きオブジェクト領域に隣接するマクロブロックのパーティションが１６×１６パーティションであるか判定する（Ｓ０７）。動きオブジェクト領域生成部１３は、隣接するマクロブロックのパーティションが１６×１６パーティションではないと判定した場合には、動きオブジェクト領域検出動作を終了する（Ｓ０７のＮｏ、Ｓ１０）。一方、動きオブジェクト領域生成部１３は、ステップＳ０５において、隣接するマクロブロックのパーティションが１６×８又は８×１６パーティションではないと判定した場合には、ステップＳ０７において、ステップＳ０４の時点における動きオブジェクト領域に隣接するマクロブロックのパーティションが１６×１６パーティションであるか判定する。 Then, the moving object area generation unit 13 determines whether the partition of the macroblock adjacent to the moving object area at the time of step S06 is a 16 × 16 partition (S07). If the moving object area generation unit 13 determines that the partition of the adjacent macroblock is not a 16 × 16 partition, the moving object area detection operation ends (No in S07, S10). On the other hand, if the moving object area generation unit 13 determines in step S05 that the adjacent macroblock partition is not a 16 × 8 or 8 × 16 partition, the moving object area at the time of step S04 in step S07. It is determined whether the partition of the macroblock adjacent to is a 16 × 16 partition.

そして、動きオブジェクト領域生成部１３は、ステップＳ０７において、１６×１６パーティションであると判定したマクロブロックについて、ステップＳ０６での動きオブジェクト領域に含まれる各マクロブロックの動きベクトルとの間の類否判定を行う（Ｓ０８）。このとき、動きオブジェクト領域生成部１３は、動きベクトルの大きさ及び方向に基づいて類否判定を行う。例えば、方向の類否判定においては、動きオブジェクト領域生成部１３は、平均動きベクトルの方向を、図３に示す０から７までの８つの方向領域の何れかに分類し、同一又は隣接する方向領域に属する動きベクトルは方向が類似するものとして判定する。 Then, the moving object region generation unit 13 determines the similarity between the macroblock determined to be a 16 × 16 partition in step S07 and the motion vector of each macroblock included in the moving object region in step S06. (S08). At this time, the moving object region generation unit 13 performs similarity determination based on the magnitude and direction of the motion vector. For example, in the direction similarity determination, the moving object region generation unit 13 classifies the direction of the average motion vector into any of the eight direction regions from 0 to 7 shown in FIG. It is determined that the motion vectors belonging to the region are similar in direction.

そして、動きオブジェクト領域生成部１３は、ステップＳ０８において類似すると判定した場合には、動きオブジェクト領域にステップＳ０８の条件を満たす隣接マクロブロックを加える（Ｓ０８のＹｅｓ、Ｓ０９）。一方、ステップＳ０８において類似しないと判定した場合には、動きオブジェクト領域検出動作を終了する（Ｓ０８のＮｏ、Ｓ１０）。 If it is determined in step S08 that the moving object area is similar, the moving object area generating unit 13 adds an adjacent macroblock that satisfies the condition of step S08 to the moving object area (Yes in S08, S09). On the other hand, if it is determined in step S08 that they are not similar, the moving object region detection operation is terminated (No in S08, S10).

図４は、図１に示した動きオブジェクト追跡部の動作を示すフローチャートである。ここでは、時刻ｔにおけるフレーム（以下、フレーム（ｔ）と称する）、及び時刻ｔ＋１におけるフレーム（以下、フレーム（ｔ＋１）と称する）との間で、動きオブジェクト追跡処理を行うものとする。マクロブロック予測部２１は、動きオブジェクト領域生成部１３から、時刻ｔにおけるフレーム（以下、フレーム（ｔ）と称する）の動きオブジェクト領域情報を取得し、その旨をマクロブロックカウント部２２に通知する（Ｓ１１）。マクロブロックカウント部２２は、フレーム（ｔ）に含まれる各動きオブジェクト領域ごとに、カウンタを準備し、カウンタ値をリセットしてゼロにする（Ｓ１２）。 FIG. 4 is a flowchart showing the operation of the moving object tracking unit shown in FIG. Here, it is assumed that a moving object tracking process is performed between a frame at time t (hereinafter referred to as frame (t)) and a frame at time t + 1 (hereinafter referred to as frame (t + 1)). The macroblock prediction unit 21 acquires the motion object region information of the frame at time t (hereinafter referred to as frame (t)) from the motion object region generation unit 13 and notifies the macroblock count unit 22 to that effect ( S11). The macroblock count unit 22 prepares a counter for each moving object area included in the frame (t), and resets the counter value to zero (S12).

マクロブロック予測部２１は、ステップＳ１１において取得した、フレーム（ｔ）の各動きオブジェクト領域内の各マクロブロック（以下、ＭＢｘと称する）について、動き補償予測部３１から動きベクトルを取得する（Ｓ１３）。そして、マクロブロック予測部２１は、ステップＳ１３において取得した動きベクトルを用いて、フレーム（ｔ）内において、時刻（ｔ＋１）の時点における、ＭＢｘの位置に対応するマクロブロック（以下、ＭＢｙと称する）を予測マクロブロックとして特定する。 The macroblock prediction unit 21 acquires a motion vector from the motion compensation prediction unit 31 for each macroblock (hereinafter referred to as MBx) in each motion object region of the frame (t) acquired in step S11 (S13). . Then, the macroblock prediction unit 21 uses the motion vector acquired in step S13, and in the frame (t), the macroblock corresponding to the position of MBx at the time (t + 1) (hereinafter referred to as MBy). Are identified as predicted macroblocks.

そして、マクロブロック予測部２１は、フレーム（ｔ＋１）の動きオブジェクト領域情報を、動きオブジェクト領域生成部１３から取得して、ステップＳ１４において特定した予測マクロブロックの情報と共に、マクロブロックカウント部２２に提供する（Ｓ１５）。マクロブロックカウント部２２は、マクロブロック予測部２１から取得した情報に基づいて、カウントを実行する（Ｓ１６）。そして、マクロブロックカウント部２２は、フレーム（ｔ＋１）の各動きオブジェクト領域について、図１を参照して上述した参照率（P）を算出する（Ｓ１７）。 Then, the macroblock prediction unit 21 acquires the motion object region information of the frame (t + 1) from the motion object region generation unit 13 and provides it to the macroblock count unit 22 together with the information of the predicted macroblock specified in step S14. (S15). The macroblock count unit 22 performs counting based on the information acquired from the macroblock prediction unit 21 (S16). Then, the macroblock count unit 22 calculates the reference rate (P) described above with reference to FIG. 1 for each moving object region of the frame (t + 1) (S17).

そして、動きオブジェクト領域間対応判定部２３は、一つのフレーム（ｔ＋１）の動きオブジェクト領域（候補領域ｃ）について、ステップＳ１７において算出した参照率（Ｐ）が最大の、フレーム（ｔ）の動きオブジェクト領域（標的領域ｒ）を判定する（Ｓ１８）。動きオブジェクト領域間対応判定部２３は、ステップＳ１８において一つの候補領域ｃに対して最大の参照率（Ｐ）を有する標的領域ｒを、候補領域ｃと同一の動きオブジェクトとして判定する（Ｓ１８のＹｅｓ、Ｓ１９）。一方、ステップＳ１８において参照率（Ｐ）が最大であると判断された標的領域ｒ以外の標的領域ｒについては、動きオブジェクト追跡処理を終了する（Ｓ１８のＮｏ、Ｓ２０）。 Then, the motion object region correspondence determination unit 23 determines the motion object of the frame (t) having the maximum reference rate (P) calculated in step S17 for the motion object region (candidate region c) of one frame (t + 1). A region (target region r) is determined (S18). In step S18, the moving object area correspondence determination unit 23 determines the target area r having the maximum reference rate (P) for one candidate area c as the same moving object as the candidate area c (Yes in S18). , S19). On the other hand, for the target regions r other than the target region r for which the reference rate (P) is determined to be the maximum in step S18, the moving object tracking process is terminated (No in S18, S20).

このように、本実施の形態に係る動きオブジェクト検出装置によれば、可変ブロックサイズ検出部１１が、画像データに対する可変ブロックサイズ動き補償予測に用いられるブロックサイズであって、８×８パーティション（以下、第１所定サイズのブロック）を検出した場合に、マクロブロック動きベクトル算出部１２は、当該ブロックを含む第１マクロブロックについて、第１動きベクトルを算出する。そして、動きオブジェクト領域生成部１３は、第１動きベクトルに基づいて、第１マクロブロックを第１動きオブジェクト領域として設定する。さらに、動きオブジェクト領域生成部１３は、特定した前記第１動きオブジェクト領域に直接又は間接的に隣接する、１６×８又は８×１６パーティション（以下、第２所定サイズのブロック）を含む第２マクロブロックを、第１動きオブジェクト領域に加える。さらに、マクロブロック動きベクトル算出部１２は、生成した第１動きオブジェクト領域に隣接するマクロブロックであって、前記第１及び第２所定サイズのブロックを含まない第３マクロブロックのそれぞれについて、第２動きベクトルを算出する。さらに、動きオブジェクト領域生成部１３は、第２動きベクトルに基づいて前記第３マクロブロックを、第１動きオブジェクト領域に加える。このように、画素の集合であるマクロブロックを基準として動きオブジェクト検出処理を行うため、実時間処理に適している、動きオブジェクト検出装置が提供できる。 As described above, according to the moving object detection device according to the present embodiment, the variable block size detection unit 11 has a block size used for variable block size motion compensation prediction for image data, and is 8 × 8 partitions (hereinafter referred to as “8 × 8 partitions”). When the first predetermined size block is detected, the macroblock motion vector calculation unit 12 calculates a first motion vector for the first macroblock including the block. Then, the moving object region generation unit 13 sets the first macroblock as the first moving object region based on the first motion vector. Furthermore, the moving object area generation unit 13 includes a second macro including a 16 × 8 or 8 × 16 partition (hereinafter referred to as a second predetermined size block) that is directly or indirectly adjacent to the identified first moving object area. A block is added to the first moving object area. Further, the macroblock motion vector calculation unit 12 performs a second operation on each of the third macroblocks that are adjacent to the generated first moving object region and do not include the first and second predetermined size blocks. A motion vector is calculated. Furthermore, the moving object region generation unit 13 adds the third macroblock to the first moving object region based on the second motion vector. As described above, since the moving object detection process is performed on the basis of a macroblock which is a set of pixels, a moving object detection apparatus suitable for real-time processing can be provided.

さらに、本実施の形態に係る各装置は、所定サイズのパーティションの検出をきっかけとして動きオブジェクトの検出及び追跡を行うため、Ｈ．２６４等の様々な動画像データに適用することができる。また、本実施の形態に係る動きオブジェクト領域生成部１３により生成された動きオブジェクト領域を利用して、通常用いられるクロマキー合成よりも、比較的容易に背景合成を行うことが可能である。 Furthermore, each apparatus according to the present embodiment detects and tracks a moving object triggered by detection of a partition of a predetermined size. The present invention can be applied to various moving image data such as H.264. In addition, using the moving object area generated by the moving object area generating unit 13 according to the present embodiment, it is possible to perform background composition relatively easily compared to the normally used chroma key composition.

好ましくは、マクロブロック予測部２１は、動きオブジェクト領域生成部１３が生成した一又は複数の動きオブジェクト領域について、それぞれカルマンフィルタを設定し、それらを用いて予測を行うことができる。更に好ましくは、カルマンフィルタの入力情報としては、動きオブジェクト領域の重心位置のみではなく、動きオブジェクト領域の速度ベクトルに相当する平均動きベクトルを用いることができる。重心位置、平均動きベクトル及び速度情報は、動きオブジェクト領域生成部１３が生成した動きオブジェクト領域に含まれるマクロブロックの位置ベクトルと動きベクトルの平均値などに基づいて算出する。カルマンフィルタの入力に動きオブジェクト領域の重心位置だけでなく、平均動きベクトルを加えることにより、カルマンフィルタの予測精度が向上し、次の時刻におけるより正確な物体の位置ベクトルならびに速度ベクトルが得られ、加えない場合に比べ、追跡性能を高めることができる。一方、従来のように、カルマンフィルタの入力に動きオブジェクト領域の重心位置だけを使用した場合には、次の時刻の重心位置と速度ベクトルを、現在の時刻の重心位置のみから予測するため、動きオブジェクトが急に方向を変えたり、速度を変化させると、予測結果が悪く、オブジェクトの追跡が困難であった。 Preferably, the macroblock prediction unit 21 can set a Kalman filter for one or a plurality of moving object regions generated by the moving object region generation unit 13 and perform prediction using them. More preferably, as input information of the Kalman filter, an average motion vector corresponding to the velocity vector of the moving object region can be used as well as the center of gravity position of the moving object region. The barycentric position, average motion vector, and speed information are calculated based on the average value of the position vector of the macroblock and the motion vector included in the motion object area generated by the motion object area generation unit 13. By adding not only the center of gravity position of the moving object area but also the average motion vector to the input of the Kalman filter, the prediction accuracy of the Kalman filter is improved, and a more accurate object position vector and velocity vector at the next time are obtained, not added Compared to the case, the tracking performance can be improved. On the other hand, when only the centroid position of the moving object area is used for the input of the Kalman filter as in the conventional case, the centroid position and velocity vector at the next time are predicted only from the centroid position at the current time. If the direction is changed suddenly or the speed is changed, the prediction result is poor and it is difficult to track the object.

更に好ましくは、マクロブロック予測部２１は、ある動きオブジェクトが、障害物に遮蔽されたり、他の動きオブジェクトと交差することによって、見かけ上、短時間の間フレーム内にから消滅する現象である、いわゆるオクルージョンの発生に起因する検出精度の劣化を低減するように構成することができる。この場合、例えば、マクロブロック予測部２１において、予め設定した所定期間の間、カルマンフィルタによる予測を行うことで、オクルージョンにより画面上から消滅した動きオブジェクトを追跡することが可能になる。 More preferably, the macroblock prediction unit 21 is a phenomenon in which a certain moving object disappears from the frame for a short period of time by being blocked by an obstacle or intersecting with another moving object. It can be configured to reduce deterioration in detection accuracy due to occurrence of so-called occlusion. In this case, for example, by performing prediction using a Kalman filter for a predetermined period set in advance in the macroblock prediction unit 21, it is possible to track a moving object that has disappeared from the screen due to occlusion.

このとき、オクルージョンにより画面上から消滅した動きオブジェクトに対応するカルマンフィルタの観測データは、位置ベクトル、速度情報に対応した動きベクトルともに０としてカルマンフィルタによる予測を続ける。また、所定期間の間、動きオブジェクト領域間対応判定部２３は、オクルージョンの有無に関わらず、動きオブジェクト領域特定部１３が特定した動きオブジェクト領域が、自らが保持している動きオブジェクト領域に類似しているかどうかの判定を続ける。そして、オクルージョンが解消され、消滅していた動きオブジェクトが再び画面上に出現した場合、動きオブジェクト領域間対応判定部２３は、そのオブジェクトを新たなオブジェクトとして、その位置と、所定期間中における予測結果に基づき、カルマンフィルタによる予測位置との距離を比較する。そして、動きオブジェクト領域間対応判定部２３は、その距離が予め定められたしきい値より小さければ、オクルージョンにより消滅したオブジェクト領域が再び出現したと判断する。 At this time, the observation data of the Kalman filter corresponding to the motion object disappeared from the screen due to occlusion is set to 0 for both the position vector and the motion vector corresponding to the velocity information, and the prediction by the Kalman filter is continued. In addition, during a predetermined period, the moving object area correspondence determining unit 23 resembles the moving object area specified by the moving object area specifying unit 13 with the moving object area held by the moving object area specifying unit 13 regardless of the presence or absence of occlusion. Continue to determine whether or not Then, when the occlusion is resolved and the moving object that has disappeared appears on the screen again, the moving object region correspondence determination unit 23 sets the object as a new object, its position, and a prediction result during a predetermined period. Based on the above, the distance to the predicted position by the Kalman filter is compared. If the distance is smaller than a predetermined threshold, the moving object area correspondence determination unit 23 determines that the object area that disappeared due to occlusion has appeared again.

このようにして、上述したようなマクロブロック予測部２１を備える動きオブジェクト追跡部２０は、オクルージョンの発生したオブジェクト領域についても高精度に追跡することができる。かかる動きオブジェクト追跡部２０は、監視カメラシステム等への実装に特に適している。これは、駐車場や街角を徘徊する不審人物を、障害物等が存在する場合でも、見失うことなく追跡し続けることが可能となるからである。 In this way, the moving object tracking unit 20 including the macroblock prediction unit 21 as described above can also track an object region where occlusion has occurred with high accuracy. The moving object tracking unit 20 is particularly suitable for mounting on a surveillance camera system or the like. This is because it is possible to keep track of a suspicious person who deceives a parking lot or a street corner without losing sight even when an obstacle or the like exists.

また、上記実施の形態に係る動きオブジェクト検出装置２は、例えば、以下に示すようなシステムとして実装することができる。第１の例に係るシステムは、ビデオ広告を伴う映像（ビデオ）配信システムである。本システムは、動きオブジェクト検出装置２の他に、例えば、カメラ４０ａ並びに４０ｂ、ビデオサーバ４１、音声認識システム４２、顔画像認識システム４３、ＵＴＣ（Coordinated Universal Time）サーバ４４、メタデータ生成装置４５、メタデータプロバイダ４６を含む。顔画像認識システムは、例えば、著名人の顔画像のデータと、その著名人の名前や経歴などのメタデータを関連付けるデータベースを保持している。メタデータ生成装置４５は、例えば、インターネット（図示しない）に接続しており、インターネット経由で様々なデータを取得することができる。ユーザ４９は、例えば、スマートフォンやＰＣ（Personal Computer）などの端末４７からシステムにアクセスして、ビデオを視聴することができる。端末４７は、メタデータプロバイダ４６からメタデータを取得するメタデータ取得部４８を備える。メタデータ取得部４８は、例えば、ＣＰＵによって実装可能である。 In addition, the moving object detection device 2 according to the above embodiment can be implemented as a system as described below, for example. The system according to the first example is a video (video) distribution system with a video advertisement. In addition to the moving object detection device 2, this system includes, for example, cameras 40a and 40b, a video server 41, a speech recognition system 42, a face image recognition system 43, a UTC (Coordinated Universal Time) server 44, a metadata generation device 45, A metadata provider 46 is included. The face image recognition system holds, for example, a database that associates celebrity face image data with metadata such as the celebrity name and career. The metadata generation device 45 is connected to the Internet (not shown), for example, and can acquire various data via the Internet. The user 49 can view the video by accessing the system from a terminal 47 such as a smartphone or a personal computer (PC), for example. The terminal 47 includes a metadata acquisition unit 48 that acquires metadata from the metadata provider 46. The metadata acquisition unit 48 can be implemented by a CPU, for example.

本システムは、カメラ４０ａ並びに４０ｂにより撮像されるライブビデオ（リアルタイム映像）に含まれる、大勢の人物を、動きオブジェクト領域装置２によって、検出及び追跡するように構成される。動きオブジェクト領域装置２の動きオブジェクト領域特定部１０は、図１〜３を参照して詳述したような方法により、各人物などを含む領域を生成して、ビデオデータに含まれる各フレームについて動きオブジェクト領域情報を提供する。動きオブジェクト追跡部２０は、図１及び４を参照して詳述したような方法により、各動きオブジェクトを追跡して、検出された各動きオブジェクトについて識別番号（ＩＤ）を付与すると共に、動きオブジェクト追跡情報を提供する。動きオブジェクト追跡情報は、例えば、追跡した動きオブジェクト領域を含む近似矩形の対角座標値の情報である。また、カメラ４０ａ並びに４０ｂにより撮像されるライブビデオは、ビデオサーバ４１に保存され、ユーザ４９に対して提供される。 The system is configured to detect and track a large number of persons included in live video (real-time video) captured by the cameras 40a and 40b by the moving object area device 2. The moving object area specifying unit 10 of the moving object area apparatus 2 generates an area including each person by the method described in detail with reference to FIGS. Provides object area information. The moving object tracking unit 20 tracks each moving object by the method described in detail with reference to FIGS. 1 and 4 and assigns an identification number (ID) to each detected moving object. Provide tracking information. The moving object tracking information is, for example, information on diagonal coordinate values of an approximate rectangle including the tracked moving object area. In addition, live video captured by the cameras 40 a and 40 b is stored in the video server 41 and provided to the user 49.

また、音声認識システム４２は、ビデオデータに含まれる音声データをテキスト化し、音声テキストデータとして動きオブジェクト追跡部２０に提供する。顔画像認識システム４３は、協調して、時間や、各動きオブジェクトの位置情報、テキストデータなどの情報を、各動きオブジェクトに対して、メタデータとして追加する。また、顔画像認識システム４３は、動きオブジェクト（人物）を含む画像ファイルについて、予め保持しているデータベースとの照合を行い、各動きオブジェクト（人物）のＩＤと、氏名等のメタデータとを関連付ける。 In addition, the voice recognition system 42 converts voice data included in the video data into text, and provides the text to the moving object tracking unit 20 as voice text data. The face image recognition system 43 cooperates to add information such as time, position information of each moving object, and text data as metadata to each moving object. Further, the face image recognition system 43 collates an image file including a moving object (person) with a database stored in advance, and associates an ID of each moving object (person) with metadata such as a name. .

そして、メタデータ生成装置４５は、インターネットを経由して、例えば、ツイッター、リアルタイムサーチエンジンなどから準リアルタイムテキストデータを取得する。また、メタデータ生成部４５は、ＵＴＣサーバ４４からＵＴＣ時間に基づくビデオ取得時（現在時間）の情報を取得し、メタデータに追加する。この他に、時間情報としては、ビデオ内での相対的な時間を示すビデオ時間の情報が含まれうる。 And the metadata production | generation apparatus 45 acquires near real time text data from a Twitter, a real time search engine, etc. via the internet. Further, the metadata generation unit 45 acquires information on the video acquisition time (current time) based on the UTC time from the UTC server 44 and adds it to the metadata. In addition, the time information may include video time information indicating a relative time within the video.

更に、メタデータ生成装置４５は、一つのオブジェクトに対して積極的又は消極的に作用する、他の一又は複数のオブジェクトのIDを上述したメタデータに含めることもできる。ここで、「一つのオブジェクトに対して積極的に作用するオブジェクト」とは、例えば、一つのオブジェクトである人物（人物Aとする）に対して話しかける他の人物や、人物Aとテニスをする他の人物である。他方、「一つのオブジェクトに対して消極的に作用するオブジェクト」とは、例えば、人物Aが身に着けている腕時計や、人物Aが座っているソファー、人物Aが片手に持っているワインボトルなどである。 Further, the metadata generation device 45 can include the IDs of one or more other objects acting positively or passively on one object in the above-described metadata. Here, “an object that acts positively on one object” means, for example, another person talking to a person (referred to as person A) as one object, or playing tennis with person A Person. On the other hand, “an object that acts passively on one object” means, for example, a wristwatch worn by person A, a sofa on which person A sits, or a wine bottle that person A holds in one hand Etc.

図６に、メタデータプロバイダ４６において保持されるメタデータの一例を示す。各オブジェクトIDごとに、UTC時刻、ビデオ時刻、位置情報、音声テキストデータ、ツイッター等から取得したテキストデータ、関連するオブジェクトのIDと、その動作等の情報がメタデータとして保持されている。「位置情報」は、カメラ１〜Nのそれぞれにおける各オブジェクト領域を含む近似矩形の対角座標値である。UTC時刻（１６：３７：０８）において、オブジェクト１（ID：１２８９７６５４）は、オブジェクト２（ID：１８９９９０１２）に対して、「こんにちは！」と発声した場合には、音声テキストデータとして「こんにちは!」というテキスト情報が保持されている。そして、「関連オブジェクト」のIDとして、オブジェクト２のIDが保持され、その「動作」として、「話を聴く」という情報が保持されている。これと同様に、各時点における各オブジェクトについての様々な情報がメタデータとして保持されている。 FIG. 6 shows an example of metadata held in the metadata provider 46. For each object ID, information such as UTC time, video time, position information, audio text data, text data acquired from Twitter, etc., IDs of related objects, and their operations are held as metadata. “Position information” is a diagonal coordinate value of an approximate rectangle including each object region in each of the cameras 1 to N. In the UTC time (16:37:08), the object 1 (ID: 12897654), the object 2: against (ID 18999012), if you say "Hello!" Is, "Hello!" As the voice text data Is held. Then, the ID of the object 2 is held as the ID of the “related object”, and the information “listen to the story” is held as the “motion”. Similarly, various information about each object at each time point is held as metadata.

また、本システムは、ライブビデオが表示される画面を介して、ユーザが対話的に興味あるオブジェクトを指定できるように構成することもできる。このとき、ユーザは、例えば、指やスタイラスペン等によりタッチパネル画面上において対角線を描くことにより、興味あるオブジェクトを含む矩形領域を定義する。このとき、矩形領域内のオブジェクトの動きが遅く、連続したいくつかの動画フレームを比較してもあまり変化しないような場合、特定された境界よりも大きい範囲でオブジェクトの動きを捉えることにより、位置情報を取得するようにシステムを構成することもできる。 The system can also be configured so that the user can interactively specify an object of interest via a screen on which live video is displayed. At this time, the user defines a rectangular region including an object of interest by drawing a diagonal line on the touch panel screen with a finger or a stylus pen, for example. At this time, if the movement of the object in the rectangular area is slow and does not change much even if several consecutive video frames are compared, the position of the object can be determined by capturing the movement of the object in a range larger than the specified boundary. The system can also be configured to obtain information.

また、メタデータ取得部４８は、ユーザの嗜好や要求に合う内容のメタデータ（すなわち、ユーザ専用のメタデータ）を取得する（フェッチする）ためのフィルタを生成し、メタデータプロバイダ４６から当該フィルタを経てメタデータを取得することもできる。このような、フィルタを用いた情報フィルタリングにより取得したメタデータは、ユーザにとって魅力的且つ有意義である可能性が高い。メタデータ取得部４８が、フィルタを生成する際に用いるパラメータとしては、位置、ユーザの性別や年齢、ユーザが起動しているアプリケーションの種類、メタデータに含まれるコンテンツの分野、及びユーザの周辺の環境（例えば、コンサート、パーティー、学校、職場等）等が挙げられる。この他、客観的には測定不能な、ユーザの気分及び期待度等を上記パラメータとして用いることも考えられる。 Further, the metadata acquisition unit 48 generates a filter for acquiring (fetching) metadata having contents that match the user's preference and request (that is, metadata dedicated to the user), and the metadata provider 46 receives the filter from the metadata provider 46. The metadata can also be obtained via There is a high possibility that metadata acquired by such information filtering using a filter is attractive and meaningful to the user. Parameters used when the metadata acquisition unit 48 generates a filter include the position, the gender and age of the user, the type of application that the user is running, the field of content included in the metadata, and the surroundings of the user Environment (for example, concerts, parties, schools, workplaces, etc.). In addition, it is also conceivable to use the user's mood and expectation that cannot be measured objectively as the parameters.

本システムによれば、ビデオと共に、ユーザの嗜好に応じたビデオ広告や、メタデータ情報なども配信することが可能である。また、複数のカメラを用いる場合には、複数のカメラが異なるアングルで撮影した追跡対象とする動きオブジェクトの位置情報等のメタデータを利用することで、複数カメラが一体となって、同一の追跡対象を追跡することが可能である。 According to this system, it is possible to distribute video advertisements according to user preferences, metadata information, and the like together with video. In addition, when using a plurality of cameras, a plurality of cameras can be integrated into the same tracking by using metadata such as positional information of a moving object to be tracked that is captured at different angles by a plurality of cameras. It is possible to track the subject.

更に、本発明の一態様として、動きオブジェクト検出装置２を、コンピュータとして構成させることができる。コンピュータを、この装置として機能させるためのプログラムは、コンピュータに備えられる記憶部に記憶される。そのような記憶部は、外付けハードディスクなどの外部記憶装置、或いはＲＯＭ又はＲＡＭなどの内部記憶装置で実現することができる。上述の装置として機能するコンピュータは、ＣＰＵなどの制御で実現することができる。即ち、ＣＰＵが、各構成要素の機能を実現するための処理内容が記述されたプログラムを、適宜、記憶部から読み込んで、各構成要素の機能をコンピュータ上で実現させることができる。ここで、各構成要素の機能をハードウェアの一部で実現しても良い。 Furthermore, as one aspect of the present invention, the moving object detection device 2 can be configured as a computer. A program for causing a computer to function as this device is stored in a storage unit provided in the computer. Such a storage unit can be realized by an external storage device such as an external hard disk or an internal storage device such as ROM or RAM. A computer that functions as the above-described device can be realized by control of a CPU or the like. In other words, the CPU can appropriately read from the storage unit a program in which the processing content for realizing the function of each component is described, and realize the function of each component on the computer. Here, the function of each component may be realized by a part of hardware.

また、この処理内容を記述したプログラムを、例えばＤＶＤ又はＣＤ−ＲＯＭなどの可搬型記録媒体の販売、譲渡、貸与等により流通させることができるほか、そのようなプログラムを、例えばネットワーク上にあるサーバの記憶部に記憶しておき、ネットワークを介してサーバから他のコンピュータにそのプログラムを転送することにより、流通させることができる。 In addition, the program describing the processing contents can be distributed by selling, transferring, or lending a portable recording medium such as a DVD or CD-ROM, and such a program can be distributed on a server on a network, for example. Can be distributed by transferring the program from the server to another computer via the network.

また、そのようなプログラムを実行するコンピュータは、例えば、可搬型記録媒体に記録されたプログラム又はサーバから転送されたプログラムを、一旦、自己の記憶部に記憶することができる。また、このプログラムの別の実施態様として、コンピュータが可搬型記録媒体から直接プログラムを読み取り、そのプログラムに従った処理を実行することとしてもよく、更に、このコンピュータにサーバからプログラムが転送される度に、逐次、受け取ったプログラムに従った処理を実行することとしてもよい。 In addition, a computer that executes such a program can temporarily store, for example, a program recorded on a portable recording medium or a program transferred from a server in its own storage unit. As another embodiment of the program, the computer may directly read the program from a portable recording medium and execute processing according to the program, and each time the program is transferred from the server to the computer. In addition, the processing according to the received program may be executed sequentially.

１動画像処理装置
２動きオブジェクト検出装置
１０動きオブジェクト領域特定部
１１可変ブロックサイズ検出部
１２マクロブロック動きベクトル算出部
１３動きオブジェクト領域生成部
２０動きオブジェクト追跡部
２１マクロブロック予測部
２２マクロブロックカウント部
２３動きオブジェクト領域間対応判定部
３０符号化部
３１動き補償予測部
４０カメラ
４１ビデオサーバ
４２音声認識システム
４３顔画像認識システム
４４ＵＴＣサーバ
４５メタデータ生成装置
４６メタデータプロバイダ
４７端末

DESCRIPTION OF SYMBOLS 1 Moving image processing apparatus 2 Moving object detection apparatus 10 Moving object area | region specific | specification part 11 Variable block size detection part 12 Macroblock motion vector calculation part 13 Moving object area | region production | generation part 20 Moving object tracking part 21 Macroblock prediction part 22 Macroblock count part 23 motion object region correspondence determination unit 30 encoding unit 31 motion compensation prediction unit 40 camera 41 video server 42 speech recognition system 43 face image recognition system 44 UTC server 45 metadata generation device 46 metadata provider 47 terminal

Claims

A variable block size detection unit for detecting a block size used for variable block size motion compensation prediction for image data;
A motion vector calculation unit that calculates a first motion vector for a first macroblock including the block when the variable block size detection unit detects a block of a first predetermined size;
Based on the first motion vector, the first macroblock is set as a first motion object region, and a second macroblock including a second predetermined size block adjacent to the set first motion object region, A moving object area generator for adding to the first moving object area,
The motion vector calculation unit calculates a second motion vector for each of the third macro blocks that are adjacent to the first motion object region and do not include the first and second predetermined size blocks. ,
The moving object region generation unit adds the third macroblock to the first moving object region based on the second motion vector.
A moving object detection device characterized by the above.

The moving object detection device according to claim 1,
A predicting unit configured to predict a position of the moving object region after a predetermined time for the moving object region generated by the moving object region generating unit;
Based on the position of the moving object region predicted by the prediction unit and the second moving object region generated for the image data after the predetermined time by the moving object region generation unit, the first moving object region And a movement object area correspondence determination unit for determining a correspondence with the second movement object area,
A moving object detection device comprising:

The prediction unit predicts the position of each first macroblock at the time of the second frame subsequent to the first frame based on the motion vector of each first macroblock included in the first moving object region. A macroblock predictor,
The inter-object region correspondence determination unit is configured to determine the position of each first macroblock predicted by the macroblock prediction unit and the second moving object region generated for the second frame by the moving object region generation unit. , Count, and based on the result, determine the correspondence between the first moving object area and the second moving object area,
The moving object detection device according to claim 2, wherein

The predicting unit further predicts a position of the first moving object region at a time point of a plurality of third frames for a predetermined time after the first frame using a Kalman filter,
The inter-object region correspondence determination unit further includes the position of the first moving object region predicted using the Kalman filter, and one or more third movements generated by the moving object region generation unit during the predetermined time. The moving object detection device according to claim 3, wherein a correspondence between the first moving object area and the third moving object area is determined based on a position of the object area.

On the computer,
A variable block size detecting step for detecting a block size used for variable block size motion compensation prediction for image data;
A motion vector calculation step of calculating a first motion vector for a first macroblock including the block when the variable block size detection unit detects a block of a first predetermined size;
Based on the first motion vector, the first macroblock is set as a first motion object region, and a second macroblock including a second predetermined size block adjacent to the set first motion object region, A moving object region generating step to be added to the first moving object region,
The motion vector calculation step calculates a second motion vector for each of the third macroblocks that are adjacent to the first motion object area and do not include the first and second predetermined size blocks. ,
The moving object region generating step adds the third macroblock to the first moving object region based on the second motion vector.
A moving object detection program characterized by the above.

The moving object detection program according to claim 5,
A predicting step of predicting a position of the moving object area after a predetermined time for the moving object area generated by the moving object area generating step;
Based on the position of the moving object area predicted in the prediction step and the second moving object area generated for the image data after the predetermined time in the moving object area generation step, the first moving object area And a movement object area correspondence determination step for determining a correspondence with the second movement object area,
A motion object detection program characterized by causing

A variable block size detecting step for detecting a block size used for variable block size motion compensation prediction for image data;
A motion vector calculation step of calculating a first motion vector for a first macroblock including the block when the variable block size detection unit detects a block of a first predetermined size;
Based on the first motion vector, the first macroblock is set as a first motion object region, and a second macroblock including a second predetermined size block adjacent to the set first motion object region, A moving object region generating step to add to the first moving object region,
The motion vector calculation step calculates a second motion vector for each of the third macroblocks that are adjacent to the first motion object area and do not include the first and second predetermined size blocks. ,
The moving object region generating step adds the third macroblock to the first moving object region based on the second motion vector.
A moving object detection method characterized by the above.

The moving object detection method according to claim 7,
A predicting step of predicting a position of the moving object area after a predetermined time for the moving object area generated by the moving object area generating step;
Based on the position of the moving object area predicted in the prediction step and the second moving object area generated for the image data after the predetermined time in the moving object area generation step, the first moving object area And a movement object area correspondence determination step for determining a correspondence with the second movement object area,
A moving object detection method comprising: