JP2007004732A

JP2007004732A - Image generation device and method

Info

Publication number: JP2007004732A
Application number: JP2005187390A
Authority: JP
Inventors: Masahiro Iwasaki; 正宏岩崎; Takeo Azuma; 健夫吾妻; Kenji Kondo; 堅司近藤
Original assignee: Matsushita Electric Industrial Co Ltd
Current assignee: Panasonic Holdings Corp
Priority date: 2005-06-27
Filing date: 2005-06-27
Publication date: 2007-01-11

Abstract

PROBLEM TO BE SOLVED: To provide an image generation device for generating a new image on which such a parameter as the shape, dress and motion of a joint object existing in an image is reflected. SOLUTION: This image generation device for generating a new image on which the characteristics of a joint object are reflected from an image obtained by picking up the image of the joint object is provided with an image input part 101 for acquiring an image by picking up the image of the joint object, a parameter calculating part 102 for calculating a first parameter relating to the position of the joint or inter-joint site of the joint object by applying a model having a preliminarily held joint to the joint object in the acquired image, a model conversion part 103 for executing model conversion to estimate a second parameter relating to the shape information of the inter-joint site of the joint object and the position and motion of the joint which is not included in the first parameter by using the first parameter and an image generation part 104 for generating a new image on which the characteristics of the joint object are reflected by using the second parameter. COPYRIGHT: (C)2007,JPO&INPIT

Description

本発明は、画像処理によって人物や動物等を含む関節物体の画像を生成する画像生成装置及びその方法に関する。 The present invention relates to an image generation apparatus and method for generating an image of a joint object including a person or an animal by image processing.

従来の関節物体の画像生成技術は、コンピュータグラフィクスの分野において多く実現がなされている。これらの多くは、モーションキャプチャシステムを用いて、実際の人物の形状や動きデータを取得し、その動きデータに基づいてアニメーション生成を行っている。これによって、関節物体のさまざまな姿勢や動きを画像として出力することが可能である。また、モーションキャプチャシステムを使わなくともアニメーション生成可能な方法も存在する。例えば、特許文献１には、動力学を用いて人物の筋肉動作を最小とする動きを推定することによって、人体動作のアニメーションを生成する方法が開示されている。 Many conventional joint object image generation techniques have been implemented in the field of computer graphics. Many of these use a motion capture system to acquire actual human shape and motion data, and generate animation based on the motion data. As a result, various postures and movements of the joint object can be output as an image. There is also a method that can generate animation without using a motion capture system. For example, Patent Document 1 discloses a method of generating an animation of a human body motion by estimating a motion that minimizes a person's muscle motion using dynamics.

また、動作解析等を目的として、画像中に存在する関節物体に対して、関節モデルを当てはめる画像解析方法が開示されている。例えば、特許文献２では、画像中に存在する対象物に対して、その対象物が持つ部位それぞれについて構成したモデルを当てはめ、当てはめた結果、対象物をモデルパラメータとして保持するものである。また、非特許文献１には、１０個の身体部位で構成した人物モデルを用いて、画像中に存在する人物に対して円柱の組合せで構成した人物モデルを当てはめ、画像中に存在する人物が動いた場合にも、その身体部位をトラッキングすることができる技術が提案されている。
特許第３３５５１１３号公報特開平８−２１４２８９号公報ＬｅｏｎｉｄＳｉｇａｌ，ＳｉｄｈａｒｔｈＢｈａｔｉａ、ＳｔｅｆａｎＲｏｔｈ、ＭｉｃｈａｅｌＪ．Ｂｌａｃｋ、ＭｉｃｈａｅｌＩｓａｒｄ、“ＴｒａｃｋｉｎｇＬｏｏｓｅ−ＬｉｍｂｅｄＰｅｏｐｌｅ”、２００４ＩＥＥＥＣｏｍｐｕｔｅｒＳｏｃｉｅｔｙＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎ，Ｖｏｌ．１，ｐｐ４２１−４２８、２００４ For the purpose of motion analysis and the like, an image analysis method for applying a joint model to a joint object existing in an image is disclosed. For example, in Patent Document 2, a model configured for each part of an object is applied to an object present in an image, and as a result of the application, the object is held as a model parameter. In Non-Patent Document 1, a person model composed of 10 body parts is used to apply a person model composed of a combination of cylinders to a person present in the image, and the person present in the image A technique that can track a body part even when it moves is proposed.
Japanese Patent No. 3355113 JP-A-8-214289 Leonid Sigal, Sidharth Bhatia, Stefan Roth, Michael J. Black, Michael Isard, “Tracking Loose-Liberated People”, 2004 IEEE Computer Society Conferencing on Computer Vision and Pattern Recognition, Vol. 1, pp 421-428, 2004

しかしながら、上記従来技術に代表される画像生成方法や画像解析方法を用いても、画像中に存在する関節物体が持つ身体部位形状や服装などのパラメータを反映した正確な画像を生成することができなという問題がある。 However, even with the image generation method and image analysis method typified by the above prior art, it is possible to generate an accurate image reflecting parameters such as the body part shape and clothes of the joint object present in the image. There is a problem.

特許文献１に代表される画像生成方法では、関節物体の動きに関するパラメータをあらかじめ算出しているために、関節物体の動きパラメータからアニメーション生成が可能である。ところが、画像中に存在する関節物体から服装や形状に関するパラメータを得ることは行っていない。そのため、実際の画像中に存在する関節物体の形状、色、動きを反映したアニメーションを生成することができない。 In the image generation method represented by Patent Document 1, since parameters related to the motion of a joint object are calculated in advance, an animation can be generated from the motion parameters of the joint object. However, parameters relating to clothing and shape are not obtained from joint objects present in the image. For this reason, it is impossible to generate an animation reflecting the shape, color, and movement of a joint object existing in an actual image.

また、特許文献２や非特許文献１に代表される画像解析方法では、円柱や直線等で簡略化した人物モデルを当てはめるのみである。そのため、画像中に存在する関節物体の関節位置等の大まかなパラメータは取得できるものの、より詳細な動き、服装、細かな形状に関する情報は取得することが難しい。例えば、非特許文献１では、１０個の部位位置を同定する方法が提案されているが、アニメーション生成等の画像生成を行うためには、画像中に存在する関節物体に関する、より詳細な動き、服装、細かな形状などの情報が必要となる。ところが、現状の技術水準では、モデルの部位数が増えれば増えるほど、モデルの当てはめが困難となり、画像生成に十分な詳細モデルを画像に対して正確に当てはめることは難しい。さらに、モデルの部位数が増えれば増えるほど、計算すべきパラメータが増加するため、計算量も大きくなる。また、オクルージョン等によって、関節や部位が隠されている場合には、隠された関節や部位に関する位置や動きの情報を画像から得ることが難しいといった問題もある。なお、オクルージョンとは、移動物体の一部が物陰に隠れてしまい、撮影できるピクセル数が変化することである。 In addition, in the image analysis methods represented by Patent Document 2 and Non-Patent Document 1, only a human model simplified by a cylinder or a straight line is applied. Therefore, although it is possible to acquire rough parameters such as joint positions of joint objects existing in the image, it is difficult to acquire information on more detailed movement, clothes, and detailed shapes. For example, in Non-Patent Document 1, a method of identifying 10 site positions has been proposed, but in order to perform image generation such as animation generation, more detailed movements related to joint objects existing in the image, Information on clothes, detailed shape, etc. is required. However, with the current technical level, the more models the number of parts increases, the more difficult it is to apply the model, and it is difficult to accurately apply a detailed model sufficient for image generation to the image. Furthermore, as the number of parts of the model increases, the amount of calculation increases because the parameters to be calculated increase. In addition, when a joint or part is hidden by occlusion or the like, there is a problem that it is difficult to obtain position and movement information regarding the hidden joint or part from an image. Occlusion means that a part of a moving object is hidden behind the object and the number of pixels that can be photographed changes.

そこで、本発明は、このような課題を解決するものであり、画像中に存在する関節物体の情報（特性）を反映した新たな画像を生成する画像生成装置等を提供することを目的とする。 Accordingly, the present invention is to solve such problems, and an object of the present invention is to provide an image generation device that generates a new image reflecting information (characteristics) of a joint object existing in an image. .

つまり、本発明は、画像中に存在する関節物体の画像から、画像中に存在する関節物体の部位形状、服装、動き等に関する情報を抽出し、抽出した情報を用いて、画像中に存在する関節物体の特性を反映した新たな画像を生成することを目的とする。 That is, the present invention extracts information on the shape, clothing, movement, etc. of the joint object existing in the image from the image of the joint object existing in the image, and exists in the image using the extracted information. An object is to generate a new image reflecting the characteristics of a joint object.

具体的には、本発明は、画像中に存在する関節物体の画像から、画像中に存在する関節物体の部位形状、服装、動き等に関する情報を抽出し、抽出した情報を用いて、時間的、空間的に内挿、外挿した画像を生成する技術を提供する。また、画像中に存在する関節物体の画像から、画像中に存在する関節物体の身体部位形状や動きに関する情報を抽出し、抽出した情報を用いて、関節物体を構成する部位が明確に目視できる画像を生成する技術をも提供する。さらに、画像中に存在する関節物体の画像から、画像中に存在する関節物体の部位形状、服装、動きに関する情報を抽出し、抽出した情報を用いて、得られた画像中に存在する関節物体の姿勢や動きとは異なる姿勢や動きを含む関節物体の画像を生成する技術をも提供する。 Specifically, the present invention extracts information on the shape, clothing, movement, etc. of the joint object existing in the image from the image of the joint object existing in the image, and uses the extracted information to temporally Provide a technique for generating spatially interpolated and extrapolated images. In addition, information on the body part shape and movement of the joint object existing in the image is extracted from the image of the joint object existing in the image, and the part constituting the joint object can be clearly seen using the extracted information. A technique for generating images is also provided. Furthermore, from the image of the joint object present in the image, information on the part shape, clothing, and movement of the joint object present in the image is extracted, and using the extracted information, the joint object present in the obtained image is extracted. Also provided is a technique for generating an image of a joint object including a posture and movement different from the posture and movement of the robot.

上記目的を達成するために、本発明に係る画像生成装置は、関節物体を示す画像から、前記関節物体の特性を反映した新たな画像を生成する画像生成装置であって、関節物体を撮像した画像を取得する画像入力手段と、取得された画像中の関節物体に対して、あらかじめ保持した関節を有するモデルを当てはめることによって、前記関節物体の関節又は関節間部位の位置に関する第１パラメータを算出するパラメータ算出手段と、前記第１パラメータを用いて、前記関節物体の関節間部位の形状情報、前記第１パラメータには含まれない関節の位置及び動きに関する第２パラメータを推定するモデル変換を行うモデル変換手段と、前記第２パラメータを用いて、前記関節物体の特性を反映した新たな画像を生成する画像生成手段とを備えることを特徴とする。 In order to achieve the above object, an image generation apparatus according to the present invention is an image generation apparatus that generates a new image reflecting the characteristics of a joint object from an image showing the joint object. By applying an image input means for acquiring an image and a model having a joint held in advance to a joint object in the acquired image, a first parameter related to the position of the joint of the joint object or an inter-joint site is calculated. Using the parameter calculation means for performing the model conversion for estimating the shape information of the inter-articular part of the joint object and the second parameter relating to the position and movement of the joint not included in the first parameter, using the first parameter Model conversion means; and image generation means for generating a new image reflecting the characteristics of the joint object using the second parameter. The features.

ここで、前記画像入力手段が時間的に連続した画像を取得する場合には、前記パラメータ算出手段は、前記画像を用いて、前記関節物体の関節又は関節間部位の位置及び動きに関する第１パラメータを算出してもよい。 Here, when the image input unit acquires temporally continuous images, the parameter calculation unit uses the image to calculate the first parameter relating to the position and movement of the joint or inter-joint site of the joint object. May be calculated.

なお、本発明は、このような画像生成装置として実現できるだけでなく、画像生成方法、その方法をステップとして含むプログラム、そのプログラムを記録したコンピュータ読み取り可能な記録媒体等としても実現することができる。 The present invention can be realized not only as such an image generation apparatus but also as an image generation method, a program including the method as a step, a computer-readable recording medium storing the program, and the like.

上記の方法により、画像中に存在する関節物体に対して関節モデルを当てはめることによって得られる大まかな関節及び部位の位置や動き情報から、より詳細な関節及び部位の位置や動き情報を推定することによって、画像中に存在する関節物体の部位形状、動きなどの情報を反映させた新たな画像を生成することが可能である。 Estimating more detailed joint and part position and motion information from the rough joint and part position and motion information obtained by applying the joint model to the joint object existing in the image by the above method. Thus, it is possible to generate a new image reflecting information such as a part shape and movement of a joint object existing in the image.

具体的には、画像中に存在する関節物体の画像から時間的、空間的に内挿、外挿した画像を生成することが可能である。また、画像中に存在する関節物体の画像から関節物体の構成する部位が明確に目視できる画像を生成することが可能である。さらに、画像中に存在する関節物体の画像から、その関節物体の部位形状や服装等を反映した上で、画像中に存在する関節物体の姿勢や動きとは異なる姿勢や動きを含む関節物体の画像を生成することが可能である。また、さらに、オクルージョン等によって、一部の関節及び部位の位置や動き情報を画像から得ることができないような場合においても、画像から得ることができなかった関節及び部位の位置や動き情報を推定することによって、画像中に存在する関節物体の部位形状、動きなどの情報を反映させた新たな画像を生成することが可能である。 Specifically, it is possible to generate an image that is temporally and spatially interpolated and extrapolated from the image of the joint object existing in the image. In addition, it is possible to generate an image in which a part constituting the joint object can be clearly seen from the image of the joint object existing in the image. Furthermore, from the image of the joint object existing in the image, reflecting the shape and clothes of the joint object, the joint object including the posture and movement different from the posture and movement of the joint object existing in the image. An image can be generated. In addition, even when the position and motion information of some joints and parts cannot be obtained from the image due to occlusion etc., the position and motion information of the joints and parts that could not be obtained from the image are estimated. By doing so, it is possible to generate a new image reflecting information such as the shape and movement of the joint object existing in the image.

よって、本発明により、撮像された実画像から高精度な画像が生成され、特に、デジタルカメラやカメラ付き携帯電話、ビデオ装置等の撮影によって得られた映像を補完して精度を向上させる映像補完装置等として、その実用的価値は高い。 Therefore, according to the present invention, a high-accuracy image is generated from the captured real image, and in particular, video complementation that improves the accuracy by complementing the video obtained by shooting with a digital camera, a mobile phone with a camera, a video device, etc. As a device or the like, its practical value is high.

本発明の一実施形態は、関節物体を示す画像から、前記関節物体の特性を反映した新たな画像を生成する画像生成装置であって、関節物体を撮像した画像を取得する画像入力手段と、取得された画像中の関節物体に対して、あらかじめ保持した関節を有するモデルを当てはめることによって、前記関節物体の関節又は関節間部位の位置に関する第１パラメータを算出するパラメータ算出手段と、前記第１パラメータを用いて、前記関節物体の関節間部位の形状情報、前記第１パラメータには含まれない関節の位置及び動きに関する第２パラメータを推定するモデル変換を行うモデル変換手段と、前記第２パラメータを用いて、前記関節物体の特性を反映した新たな画像を生成する画像生成手段とを備えることを特徴とする。ここで、前記画像生成手段は、関節物体の特性を反映した新たな画像として、例えば、関節物体の関節又は関節間部位の位置、および、前記関節物体の関節間部位の色およびテクスチャ情報を反映した画像を生成する。これによって、前記第１のパラメータより、情報量が多い第２のパラメータを推定するか、もしくはオクルージョン等の原因により、画像への関節モデルの当てはめのみでは得る事ができない情報に関するパラメータを第２のパラメータとして推定することによって、画像中に存在する関節物体の情報（特性）を反映した新たな画像の生成が可能となる。 One embodiment of the present invention is an image generation device that generates a new image reflecting the characteristics of the joint object from an image showing the joint object, and an image input unit that acquires an image obtained by imaging the joint object; Parameter calculating means for calculating a first parameter related to a position of a joint or an inter-joint portion of the joint object by applying a model having a joint held in advance to the joint object in the acquired image; Model conversion means for performing model conversion for estimating a second parameter relating to shape information of an inter-joint part of the joint object, a position and motion of a joint not included in the first parameter, using the parameter, and the second parameter And an image generation means for generating a new image reflecting the characteristics of the joint object. Here, the image generation unit reflects, for example, the position of the joint of the joint object or the inter-joint part, and the color and texture information of the joint part of the joint object as a new image reflecting the characteristics of the joint object. Generated image. As a result, the second parameter having a larger amount of information than the first parameter is estimated, or a parameter relating to information that cannot be obtained only by fitting the joint model to the image due to a cause such as occlusion or the like. By estimating as a parameter, it is possible to generate a new image reflecting information (characteristics) of a joint object existing in the image.

また、本発明のより好ましい形態は、前記画像入力手段は、時間的に連続した画像を取得し、前記パラメータ算出手段は、前記画像を用いて、前記関節物体の関節又は関節間部位の位置及び動きに関する第１パラメータを算出することを特徴とする。これによって、前記画像生成手段は、時間的に連続した画像に対して、前記第１パラメータ又は前記第２パラメータに含まれる動き情報をもとに生成した、時間的に内挿および外挿した画像を前記新たな画像として生成することができる。 In a more preferred embodiment of the present invention, the image input unit acquires temporally continuous images, and the parameter calculation unit uses the image to determine the position of a joint or an inter-joint site of the joint object and A first parameter relating to movement is calculated. Thereby, the image generating means generates temporally interpolated and extrapolated images generated based on motion information included in the first parameter or the second parameter with respect to temporally continuous images. Can be generated as the new image.

また、本発明のより好ましい形態は、前記画像生成装置はさらに、前記画像生成手段によって生成された画像と目標画像との誤差を算出することにより、前記画像を評価する画像評価手段と、前記画像評価手段による評価結果に基づいて、前記第１パラメータを変更するパラメータ変更手段とを備え、前記モデル変換手段は、前記パラメータ変更手段で変更された第１パラメータに基づいて前記モデル変換を行うことを特徴とする。これによって、これによって、目標画像に近づくように前記パラメータを変更することによって、パラメータをより高精度に得ることができるため、さらに忠実に、画像中に存在する関節物体の特性を反映した新たな画像生成が可能となる。また、フレームレートの低い動画像から、よりフレームレートの高い動画像の生成が可能となる。 According to a more preferred aspect of the present invention, the image generation device further calculates an error between the image generated by the image generation unit and a target image, thereby evaluating the image, and the image Parameter changing means for changing the first parameter based on an evaluation result by the evaluation means, and the model converting means performs the model conversion based on the first parameter changed by the parameter changing means. Features. Thereby, since the parameter can be obtained with higher accuracy by changing the parameter so as to approach the target image, a new faithfully reflecting the characteristics of the joint object existing in the image can be obtained. Image generation is possible. In addition, it is possible to generate a moving image with a higher frame rate from a moving image with a lower frame rate.

なお、前記画像生成手段は、前記関節物体を構成する各部位に異なる色またはテクスチャを貼り付けた画像を前記新たな画像として生成してもよい。これによって、関節物体の構成する各部位の状態や動きを、画像中に存在する関節物体の情報（特性）を反映した画像で把握することが可能となる。 Note that the image generation means may generate an image in which a different color or texture is pasted on each part constituting the joint object as the new image. This makes it possible to grasp the state and movement of each part of the joint object with an image reflecting information (characteristics) of the joint object existing in the image.

また、前記画像生成手段は、前記関節物体の姿勢又は動きとは異なる姿勢又は動きを含む関節物体の画像に、前記関節物体の関節間部位のテクスチャを貼り付けた画像を前記新たな画像として生成してもよい。これによって、画像中に存在する関節物体の情報（特性）を反映した上で、他の姿勢、動き、形状に加工した画像の生成が可能となる。 Further, the image generation means generates an image obtained by pasting a texture of an inter-articular part of the joint object on the image of the joint object including a posture or movement different from the posture or movement of the joint object as the new image. May be. As a result, it is possible to generate an image processed into another posture, motion, and shape while reflecting information (characteristics) of the joint object existing in the image.

また、本発明のより好ましい形態は、前記モデル変換手段は、前記パラメータ算出手段によって前記第１パラメータの一部が抽出不能な場合に、抽出不能なパラメータ値を推定することによって、前記第２パラメータを推定するモデル変換を行うことを特徴とする。これによって、オクルージョン等の原因により、画像への関節モデルの当てはめのみでは得る事ができない情報に関するパラメータを前記第２パラメータとして推定することによって、画像中に存在する関節物体の情報（特性）を反映した、新たな画像の生成が可能となる。 According to a more preferred aspect of the present invention, the model conversion unit estimates the parameter value that cannot be extracted when the parameter calculation unit cannot extract a part of the first parameter. It is characterized by performing model conversion for estimating. Thus, information (characteristics) of the joint object existing in the image is reflected by estimating, as the second parameter, a parameter related to information that cannot be obtained only by fitting the joint model to the image due to causes such as occlusion. Thus, a new image can be generated.

ここで、前記動きは、動きベクトル、加速度ベクトル、アフィンパラメータ及び近似曲線パラメータのいずれかによって表されるのが好ましい。これによって、関節を有する物体の動きに関するパラメータを得ることで、動き情報を用いて時間的に内挿および外挿する新たな画像の生成が可能となる。 Here, it is preferable that the motion is represented by any one of a motion vector, an acceleration vector, an affine parameter, and an approximate curve parameter. Thus, by obtaining parameters relating to the motion of an object having a joint, it is possible to generate a new image that is temporally interpolated and extrapolated using motion information.

また、前記モデル変換手段は、予め求められた前記第１パラメータと前記第２パラメータとの相関情報を保持し、前記相関情報を参照することで、前記第２パラメータを推定するモデル変換を行うのが好ましい。これによって、関節物体の関節の位置や動きに関する情報をあらかじめ学習しておくことによって、前記第２パラメータの推定が容易となる。 In addition, the model conversion unit holds correlation information between the first parameter and the second parameter obtained in advance, and performs model conversion for estimating the second parameter by referring to the correlation information. Is preferred. This facilitates the estimation of the second parameter by learning in advance information on the position and movement of the joint of the joint object.

前記相関情報の例としては、前記第１パラメータの自己相関および第１パラメータと第２パラメータとの相互相関を含む構成としてもよい。あるいは、前記相関情報は、複数組の相互相関情報から、その重み付き線形和により算出されていてもよい。さらに、前記相関情報は、関節物体の種類又は関節物体ごとに、あらかじめ求められていてもよい。これによって、前記第２パラメータの推定がより高精度になる。 Examples of the correlation information may include an autocorrelation of the first parameter and a cross-correlation between the first parameter and the second parameter. Alternatively, the correlation information may be calculated from a plurality of sets of cross-correlation information by a weighted linear sum. Furthermore, the correlation information may be obtained in advance for each type of joint object or each joint object. This makes the estimation of the second parameter more accurate.

以下、本発明の実施の形態について、図面を用いて説明する。 Hereinafter, embodiments of the present invention will be described with reference to the drawings.

（実施の形態１）
まず、本発明の実施の形態１について説明する。図１は、実施の形態１による処理手順の概略を示す図である。この画像生成装置は、画像中に存在する関節物体の関節の位置や形状、動きに関する情報（特性）を反映した、新たな画像の生成を可能とする装置であり、画像入力部１０１、パラメータ算出部１０２、モデル変換部１０３及び画像生成部１０４から構成される。 (Embodiment 1)
First, the first embodiment of the present invention will be described. FIG. 1 is a diagram showing an outline of a processing procedure according to the first embodiment. This image generation apparatus is an apparatus that enables generation of a new image that reflects information (characteristics) related to the position, shape, and movement of a joint object existing in an image. Unit 102, model conversion unit 103, and image generation unit 104.

画像入力部１０１は、関節物体をデジタルカメラやビデオ装置等で撮像して得られる画像（つまり、コンピュータグラフィック等ではない実画像）を取得する入力インタフェース等である。ここでは、時系列に並んだ画像であっても構わない。 The image input unit 101 is an input interface or the like that acquires an image obtained by imaging a joint object with a digital camera, a video apparatus, or the like (that is, an actual image that is not a computer graphic or the like). Here, images arranged in time series may be used.

パラメータ算出部１０２は、あらかじめ用意した関節を有するモデルを画像に当てはめることによって、画像中に存在する関節物体の関節の位置を検出する処理部である。ここで、時系列に並んだ画像で、かつ画像中の関節物体が動いている場合は、その関節物体の関節、関節間部位の動きを検出することもできる。パラメータ算出部１０２の構成例を図２を用いて説明する。パラメータ算出部１０２は、入力された画像から関節物体領域を抽出する関節物体領域抽出部１０２１と、抽出した関節物体領域に対して、あらかじめ用意した関節モデルを当てはめるモデル当てはめ部１０２２と、関節モデルを当てはめることによって得た関節位置から、関節間部位の位置を算出する関節間部位位置計算部１０２３とで構成される。関節モデルを３次元で構成する場合は、３次元情報を２次元の画像空間へ射影する処理を行う。 The parameter calculation unit 102 is a processing unit that detects the position of a joint of a joint object existing in the image by applying a model having a joint prepared in advance to the image. Here, when the images are arranged in time series and the joint object in the image is moving, the motion of the joint of the joint object and the inter-joint region can also be detected. A configuration example of the parameter calculation unit 102 will be described with reference to FIG. The parameter calculation unit 102 includes a joint object region extraction unit 1021 that extracts a joint object region from the input image, a model fitting unit 1022 that applies a joint model prepared in advance to the extracted joint object region, and a joint model. An inter-joint site position calculation unit 1023 that calculates the position of the inter-joint site from the joint position obtained by fitting. When the joint model is configured in three dimensions, a process of projecting the three-dimensional information into a two-dimensional image space is performed.

モデル変換部１０３は、パラメータ算出部１０２で得られたパラメータから、関節物体の関節間部位の形状情報、そのパラメータには含まれない関節の位置及び動きに関するパラメータを推定するモデル変換を行う処理部であり、パラメータ算出部１０２で抽出した関節物体の関節位置やその動きを入力として、形状情報を含むさらに高精度な関節物体の関節位置や動きの情報を出力したり、オクルージョン等で画像から得ることができなかった関節の位置やその動きに関する情報を出力する。 The model conversion unit 103 is a processing unit that performs model conversion to estimate parameters related to the position and movement of joints that are not included in the shape information of joint parts of joint objects from the parameters obtained by the parameter calculation unit 102 The joint position and the movement of the joint object extracted by the parameter calculation unit 102 are input, and the information on the joint position and movement of the joint object including the shape information with higher accuracy is output or obtained from the image by occlusion or the like. Outputs information about joint positions and movements that could not be performed.

画像生成部１０４は、モデル変換部１０３で推定した関節位置および、関節間部位の動き情報や形状に関係する情報を用いて、新たな画像を生成する処理部である。つまり、この画像生成部１０４は、関節物体の特性を反映した新たな画像として、例えば、関節物体の関節又は関節間部位の位置、および、関節物体の関節間部位の色およびテクスチャ情報を反映した画像を生成する。 The image generation unit 104 is a processing unit that generates a new image using the joint position estimated by the model conversion unit 103 and information related to the movement information and shape of the inter-joint site. That is, the image generation unit 104 reflects, as a new image reflecting the characteristics of the joint object, for example, the position of the joint or the joint part of the joint object, and the color and texture information of the joint part of the joint object. Generate an image.

時間的に内挿および外挿した新たな画像を生成する場合の画像生成部１０４の構成例を図３を用いて説明する。画像生成部１０４は、生成するフレーム（時刻）に対応した画素位置を動き情報をもとに決定する画素移動位置計算部１０４１と、画素を移動させた結果生じる画素の欠落等を補間する補間処理部１０４２と移動した画素の色情報を決定する画素値決定部１０４３とから構成される。 A configuration example of the image generation unit 104 when generating a new image that is temporally interpolated and extrapolated will be described with reference to FIG. The image generation unit 104 includes a pixel movement position calculation unit 1041 that determines a pixel position corresponding to a frame (time) to be generated based on motion information, and an interpolation process that interpolates missing pixels that occur as a result of moving the pixel. And a pixel value determining unit 1043 that determines color information of the moved pixel.

なお、時間的に内挿、外挿する画像を生成しない場合は、動き情報が不要なため、画素移動位置計算部１０４１と補間処理部１０４２は、なくてもよい。 Note that when no temporally interpolated / extrapolated image is generated, the motion information is unnecessary, and thus the pixel movement position calculation unit 1041 and the interpolation processing unit 1042 are not necessary.

次に、以上のように構成された本実施の形態の画像生成装置による関節物体の画像生成方法について、図４のフローチャートを用いて詳細に説明する。 Next, the image generation method of the joint object by the image generation apparatus of the present embodiment configured as described above will be described in detail with reference to the flowchart of FIG.

まず、Ｓ２００１にて、画像入力部１０１は、撮影された画像の入力を受け付ける。
次に、Ｓ２００２にて、パラメータ算出部１０２の関節物体領域抽出部１０２１は、入力された画像に対して背景差分処理を行い、関節物体領域を切り出す。なお、ここでは、背景差分処理の代わりにフレーム間差分処理を行っても良い。さらに、対象とする関節物体が人物である場合は、Ｍ．ＯｒｅｎＣ．ＰａｐａｇｅｏｒｇｉｏｕＰ．ＳｉｎｈａＥ．ＯｓｕｎａａｎｄＴ．Ｐｏｇｇｉｏ“ＰｅｄｅｓｔｒｉａｎＤｅｔｅｃｔｉｏｎｕｓｉｎｇｗａｖｅｌｅｔｔｅｍｐｌａｔｅｓ”Ｐｒｏｃ．ｏｆＣＶＰＲ９７ｐｐ．１９３−１９９１９９７等を用いて、人物領域を切り出しても良い。さらに、エッジ検出処理を併用しても良い。また、背景差分処理を行う場合は、人物の存在しない背景となる画像を事前に準備しておく。動画を入力とする場合には、背景スプライトを生成し、生成した背景スプライト画像を用いることもできる。 First, in S2001, the image input unit 101 receives an input of a captured image.
Next, in S2002, the joint object region extraction unit 1021 of the parameter calculation unit 102 performs background difference processing on the input image to cut out the joint object region. Here, inter-frame difference processing may be performed instead of background difference processing. Further, when the target joint object is a person, the M.M. OrenC. PageorgiouP. SinhaE. Osuna and T. Poggio "Pedestrian Detection using wavelet templates" Proc. of CVPR97pp. The person area may be cut out using 193-1991997 or the like. Furthermore, edge detection processing may be used in combination. In addition, when performing the background difference process, an image as a background without a person is prepared in advance. When a moving image is input, a background sprite can be generated and the generated background sprite image can be used.

次に、Ｓ２００３にて、パラメータ算出部１０２のモデル当てはめ部１０２２は、図５（ａ）に示すような関節モデル１００１を用いて、図５（ｂ）のモデル当てはめ結果１００２に示されるように、前記関節物体領域に対して、あらかじめ用意した関節を有するモデルを当てはめる。ここでは、ＬｅｏｎｉｄＳｉｇａｌＳｉｄｈａｒｔｈＢｈａｔｉａ、ＳｔｅｆａｎＲｏｔｈ、ＭｉｃｈａｅｌＪ．Ｂｌａｃｋ、ＭｉｃｈａｅｌＩｓａｒｄ、“ＴｒａｃｋｉｎｇＬｏｏｓｅ−ＬｉｍｂｅｄＰｅｏｐｌｅ”、２００４ＩＥＥＥＣｏｍｐｕｔｅｒＳｏｃｉｅｔｙＣｏｎｆｅｒｅｎｃｅｏｎＣｏｍｐｕｔｅｒＶｉｓｉｏｎａｎｄＰａｔｔｅｒｎＲｅｃｏｇｎｉｔｉｏｎＶｏｌ．１，ｐｐ４２１−４２８、２００４などのモデル当てはめ手法を用いることができる。これによって、モデル当てはめ結果１００２の黒丸で示した各部分が、モデル当てはめによって検出した関節位置となる。すなわち、３次元の関節位置｛Ｘｗｉ（ｔ），Ｙｗｉ（ｔ），Ｚｗｉ（ｔ）｝および、関節の角度情報を得ることができる。なお、関節の角度情報は、関節モデル１００１で示した円柱の接続角度を用いることができる。また、時系列画像を入力した場合は、上記に加えて、入力画像ごとにモデル当てはめを行い、モデル当てはめ結果１００２で示した黒丸の３次元位置情報を時系列で得ることで、動き情報｛ΔＸｗｉ（ｔ），ΔＹｗｉ（ｔ），ΔＺｗｉ（ｔ）｝を得ることができる。 Next, in S2003, the model fitting unit 1022 of the parameter calculation unit 102 uses a joint model 1001 as shown in FIG. 5A, as shown in the model fitting result 1002 in FIG. A model having a joint prepared in advance is applied to the joint object region. Here, Leonid Sigal Sidharth Bhatia, Stefan Roth, Michael J. Black, Michael Isard, “Tracking Loose-Liberated People”, 2004 IEEE Computer Society Conferencing on Computer Vision and Pattern Recognition Vol. Model fitting techniques such as 1, pp 421-428, 2004 can be used. As a result, each part indicated by a black circle in the model fitting result 1002 becomes the joint position detected by the model fitting. That is, three-dimensional joint positions {Xwi (t), Ywi (t), Zwi (t)} and joint angle information can be obtained. Note that the connection angle of the cylinder shown by the joint model 1001 can be used as the joint angle information. When a time series image is input, in addition to the above, model fitting is performed for each input image, and the three-dimensional position information of the black circles indicated by the model fitting result 1002 is obtained in time series, so that the motion information {ΔXwi (T), ΔYwi (t), ΔZwi (t)} can be obtained.

さらに、パラメータ算出部１０２の関節間部位位置計算部１０２３は、検出した３次元の関節位置１００２を用いて、図５（ｃ）に示されるように、隣接する関節どうしの中間点を関節間部位の代表位置１００３として算出する。なお、この代表位置１００３は、必ずしも中心位置である必要は無く、関節間部位ごとに関節間部位位置を算出しても良い。 Further, the inter-joint site position calculation unit 1023 of the parameter calculation unit 102 uses the detected three-dimensional joint position 1002 to determine an intermediate point between adjacent joints as shown in FIG. 5C. Is calculated as a representative position 1003. The representative position 1003 is not necessarily the center position, and the inter-joint site position may be calculated for each inter-joint site.

なお、図５は、関節数および関節間部位の数を限定するものではない。また、頭部、手、足のように関節の先に存在する部位に関する中心位置については、首、手首、足首の関節位置とその角度情報から、規定値を用いて算出する。ここで、頭部の中心位置は、首位置から１５ｃｍ、手の中心位置は、手首位置から５ｃｍ、足は、足首位置から９ｃｍとした。もちろん、図６に示す例のように体型や性別ごとに、それぞれの値をデータベースとして用意しても構わない。 Note that FIG. 5 does not limit the number of joints and the number of sites between joints. Further, the center position regarding the part existing at the tip of the joint such as the head, hand, and foot is calculated from the joint position of the neck, wrist, and ankle and the angle information using a specified value. Here, the center position of the head was 15 cm from the neck position, the center position of the hand was 5 cm from the wrist position, and the foot was 9 cm from the ankle position. Of course, each value may be prepared as a database for each body type and sex as in the example shown in FIG.

Ｓ２００４では、モデル変換部１０３は、入力と出力の関係を記述したモデル変換データを使用することによって、入力情報から出力情報を推定する。具体的には、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きを入力として、モデル変換部１０３は、オクルージョン等で得られなかった関節の位置やその動きに関する情報を推定して出力する。さらに、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きを入力として、形状情報を含むさらに高精度な関節物体の関節位置等の情報を出力する。ここでは、モデル変換データを相関情報によって生成する例について述べる。 In S2004, the model conversion unit 103 estimates output information from input information by using model conversion data describing the relationship between input and output. Specifically, using the joint position and motion of the joint object extracted by the parameter calculation unit 102 as input, the model conversion unit 103 estimates information on the joint position and motion that could not be obtained by occlusion or the like. Output. Furthermore, using the joint position and movement of the joint object extracted by the parameter calculation unit 102 as input, information such as the joint position of the joint object with higher accuracy including shape information is output. Here, an example of generating model conversion data based on correlation information will be described.

まず、入力ベクトルをｘとする。ｘは、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きに相当する。

Ｘは、入力ベクトルの集合である。Ｎは、データセットの数である。 First, let x be an input vector. x corresponds to the joint position and movement of the joint object extracted by the parameter calculation unit 102.

X is a set of input vectors. N is the number of data sets.

また、

であり、ｍは関節位置や関節間の部位に相当する。 Also,

And m corresponds to the joint position and the part between the joints.

次に、出力ベクトルをｙとする。ｙは、推定したいパラメータであり、オクルージョン等で得られなかった関節の位置やその動きに関する情報でも良いし、形状情報を含むさらに情報量の多い関節物体の関節位置等の情報でも良い。

Ｙは、出力ベクトルの集合である。 Next, let y be an output vector. y is a parameter to be estimated, and may be information on the position and movement of a joint that has not been obtained by occlusion or the like, or may be information on the joint position of a joint object having a larger amount of information including shape information.

Y is a set of output vectors.

であり、ｌは、関節位置、関節間部位に加えて、形状を表現するためのマーカ位置等の情報を含む。

In addition to the joint position and the inter-joint site, l includes information such as a marker position for expressing the shape.

次に、Ｘの自己相関行列を次のように決定する。

Next, the autocorrelation matrix of X is determined as follows.

また、ＸとＹの相互相関行列を次のように決定する。

Further, the cross-correlation matrix between X and Y is determined as follows.

ここで、モデル変換行列をＣとすると、

で表すことができる。ここで、Ｃ_x ^*はＣ_xの逆行列、または疑似逆行列である。 Here, if the model transformation matrix is C,

It can be expressed as Here, C _x ^* is an inverse matrix of C _x or a pseudo inverse matrix.

そして、推定したいｙは、モデル変換行列を用いて次の式で表すことができる。

Then, y to be estimated can be expressed by the following equation using a model conversion matrix.

ここで、ｍ_xおよびｍ_yは、ｘおよびｙの平均ベクトルである。 Here, m _x and m _y are the mean vector of x and y.

（数８）より、ｘの平均ベクトルｍ_xおよびｙの平均ベクトルｍ_y、モデル変換行列Ｃを保持しておけば、モデル変換部１０３は、新たに与えられた入力ベクトルｘから、出力ベクトルｙを推定することが可能である。 From equation (8), if the average vector mx of _x , the average vector my of _y , and the model conversion matrix C are held, the model conversion unit 103 generates the output vector y from the newly given input vector x. Can be estimated.

ここで、具体的なｍ_x、ｍ_y、Ｃの決定方法について説明する。
ＸおよびＹとしては、モーションキャプチャデータを使うことができる。 Here, concrete m _x, m _y, method for determining the C will be described.
As X and Y, motion capture data can be used.

モーションキャプチャは、実際の関節物体の関節位置などにマーカを取り付けて、そのマーカの３次元位置を時系列で得ることができるものである。 In motion capture, a marker is attached to the joint position of an actual joint object, and the three-dimensional position of the marker can be obtained in time series.

時刻ｔにおける入力ベクトルｘをマーカｉについて次のように記述すると、

として、位置情報と動き情報を表現できる。 When the input vector x at time t is described with respect to the marker i as follows:

As described above, position information and motion information can be expressed.

動き情報については、動きベクトルの他に、動きベクトルを関数で近似しても良いし、アフィンパラメータでも良いし、加速度を用いても良い。 As for the motion information, in addition to the motion vector, the motion vector may be approximated by a function, an affine parameter, or acceleration may be used.

また、ｙも同様にマーカｋについて記述すると、

のように、位置情報と動き情報を表現できる。 Similarly, y also describes the marker k.

In this way, position information and motion information can be expressed.

ここでも、動き情報については、動きベクトルの他に、動きベクトルを関数で近似しても良いし、アフィンパラメータでも良いし、加速度を用いても良い。 Here, as for the motion information, in addition to the motion vector, the motion vector may be approximated by a function, an affine parameter, or acceleration may be used.

ｘおよびｙは、これらのベクトルをマーカ順に並べたもので表現できる。
さらに、ｍ_x、ｍ_yは、マーカ順に並べたそれぞれのベクトルの平均である。 x and y can be expressed by arranging these vectors in the order of markers.
Further, m _x, m _y are the average of the respective vectors obtained by arranging the marker order.

そして、モデル変換行列Ｃは、同時刻におけるｘとｙを組として、（数５）、（数６）、（数７）によって、計算することが可能である。なお、ｍ_x、ｍ_y、Ｃについては、あらかじめ計算しておいても良いし、データセットからその都度計算しても良い。 The model transformation matrix C can be calculated by (Equation 5), (Equation 6), and (Equation 7) with x and y at the same time as a pair. Incidentally, m _x, m _y, for C, may be previously calculated may be calculated each time from the data set.

ここで、具体的なｘとｙの例について図７を用いて説明する。ｘについては、画像から比較的検出しやすいことが重要であるため、図７（ａ）のマーカ位置の例１１０１に示されるように、大まかな関節位置に取り付けたマーカデータを用いることが望ましい。そして、ｙについては、図７（ｂ）のマーカ位置の例１１０２に示されるように、黒丸で示したｘの関節位置に加えて、白丸で示した関節間部位等に取り付けたマーカデータも含むことができる。ここで、黒丸で示した関節位置を示すマーカと白丸で示した関節間部位に取り付けたマーカとの位置関係を用いれば、おおまかな形状を得ることができる。また、図７（ｃ）に示されるように、ｙについては、ＣＧ等で作成した形状データ１１０３に示されるように、ポリゴンデータを用いることもできる。これによって、モデル変換部１０３は、画像から検出しやすい関節位置のデータから、形状に関わる情報も含めた詳細なデータを推定することが可能である。 Here, a specific example of x and y will be described with reference to FIG. Since it is important that x is relatively easy to detect from the image, it is desirable to use marker data attached to a rough joint position as shown in the marker position example 1101 in FIG. For y, as shown in the marker position example 1102 in FIG. 7B, in addition to the joint position x indicated by a black circle, marker data attached to the inter-joint site indicated by a white circle is also included. be able to. Here, if the positional relationship between the marker indicating the joint position indicated by the black circle and the marker attached to the inter-joint site indicated by the white circle is used, a rough shape can be obtained. Further, as shown in FIG. 7C, polygon data can be used for y as shown in shape data 1103 created by CG or the like. Thereby, the model conversion unit 103 can estimate detailed data including information related to the shape from the data of the joint position that is easy to detect from the image.

また、オクルージョンが生じやすい例として、人物や動物等が画像上を横向きに移動している場合がある。このような例では、左半身、もしくは右半身のどちらか一方の情報が得られない事がある。他にも、他の物体によって対象とする関節物体の一部が隠されたり、関節物体の情報の一部が得られない事もある。 In addition, as an example in which occlusion is likely to occur, a person, an animal, or the like may move horizontally on an image. In such an example, information on either the left half or the right half may not be obtained. In addition, some of the target joint objects may be hidden by other objects, or some of the information on the joint objects may not be obtained.

このような状況に備えるためには、図８（ａ）の例に示されるように、ｘについては、マーカ位置の例３２０１の片半身に関するマーカデータを用いて生成し、図８（ｂ）の例に示されるように、ｙについては、マーカ位置の例３２０２の白丸で示すマーカを含む全身に関するマーカデータを生成することによって、片半身の情報から全身の情報を推定することが可能である。 In order to prepare for such a situation, as shown in the example of FIG. 8A, x is generated using marker data relating to one half of the marker position example 3201, and FIG. As shown in the example, for y, it is possible to estimate whole body information from half-body information by generating marker data relating to the whole body including markers indicated by white circles in the marker position example 3202.

なお、ｘ、ｍ_x、Ｃについては、あらかじめテスト画像等で、画像にモデルを当てはめた結果を用いて生成しても良い。さらに、ＣＧの関節位置データを用いても良い。また、ｙについては、ＣＧからポリゴンなどの形状データを含めたデータを用いても良い。さらに、モデル変換の例として、関節物体の体型や動作ごとに、上記ｍ_x、ｍ_y、Ｃを複数組用意しても良いし、オクルージョンが生じやすい例ごとに上記ｍ_x、ｍ_y、Ｃを複数組用意しても良い。 Note that x, m _x , and C may be generated using a result obtained by fitting a model to an image in advance using a test image or the like. Further, CG joint position data may be used. For y, data including shape data such as polygons from CG may be used. Further, examples of model transformation, for each body type and operation of the joint body, the m _x, m _y, to C may be a plurality of sets prepared, the m _x for each example occlusion is likely to occur, m _y, C A plurality of sets may be prepared.

また、図９（ａ）及び（ｂ）に示されるマーカ位置の例３３０１及び３３０２のように、異なる姿勢間の相関情報を求めておくことによって、モデル変換部１０３は、入力された姿勢情報とは異なる姿勢情報を出力することも可能である。これにより、画像中に存在する関節物体の情報（特性）を反映した上で、他の姿勢に加工した画像の生成を可能とする。さらに、動き情報を含めて相関情報を求めておけば、モデル変換部１０３は、入力された動作とは異なる動作を出力することも可能である。これにより、画像中に存在する関節物体の情報（特性）を反映した上で、他の動きに加工した画像の生成を可能とする。 In addition, by obtaining correlation information between different postures as in the marker position examples 3301 and 3302 illustrated in FIGS. 9A and 9B, the model conversion unit 103 can obtain the input posture information and It is also possible to output different posture information. Thereby, it is possible to generate an image processed into another posture while reflecting information (characteristics) of the joint object existing in the image. Furthermore, if the correlation information including the motion information is obtained, the model conversion unit 103 can output an operation different from the input operation. As a result, it is possible to generate an image processed into another motion while reflecting information (characteristics) of the joint object existing in the image.

さらに、図９（ａ）及び（ｂ）に示されるマーカ位置の例３３０１及び３３０２のように、異なる姿勢間の相関情報において、ｘ、ｙそれぞれのベクトルの要素として位置情報に加えて、（数９）、（数１０）のように動き情報を含め、動き情報を含めたｘ、ｙから相関情報を求めておけば、モデル変換部１０３は、入力された動作とは異なる動作を出力することも可能である。これにより、画像中に存在する関節物体の情報（特性）を反映した上で、他の動きに加工した画像の生成を可能とする。さらに、上記相関情報のみならず、後述するように、モデル変換部１０３は、ニューラルネットワークを用いて、ＸとＹの関係を学習しても良い。さらに、上記相関情報のみならず、ニューラルネットワークを用いて、ＸとＹの関係を学習しても良い。 Further, in the correlation information between different poses as in the marker position examples 3301 and 3302 shown in FIGS. 9A and 9B, in addition to the position information as the elements of the vectors of x and y, (number 9) If the correlation information is obtained from x and y including the motion information including the motion information as in (Equation 10), the model conversion unit 103 outputs an operation different from the input operation. Is also possible. As a result, it is possible to generate an image processed into another motion while reflecting information (characteristics) of the joint object existing in the image. Furthermore, not only the correlation information but also the model conversion unit 103 may learn the relationship between X and Y using a neural network, as will be described later. Further, the relationship between X and Y may be learned using not only the correlation information but also a neural network.

ここで、Ｓ２００４のモデル変換によって推定するベクトルｙの一部を、図７（ｂ）のマーカ位置１１０２とした場合は、マーカ位置を基準点としてベジェ曲面等を用いた３次元形状推定を行う。もちろん、他のパラメトリックな曲面を用いて近似しても構わないし、図７（ｃ）の１１０３のようにポリゴンを用いたデータで形状を表現しても構わない。なお、基準点を用いた曲線近似の方法は、栗原恒弥、安生健一「３ＤＣＧアニメーション」、技術評論社、Ｐ３６、２００３等に詳しく記載されている。 Here, when a part of the vector y estimated by the model conversion in S2004 is the marker position 1102 in FIG. 7B, three-dimensional shape estimation using a Bezier curved surface or the like is performed using the marker position as a reference point. Of course, it may be approximated using other parametric curved surfaces, or the shape may be expressed by data using polygons as indicated by 1103 in FIG. 7C. The method of curve approximation using a reference point is described in detail in Tsuneya Kurihara, Kenichi Yasushi “3DCG Animation”, Technical Review, P36, 2003, and the like.

次に、Ｓ２００５にて、モデル変換部１０３は、後処理として、Ｓ２００４で推定した３次元形状を画像に投影する。これによって、関節物体の画像がメッシュ状に区切られた状態となる。ここでは、カメラパラメータが既知である場合の例について述べるが、３次元の実世界座標値を画像に投影できるものであれば良く、画像からカメラパラメータを推定する手法を用いても良い。３次元の実世界座標値を画像に投影する手法としては、徐、辻著、「３次元ビジョン」、９ページ、共立出版、１９９８年発行に詳細が記述されている。カメラパラメータを規定できれば、図１０に示されるように、Ｓ２００４で推定した３次元形状を画像上に投影することができる。なお、このステップＳ２００５は、画像生成部１０４が画像生成の前処理として行ってもよい。 Next, in S2005, the model conversion unit 103 projects the three-dimensional shape estimated in S2004 on an image as post-processing. As a result, the joint object image is divided into meshes. Here, an example in which the camera parameters are known will be described. However, any method capable of projecting a three-dimensional real world coordinate value onto an image may be used, and a method of estimating camera parameters from an image may be used. As a method for projecting a three-dimensional real world coordinate value onto an image, details are described in Xu, Tatsumi, “Three-Dimensional Vision”, page 9, Kyoritsu Shuppan, published in 1998. If the camera parameters can be defined, the three-dimensional shape estimated in S2004 can be projected on the image as shown in FIG. Note that step S2005 may be performed by the image generation unit 104 as preprocessing for image generation.

ここで、３次元形状におけるある一点を画像上に投影した位置を（ｘｊ，ｙｊ）とする。そして、各ｊの位置（ｘｊ，ｙｊ）の画素における色情報（Ｒｊ，Ｇｊ，Ｂｊ）とする。ｊは画素である。これによって、推定した３次元位置に対応する画像上での位置や色情報を得ることができる。これによって、パラメータ算出Ｓ２００２で検出した関節間部位をもとに、それぞれのメッシュ領域がどの関節間部位に属するかを得ることができる。この効果として、図１１に示されるように、領域ごとに色分けを行ったり、テクスチャを変えて、関節物体の構成する各部位を明確に目視できる画像を生成することが可能である。 Here, a position where a certain point in the three-dimensional shape is projected on the image is defined as (xj, yj). And it is set as the color information (Rj, Gj, Bj) in the pixel of each j position (xj, yj). j is a pixel. Thereby, position and color information on the image corresponding to the estimated three-dimensional position can be obtained. As a result, it is possible to obtain which inter-joint site each mesh region belongs to based on the inter-joint site detected in the parameter calculation S2002. As an effect of this, as shown in FIG. 11, it is possible to generate an image in which each part of the joint object can be clearly seen by color-coding for each region or changing the texture.

次に、Ｓ２００６では、画像生成部１０４は、Ｓ２００５で推定した関節物体の関節位置、動き、形状に関するパラメータを用いて画像生成を行う。ここでは、画像生成部１０４の画素移動位置計算部１０４１は、推定したベクトルｙの動き情報を用いて、メッシュに区切られた領域ごとに動きを計算する。動きは、アフィン動きを用いても良いし、領域ごとの平均動きベクトルでもよいし、動きを関数近似しても良い。そして、メッシュ状に区切られた領域ごとに得た動き情報を用いて画素を移動させて、新たな画像を生成する。 In step S2006, the image generation unit 104 generates an image using parameters related to the joint position, motion, and shape of the joint object estimated in step S2005. Here, the pixel movement position calculation unit 1041 of the image generation unit 104 uses the estimated motion information of the vector y to calculate the motion for each region divided into meshes. As the motion, an affine motion may be used, an average motion vector for each region may be used, or the motion may be approximated by a function. And a pixel is moved using the motion information obtained for every area | region divided into mesh shape, and a new image is produced | generated.

ここでは、図１２のように、２枚の時系列画像Ｉ（ｔ）１２０１とＩ（ｔ＋ｎ）１２０２を入力として、Ｉ（ｔ）１２０１とＩ（ｔ＋ｎ）１２０２との間に時間的に内挿する画像１２０３をＮ枚生成する場合について説明するが、入力の枚数を規定するものでは無い。 Here, as shown in FIG. 12, two time-series images I (t) 1201 and I (t + n) 1202 are input, and temporally interpolated between I (t) 1201 and I (t + n) 1202. The case where N images 1203 to be generated are generated will be described, but the number of input images is not specified.

メッシュで区切られた領域について、領域Ａの動きパラメータをＭａ、領域Ｂの動きパラメータをＭｂとすると、内挿する画像１３０３は、Ｉ（ｔ）１３０１の画像をもとに、補間処理部１０４２及び画素値決定部１０４３が、領域Ａ、Ｂに属するそれぞれの画素を、Ｍａ／（Ｎ＋１），Ｍｂ／（Ｎ＋１）ずつ移動させることによって生成できる。 Assuming that the motion parameter of the region A is Ma and the motion parameter of the region B is Mb for the region partitioned by the mesh, the image 1303 to be interpolated is based on the image of I (t) 1301 and the interpolation processing unit 1042 and The pixel value determination unit 1043 can generate each pixel belonging to the regions A and B by moving by Ma / (N + 1) and Mb / (N + 1).

外挿する場合については、図１３（ａ）に示されるＩ（ｔ）１３０１と図１３（ｃ）に示されるＩ（ｔ＋ｎ）１３０２の間で生成した動き情報をもとに、補間処理部１０４２及び画素値決定部１０４３が、Ｉ（ｔ＋ｎ）１３０２の画像から、領域Ａ、Ｂに属するそれぞれの画素を、Ｍａ／（Ｎ＋１），Ｍｂ／（Ｎ＋１）移動させることによって生成できる。 For extrapolation, the interpolation processing unit 1042 is based on the motion information generated between I (t) 1301 shown in FIG. 13A and I (t + n) 1302 shown in FIG. The pixel value determination unit 1043 can generate the pixels belonging to the areas A and B by moving Ma / (N + 1) and Mb / (N + 1) from the image of I (t + n) 1302.

なお、動きパラメータＭａとしては、平均動きベクトルｕ_A(ave)を、領域Ｂの動きパラメータＭｂとしては、平均動きベクトルをｕ_B(ave)を用いることができる。ただし、動きパラメータは、動きベクトルのみならず、アフィンパラメータ、加速度ベクトル、近似曲線パラメータでも構わない。 Note that the average motion vector u _A (ave) can be used as the motion parameter Ma, and the average motion vector u _B (ave) can be used as the motion parameter Mb of the region B. However, the motion parameter may be not only a motion vector but also an affine parameter, an acceleration vector, or an approximate curve parameter.

この時、領域Ａと領域Ｂが異なる方向に移動するために、図１３(ｄ)及び（ｅ）の画像例１３０４および１３０５のように関節位置および領域境界を中心に領域Ａと領域Ｂが分離したり、重なったりする危険性がある。これについては、以下のように処理を行うと効果的である。 At this time, since the region A and the region B move in different directions, the region A and the region B are separated around the joint position and the region boundary as in the image examples 1304 and 1305 in FIGS. There is a risk of overlapping or overlapping. About this, it is effective to process as follows.

まず、画素移動位置計算部１０４１は、隣接する領域ＡおよびＢにおいて、領域Ａの動きパラメータをＭａ領域Ｂの動きパラメータをＭｂとする。次に、各領域に属する画素ごとに、領域境界の画素までの最短距離ｄｉｓｔ_{j_min}を計算する。ここでｊは、画素である。なお、ｄｉｓｔ_{j_min}は、領域境界の重心までの距離でも構わない。 First, in the adjacent regions A and B, the pixel movement position calculation unit 1041 sets the motion parameter of the region A as the motion parameter of the Ma region B as Mb. Next, for each pixel belonging to each region, the shortest distance dist _{j_min} to the pixel at the region boundary is calculated. Here, j is a pixel. Note that dist _{j_min} may be a distance to the center of gravity of the region boundary.

領域Ａに属する画素を例として説明する。
次のように、画素移動位置計算部１０４１は、各画素の動きパラメータＭａ＿ｊを決定する。ここでｊは画素である。

A pixel belonging to region A will be described as an example.
As described below, the pixel movement position calculation unit 1041 determines the motion parameter Ma_j of each pixel. Here, j is a pixel.

同様に、領域Ｂに属する画素については、

で表せる。 Similarly, for pixels belonging to region B,

It can be expressed as

なお、非線形関数を利用しても良く、

のようにすることも可能である。 A nonlinear function may be used,

It is also possible to do as follows.

以上のような手法を用いることで、画像生成部１０４は、図１３（ｂ）の内挿画像１３０３のように、関節で接続された部位が分離せず、かつおよび領域どうしが重ならない条件で新たな画像を生成することが可能である。画像生成部１０４は、関節で接続された部位が分離しないように、関節位置を基準とした画素移動を行うことにより、新たな画像を生成することができる。 By using the method as described above, the image generation unit 104 can be used on the condition that the parts connected by the joint are not separated and the areas do not overlap as in the interpolated image 1303 of FIG. A new image can be generated. The image generation unit 104 can generate a new image by performing pixel movement based on the joint position so that the parts connected by the joint are not separated.

ここでは、領域境界付近の画素の動き情報がなだらかに変化するような条件であれば良い。 In this case, it is sufficient if the condition is such that the motion information of the pixels near the region boundary changes gently.

もちろん、外挿画像１２０４についても同様である。
また、さらに、画素を移動させることによって、新たに生成した画像の画素が一部欠ける場合があるが、この場合は、近傍画素から補間するか、もしくは、時刻ｔ＋ｎの画像から、時間的に逆向きの時刻ｔの画像を生成し、順方向から生成した画像と逆方向から生成した画像とを用いて画像を生成することも有効である。なお、補間方法としては、バイリニアやバイキュービック法、モルフォロジー処理等を用いることができる。 Of course, the same applies to the extrapolated image 1204.
Further, by moving the pixel, a part of the pixel of the newly generated image may be missing. In this case, interpolation is performed from the neighboring pixel, or the time is reversed from the image at time t + n. It is also effective to generate an image at the time t in the direction and generate an image using an image generated from the forward direction and an image generated from the reverse direction. As an interpolation method, bilinear, bicubic method, morphological processing, or the like can be used.

以上の処理により、本実施の形態における画像生成装置によって、関節を有する関節モデルの当てはめによって得た動きや関節の位置等に関するパラメータと画像から得た形状や服装等に関するパラメータとを用いて、画像中に存在する関節物体の情報（特性）を反映した、新たな画像の生成が可能である。 Through the above processing, the image generation apparatus according to the present embodiment uses the parameters relating to the movement and joint position obtained by fitting the joint model having joints, and the parameters relating to the shape and clothes obtained from the image. It is possible to generate a new image reflecting information (characteristics) of the joint object existing inside.

また、内挿画像１２０３、外挿画像１２０４と入力画像とを時間順に並べて再生することによって、フレームレートの低い動画像から、よりフレームレートの高い動画像の生成が可能である。 Further, by reproducing the interpolated image 1203, the extrapolated image 1204, and the input image in time order, a moving image with a higher frame rate can be generated from a moving image with a lower frame rate.

（実施の形態２）
次に、本発明の実施の形態２について説明する。図１４は、実施の形態２による処理手順の概略を示す図である。この画像生成装置は、実施の形態１に加えて、生成した画像を評価しながら関節物体の関節位置、形状、服装、動きに関するパラメータを変更することによって、画像中に存在する関節物体の関節の位置や形状、服装、動きに関する情報を反映した、より精度の高い新たな画像の生成を可能とする装置であり、画像入力部１０１、パラメータ算出部１０２、モデル変換部１０３、画像生成部１０４、画像評価部２０１及びパラメータ変更部２０２から構成される。なお、画像入力部１０１、パラメータ算出部１０２、モデル変換部１０３については、実施の形態１と同様であるので、説明は省略する。 (Embodiment 2)
Next, a second embodiment of the present invention will be described. FIG. 14 is a diagram showing an outline of a processing procedure according to the second embodiment. In addition to the first embodiment, this image generation apparatus changes parameters related to the joint position, shape, clothing, and movement of a joint object while evaluating the generated image. It is a device that enables generation of a new image with higher accuracy reflecting information on position, shape, clothes, and movement, and includes an image input unit 101, a parameter calculation unit 102, a model conversion unit 103, an image generation unit 104, An image evaluation unit 201 and a parameter change unit 202 are included. Note that the image input unit 101, the parameter calculation unit 102, and the model conversion unit 103 are the same as those in the first embodiment, and a description thereof will be omitted.

画像生成部１０４では、入力画像Ｉ（ｔ）１２０１をもとに、入力画像Ｉ（ｔ）１２０１とＩ（ｔ＋ｎ）１２０２のから検出したパラメータを用いて、Ｉ（ｔ＋ｎ）に相当する時刻の画像を生成する。これを生成画像Ｉ'（ｔ＋ｎ）とする。 The image generation unit 104 uses the parameters detected from the input images I (t) 1201 and I (t + n) 1202 based on the input image I (t) 1201 and an image at a time corresponding to I (t + n). Is generated. This is a generated image I ′ (t + n).

画像評価部２０１では、図１５（ａ）〜（ｃ）に示されるように前記入力画像Ｉ（ｔ＋ｎ）３５０１と生成画像Ｉ'（ｔ＋ｎ）３５０２との誤差を計算する。なお、生成画像Ｉ'（ｔ＋ｎ）３５０２は、ポリゴン表示された面ごとにハッチングによって表示したが、実際には、Ｓ２００５で投影した画像上での画素値を用いても良い。 The image evaluation unit 201 calculates an error between the input image I (t + n) 3501 and the generated image I ′ (t + n) 3502 as shown in FIGS. Note that the generated image I ′ (t + n) 3502 is displayed by hatching for each polygon-displayed surface, but actually, the pixel value on the image projected in S2005 may be used.

次に、パラメータ変更部２０２では、パラメータ算出部１０２で検出したパラメータである関節位置や動きパラメータを変更する。そして、変更したパラメータに従って、再度、モデル変換部１０３、画像生成部１０４、画像評価部２０１が各処理を行う。ここで、パラメータ変更部２０２では、モデル変換部１０３で推定したパラメータを変更しても良い。 Next, the parameter changing unit 202 changes the joint position and the motion parameter, which are parameters detected by the parameter calculating unit 102. Then, the model conversion unit 103, the image generation unit 104, and the image evaluation unit 201 perform each process again according to the changed parameters. Here, the parameter change unit 202 may change the parameter estimated by the model conversion unit 103.

これら処理を繰り返しながら、誤差が小さくなるパラメータを決定し、この時のパラメータを用いて、画像生成部１０４は、新たな画像を生成する。ここで、処理の繰返しは、必ずしもすべてのパラメータを網羅的に変更する必要は無く、誤差が閾値以下になるまで行ったり、規定回数繰り返したり、処理時間によって決定することができる。 By repeating these processes, a parameter that reduces the error is determined, and the image generation unit 104 generates a new image using the parameter at this time. Here, it is not always necessary to comprehensively change all the parameters, and the processing can be repeated until the error becomes equal to or less than a threshold value, can be repeated a specified number of times, or can be determined according to the processing time.

処理時間によって決定する場合は、フレームレートと新たに生成する画像の枚数を考慮する必要がある。例えば、１０フレーム／秒で入力される画像に対し、フレーム間に２枚の画像を新たに生成することで、３０フレーム／秒の画像列をリアルタイムに生成することを考えた場合、少なくとも０．１秒の間に２枚の画像を生成する必要がある。この場合、一枚あたり０．０５秒で生成する必要があり、このような情報を処理時間の閾値として用いることが可能である。 When determining by the processing time, it is necessary to consider the frame rate and the number of newly generated images. For example, when an image sequence of 30 frames / second is generated in real time by newly generating two images between frames for an image input at 10 frames / second, at least 0. It is necessary to generate two images in one second. In this case, it is necessary to generate in 0.05 second per sheet, and such information can be used as a threshold for processing time.

次に、以上のように構成された本実施の形態の画像生成装置による関節物体の画像生成方法について、図１６のフローチャートを用いて詳細に説明する
Ｓ２１０１からＳ２１０５までは、実施の形態１と同様であるため、説明を省略する。 Next, a method for generating an image of a joint object by the image generation apparatus according to the present embodiment configured as described above will be described in detail with reference to the flowchart of FIG. 16. S2101 to S2105 are the same as in the first embodiment. Therefore, the description is omitted.

Ｓ２１０６では、画像生成部１０４は、入力画像Ｉ（ｔ）１４０１をもとに、入力画像Ｉ（ｔ）１４０１とＩ（ｔ＋ｎ）１４０２から検出したパラメータを用いて、Ｉ（ｔ＋ｎ）に相当する時刻の画像を生成する。これをＩ'（ｔ＋ｎ）とする。ここで、Ｉ（ｔ＋ｎ）を目標画像と呼ぶ。 In step S 2106, the image generation unit 104 uses the parameters detected from the input images I (t) 1401 and I (t + n) 1402 based on the input image I (t) 1401, and the time corresponding to I (t + n). Generate an image of This is I ′ (t + n). Here, I (t + n) is called a target image.

Ｉ'（ｔ＋ｎ）の生成方法について述べる。ここでは、実施の形態１と同様の手法により、入力画像Ｉ（ｔ）をもとに、Ｓ２１０３とＳ２１０４にて検出したパラメータを用いて、時刻ｔ＋ｎの予測画像Ｉ'（ｔ＋ｎ）を生成する。 A method for generating I ′ (t + n) will be described. Here, a predicted image I ′ (t + n) at time t + n is generated using the parameters detected in S2103 and S2104 based on the input image I (t) by the same method as in the first embodiment.

Ｓ２１０６では、画像評価部２０１は、評価値として、目標画像Ｉ（ｔ＋ｎ）３４０１と予測画像Ｉ'（ｔ＋ｎ）３４０２との誤差を計算する。評価値の計算方法としては、目標画像Ｉ（ｔ＋ｎ）３５０１の画素値と、生成画像Ｉ'（ｔ＋ｎ）３５０２の画素値との差を計算する。目標画像Ｉ（ｔ＋ｎ）と生成画像Ｉ'（ｔ＋ｎ）３５０２とのオーバーラップが多く、かつ画素値が近ければ、目標画像に近いと判断することが望ましい。そこで、次のような評価値を用いることができる。 In step S 2106, the image evaluation unit 201 calculates an error between the target image I (t + n) 3401 and the predicted image I ′ (t + n) 3402 as an evaluation value. As a method for calculating the evaluation value, the difference between the pixel value of the target image I (t + n) 3501 and the pixel value of the generated image I ′ (t + n) 3502 is calculated. If the overlap between the target image I (t + n) and the generated image I ′ (t + n) 3502 is large and the pixel values are close, it is desirable to determine that the target image is close. Therefore, the following evaluation values can be used.

評価値の計算方法としては、

等を用いることができる。もちろん、上式に限らず、目標画像と予測画像との誤差を評価する計算方法であれば良い。 As a method of calculating the evaluation value,

Etc. can be used. Of course, the calculation method is not limited to the above formula, and any calculation method for evaluating the error between the target image and the predicted image may be used.

次に、Ｓ２１０７では、画像評価部２０１は、Ｅｒｒ値があらかじめ設定した評価値を満たしているか否かを計算する。ここで、Ｅｒｒ値があらかじめ設定した評価値を満たしていない場合は、現時点において、最も評価値に近い値とその時のＳ２１０４及びＳ２１０５で検出したパラメータとを組として保持する。反対に、Ｅｒｒ値があらかじめ設定した評価値を満たしている場合は、Ｓ２１０９の処理を行う。もちろん、すべてのパラメータを網羅的に処理し、評価値が最も良いパラメータを決定してからＳ２１０９の処理を行っても良い。ただし、リアルタイム処理等、処理時間を考慮する場合は、規定回数繰返したり、規定した処理時間に達するまでとしても良い。この場合は、繰返し処理の中で最も評価値に近いパラメータを選択する。 In step S 2107, the image evaluation unit 201 calculates whether the Err value satisfies a preset evaluation value. If the Err value does not satisfy the preset evaluation value, the value closest to the evaluation value and the parameters detected in S2104 and S2105 at that time are stored as a set. On the other hand, if the Err value satisfies the preset evaluation value, the process of S2109 is performed. Of course, all the parameters may be processed comprehensively, and the process of S2109 may be performed after determining the parameter having the best evaluation value. However, when processing time is considered, such as real-time processing, it may be repeated a specified number of times or until a specified processing time is reached. In this case, the parameter closest to the evaluation value is selected in the repeated processing.

次に、Ｓ２１０８では、パラメータ変更部２０２は、Ｓ２１０３で検出したモデル当てはめの結果を変更する。そして変更されたパラメータを入力として、再度Ｓ２１０４以降を繰り返す。 In step S2108, the parameter changing unit 202 changes the result of model fitting detected in step S2103. Then, the process after S2104 is repeated again with the changed parameter as an input.

この時、図１７のような階層表現（関節間部位の接続関係）を用いることも効果的である。これによって、モデル当てはめの結果に対して階層的な接続関係を得ることができる。この効果としては、例えば、胴体のパラメータを先に決定し、次に、左右上腕、左右大腿、頭のように、胴体と接続されている関係を用いて、上位の階層に属する関節位置から順にパラメータを変更することで、効率的にパラメータを変更、決定することができることにある。すなわち、効率的に誤差が小さくなるパラメータを決定することができる。 At this time, it is also effective to use a hierarchical expression (connection relationship between joint portions) as shown in FIG. Thereby, a hierarchical connection relationship can be obtained for the result of the model fitting. As this effect, for example, the parameters of the torso are determined first, and then using the relationship connected to the torso, such as the left and right upper arms, the left and right thighs, and the head, the joint positions belonging to the upper hierarchy are sequentially By changing the parameters, the parameters can be changed and determined efficiently. That is, it is possible to determine a parameter that effectively reduces the error.

そして、Ｓ２１０９では、画像生成部１０４は、上記階層表現（関節間部位の接続関係）を利用した画像生成を行うことができる。上位の階層にある関節間部位の動きパラメータをＭｂ、上位の階層と関節によって接続された下位階層にある関節間部位の動きパラメータをＭａとすると、（数１１）から（数１４）で示した、各画素の動きパラメータは、次式のように書き換えられる。

In step S 2109, the image generation unit 104 can perform image generation using the hierarchical expression (connection relationship between joint portions). Assuming that the motion parameter of the joint part in the upper hierarchy is Mb and the motion parameter of the joint part in the lower hierarchy connected to the upper hierarchy by the joint is Ma, (Equation 11) to (Equation 14) are shown. The motion parameter of each pixel is rewritten as the following equation.

これによって、上位階層の関節間部位の動きが支配的になるため、より関節物体の構造を反映した画像を生成することができる。 As a result, the movement of the inter-joint site in the upper hierarchy becomes dominant, and thus an image reflecting the structure of the joint object can be generated.

また、さらに、あらかじめ用意した関節モデルの関節間距離を変更するように、パラメータを変更することも可能である。 Furthermore, it is also possible to change the parameters so as to change the inter-joint distance of the joint model prepared in advance.

その場合には、Ｓ２１０９では、画像生成部１０４は、Ｓ２１０７で最終的に決定したパラメータを用いて画像を生成する。この時の画像生成方法は、実施の形態１におけるＳ２００５と同様であり、時間的に内挿、外挿した画像を生成することもできるし、関節物体の構成する各部位を明確に目視できるように、各部位に異なる色やテクスチャ等を貼り付けた画像を生成することもできる。図１５（ｂ）の画像３５０２に示されるように、画像評価前の画像生成時には、実際の目標画像と一部ずれがあったとしても、（数２２）のような評価に基づいてパラメータを変更することによって、パラメータ変更後の画像３５０３に示されるように、目標画像とのずれを最小限に押さえた画像を生成することができる。これにより、より精度の高い新たな画像を生成することができる。 In that case, in S2109, the image generation unit 104 generates an image using the parameters finally determined in S2107. The image generation method at this time is the same as that in S2005 in the first embodiment, and it is possible to generate temporally interpolated and extrapolated images, and to clearly see each part constituting the joint object. In addition, it is possible to generate an image in which a different color, texture, or the like is pasted on each part. As shown in the image 3502 in FIG. 15B, the parameters are changed based on the evaluation as shown in (Equation 22) even when there is a partial deviation from the actual target image when generating the image before the image evaluation. By doing so, as shown in the image 3503 after the parameter change, it is possible to generate an image in which the deviation from the target image is minimized. Thereby, a new image with higher accuracy can be generated.

以上の処理により、実施の形態１の効果に加えて、生成した画像を評価しながら関節物体の関節位置、形状、服装、動きに関するパラメータを変更することによって、画像中に存在する関節物体の関節の位置や形状、服装、動きに関する情報を反映した、より精度の高い新たな画像の生成が可能となる。 Through the above processing, in addition to the effects of the first embodiment, by changing the parameters related to the joint position, shape, clothing, and movement of the joint object while evaluating the generated image, the joint of the joint object existing in the image It is possible to generate a new image with higher accuracy reflecting information on the position, shape, clothes, and movement of the image.

（実施の形態３）
次に、本発明の実施の形態３について説明する。実施の形態３は、実施の形態１及び２における別の動作例である。つまり、実施の形態３は、実施の形態１および２において、画像から関節モデルのパラメータの一部が抽出不能な場合に、モデル変換を行うことによって、検出不能なパラメータを推定し、推定したパラメータを用いて画像生成を行う動作例に相当する。ここでは、図１に沿って説明する。もちろん、図１４と同様に、生成した画像を目標画像との誤差により評価する画像評価ステップと、その評価結果に基づいて、パラメータの値を変更するパラメータ変更ステップとを追加しても構わない。 (Embodiment 3)
Next, a third embodiment of the present invention will be described. The third embodiment is another operation example in the first and second embodiments. That is, the third embodiment estimates the undetectable parameters by performing model conversion when some of the parameters of the joint model cannot be extracted from the image in the first and second embodiments. This corresponds to an operation example in which image generation is performed using. Here, it demonstrates along FIG. Of course, as in FIG. 14, an image evaluation step for evaluating the generated image based on an error from the target image and a parameter changing step for changing the parameter value based on the evaluation result may be added.

画像入力部１０１、パラメータ算出部１０２、画像生成部１０４については、実施の形態１と同様であるので、説明は省略する。 Since the image input unit 101, the parameter calculation unit 102, and the image generation unit 104 are the same as those in the first embodiment, description thereof will be omitted.

パラメータ算出部１０２では、あらかじめ用意した関節を有する関節モデル１００１を図５（ｂ）のモデル当てはめ結果１００２のように画像に当てはめることによって、画像中に存在する関節物体の関節の位置を検出する。しかしながら、画像中を左右に横切る場合や、遮蔽物によって一時的に関節物体の一部が隠されてしまう場合においては、関節物体の関節位置やその動きに関する情報の一部を得られない事がある。 The parameter calculation unit 102 detects a joint position of a joint object existing in the image by applying a joint model 1001 having a joint prepared in advance to an image as shown in a model fitting result 1002 in FIG. 5B. However, when crossing the image from side to side or when a part of the joint object is temporarily hidden by the shielding object, it may not be possible to obtain a part of information on the joint position of the joint object and its movement. is there.

図８を用いて、画像中を人物が歩きながら左右に横切る場合について説明する。この場合、図８（ａ）に示されるマーカ位置の例３２０１の黒丸で示す部位は、画像から検出することが可能であるが、一方、図８（ｂ）に示されるマーカ位置の例３２０２の白丸で示す部位は、動作の途中で右半身の一部が左半身の一部を遮蔽するために、画像から検出不能な場合がある。この場合、パラメータ算出部１０２では、関節モデルが有する関節位置のパラメータをすべて正確に検出することが難しい、また、遮蔽された関節位置は、検出したとしても信頼性が低い。 The case where a person crosses right and left while walking in the image will be described with reference to FIG. In this case, the part indicated by a black circle in the marker position example 3201 shown in FIG. 8A can be detected from the image, while the marker position example 3202 shown in FIG. A part indicated by a white circle may not be detected from the image because a part of the right half shields a part of the left half during the operation. In this case, it is difficult for the parameter calculation unit 102 to accurately detect all the joint position parameters of the joint model, and even if the shielded joint position is detected, the reliability is low.

ここで、（数２）を用いて説明すると、遮蔽によって一部の関節位置が得られない場合は、ベクトルの要素の一部が情報として得られないことになる。この場合は、画像から検出できなかった関節位置を表現するＸ_inputの要素に次のように０を入れることで、モデル変換部１０３は、実施の形態１と同様の計算を行う。

Here, to explain using (Equation 2), when some joint positions cannot be obtained by shielding, some of the elements of the vector cannot be obtained as information. In this case, the model conversion unit 103 performs the same calculation as in the first embodiment by putting 0 in the element of X _input representing the joint position that could not be detected from the image as follows.

また、以下の方法を用いることで、検出不能なパラメータを推定することができる。 Moreover, the parameter which cannot be detected can be estimated by using the following method.

人物や動物を例として挙げると、画像中に存在する関節物体の片半身のみの情報しか得ることが出来ない場合がある。 Taking a person or animal as an example, it may be possible to obtain only information about one half of the joint object existing in the image.

このような場合は、モデル変換部１０３は、（数１）の入力ベクトルの集合を、図８（ａ）のマーカ位置の例３２０１の黒丸で示すデータのように、片半身のみのデータを用いて生成し、（数３）の出力ベクトルの集合を、図８（ｂ）のマーカ位置の例３２０２の黒丸と白丸のように全身データとする、もしくは、図８（ｂ）のマーカ位置の例３２０２の白丸のように入力ベクトルとは反対の片半身データを用いて、（数５）〜（数８）のモデル変換行列を生成する。これによって、片半身の入力データから全身の関節位置、もしくはその動きに関する情報を得ることができる。推定誤差を図１８に示す。図１８は、さまざまな動作を時系列で行った場合に、右半身の関節位置の入力情報から、入力とは反対側の左半身における関節位置である、左肩、左肘、左手首、左膝、左足首の位置を推定した場合の誤差平均を示したものである。画像からは得られない情報を５センチ程度の誤差で得られるため、画像生成を目的とした場合は十分な精度である。さらに、出力ベクトルｙに形状情報を含めて推定することも可能である。 In such a case, the model conversion unit 103 uses the data of only one half of the set of input vectors of (Equation 1), such as the data indicated by the black circles in the marker position example 3201 in FIG. The set of output vectors of (Equation 3) is used as whole body data such as the black circle and white circle in the marker position example 3202 in FIG. 8B, or the marker position example in FIG. A model transformation matrix of (Expression 5) to (Expression 8) is generated using half body data opposite to the input vector, such as a white circle of 3202. As a result, it is possible to obtain information on the joint position of the whole body or its movement from the input data of one half body. The estimation error is shown in FIG. FIG. 18 shows the left shoulder, left elbow, left wrist, and left knee, which are the joint positions in the left body opposite to the input, based on the input information of the joint positions of the right half body when various operations are performed in time series. The average error when the position of the left ankle is estimated is shown. Since information that cannot be obtained from the image can be obtained with an error of about 5 cm, the accuracy is sufficient for the purpose of image generation. Furthermore, it is also possible to estimate by including shape information in the output vector y.

このように、片半身のみのデータから、全身のデータを推定するモデル変換は、例えば、関節物体の移動方向が画像上で水平方向に動いている場合に特に有効である。そこで、モデル変換部１０３は、画像入力部１０１でオプティカルフロー等を用いて、物体の移動方向算出を行い、その移動方向が水平方向であれば、このモデル変換手法を選択するようにしても良い。 As described above, model conversion for estimating whole body data from data of only one half body is particularly effective when, for example, the moving direction of the joint object moves in the horizontal direction on the image. Therefore, the model conversion unit 103 may calculate the moving direction of the object using the optical flow or the like in the image input unit 101, and may select this model conversion method if the moving direction is the horizontal direction. .

また、肘や膝などの関節位置が抽出できない場合や、図１９に示されるように他の遮蔽物３６０２によって、３６０１のように足首位置等が検出できない場合等、さまざまな状況を想定して、状況に応じたモデル変換行列を用意することで、モデル変換部１０３は、関節位置およびその動きや形状の高精度な推定が可能となる。状況をＳ個想定した場合は、次式のようにモデル変換行列をＳ個用意することになる。

In addition, assuming various situations such as the case where joint positions such as elbows and knees cannot be extracted, and the case where an ankle position or the like cannot be detected by other shield 3602 as shown in FIG. By preparing a model conversion matrix according to the situation, the model conversion unit 103 can estimate the joint position, its movement and shape with high accuracy. When S situations are assumed, S model transformation matrices are prepared as in the following equation.

以上の処理により、実施の形態１および２の効果に加えて、モデル変換部１０３によって、オクルージョン等で画像から検出困難なパラメータをモデル変換によって推定することで、画像のみからでは抽出不能なパラメータを用いた新たな画像の生成を可能とするものである。 Through the above processing, in addition to the effects of the first and second embodiments, the model conversion unit 103 estimates parameters that are difficult to detect from the image by occlusion or the like by model conversion. It is possible to generate a new image used.

さらに、（数９）（数１０）のように動き情報を用いて、（数２１）で動き情報を含めた相関情報を求めておけば、モデル変換部１０３は、入力された動作とは異なる動作を出力することも可能である。この場合、入力ベクトルｘと出力ベクトルｙをそれぞれ、異なる動作や姿勢に対応するように相関情報を求めておけばよい。これにより、画像中に存在する関節物体の情報（特性）を反映した上で、他の姿勢や動作に加工した画像の生成を可能とする。 Further, if the correlation information including the motion information is obtained in (Equation 21) using the motion information as in (Equation 9) and (Equation 10), the model conversion unit 103 is different from the input operation. It is also possible to output an action. In this case, the correlation information may be obtained so that the input vector x and the output vector y correspond to different actions and postures, respectively. As a result, it is possible to generate an image processed into another posture or motion while reflecting information (characteristics) of the joint object existing in the image.

これにより、本実施の形態における画像生成装置によって、画像中に存在する関節物体の情報（特性）を反映した上で、他の動きに加工した画像の生成が可能となる。 As a result, the image generating apparatus according to the present embodiment can generate an image processed into another motion while reflecting information (characteristics) of the joint object existing in the image.

（実施の形態４）
次に、本発明の実施の形態４について説明する。図２０は、実施の形態４による処理手順の概略を示す図である。この画像生成装置は、実施の形態１〜３に加えて、画像中に存在する関節物体の情報（特性）を反映した上で、他の形状に加工した画像の生成を可能とする装置であり、画像入力部１０１、パラメータ算出部１０２、モデル変換部１０３、画像生成部１０４及びユーザ設定部３０１から構成される。ここでは、実施の形態１に沿って説明するが、すべての実施の形態で利用可能である。なお、画像入力部１０１、パラメータ算出部１０２、モデル変換部１０３については、実施の形態１と同様であるため、説明は省略する。 (Embodiment 4)
Next, a fourth embodiment of the present invention will be described. FIG. 20 is a diagram showing an outline of a processing procedure according to the fourth embodiment. In addition to the first to third embodiments, this image generation apparatus is an apparatus that enables generation of an image processed into another shape while reflecting information (characteristics) of a joint object existing in the image. , An image input unit 101, a parameter calculation unit 102, a model conversion unit 103, an image generation unit 104, and a user setting unit 301. Here, although it demonstrates along Embodiment 1, it can utilize in all the embodiments. Note that the image input unit 101, the parameter calculation unit 102, and the model conversion unit 103 are the same as those in the first embodiment, and thus description thereof is omitted.

ここでは、モデル変換部１０３にて、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きを入力として、形状情報を含むさらに高精度な関節物体の関節位置等の情報を出力する場合について説明する。 Here, when the model conversion unit 103 inputs the joint position and the movement of the joint object extracted by the parameter calculation unit 102 and outputs information such as the joint position of the joint object including the shape information with higher accuracy. Will be described.

ユーザ設定部３０１は、モデル変換部１０３で得られた形状情報を含むパラメータの一部をユーザが変更する。ここでは、太らせる、痩せさせる、といったパラメータを変更するように画面表示することも可能である。 In the user setting unit 301, the user changes some of the parameters including the shape information obtained by the model conversion unit 103. Here, it is also possible to display the screen so as to change parameters such as fattening or thinning.

画像生成部１０４は、ユーザ設定部３０１で設定した形状情報を含むパラメータとモデル変換部１０３で得たパラメータとを用いた画像生成を行うことで、入力された関節物体の形状、動き、色などに関する情報を反映した上で、形状を変更した画像を生成する。 The image generation unit 104 generates an image using the parameters including the shape information set by the user setting unit 301 and the parameters obtained by the model conversion unit 103, so that the shape, movement, color, etc. of the input joint object An image with a changed shape is generated after reflecting the information regarding.

次に、以上のように構成された本実施の形態の画像生成装置による関節物体の画像生成方法について、図２１のフローチャートを用いて詳細に説明する。 Next, the image generation method of the joint object by the image generation apparatus of the present embodiment configured as described above will be described in detail with reference to the flowchart of FIG.

Ｓ２２０１からＳ２２０５までは、実施の形態１と同様であるため、説明を省略する。
Ｓ２２０４でモデル変換部１０３が推定したパラメータｙは、関節位置やその動きに関する情報に加えて、関節間部位に取り付けたマーカの位置情報も得ることができる。これはすなわち、関節間部位の形状を表現していることになる。 Since S2201 to S2205 are the same as those in the first embodiment, the description thereof is omitted.
The parameter y estimated by the model conversion unit 103 in S2204 can also obtain the position information of the marker attached to the joint part in addition to the information on the joint position and its movement. In other words, this represents the shape of the joint part.

Ｓ２２０７では、ユーザ設定部３０１は、特に関節間部位に取り付けられたマーカ位置情報等、形状に関するパラメータを、ユーザからの指示に従って、変更する。人物や動物の場合を例に挙げると、腹の上に取り付けたマーカの位置情報を変更することによって、腹の大きさを変更することができる。 In step S 2207, the user setting unit 301 changes parameters related to the shape, such as marker position information attached to the interarticular site, in accordance with an instruction from the user. Taking the case of a person or animal as an example, the size of the belly can be changed by changing the position information of the marker attached on the belly.

人物や動物の場合を例に挙げると腹の上に取り付けたマーカの位置情報を変更することによって、図２２に示されるように、腹の大きさを変更することができる。図２３にユーザが変更するパラメータの設定画面の例を示す。ユーザが変更可能なパラメータは、例えば、図２３のＡ〜Ｅのように、あらかじめ制御可能な関節間部位位置を決定しておくことが望ましい。具体的には図７（ｂ）のｙのマーカ位置の例１１０２に示されるように、白丸で示した形状に関連するマーカ位置を制御可能な関節間部位位置とすることで、ユーザ設定部３０１は、入力画像の形状を関節間部位ごとに変化させることが可能である。そして、ユーザは、形状パラメータ制御バー３７０２を操作することによって、関節間部位ごとに形状を変化させる。このように、表示装置で表示するようにすることで、ユーザが簡単に形状を変更することができる。 Taking the case of a person or animal as an example, the size of the belly can be changed as shown in FIG. 22 by changing the position information of the marker attached on the belly. FIG. 23 shows an example of a parameter setting screen to be changed by the user. As for the parameters that can be changed by the user, for example, as shown in A to E of FIG. Specifically, as shown in an example 1102 of y marker positions in FIG. 7B, the user setting unit 301 is obtained by setting the marker positions related to the shapes indicated by white circles as controllable inter-joint site positions. The shape of the input image can be changed for each inter-joint site. Then, the user operates the shape parameter control bar 3702 to change the shape for each joint portion. Thus, by displaying on the display device, the user can easily change the shape.

次に、Ｓ２２０６では、画像生成部１０４は、Ｓ２２０４で推定したパラメータｙと、Ｓ２２０７で変更した形状に関するパラメータとを用いて画像を生成する。図２２（ａ）に示されるように、入力画像１７０１を入力として、Ｓ２２０７で形状に関するパラメータを変更することによって、入力画像の情報を反映した上で、図２２（ｂ）に示されるように、一部の形状を加工もしくは変更した出力画像１７０２を得ることができる。 In step S2206, the image generation unit 104 generates an image using the parameter y estimated in step S2204 and the parameter related to the shape changed in step S2207. As shown in FIG. 22A, the input image 1701 is used as an input, and the parameters relating to the shape are reflected in S2207 to reflect the information of the input image, and as shown in FIG. An output image 1702 in which a part of the shape is processed or changed can be obtained.

ここで、Ｓ２２０７について図２４を用いて詳しく説明する。Ｓ２２０４にて、図２４（ａ）における形状パラメータ１８０５が得られたとする。この時、画像生成部１０４は、両端の関節位置を端点として、形状パラメータ１８０５の点を制御点としたベジェ曲線を生成する。そして、Ｓ２２０７にて、ユーザ設定部３０１が取得したユーザからの指示に従って、画像生成部１０４は、形状パラメータ１８０５を図２４（ａ）の矢印のように変更する、すなわち、図２３の形状パラメータ制御バー３７０２を操作すると、両端の関節位置と変更された形状パラメータ１８０５の点とを通るベジェ曲線１８０６を生成して、新たな画像を生成する。この時、色情報やテクスチャ情報は、Ｓ２２０５で得た色、テクスチャ情報を用いる。 Here, S2207 will be described in detail with reference to FIG. Assume that the shape parameter 1805 in FIG. 24A is obtained in S2204. At this time, the image generation unit 104 generates a Bezier curve having joint positions at both ends as end points and shape parameter 1805 points as control points. In step S2207, the image generation unit 104 changes the shape parameter 1805 as indicated by the arrow in FIG. 24A in accordance with the instruction from the user acquired by the user setting unit 301, that is, the shape parameter control in FIG. When the bar 3702 is operated, a Bezier curve 1806 passing through the joint positions at both ends and the point of the changed shape parameter 1805 is generated, and a new image is generated. At this time, the color and texture information obtained in step S2205 is used as the color information and texture information.

なお、領域の輪郭点すべてをベジェ曲線の制御点とすることも可能であるし、ベジェ曲線の代わりとして、スプライン補間などのパラメトリックに曲線を処理する手法を使うことも可能である。また、図２３の形状パラメータ制御バー３７０２は、図７（ｂ）のマーカ位置の例１１０２に示した白丸の点の位置を変更させることになり、これは、図２４における形状パラメータ１８０５に相当する。 It should be noted that all the contour points of the region can be used as control points of the Bezier curve, and a parametric method such as spline interpolation can be used instead of the Bezier curve. Also, the shape parameter control bar 3702 in FIG. 23 changes the position of the white dot shown in the marker position example 1102 in FIG. 7B, which corresponds to the shape parameter 1805 in FIG. .

以上の処理により、本実施の形態における画像生成装置により、画像中に存在する関節物体の情報（特性）を反映した上で、他の形状に加工した画像の生成が可能となる。 With the above processing, the image generation apparatus according to the present embodiment can generate an image processed into another shape while reflecting information (characteristics) of a joint object existing in the image.

（実施の形態５）
次に、本発明の実施の形態５について説明する。実施の形態５は、実施の形態１〜４における別の動作例である。つまり、実施の形態５では、実施の形態１〜４で説明した処理に加えて、パラメータ算出ステップ、モデル変換ステップで動き情報を用いる動作例に相当する。ここでは、実施の形態１を例として説明するが、すべての実施の形態において適用可能である。また、入力画像は、時系列に並んだ複数枚の画像であることが望ましい。なお、時系列に並んだ画像を３次元に並べた時空間画像としても良い。 (Embodiment 5)
Next, a fifth embodiment of the present invention will be described. The fifth embodiment is another operation example in the first to fourth embodiments. That is, the fifth embodiment corresponds to an operation example in which motion information is used in the parameter calculation step and the model conversion step in addition to the processing described in the first to fourth embodiments. Here, Embodiment 1 will be described as an example, but the present invention can be applied to all the embodiments. The input image is preferably a plurality of images arranged in time series. In addition, it is good also as a spatio-temporal image which arranged the image arranged in time series in three dimensions.

図１のパラメータ算出部１０２およびモデル変換部１０３で用いることができる動き情報について説明する。パラメータ算出部１０２では、モデル当てはめ手法を用いて、３次元の関節位置および、角度情報を得ることができる。ここで、時系列画像を入力した場合は、上記に加えて各時刻における動き情報を得ることができる。さらに、モデル変換部１０３においても、関節位置の動き情報に加え、関節間部位の動き情報を得ることができる。以下に、上記動きベクトルに加えて、加速度ベクトルを用いた例を説明する。 The motion information that can be used by the parameter calculation unit 102 and the model conversion unit 103 in FIG. 1 will be described. The parameter calculation unit 102 can obtain three-dimensional joint position and angle information using a model fitting method. Here, when a time-series image is input, motion information at each time can be obtained in addition to the above. Furthermore, in the model conversion unit 103, in addition to the motion information of the joint position, the motion information of the inter-joint site can be obtained. An example using an acceleration vector in addition to the motion vector will be described below.

例えば、Ｔ枚の時系列に並んだ画像を入力とした場合、パラメータ算出部１０２は、各関節位置、もしくは関節間部位位置ｉの動きベクトル｛ΔＸｗｉ（ｔ），ΔＹｗｉ（ｔ），ΔＺｗｉ（ｔ）｝を、Ｔ−１個得ることができる。この時、３枚以上の時系列画像が入力された場合は、次式のように加速度ベクトルｓを得ることができる。なお、ｖは動きベクトルである。

For example, when T images arranged in time series are input, the parameter calculation unit 102 calculates the motion vectors {ΔXwi (t), ΔYwi (t), ΔZwi (t) of each joint position or inter-joint site position i. )} Can be obtained. At this time, when three or more time-series images are input, the acceleration vector s can be obtained as in the following equation. Note that v is a motion vector.

計算した加速度ベクトルを用いた場合の画像生成方法を説明する。実施の形態１におけるＳ２００６の画像生成では、画像中の関節物体が一定速度で動いている場合における、時間的に内挿、外挿する画像の生成例ついて説明した。これに加えて、加速度ベクトルを用いて、実施の形態１における動きパラメータを（ｓ／（Ｎ＋１）＋ｕ）／（Ｎ＋１）とすることで、加速度を加味した時間的に内挿、外挿した画像を生成することが可能である。具体的には、関節物体の動きが急激に早くなったり、急激に止まったりといった場合に、その加速度を反映して、内挿、外挿した画像を生成することが可能となる。 An image generation method using the calculated acceleration vector will be described. In the image generation in S2006 in the first embodiment, an example of generating an image that is temporally interpolated and extrapolated when the joint object in the image is moving at a constant speed has been described. In addition to this, by using the acceleration vector and setting the motion parameter in the first embodiment to (s / (N + 1) + u) / (N + 1), an image that is temporally interpolated and extrapolated taking acceleration into account. Can be generated. Specifically, when the motion of the joint object suddenly increases or stops suddenly, it is possible to generate an interpolated or extrapolated image reflecting the acceleration.

次に、動きベクトルの代わりに、Ｎ次関数をフィッティングした場合について述べる。Ｔ枚の時系列に並んだ画像を入力とした場合、パラメータ算出部１０２は、Ｔ個の関節位置情報や画像上での位置情報に対してＮ次の関数でフィティングすることができる。これにより、フィッティングした関数の値に沿うように、時間的に内挿、外挿した画像を生成することが可能である。具体的には、関数でフィッティングすることによって、より滑らかな動きを表現することが可能となるため、内挿、外挿した画像を用いてより滑らかな動画を生成することが可能となる。 Next, a case where an Nth order function is fitted instead of a motion vector will be described. When T images arranged in time series are input, the parameter calculation unit 102 can perform fitting with respect to T pieces of joint position information and position information on the image using an Nth order function. As a result, it is possible to generate temporally interpolated and extrapolated images so as to follow the value of the fitted function. Specifically, by fitting with a function, it is possible to express a smoother motion, and thus it is possible to generate a smoother moving image using the interpolated and extrapolated images.

次に、動きベクトルの代わりに、アフィンパラメータを用いる場合について述べる。Ｔ枚の時系列に並んだ画像を入力とした場合、パラメータ算出部１０２及びモデル変換部１０３は、Ｔ個の関節位置情報や画像上での位置情報を用いて、アフィンパラメータを推定することが可能である。 Next, a case where affine parameters are used instead of motion vectors will be described. When T images arranged in time series are input, the parameter calculation unit 102 and the model conversion unit 103 can estimate affine parameters using T joint position information and position information on the image. Is possible.

ここでは、画像上での位置情報を用いて、パラメータ算出部１０２がアフィンパラメータを推定する例について説明する。時刻ｔにおける位置を（ｘ、ｙ）、時刻ｔ＋１でその画素が移動した先を（ｘ'ｙ'、）とすると、アフィン変換は、次のように表すことができる。

Here, an example will be described in which the parameter calculation unit 102 estimates affine parameters using position information on an image. Assuming that the position at time t is (x, y) and that the pixel has moved at time t + 1 is (x'y ',), the affine transformation can be expressed as follows.

ここで、アフィンパラメータａ〜ｆを（数９）（数１０）におけるΔｘ、Δｙ、Δｚの代わりとして用いれば、動きベクトルの代わりにアフィンパラメータを用いたモデル変換を行うことができる。これによって、動きパラメータとして、動きベクトルの代わりにアフィンパラメータを用いた時間的に内挿、外挿した画像を生成することができる。アフィンパラメータは、回転運動を含む動きの表現が可能であり、腕や足の回旋運動の表現に適している。 Here, if affine parameters a to f are used in place of Δx, Δy, and Δz in (Equation 9) and (Equation 10), model conversion using affine parameters instead of motion vectors can be performed. As a result, temporally interpolated and extrapolated images using affine parameters instead of motion vectors can be generated as motion parameters. An affine parameter can express a motion including a rotational motion, and is suitable for a rotational motion of an arm or a leg.

特に、回転運動を含む動きに有効である。さらに、実施の形態２の図１７で説明したように、関節モデルを階層的に表現し、それに対しアフィンパラメータを組み合わせることも有効である。階層的に表現すれば、関節間部位の階層的な接続関係を得ることができる。この効果としては、例えば、胴体のパラメータを先に決定し、次に、左右上腕、左右大腿、頭のように、胴体と接続されている関係を用いて、上位の階層に属する関節位置から順にパラメータを変更することで、効率的にパラメータを変更、決定することができることにある。さらに、上位階層の関節間部位の動きが支配的になるため、より関節物体の構造を反映した画像を生成することができる。 In particular, it is effective for movement including rotational movement. Furthermore, as described with reference to FIG. 17 of the second embodiment, it is also effective to express the joint model hierarchically and combine the affine parameters with it. If expressed hierarchically, it is possible to obtain a hierarchical connection relationship between the joint parts. As this effect, for example, the parameters of the torso are determined first, and then using the relationship connected to the torso, such as the left and right upper arms, the left and right thighs, and the head, the joint positions belonging to the upper hierarchy are sequentially By changing the parameters, the parameters can be changed and determined efficiently. Furthermore, since the movement of the joint part in the upper hierarchy becomes dominant, an image reflecting the structure of the joint object can be generated.

以上で説明した動きパラメータの決定方法により、実施の形態１から４で説明した効果に加えて、効率的に高精度な画像を生成可能である。 By the motion parameter determination method described above, in addition to the effects described in the first to fourth embodiments, it is possible to efficiently generate a highly accurate image.

（実施の形態６）
次に、本発明の実施の形態６について説明する。実施の形態６では、実施の形態１〜５におけるモデル変換部１０３に置き換わる別のモデル変換部（あるいは、モデル変換部１０３をより具体化したもの）について詳しく説明する。ここでは、実施の形態１に沿って説明するが、すべての実施の形態において適用可能である。 (Embodiment 6)
Next, a sixth embodiment of the present invention will be described. In the sixth embodiment, another model conversion unit (or a more specific example of the model conversion unit 103) that replaces the model conversion unit 103 in the first to fifth embodiments will be described in detail. Here, description will be made along the first embodiment, but the present invention can be applied to all the embodiments.

実施の形態１では、モデル変換部１０３は、入力と出力の関係を記述したモデル変換データを使用することによって、入力情報から出力情報を推定する。具体的には、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きを入力として、オクルージョン等で得られなかった関節の位置やその動きに関する情報を推定して出力する。さらに、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きを入力として、形状情報を含むさらに高精度な関節物体の関節位置等の情報を出力する。 In Embodiment 1, the model conversion unit 103 estimates output information from input information by using model conversion data describing the relationship between input and output. Specifically, using the joint position and motion of the joint object extracted by the parameter calculation unit 102 as input, information on the joint position and motion that could not be obtained by occlusion or the like is estimated and output. Furthermore, using the joint position and movement of the joint object extracted by the parameter calculation unit 102 as input, information such as the joint position of the joint object with higher accuracy including shape information is output.

このようなモデル変換部１０３の具体例として、まず、図２６に示すニューラルネットワークを用いたモデル変換部１０３ａについて説明する。このモデル変換部１０３ａは、パラメータ算出部１０２から出力されるパラメータを取得する入力部１０３１と、入力部１０３１によって取得されたパラメータを入力ベクトルとして出力ベクトルに変換するニューラルネットワーク１０３２と、ニューラルネットワーク１０３２からの出力ベクトルを画像生成部１０４に出力する出力部１０３３とから構成される。 As a specific example of such a model conversion unit 103, first, a model conversion unit 103a using a neural network shown in FIG. 26 will be described. The model conversion unit 103a includes an input unit 1031 that acquires a parameter output from the parameter calculation unit 102, a neural network 1032 that converts the parameter acquired by the input unit 1031 into an output vector as an input vector, and a neural network 1032 The output unit 1033 outputs the output vector to the image generation unit 104.

ニューラルネットワーク１０３２については、（数１）〜（数４）に示した入力ベクトルｘと出力ベクトルｙを用いて、誤差逆伝播法によってニューラルネットワークの学習を行っておく。学習が終了すれば、モデル変換部１０３ａは、ｘの入力に対して推定したｙを出力することができる。なお、図２６のニューラルネットワークは３層としたが、階層数を限定するものではない。また、ニューラルネットワーク１０３２について、予め学習させておくのではなく、学習させながら使用してもよい。 For the neural network 1032, the neural network is learned by the error back propagation method using the input vector x and the output vector y shown in (Equation 1) to (Equation 4). When the learning is completed, the model conversion unit 103a can output y estimated with respect to the input of x. Although the neural network in FIG. 26 has three layers, the number of layers is not limited. Further, the neural network 1032 may be used while learning, instead of learning in advance.

学習過程においては、具体的には、入力ベクトルｘと出力ベクトルｙの組を用いて、図２６に示されるように階層間の結合荷重ｗを変更する。

In the learning process, specifically, the combination load w between the hierarchies is changed using a set of the input vector x and the output vector y as shown in FIG.

ここで、ｚは各階層における出力値、ｉはニューラルネットワークの素子番号、ｌは、階層番号を示す。なお、

ここで、ｆ（ｘ）はシグモイド関数、ｎ^lは、階層ｌの素子数である。 Here, z is an output value in each layer, i is an element number of the neural network, and l is a layer number. In addition,

Here, f (x) is a sigmoid function, and n ^l is the number of elements in the hierarchy l.

上記、（数２４）〜（数２６）を用いて繰返し学習を行うことにとって、結合荷重ｗを得ることができる。結合荷重ｗを得ることができれば、モデル変換部１０３ａにより、入力ベクトルｘに対して、出力ベクトルｙを得ることができる。 By performing iterative learning using the above (Equation 24) to (Equation 26), the connection load w can be obtained. If the combined load w can be obtained, the output vector y can be obtained for the input vector x by the model conversion unit 103a.

これにより、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きを入力ベクトルｘとして、モデル変換部１０３ａは、オクルージョン等で得られなかった関節の位置やその動きに関する出力ベクトルｙを推定することができる。さらに、パラメータ算出部１０２で抽出された関節物体の関節位置やその動きを入力ベクトルｘとして、形状情報を含むさらに高精度な関節物体の関節位置等に関する出力ベクトルｙを推定することが可能となる。なお、誤差逆伝播法については、アービブ「ニューラルネットと脳理論」、ｐ４３７、サイエンス社、１９９２年に詳しく説明されている。 As a result, the joint position and the motion of the joint object extracted by the parameter calculation unit 102 are used as the input vector x, and the model conversion unit 103a estimates the output vector y related to the joint position and the motion that cannot be obtained by occlusion or the like. can do. Furthermore, it is possible to estimate the output vector y related to the joint position of the joint object with higher accuracy including the shape information using the joint position and the motion of the joint object extracted by the parameter calculation unit 102 as the input vector x. . The error back-propagation method is described in detail in Arviv “Neural network and brain theory”, p437, Science, 1992.

次に、実施の形態１で説明した相関情報を複数の相関情報の線形和で表現する方法について説明する。図２７は、複数の相関情報の線形和を相関情報とするモデル変換部１０３ｂの構成を示す機能ブロック図である。このモデル変換部１０３ｂは、複数の相関情報を保持する相関情報記憶部１０３５と、パラメータ算出部１０２から出力されたパラメータを取得する入力部１０３１と、入力部１０３１で取得されたパラメータに対して、相関情報記憶部１０３５に保持された複数の相関情報の線形和を用いて変換する変換部１０３４と、変換部１０３４で得られたパラメータを画像生成部１０４に出力する出力部１０３３とから構成される。なお、相関情報記憶部１０３５は、関節物体の体型や動作ごとに、（数１）〜（数１０）で説明したｍ_x、ｍ_y、Ｃを複数組保持しても良いし、オクルージョンが生じやすい例ごとに、ｍ_x、ｍ_y、Ｃを複数組保持しても良い。 Next, a method for expressing the correlation information described in Embodiment 1 as a linear sum of a plurality of correlation information will be described. FIG. 27 is a functional block diagram illustrating a configuration of the model conversion unit 103b that uses a linear sum of a plurality of pieces of correlation information as correlation information. The model conversion unit 103b includes a correlation information storage unit 1035 that holds a plurality of pieces of correlation information, an input unit 1031 that acquires parameters output from the parameter calculation unit 102, and a parameter acquired by the input unit 1031. A conversion unit 1034 that performs conversion using a linear sum of a plurality of pieces of correlation information held in the correlation information storage unit 1035, and an output unit 1033 that outputs parameters obtained by the conversion unit 1034 to the image generation unit 104. . Incidentally, the correlation information storage unit 1035, for each body type and operation of the joint body, m _x described in (Equation 1) through (10), m _y, to C may be a plurality of sets hold the occlusion occurs each easy example, m _x, m _y, C may be a plurality of sets hold.

ここでは、図２５を用いて関節物体の体型ごとにＰ個の相関情報を用いた例について説明する。図２５の相関情報Ｃ¹〜Ｃ^pは、それぞれ体型別相関行列を生成する例である。それぞれの体型を持った関節物体ごとに、モーションキャプチャデータで得た３次元位置情報や動き情報を用いて入力ベクトルｘと出力ベクトルｙとを得る。そして、それぞれのｘとｙとを用いて相関行列Ｃを生成する。 Here, an example in which P pieces of correlation information are used for each body shape of a joint object will be described with reference to FIG. The correlation information C ^{1 to} C ^{p in} FIG. 25 is an example of generating a body type correlation matrix. For each joint object having each body type, an input vector x and an output vector y are obtained using the three-dimensional position information and motion information obtained from the motion capture data. And the correlation matrix C is produced | generated using each x and y.

そして、（数５）は体型ごとに用意したＮ^p組の入力ベクトルｘを用いて、以下のように書き換えることができる。

(Equation 5) can be rewritten as follows using N ^p sets of input vectors x prepared for each body type.

また、（数６）は、体型ごとに用意したＮ^p組の入力ベクトルｘと出力ベクトルｙを用いて、以下のように書き換えることができる。

(Equation 6) can be rewritten as follows using N ^p sets of input vectors x and output vectors y prepared for each body type.

そして、（数７）は、（数２７）と（数２８）より、

で表すことができる。ここで、Ｃ_x ^p*はＣ_x ^pの逆行列、または疑似逆行列である。 And (Equation 7) is obtained from (Equation 27) and (Equation 28).

It can be expressed as Here, C _x ^{p *} is an inverse matrix of C _x ^p or a pseudo inverse matrix.

そして、関節物体の体型ごとに推定するｙ^p _expectedは、モデル変換行列を用いて次の式で表すことができる。

Then, y ^p _expected to estimate for each type of joint object can be expressed by the following equation using the model transformation matrix.

そして、最終的に推定したい出力ベクトルｙは、Ｐ個の推定結果の線形和によって表現することができる。

The output vector y to be finally estimated can be expressed by a linear sum of P estimation results.

ここで、

である。α_k，ｍ_x ^k，Ｃ_x ^kは、ＥＭアルゴリズムで推定することができる。 here,

It is. α _k , _mx ^k , and C _x ^k can be estimated by an EM algorithm.

ＥＭアルゴリズムについては、上田修功「ベイズ学習Ｉ−統計的学習の基礎―」、電子情報通信学会誌、Ｖｏｌ．８５Ｎｏ．４ｐｐ．２６５−２７１，２００２に詳しく記載されている。 Regarding the EM algorithm, Nobuo Ueda, “Bayesian Learning I—Basics of Statistical Learning”, IEICE Journal, Vol. 85No. 4pp. 265-271, 2002.

このように、複数の体型ごとに相関行列を求めておき、それを相関情報記憶部１０３５が保持し、変換部１０３４がその線形和を用いてモデル変換を行うことで、モデル変換部１０３ｂは、さまざまな体型に対しても、複数の体型の線形和でモデル変換を行える効果を奏する。 Thus, the correlation matrix is obtained for each of a plurality of body types, the correlation information storage unit 1035 holds the matrix, and the conversion unit 1034 performs model conversion using the linear sum, so that the model conversion unit 103b is Even for various body types, there is an effect that model conversion can be performed with a linear sum of a plurality of body types.

以上により、実施の形態１〜５の効果に加えて、対象とする関節物体の体型のバリエーションに頑健となる。 As described above, in addition to the effects of the first to fifth embodiments, the body shape variation of the target joint object is robust.

以上、本発明に係る画像生成装置について、実施の形態及び変形例に基づいて説明したが、本発明は、これらの形態や例に限定されるものではない。各実施の形態や変形例における構成要素を適宜組み合わせて実現される別の形態や、各実施の形態に対して当業者が思いつく変形を施して得られる形態も本発明に含まれる。 As described above, the image generation apparatus according to the present invention has been described based on the embodiments and the modifications. However, the present invention is not limited to these forms and examples. Other forms realized by appropriately combining the constituent elements in each embodiment and modification, and forms obtained by subjecting each embodiment to modifications conceived by those skilled in the art are also included in the present invention.

なお、特許請求の範囲と実施の形態における構成要素の対応は次の通りである。つまり、特許請求の範囲における「画像入力手段」、「パラメータ算出手段」、「モデル変換手段」、「画像生成手段」、「画像評価手段」、「パラメータ変更手段」の一例が、それぞれ、実施の形態における画像入力部１０１、パラメータ算出部１０２、モデル変換部１０３、画像生成部１０４、画像評価部２０１、パラメータ変更部２０２である。ただし、特許請求の範囲における構成要素は、これら実施の形態における対応する構成要素だけに限定されるのでなく、その等価物も含まれる。 The correspondence between the claims and the components in the embodiment is as follows. In other words, examples of “image input means”, “parameter calculation means”, “model conversion means”, “image generation means”, “image evaluation means”, and “parameter change means” in the claims are as follows. An image input unit 101, a parameter calculation unit 102, a model conversion unit 103, an image generation unit 104, an image evaluation unit 201, and a parameter change unit 202. However, the constituent elements in the claims are not limited to the corresponding constituent elements in these embodiments, and equivalents thereof are also included.

本発明は、画像生成装置として、特に、画像処理によって、人物や動物等を含む関節物体の画像を生成する装置として、例えば、画像中に存在する関節物体の動きや関節の位置等に関するパラメータと形状、服装等に関するパラメータとを用いて、画像中に存在する関節物体の特性を反映した新たな画像を生成する装置、アニメーション生成装置、デジタルカメラ・カメラ付き携帯電話・ビデオ装置等で撮影した映像を補完して精度を向上させる映像補完装置、ゲーム・映画・コンピュータグラフィックス用の静止画や動画を生成する装置として、有用である。 The present invention is an image generation apparatus, particularly an apparatus for generating an image of a joint object including a person, an animal, or the like by image processing, for example, parameters relating to the movement of a joint object, the position of a joint, etc. present in the image Images taken by devices that generate new images that reflect the characteristics of joint objects in the image using parameters related to shape, clothing, etc., animation generation devices, digital cameras / mobile phones with video cameras, video devices, etc. It is useful as a video complementing device that improves the accuracy by complementing images, and a device that generates still images and moving images for games, movies, and computer graphics.

本発明の実施の形態１における画像生成装置の構成を示す図The figure which shows the structure of the image generation apparatus in Embodiment 1 of this invention. パラメータ算出部の詳細な構成を示す図The figure which shows the detailed constitution of the parameter calculation section 画像生成部の詳細な構成を示す図The figure which shows the detailed structure of an image generation part. 画像生成装置の動作を示すフローチャートFlow chart showing operation of image generation apparatus 関節モデルを示す図Diagram showing joint model 頭、手、足の位置を示すデータベースの例を示す図Diagram showing an example of a database showing the position of the head, hands, and feet 入力データと出力データを示す図Diagram showing input and output data 入力、出力ベクトルの例を示す図Diagram showing examples of input and output vectors 入力、出力ベクトルの例を示す図Diagram showing examples of input and output vectors ３次元情報の画像への投影例を示す図The figure which shows the example of a projection to the image of three-dimensional information 画像生成例を示す図Diagram showing an example of image generation 内挿、外挿画像の例を示す図Diagram showing examples of interpolated and extrapolated images 画像生成方法を示す図Diagram showing image generation method 本発明の実施の形態２における画像生成装置の構成を示す図The figure which shows the structure of the image generation apparatus in Embodiment 2 of this invention. 画像評価部の動作を説明する図The figure explaining operation of an image evaluation part 画像生成装置の動作を示すフローチャートFlow chart showing operation of image generation apparatus 関節モデルの階層表現の例を示す図Diagram showing an example of hierarchical representation of a joint model 本発明の実施の形態３による関節位置の推定結果を示す図The figure which shows the estimation result of the joint position by Embodiment 3 of this invention 画像の一部が遮蔽されている例を示す図The figure which shows the example where a part of image is shielded 本発明の実施の形態４における画像生成装置の構成を示す図The figure which shows the structure of the image generation apparatus in Embodiment 4 of this invention. 画像生成装置の動作を示すフローチャートFlow chart showing operation of image generation apparatus 画像生成例を示す図Diagram showing an example of image generation パラメータ設定画面の例を示す図Figure showing an example of the parameter setting screen 画像生成方法を示す図Diagram showing image generation method 本発明の実施の形態６による相関行列の例を示す図The figure which shows the example of the correlation matrix by Embodiment 6 of this invention モデル変換部の別の構成例を示す図The figure which shows another structural example of a model conversion part モデル変換部の別の構成例を示す図The figure which shows another structural example of a model conversion part

Explanation of symbols

１０１画像入力部
１０２パラメータ算出部
１０３、１０３ａ、１０３ｂモデル変換部
１０４画像生成部
２０１画像評価部
２０２パラメータ変更部
３０１ユーザ設定部
１０２１関節物体領域抽出部
１０２２モデル当てはめ部
１０２３関節間部位位置計算部
１０３１入力部
１０３２ニューラルネットワーク
１０３３出力部
１０３４変換部
１０３５相関情報記憶部
１０４１画素移動位置計算部
１０４２補間処理部
１０４３画素値決定部 DESCRIPTION OF SYMBOLS 101 Image input part 102 Parameter calculation part 103, 103a, 103b Model conversion part 104 Image generation part 201 Image evaluation part 202 Parameter change part 301 User setting part 1021 Joint object area | region extraction part 1022 Model fitting part 1023 Inter-joint part location calculation part 1031 Input unit 1032 Neural network 1033 Output unit 1034 Conversion unit 1035 Correlation information storage unit 1041 Pixel movement position calculation unit 1042 Interpolation processing unit 1043 Pixel value determination unit

Claims

An image generation device that generates a new image reflecting the characteristics of the joint object from an image obtained by imaging the joint object,
An image input means for acquiring an image obtained by imaging a joint object;
A parameter calculating means for calculating a first parameter related to a position of a joint or an inter-joint portion of the joint object by fitting a model having a joint held in advance to the joint object in the acquired image;
Area dividing means for extracting a second parameter relating to at least one of color and texture information of the inter-articular part of the joint object by performing area division of the image of the joint object based on the first parameter;
Image generating means for generating a new image reflecting the characteristics of the joint object using the first parameter calculated by the parameter calculating means and the second parameter extracted by the area dividing means. A featured image generation apparatus.

The image generation means generates an image reflecting the position of the joint or the joint part of the joint object and the color and texture information of the joint part of the joint object as a new image reflecting the characteristics of the joint object. The image generating apparatus according to claim 1.

The image input means acquires temporally continuous images,
The image generation apparatus according to claim 1, wherein the parameter calculation unit calculates a first parameter related to a position and a motion of a joint or an inter-joint site of the joint object using the image.

The image generation device further includes:
Image evaluation means for evaluating the image by calculating an error between the image generated by the image generation means and the target image;
Parameter changing means for changing the first parameter based on the evaluation result by the image evaluation means,
The region dividing unit performs the region division based on the first parameter changed by the parameter changing unit,
2. The image according to claim 1, wherein the image generation unit generates the image using the first parameter changed by the parameter change unit and the second parameter extracted by the region dividing unit. Generator.

The image generation device further uses the first parameter to estimate shape information of the joint part of the joint object, and a third parameter relating to the position and movement of the joint not included in the first parameter. With
2. The image according to claim 1, wherein the image generation unit generates a new image reflecting characteristics of the joint object by using the first parameter, the second parameter, and the third parameter. Generator.

The image generation unit generates, as the new image, a temporally interpolated and extrapolated image generated based on motion information included in the first parameter with respect to a temporally continuous image. The image generating apparatus according to claim 3.

The image generation apparatus according to claim 1, wherein the image generation unit generates, as the new image, an image in which a different color or texture is pasted on each part constituting the joint object.

The image generation means generates, as the new image, an image obtained by pasting a texture of an inter-articular part of the joint object on an image of a joint object including a posture or movement different from the posture or movement of the joint object. The image generation apparatus according to claim 1.

The image according to claim 1, wherein the parameter calculation unit includes a joint object region extraction unit that extracts a region of a joint object with respect to the image, and performs the fitting with respect to the extracted region. Generator.

The image generation apparatus according to claim 9, wherein the joint object region extraction unit extracts the region by performing edge extraction on the image.

The image generating apparatus further includes periodicity detecting means for detecting a period of motion of the joint object existing in the image acquired by the image input means,
The parameter calculating means, the region dividing means, and the image generating means are configured to calculate a first parameter, extract a second parameter, and extract a second parameter, for each time-series image for one period detected by the periodicity detecting means, and The image generating apparatus according to claim 3, wherein the image is generated.

The image generation apparatus according to claim 3, wherein the motion is represented by any one of a motion vector, an acceleration vector, an affine parameter, and an approximate curve parameter.

The image generating apparatus according to claim 1, wherein the region dividing unit calculates the second parameter using, as an initial value, position information of a joint position or an inter-joint site included in the first parameter.

The image generation unit according to claim 1, wherein the image generation unit generates the new image by performing pixel movement based on a joint position so that a portion connected by a joint is not separated. apparatus.

6. The image generation according to claim 5, wherein the model conversion means obtains correlation information between the first parameter and the third parameter in advance and estimates the third parameter based on the correlation information. apparatus.

The model conversion unit estimates the third parameter by estimating a parameter value that cannot be extracted when a part of the first parameter cannot be extracted by the parameter calculation unit. 5. The image generating device according to 5.

The parameter calculation means calculates the first parameter expressed hierarchically based on the connection relationship between the joint parts of the joint object,
The image generation apparatus according to claim 5, wherein the model conversion unit estimates the third parameter expressed hierarchically based on a connection relation between joint portions of a joint object.

An image generation method for generating a new image reflecting characteristics of the joint object from an image obtained by imaging the joint object,
An image input step for acquiring an image obtained by imaging a joint object;
A parameter calculation step of calculating a first parameter related to the position or movement of the joint of the joint object or the inter-joint region by applying a model having a joint held in advance to the joint object in the acquired image;
A region dividing step of extracting a second parameter related to at least one of a color and texture information of an inter-articular portion of the joint object by performing region division of the image of the joint object based on the first parameter;
An image generation step of generating a new image reflecting the characteristics of the joint object using the first parameter calculated by the parameter calculation step and the second parameter extracted by the region division step. A featured image generation method.

A program for generating a new image reflecting the characteristics of the joint object from an image obtained by imaging the joint object,
A program causing a computer to execute the steps included in the image generation method according to claim 18.