JP6207210B2

JP6207210B2 - Information processing apparatus and method

Info

Publication number: JP6207210B2
Application number: JP2013086915A
Authority: JP
Inventors: 要冨手; 康生片野; 優和真継
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2013-04-17
Filing date: 2013-04-17
Publication date: 2017-10-04
Anticipated expiration: 2033-04-17
Also published as: JP2014211719A

Description

本発明は、任意の表情の顔モデルを生成する情報処理に関する。 The present invention relates to information processing for generating a face model of an arbitrary expression.

映画の制作現場やエンタテイメントの分野において、アクタの表情を撮影して、仮想キャラクタにアクタと同じような表情をさせるパフォーマンスキャプチャと呼ばれる技術が盛んに研究されている。 In the field of movie production and entertainment, a technique called performance capture that captures the facial expression of an actor and makes the virtual character look like an actor is actively researched.

近年、イメージセンサの高解像度化と計算機の高速化が相俟って、特定人物の毛穴や皺などの微細な凸凹まで計算機に取り込み、後に、その人物の表情を任意に変更する手法が提案されている（特許文献1参照）。 In recent years, a combination of high-resolution image sensors and high-speed computers has led to the introduction of fine irregularities such as pores and wrinkles of a specific person into a computer, and later a method to arbitrarily change the person's facial expression. (See Patent Document 1).

特許文献1に開示される表情生成方法は、顔の大まかな変形に伴う、毛穴や皺などの微細な変形も忠実に再現することが可能である。特許文献1の技術は、まず、大まかな変形と微細な変形をそれぞれ異なるデータ取得方法で取得する。 The facial expression generation method disclosed in Patent Document 1 can faithfully reproduce fine deformations such as pores and wrinkles accompanying rough deformation of the face. In the technique of Patent Document 1, first, rough deformation and fine deformation are acquired by different data acquisition methods.

大まかな変形のデータは、被撮影者の顔にマーカを付けて、一般的なモーションキャプチャ技術を利用して取得される。この時の大まかな顔の変形は、マーカ点を使って計測した疎な三次元点群、および、それら点をつないだポリゴンメッシュで表される。 Rough deformation data is acquired using a general motion capture technique by attaching a marker to the face of the subject. The rough face deformation at this time is represented by a sparse three-dimensional point group measured using marker points, and a polygon mesh connecting these points.

一方、微細な変形は、フォトメトリックステレオと密なオプティカルフローの手法を利用して取得される。このとき取得した微細な変形は、ディスプレイスメントマップ(displacement map)として表される。 On the other hand, fine deformation is obtained by using a photometric stereo and a dense optical flow technique. The fine deformation acquired at this time is represented as a displacement map.

ディスプレイスメントマップは、グレイスケール画像またはRGB画像によって、元形状に対する高低を定義し、少ないポリゴンで細かな凹凸を表現するための手法である。また、ディスプレイスメントマップは、3Dモデルの頂点を実際の表面に対して上下に移動させて凹凸を表現する特徴を有する。 The displacement map is a technique for defining fine irregularities with a small number of polygons by defining the height of the original shape with a grayscale image or RGB image. In addition, the displacement map has a feature of expressing unevenness by moving the vertex of the 3D model up and down with respect to the actual surface.

そして、被撮影者が表出し得る表情の最大強度をシステムに登録する必要がある。そのため、撮影時、被撮影者が基本的な七つの感情（喜び、悲しみ、怒り、諦め、驚き、嫌悪、恐怖）を最大限表出した表情が撮影される。 Then, it is necessary to register in the system the maximum facial expression intensity that the subject can express. For this reason, at the time of shooting, the photographer takes a facial expression that expresses the seven basic emotions (joy, sadness, anger, praise, surprise, disgust, fear) to the maximum extent.

特許文献1が開示する表情生成装置において、データ取得後、システム利用者は、被撮影者のCGオブジェクトに対して任意の変形を加える。表情生成装置は、加えられた変形に応じて、複数枚のディスプレイスメントマップから微細な凹凸形状の変動を推定し、任意の表情を作り出す。 In the facial expression generation device disclosed in Patent Document 1, after data acquisition, the system user makes arbitrary modifications to the CG object of the subject. The facial expression generation device estimates a minute uneven shape variation from a plurality of displacement maps in accordance with the applied deformation, and creates an arbitrary facial expression.

特許文献1が開示する表情生成方法では、被撮影者に限り、被撮影者の微細凹凸の変動を含む任意の表情を生成可能である。しかし、キャプチャした被撮影者の顔の大まかな変形と微細な変形（以下、表情変動）に関する情報を他人のCGオブジェクトに与えて、他人の顔の任意の表情を作り出すことまではできない。 In the expression generation method disclosed in Patent Document 1, only an imaged person can generate an arbitrary expression including fluctuations in the fine unevenness of the imaged person. However, it is impossible to create an arbitrary expression of another person's face by giving information about the rough deformation and fine deformation (hereinafter referred to as expression variation) of the captured subject to other person's CG object.

米国特許出願公開第2009/0195545号明細書US Patent Application Publication No. 2009/0195545 米国特許第7,548,272号明細書U.S. Pat.No. 7,548,272 特開2001-076177号公報JP 2001-076177

Yang Chen、Gerard Medioni「Object modelling by registration of multiple range images」Image and Vision Computing、Vol. 10、No. 3、145-155頁、1992年4月Yang Chen, Gerard Medioni `` Object modeling by registration of multiple range images '' Image and Vision Computing, Vol. 10, No. 3, pp. 145-155, April 1992 Thibaut Weise、Sofien Bouaziz、Hao Li、Mark Pauly「Realtime Performance-Based Facial Animation」ACM Transactions on Graphics、Proceedings of the 38th ACM SIGGRAPH 2011、2011年8月Thibaut Weise, Sofien Bouaziz, Hao Li, Mark Pauly "Realtime Performance-Based Facial Animation" ACM Transactions on Graphics, Proceedings of the 38th ACM SIGGRAPH 2011, August 2011 J. Ahlberg「CANDIDE-3 -- an updated parameterized face」Report No. LiTH-ISY-R-2326、Dept. of Electrical Engineering、Linkoping University、Sweden、2001年J. Ahlberg “CANDIDE-3-an updated parameterized face” Report No. LiTH-ISY-R-2326, Dept. of Electrical Engineering, Linkoping University, Sweden, 2001 Yeongho Seol et al.「Spacetime Expression Cloning for Blendshapes」ACM Transaction On Graphics、31(2)、14:1-14:12頁、2012年Yeongho Seol et al. `` Spacetime Expression Cloning for Blendshapes '' ACM Transaction On Graphics, 31 (2), 14: 1-14: 12, 2012 Z. Deng、P. Y. Chiang、P. Fox、U. Neumann「Animating Blendshape Faces by Cross-Mapping Motion Capture Data」ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games 2006 (I3DG)Z. Deng, P. Y. Chiang, P. Fox, U. Neumann `` Animating Blendshape Faces by Cross-Mapping Motion Capture Data '' ACM SIGGRAPH Symposium on Interactive 3D Graphics and Games 2006 (I3DG) Richard S. Sutton、Andrew G. Barto「Reinforcement Learning: An Introduction」The MIT Press、1998年Richard S. Sutton, Andrew G. Barto “Reinforcement Learning: An Introduction”, The MIT Press, 1998 Masatoshi Okutomi、Takeo Kanade「A Multiple-Baseline Stereo」IEEE Transactions on Pattern Analysis and Machine Intelligence、Vol. 15、No. 4、1993年4月Masatoshi Okutomi, Takeo Kanade “A Multiple-Baseline Stereo” IEEE Transactions on Pattern Analysis and Machine Intelligence, Vol. 15, No. 4, April 1993 Iain Matthews、Simon Baker「Active Appearance Models Revisited」International Journal of Computer Vision、Vol. 60、No. 2、135-164頁、2004年11月Iain Matthews, Simon Baker “Active Appearance Models Revisited” International Journal of Computer Vision, Vol. 60, No. 2, pp. 135-164, November 2004 Beier T.、Neely S.「Feature-based image metamorphosis」Computer Graphics 1992、Vol. 26(2)、35-42頁Beier T., Neely S. "Feature-based image metamorphosis" Computer Graphics 1992, Vol. 26 (2), pp. 35-42 J. Daugman「Entropy reduction and decorrelation in visual coding by oriented neural receptive fields」Trans. on Biomedical Engineering、Vol. 36、No. 1、107-114頁、1989年J. Daugman “Entropy reduction and decorrelation in visual coding by oriented neural receptive fields” Trans. On Biomedical Engineering, Vol. 36, No. 1, 107-114, 1989 T. Ojala、M. Pietikainen、D. Harwood「Performance evaluation of texture measures with classification based on Kullback discrimination of distributions」Proceedings of the 12th IAPR International Conference on Pattern Recognition (ICPR 1994)、Vol. 1、582-585頁T. Ojala, M. Pietikainen, D. Harwood `` Performance evaluation of texture measures with classification based on Kullback discrimination of distributions '' Proceedings of the 12th IAPR International Conference on Pattern Recognition (ICPR 1994), Vol. 1, 582-585 P. Viola、M. Jones「Rapid Object Detection using a Boosted Cascade of Simple Features」IEEE Conference on Computer Vision and Pattern Recognition、2001年P. Viola, M. Jones “Rapid Object Detection using a Boosted Cascade of Simple Features” IEEE Conference on Computer Vision and Pattern Recognition, 2001 「コンピュータグラフィックス」CG-ARTS協会発行、146-149頁、2004年"Computer Graphics", CG-ARTS Association, pp. 146-149, 2004 M オローク著、袋谷賢吉、大久保篤志共訳「三次元コンピュータ・アニメーションの原理」近代科学社、94-98頁、1997年M Orokku, Kenkichi Fukuroya and Atsushi Okubo, "Principles of 3D computer animation", Modern Science, 94-98, 1997 David J. Fleet、Yair Weiss「Optical Flow Estimation」In Paragios et al.、Handbook of Mathematical Models in Computer Vision、Springer、2006年David J. Fleet, Yair Weiss `` Optical Flow Estimation '' In Paragios et al., Handbook of Mathematical Models in Computer Vision, Springer, 2006

本発明は、個人の顔骨格モデルに基づき他人の顔の任意の表情を有する顔モデルを生成するための変動モデルを生成することを目的とする。 An object of the present invention is to generate a variation model for generating a face model having an arbitrary expression of another person's face based on an individual's face skeleton model.

本発明は、前記の目的を達成する一手段として、以下の構成を備える。 The present invention has the following configuration as one means for achieving the above object.

本発明にかかる情報処理は、汎用の顔骨格モデルを個人の顔骨格モデルに変換するための骨格モデルを生成し、前記個人の顔骨格モデルと前記骨格モデルを用いて表情の個人差を吸収した変動モデルを算出し、前記変動モデルを所定の表情カテゴリごとに分類し、前記分類した変動モデルをデータベースに登録し、前記変動モデルを、予め用意された複数のターゲットシェイプの重み付き線形和として算出し、前記ターゲットシェイプの重みを、擬似逆行列によって算出する。 The information processing according to the present invention generates a skeleton model for converting a general-purpose facial skeleton model into an individual facial skeleton model, and absorbs individual differences in facial expressions using the individual facial skeleton model and the skeleton model. A variation model is calculated, the variation model is classified into predetermined facial expression categories, the classified variation model is registered in a database, and the variation model is calculated as a weighted linear sum of a plurality of target shapes prepared in advance. Then, the weight of the target shape is calculated by a pseudo inverse matrix .

本発明によれば、個人の顔骨格モデルに基づき他人の顔の任意の表情を有する顔モデルを生成するための変動モデルを生成することができる。 According to the present invention, it is possible to generate a variation model for generating a face model having an arbitrary expression of another person's face based on an individual's face skeleton model.

前段処理を実行する表情生成装置の構成例を示すブロック図。The block diagram which shows the structural example of the facial expression production | generation apparatus which performs a front | former stage process. 汎用顔モデルの描画例を示す図。The figure which shows the example of drawing of a general purpose face model. 後段処理を実行する表情生成装置の構成例を示すブロック図。The block diagram which shows the structural example of the facial expression production | generation apparatus which performs a back | latter stage process. 実施例の処理構成例を説明するブロック図。The block diagram explaining the process structural example of an Example. 変動モデルデータベース生成部の構成例を示すブロック図。The block diagram which shows the structural example of a fluctuation model database production | generation part. 変動モデルデータベース生成部の処理を説明するフローチャート。The flowchart explaining the process of a fluctuation | variation model database production | generation part. 任意表情生成部の処理を説明するフローチャート。The flowchart explaining the process of an arbitrary facial expression production | generation part. 実施例2の表情生成装置の構成例を示すブロック図。FIG. 6 is a block diagram illustrating a configuration example of a facial expression generation apparatus according to a second embodiment. 実施例2の変動モデルデータベース生成部および任意表情生成部の構成例を示すブロック図。FIG. 9 is a block diagram illustrating a configuration example of a variation model database generation unit and an arbitrary facial expression generation unit according to the second embodiment. 変動モデルデータベース生成部の処理を説明するフローチャート。The flowchart explaining the process of a fluctuation | variation model database production | generation part. 実施例2の任意表情生成部の処理を説明するフローチャート。9 is a flowchart for explaining processing of an arbitrary facial expression generation unit according to the second embodiment. 実施例3の変動モデルデータベース生成部および任意表情生成部の構成例を示すブロック図。FIG. 10 is a block diagram illustrating a configuration example of a variation model database generation unit and an arbitrary facial expression generation unit according to the third embodiment.

以下、本発明にかかる実施例の表情生成に関する情報処理を図面を参照して詳細に説明する。 Hereinafter, information processing relating to facial expression generation according to an embodiment of the present invention will be described in detail with reference to the drawings.

実施例1では、個体ごとに骨格モデルを定義して、個体に依存しない表情変動モデルを作成し、任意の顔に対して所望の表情を加える例を説明する。 In Example 1, an example will be described in which a skeletal model is defined for each individual, a facial expression variation model independent of the individual is created, and a desired facial expression is added to an arbitrary face.

実施例1の処理は大きく前段後段の二部構成に分かれる。前段の処理は、特定個人の顔から表情の個人差を除いた、万人に適応可能な表情の汎用変動モデルを算出し、汎用変動モデルをデータベース(DB)に蓄積する。後段の処理は、特定人物の顔に汎用変動モデルを適用し、任意の顔における任意の表情を作り出す。 The processing of the first embodiment is roughly divided into a two-part configuration of the former stage and the latter stage. The first stage of processing calculates a general-purpose variation model of facial expressions that can be applied to all people, excluding individual differences in facial expressions from the face of a specific individual, and stores the general-purpose variation model in a database (DB). In the subsequent processing, a general-purpose variation model is applied to the face of a specific person to create an arbitrary expression on an arbitrary face.

［汎用変動モデルの算出］
以下、前段処理に相当する、表情の汎用変動モデルを算出し、汎用変動モデルをDBに登録するまでの方法を説明する。 [Calculation of general-purpose variation model]
Hereinafter, a method for calculating a general-purpose variation model of facial expression corresponding to the preceding process and registering the general-purpose variation model in the DB will be described.

図1のブロック図により前段処理を実行する表情生成装置の構成例を示す。 The block diagram of FIG. 1 shows an example of the configuration of a facial expression generation apparatus that executes pre-stage processing.

表情生成装置は、まず、汎用の顔骨格モデルである汎用顔モデル100を準備する。汎用顔モデル100は、一般的なコンピュータグラフィクス(CG)ソフトウェアによって描画される。図2により汎用顔モデル100の描画例を示す。図2に示す描画例は、三次元の頂点（XYZ座標）と、隣接する頂点を結んで得られるポリゴンメッシュによって構成されている。 The facial expression generation apparatus first prepares a general-purpose face model 100 that is a general-purpose face skeleton model. The general-purpose face model 100 is drawn by general computer graphics (CG) software. A drawing example of the general-purpose face model 100 is shown in FIG. The drawing example shown in FIG. 2 includes a three-dimensional vertex (XYZ coordinates) and a polygon mesh obtained by connecting adjacent vertices.

一般に、汎用顔モデル100の形状には、デファクトスタンダードになる、決まった形状が存在しない。そのため、システム利用者（ユーザ）が手作業で汎用顔モデル100を作成してもよいし、DBに登録された複数の人物の顔形状から平均的な顔形状を表す汎用顔モデル100を作り出すこともできる。 In general, the shape of the general-purpose face model 100 does not have a fixed shape that becomes a de facto standard. Therefore, the system user (user) may create the general-purpose face model 100 manually, or create the general-purpose face model 100 that represents the average face shape from the face shapes of multiple persons registered in the DB. You can also.

ただし、DBから平均的な顔形状を作成する場合、当然ながら、DBに登録された人物の個人情報（人種、性別、年齢等）の影響を受けるため、汎用顔モデル100は、構築したいシステムの特性に合わせて作成することが望ましい。例えば、本システムを利用して欧米人の顔に表情を加える場合を想定すると、汎用顔モデル100は、欧米人の顔の特徴を多く含む顔モデルであることが望まれる。 However, when creating an average face shape from the DB, of course, it is affected by the personal information (race, gender, age, etc.) of the person registered in the DB. It is desirable to create it according to the characteristics of For example, assuming that a facial expression is added to a Western face using this system, it is desirable that the general-purpose face model 100 is a face model including many features of a Western face.

次に、汎用顔モデル100に人物Aの顔骨格モデルである骨格モデル101を適用して、汎用顔モデル100を人物Aの個人顔モデル102に変形する。具体的には、汎用顔モデル100の各頂点を骨格モデル101が指定する座標へ移動する処理である。従って、骨格モデル101は、行列または汎用顔モデル100の各頂点を移動するためのベクトルの集合として表される。 Next, the skeleton model 101 that is the face skeleton model of the person A is applied to the general-purpose face model 100 to transform the general-purpose face model 100 into the personal face model 102 of the person A. Specifically, this is a process of moving each vertex of the general-purpose face model 100 to the coordinates specified by the skeleton model 101. Therefore, the skeleton model 101 is represented as a matrix or a set of vectors for moving each vertex of the general-purpose face model 100.

式(1)は、汎用顔モデル100に骨格モデル101を適用して個人顔モデル102に変形する方法を示す。
Characterized GM = Mc・GM …(1) Expression (1) shows a method of transforming the personal face model 102 by applying the skeleton model 101 to the general-purpose face model 100.
Characterized GM = Mc ・ GM… (1)

式(1)の左辺のCharacterized GM（GMはGeneric Modelの略）は、入力として得られる無表情（以下、基準表情）の顔（以下、無表情顔）の個人顔モデルである。Characterized GMは、入力データとして得られる奥行き画像の各頂点をポリゴンメッシュなどで再構成したものである。ここで取得する無表情顔の各頂点の位置は、表情変化がない状態の連続した複数フレームの平均値や中央値を用いて決定してもよい。また、リラックスした状態の被撮影者の顔をレンジスキャナなどを利用し取得した3Dデータを採用してもよい。 Characterized GM (GM is an abbreviation of Generic Model) on the left side of Expression (1) is an individual face model of an expressionless (hereinafter referred to as reference expression) face (hereinafter referred to as an expressionless face) obtained as an input. Characterized GM is obtained by reconstructing each vertex of a depth image obtained as input data with a polygon mesh or the like. The position of each vertex of the expressionless face acquired here may be determined using an average value or median value of a plurality of consecutive frames in a state where there is no expression change. Alternatively, 3D data obtained by using a range scanner or the like for the face of the subject in a relaxed state may be employed.

右辺のGMは、汎用顔モデル100を構成する三次元空間中の全頂点座標を格納する行列である。Mcは、骨格モデル101であり、GMの全頂点を移動して（この場合は人物Aの）個人顔に変形させるための行列である。骨格モデル101として定義される行列Mcは、表情の変動に依存しない目、口、鼻といった各部位の位置とサイズを示す情報を含む。 The GM on the right side is a matrix that stores all vertex coordinates in the three-dimensional space constituting the general-purpose face model 100. Mc is a skeleton model 101, and is a matrix for moving all the vertices of GM and transforming it into an individual face (in this case, person A). The matrix Mc defined as the skeletal model 101 includes information indicating the position and size of each part such as eyes, mouth, and nose that do not depend on changes in facial expression.

●骨格モデルの計算と個人顔モデルの生成
骨格モデル101を計算する手法は幾つか存在する。 ● Skeletal model calculation and personal face model generation There are several methods for calculating the skeletal model 101.

例えば、特許文献1の技術は、RGB画像と奥行き画像を同時に取得可能な装置を用いて、顔の三次元形状を取得する。そして、取得した顔の三次元形状に対して、汎用顔モデルにICP (Iterative Closest Point)アルゴリズムによる形状の当て嵌めを行って、汎用顔モデルの位置と傾きを調整する。なお、汎用顔モデルは非特許文献3参照、ICP (Iterative Closest Point)アルゴリズムは非特許文献1参照、汎用顔モデルの位置と傾きの調整は非特許文献2参照。例えば汎用顔モデルは、非特許文献3に記載されるような顔の形状を示す頂点群と、それらの頂点で構成されるトポロジ情報である。 For example, the technique of Patent Document 1 acquires a three-dimensional shape of a face using an apparatus that can simultaneously acquire an RGB image and a depth image. Then, the position and inclination of the general-purpose face model are adjusted by fitting the general-purpose face model to the general-purpose face model using an ICP (Iterative Closest Point) algorithm. Refer to Non-Patent Document 3 for the general-purpose face model, refer to Non-Patent Document 1 for the ICP (Iterative Closest Point) algorithm, and refer to Non-Patent Document 2 for adjusting the position and inclination of the general-purpose face model. For example, the general-purpose face model is vertex information indicating the shape of a face as described in Non-Patent Document 3, and topology information including these vertices.

汎用顔モデルを利用した被撮影者の顔の位置と傾きの調整は、被撮影者の顔の位置と傾きを固定せずに撮影を行うことを想定したためである。言い換えれば、被撮影者の顔の位置と傾きを充分に固定することができ、汎用顔モデルを当て嵌められる場合、位置と傾きを調整する処理は不要である。 This is because the adjustment of the position and tilt of the face of the subject using the general-purpose face model is based on the assumption that shooting is performed without fixing the position and tilt of the face of the subject. In other words, the position and inclination of the face of the subject can be sufficiently fixed, and when the general-purpose face model is fitted, the process of adjusting the position and inclination is not necessary.

一般に、ICPアルゴリズムにより形状を当て嵌める場合、剛体オブジェクトを対象として形状全体で位置合わせを行う。しかし、顔を対象とする場合、顔には表情変化による形状変形があるため、ICPアルゴリズムを適用することができない部位も存在する。例えば、顎が動く場合など顔の形状が大きく変化するため、ICPアルゴリズムによる位置合わせが困難になる。 In general, when applying a shape by the ICP algorithm, alignment is performed for the entire shape with respect to a rigid object. However, when a face is a target, there is a part to which the ICP algorithm cannot be applied because the face is deformed due to a change in facial expression. For example, when the jaw moves, the face shape changes greatly, making it difficult to align with the ICP algorithm.

そこで、本実施例においては、顔全体で位置合わせを行わずに、顔の上半分（目の周辺や額）を剛体と見なし、ポイントツープレーン(point-to-plane)の対応付けにより形状の当て嵌めを行う。これは、一般に、目の周辺や額部分は、顔骨格の構造上、顎のように大きな変動が生じないためである。 Therefore, in this embodiment, the upper half of the face (the periphery of the eyes and the forehead) is regarded as a rigid body without performing alignment on the entire face, and the shape of the shape is determined by point-to-plane association. Make a fit. This is because, in general, the periphery of the eye and the forehead part do not change as much as the chin due to the structure of the face skeleton.

そして、汎用顔モデルの各頂点が実際に取得した顔の形状に合うよう、汎用顔モデルの各頂点を移動する。当然ながら、汎用顔モデルの各頂点と、入力信号から取得した顔の形状に合わせて変形した汎用顔モデルの各頂点は対応がとれている。従って、頂点ごとにその移動量を計算して、骨格モデル101を表す行列Mcを求めることができる。 Then, each vertex of the general-purpose face model is moved so that each vertex of the general-purpose face model matches the shape of the actually acquired face. Naturally, each vertex of the general-purpose face model corresponds to each vertex of the general-purpose face model deformed according to the shape of the face acquired from the input signal. Therefore, the movement amount is calculated for each vertex, and the matrix Mc representing the skeleton model 101 can be obtained.

上記のようにして取得した骨格モデル101により、汎用顔モデル100を変形して、人物Aの個人顔モデル102を生成する。なお、骨格モデル101の算出に当り、個人顔モデル102に貼るテクスチャは同時に取得しておくものとする。 Based on the skeleton model 101 obtained as described above, the general-purpose face model 100 is deformed to generate the personal face model 102 of the person A. In calculating the skeleton model 101, it is assumed that the texture to be attached to the individual face model 102 is acquired at the same time.

個人顔モデル102は、汎用顔モデル100を骨格モデル101によって変形しただけなので、個人顔モデル102には表情の情報が含まれない。ただし、必要に応じて、人物Aの肌の色を表現するテクスチャを個人顔モデル102に貼ることができる。 Since the personal face model 102 is obtained by simply deforming the general-purpose face model 100 with the skeleton model 101, the personal face model 102 does not include facial expression information. However, a texture representing the skin color of the person A can be attached to the individual face model 102 as necessary.

また、特許文献1の技術は、RGB画像と奥行き画像をリアルタイムに取得するセンサを利用するが、必ずしも、このような特殊なセンサを用いる必要はない。RGB信号を取得することができる複数台（二台以上）の一般的なカメラを利用して、人物の顔の皺や毛穴と言った微細な凹凸形状を取得・再現することが可能である。また、市販の三次元レンジスキャナなどを利用して個人の高精細な顔形状を取得してもよい。 The technique of Patent Document 1 uses a sensor that acquires an RGB image and a depth image in real time, but such a special sensor is not necessarily used. By using multiple (two or more) general cameras capable of acquiring RGB signals, it is possible to acquire and reproduce minute uneven shapes such as wrinkles and pores on a person's face. Moreover, you may acquire an individual's high-definition face shape using a commercially available three-dimensional range scanner.

特定個人の骨格モデル101の行列Mcの具体的な取得方法は、まず、レンジファインダや市販のカメラなどで、デプスデータとテクスチャデータを取得する。その際、被撮影者の顔と、汎用顔モデル100の同位置（目尻や口端と言ったランドマーク点）にマーカ点を付け、マーカ点の移動から骨格モデル101の行列Mcを記述することができる。 As a specific method for acquiring the matrix Mc of the skeleton model 101 of a specific individual, first, depth data and texture data are acquired using a range finder or a commercially available camera. At that time, a marker point is attached to the subject's face and the same position of the general-purpose face model 100 (landmark points such as the corner of the eye or the mouth), and the matrix Mc of the skeleton model 101 is described from the movement of the marker point. Can do.

あるいは、レンジファインダなどで取得したデプスデータと汎用顔モデル101の行列GMがぴったり合うように、汎用顔モデル101を変形する。このとき、汎用顔モデル101に与えた変形量（各頂点の変位ベクトル）を骨格モデル101の行列Mcとしてもよい。 Alternatively, the general-purpose face model 101 is deformed so that the depth data acquired by the range finder or the like and the matrix GM of the general-purpose face model 101 exactly match each other. At this time, the deformation amount (displacement vector of each vertex) given to the general-purpose face model 101 may be used as the matrix Mc of the skeleton model 101.

●汎用変動モデルの生成
次に、汎用変動モデル104を生成するために、表情付き個人顔モデル103がシステムに入力される。 Generation of general variation model Next, in order to generate the general variation model 104, a personal face model 103 with an expression is input to the system.

人物Aの個人顔モデル102の形状の取得と同一の撮影手段・方法を利用して、人物Aの顔の形状を表情付き個人顔モデル103として取得する。当然ながら、表情付き個人顔モデル103には、個人顔モデル102を基準とする、個人顔モデル102の各頂点の変動分が表情として加えられている。 The shape of the face of the person A is acquired as the personal face model 103 with a facial expression using the same photographing means and method as the acquisition of the shape of the personal face model 102 of the person A. Naturally, the individual face model 103 with an expression is added with a variation of each vertex of the individual face model 102 as an expression with respect to the individual face model 102 as a reference.

汎用変動モデル104は、表情付き個人顔モデル103から、個人差を吸収し、表情の変動分のみを算出したものである。以下では、表情の変動分を算出する方法について説明する。 The general-purpose variation model 104 is obtained by absorbing individual differences from the individual face model 103 with an expression and calculating only the variation of the expression. Hereinafter, a method for calculating the amount of change in facial expression will be described.

まず、表情付き個人顔モデル103は、Characterized Expression GMとすると式(2)に示すように表される。
Characterized Expression GM = Mc・Me・GM …(2)
ここで、Mcは特定個人（この例では人物A）の骨格モデル101を表す行列、
Meは表情の変動を示す汎用変動モデル104を表す行列、
GMは汎用顔モデル100の形状を表す行列。 First, the personal face model 103 with an expression is expressed as shown in Expression (2) when it is Characterized Expression GM.
Characterized Expression GM = Mc ・ Me ・ GM… (2)
Here, Mc is a matrix representing the skeleton model 101 of a specific individual (in this example, person A),
Me is a matrix that represents the general-purpose variation model 104 that shows the variation in facial expressions,
GM is a matrix that represents the shape of the general-purpose face model 100.

式(2)において、人物Aの骨格モデル101の行列Mcは、上述したように、三次元計測によって得られた人物Aの顔の三次元形状データと、汎用顔モデル100の三次元形状の差異に基き取得される変動成分をベクトルで表した行列である。 In Equation (2), the matrix Mc of the skeleton model 101 of the person A is the difference between the three-dimensional shape data of the face of the person A obtained by the three-dimensional measurement and the three-dimensional shape of the general-purpose face model 100 as described above. This is a matrix representing the fluctuation component acquired based on the vector.

骨格モデル101の行列Mcを、汎用顔モデル100の行列GMの各頂点の変位ベクトル群として表すとすると、行列Mcは一般的に正則行列になる。従って、行列Mcの逆行列を求めることが可能である。そこで、式(2)の両辺に左から行列Mcの逆行列Mc^-1を掛け、表情付き汎用顔モデルを表すMe・GMを算出する。つまり、骨格モデル101を表す行列Mcの逆行列Mc^-1を実測値に掛けることで、個人差を吸収する。 If the matrix Mc of the skeleton model 101 is expressed as a displacement vector group of each vertex of the matrix GM of the general-purpose face model 100, the matrix Mc is generally a regular matrix. Therefore, it is possible to obtain an inverse matrix of the matrix Mc. Therefore, Me · GM representing a general-purpose face model with a facial expression is calculated by multiplying both sides of Equation (2) by the inverse matrix Mc ⁻¹ of the matrix Mc from the left. That is, the individual difference is absorbed by multiplying the actually measured value by the inverse matrix Mc ⁻¹ of the matrix Mc representing the skeleton model 101.

ここで求めたいのは、汎用変動モデル104を表す行列Meである。しかし、行列Meを直接的に求めるのは数学的に困難である。 What is desired here is a matrix Me representing the general-purpose variation model 104. However, it is mathematically difficult to obtain the matrix Me directly.

非特許文献4は、汎用変動モデル104を表す行列Meを直接的に算出する代わりに「ターゲットシェイプ」と呼ばれる顔形状を複数用意し、複数のターゲットシェイプを所定の混合比で加算する重み付き線形和で表すことで、所定の中間的な表情を再現する。 Non-Patent Document 4 prepares a plurality of face shapes called “target shapes” instead of directly calculating the matrix Me representing the general-purpose variation model 104, and adds a plurality of target shapes with a predetermined mixture ratio to a weighted linear Representing a certain intermediate expression by expressing the sum.

ターゲットシェイプは、顔の部位（左眼、右眼、左眉、右眉など）ごとに最も変形した状態の形状（頂点群）を定義したものである。例えば、眉を最大限上げた時の顔の形状（頂点群）をターゲットシェイプとして定義し、ターゲットシェイプに近付けるように汎用顔モデル100の各頂点を移動させると、眉を上げる表情を擬似的に作り出すことができる。 The target shape defines the most deformed shape (vertex group) for each part of the face (left eye, right eye, left eyebrow, right eyebrow, etc.). For example, if the shape of the face (vertex group) when the eyebrows are raised to the maximum is defined as the target shape and each vertex of the general-purpose face model 100 is moved so as to be close to the target shape, the expression that raises the eyebrows is simulated Can be produced.

また、複数のターゲットシェイプを混合して中間的な表情を作り出すことができる。例えば、左右の口角が上がった状態のターゲットシェイプ1と、眉の外側が下がったターゲットシェイプ2をそれぞれ用意し、ターゲットシェイプ1と2の重み付き線形和により汎用顔モデル100の各頂点を移動することで、「笑顔」を作り出すことができる。そして、ターゲットシェイプの混合に用いる重みを調節することで、各ターゲットシェイプの影響度を制御し、表情の表出強度を変更することができる。 In addition, an intermediate facial expression can be created by mixing a plurality of target shapes. For example, prepare target shape 1 with the left and right mouth corners raised and target shape 2 with the outside of the eyebrows lowered, and move each vertex of the general-purpose face model 100 by the weighted linear sum of the target shapes 1 and 2 This makes it possible to create a “smile”. Then, by adjusting the weight used for mixing the target shapes, the influence degree of each target shape can be controlled, and the expression intensity of the facial expression can be changed.

このように、複数のターゲットシェイプを重み付き線形和で混合する処理は、一般に「ブレンドシェイプ」と呼ばれる（非特許文献5参照）。 In this way, the process of mixing a plurality of target shapes with a weighted linear sum is generally called “blend shape” (see Non-Patent Document 5).

ただし、ブレンドシェイプによる表情の表現には限界がある。ターゲットシェイプで定義される最大限変形した状態の形状を超える変形や、予め用意したターゲットシェイプの形状に含まれない変形を表現することができない。 However, there are limits to the expression of facial expressions using blend shapes. A deformation exceeding the maximum deformed shape defined by the target shape or a deformation not included in the shape of the target shape prepared in advance cannot be expressed.

本実施例においては、この手法を利用し、顔の部位ごとにターゲットシェイプを30から40個程度、予め用意し、それらの形状の混合比を行列計算により求める。これにより、間接的に表情付き汎用顔モデルを表すMe・GMを求め、その際の重みwを汎用変動モデル104として変動モデルデータベース(DB)105に登録する。 In the present embodiment, this method is used to prepare about 30 to 40 target shapes for each facial part in advance, and the mixing ratio of these shapes is obtained by matrix calculation. As a result, Me / GM representing a general-purpose face model with a facial expression is obtained indirectly, and the weight w at that time is registered in the variation model database (DB) 105 as a general-purpose variation model 104.

表情付き汎用顔モデルを表すMe・GMは、n個のターゲットシェイプT_kに重みw_kの係数を掛けた重み付き線形和で表すことができる。なお、ターゲットシェイプT_kには、予め定義された頂点群の座標値が格納されている。
Me・GM = Σ_k=1 ⁿw_kT_k …(3)
ここで、nはターゲットシェイプの数。 Me / GM representing a general-purpose face model with facial expressions can be expressed by a weighted linear sum obtained by multiplying n target shapes T _k by a coefficient of weight w _k . The target shape T _k stores a coordinate value of a predetermined vertex group.
Me ・ GM = Σ _{k = 1} ⁿ w _k T _k (3)
Where n is the number of target shapes.

なお、ターゲットシェイプの数nは、顔筋による変動を表現するために充分な数でよく。上記ではnを30から40個と説明したが、40個を超えるターゲットシェイプを用意してもよい。ただし、ターゲットシェイプによる顔の変形は、ターゲットシェイプの重み付き線形和などで計算されるため、個々のターゲットシェイプの変形は、互いに独立した変形になることが望ましい。 Note that the number n of target shapes may be sufficient to express fluctuations due to facial muscles. In the above description, n is described as 30 to 40, but target shapes exceeding 40 may be prepared. However, since the deformation of the face due to the target shape is calculated by a weighted linear sum of the target shapes, it is desirable that the deformations of the individual target shapes are independent from each other.

また、被撮影者の表情サンプルから、ターゲットシェイプの混合比を、主成分分析(Principal Component Analysis)などにより基底分解して、基本的な七つの感情に起因する表情のパラメータを算出する。そして、当該パラメータからターゲットシェイプの混合比を自動的に計算することもできる。 In addition, from the facial expression sample of the photographed subject, the mixture ratio of the target shape is fundamentally decomposed by principal component analysis or the like, and facial expression parameters resulting from the seven basic emotions are calculated. Then, the mixing ratio of the target shape can be automatically calculated from the parameter.

なお、ターゲットシェイプの混合比（重み）は、擬似逆行列を求めることで算出してもよいし、ラグランジュ未定乗数法によって解析的に算出してもよい。 The mixing ratio (weight) of the target shape may be calculated by obtaining a pseudo inverse matrix or analytically calculated by the Lagrange undetermined multiplier method.

ここでは、説明を簡単にするため、同一の方法を用いて顔の形状を取得する例を説明したが、構築するシステムによっては、別の手法を用いて顔の形状を取得してもよい。何故なら、個人顔モデル102の形状を高精度に取得しようとすると、その精度に応じて形状取得装置が肥大化してしまう上、取得時の光源環境にも制約が生じることが多い。 Here, in order to simplify the description, an example in which the face shape is acquired using the same method has been described. However, depending on the system to be constructed, the face shape may be acquired using another method. This is because if the shape of the personal face model 102 is to be acquired with high accuracy, the shape acquisition device will be enlarged according to the accuracy, and the light source environment at the time of acquisition will often be restricted.

表情付き個人顔モデル103として、普段の何気ない自然な表情を撮影しようとすると、高精度に形状を取得する装置が必ずしも利用できないことは容易に想像される。そのため、個人顔モデル102に利用する形状取得方法と、表情付き個人顔モデル103の形状取得方法が異なる場合も想定される。その場合は、別途、表情付き個人顔モデル103の実測値と個人顔モデル102の対応点を逐次求めることで、同一の方法を用いて顔の形状を取得する方法において説明した条件と等しくすることができる。 As an individual face model 103 with a facial expression, it is easily imagined that a device that acquires a shape with high accuracy cannot always be used if an ordinary natural expression is taken. For this reason, it is assumed that the shape acquisition method used for the personal face model 102 and the shape acquisition method of the personal face model 103 with an expression are different. In that case, separately, the measured values of the personal face model 103 with facial expression and the corresponding points of the personal face model 102 are sequentially obtained, so that the conditions described in the method for acquiring the face shape using the same method are made equal. Can do.

形状取得方法が同一でも異なる場合でも、汎用変動モデルの精度は骨格モデル101の精度に依存するため、骨格モデル101は高精度に取得する必要がある。 Even if the shape acquisition methods are the same or different, the accuracy of the general-purpose variation model depends on the accuracy of the skeleton model 101, and thus the skeleton model 101 needs to be acquired with high accuracy.

重みwは、それぞれ後段で生成したい表情ごとにカテゴリ分けして変動モデルDB105に登録する。分類に使用するカテゴリ（以下、表情カテゴリ）は、基本的な七つの感情（驚き、喜び、怒り、恐怖、悲しみ、嫌悪、恐怖）である。さらに、感情が混ざり合った表情、作り笑いのように目の周辺領域に表れる意味と口周辺に表れる意味が異なる表情なども表情カテゴリとして用意することができる。 The weight w is categorized for each facial expression desired to be generated in the subsequent stage and registered in the variation model DB 105. The categories used for classification (hereinafter referred to as facial expression categories) are the seven basic emotions (surprise, joy, anger, fear, sadness, disgust, fear). Furthermore, facial expressions with mixed emotions, facial expressions with different meanings appearing in the peripheral area of the eye and meanings appearing around the mouth, such as smirk, can be prepared as facial expression categories.

ここで、汎用変動モデル104として算出した重みwをパラメータとして、表情カテゴリごとに分類して変動モデルDB105に登録する方法を説明する。なお、上記では重みwを汎用変動モデル104のパラメータとして登録する例を説明したが、登録するパラメータは、その限りではない。 Here, a method of classifying each facial expression category using the weight w calculated as the general-purpose variation model 104 as a parameter and registering it in the variation model DB 105 will be described. Although an example in which the weight w is registered as a parameter of the general-purpose variation model 104 has been described above, the parameter to be registered is not limited thereto.

汎用変動モデル104として算出したパラメータは、パーセプトロン、ニューラルネットワーク、サポートベクタマシンと言った一般的な機械学習の枠組みを利用することで表情のカテゴリに分類することができる。 Parameters calculated as the general-purpose variation model 104 can be classified into facial expression categories using a general machine learning framework such as a perceptron, a neural network, and a support vector machine.

分類方法の一例は、汎用顔モデル100に対して、システムの利用者が設定した表情でターゲットシェイプの重みwを算出し、その重みwを表情カテゴリを代表するパラメータとして識別に用いる方法である。 An example of the classification method is a method of calculating a target shape weight w for the general-purpose face model 100 with a facial expression set by a system user and using the weight w as a parameter representing the facial expression category for identification.

そして、様々な表情が入力サンプルとして入力された場合、k-meansクラスタリングや、マルチクラスSVMなどを使って表情カテゴリのクラスタを更新する。クラスタの更新方法に関しては、種々の方法が知られているので、構築するシステムによって好適な更新方法を選べばよい。 When various facial expressions are input as input samples, the facial expression category cluster is updated using k-means clustering, multi-class SVM, or the like. Since various methods are known for updating the cluster, a suitable update method may be selected depending on the system to be constructed.

また、喜怒哀楽がはっきりした表情は、表情の変動が大きいため、表情の変動が小さい場合に比べてクラスタリングは容易である。一方で、微細な表情は、表情の変動が小さいため、無表情カテゴリのクラスタに分類するか、それ以外の表情カテゴリとして分類するかの判断が難しくなる。その場合、汎用顔モデル100に汎用変動モデル104のパラメータを適用して生成した表情付き汎用顔モデルを数人の被験者に観察させ、主観評価による分類結果を基に、強化学習の枠組みを使って表情カテゴリを更新する（非特許文献6参照）。 In addition, since facial expressions with clear emotions have large variations in facial expressions, clustering is easier than when facial variations are small. On the other hand, since a fine facial expression has a small variation in facial expression, it is difficult to determine whether it is classified into a cluster of an expressionless category or an expression category other than that. In that case, let the several subjects observe the general-purpose face model with facial expressions generated by applying the parameters of the general-purpose variation model 104 to the general-purpose face model 100, and use the reinforcement learning framework based on the classification result by subjective evaluation Update the facial expression category (see Non-Patent Document 6).

以上のようにして、個人差を吸収した後、表情の汎用変動モデル104をパラメータとして算出し、予め決めた表情カテゴリごとにパラメータを変動モデルDB105に登録・蓄積する。 After absorbing individual differences as described above, the general-purpose variation model 104 for facial expressions is calculated as a parameter, and parameters are registered and accumulated in the variation model DB 105 for each predetermined facial expression category.

なお、変動モデルDB105に登録する表情カテゴリは、年齢、性別、人種、国籍などの属性情報を用いた階層的な構造を有してもよい。 The facial expression categories registered in the variation model DB 105 may have a hierarchical structure using attribute information such as age, gender, race, nationality, and the like.

［表情の付加］
次に、後段処理に相当する、任意の顔に任意の表情を付加する方法を説明する。 [Add facial expression]
Next, a method for adding an arbitrary expression to an arbitrary face, which corresponds to the subsequent process, will be described.

図3のブロック図により後段処理を実行する表情生成装置の構成例を示す。なお、図3は、図1に示す前段処理の構成に、後段処理の構成を加えた構成を示す。従って、前段処理の構成の説明は省略する。 The block diagram of FIG. 3 shows an example of the configuration of a facial expression generation apparatus that executes subsequent processing. FIG. 3 shows a configuration obtained by adding the configuration of the post-stage process to the configuration of the pre-stage process shown in FIG. Therefore, the description of the configuration of the pre-processing is omitted.

人物Bは、人物Aと異なる人物であり、汎用変動モデル104の算出に無関係の人物である。人物Bの骨格モデル301は、人物Aの骨格モデル101を取得した方法と同じ方法で取得することができる。また、人物Bの個人顔モデル302は、人物Bの骨格モデル301によって汎用顔モデル100を変形した顔モデルであり、その形状は、理論的に表情が全くない状態（基準表情）の入力信号から構成される三次元形状と等価である。 The person B is a person who is different from the person A and is not related to the calculation of the general variation model 104. The skeleton model 301 of the person B can be acquired by the same method as the method of acquiring the skeleton model 101 of the person A. The personal face model 302 of the person B is a face model obtained by deforming the general-purpose face model 100 by the skeleton model 301 of the person B, and its shape is theoretically derived from an input signal in a state where there is no expression (reference expression). It is equivalent to the three-dimensional shape that is constructed.

ここでは、人物Aの表情付き個人顔モデル103から算出し、変動モデルDB105に蓄積した汎用変動モデル104（パラメータ）を人物Bの顔画像に適用して、人物Bの顔画像を任意の表情に変形する。 Here, the general variation model 104 (parameter) calculated from the personal face model 103 with the expression of the person A and accumulated in the variation model DB 105 is applied to the face image of the person B, and the face image of the person B is changed to an arbitrary expression. Deform.

このとき、システムの利用者には、画面を介して、人物Bの基準表情（無表情）の顔画像が提示され、同時に、変動モデルDB105に登録された表情カテゴリの代表パラメータを汎用顔モデル100に適用した表情を有する複数の顔画像が提示される。以下、表情を有する顔を「有表情顔」と呼ぶ。 At this time, the system user is presented with a facial image of the reference facial expression (no expression) of person B via the screen, and at the same time, the representative parameters of the facial expression category registered in the variation model DB 105 are used as the general-purpose facial model 100. A plurality of face images having facial expressions applied to is presented. Hereinafter, a face having a facial expression is referred to as a “facial expression face”.

利用者は、複数の有表情顔の画像から所望する表情の顔画像を選び、人物Bの顔画像への適用を指示する。もしくは、利用者に表情カテゴリのリストを提示し、利用者がリストから選択した表情カテゴリに基づき人物Bの顔画像に表情を加えててもよい。表情カテゴリが選択された場合、変動モデルDB105に登録されている代表パラメータ（クラスタの重心に最も近いパラメータ）から順に人物Bの顔画像に適用する。適用の結果、利用者が望む場合は、同一の表情カテゴリの他のパラメータを人物Bの顔画像に適用する。 The user selects a facial image having a desired facial expression from a plurality of facial expression facial images, and instructs application to the facial image of person B. Alternatively, a list of facial expression categories may be presented to the user, and facial expressions may be added to the face image of the person B based on the facial expression category selected from the list by the user. When the expression category is selected, the representative parameter (the parameter closest to the cluster centroid) registered in the variation model DB 105 is applied to the face image of the person B in order. If the user desires as a result of the application, other parameters of the same expression category are applied to the face image of the person B.

汎用変動モデル104の算出に人物Bが関与し、汎用変動モデル104を、人物Bの骨格モデル301、人物Bの個人顔モデル302、人物Bの表情付き個人顔モデル303から生成することもできる。この場合、変動モデルDB105から適切なパラメータを選択すれば、汎用顔モデル100、人物Bの骨格モデル301、汎用変動モデル104から生成される表情と人物Bの実際の表情は一致する。 The person B is involved in the calculation of the general fluctuation model 104, and the general fluctuation model 104 can be generated from the skeleton model 301 of the person B, the personal face model 302 of the person B, and the personal face model 303 with the expression of the person B. In this case, if an appropriate parameter is selected from the variation model DB 105, the facial expression generated from the general-purpose face model 100, the skeleton model 301 of the person B, and the general-purpose variation model 104 matches the actual expression of the person B.

図4のブロック図により実施例の処理構成例を説明する。なお、表情生成装置は、ネットワークを介して、または、各種記録媒体から取得したソフトウェア（プログラム）をコンピュータ機器において実行することで実現される。コンピュータ機器は、CPU、メモリ、ストレージデバイス、入出力装置、システムバス、表示装置などにより構成されるパーソナルコンピュータなどの汎用機器でよい。あるいは、表情生成用のソフトウェアの実行に最適化されたハードウェアを有する専用機器を利用してもよい。 A processing configuration example of the embodiment will be described with reference to the block diagram of FIG. The facial expression generation apparatus is realized by executing software (programs) acquired from various recording media on a computer device via a network. The computer device may be a general-purpose device such as a personal computer including a CPU, a memory, a storage device, an input / output device, a system bus, a display device, and the like. Alternatively, a dedicated device having hardware optimized for executing software for generating facial expressions may be used.

図4において、データベース生成部400は上述した前段処理を実行する部分に相当し、任意表情生成部401は後段処理を実行する部分に相当する。データベース生成部400は、入力データ取得部402、変動モデルデータベース生成部403、変動モデルDB105を有する。 In FIG. 4, a database generation unit 400 corresponds to a part that executes the above-described pre-stage processing, and an arbitrary facial expression generation unit 401 corresponds to a part that executes the post-stage processing. The database generation unit 400 includes an input data acquisition unit 402, a variation model database generation unit 403, and a variation model DB 105.

入力データ取得部402は、例えば、市販の複数のRGBカメラからRGB画像を取得し、マルチベースラインステレオ法などを用いてRGB画像から奥行き情報を取得する（例えば、非特許文献7参照）。そして、取得したRGB画像と奥行き情報を変動モデルデータベース生成部403に入力する。奥行き情報の取得は、種々の方法が提案されていて、上記ステレオ法による取得方法のほかに、デプスカメラやレンジセンサを用いて奥行き情報を取得する方法がある。奥行き情報の取得方法は、構築するシステムごとに好適な手段を選択すればよい。 For example, the input data acquisition unit 402 acquires RGB images from a plurality of commercially available RGB cameras, and acquires depth information from the RGB images using a multi-baseline stereo method or the like (see, for example, Non-Patent Document 7). Then, the acquired RGB image and depth information are input to the variation model database generation unit 403. Various methods of acquiring depth information have been proposed. In addition to the acquisition method using the stereo method, there is a method of acquiring depth information using a depth camera or a range sensor. As a method for acquiring depth information, a suitable means may be selected for each system to be constructed.

●変動モデルデータベース生成部
図5のブロック図により変動モデルデータベース生成部403の構成例を示す。 Fluctuation Model Database Generation Unit A configuration example of the variation model database generation unit 403 is shown by the block diagram of FIG.

骨格モデル生成部500は、入力データ取得部401から入力された被撮影者（例えば人物A）の情報に基づき、被撮影者の骨格モデルが骨格モデルデータベース(DB)501に登録されているか否かを検索する。検索の結果、該当する骨格モデルが見付かれば、当該骨格モデルを汎用顔モデルに適用するための設定を行う。また、該当する骨格モデルが見付からなかった場合は、入力データ取得部401から入力された奥行き情報を用いて、上述した方法によって被撮影者の骨格モデルを生成し、生成した骨格モデルを骨格モデルDB501に登録する。 The skeletal model generation unit 500 determines whether or not the skeleton model of the subject is registered in the skeleton model database (DB) 501 based on the information of the subject (for example, the person A) input from the input data acquisition unit 401. Search for. If a corresponding skeleton model is found as a result of the search, settings for applying the skeleton model to the general-purpose face model are performed. If the corresponding skeleton model is not found, the skeleton model of the subject is generated by the above-described method using the depth information input from the input data acquisition unit 401, and the generated skeleton model is stored in the skeleton model DB501. Register with.

骨格モデル適用部502は、骨格モデルDB501に登録された骨格モデルを利用して、入力データ取得部402から入力された奥行き情報に含まれる個人差を吸収する。つまり、式(2)に示すように、表情付き個人顔モデル(Characterized Expression GM)に、特定個人の骨格モデルMcの逆行列Mc^-1を掛けることで個人差を打ち消した表情付き汎用顔モデルMe・GMを生成する。 The skeleton model application unit 502 uses the skeleton model registered in the skeleton model DB 501 to absorb individual differences included in the depth information input from the input data acquisition unit 402. In other words, as shown in Equation (2), a generalized face model with a facial expression Me that eliminates individual differences by multiplying the individualized facial model with expression (Characterized Expression GM) by the inverse matrix Mc ^-1 of the skeleton model Mc of a specific individual・ Generate GM.

変動モデル算出部503は、骨格モデル適用部502が生成した表情付き汎用顔モデルMe・GMについて、ターゲットシェイプの重みwをパラメータとする変動モデルを算出する。表情カテゴリ分類部504は、変動モデルDB105の表情カテゴリに合わせて、変動モデル算出部503が算出した変動モデルを分類する。 The variation model calculation unit 503 calculates a variation model using the weight w of the target shape as a parameter for the general-purpose face model Me / GM with an expression generated by the skeleton model application unit 502. The expression category classification unit 504 classifies the variation model calculated by the variation model calculation unit 503 in accordance with the expression category of the variation model DB 105.

変動モデルをターゲットシェイプの重みwをパラメータとして表現する場合、そのパラメータでクラスタリングを行い、クラスタに表情のラベルを付加する。例えば、笑顔の表情は、下瞼を上方に動かすターゲットシェイプT_A、目尻を下げるターゲットシェイプT_B、口角を上方に動かすターゲットシェイプT_C、および、重みwによって式(4)のように表される。なお、ターゲットシェイプT_A、T_B、T_Cには予め定義された頂点群の座標値が格納されている。
Smile = w_AT_A・w_BT_B・w_CT_C
例えば、(w_A, w_B, w_C) = (1.0, 0.5, 0.7) When the variation model is expressed using the weight w of the target shape as a parameter, clustering is performed using the parameter, and a facial expression label is added to the cluster. For example, the expression of a smile is expressed as in equation (4) by a target shape T _A that moves the lower eyelid upward, a target shape T _B that lowers the corner of the eye, a target shape T _C that moves the mouth corner upward, and a weight w. The Note that the target shape T _A , T _B , and T _C store the coordinate values of predefined vertex groups.
Smile = w _A T _A・ w _B T _B・ w _C T _C
For example, (w _A , w _B , w _C ) = (1.0, 0.5, 0.7)

このように、表情カテゴリごとに重みwを算出してパラメータのクラスタリングを行う。クラスタリングには、例えばk-meansのような一般的なクラスタリング手法を用いればよい。そして、変動モデル算出部503が算出したパラメータ（変動モデル）がどのクラスタ（表情カテゴリ）に属すかを計算し、計算結果に従い、変動モデルを変動モデルDB105に登録する。 In this way, the weighting w is calculated for each facial expression category to perform parameter clustering. For clustering, a general clustering method such as k-means may be used. Then, the cluster (expression category) to which the parameter (variation model) calculated by the variation model calculation unit 503 belongs is calculated, and the variation model is registered in the variation model DB 105 according to the calculation result.

図6のフローチャートにより変動モデルデータベース生成部403の処理を説明する。 The process of the fluctuation model database generation unit 403 will be described with reference to the flowchart of FIG.

骨格モデル生成部500は、入力データ取得部401から入力データ（被撮影者の顔を撮影して得られた画像と形状情報（奥行き情報）および被撮影者の情報）を取得する(S601)。そして、被撮影者に該当する骨格モデルが骨格モデルDB501に登録されているか否かを検索する(S602)。被撮影者の骨格モデルが骨格モデルDB501に登録されている場合、処理はステップ605に進む。また、被撮影者の骨格モデルが未登録の場合、処理はステップS603に進む。 The skeleton model generation unit 500 acquires input data (image and shape information (depth information) and subject information) obtained by photographing the subject's face from the input data obtaining unit 401 (S601). Then, it is searched whether or not the skeleton model corresponding to the subject is registered in the skeleton model DB 501 (S602). If the skeleton model of the subject is registered in the skeleton model DB 501, the process proceeds to step 605. If the skeleton model of the subject is not registered, the process proceeds to step S603.

最も簡単な検索方法は、骨格モデルの登録時に被撮影者の氏名など被撮影者に固有の情報を骨格モデルのラベルとして登録し、検索時に被撮影者の氏名と骨格モデルのラベルを照合するなどが考えられる。また、被撮影者の無表情の顔画像が入力される場合は、入力された画像が示す顔の形状と、骨格モデルに基づき汎用顔モデル100を変形した形状のマッチングから、被撮影者の特定または被撮影者の骨格モデルの有無を判定することもできる。 The simplest search method is to register information specific to the subject, such as the subject's name, as the skeleton model label when registering the skeletal model, and match the subject's name with the skeleton model label when searching. Can be considered. In addition, when a faceless facial image of the subject is input, the identification of the subject is performed by matching the shape of the face indicated by the input image with the shape deformed from the general-purpose face model 100 based on the skeleton model. Alternatively, the presence / absence of the subject's skeleton model can also be determined.

被撮影者の骨格モデルが未登録の場合、骨格モデル生成部500は、被撮影者の骨格モデルを生成し(S603)、被撮影者に固有の情報をラベルとして付加した骨格モデルを骨格モデルDB501に登録する(S604)。 When the skeleton model of the subject is not registered, the skeleton model generation unit 500 generates the skeleton model of the subject (S603), and the skeleton model added with information specific to the subject as a label is the skeleton model DB501. (S604).

骨格モデルは、入力データに含まれる奥行き情報の位置と、汎用顔モデル100の位置を合わせた後、奥行き情報に基づき汎用顔モデル100の各頂点を移動する生成される。入力データとして、毛穴や皺と言った微細な凹凸の奥行き情報が取得できる場合がある。その場合、汎用顔モデル100の各頂点の移動後、汎用顔モデル100の各メッシュ（平面）の法線方向の高低差を示すディスプレイスメントマップとして、奥行き情報の微細な凹凸を保存することが可能である。もしくは、各メッシュからZ軸（奥行き）方向に高低差を取り、ディスプレイスメントマップを生成してもよい。生成したディスプレイスメントマップは、骨格モデルと一緒に骨格モデルDB501に格納される。 The skeleton model is generated by moving the positions of the general-purpose face model 100 based on the depth information after matching the position of the depth information included in the input data with the position of the general-purpose face model 100. As input data, there is a case where depth information of minute irregularities such as pores and wrinkles can be acquired. In that case, after moving each vertex of the general-purpose face model 100, it is possible to preserve the fine unevenness of the depth information as a displacement map showing the height difference in the normal direction of each mesh (plane) of the general-purpose face model 100 It is. Alternatively, a displacement map may be generated by taking a height difference from each mesh in the Z-axis (depth) direction. The generated displacement map is stored in the skeleton model DB 501 together with the skeleton model.

次に、骨格モデル適用部502は、骨格モデルDB501から被撮影者の骨格モデルを読み込む(S605)。骨格モデルは、汎用顔モデル100の各頂点を移動するためのベクトルまたは行列である。そして、読み込んだ骨格モデルに基づき、汎用顔モデル100の全頂点を移動して被撮影者の個人顔モデルを生成する(S606)。個人顔モデルは、骨格モデルに基づき汎用顔モデル100を変形した無表情の顔である。 Next, the skeleton model application unit 502 reads the skeleton model of the subject from the skeleton model DB 501 (S605). The skeleton model is a vector or matrix for moving each vertex of the general-purpose face model 100. Then, based on the read skeleton model, all the vertices of the general-purpose face model 100 are moved to generate a personal face model of the subject (S606). The personal face model is an expressionless face obtained by deforming the general-purpose face model 100 based on the skeleton model.

次に、変動モデル算出部503は、入力データに含まれる奥行き情報に基づき、ステップS606で生成された個人顔モデルの各頂点をさらに移動し、各頂点の移動量を表情変動による変動モデルとして記録する(S607)。また、この時の表情に合わせて、ディスプレイスメントマップも再度取得し直す。 Next, the variation model calculation unit 503 further moves each vertex of the personal face model generated in step S606 based on the depth information included in the input data, and records the movement amount of each vertex as a variation model due to facial expression variation. (S607). Also, the displacement map is acquired again according to the expression at this time.

次に、表情カテゴリ分類部504は、変動モデルの登録先の表情カテゴリを識別する(S608)。つまり、ステップS607で生成された変動モデルとディスプレイスメントマップに、該当する表情カテゴリのラベルを付与する。そして、変動モデルとディスプレイスメントマップを変動モデルデータベース105に登録する(S609)。 Next, the facial expression category classification unit 504 identifies the facial expression category to which the variation model is registered (S608). That is, the label of the corresponding facial expression category is given to the variation model and the displacement map generated in step S607. Then, the variation model and the displacement map are registered in the variation model database 105 (S609).

以上の処理により、骨格モデルDB501と変動モデルDB105が構築される。つまり、骨格モデルを生成し、骨格モデルを適用した汎用顔モデルから、表情による変動を変動モデルとして算出することで、個人差を吸収した汎用変動モデルを変動モデルDB105に登録することができる。 Through the above processing, the skeleton model DB 501 and the variation model DB 105 are constructed. That is, by generating a skeleton model and calculating a variation due to facial expression as a variation model from a general-purpose face model to which the skeleton model is applied, a general-purpose variation model that absorbs individual differences can be registered in the variation model DB 105.

●任意表情生成部
任意表情生成部401は、変動モデルDB105を利用して、表情を付加したい人物の個人顔モデルに仮想的に表情を付加する。 Arbitrary Expression Generation Unit The arbitrary expression generation unit 401 uses the variation model DB 105 to virtually add an expression to the personal face model of the person to whom an expression is to be added.

任意表情生成部401は、骨格モデルDB501と変動モデルDB105に蓄積されたデータを利用し、任意の人物の顔形状に適切な表情変形を施す。 The arbitrary facial expression generation unit 401 uses the data stored in the skeleton model DB 501 and the variation model DB 105 to perform appropriate facial expression deformation on the face shape of an arbitrary person.

個人顔モデル設定部405は、骨格モデルDB501から、システムの利用者が所望する人物の顔の骨格モデルを取得する。 The personal face model setting unit 405 acquires from the skeleton model DB 501 a skeleton model of the face of the person desired by the system user.

表情付き顔モデル生成部406は、取得された骨格モデルに基づき、汎用顔モデル100を変形して個人顔モデルを生成する。ここで得られる顔モデルは常に無表情である。さらに、生成した個人顔モデルに、汎用変動モデルに基づく表情を付加する。 The expression-equipped face model generation unit 406 generates a personal face model by deforming the general-purpose face model 100 based on the acquired skeleton model. The face model obtained here is always expressionless. Furthermore, an expression based on the general-purpose variation model is added to the generated personal face model.

図7のフローチャートにより任意表情生成部401の処理を説明する。 The process of the arbitrary facial expression generation unit 401 will be described with reference to the flowchart of FIG.

個人顔モデル設定部405は、表情を付加する人物の個人顔モデルを設定するために、人物プロファイルを取得する(S701)。人物プロファイルは、例えば、氏名、顔写真など当該人物に固有の情報を含み、骨格モデルDB501から骨格モデルを読み込むために利用される。 The personal face model setting unit 405 acquires a person profile in order to set a personal face model of a person to whom a facial expression is added (S701). The person profile includes information unique to the person, such as a name and a face photograph, and is used to read a skeleton model from the skeleton model DB 501.

次に、個人顔モデル設定部405は、骨格モデルの有無の判定(S702)、骨格モデルの生成(S703)、骨格モデルの登録(S704)、骨格モデルの読み込み(S705)を行う。これら処理は、変動モデルデータベース生成部403におけるステップS602からS605の処理と同様であり、詳細説明を省略する。 Next, the personal face model setting unit 405 determines whether or not there is a skeleton model (S702), generates a skeleton model (S703), registers a skeleton model (S704), and reads a skeleton model (S705). These processes are the same as the processes of steps S602 to S605 in the variation model database generation unit 403, and detailed description thereof is omitted.

次に、個人顔モデル設定部405は、変動モデルDB105に設定されている表情カテゴリの中から、システムの利用者に所望する表情カテゴリを選択させ、その選択情報を取得する(S706)。 Next, the personal face model setting unit 405 causes the system user to select a desired facial expression category from the facial expression categories set in the variation model DB 105, and acquires the selection information (S706).

各表情カテゴリについて複数の表情サンプルが登録されている場合、それら表情サンプルからランダムにパラメータを取り出し、汎用顔モデル100に表情を付加する。つまり、表情カテゴリごとに一つ以上の表情付き汎用顔モデルを表示して、システムの利用者に所望する表情付き汎用顔モデルを選択させる。もし、システムの利用者が所望する表情がない場合は、再度、表情サンプルからパラメータを取り出して、システムの利用者に提示する表情付き汎用顔モデルを変更する。 When a plurality of facial expression samples are registered for each facial expression category, parameters are randomly extracted from the facial expression samples, and facial expressions are added to the general-purpose face model 100. That is, one or more general-purpose face models with facial expressions are displayed for each facial expression category, and the system user is allowed to select a desired general-purpose face model with facial expressions. If there is no facial expression desired by the system user, parameters are extracted from the facial expression sample again to change the general-purpose face model with facial expression to be presented to the system user.

次に、表情付き顔モデル生成部406は、選択情報が示す表情サンプルのパラメータを汎用顔モデル100に適用して、表情付き汎用顔モデルを生成する(S707)。そして、骨格モデルを表情付き汎用顔モデルに適用して、任意の顔に任意の表情を付加する(S708)。 Next, the expression-equipped face model generation unit 406 applies an expression sample parameter indicated by the selection information to the general-purpose face model 100 to generate a general-purpose face model with an expression (S707). Then, the skeleton model is applied to the general-purpose face model with an expression to add an arbitrary expression to an arbitrary face (S708).

なお、ステップS707とS708は、この順に実行する必要がある。実行順を変更すると、所望の変形が行えなくなる。また、骨格モデルが読み込まれている状態であれば、表情を付加する処理は二回の行列演算で実行することができる。そのため、ビデオカメラで取得した人物の顔に対して、リアルタイムに表情を加えた仮想的な表情を作り出すことも可能である。 Steps S707 and S708 need to be executed in this order. If the execution order is changed, the desired deformation cannot be performed. If the skeleton model is being read, the expression adding process can be executed by two matrix operations. Therefore, it is also possible to create a virtual facial expression by adding a facial expression in real time to the face of a person acquired with a video camera.

以上のように、入力されたデータ（特定個人の画像と奥行き情報）から個人差を吸収した表情の汎用変動モデルを生成し、所望する顔の汎用変動モデルを利用して、所望する顔の任意の表情を作り出すことができる。 As described above, a general-purpose variation model of a facial expression that absorbs individual differences is generated from the input data (a specific individual's image and depth information), and the desired face can be arbitrarily selected using the general-purpose variation model of the desired face. Can create a facial expression.

このように、個人ごとの骨格モデル表現により個人差を吸収し、表情による変動を汎用変動モデルとして算出して、万人に適用可能な表情の汎用変動モデルを登録した変動モデルデータベースを構築することができる。また、人物が変更されても、骨格モデルを再定義することで、汎用変動モデルを利用して、特定人物の個人顔モデルに所望の表情変化を付加することができる。 In this way, the individual model is absorbed by the skeleton model expression for each individual, the variation due to facial expressions is calculated as a general variation model, and a variation model database that registers the general variation model of facial expressions applicable to everyone is constructed. Can do. Even if the person is changed, by redefining the skeletal model, a desired expression change can be added to the individual face model of the specific person using the general-purpose variation model.

以下、本発明にかかる実施例2の情報処理を説明する。なお、実施例2において、実施例1と略同様の構成については、同一符号を付して、その詳細説明を省略する。 The information processing according to the second embodiment of the present invention will be described below. Note that the same reference numerals in the second embodiment denote the same parts as in the first embodiment, and a detailed description thereof will be omitted.

実施例1では、人物Aの骨格モデル101に基づき汎用顔モデル100を変形した個人顔モデル102を作成し、人物Aの表情付き個人顔モデル103（表情による三次元形状の変動モデル）を取得しする。そして、骨格モデル101を利用して表情付き個人顔モデル103から万人に適用可能な汎用変動モデル104を生成して、汎用変動モデル104利用して任意の表情の付加を行う方法を説明した。 In the first embodiment, a personal face model 102 obtained by deforming the general-purpose face model 100 based on the skeleton model 101 of the person A is created, and a personal face model 103 with a facial expression of the person A (a three-dimensional shape variation model depending on the expression) is acquired. To do. Then, a method for generating a general-purpose variation model 104 applicable to all persons from the personal face model 103 with an expression using the skeleton model 101 and adding an arbitrary expression using the general-purpose variation model 104 has been described.

実施例2では、実施例1と同様に入力データに含まれる奥行き情報から、表情による三次元形状の変動モデルを抽出し、人物Aの表情変動を人物Bの個人顔モデルに適用する方法を説明する。ただし、実施例2においては、骨格モデルを利用した個人差の吸収は行わず、人物Aの表情変動を直接人物Bの個人顔モデルに適用する。勿論、実施例2においても、実施例1と同様、人物Aの表情から表情による変動モデルデータベース105を構築し、人物Bの個人顔モデルに当て嵌めて所望の表情を作り出すことを目的とする。 In the second embodiment, as in the first embodiment, a method of extracting a variation model of a three-dimensional shape due to a facial expression from depth information included in input data and applying the facial expression variation of the person A to the personal face model of the person B will be described. To do. However, in the second embodiment, the individual difference using the skeleton model is not absorbed, and the facial expression variation of the person A is directly applied to the personal face model of the person B. Of course, in the second embodiment, as in the first embodiment, an object is to construct a variation model database 105 based on facial expressions from the facial expressions of the person A and apply them to the personal face model of the person B to create a desired facial expression.

図8のブロック図により実施例2の表情生成装置の構成例を示す。 A block diagram of FIG. 8 shows a configuration example of the facial expression generation apparatus according to the second embodiment.

実施例1では、人物Aと人物Bの骨格モデルを精度よく算出することで個人差を吸収し、汎用変動モデル104を作成する例を説明した。実施例2は、同一人物の無表情顔と有表情顔の差分から変動モデルを生成する。 In the first embodiment, an example has been described in which the general variation model 104 is created by absorbing individual differences by calculating the skeleton models of the person A and the person B with high accuracy. In the second embodiment, a variation model is generated from the difference between the expressionless face and the expressional face of the same person.

実施例1の汎用変動モデル104は、汎用顔モデル100からの表情変動をモデル化したものである。一方で、実施例2の変動モデル802は、人物Aの無表情顔モデル800から有表情顔モデル801へ形状変化する際の各頂点の移動を相対座標系で記述したものである。ここで、相対座標系は、例えば顔にマーカを装着し、そのマーカによって定まる点を基準点（原点）として、形状変化後のマーカ位置を二次元座標または三次元座標によって表現する座標系である。各マーカは、装着した顔の位置に応じたIDが設定されている。言い換えれば、相対座標系の変動モデル（以下、相対座標変動モデル）には、マーカIDごとに、無表情顔モデル800のマーカ位置を原点として、有表情顔モデル801のマーカ位置を示す座標が記述されている。 The general-purpose variation model 104 according to the first embodiment models the expression variation from the general-purpose face model 100. On the other hand, the variation model 802 of the second embodiment describes the movement of each vertex when the shape changes from the expressionless face model 800 of the person A to the expressional face model 801 in a relative coordinate system. Here, the relative coordinate system is a coordinate system in which, for example, a marker is attached to the face, and a point determined by the marker is used as a reference point (origin), and the marker position after the shape change is expressed by two-dimensional coordinates or three-dimensional coordinates. . Each marker is set with an ID corresponding to the position of the attached face. In other words, in the relative coordinate system variation model (hereinafter referred to as relative coordinate variation model), for each marker ID, the coordinates indicating the marker position of the facial expression face model 801 are described with the marker position of the expressionless face model 800 as the origin. Has been.

なお、マーカにより相対座標系の原点を設定する例を説明したが、マーカの代わりに顔のランドマーク（目尻や口端など位置をユニークに特定することができる箇所）を基準点（原点）に採用してもよい。 In addition, although the example which sets the origin of a relative coordinate system with the marker was explained, the landmark of the face (the part where the position such as the corner of the eye or the mouth edge can be uniquely specified) is used as the reference point (origin) instead of the marker. It may be adopted.

そして、基準表情に相当する人物Bの無表情顔モデル804を用意して、人物Bの無表情顔モデル804の各頂点に相対座標変動モデル802を適用することで、任意の表情を有する人物Bの表情付き顔モデル805を生成する。 Then, by preparing the expressionless face model 804 of the person B corresponding to the reference expression, and applying the relative coordinate variation model 802 to each vertex of the expressionless face model 804 of the person B, the person B having an arbitrary expression A face model 805 with expression is generated.

具体的には、人物Aがリラックスした状態（できるだけ表情変化が起きない状態）の、顔の三次元形状を複数フレームに亘って取得する。取得した複数フレームにおける頂点の変動を考慮して、平均的な頂点を算出し、当該頂点によって構成される顔の形状を無表情顔モデル800とする。このとき、複数フレームの間で、顔の三次元形状を構成する全頂点の対応が取れていることが望まれる。 Specifically, the three-dimensional shape of the face in a state in which the person A is relaxed (a state in which the expression change does not occur as much as possible) is acquired over a plurality of frames. An average vertex is calculated in consideration of the obtained vertex fluctuations in a plurality of frames, and the face shape constituted by the vertexes is defined as an expressionless face model 800. At this time, it is desired that all the vertices constituting the three-dimensional shape of the face can be matched among a plurality of frames.

特許文献2は、複数フレームの間で、顔の各頂点の対応を密に取る取得方法を開示する。特許文献2の方法は、被撮影者の顔にマーカの代わりに蛍光塗料を塗り、タイミングを制御しながらストロボを発光させ、蛍光塗料の塗りむらによって生じる粗密パターンを利用して、高精度な奥行き画像を取得する。この手法の利点は、顔に塗った蛍光塗料の塗りむらによって生じる微細パターンにより、パターンマッチングで顔の密な形状を再現することができる点にある。さらに、顔から蛍光塗料を落すまで、フレーム間の対応点探索を精度よく行うことができる。 Patent Document 2 discloses an acquisition method that closely matches each vertex of a face among a plurality of frames. The method of Patent Document 2 applies a fluorescent paint instead of a marker on the face of the subject, emits a strobe while controlling the timing, and uses a dense pattern generated by uneven application of the fluorescent paint to obtain a high-precision depth. Get an image. The advantage of this method is that the dense shape of the face can be reproduced by pattern matching with the fine pattern generated by the uneven application of the fluorescent paint applied to the face. Furthermore, the corresponding point search between frames can be accurately performed until the fluorescent paint is removed from the face.

実施例2では、特許文献2に開示された技術などを利用して、人物Aの無表情顔モデル800を生成する。そして、複数フレームの間での全頂点の対応が取れるという前提に基づき、人物Aに幾つかの表情を作ってもらい、表情サンプルを取得する。 In the second embodiment, the expressionless face model 800 of the person A is generated using the technique disclosed in Patent Document 2. Then, based on the premise that all vertices can be handled among a plurality of frames, the person A creates some facial expressions and obtains facial expression samples.

このように、人物Aの表情付き顔モデル801は、人物Aの無表情顔モデル800を生成した状態、または、同じ撮影環境で取得された表情付き顔モデルである。なお、特許文献2に開示された技術は、顔から蛍光塗料を落とした時点で、フレーム間の対応点探索が困難になる。 As described above, the face model 801 with a facial expression of the person A is a face model with a facial expression acquired in the state where the expressionless face model 800 of the person A is generated or in the same shooting environment. The technique disclosed in Patent Document 2 makes it difficult to search for corresponding points between frames when the fluorescent paint is dropped from the face.

相対座標変動モデル802は、このようにして得られた人物Aの無表情顔モデル800と表情付き顔モデル801の全対応点の変動（3D空間における移動量またはベクトルで表される）を記録したものである。 The relative coordinate variation model 802 records the variation of all corresponding points of the expressionless face model 800 of the person A and the face model with expression 801 obtained in this way (represented by a movement amount or a vector in 3D space). Is.

変動モデルデータベース105には、実施例1と同様、表情カテゴリが予め設定されている。相対座標変動モデル802の登録時、実際にシステムの利用者が人物Aの表情付き顔モデル801を確認して、適切な表情カテゴリを表す表情ラベルを付加する。 In the variation model database 105, as in the first embodiment, facial expression categories are set in advance. When registering the relative coordinate variation model 802, the user of the system actually confirms the face model 801 with a facial expression of the person A and adds a facial expression label representing an appropriate facial expression category.

人物Bの無表情顔モデル804は、人物Aの無表情顔モデル800と同様に生成する。ここで、人物Bの無表情顔モデル804に、人物Aの表情から生成した相対座標変動モデル802を適用するには、人物Aと人物Bの顔モデルの各頂点の対応が取れている必要がある。しかし、人物Aと人物Bは、顔の骨格や表面積が大きく異なる上、蛍光塗料の塗り方（塗りむらのパターン）が異なり、人物Aの顔モデルの全頂点と人物Bの顔モデルの全頂点の間で対応点を探索することは困難である。 The expressionless face model 804 of the person B is generated in the same manner as the expressionless face model 800 of the person A. Here, in order to apply the relative coordinate variation model 802 generated from the facial expression of the person A to the expressionless facial model 804 of the person B, it is necessary that the correspondence between each vertex of the facial model of the person A and the person B is taken. is there. However, person A and person B differ greatly in face skeleton and surface area, and the way of applying fluorescent paint (unevenness pattern) is different, so that all the vertices of face model of person A and all vertices of face model of person B It is difficult to search for corresponding points between.

特許文献3は、詳細なポリゴンモデルのフレーム間の対応を求める処理を開示する。 Patent Document 3 discloses a process for obtaining a correspondence between frames of a detailed polygon model.

異なる形状のCGオブジェクト同士を対応付ける特許文献3に開示される方法は、ポリゴンリダクション処理によりオブジェクトAとBの密な形状（高階層ポリゴンモデル）からポリゴン数を低減して、疎な形状（低階層ポリゴンモデル）を生成する。その際、ポリゴン数の低減過程をメタデータとして保存する。そして、低階層ポリゴンモデルの各頂点の対応付けを行った後、前述したメタデータを用い各ポリゴンモデルを高階層ポリゴンモデルへ復元処理する。以上の処理により、高精細ポリゴンのオブジェクトAとBの対応付けを行うことができる。 The method disclosed in Patent Document 3 for associating CG objects with different shapes reduces the number of polygons from the dense shape of objects A and B (high-level polygon model) by polygon reduction processing, resulting in a sparse shape (low-level) Polygon model). At that time, the process of reducing the number of polygons is stored as metadata. Then, after associating each vertex of the low-level polygon model, each polygon model is restored to a high-level polygon model using the above-described metadata. Through the above processing, the high-definition polygon objects A and B can be associated with each other.

ただし、特許文献3の方法により、顔の形状を構成するポリゴンに一様にポリゴンリダクション処理を施すと、他の部位に比べて情報量が多い目、眉、口と言った部位の情報が消失する。そのため、一様にポリゴンリダクション処理せずに、非特許文献3のように、人物Aの無表情モデル800の各頂点に相当する人物Bの無表情顔モデル804の頂点には重みを付けて削減を抑制する必要がある。 However, if the polygon reduction process is applied uniformly to the polygons that make up the face shape by the method of Patent Document 3, the information of the parts such as eyes, eyebrows, and mouth that have more information than other parts will be lost. To do. Therefore, without applying the polygon reduction process uniformly, as shown in Non-Patent Document 3, the vertex of the expressionless face model 804 of the person B corresponding to each vertex of the expressionless model 800 of the person A is weighted and reduced. It is necessary to suppress.

また、低階層ポリゴンモデルの頂点同士の対応付けは、例えば、システムの利用者が手作業で顔の重要部分である目、口、鼻周辺の頂点を指定してもよい。また、Active Appearance Model（非特許文献8参照）などの方法を用いて、顔の特徴部位の頂点の対応を取ってもよい。 In addition, associating vertices of the low-level polygon model, for example, the system user may manually specify the vertices around the eyes, mouth, and nose, which are important parts of the face. Further, the correspondence between the vertices of the facial feature parts may be taken using a method such as Active Appearance Model (see Non-Patent Document 8).

このように、頂点の対応付ける処理を行うことで、人物Aの無表情顔モデル800と人物Bの無表情顔モデル804の対応を取る。そして、人物Aの表情から算出した相対座標変動モデル802を人物Bの無表情顔モデル804を構成する各頂点に適用することで、人物Aの表情変動を人物Bの顔モデルに適用することができる。 In this way, by performing the process of associating the vertices, correspondence between the expressionless face model 800 of the person A and the expressionless face model 804 of the person B is obtained. Then, by applying the relative coordinate variation model 802 calculated from the expression of the person A to each vertex constituting the expressionless face model 804 of the person B, the expression variation of the person A can be applied to the face model of the person B. it can.

実施例2では、骨格モデルを用いる代わりに、個人の無表情顔モデル800を利用して、表情による形状変動をモデル化した相対座標変動モデル802を生成する。表情の「笑顔」を例にすると、笑顔はどの人種や性別にも共通して、目尻が下がり口角が上がるといった顔の形状変動が見られる。万人に共通する形状変動は、骨格モデルを用いなくとも、無表情顔モデル800を基準とする変動モデルで充分に記述することが可能である。 In the second embodiment, instead of using a skeleton model, a relative coordinate variation model 802 in which a shape variation due to a facial expression is modeled using an individual expressionless face model 800 is generated. Taking the expression “smile” as an example, smiles are common to all races and genders, and facial shape fluctuations such as lowering the corners of the eyes and increasing the corners of the eyes can be seen. The shape variation common to everyone can be sufficiently described by a variation model based on the expressionless face model 800 without using a skeleton model.

●変動モデルデータベース生成部
図9のブロック図により実施例2の変動モデルデータベース生成部403および任意表情生成部401の構成例を示す。 Variation Model Database Generation Unit A configuration example of the variation model database generation unit 403 and the arbitrary facial expression generation unit 401 of the second embodiment is shown by the block diagram of FIG.

入力データ取得部402は、実施例1と同様に、被撮影者（例えば人物A）の顔のRGB画像と高精細な奥行き情報を入力する。 As in the first embodiment, the input data acquisition unit 402 inputs an RGB image of the face of the subject (for example, a person A) and high-definition depth information.

表情顔モデル生成部900は、入力データ取得部402から入力された高精細ポリゴンモデルの各頂点の微小変動を取り除くために、複数フレームの間の平均値または中央値を取り、無表情顔モデル800および表情付き顔モデル801を生成する。 The facial expression model generation unit 900 takes an average value or a median value between a plurality of frames in order to remove minute fluctuations at each vertex of the high-definition polygon model input from the input data acquisition unit 402, and performs an expressionless facial model 800. Then, a face model 801 with an expression is generated.

相対座標変動モデル算出部901は、高精細ポリゴンモデルである無表情顔モデル800と表情付き顔モデル801の間の各頂点の移動量を相対座標変動モデル802として算出する。 The relative coordinate variation model calculation unit 901 calculates the amount of movement of each vertex between the expressionless face model 800, which is a high-definition polygon model, and the facial model with expression 801, as a relative coordinate variation model 802.

表情カテゴリ分類部504は、実施例1と同様の処理を行い、相対座標変動モデル算出部901が算出した相対座標変動モデル802に表情カテゴリのラベルを付加し、相対座標変動モデル802を変動モデルDB105に登録する。勿論、必要があれば、表情カテゴリのラベルに加え、人物の氏名などの属性情報を同時に変動モデルDB105に登録する。 The facial expression category classification unit 504 performs processing similar to that in the first embodiment, adds a label of the facial expression category to the relative coordinate variation model 802 calculated by the relative coordinate variation model calculation unit 901, and converts the relative coordinate variation model 802 into the variation model DB 105. Register with. Of course, if necessary, in addition to the expression category label, attribute information such as a person's name is registered in the variation model DB 105 at the same time.

任意表情生成部401の対応点探索部902は、上述した対応付け処理を行い、人物Aの無表情顔モデル800と人物Bの無表情顔モデル804の全頂点を対応付ける。 The corresponding point search unit 902 of the arbitrary facial expression generation unit 401 performs the above-described association processing, and associates all vertices of the expressionless face model 800 of the person A and the expressionless face model 804 of the person B.

表情付き顔モデル生成部406は、システムの利用者が選択した表情カテゴリに対応する、変動モデルデータベース105に蓄積された相対座標変動モデル802を適用して、対応付け処理が完了した人物Bの無表情顔モデル804に表情変動を付加する。 The expression-equipped face model generation unit 406 applies the relative coordinate variation model 802 stored in the variation model database 105 corresponding to the expression category selected by the user of the system, and the person B who has completed the association process An expression variation is added to the expression face model 804.

図10のフローチャートにより変動モデルデータベース生成部403の処理を説明する。 The process of the fluctuation model database generation unit 403 will be described with reference to the flowchart of FIG.

表情顔モデル生成部900は、入力データ取得部401から入力データ（被撮影者の顔を撮影して得られた画像と形状情報（奥行き情報）および被撮影者の情報）を取得する(S1001)。そして、被撮影者の無表情顔モデル800が存在するか否か（システムのメモリに格納されているか否か）を検索する(S1002)。被撮影者の無表情顔モデル800が存在する場合、処理はステップ1004に進む。 The facial expression model generation unit 900 acquires input data (image and shape information (depth information) and subject information) obtained by photographing the subject's face from the input data obtaining unit 401 (S1001). . Then, it is searched whether or not the subject's expressionless face model 800 exists (whether or not stored in the system memory) (S1002). If the expressionless face model 800 of the subject is present, the process proceeds to step 1004.

被撮影者の無表情顔モデル800が存在しない場合、表情顔モデル生成部900は、無表情顔モデル800を生成する(S1003)。無表情顔モデル800の生成は、時系列的に連続する複数フレームに亘る入力データが必要になる。そのため、表情顔モデル生成部900は、充分な入力データ（サンプル）が集まるまで、処理をステップS1001に戻して入力データを取得する。 If the expressionless face model 800 of the subject does not exist, the expression face model generation unit 900 generates the expressionless face model 800 (S1003). The generation of the expressionless face model 800 requires input data over a plurality of frames that are continuous in time series. Therefore, the facial expression model generation unit 900 returns the process to step S1001 to acquire input data until sufficient input data (samples) are collected.

次に、表情顔モデル生成部900は、入力データ取得部401から入力データ（被撮影者の顔を撮影して得られた画像と形状情報（奥行き情報）および被撮影者の情報）を取得する(S1004)。そして、被撮影者の表情付き顔モデルが存在するか否か（システムのメモリに格納されているか否か）を検索する(S1005)。被撮影者の無表情顔モデルが存在する場合、処理はステップ1007に進む。 Next, the facial expression model generation unit 900 acquires input data (image and shape information (depth information) and information on the subject) obtained by photographing the subject's face from the input data acquisition unit 401. (S1004). Then, it is searched whether or not a face model with a facial expression of the photographed person exists (whether or not stored in the memory of the system) (S1005). If there is an expressionless face model of the subject, the process proceeds to step 1007.

被撮影者の表情付き顔モデルが存在しない場合、表情顔モデル生成部900は、表情付き顔モデルを生成する(S1006)。表情付き顔モデルの生成は、時系列的に連続する複数フレームに亘る入力データが必要になる。そのため、表情顔モデル生成部900は、充分な入力データ（サンプル）が集まるまで、処理をステップS1004に戻して入力データを取得する。 When the face model with facial expression of the subject does not exist, the facial expression model generation unit 900 generates a facial model with facial expression (S1006). Generation of a facial model with a facial expression requires input data over a plurality of frames that are continuous in time series. Therefore, the facial expression model generation unit 900 returns the process to step S1004 to acquire input data until sufficient input data (samples) is collected.

また、被撮影者の無表情顔モデル800と表情付き顔モデル801は、そのときの肌の状態やその日の気分、さらには経年変化など様々な要因で微妙に変化すると予想される。そのため、本実施例においては、無表情顔モデル800と表情付き顔モデル801をその場限りのデータとして生成することを想定し、ストレージデバイスなどに保存しない。ただし、構築するシステムによっては、一度、無表情顔モデル800と表情付き顔モデル801を生成した後、被撮影者の属性情報とともに無表情顔モデル800と表情付き顔モデル801をシステムのストレージデバイスに保存してもよい。 Further, the expressionless face model 800 and the face model with expression 801 of the photographed subject are expected to change slightly due to various factors such as the state of the skin at that time, the mood of the day, and aging. Therefore, in the present embodiment, it is assumed that the expressionless face model 800 and the expression-equipped face model 801 are generated as ad hoc data and are not stored in a storage device or the like. However, depending on the system to be constructed, once the expressionless face model 800 and the expression-equipped face model 801 are generated, the expressionless face model 800 and the expression-equipped face model 801 are stored in the system storage device along with the attribute information of the subject. May be saved.

次に、相対座標変動モデル算出部901は、無表情顔モデル800と表情付き顔モデル801の間の各頂点の移動量を相対座標変動モデル802として算出する(S1007)。 Next, the relative coordinate variation model calculation unit 901 calculates the amount of movement of each vertex between the expressionless face model 800 and the facial model with expression 801 as a relative coordinate variation model 802 (S1007).

次に、表情カテゴリ分類部504は、算出された相対座標変動モデル802に表情カテゴリのラベルを付加し、相対座標変動モデル802を変動モデルDB105に登録する(S1008)。 Next, the facial expression category classification unit 504 adds a label of the facial expression category to the calculated relative coordinate variation model 802 and registers the relative coordinate variation model 802 in the variation model DB 105 (S1008).

図11のフローチャートにより実施例2の任意表情生成部401の処理を説明する。 Processing of the arbitrary facial expression generation unit 401 of the second embodiment will be described with reference to the flowchart of FIG.

表情顔モデル生成部900は、入力データ取得部401から入力データ（被撮影者の顔を撮影して得られた画像と形状情報（奥行き情報）および被撮影者の情報）を取得する(S1101)。そして、被撮影者の無表情顔モデル804が存在するか否か（システムのメモリに格納されているか否か）を検索する(S1102)。被撮影者の無表情顔モデル804が存在する場合、処理はステップ1104に進む。 The facial expression model generation unit 900 acquires input data (image and shape information (depth information) and subject information) obtained by photographing the subject's face from the input data obtaining unit 401 (S1101). . Then, it is searched whether or not the subject's expressionless face model 804 exists (whether or not stored in the system memory) (S1102). If the subject's expressionless face model 804 exists, the process proceeds to step 1104.

被撮影者の無表情顔モデル804が存在しない場合、表情顔モデル生成部900は、無表情顔モデル804を生成する(S1103)。無表情顔モデル804の生成は、時系列的に連続する複数フレームに亘る入力データが必要になる。そのため、表情顔モデル生成部900は、充分な入力データ（サンプル）が集まるまで、処理をステップS1101に戻して入力データを取得する。 If the expressionless face model 804 of the subject does not exist, the expression face model generation unit 900 generates an expressionless face model 804 (S1103). The generation of the expressionless face model 804 requires input data over a plurality of frames that are continuous in time series. Therefore, the facial expression model generation unit 900 returns the process to step S1101 to acquire input data until sufficient input data (samples) is collected.

次に、対応点探索部902は、無表情顔モデル800と無表情顔モデル804の各頂点の間の対応が取れているか否かを判定し(S1104)、対応が取れていない場合は上述した対応点を探索する処理を行う(S1105)。 Next, the corresponding point search unit 902 determines whether or not the correspondence between the vertices of the expressionless face model 800 and the expressionless face model 804 is obtained (S1104). Processing for searching for corresponding points is performed (S1105).

頂点の対応がとれている場合、または、対応点探索が終了すると、表情付き顔モデル生成部406は、ステップS706と同様、変動モデルDB105に設定された表情カテゴリからシステムの利用者に所望する表情カテゴリを選択させる。そして、その選択情報を取得する(S1106)。そして、頂点の対応が取れた、人物Bの無表情顔モデル804の各頂点に、選択情報に対応する相対座標変動モデル802を適用して、各頂点を移動した人物Bの表情付き顔モデル805を生成する(S1107)。 When the correspondence between the vertices is taken, or when the corresponding point search ends, the facial model with facial expression generation unit 406 performs the facial expression desired by the system user from the facial expression category set in the variation model DB 105, as in step S706. Let the category be selected. Then, the selection information is acquired (S1106). Then, by applying the relative coordinate variation model 802 corresponding to the selection information to each vertex of the expressionless face model 804 of the person B who can correspond to the vertex, the face model 805 with the expression of the person B who moved each vertex Is generated (S1107).

以上のように、実施例2の表情生成装置は、デプスデータ（奥行き情報）を利用して、個々の人物の無表情顔モデルを生成し、表情変化に起因する無表情顔モデルからの変動分を相対座標変動モデル802として変動モデルDB105に蓄積する。これにより、任意の顔モデルに対して表情を付加することができる。 As described above, the facial expression generation apparatus according to the second embodiment uses the depth data (depth information) to generate an expressionless face model of each person, and the variation from the expressionless face model caused by the expression change. Are stored in the variation model DB 105 as a relative coordinate variation model 802. Thereby, an expression can be added to an arbitrary face model.

なお、実施例2の表情生成装置は、デプスデータを利用せずに、二次元画像処理によって表情による変動モデルを記述することが可能である。例えば、非特許文献8に記載されたActive Appearance Model (AAM)を利用して、顔の特徴点群の座標値を算出することができる。そして、顔の特徴点群の座標値から、無表情顔モデルと表情付き顔モデルを生成することでき、実施例2で説明した方法を用いれば、二次元の画像特徴量からでも表情の変動モデルを生成することが可能である。そして、二次元モーフィングの手法（非特許文献9参照）を用いて、任意の顔モデルに表情を付加する。 Note that the facial expression generation apparatus according to the second embodiment can describe a variation model based on facial expressions by two-dimensional image processing without using depth data. For example, the coordinate values of facial feature points can be calculated using Active Appearance Model (AAM) described in Non-Patent Document 8. An expressionless face model and a face model with expression can be generated from the coordinate values of the facial feature points. Using the method described in the second embodiment, a facial expression variation model can be obtained from a two-dimensional image feature amount. Can be generated. Then, using a two-dimensional morphing method (see Non-Patent Document 9), an expression is added to an arbitrary face model.

以下、本発明にかかる実施例3の情報処理を説明する。なお、実施例3において、実施例1、2と略同様の構成については、同一符号を付して、その詳細説明を省略する。 Hereinafter, information processing according to the third embodiment of the present invention will be described. Note that the same reference numerals in the third embodiment denote the same parts as in the first and second embodiments, and a detailed description thereof will be omitted.

実施例3では、二次元または三次元の変動モデルに加え、テクスチャの変動モデルを利用する方法を説明する。以下では、実施例3として、実施例1の構成にテクスチャの変動モデルを加えた例を説明する。勿論、実施例2の構成にテクスチャの変動モデルを加えた実施例も実現可能である。 In the third embodiment, a method of using a texture variation model in addition to a two-dimensional or three-dimensional variation model will be described. In the following, as Example 3, an example in which a texture variation model is added to the configuration of Example 1 will be described. Of course, an embodiment in which a texture variation model is added to the configuration of the embodiment 2 can also be realized.

図12のブロック図により実施例3の変動モデルデータベース生成部403および任意表情生成部401の構成例を示す。 The block diagram of FIG. 12 shows a configuration example of the variation model database generation unit 403 and the arbitrary facial expression generation unit 401 of the third embodiment.

テクスチャ情報算出部1200は、入力データ取得部402から入力されるRGB画像を画像処理して、顔のテクスチャ情報を算出する。テクスチャ情報は、例えば皺、毛穴、瞳の色、唇の色、眉の形、一様光源環境下における肌の色などの情報、顔のランドマーク付近（目尻、口端、眉の端など）の画像特徴を有する領域パターン情報（テクストン、textone）などである。テクストンは、あるルールに従い画像をクラスタ化することにより、当該画像がもつ際立った特徴を把握し易くするものである。テクストンは、Gabor特徴（非特許文献10参照）やLBP特徴（非特許文献11参照）など様々な特徴量を用いて記述される。 The texture information calculation unit 1200 performs image processing on the RGB image input from the input data acquisition unit 402 to calculate face texture information. Texture information includes, for example, wrinkles, pores, pupil color, lip color, eyebrow shape, skin color in a uniform light source environment, face landmarks (such as the corners of the eyes, mouth, and eyebrows) Area pattern information (texton, textone) having the following image features. Texton makes it easy to grasp the distinctive features of an image by clustering the image according to a certain rule. Textons are described using various feature quantities such as Gabor features (see Non-Patent Document 10) and LBP features (see Non-Patent Document 11).

テクスチャ情報は、一般に、次の二種類の方法で算出することができる。 In general, texture information can be calculated by the following two methods.

一つ目の方法は、前述したActive Appearance Model (AAM)や、Haar-like特徴量を使った顔検出（非特許文献12参照）などにより二次元の入力画像から顔領域を特定する。そして、フレーム間において顔の位置をレジストレーションした上、顔のテクスチャ変動を取得する。この方法によるテクスチャ変動モデルは、フレーム間の二次元画像の各座標における輝度変化または色変化を時系列信号として記述したモデルである。 In the first method, a face region is specified from a two-dimensional input image by the aforementioned Active Appearance Model (AAM), face detection using Haar-like feature quantities (see Non-Patent Document 12), or the like. Then, after registering the position of the face between frames, the texture variation of the face is acquired. The texture variation model according to this method is a model in which a luminance change or a color change at each coordinate of a two-dimensional image between frames is described as a time series signal.

二つ目の方法は、三次元復元した顔形状の各頂点の色を入力画像から取得し、三次元顔モデルの各頂点の色をテクスチャ座標に展開して、テクスチャ情報を算出する（非特許文献13参照）。 The second method obtains the color of each vertex of the face shape restored in three dimensions from the input image, expands the color of each vertex of the three-dimensional face model into texture coordinates, and calculates the texture information (non-patent) Reference 13).

非特許文献13に記載された標準的なCG技術において、顔の表面形状のようなパラメトリック曲面にテクスチャをマッピングする際、三次元空間の曲面上の点(x, y, z)は、二つのパラメータ(u, v)で表すのが一般的である。これらのパラメータを、0≦u≦1、0≦v≦1の範囲で変化させて、曲面上の点の三次元座標値およびテクスチャの画素値を得ることができる。このテクスチャマッピングの手法を利用すれば、パラメータ(u, v)をテクスチャ座標として定義することができる。 In the standard CG technique described in Non-Patent Document 13, when mapping a texture to a parametric curved surface such as the surface shape of the face, the point (x, y, z) on the curved surface in the three-dimensional space Generally expressed by parameters (u, v). By changing these parameters in the range of 0 ≦ u ≦ 1 and 0 ≦ v ≦ 1, the three-dimensional coordinate value of the point on the curved surface and the pixel value of the texture can be obtained. Using this texture mapping technique, the parameters (u, v) can be defined as texture coordinates.

そして、三次元オブジェクトの頂点座標(x, y, z)とテクスチャ座標(u, v)の変換行列を予め求め、三次元復元した顔形状の各頂点の色を、テクスチャ座標(u, v)に投影することが可能になる。テクスチャ情報を取得する手法は、構築するシステムにおいて好適な手法を用いればよい。 Then, a transformation matrix between the vertex coordinates (x, y, z) and texture coordinates (u, v) of the three-dimensional object is obtained in advance, and the color of each vertex of the face shape restored in three dimensions is determined by the texture coordinates (u, v). It becomes possible to project to. As a method for acquiring texture information, a method suitable for the system to be constructed may be used.

テクスチャ変動モデル生成部1201は、骨格モデルを利用して、テクスチャ情報算出部1200が算出したテクスチャ情報から個人差を吸収し、表情変動によるテクスチャ変動成分をモデル化する。 The texture variation model generation unit 1201 uses a skeleton model to absorb individual differences from the texture information calculated by the texture information calculation unit 1200 and models a texture variation component due to facial expression variation.

この処理は、テクスチャ座標に展開されたテクスチャ画像を画像全体から、または、局所領域ごとにテクスチャ情報（テクストンなどの特徴量）を抽出することによって行われる。復元した三次元形状からテクスチャ情報を取り出す場合を想定すると、テクスチャ座標に展開されたテクスチャの目、鼻、口の各周辺領域は、テクスチャ座標において一意に決まったポリゴンの座標にマッピングされる。このマッピング方法は、一般にパラメータ化テクスチャマッピングまたはUVマッピング（非特許文献14参照）と呼ばれる。ただし、これは異なる三次元顔形状において、各頂点の対応がすべて取れていることが前提になる。 This processing is performed by extracting texture information (features such as texton) from the entire image or for each local region from the texture image developed in the texture coordinates. Assuming that the texture information is extracted from the restored three-dimensional shape, the surrounding areas of the eyes, nose, and mouth of the texture developed in the texture coordinates are mapped to the polygon coordinates uniquely determined in the texture coordinates. This mapping method is generally called parameterized texture mapping or UV mapping (see Non-Patent Document 14). However, this is based on the premise that all the correspondences between the vertices are taken in different three-dimensional face shapes.

入力データから取得できる人物Aの三次元顔形状は、人物Aの骨格モデル101を利用して、一旦、汎用顔モデル100に変形される。一度、汎用顔モデル100に変換することで、各表情の三次元顔形状同士の対応付けが可能になる。そして、対応付けられた各頂点のテクスチャ情報をテクスチャ座標に展開し、テクスチャ画像を生成する。実施例3では、生成されたテクスチャ画像をテクスチャ変動モデルとして、変動モデルDB105に登録する。 The three-dimensional face shape of the person A that can be acquired from the input data is temporarily transformed into the general-purpose face model 100 using the skeleton model 101 of the person A. Once converted into the general-purpose face model 100, the three-dimensional face shapes of each expression can be associated with each other. Then, the texture information of each associated vertex is developed into texture coordinates to generate a texture image. In the third embodiment, the generated texture image is registered in the variation model DB 105 as a texture variation model.

なお、テクスチャ変動モデルは、一つのテクスチャ画像で構成してもよいし、得られたテクスチャ画像の時系列的な変動をオプティカルフロー推定処理（非特許文献15参照）を利用して算出し、変動モデルを構成してもよい。 The texture variation model may be composed of a single texture image, or the time-series variation of the obtained texture image is calculated using an optical flow estimation process (see Non-Patent Document 15). A model may be constructed.

テクスチャ変動モデルを変動モデルDB105に登録する際は、表情カテゴリのラベルを付加した三次元形状成分の変動モデル（3D形状変動モデル）と対になる顔面上のテクスチャ成分の変動モデル（テクスチャ変動モデル）を登録する。なお、3D形状変動モデルは、対応がとれた各頂点の変位ベクトル、または、対応がとれたポリゴン内の法線の変位ベクトルとして表される。また、3D形状変動モデルおよびテクスチャ変動モデルは、顔の局所領域ごとに算出され、登録されてもよい。 When registering the texture variation model in the variation model DB105, the variation model of the texture component on the face (texture variation model) paired with the variation model of the 3D shape component (3D shape variation model) with the expression category label added Register. Note that the 3D shape variation model is represented as a displacement vector of each corresponding vertex or a displacement vector of a normal line in the corresponding polygon. The 3D shape variation model and the texture variation model may be calculated and registered for each local region of the face.

表情付き顔モデル生成部505は、システムの利用者が選択した表情カテゴリに登録されている3D形状変動モデルおよびテクスチャ変動モデルをランダムに選択し、汎用顔モデル100に適用する。そして、変動モデルを適用後の汎用顔モデルを骨格モデルDB501に登録された人物Aまたは人物Bの顔に変形することで、任意の顔モデルに任意の表情を付加することができる。 The expression-equipped face model generation unit 505 randomly selects a 3D shape variation model and a texture variation model registered in the expression category selected by the user of the system and applies them to the general-purpose face model 100. Then, by transforming the general-purpose face model after applying the variation model into the face of the person A or person B registered in the skeleton model DB 501, an arbitrary expression can be added to the arbitrary face model.

なお、3D形状変動モデルがターゲットシェイプの重みwで定義付けられている場合、テクスチャも3D形状変動モデルと同様の重みwを利用してテクスチャをブレンド処理した上で汎用顔モデル101に適用してもよい。あるいは、顔の局所領域（部位）ごとにテクスチャを切り替えるなども可能である。 If the 3D shape variation model is defined with the target shape weight w, the texture is blended using the same weight w as the 3D shape variation model and applied to the general-purpose face model 101. Also good. Alternatively, the texture can be switched for each local region (part) of the face.

また、複数の表情カテゴリに属する変動モデル、言い換えれば、表情カテゴリが異なる複数の変動モデルの重み付き線形和により、各頂点の3D形状の変位量を求め、当該変位量を基準表情に当て嵌めて、任意の表情の顔モデルを生成してもよい。 Also, the displacement model belonging to multiple facial expression categories, in other words, the weighted linear sum of multiple variation models with different facial expression categories, is used to determine the displacement amount of the 3D shape of each vertex, and the displacement amount is applied to the reference facial expression. A face model having an arbitrary expression may be generated.

実施例3の処理は、実施例1の処理と略同様であり、詳細な説明を省略する。 The processing of the third embodiment is substantially the same as the processing of the first embodiment, and detailed description thereof is omitted.

［その他の実施例］
また、本発明は、以下の処理を実行することによっても実現される。即ち、上述した実施形態の機能を実現するソフトウェア（プログラム）を、ネットワーク又は各種記憶媒体を介してシステム或いは装置に供給し、そのシステムあるいは装置のコンピュータ（又はCPUやMPU等）がプログラムを読み出して実行する処理である。 [Other Examples]
The present invention can also be realized by executing the following processing. That is, software (program) that realizes the functions of the above-described embodiments is supplied to a system or apparatus via a network or various storage media, and a computer (or CPU, MPU, etc.) of the system or apparatus reads the program. It is a process to be executed.

Claims

A skeleton model generation means for generating a skeleton model for converting a general-purpose facial skeleton model into a personal facial skeleton model;
Calculating means for calculating a variation model that absorbs individual differences in facial expressions using the individual's facial skeleton model and the skeleton model;
The variation model classified for each predetermined expression category, have a registration means for registering the variation model the classification database,
The information processing apparatus calculates the variation model as a weighted linear sum of a plurality of target shapes prepared in advance, and calculates the weight of the target shape by a pseudo inverse matrix .

The skeleton model generation means generates a skeleton model from a face skeleton model of an individual's reference facial expression, and generates a skeleton model with an expression representing a variation of each vertex of the skeleton model from the face skeleton model with an individual expression. Item 4. The information processing device according to Item 1.

3. The information processing apparatus according to claim 2, wherein the calculation unit calculates the variation model in which individual differences are absorbed from the skeleton model with an expression using a facial skeleton model of the reference facial expression of the individual.

Further, the variation model corresponding to the expression category selected by the user is applied to the skeleton model generated from the facial skeleton model of the reference facial expression of the individual by the skeleton model generation unit to generate the facial model with the expression of the individual The information processing apparatus according to any one of claims 1 to 3, further comprising a face model generation unit that performs the processing.

An information processing method for an information processing apparatus having a generation unit, a calculation unit, and a registration unit,
The generating means generates a skeleton model for converting a general-purpose facial skeleton model into a personal facial skeleton model;
The calculation means calculates a variation model that absorbs individual differences in facial expressions using the individual face skeleton model and the skeleton model;
The registration means classifies the variation model for each predetermined facial expression category, registers the classified variation model in a database ,
The information processing method , wherein the calculation means calculates the variation model as a weighted linear sum of a plurality of target shapes prepared in advance, and calculates the weight of the target shape by a pseudo inverse matrix .

The program for functioning a computer as each means of the information processing apparatus as described in any one of Claims 1-5 .