JP2016099759A

JP2016099759A - Face detection method, face detection device, and face detection program

Info

Publication number: JP2016099759A
Application number: JP2014235232A
Authority: JP
Inventors: 嘉伸海老澤; Yoshinobu Ebisawa
Original assignee: Shizuoka University NUC
Current assignee: Shizuoka University NUC
Priority date: 2014-11-20
Filing date: 2014-11-20
Publication date: 2016-05-30
Anticipated expiration: 2034-11-20
Also published as: JP6452235B2

Abstract

PROBLEM TO BE SOLVED: To provide a method, device, and program for obtaining an accurate result of face detection.SOLUTION: A face detection method includes a function derivation step S3 of deriving a function for correcting a face direction; a face attitude derivation step S6 of deriving a face attitude; and a face attitude correction step S7 of correcting the face attitude. The face direction is derived by using a distance between parts in a reference part group that is a combination of three: the left pupil, the right pupil, and one of right and left nostrils of a subject, and a two-dimensional position of the reference part group detected in a face image of the subject, and calculating the direction of a normal line of a flat surface including the reference part group. The line of sight and a reference face direction are derived by performing stereo matching of two face images of the subject.SELECTED DRAWING: Figure 7

Description

本発明は、顔検出方法、顔検出装置、及び顔検出プログラムに関する。 The present invention relates to a face detection method, a face detection apparatus, and a face detection program.

自動車技術などの分野では、ドライバーの視線及び頭部姿勢を計測し、これらデータに基づいてドライバーの運転支援を行う技術が検討されている。特許文献１〜３には、瞳孔と角膜反射像とを利用して、被験者の視線を高精度に検出する視線検出装置が開示されている。また、特許文献４〜６には、被験者の頭部姿勢を検出する装置が開示されている。ここで、頭部姿勢とは、被験者の眼球の回転とは関係の無い、頭蓋骨の位置と方向をいう。特許文献４，５の頭部姿勢検出装置は、いわゆるステレオマッチング法を利用している。具体的には、ステレオ較正された光学系を利用して被験者の頭部画像データを取得する。次に、２枚の頭部画像データをステレオマッチングして、被験者の瞳孔及び鼻孔の各三次元座標を求める。これら三次元座標を利用して瞳孔と鼻孔間中点の重心を算出し、この重心を頭部位置としている。また、瞳孔と鼻孔間中点を通る平面を算出し、この平面の法線を頭部方向としている。また、特許文献６の頭部姿勢検出方法は、１台の光学系によって瞳孔と鼻孔間中点といった特徴点の三次元位置を推定する。すなわち、特許文献６の頭部姿勢検出方法は、いわゆるステレオマッチング法を利用していない。この方法では、１台の光学系によって得た画像データと、２個の瞳孔と各鼻孔との互いの距離を拘束条件とを利用して、瞳孔と鼻孔間中点の三次元位置を推定する。 In the field of automobile technology and the like, a technique for measuring driver's line of sight and head posture and assisting the driver's driving based on these data is being studied. Patent Documents 1 to 3 disclose gaze detection devices that detect the gaze of a subject with high accuracy using a pupil and a cornea reflection image. Patent Documents 4 to 6 disclose devices that detect a subject's head posture. Here, the head posture refers to the position and direction of the skull that is not related to the rotation of the eyeball of the subject. The head posture detection devices of Patent Documents 4 and 5 use a so-called stereo matching method. Specifically, the test subject's head image data is acquired using a stereo-calibrated optical system. Next, two head image data are stereo-matched to determine the three-dimensional coordinates of the pupil and nostrils of the subject. The center of gravity of the midpoint between the pupil and nostril is calculated using these three-dimensional coordinates, and this center of gravity is used as the head position. Also, a plane passing through the midpoint between the pupil and nostril is calculated, and the normal of this plane is taken as the head direction. Further, the head posture detection method of Patent Document 6 estimates the three-dimensional position of feature points such as the midpoint between the pupil and the nostril using a single optical system. That is, the head posture detection method of Patent Document 6 does not use a so-called stereo matching method. In this method, the three-dimensional position of the midpoint between the pupil and the nostril is estimated using the image data obtained by one optical system and the constraint condition of the distance between the two pupils and each nostril. .

特開２００５−２３００４９号公報JP 2005-230049 A 特開２００５−１９８７４３号公報JP 2005-198743 A 特開２００５−１８５４３１号公報JP 2005-185431 A 特開２００５−２６６８６８号公報JP 2005-266868 A 特開２００７−２６０７３号公報JP 2007-26073 A 特開２００７−２７１５５４号公報JP 2007-271554 A

特許文献４，５の頭部姿勢検出装置は、ステレオマッチング法を利用しているため、ランダムノイズが大きくなる傾向にある。特許文献６の頭部姿勢検出方法は、２個の瞳孔と各鼻孔との互いの距離を拘束条件として与えているため、頭部と眼球との相対的な姿勢関係によっては、偏り誤差が大きくなる傾向にある。 Since the head posture detection devices of Patent Documents 4 and 5 use the stereo matching method, random noise tends to increase. Since the head posture detection method of Patent Document 6 gives the mutual distance between the two pupils and each nostril as a constraint, depending on the relative posture relationship between the head and the eyeball, the bias error is large. Tend to be.

そこで、本発明は、精度のよい顔検出の結果が得られる顔検出方法、顔検出装置、及び顔検出プログラムを提供する。 Therefore, the present invention provides a face detection method, a face detection apparatus, and a face detection program that can obtain accurate face detection results.

本発明の一形態に係る顔検出方法は、対象者の顔姿勢と視線との間の第１角度及び基準顔姿勢と顔姿勢の間の第２角度の関係を規定する係数を含み、顔姿勢を補正するための関数を導出する関数導出ステップと、顔姿勢を導出する顔姿勢導出ステップと、関数導出ステップにおいて導出された関数を利用して、顔姿勢を補正する顔姿勢補正ステップと、を有し、顔姿勢は、対象者の左瞳孔、右瞳孔、及び左右の鼻孔の何れか一方の３つの組み合わせである基準部位群における部位間の距離と、対象者の顔画像において検出される基準部位群の二次元的位置とを利用して、基準部位群を含む平面の法線を算出することによって導出され、視線及び基準顔姿勢は、２枚の対象者の顔画像をステレオマッチングすることによって導出される。 A face detection method according to an aspect of the present invention includes a coefficient that defines a relationship between a first angle between a face posture and a line of sight of a subject and a second angle between a reference face posture and the face posture. A function deriving step for deriving a function for correcting the face, a face posture deriving step for deriving a face posture, and a face posture correcting step for correcting the face posture using the function derived in the function deriving step. The facial posture is a distance between parts in a reference part group that is a combination of any one of the left pupil, right pupil, and right and left nostrils of the subject, and a reference detected in the face image of the subject It is derived by calculating the normal of the plane including the reference part group using the two-dimensional position of the part group, and the line of sight and the reference face posture are stereo-matching the face images of the two subjects. Is derived by

この顔検出方法は、基準部位群における部位間の距離を拘束条件として利用する方法によって顔姿勢を導出する。従って、ランダムノイズの少ない安定した結果を得ることができる。そして、顔検出方法では、関数を利用して顔姿勢を補正する。この関数は、対象者の視線と顔姿勢の間の第１角度及び顔姿勢と基準顔姿勢の間の第２角度の関係を規定する係数を含んでいるので、導出された顔姿勢を基準顔姿勢に相当する結果に変換する。ここで、基準顔姿勢は、ステレオマッチングにより導出された結果であるので、真の顔姿勢に対する偏りが小さい。このため、ランダムノイズの少ない安定した顔姿勢が、真の顔姿勢に対する偏りが小さい顔姿勢に補正される。従って、精度のよい顔検出の結果が得られる。 In this face detection method, a face posture is derived by a method using a distance between parts in a reference part group as a constraint condition. Therefore, a stable result with little random noise can be obtained. In the face detection method, the face posture is corrected using a function. This function includes a coefficient that defines the relationship between the first angle between the line of sight of the subject and the face posture and the second angle between the face posture and the reference face posture. Convert to a result corresponding to the posture. Here, since the reference face posture is a result derived by stereo matching, the bias with respect to the true face posture is small. For this reason, a stable face posture with less random noise is corrected to a face posture with a small deviation from the true face posture. Therefore, an accurate face detection result can be obtained.

関数導出ステップは、顔姿勢を導出するステップと、視線及び基準顔姿勢を導出するステップと、第１角度及び第２角度に基づいて、係数を導出するステップと、を含むこととしてもよい。この関数導出ステップによれば、第１角度及び第２角度の関係を規定する係数を含む関数を導出することができる。 The function deriving step may include a step of deriving a face posture, a step of deriving a line of sight and a reference face posture, and a step of deriving a coefficient based on the first angle and the second angle. According to this function deriving step, a function including a coefficient that defines the relationship between the first angle and the second angle can be derived.

関数導出ステップを１回実行した後に、顔姿勢導出ステップと顔姿勢補正ステップと、を繰り返し実行することとしてもよい。この方法では、顔姿勢導出ステップと、顔姿勢補正ステップを実行するときに、補正のための関数が既に得られている。従って、顔姿勢導出ステップと、顔姿勢補正ステップを実行し始めた直後から、精度のよい顔検出の結果が得ることができる。 After executing the function deriving step once, the face posture deriving step and the face posture correcting step may be repeatedly executed. In this method, a function for correction is already obtained when executing the face posture deriving step and the face posture correcting step. Accordingly, an accurate face detection result can be obtained immediately after the start of the face posture deriving step and the face posture correcting step.

顔検出方法では、関数導出ステップと顔姿勢導出ステップと顔姿勢補正ステップと、この順で繰り返し実行することとしてもよい。この方法によれば、顔姿勢導出ステップと顔姿勢補正ステップを実行する毎に、関数導出ステップも実行される。関数導出ステップの繰り返しにより係数が更新されて所定の値に収束する。従って、事前に関数を準備することなく、顔姿勢を検出し始めることができる。 In the face detection method, the function deriving step, the face posture deriving step, and the face posture correcting step may be repeatedly executed in this order. According to this method, the function deriving step is also executed every time the face posture deriving step and the face posture correcting step are performed. The coefficient is updated by repeating the function derivation step and converges to a predetermined value. Therefore, it is possible to start detecting the face posture without preparing a function in advance.

顔姿勢補正ステップは、第１の座標系に基づく視線と第１の座標系に基づく顔姿勢とを、第１の座標系とは異なる第２の座標系に基づくように座標変換するステップと、第２の座標系に基づく視線と第２の座標系に基づく顔姿勢とを利用して、第２の座標系に基づく第１角度を取得し、当該第１角度と関数とを利用して、第２の座標系に基づく第２角度を取得するステップと、第２の座標系に基づく第２角度を利用して、第２の座標系に基づく顔姿勢を補正するステップと、補正された顔姿勢を第１の座標系に基づくように座標変換するステップと、を含むこととしてもよい。この方法によれば、顔姿勢の検出精度を高めることができる。 The face posture correction step is a step of performing coordinate conversion between a line of sight based on the first coordinate system and a face posture based on the first coordinate system based on a second coordinate system different from the first coordinate system; Using a line of sight based on the second coordinate system and a face posture based on the second coordinate system, a first angle based on the second coordinate system is obtained, and using the first angle and the function, Obtaining a second angle based on the second coordinate system; correcting a face posture based on the second coordinate system using the second angle based on the second coordinate system; and a corrected face And converting the posture so as to be based on the first coordinate system. According to this method, the detection accuracy of the face posture can be increased.

本発明の別の形態に係る顔姿勢検出装置は、対象者の顔を撮像する少なくとも２台の撮像手段と、撮像手段で撮像された顔画像に基づいて、対象者の顔姿勢を導出する処理手段と、を備え、処理手段は、対象者の顔姿勢と視線との間の第１角度及び基準顔姿勢と顔姿勢の間の第２角度の関係を規定する係数を含み、顔姿勢を補正するための関数を導出する関数導出部と、顔姿勢を導出する顔姿勢導出部と、関数導出部において導出された関数を利用して、顔姿勢を補正する顔姿勢補正部と、を有し、顔姿勢は、対象者の左瞳孔、右瞳孔、及び左右の鼻孔の何れか一方の３つの組み合わせである基準部位群における部位間の距離と、対象者の顔画像において検出される基準部位群の二次元的位置とを利用して、基準部位群を含む平面の法線を算出することによって導出され、視線及び基準顔姿勢は、２枚の対象者の顔画像をステレオマッチングすることによって導出される。 According to another aspect of the present invention, there is provided a face posture detection apparatus that derives a face posture of a target person based on at least two image pickup means for picking up the face of the target person and a face image picked up by the image pickup means. And the processing means includes a coefficient that defines a relationship between a first angle between the face posture and the line of sight of the subject and a second angle between the reference face posture and the face posture, and corrects the face posture. A function deriving unit for deriving a function for performing a facial pose, a face posture deriving unit for deriving a face posture, and a face posture correcting unit for correcting the face posture using the function derived in the function deriving unit. The face posture is determined by the distance between the parts in the reference part group that is a combination of any one of the left pupil, the right pupil, and the right and left nostrils of the subject, and the reference part group detected in the face image of the subject The normal of the plane including the reference region group is calculated using the two-dimensional position of Is derived by, gaze and reference face orientation is derived by stereo matching two of the subject's face image.

この顔検出装置は、上述した顔検出方法と同様の効果を得ることができる。すなわち、顔検出装置は、基準部位群における部位間の距離を拘束条件として利用することにより顔姿勢を導出する。従って、ランダムノイズの少ない安定した結果を得ることができる。そして、顔検出装置では、関数を利用して顔姿勢を補正する。この関数は、対象者の視線と顔姿勢の間の第１角度及び顔姿勢と基準顔姿勢の間の第２角度の関係を規定する係数を含んでいるので、導出された顔姿勢を、基準顔姿勢に相当する結果に変換する。ここで、基準顔姿勢は、ステレオマッチングにより導出された結果であるので、真の顔姿勢に対する偏りが小さい。このため、ランダムノイズの少ない安定した顔姿勢が、真の顔姿勢に対する偏りが小さい顔姿勢に補正される。精度のよい顔検出の結果が得られる。 This face detection device can obtain the same effect as the face detection method described above. That is, the face detection device derives the face posture by using the distance between the parts in the reference part group as a constraint condition. Therefore, a stable result with little random noise can be obtained. The face detection device corrects the face posture using a function. This function includes coefficients that define the relationship between the first angle between the subject's line of sight and the face posture and the second angle between the face posture and the reference face posture. The result is converted into a result corresponding to the face posture. Here, since the reference face posture is a result derived by stereo matching, the bias with respect to the true face posture is small. For this reason, a stable face posture with less random noise is corrected to a face posture with a small deviation from the true face posture. Accurate face detection results can be obtained.

顔姿勢補正部は、第１の座標系に基づく視線と第１の座標系に基づく顔姿勢とを、第１の座標系とは異なる第２の座標系に基づくように座標変換する第１の座標変換部と、第２の座標系に基づく視線と第２の座標系に基づく顔姿勢とを利用して、第２の座標系に基づく第１角度を取得し、当該第１角度と関数とを利用して、第２の座標系に基づく第２角度を取得する角度取得部と、第２の座標系に基づく第２角度を利用して、第２の座標系に基づく顔姿勢を補正する方向補正部と、補正された顔姿勢を第１の座標系に基づくように座標変換する第２の座標変換部と、を含むこととしてもよい。この構成によれば、顔姿勢の検出精度を高めることができる。 The face posture correcting unit performs first coordinate conversion between a line of sight based on the first coordinate system and a face posture based on the first coordinate system based on a second coordinate system different from the first coordinate system. A coordinate conversion unit, a line of sight based on the second coordinate system, and a face posture based on the second coordinate system are used to obtain a first angle based on the second coordinate system, and the first angle and function Using the angle acquisition unit for acquiring the second angle based on the second coordinate system and the second angle based on the second coordinate system, the face posture based on the second coordinate system is corrected. A direction correction unit and a second coordinate conversion unit that converts the corrected face posture based on the first coordinate system may be included. According to this configuration, the detection accuracy of the face posture can be increased.

本発明の更に別の形態に係る顔検出プログラムは、コンピュータを、対象者の顔姿勢と視線との間の第１角度及び基準顔姿勢と顔姿勢の間の第２角度の関係を規定する係数を含み、顔姿勢を補正するための関数を導出する関数導出部と、顔姿勢を導出する顔姿勢導出部と、関数導出部において導出された関数を利用して、顔姿勢を補正する顔姿勢補正部と、して機能させ、顔姿勢は、対象者の左瞳孔、右瞳孔、及び左右の鼻孔の何れか一方の３つの組み合わせである基準部位群における部位間の距離と、対象者の顔画像において検出される基準部位群の二次元的位置とを利用して、基準部位群を含む平面の法線を算出することによって導出され、視線及び基準顔姿勢は、２枚の対象者の顔画像をステレオマッチングすることによって導出される。 According to still another aspect of the present invention, there is provided a face detection program for calculating a relationship between a first angle between a face posture and a line of sight of a subject and a second angle between a reference face posture and the face posture. A function derivation unit for deriving a function for correcting the face pose, a face derivation unit for deriving the face pose, and a face pose that corrects the face pose using the functions derived in the function derivation unit The face posture is the distance between the parts in the reference part group that is a combination of any one of the left pupil, the right pupil, and the right and left nostrils, and the face of the subject. Using the two-dimensional position of the reference part group detected in the image, it is derived by calculating the normal of the plane including the reference part group, and the line of sight and the reference face posture are the faces of the two subjects. Derived by stereo matching the image .

この顔検出プログラムは、上述した顔検出方法及び顔検出装置と同様の効果を得ることができる。すなわち、顔検出プログラムは、基準部位群における部位間の距離を拘束条件として利用することにより顔姿勢を導出する。従って、ランダムノイズの少ない安定した結果を得ることができる。そして、顔検出装置では、関数を利用して顔姿勢を補正する。この関数は、対象者の視線と顔姿勢の間の第１角度及び顔姿勢と基準顔姿勢の間の第２角度の関係を規定する係数を含んでいるので、導出された顔姿勢を、基準顔姿勢に相当する結果に変換する。ここで、基準顔姿勢は、ステレオマッチングにより導出された結果であるので、真の顔姿勢に対する偏りが小さい。このため、ランダムノイズの少ない安定した顔姿勢が、真の顔姿勢に対する偏りが小さい顔姿勢に補正される。精度のよい顔検出の結果が得られる。 This face detection program can obtain the same effects as the above-described face detection method and face detection apparatus. That is, the face detection program derives the face posture by using the distance between the parts in the reference part group as a constraint condition. Therefore, a stable result with little random noise can be obtained. The face detection device corrects the face posture using a function. This function includes coefficients that define the relationship between the first angle between the subject's line of sight and the face posture and the second angle between the face posture and the reference face posture. The result is converted into a result corresponding to the face posture. Here, since the reference face posture is a result derived by stereo matching, the bias with respect to the true face posture is small. For this reason, a stable face posture with less random noise is corrected to a face posture with a small deviation from the true face posture. Accurate face detection results can be obtained.

本発明の一形態に係る顔検出方法、顔検出装置及び顔検出プログラムによれば、精度のよい顔検出の結果が得られる。 According to the face detection method, the face detection apparatus, and the face detection program according to an aspect of the present invention, a highly accurate face detection result can be obtained.

実施形態に係る顔検出装置を示す図である。It is a figure which shows the face detection apparatus which concerns on embodiment. 特徴点を説明する図である。It is a figure explaining a feature point. 拘束条件法において発生する誤差を説明するための図である。It is a figure for demonstrating the error which generate | occur | produces in a constraint condition method. 拘束条件法において発生する誤差を説明するための図である。It is a figure for demonstrating the error which generate | occur | produces in a constraint condition method. 顔姿勢補正の原理を説明する図である。It is a figure explaining the principle of face posture correction. 顔姿勢補正の原理を説明する図である。It is a figure explaining the principle of face posture correction. 実施形態に係る顔検出方法の主要なステップを示す図である。It is a figure which shows the main steps of the face detection method which concerns on embodiment. 実施形態に係る顔検出方法の効果を説明するための図である。It is a figure for demonstrating the effect of the face detection method which concerns on embodiment. 瞳孔用カメラ及び鼻孔用カメラのレンズ部分を示す図である。It is a figure which shows the lens part of the camera for pupils, and the camera for nostrils. 実施形態に係る画像処理装置のハードウェア構成を示す図である。It is a figure which shows the hardware constitutions of the image processing apparatus which concerns on embodiment. 実施形態に係る顔検出装置の機能構成を示す図である。It is a figure which shows the function structure of the face detection apparatus which concerns on embodiment. 世界座標系とカメラ座標系の関係を説明するための図である。It is a figure for demonstrating the relationship between a world coordinate system and a camera coordinate system. カメラ座標系と顔座標系との位置関係を示す図である。It is a figure which shows the positional relationship of a camera coordinate system and a face coordinate system. 鼻孔用カメラのレンズの中心を原点とした二次元座標系における画像平面と特徴点の三次元座標との関係を示す図である。It is a figure which shows the relationship between the image plane in the two-dimensional coordinate system which made the origin the center of the lens of a nostril camera lens, and the three-dimensional coordinate of a feature point. 視線の検出を説明するための図である。It is a figure for demonstrating the detection of a gaze. 係数の決定方法を説明するための図である。It is a figure for demonstrating the determination method of a coefficient. 実施形態に係る顔検出プログラムの構成を示す図である。It is a figure which shows the structure of the face detection program which concerns on embodiment.

以下、添付図面を参照しながら本発明を実施するための形態を詳細に説明する。図面の説明において同一の要素には同一の符号を付し、重複する説明を省略する。 DESCRIPTION OF EMBODIMENTS Hereinafter, embodiments for carrying out the present invention will be described in detail with reference to the accompanying drawings. In the description of the drawings, the same elements are denoted by the same reference numerals, and redundant description is omitted.

本発明の一形態に係る顔検出方法は、図１に示される顔検出装置１により実行される。顔検出装置１は、対象者Ａの視線及び顔姿勢を検出するコンピュータシステムであり、このシステムにより、顔検出方法が実施される。対象者Ａとは、視線及び顔姿勢を検出する対象となる人であり、被験者ともいうことができる。視線とは、対象者Ａの瞳孔中心と該対象者の注視点（対象者が見ている点）とを結ぶ線である。なお、「視線」という用語は、起点、終点、及び方向の意味（概念）を含む。顔姿勢は、顔の方向及び重心とで定まり、後述する顔姿勢ベクトルで表される。すなわち、顔姿勢とは骨格（頭蓋骨）の位置と方向を意味し、眼球の回転とは無関係である。顔検出装置１及び顔検出方法の利用目的は何ら限定されず、例えば、よそ見運転の検出、運転者の眠気の検出、商品の興味の度合いの調査、コンピュータへのデータ入力などに利用することができる。 The face detection method according to one aspect of the present invention is executed by the face detection apparatus 1 shown in FIG. The face detection device 1 is a computer system that detects the line of sight and the face posture of the subject A, and a face detection method is performed by this system. The target person A is a person who detects a line of sight and a face posture, and can also be called a subject. The line of sight is a line connecting the center of the pupil of the subject A and the gaze point of the subject (the point the subject is looking at). The term “line of sight” includes the meaning (concept) of the starting point, the ending point, and the direction. The face posture is determined by the face direction and the center of gravity, and is represented by a face posture vector described later. That is, the face posture means the position and direction of the skeleton (skull) and is independent of the rotation of the eyeball. The purpose of use of the face detection device 1 and the face detection method is not limited at all. For example, it can be used for detection of looking away, detection of driver drowsiness, investigation of the degree of interest of products, data input to a computer, and the like. it can.

まず、本実施形態に係る顔検出方法の基本原理について説明する。図２に示されるように、対象者Ａの顔姿勢を示す顔姿勢ベクトルＶ_Ｂは、対象者Ａの瞳孔及び鼻孔を利用して導出される。例えば、顔姿勢ベクトルＶ_Ｆ，Ｖ_Ｂは、左瞳孔の座標Ｐ_１、右瞳孔の座標Ｐ_２、及び鼻孔間中心の座標Ｐ_０の３点を通る平面の法線ベクトルとして導出される。これら左瞳孔、右瞳孔、左鼻孔中心、右鼻孔中心及び鼻孔間中心は、特徴点である。ここで特徴点の三次元座標を算出する方法として、所定の拘束条件を利用して算出する方法と、ステレオマッチングを利用して算出する方法と、がある。以下の説明において、便宜上、拘束条件を利用する方法を「拘束条件法」と呼び、ステレオマッチングによる方法を「ステレオ法」と呼ぶことにする。 First, the basic principle of the face detection method according to this embodiment will be described. As shown in FIG. 2, the face posture vector V _B indicating the face posture of the subject A is derived using the pupil and nostril of the subject A. For example, the face posture vectors V _F and V _B are derived as a normal vector of a plane passing through three points of the left pupil coordinate P ₁ , the right pupil coordinate P ₂ , and the internostral center coordinate P ₀ . These left pupil, right pupil, left nostril center, right nostril center and nostril center are feature points. Here, as a method of calculating the three-dimensional coordinates of the feature points, there are a method of calculating using a predetermined constraint condition and a method of calculating using stereo matching. In the following description, for convenience, a method using a constraint condition is referred to as a “constraint condition method”, and a method using stereo matching is referred to as a “stereo method”.

拘束条件法では、特徴点の三次元座標の算出において、特徴点の間の距離（以下「特徴点間距離」又は単に「距離」という）を拘束条件として利用している。特徴点間距離とは、例えば、左瞳孔の座標Ｐ_１と右瞳孔の座標Ｐ_２との間の距離Ｌ_１、左瞳孔の座標Ｐ_１と鼻孔間中心の座標Ｐ_０との間の距離Ｌ_２及び右瞳孔の座標Ｐ_２と鼻孔間中心の座標Ｐ_０との間の距離Ｌ_３である。拘束条件法により導出される顔姿勢は、安定性が高いという利点を有する。この安定性とは、対象者Ａの頭部を固定した状態で顔姿勢を検出したときに現れる時間的なランダムノイズによって示される特性をいう。すなわち、安定性が高いとは、ランダムノイズが小さいことを意味する。また、ステレオ法の実施には光学系（例えばカメラ）が２個必要であるところ、拘束条件法は１個の光学系だけで実施可能である。従って、２個の光学系を離間して配置し、それぞれの光学系を利用して拘束条件法を実施することにより、顔方向の検出範囲を光学系の離間方向に拡大することが可能になる。２個の光学系と拘束条件法を利用した場合の顔方向検出範囲は、２個の光学系とステレオ法を利用した場合の顔方向検出範囲よりも広い。なお、以下の説明において、拘束条件法で導出された結果を、単に「顔方向」（顔姿勢）と呼ぶ。 In the constraint condition method, the distance between feature points (hereinafter referred to as “distance between feature points” or simply “distance”) is used as a constraint condition in calculating the three-dimensional coordinates of the feature points. The distance between the feature points is, for example, a distance L ₁ between the coordinate P ₁ of the left pupil and the coordinate P _{2 of the} right pupil, and a distance L between the coordinate P _{1 of the} left pupil and the coordinate P _{0 of the} center between the nostrils. ₂ and the distance L ₃ between the coordinate P _{2 of the} right pupil and the coordinate P _{0 of the} center between the nostrils. The face posture derived by the constraint condition method has an advantage of high stability. This stability refers to a characteristic indicated by temporal random noise that appears when a face posture is detected with the subject A's head fixed. That is, high stability means that random noise is small. In addition, the implementation of the stereo method requires two optical systems (for example, cameras), but the constraint method can be implemented with only one optical system. Accordingly, by arranging the two optical systems apart and performing the constraint condition method using the respective optical systems, the detection range in the face direction can be expanded in the separation direction of the optical system. . The face direction detection range when the two optical systems and the constraint method are used is wider than the face direction detection range when the two optical systems and the stereo method are used. In the following description, the result derived by the constraint condition method is simply referred to as “face direction” (face posture).

しかし、拘束条件法は、顔方向に対して視線が動いた場合に、特徴点間距離が変化する。例えば、対象者Ａが頭部を動かすことなく視線のみを移動させた場合や、対象者Ａがある点を注視した状態で頭部のみを移動させた場合が挙げられる。この場合には、特徴点間距離が変化する。図３（ａ）は、顔方向（顔姿勢ベクトルＶ_Ｂの方向）と視線Ｇが一致しているときに、３個の特徴点の座標Ｐ_１，Ｐ_２，Ｐ_０により規定される平面Ｂ１と、その法線ベクトル（顔姿勢ベクトルＶ_Ｆ）を示す。図３（ｂ）は、図３（ａ）の状態から顔方向を固定したままで視線を変化させた場合における、３個の特徴点の座標Ｐ_１，Ｐ_２，Ｐ_０により規定される平面Ｂ２と、その法線ベクトル（顔姿勢ベクトルＶ_Ｆ）を示す。理想的には、顔方向は固定されたままであるので、顔姿勢ベクトルＶ_Ｆも変化しないはずである。しかし、図３（ｂ）に示されるように、視線を変化させることによって、３個の特徴点の座標Ｐ_１，Ｐ_２，Ｐ_０により規定される平面Ｂ２が変化している。換言すると、距離Ｌ_１，Ｌ_２，Ｌ_３が変化している。従って、実際の特徴点間距離と、顔方向の導出に利用する特徴点間距離とに差異が生じ、導出される顔方向に誤差（偏り誤差）が発生する。すなわち、拘束条件法では、顔方向と視線とのずれに起因する誤差が生じ得る。図４に示されるように、対象者Ａの顔方向ｎ_ａをディスプレイ装置４０の中心の点Ａ_１に向けた状態で、視線Ｇだけをディスプレイ装置４０の左斜め下に設定した点Ａ２に向けると、導出される顔方向は点ｎ_eになる。例えば、対象者Ａの顔方向を固定した状態で、視線だけを左右に±１０度ほど動かすと、顔方向は真値から左右に±５度程度偏って検出される。視線は左右に±３０度程度までは検出できるため、顔方向は更に大きく偏って検出されることになる。 However, the constraint condition method changes the distance between feature points when the line of sight moves with respect to the face direction. For example, the case where the subject A moves only the line of sight without moving the head, or the case where the subject A moves only the head while gazing at a certain point. In this case, the distance between feature points changes. FIG. 3A shows a plane B1 defined by the coordinates P ₁ , P ₂ , and P ₀ of the three feature points when the face direction (the direction of the face posture vector V _B ) and the line of sight G match. And its normal vector (face posture vector V _F ). FIG. 3B is a plane defined by the coordinates P ₁ , P ₂ , and P ₀ of the three feature points when the line of sight is changed from the state of FIG. 3A while the face direction is fixed. B2 and its normal vector (face posture vector V _F ) are shown. Ideally, the face orientation vector V _F should not change because the face direction remains fixed. However, as shown in FIG. 3B, the plane B2 defined by the coordinates P ₁ , P ₂ , and P ₀ of the three feature points is changed by changing the line of sight. In other words, the distances L ₁ , L ₂ and L ₃ are changing. Therefore, there is a difference between the actual distance between feature points and the distance between feature points used for derivation of the face direction, and an error (bias error) occurs in the derived face direction. That is, in the constraint condition method, an error due to a deviation between the face direction and the line of sight can occur. As shown in FIG. 4, the face direction n _a subject's A state toward the point A ₁ of the center of the display device 40, directs only gaze G to A2 point set under the left oblique display device 40 When the face direction is derived is a point n _e. For example, if the face direction of the subject A is fixed and only the line of sight is moved about ± 10 degrees to the left and right, the face direction is detected with a deviation of about ± 5 degrees from the true value to the left and right. Since the line of sight can be detected up to about ± 30 degrees to the left and right, the face direction is detected with a greater bias.

一方、ステレオ法では、顔方向と視線とのずれに起因する誤差はほとんど問題にならない。なぜならば、ステレオ法では、特徴点の三次元座標をステレオマッチングにより独立に検出する。このような三次元座標を利用して導出される顔姿勢ベクトルは、顔方向と視線との間でずれが発生した状態であっても、ほとんどその影響を受けない。この理由は、眼球が眼窩の中で±３０程度回転しても、瞳孔の奥行き方向の位置は大きく変化しないためである。しかし、ステレオ法は、ステレオマッチングの処理過程において、顔画像毎に含まれる独立したノイズが、算出される特徴点の三次元座標にも影響を与え、ひいては顔姿勢ベクトルにも影響を及ぼす。具体的には、顔画像に含まれたノイズは、顔姿勢ベクトルにおけるランダムノイズの要因となり得る。このランダムノイズは、平均化処理により低減することが可能である。この平均化処理された顔姿勢ベクトルは、拘束条件法で導出される顔姿勢ベクトルに比べて真の顔方向に対するずれが小さい。なお、以下の説明においてステレオ法で導出された結果を「基準顔方向」（基準顔姿勢）と呼ぶ。 On the other hand, in the stereo method, an error caused by a deviation between the face direction and the line of sight hardly causes a problem. This is because in the stereo method, the three-dimensional coordinates of feature points are detected independently by stereo matching. The face posture vector derived by using such three-dimensional coordinates is hardly affected even when there is a deviation between the face direction and the line of sight. This is because the position of the pupil in the depth direction does not change greatly even if the eyeball rotates about ± 30 in the orbit. However, in the stereo method, in the process of stereo matching, independent noise included in each face image also affects the calculated three-dimensional coordinates of feature points, and thus also affects the face posture vector. Specifically, noise included in the face image can be a cause of random noise in the face posture vector. This random noise can be reduced by averaging processing. The averaged face posture vector has a smaller deviation from the true face direction than the face posture vector derived by the constraint condition method. In the following description, the result derived by the stereo method is referred to as “reference face direction” (reference face posture).

発明者らは、基準顔方向と視線との間の第１角度は、顔方向と基準顔方向と間の第２角度と関係があることを見出した。本実施形態の顔検出方法は、拘束条件法で導出した顔方向に対して、顔方向と視線とのずれに起因して生じる誤差を補正することにより、安定度が高く且つ正確な顔方向を導出する。具体的には、拘束条件法で導出した顔方向から、ステレオ法で導出した顔方向を推定する。 The inventors have found that the first angle between the reference face direction and the line of sight is related to the second angle between the face direction and the reference face direction. The face detection method according to the present embodiment corrects an error caused by the deviation between the face direction and the line of sight with respect to the face direction derived by the constraint condition method, thereby obtaining a highly stable and accurate face direction. To derive. Specifically, the face direction derived by the stereo method is estimated from the face direction derived by the constraint condition method.

具体的な原理について図５を用いて説明する。Ｇは視線であり、Ｈ_１は拘束条件法で検出された顔方向であり、Ｈ_２はステレオ法で検出された基準顔方向であるとする。視線Ｇと基準顔方向Ｈ_２の間の角度は角度∠Ｇである。基準顔方向Ｈ_２と顔方向Ｈ_１との間の角度は角度∠Ｈである。 A specific principle will be described with reference to FIG. G is the line of sight, H ₁ is the face direction detected by the constraint method, and H ₂ is the reference face direction detected by the stereo method. The angle between the sight line G and the reference face direction H ₂ is the angle ∠G. The angle between the reference face direction H ₂ and the face direction H ₁ is the angle ∠H.

図６（ａ）に示されるように、基準顔方向Ｈ_２と視線Ｇが一致するときには、顔方向Ｈ_１は、基準顔方向Ｈ_２と一致する。一方、図６（ｂ）に示されるように、基準顔方向Ｈ_２が視線Ｇと一致しないときには、顔方向Ｈ_１は、基準顔方向Ｈ_２に対して偏る。ここで、新たな角度∠Ｇ’を定義する。角度∠Ｇ’は式（１）により示される。すなわち、角度∠Ｇ’は顔方向Ｈ_１と視線Ｇとの間の角度であるともいえる。そして、角度∠Ｇ’と角度∠Ｈとの間には、下記式（２）により示される線形関係があると仮定する。ここで、式（２）における係数ｋ_１，ｋ_２は、対象者の顔方向Ｈ_１と視線Ｇとの間の角度∠Ｇ’（第１角度）及び基準顔方向Ｈ_２と顔方向Ｈ_１の間の角度∠Ｈ（第２角度）の関係を規定する係数である。

まず、対象者の顔方向（頭部方向）と視線Ｇとのそれぞれをいろいろな方向へ向けながら角度∠Ｇ’と角度∠Ｈとを有するデータを収集する。それらデータは、図１６に示されるように、横軸が角度∠Ｇ’であり縦軸が角度∠Ｈである二次元座標にプロットされる。角度∠Ｇ’と角度∠Ｈとは、式（２）の関係を有する。従って、例えば最小二乗法等を利用して近似式を算出することにより、係数ｋ_１，ｋ_２を求めることができる。そうすると、係数ｋ_１，ｋ_２によって決定された関数を利用して、角度∠Ｇ’から角度∠Ｈを算出することが可能になる。そして、角度∠Ｈと顔方向Ｈ_１とを角度∠Ｈの定義に適用すると、真値に近い基準顔方向Ｈ_２を推定できる。なお、角度∠Ｈが大きくなった場合には、非線形成分を考慮することが望ましい。その場合には、下記式（３）に示される非線形関係式を使用してもよい。

As shown in FIG. 6 (a), when the reference face direction H ₂ and sight G match, face direction H ₁ is consistent with the reference face direction H _2. On the other hand, as shown in FIG. 6 (b), when the reference face direction H ₂ does not coincide with the line of sight G is face direction H ₁ is biased against the reference face direction H _2. Here, a new angle ∠G ′ is defined. The angle ∠G ′ is expressed by equation (1). That is, it can be said that the angle ∠G ′ is an angle between the face direction H ₁ and the line of sight G. Then, it is assumed that there is a linear relationship represented by the following equation (2) between the angle ∠G ′ and the angle ∠H. Here, the coefficients k ₁ and k ₂ in Expression (2) are the angle ∠G ′ (first angle) between the face direction H ₁ of the subject and the line of sight G, the reference face direction H _2, and the face direction H _1. It is a coefficient which prescribes | regulates the relationship of angle の間 H (2nd angle) between.

First, data having an angle ∠G ′ and an angle ∠H is collected while directing the face direction (head direction) of the subject and the line of sight G in various directions. As shown in FIG. 16, these data are plotted in two-dimensional coordinates in which the horizontal axis is the angle ∠G ′ and the vertical axis is the angle ∠H. The angle ∠G ′ and the angle ∠H have the relationship of Expression (2). Therefore, for example, the coefficients k ₁ and k ₂ can be obtained by calculating the approximate expression using the least square method or the like. Then, the angle ∠H can be calculated from the angle ∠G ′ using the function determined by the coefficients k ₁ and k ₂ . When applying the angle ∠H and the face direction H ₁ in the definition of the angle ∠H, it can be estimated reference face direction H ₂ close to the true value. When the angle ∠H increases, it is desirable to consider a nonlinear component. In that case, you may use the nonlinear relational expression shown by following formula (3).

なお、式（１）〜（３）は、Ｘ軸とＹ軸とにおいて、それぞれ独立に求めることが可能である。ただし、ここでいうＸ軸、Ｙ軸とは、顔座標系における軸である。すなわち、式（１）〜（３）は、顔座標系の上で成立する。顔方向Ｈ_１の検出にあたっては、まず、様々な方向へ視線Ｇと顔方向Ｈ_１を向けて、それらの方向を計測する。この計測は、世界座標系（第１の座標系）を基準として行われる。次に、世界座標系を基準とした視線Ｇと顔方向Ｈ_１を顔座標系（第２の座標系）に変換する。次に、顔座標系において上記関係式（１）〜（３）を利用して、補正された顔方向Ｈ_１を算出する。そして、顔座標系における補正された顔方向Ｈ_１を、世界座標系に基づくように座標変換する。これらの処理により、精度の良い顔方向Ｈ_１が取得される。図２におけるＺ軸は顔方向である。Ｘ軸はＺ軸に直交する水平軸である。Ｙ軸はＺ軸に直交する垂直軸である。例えば、Ｘ軸に沿って顔方向が移動するということは、Ｙ軸を周りに頭部が回転することを意味する。 Expressions (1) to (3) can be obtained independently for the X axis and the Y axis. However, the X-axis and Y-axis here are axes in the face coordinate system. That is, Expressions (1) to (3) are established on the face coordinate system. In the detection of the face direction H _1, first, towards the line of sight G and face direction H ₁ in different directions, to measure their direction. This measurement is performed on the basis of the world coordinate system (first coordinate system). Then converted based on the world coordinate system to the gaze G and face direction H ₁ face coordinate system (second coordinate system). Next, using the relational expressions (1) to (3) in the face coordinate system, calculates a corrected face direction H _1. Then, the face direction H ₁ that is corrected in the face coordinate system, a coordinate conversion to be based on the world coordinate system. By these processes, accurate face direction H ₁ is obtained. The Z axis in FIG. 2 is the face direction. The X axis is a horizontal axis orthogonal to the Z axis. The Y axis is a vertical axis orthogonal to the Z axis. For example, the movement of the face direction along the X axis means that the head rotates around the Y axis.

このように、顔座標系に変換した上での補正は、頭部を左右に傾けた時に、特に有効である。なぜならば、世界座標系で対象者が左右に視線を動かしても、顔座標系では、純粋にＹ軸周りの眼球回転だけでなく、Ｘ軸周りの眼球回転成分も含まれるためである。すなわち、頭蓋骨に対して視線が斜め方向に動くためである。また、人間の場合、正面を見た時に、２個の瞳孔に対して、鼻孔間中心が前方に突出している。そうすると、顔座標系を基準とした顔方向は対象者が正面と思っている方向よりも上方を向く。従って、この場合に、対象者が視線をＸ軸方向に動かしたとしても、顔座標系の上では、Ｙ軸方向の成分が現れる。 Thus, the correction after conversion to the face coordinate system is particularly effective when the head is tilted to the left and right. This is because even if the subject moves his / her line of sight in the world coordinate system, the face coordinate system includes not only the eyeball rotation around the Y axis but also the eyeball rotation component around the X axis. That is, the line of sight moves in an oblique direction with respect to the skull. Further, in the case of a human, when looking at the front, the center between the nostrils protrudes forward with respect to the two pupils. If it does so, the face direction on the basis of a face coordinate system will face upwards rather than the direction which an object person thinks is a front. Therefore, in this case, even if the subject moves his / her line of sight in the X-axis direction, a component in the Y-axis direction appears on the face coordinate system.

上述したように、対象者が感じる正面に対して計測される顔正面（顔方向）は大きく上にずれている。従って、図６や図８で示されるように画面の中央に顔座標系を基準とした顔方向を向けるためには、頭部を大きく下に傾けることになる。これを解決するためには、顔方向（正面）の補正を行う。具体的には、注視点（視線）較正を行うにあたって、画面の中央に視標を提示し一点較正を行う。この一点較正を行うとき、対象者は予め画面の中央に顔の正面を向ける。この間に、世界座標系において、顔位置から画面中央へ向かう方向ベクトルを求める。ここで、顔位置は、一例として、重心や瞳孔間中点と規定することができる。なぜならば、視線も片眼毎ではなく、瞳孔間中点から画面中央へ直線と考えた方がよいためである。すなわち、起点を同じにする意味で理想的である。更に、このベクトルを顔座標系におけるベクトルに変換する。これにより、頭蓋骨に対する顔正面を意味する顔正面ベクトルが決定される。その後は、フレーム毎に、世界座標系における顔座標系の姿勢を求めて、それから顔座標系から世界座標系への座標変換式を求める。その座標変換式を利用して、先に求めた顔座標系における顔正面ベクトルを、顔座標系から世界座標系に変換すれば、顔方向ベクトルが求まる。この顔方向ベクトルの方向に顔位置を起点として直線を伸ばし、その直線と画面との交点を画面上の正面位置をすることができる。 As described above, the face front (face direction) measured with respect to the front felt by the subject is greatly shifted upward. Therefore, as shown in FIG. 6 and FIG. 8, in order to direct the face direction based on the face coordinate system to the center of the screen, the head is greatly inclined downward. In order to solve this, the face direction (front) is corrected. Specifically, when performing gazing point (line of sight) calibration, a target is presented at the center of the screen to perform one-point calibration. When performing this one-point calibration, the subject turns the front of the face to the center of the screen in advance. During this time, a direction vector from the face position toward the center of the screen is obtained in the world coordinate system. Here, the face position can be defined as, for example, the center of gravity or the mid-pupil midpoint. This is because it is better to consider the line of sight as a straight line from the mid-pupil midpoint to the center of the screen, not for each eye. That is, it is ideal in the sense that the starting points are the same. Further, this vector is converted into a vector in the face coordinate system. Thereby, the face front vector which means the face front with respect to the skull is determined. Thereafter, the posture of the face coordinate system in the world coordinate system is obtained for each frame, and then a coordinate conversion formula from the face coordinate system to the world coordinate system is obtained. If the face front vector in the previously obtained face coordinate system is converted from the face coordinate system to the world coordinate system using the coordinate conversion formula, the face direction vector can be obtained. A straight line can be extended starting from the face position in the direction of the face direction vector, and the intersection of the straight line and the screen can be the front position on the screen.

図７に示されるように、顔検出方法は、主要なステップとして、画像取得ステップＳ１と、前処理ステップＳ２と、関数導出ステップＳ３と、画像取得ステップＳ４と、前処理ステップＳ５と、顔姿勢導出ステップＳ６と、顔姿勢補正ステップＳ７と、を有する。顔検出方法では、関数導出ステップＳ３を予め実施し、関数導出ステップＳ３で導出された関数（式（２）参照）をもって顔姿勢導出ステップＳ６及び顔姿勢補正ステップＳ７を繰り返し実施する。なお、顔検出方法では、関数導出ステップＳ３、顔姿勢導出ステップＳ６、顔姿勢補正ステップＳ７をこの順で繰り返し実施してもよい。この場合には、処理を繰り返す毎に、関数が更新され、徐々に顔方向の検出精度が向上することになる。 As shown in FIG. 7, the face detection method includes, as main steps, an image acquisition step S1, a preprocessing step S2, a function derivation step S3, an image acquisition step S4, a preprocessing step S5, and a face posture. It has a derivation step S6 and a face posture correction step S7. In the face detection method, the function deriving step S3 is performed in advance, and the face posture deriving step S6 and the face posture correcting step S7 are repeatedly performed using the function derived in the function deriving step S3 (see Expression (2)). In the face detection method, the function deriving step S3, the face posture deriving step S6, and the face posture correcting step S7 may be repeated in this order. In this case, each time the process is repeated, the function is updated, and the detection accuracy of the face direction is gradually improved.

まず、画像取得ステップＳ１を実施する。画像取得ステップＳ１では、瞳孔用カメラ（撮像手段）１０と鼻孔用カメラ２０を制御して、複数の画像データ（顔画像）を取得する。画像データには、明瞳孔画像、暗瞳孔画像、鼻孔画像がある。画像取得ステップＳ１の詳細については、後述する。画像取得ステップＳ１の後に、前処理ステップＳ２を実施する。前処理ステップＳ２では、画像取得ステップＳ１で取得された画像データ（顔画像）を利用して、対象者Ａの顔方向及び視線を導出する。ここで、前処理ステップＳ２は、角度∠Ｇ’と角度∠Ｈとを世界座標系（第１の座標系）において取得するステップＳ２ａと、角度∠Ｇ’と角度∠Ｈとを顔座標系（第２の座標系）に座標変換するステップＳ２ｂとを含む。また、導出される顔方向は、ステレオ法により導出された結果と、拘束条件法により導出された結果とを含む。更に、導出される視線は、ステレオ法により導出された結果を含む。前処理ステップＳ２の詳細については、後述する。前処理ステップＳ２の後に、関数導出ステップＳ３を実施する。関数導出ステップＳ３では、前処理ステップＳ２で導出された顔方向及び視線を利用して、顔方向を補正するための関数（式（２）参照）を導出する。関数導出ステップＳ３は、上記式（２）に示された係数ｋ_１，ｋ_２を算出することにより関数（式（２））を決定するステップＳ３ａを含む。式（２）に示されるように、関数は、係数ｋ_１，ｋ_２含む。関数を導出するとは、この係数ｋ_１，ｋ_２を決定することである。関数導出ステップＳ３の詳細については、後述する。 First, the image acquisition step S1 is performed. In the image acquisition step S1, the pupil camera (imaging means) 10 and the nostril camera 20 are controlled to acquire a plurality of image data (face images). The image data includes a bright pupil image, a dark pupil image, and a nostril image. Details of the image acquisition step S1 will be described later. After the image acquisition step S1, a preprocessing step S2 is performed. In the preprocessing step S2, the face direction and line of sight of the subject A are derived using the image data (face image) acquired in the image acquisition step S1. Here, in the preprocessing step S2, step S2a for obtaining the angle ∠G ′ and the angle ∠H in the world coordinate system (first coordinate system), and the angle ∠G ′ and the angle ∠H are represented by the face coordinate system ( And (S2b) for converting the coordinates to the second coordinate system. The derived face direction includes a result derived by the stereo method and a result derived by the constraint condition method. Further, the derived line of sight includes the result derived by the stereo method. Details of the preprocessing step S2 will be described later. After the preprocessing step S2, a function derivation step S3 is performed. In the function derivation step S3, a function (see equation (2)) for correcting the face direction is derived using the face direction and line of sight derived in the preprocessing step S2. The function deriving step S3 includes a step S3a for determining a function (formula (2)) by calculating the coefficients k ₁ and k ₂ shown in the formula (2). As shown in Equation (2), the function includes coefficients k ₁ and k ₂ . Deriving a function means determining the coefficients k ₁ and k ₂ . Details of the function derivation step S3 will be described later.

以上の画像取得ステップＳ１、前処理ステップＳ２、関数導出ステップＳ３は、顔検出装置１の起動毎、又は、所望のタイミングで実行される。 The above image acquisition step S1, preprocessing step S2, and function derivation step S3 are executed every time the face detection device 1 is activated or at a desired timing.

次に、画像取得ステップＳ４を実施して画像データ（顔画像）を取得し、画像取得ステップＳ４の後に前処理ステップＳ５を実施する。続いて、前処理ステップＳ５の後に、顔姿勢導出ステップＳ６を実施する。顔姿勢導出ステップＳ６では、拘束条件法を利用して、顔方向を導出する。顔姿勢導出ステップＳ６では、特徴点間距離を拘束条件として利用する。この特徴点間距離は、予め取得された値を利用してもよいし、前処理ステップＳ２においてステレオ法を利用して算出した特徴点の三次元座標を利用してもよい。これは、視線検出（注視点検出）をするとき、１秒程度の注視をする間に同時に行うことができる。顔姿勢導出ステップＳ６の詳細については後述する。 Next, image acquisition step S4 is performed to acquire image data (face image), and preprocessing step S5 is performed after image acquisition step S4. Subsequently, after the preprocessing step S5, a face posture deriving step S6 is performed. In the face posture deriving step S6, the face direction is derived using the constraint condition method. In the face posture deriving step S6, the distance between feature points is used as a constraint condition. As the distance between the feature points, a value acquired in advance may be used, or the three-dimensional coordinates of the feature points calculated using the stereo method in the preprocessing step S2 may be used. This can be performed simultaneously during gaze detection for about 1 second when performing gaze detection (gaze point detection). Details of the face posture deriving step S6 will be described later.

顔姿勢導出ステップＳ６の後に、顔姿勢補正ステップＳ７を実施する。顔姿勢補正ステップＳ７では、関数と前処理ステップＳ２で検出された視線とを利用して、顔方向を補正する。顔姿勢補正ステップＳ７の詳細については、後述する。そして、再び画像取得ステップＳ４を実施する。これら画像取得ステップＳ４、前処理ステップＳ５、顔姿勢導出ステップＳ６、顔姿勢補正ステップＳ７を繰り返し実行する。 After the face posture deriving step S6, a face posture correcting step S7 is performed. In the face posture correction step S7, the face direction is corrected using the function and the line of sight detected in the preprocessing step S2. Details of the face posture correction step S7 will be described later. Then, the image acquisition step S4 is performed again. These image acquisition step S4, preprocessing step S5, face posture derivation step S6, and face posture correction step S7 are repeatedly executed.

この顔検出方法は、基準部位群における部位間の距離を拘束条件として利用する拘束条件法によって顔方向Ｈ_１を導出する。従って、ランダムノイズの少ない安定した結果を得ることができる。そして、顔検出方法では、関数（式（２））を利用して顔方向Ｈ_１を補正する。この関数は、対象者Ａの視線Ｇと基準顔方向Ｈ_２の間の第１角度、及び顔方向Ｈ_１と基準顔方向Ｈ_２の間の第２角度の関係を規定する係数ｋ_１，ｋ_２を含んでいるので、導出された顔方向Ｈ_１を、基準顔方向Ｈ_２に相当する結果に変換する。ここで、基準顔方向Ｈ_２は、ステレオマッチングにより導出された結果であるので、真の顔方向に対する偏りが小さい。このため、ランダムノイズの少ない安定した顔方向Ｈ_１が、真の顔方向に対する偏りが小さい顔方向に補正される。従って、顔検出における精度を向上することができる。 The face detection method derives a face direction H ₁ by constraint method utilizing the distance between sites in the reference site group as a constraint condition. Therefore, a stable result with little random noise can be obtained. Then, in the face detecting method, to correct the face direction H ₁ by using the function (equation (2)). This function is a coefficient k ₁ , k that defines the relationship between the first angle between the line of sight G of the subject A and the reference face direction H ₂ and the second angle between the face direction H ₁ and the reference face direction H _2. ₂ , the derived face direction H ₁ is converted into a result corresponding to the reference face direction H ₂ . The reference face direction H ₂ are the results derived by the stereo matching, a small bias to the true face direction. Thus, less stable face direction H ₁ of random noise is corrected with the deviation is small the face direction with respect to true face direction. Therefore, the accuracy in face detection can be improved.

また、顔検出方法によれば、拘束条件法により得られた顔方向Ｈ_１を、フレーム毎に補正することができる。この顔検出方法により得られた顔方向は、拘束条件法が有する高い安定性と、ステレオ法が有する正確性とを有している。 Further, according to the face detection method, the face direction H ₁ obtained by the constraint method, can be corrected for each frame. The face direction obtained by this face detection method has the high stability that the constraint method has and the accuracy that the stereo method has.

関数導出ステップＳ３は、係数ｋ_１，ｋ_２を導出するステップを含む。この関数導出ステップＳ３によれば、第１角度及び第２角度の関係を規定する係数ｋ_１，ｋ_２を含む関数（上記式（２））を導出することができる。 The function deriving step S3 includes a step of deriving the coefficients k ₁ and k ₂ . According to this function deriving step S3, it is possible to derive a function including the coefficients k ₁ and k ₂ that define the relationship between the first angle and the second angle (the above formula (2)).

関数導出ステップＳ３を１回実行した後に、顔姿勢導出ステップＳ６と顔姿勢補正ステップＳ７と、を繰り返し実行する。この方法では、顔姿勢導出ステップＳ６と、顔姿勢補正ステップＳ７を実行するときに、既に補正のための関数が得られている。従って、顔姿勢導出ステップＳ６と、顔姿勢補正ステップＳ７を実行し始めた直後から、顔検出における検出精度を向上することができる。 After the function derivation step S3 is executed once, the face posture derivation step S6 and the face posture correction step S7 are repeatedly executed. In this method, when executing the face posture deriving step S6 and the face posture correcting step S7, a function for correction has already been obtained. Accordingly, detection accuracy in face detection can be improved immediately after the start of the face posture deriving step S6 and the face posture correcting step S7.

顔検出方法では、関数導出ステップＳ３と前処理ステップＳ５と顔姿勢導出ステップＳ６と顔姿勢補正ステップＳ７と、この順で繰り返し実行することとしてもよい。この方法によれば、顔姿勢導出ステップＳ６と顔姿勢補正ステップＳ７を実行する毎に、関数導出ステップＳ３も実行される。関数導出ステップＳ３の繰り返しにより係数ｋ_１，ｋ_２が更新されて所定の値に収束する。従って、事前に関数を準備することなく、顔姿勢導出ステップＳ６と、顔姿勢補正ステップＳ７とを実行することができる。 In the face detection method, the function deriving step S3, the preprocessing step S5, the face posture deriving step S6, and the face posture correcting step S7 may be repeatedly executed in this order. According to this method, the function deriving step S3 is also executed every time the face posture deriving step S6 and the face posture correcting step S7 are performed. By repeating the function derivation step S3, the coefficients k ₁ and k ₂ are updated and converge to a predetermined value. Therefore, the face posture deriving step S6 and the face posture correcting step S7 can be executed without preparing a function in advance.

顔検出方法は、世界座標系を基準として取得された角度∠Ｇ’と角度∠Ｈとを顔座標系に座標変換した後に関数を決定し、得られた関数を世界座標系に座標変換する。この方法によれば、顔姿勢の検出精度を高めることができる。 In the face detection method, the angle ∠G ′ and the angle ∠H acquired with reference to the world coordinate system are coordinate-transformed into the face coordinate system, a function is determined, and the obtained function is coordinate-transformed into the world coordinate system. According to this method, the detection accuracy of the face posture can be increased.

＜実施例＞
顔検出方法における顔方向の補正効果を確認した。まず、ディスプレイ装置４０の中心から７５ｃｍ離間した位置に対象者Ａの顎を固定する台を配置した。対象者は、この台に顎を載せて顔面をディスプレイ装置４０に対して正対させた。この状態では、対象者Ａの顔方向は、ディスプレイ装置４０の中心に向かう方向である。この状態において、まず注視点の一点較正を実施した。次に、対象者Ａは、ディスプレイ装置４０上に表示した９個の視標Ｔ_Ｇに順次視線を向けた。各視標Ｔ_Ｇを注視する時間はおよそ１秒間である。図８（ａ）は、顔検出方法に係る補正を行わなかった場合の結果を示し、図８（ａ）は、顔検出方法に係る補正を行った場合の結果を示す。破線の丸印Ｃは、顔検出装置１によって検出された顔方向を示す。図８（ａ）に示されるように、補正を行わない場合、実際の顔方向は固定されているにも関わらず、導出される顔方向は、二点鎖線に示される領域Ｄ１の範囲でばらついた。具体的には、領域Ｃ１の範囲は、４０ｍｍ〜５０ｍｍ程度であった。一方、図８（ｂ）に示されるように、補正を行った場合、二点鎖線に示される領域Ｄ２の範囲が縮小した。具体的には、領域Ｃ２の範囲は、１０ｍｍ以下であった。従って、顔検出方法によれば、顔方向の誤差を低減できることが確認できた。 <Example>
The correction effect of the face direction in the face detection method was confirmed. First, a stand for fixing the jaw of the subject A was placed at a position 75 cm away from the center of the display device 40. The subject placed his chin on this table and faced the face to the display device 40. In this state, the face direction of the subject A is a direction toward the center of the display device 40. In this state, first, one point calibration of the gazing point was performed. Next, the subject A directed his / her line of sight sequentially to the nine targets _TG displayed on the display device 40. The time for gazing at each target _TG is approximately 1 second. FIG. 8A shows the result when the correction according to the face detection method is not performed, and FIG. 8A shows the result when the correction according to the face detection method is performed. A dashed circle C indicates the face direction detected by the face detection device 1. As shown in FIG. 8A, when the correction is not performed, the derived face direction varies within the range of the region D1 indicated by the two-dot chain line even though the actual face direction is fixed. It was. Specifically, the range of the region C1 was about 40 mm to 50 mm. On the other hand, as shown in FIG. 8B, when correction is performed, the range of the region D2 indicated by the two-dot chain line is reduced. Specifically, the range of the region C2 was 10 mm or less. Therefore, according to the face detection method, it was confirmed that the error in the face direction can be reduced.

以下、本実施形態に係る顔検出方法の具体的な形態、顔検出方法を実施するための顔検出装置１及び顔検出プログラムについて詳細に説明する。 Hereinafter, a specific form of the face detection method according to the present embodiment, the face detection apparatus 1 for performing the face detection method, and a face detection program will be described in detail.

＜顔姿勢検出装置＞
図１に示されるように、顔検出装置１は、ステレオカメラとして機能する一対の瞳孔用カメラ（撮像手段）１０と、一対の鼻孔用カメラ２０と、画像処理装置（処理手段）３０とを備える。以下では、必要に応じて、一対の瞳孔用カメラ１０を、対象者Ａの左側にある左側瞳孔用カメラ１０_Ｌと、対象者Ａの右側にある右側瞳孔用カメラ１０_Ｒとに区別する。また、一対の鼻孔用カメラ２０を、対象者Ａの左側にある左側鼻孔用カメラ２０_Ｌと、対象者Ａの右側にある右側鼻孔用カメラ２０_Ｒとに区別する。本実施形態では、顔検出装置１は、対象者Ａが見る対象であるディスプレイ装置４０を更に備えるが、顔検出装置１の利用目的は上記のように限定されないので、対象者Ａの視線の先にあるものはディスプレイ装置４０に限定されず、例えば自動車のフロントガラスでもあり得る。従って、ディスプレイ装置４０は顔検出装置１における必須の要素ではない。４台のカメラ１０，２０は何れも画像処理装置３０と無線又は有線により接続され、各カメラ１０，２０と画像処理装置３０との間で各種のデータ又は命令が送受信される。各カメラ１０，２０に対しては予めカメラ較正が行われる。 <Face posture detection device>
As shown in FIG. 1, the face detection device 1 includes a pair of pupil cameras (imaging means) 10 that functions as a stereo camera, a pair of nostril cameras 20, and an image processing device (processing means) 30. . Hereinafter, if necessary, distinguishes a pair of pupil cameras 10, the cameras 10 _L for the left pupil on the left side of the subject A, in the right pupil camera 10 _R on the right side of the subject A. Moreover, distinguishing a pair of nostril cameras 20, the cameras 20 _L for the left nostril to the left of the subject A, in the camera 20 _R for the right nostril to the right of the subject A. In the present embodiment, the face detection device 1 further includes a display device 40 that is an object to be viewed by the subject A. However, since the purpose of use of the face detection device 1 is not limited as described above, However, the display device 40 is not limited to the display device 40, and may be a windshield of an automobile, for example. Therefore, the display device 40 is not an essential element in the face detection device 1. The four cameras 10 and 20 are all connected to the image processing device 30 by wireless or wired, and various data or commands are transmitted and received between the cameras 10 and 20 and the image processing device 30. Camera calibration is performed on each of the cameras 10 and 20 in advance.

瞳孔用カメラ１０及び鼻孔用カメラ２０は何れも対象者Ａの顔を撮像する装置である。瞳孔用カメラ１０は特に対象者Ａの瞳孔及びその周辺を撮影するために用いられる。鼻孔用カメラ２０は特に対象者Ａの瞳孔、鼻孔、及びこれらの周辺を撮影するために用いられる。瞳孔用カメラ１０は瞳孔光学系であり、鼻孔用カメラ２０は鼻孔光学系である。本明細書では、瞳孔用カメラ１０により得られる画像を瞳孔画像（明瞳孔画像または暗瞳孔画像）といい、鼻孔用カメラ２０により得られる画像を鼻孔画像という。 Both the pupil camera 10 and the nostril camera 20 are devices that image the face of the subject A. The pupil camera 10 is used in particular for photographing the pupil of the subject A and its surroundings. The nostril camera 20 is used in particular for photographing the pupil, nostril, and the periphery of the subject A. The pupil camera 10 is a pupil optical system, and the nostril camera 20 is a nostril optical system. In this specification, an image obtained by the pupil camera 10 is called a pupil image (bright pupil image or dark pupil image), and an image obtained by the nostril camera 20 is called a nostril image.

瞳孔用カメラ１０及び鼻孔用カメラ２０は、対象者Ａが眼鏡をかけているときの顔画像における反射光の写り込みを防止する目的で、対象者Ａの顔より低い位置に設けられる。一対の瞳孔用カメラ１０は水平方向に沿って所定の間隔をおいて配される。一対の鼻孔用カメラ２０は、一対の瞳孔用カメラ１０より低くかつ水平方向に沿って所定の間隔をおいて配される。鼻孔用カメラ２０を瞳孔用カメラ１０より下に配置するのは、対象者が顔を下に向けた場合でも鼻孔を検出できるようにするためである。水平方向に対する瞳孔用カメラ１０及び鼻孔用カメラ２０の仰角は、瞳孔の確実な検出と対象者Ａの視野範囲の妨げの回避との双方を考慮して、例えば２０度〜３５度の範囲に設定される。あるいは、瞳孔用カメラ１０の仰角が２０度〜３０度の範囲に設定され、鼻孔用カメラ２０の仰角が２５度〜３５度程度の範囲に設定されてもよい。 The pupil camera 10 and the nostril camera 20 are provided at a position lower than the face of the subject A for the purpose of preventing reflection of reflected light in the face image when the subject A is wearing glasses. The pair of pupil cameras 10 are arranged at a predetermined interval along the horizontal direction. The pair of nostril cameras 20 is disposed lower than the pair of pupil cameras 10 and at a predetermined interval along the horizontal direction. The reason why the nostril camera 20 is arranged below the pupil camera 10 is to allow the nostril to be detected even when the subject faces the face downward. The elevation angles of the pupil camera 10 and the nostril camera 20 with respect to the horizontal direction are set to a range of, for example, 20 to 35 degrees in consideration of both reliable detection of the pupil and avoidance of obstruction of the visual field range of the subject A. Is done. Alternatively, the elevation angle of the pupil camera 10 may be set in a range of 20 degrees to 30 degrees, and the elevation angle of the nostril camera 20 may be set in a range of about 25 degrees to 35 degrees.

本実施形態では、瞳孔用カメラ１０及び鼻孔用カメラ２０は、インターレーススキャン方式の一つであるＮＴＳＣ方式のカメラである。ＮＴＳＣ方式では、１秒間に３０枚得られる１フレームの画像データ（顔画像）は、奇数番目の水平画素ラインで構成される奇数フィールドと、偶数番目の水平画素ラインで構成される偶数フィールドから構成され、奇数フィールドの画像と偶数フィールドの画像とが１／６０秒の間隔で交互に撮影されることで生成される。従って、一つのフレームは、一対の奇数フィールド及び偶数フィールドに相当する。瞳孔用カメラ１０及び鼻孔用カメラ２０はそれぞれ、画像処理装置３０からの命令に応じて対象者Ａを撮像し、画像データ（顔画像）を画像処理装置３０に出力する。 In the present embodiment, the pupil camera 10 and the nostril camera 20 are NTSC cameras, which are one of the interlace scan methods. In the NTSC system, one frame of image data (face image) obtained 30 frames per second consists of an odd field composed of odd-numbered horizontal pixel lines and an even field composed of even-numbered horizontal pixel lines. Then, an odd field image and an even field image are alternately captured at an interval of 1/60 seconds. Therefore, one frame corresponds to a pair of odd and even fields. Each of the pupil camera 10 and the nostril camera 20 captures the subject A in response to a command from the image processing device 30 and outputs image data (face image) to the image processing device 30.

瞳孔用カメラ１０及び鼻孔用カメラ２０は光源を備える。図９に示されるように、瞳孔用カメラ１０及び鼻孔用カメラ２０は、対物レンズ１１が円形状の開口部１２に収容され、開口部１２の外側に光源１３が設けられている。光源１３は、対象者Ａの顔に向けて照明光を照射するための機器であり、複数の発光素子１３ａと複数の発光素子１３ｂとから成る。発光素子１３ａは、出力光の中心波長が８５０ｎｍの半導体発光素子（ＬＥＤ）であり、開口部１２の縁に沿って等間隔でリング状に配される。発光素子１３ｂは、出力光の中心波長が９４０ｎｍの半導体発光素子であり、発光素子１３ａの外側に等間隔でリング状に配される。従って、瞳孔用カメラ１０の光軸から発光素子１３ｂまでの距離は、該光軸から発光素子１３ａまでの距離よりも大きい。それぞれの発光素子１３ａ，１３ｂは、瞳孔用カメラ１０の光軸に沿って照明光を出射するように設けられる。なお、光源１３の配置は図２に示す構成に限定されず、カメラをピンホールモデルとみなすことができれば他の配置であってもよい。なお、鼻孔用カメラ２０は、光源１３を備えていなくてもよい。この場合には、鼻孔用カメラ２０は、瞳孔用カメラ１０の光源１３により照らされた対象者Ａの顔を撮影する。すなわち、鼻孔用カメラ２０は光源１３からの光を利用して撮影を行う。 The pupil camera 10 and the nostril camera 20 include a light source. As shown in FIG. 9, in the pupil camera 10 and the nostril camera 20, the objective lens 11 is accommodated in a circular opening 12, and a light source 13 is provided outside the opening 12. The light source 13 is a device for irradiating illumination light toward the face of the subject A, and includes a plurality of light emitting elements 13a and a plurality of light emitting elements 13b. The light emitting elements 13 a are semiconductor light emitting elements (LEDs) having a center wavelength of output light of 850 nm, and are arranged in a ring shape at equal intervals along the edge of the opening 12. The light emitting element 13b is a semiconductor light emitting element having a center wavelength of output light of 940 nm, and is arranged in a ring shape at equal intervals outside the light emitting element 13a. Accordingly, the distance from the optical axis of the pupil camera 10 to the light emitting element 13b is larger than the distance from the optical axis to the light emitting element 13a. Each of the light emitting elements 13 a and 13 b is provided so as to emit illumination light along the optical axis of the pupil camera 10. Note that the arrangement of the light source 13 is not limited to the configuration shown in FIG. 2, and other arrangements may be used as long as the camera can be regarded as a pinhole model. The nostril camera 20 may not include the light source 13. In this case, the nostril camera 20 captures the face of the subject A illuminated by the light source 13 of the pupil camera 10. That is, the nostril camera 20 performs imaging using light from the light source 13.

鼻孔は後述する角膜反射に比べて寸法が大きいので、瞳孔用カメラ１０より分解能が低いカメラを鼻孔用カメラ２０として用いても鼻孔を検出することができる。すなわち、鼻孔用カメラ２０の分解能は瞳孔用カメラ１０の分解用カメラより低くてもよい。例えば、瞳孔用カメラ１０の分解能が６４０ピクセル×４８０ピクセルであるのに対して、鼻孔用カメラ２０の分解能が３２０ピクセル×２４０ピクセルであってもよい。 Since the nostrils are larger in size than the corneal reflection described later, the nostrils can be detected even if a camera having a lower resolution than the pupil camera 10 is used as the nostril camera 20. That is, the resolution of the nostril camera 20 may be lower than the resolution camera of the pupil camera 10. For example, the resolution of the pupil camera 10 may be 640 pixels × 480 pixels, whereas the resolution of the nostril camera 20 may be 320 pixels × 240 pixels.

画像処理装置３０は、瞳孔用カメラ１０及び鼻孔用カメラ２０の制御と、対象者Ａの視線及び顔方向の算出（検出）とを実行するコンピュータである。画像処理装置３０は、据置型又は携帯型のパーソナルコンピュータ（ＰＣ）により構築されてもよいし、ワークステーションにより構築されてもよいし、他の種類のコンピュータにより構築されてもよい。あるいは、画像処理装置３０は複数台の任意の種類のコンピュータを組み合わせて構築されてもよい。複数台のコンピュータを用いる場合には、これらのコンピュータはインターネットやイントラネットなどの通信ネットワークを介して接続される。 The image processing device 30 is a computer that executes control of the pupil camera 10 and nostril camera 20 and calculation (detection) of the line of sight and face direction of the subject A. The image processing apparatus 30 may be constructed by a stationary or portable personal computer (PC), may be constructed by a workstation, or may be constructed by another type of computer. Alternatively, the image processing apparatus 30 may be constructed by combining a plurality of arbitrary types of computers. When a plurality of computers are used, these computers are connected via a communication network such as the Internet or an intranet.

図１０に示されるように、画像処理装置３０は、ＣＰＵ（プロセッサ）１０１と、主記憶部１０２と、補助記憶部１０３と、通信制御部１０４と、入力装置１０５と、出力装置１０６とを備える。ＣＰＵ１０１は、オペレーティングシステムやアプリケーション・プログラムなどを実行する。主記憶部１０２は、ＲＯＭ及びＲＡＭで構成される。補助記憶部１０３は、ハードディスクやフラッシュメモリなどで構成される。通信制御部１０４は、ネットワークカードあるいは無線通信モジュールで構成される。入力装置１０５は、キーボードやマウスなどを含む。出力装置１０６は、ディスプレイやプリンタなどを含む。 As illustrated in FIG. 10, the image processing apparatus 30 includes a CPU (processor) 101, a main storage unit 102, an auxiliary storage unit 103, a communication control unit 104, an input device 105, and an output device 106. . The CPU 101 executes an operating system, application programs, and the like. The main storage unit 102 includes a ROM and a RAM. The auxiliary storage unit 103 is configured by a hard disk, a flash memory, or the like. The communication control unit 104 includes a network card or a wireless communication module. The input device 105 includes a keyboard and a mouse. The output device 106 includes a display, a printer, and the like.

後述する画像処理装置３０の各機能要素は、ＣＰＵ１０１又は主記憶部１０２の上に所定のソフトウェアを読み込ませ、ＣＰＵ１０１の制御の下で通信制御部１０４や入力装置１０５、出力装置１０６などを動作させ、主記憶部１０２又は補助記憶部１０３におけるデータの読み出し及び書き込みを行うことで実現される。処理に必要なデータやデータベースは主記憶部１０２又は補助記憶部１０３内に格納される。 Each functional element of the image processing apparatus 30 described later reads predetermined software on the CPU 101 or the main storage unit 102 and operates the communication control unit 104, the input device 105, the output device 106, and the like under the control of the CPU 101. This is realized by reading and writing data in the main storage unit 102 or the auxiliary storage unit 103. Data and a database necessary for processing are stored in the main storage unit 102 or the auxiliary storage unit 103.

図１１に示すように、画像処理装置３０は機能的構成要素として画像取得部３１、前処理部３２、関数導出部３３、顔姿勢導出部３４及び顔姿勢補正部３６を備える。画像取得部３１は、瞳孔用カメラ１０及び鼻孔用カメラ２０の撮影タイミングと瞳孔用カメラ１０の光源１３の発光タイミングとを制御することで、瞳孔用カメラ１０及び鼻孔用カメラ２０から画像データ（顔画像）を取得する機能要素である。前処理部３２は、画像データ（顔画像）に基づいて顔姿勢ベクトル及び視線を算出する機能要素である。前処理部３２は、角度取得部３２ａと、座標変換部（第１の座標変換部）３２ｂと、を有する。角度取得部３２ａは、角度∠Ｇ’及び角度∠Ｈを世界座標系において取得する。座標変換部３２ｂは、世界座標系において取得された角度∠Ｇ’及び角度∠Ｈを、世界座標系とは異なる顔座標系に変換する。関数導出部３３は、特徴点の三次元座標に基づいて顔方向を補正するための関数を導出する機能要素である。関数導出部３３は、関数決定部３３ａを有する。関数決定部３３ａは、上記式（２）に示された係数ｋ_１，ｋ_２を算出する。顔姿勢導出部３４は、拘束条件法を利用して顔方向を導出する機能要素である。顔姿勢補正部３６は、顔姿勢導出部３４により導出された顔方向を補正する機能要素である。顔姿勢補正部３６は、第１の座標変換部３６ａと、角度取得部３６ｂと、方向補正部３６ｃと、第２の座標変換部３６ｄとを有する。補正された顔方向の出力先は何ら限定されない。例えば、画像処理装置３０は結果を画像、図形、又はテキストでモニタに表示してもよいし、メモリやデータベースなどの記憶装置に格納してもよいし、通信ネットワーク経由で他のコンピュータシステムに送信してもよい。 As shown in FIG. 11, the image processing apparatus 30 includes an image acquisition unit 31, a preprocessing unit 32, a function deriving unit 33, a face posture deriving unit 34, and a face posture correcting unit 36 as functional components. The image acquisition unit 31 controls image data (face) from the pupil camera 10 and the nostril camera 20 by controlling the photographing timing of the pupil camera 10 and the nostril camera 20 and the light emission timing of the light source 13 of the pupil camera 10. Image). The preprocessing unit 32 is a functional element that calculates a face posture vector and a line of sight based on image data (face image). The preprocessing unit 32 includes an angle acquisition unit 32a and a coordinate conversion unit (first coordinate conversion unit) 32b. The angle acquisition unit 32a acquires the angle ∠G ′ and the angle ∠H in the world coordinate system. The coordinate conversion unit 32b converts the angle ∠G ′ and the angle ∠H acquired in the world coordinate system into a face coordinate system different from the world coordinate system. The function deriving unit 33 is a functional element that derives a function for correcting the face direction based on the three-dimensional coordinates of the feature points. The function deriving unit 33 includes a function determining unit 33a. The function determination unit 33a calculates the coefficients k ₁ and k ₂ shown in the above equation (2). The face posture deriving unit 34 is a functional element that derives the face direction using the constraint condition method. The face posture correction unit 36 is a functional element that corrects the face direction derived by the face posture deriving unit 34. The face posture correction unit 36 includes a first coordinate conversion unit 36a, an angle acquisition unit 36b, a direction correction unit 36c, and a second coordinate conversion unit 36d. The output destination of the corrected face direction is not limited at all. For example, the image processing apparatus 30 may display the result as an image, graphic, or text on a monitor, store the result in a storage device such as a memory or a database, or send the result to another computer system via a communication network. May be.

この顔検出装置１は、上述した顔検出方法と同様の効果を得ることができる。すなわち、顔検出装置１は、基準部位群における部位間の距離を拘束条件として利用する拘束条件法によって顔方向Ｈ_１を導出する。従って、ランダムノイズの少ない安定した結果を得ることができる。そして、顔検出装置では、関数（式（２））を利用して顔方向Ｈ_１を補正する。この関数は、対象者Ａの視線Ｇと基準顔方向Ｈ_２の間の第１角度、及び顔方向Ｈ_１と基準顔方向Ｈ_２の間の第２角度の関係を規定する係数ｋ_１を含んでいるので、導出された顔方向Ｈ１を、基準顔方向Ｈ_２に相当する結果に変換する。ここで、基準顔方向Ｈ_２は、ステレオマッチングにより導出された結果であるので、真の顔方向に対する偏りが小さい。このため、ランダムノイズの少ない安定した顔方向Ｈ_１が、真の顔方向に対する偏りが小さい顔方向に補正される。従って、顔検出における精度を向上することができる。 This face detection apparatus 1 can obtain the same effect as the face detection method described above. That is, the face detection apparatus 1 derives the face direction H ₁ by constraint method utilizing the distance between sites in the reference site group as a constraint condition. Therefore, a stable result with little random noise can be obtained. Then, the face detection apparatus corrects the face direction H ₁ by using the function (equation (2)). This function includes a coefficient k ₁ that defines the relationship between the first angle between the line of sight G of the subject A and the reference face direction H ₂ and the second angle between the face direction H ₁ and the reference face direction H _2. since Dale, converts derived the face direction H1, the results corresponding to the reference face direction H _2. The reference face direction H ₂ are the results derived by the stereo matching, a small bias to the true face direction. Thus, less stable face direction H ₁ of random noise is corrected with the deviation is small the face direction with respect to true face direction. Therefore, the accuracy in face detection can be improved.

また、顔検出装置１は、２台以上の鼻孔用カメラ２０を互いに離間させて配置しているので、顔方向Ｈ_１の検出範囲を広げることができる。例えば、顔検出装置１は、鼻孔用カメラ２０を正面にして各方向に±３０度〜±４０度の範囲で対象者Ａの顔方向Ｈ_１を検出できる。ここで、基準顔方向Ｈ_２の検出用光学系（図１における右側瞳孔用カメラ１０_Ｒ、左側瞳孔用カメラ１０_Ｌ）は、例えば互いに３０度程度離れているとする。この場合に、対象者Ａから向かって左に１５度のところに存在する左側瞳孔用カメラ１０_Ｌでは、対象者Ａが右側に３０度を超えた位置を注視したときには顔方向を検出できない（光学系を正面にして左右±４５度の範囲で検出できる）。このような場合に、１枚の画像データ（顔画像）によって顔方向を導出できる拘束条件法が有利である。 The face detection apparatus 1, since two or more nostril camera 20 are arranged to be spaced apart from each other, it is possible to widen the detection range of the face direction H _1. For example, the face detection apparatus 1 can detect a face direction H ₁ of the subject A in the range of 30 ° ~ ± 40 ° ± to the nostrils camera 20 in the front in each direction. Here, it is assumed that the detection optical systems (the right pupil camera 10 _R and the left pupil camera 10 _L in FIG. 1) in the reference face direction H ₂ are separated from each other by about 30 degrees, for example. In this case, in the left pupil camera 10 _{L that} is 15 degrees to the left from the subject A, the face direction cannot be detected when the subject A gazes at a position that exceeds 30 degrees to the right (optical). It can be detected within a range of ± 45 degrees to the left and right with the system in front. In such a case, the constraint condition method that can derive the face direction from one piece of image data (face image) is advantageous.

以下、顔検出方法における、画像取得ステップＳ１、前処理ステップＳ２、関数導出ステップＳ３、画像取得ステップＳ４、前処理ステップＳ５、顔姿勢導出ステップＳ６、及び顔姿勢補正ステップＳ７の具体的態様について詳細に説明する。 Hereinafter, specific aspects of image acquisition step S1, preprocessing step S2, function derivation step S3, image acquisition step S4, preprocessing step S5, face posture derivation step S6, and face posture correction step S7 in the face detection method will be described in detail. Explained.

＜画像取得ステップＳ１＞
眼に入った光は網膜で乱反射し、反射光のうち瞳孔を通り抜けた光は強い指向性をもって光源へ戻る性質がある。カメラの開口部近くにある光源が発光した時にカメラを露光させると、網膜で反射した光の一部がその開口部に入るため、瞳孔が瞳孔周辺よりも明るく写った画像を取得することができる。この画像が明瞳孔画像である。これに対して、カメラの開口部から離れた位置にある光源が発光した時にカメラを露光させると、眼から戻ってきた光はカメラの開口部にほとんど戻らないため、瞳孔が暗く写った画像を取得することができる。この画像が暗瞳孔画像である。また、透過率が高い波長の光を眼に照射すると、網膜での光の反射が多くなるので瞳孔が明るく写り、透過率が低い波長の光を眼に照射すると、網膜での光の反射が少なくなるので瞳孔が暗く写る。 <Image acquisition step S1>
Light that enters the eye is diffusely reflected by the retina, and light that passes through the pupil of the reflected light has a property of returning to the light source with strong directivity. When the camera is exposed when a light source near the opening of the camera emits light, a part of the light reflected by the retina enters the opening, so an image in which the pupil appears brighter than the periphery of the pupil can be acquired. . This image is a bright pupil image. On the other hand, when the camera is exposed when a light source located far from the camera opening emits light, the light returned from the eye hardly returns to the camera opening. Can be acquired. This image is a dark pupil image. In addition, when light with a wavelength with high transmittance is irradiated on the eye, the reflection of light on the retina increases, so the pupil appears bright, and when light with a wavelength with low transmittance is irradiated on the eye, the light is reflected on the retina. The pupil will appear dark because it will decrease.

（明瞳孔画像と暗瞳孔画像の取得）
画像取得部３１は、瞳孔用カメラ１０の奇数フィールドに合わせて発光素子１３ａを点灯させて明瞳孔画像を撮影し、瞳孔用カメラ１０の偶数フィールドに合わせて発光素子１３ａを点灯させて暗瞳孔画像を撮影する。画像取得部３１は２個の瞳孔用カメラ１０の間で作動タイミングをわずかにずらし、個々の瞳孔用カメラ１０の露光時間はそのずらし時間以下に設定される。画像取得部３１は、瞳孔用カメラ１０の露光時間中に、対応する発光素子１３ａ及び発光素子１３ｂを交互に発光させることで、一方の瞳孔用カメラ１０の光源１３からの光が他方の瞳孔用カメラ１０の画像に影響を与えないようにする（クロストークが起こらないようにする）。 (Acquisition of bright pupil image and dark pupil image)
The image acquisition unit 31 illuminates the light emitting element 13a in accordance with the odd field of the pupil camera 10 to take a bright pupil image, and lights up the light emitting element 13a in accordance with the even field of the pupil camera 10 to obtain the dark pupil image. Shoot. The image acquisition unit 31 slightly shifts the operation timing between the two pupil cameras 10, and the exposure time of each pupil camera 10 is set to be equal to or less than the shift time. The image acquisition unit 31 causes the corresponding light emitting element 13a and light emitting element 13b to emit light alternately during the exposure time of the pupil camera 10, so that the light from the light source 13 of one pupil camera 10 is for the other pupil. The image of the camera 10 is not affected (so that crosstalk does not occur).

画像取得部３１は、これらの一連の制御により得られる明瞳孔画像及び暗瞳孔画像を取得する。得られる画像データは、奇数フィールド又は偶数フィールドのみに有効画素を有しているため、画像取得部３１は、隣接する有効画素の画素ラインの輝度平均をそのライン間の画素値に埋め込むことによって、明瞳孔画像又は暗瞳孔画像を生成する。画像取得部３１は明瞳孔画像及び暗瞳孔画像を前処理部３２に出力する。 The image acquisition unit 31 acquires a bright pupil image and a dark pupil image obtained by a series of these controls. Since the obtained image data has effective pixels only in the odd field or even field, the image acquisition unit 31 embeds the luminance average of the pixel lines of adjacent effective pixels in the pixel values between the lines, A bright pupil image or a dark pupil image is generated. The image acquisition unit 31 outputs the bright pupil image and the dark pupil image to the preprocessing unit 32.

（鼻孔画像の取得）
画像取得部３１は、光源１３の点灯に同期させて鼻孔画像を撮影する。点灯される発光素子は、発光素子１３ａ，１３ｂの何れでもよい。画像取得部３１は鼻孔画像を前処理部３２に出力する。 (Acquisition of nostril images)
The image acquisition unit 31 captures a nostril image in synchronization with the lighting of the light source 13. The light emitting element to be turned on may be either of the light emitting elements 13a and 13b. The image acquisition unit 31 outputs the nostril image to the preprocessing unit 32.

＜前処理ステップＳ２＞
前処理ステップＳ２は、瞳孔位置を検出するステップと、鼻孔を検出するステップと、ステレオ法による顔姿勢ベクトルの導出ステップと、拘束条件法による顔姿勢ベクトルの導出ステップと、視線の検出ステップと、を有する。 <Preprocessing step S2>
The preprocessing step S2 includes a step of detecting a pupil position, a step of detecting a nostril, a step of deriving a face posture vector by a stereo method, a step of deriving a face posture vector by a constraint condition method, and a step of detecting a gaze, Have

（瞳孔位置の検出）
前処理部３２は、２画像の差分を取ることで差分画像を生成する。そして、前処理部３２は差分画像から対象者Ａにおける左右の瞳孔の位置を算出する。連続する２フィールドの一方は明瞳孔画像であり他方は暗瞳孔画像である。ｉ番目のフィールドの画像が撮影されてから（ｉ＋１）番目のフィールドの画像が撮影されるまでの間に対象者Ａの頭部が動かなければ、単純に明瞳孔画像及び暗瞳孔画像の差を取ることで、瞳孔部分が浮かび上がった差分画像を生成することができる。 (Detection of pupil position)
The preprocessing unit 32 generates a difference image by taking the difference between the two images. Then, the preprocessing unit 32 calculates the positions of the left and right pupils of the subject A from the difference image. One of the two consecutive fields is a bright pupil image, and the other is a dark pupil image. If the head of the subject A does not move between the time when the image of the i-th field is captured and the time when the image of the (i + 1) -th field is captured, the difference between the bright pupil image and the dark pupil image is simply calculated. By taking it, it is possible to generate a difference image in which the pupil portion is raised.

ここで、前処理部３２は、差分を取る前に、連続する２フィールドの画像のうち、先に得られた画像の位置を後から得られた画像の位置に合わせ（この処理を位置補正という）を実行してもよい。具体的には、前処理部３２は、差分画像を得る前に明瞳孔画像及び暗瞳孔画像に対して位置補正を実行してもよい。ｉ番目のフィールドの画像が撮影されてから（ｉ＋１）番目のフィールドの画像が撮影されるまでの間のわずかな時間に対象者Ａの頭部が動くと、これら２画像の間で瞳孔の位置に偏りが生じ、その結果、良好な差分画像を得ることができない。従って、瞳孔用カメラ１０及び鼻孔用カメラ２０のフレームレートが高速ではない場合に、位置補正の適用が有効である。本実施形態での位置補正には、顔方向の予測に基づく位置補正と、その後に行われる、角膜反射に基づく位置補正の２種類がある。 Here, before taking the difference, the preprocessing unit 32 aligns the position of the image obtained earlier among the images of two consecutive fields with the position of the image obtained later (this process is referred to as position correction). ) May be executed. Specifically, the preprocessing unit 32 may perform position correction on the bright pupil image and the dark pupil image before obtaining the difference image. If the head of the subject A moves for a short time after the image of the i-th field is taken and before the image of the (i + 1) -th field is taken, the position of the pupil between these two images As a result, a good differential image cannot be obtained. Therefore, application of position correction is effective when the frame rates of the pupil camera 10 and the nostril camera 20 are not high. There are two types of position correction in the present embodiment: position correction based on face direction prediction and position correction based on corneal reflection performed thereafter.

瞳孔検出の方法は前フィールド（ｉ番目のフィールド）での瞳孔の検出結果（前回の瞳孔検出結果）によって下記の３種類に分類される。
（１）前フィールド（前回の瞳孔検出）で両瞳孔を検出できた場合。
（２）前フィールド（前回の瞳孔検出）で片方の瞳孔のみを検出できた場合。
（３）前フィールド（前回の瞳孔検出）で両瞳孔を検出できなかった場合。 The pupil detection methods are classified into the following three types according to the pupil detection result (previous pupil detection result) in the previous field (i-th field).
(1) When both pupils can be detected in the previous field (previous pupil detection).
(2) When only one pupil can be detected in the previous field (previous pupil detection).
(3) When both pupils cannot be detected in the previous field (previous pupil detection).

前フィールドで両瞳孔を検出できた場合には、前処理部３２は瞳孔追跡により両瞳孔を決定し、左右の瞳孔の中心座標を算出する。まず、前処理部３２は、予測瞳孔位置の三次元座標を、後述する式（６）を用いて撮像平面（瞳孔画像）上の二次元座標に変換する。また、前処理部３２は、次フィールド（（ｉ＋１）番目のフィールド）の瞳孔画像を画像取得部３１から取得する。続いて、前処理部３２は予測瞳孔位置の二次元座標を中心とする小ウィンドウ（例えば７０ピクセル×７０ピクセル）を次フィールドの瞳孔画像に設定する。一方、前フィールドの画像に対しては、前処理部３２は既に検出されている二次元座標を中心とする小ウィンドウを設定する。続いて、前処理部３２は前フィールドのウィンドウの位置を次フィールドのウィンドウの位置に合わせ、明瞳孔画像と暗瞳孔画像との差分を取る。続いて、前処理部３２は、その処理で得られた差分画像に対してＰタイル法によって決定された閾値で２値化を行った後、孤立点除去及びラベリングを行う。続いて、前処理部３２は、瞳孔らしい面積、サイズ、面積比、正方形度、及び瞳孔特徴量等の形状パラメータに基づいて、ラベルづけされた画素の連結成分の中から瞳孔候補を選択する。そして、前処理部３２は二つの瞳孔候補の関係が所定の関係にあるものを左右の瞳孔として決定し、画像データにおける左右の仮の瞳孔位置を求める。すなわち、前処理部３２は、ピンホールモデルを用いて顔姿勢から予測された瞳孔の三次元座標を撮像平面に投影してから、位置補正を実行して差分画像を生成し、その差分画像に基づいて瞳孔を特定する。 If both pupils can be detected in the previous field, the preprocessing unit 32 determines both pupils by pupil tracking and calculates the center coordinates of the left and right pupils. First, the preprocessing unit 32 converts the three-dimensional coordinates of the predicted pupil position into two-dimensional coordinates on the imaging plane (pupil image) using Expression (6) described later. In addition, the preprocessing unit 32 acquires the pupil image of the next field ((i + 1) th field) from the image acquisition unit 31. Subsequently, the preprocessing unit 32 sets a small window (for example, 70 pixels × 70 pixels) centered on the two-dimensional coordinates of the predicted pupil position in the pupil image of the next field. On the other hand, for the image in the previous field, the preprocessing unit 32 sets a small window centered on the already detected two-dimensional coordinates. Subsequently, the preprocessing unit 32 adjusts the position of the window in the previous field to the position of the window in the next field, and obtains the difference between the bright pupil image and the dark pupil image. Subsequently, the preprocessing unit 32 binarizes the difference image obtained by the processing with a threshold value determined by the P tile method, and then performs isolated point removal and labeling. Subsequently, the preprocessing unit 32 selects pupil candidates from among the connected components of the labeled pixels based on the shape parameters such as the pupil-like area, size, area ratio, squareness, and pupil feature amount. Then, the preprocessing unit 32 determines that the relationship between the two pupil candidates is a predetermined relationship as the left and right pupils, and obtains the left and right temporary pupil positions in the image data. That is, the preprocessing unit 32 projects the three-dimensional coordinates of the pupil predicted from the face posture using the pinhole model onto the imaging plane, and then performs position correction to generate a difference image. Based on this, the pupil is identified.

前フィールドで片方の瞳孔のみ検出された場合には、前処理部３２は、検出された方の瞳孔については、上記と同様の瞳孔追跡により瞳孔を決定し、仮の瞳孔位置を求める。一方、検出されなかった方の瞳孔については、前処理部３２は検出された方の瞳孔の位置から所定の距離（例えば３０ピクセル）だけ離れた位置に中ウィンドウ（例えば１５０ピクセル×６０ピクセル）を設定し、その中ウィンドウについて差分画像を生成する。そして、前処理部３２はその差分画像に対して、上記と同様の手順で瞳孔候補を選択する。そして、前処理部３２は瞳孔候補の中で面積が最も大きいものを他方の仮の瞳孔位置として決定する。 When only one pupil is detected in the previous field, the preprocessing unit 32 determines the pupil for the detected pupil by pupil tracking similar to the above, and obtains a temporary pupil position. On the other hand, for the pupil that has not been detected, the pre-processing unit 32 moves the middle window (for example, 150 pixels × 60 pixels) to a position that is a predetermined distance (for example, 30 pixels) away from the position of the detected pupil. Set and generate a difference image for the window inside. And the pre-processing part 32 selects a pupil candidate with the same procedure as the above with respect to the difference image. Then, the preprocessing unit 32 determines the largest candidate pupil position as the other temporary pupil position.

前フィールドで両瞳孔を検出できなかった場合には、前処理部３２は画像全体から瞳孔を探索する。具体的には、前処理部３２は、前フィールドの画像と次フィールドの画像との差分を取ることで得た差分画像に対して、上記と同様の手順で瞳孔候補を選択する。そして、前処理部３２は二つの瞳孔候補の関係が所定の関係にあるものを左右の瞳孔として決定し、画像データにおける左右の仮の瞳孔位置を求める。 If both pupils cannot be detected in the previous field, the preprocessing unit 32 searches for the pupil from the entire image. Specifically, the preprocessing unit 32 selects pupil candidates by the same procedure as described above for the difference image obtained by taking the difference between the image of the previous field and the image of the next field. Then, the preprocessing unit 32 determines that the relationship between the two pupil candidates is a predetermined relationship as the left and right pupils, and obtains the left and right temporary pupil positions in the image data.

続いて、前処理部３２は角膜反射の位置を考慮して最終的な瞳孔位置を確定する。具体的には、前処理部３２は、明瞳孔画像及び暗瞳孔画像のそれぞれに対して、仮の瞳孔位置を中心とした小ウィンドウを設定し、その小ウィンドウの範囲のみを高分解像度化した画像データを作成し、その画像データから角膜反射を検出する。前処理部３２は、小ウィンドウ内において、Ｐタイル法による２値化とラベリングとを行い、形状や輝度平均などの情報から角膜反射候補を選択する。そして、前処理部３２は選択した部分の中心座標に対し分離度フィルタを与え、分離度と輝度を乗算して得られる特徴量を求める。その特徴量が一定値以上であれば、前処理部３２は小ウィンドウの中心座標を仮の角膜反射座標として検出し、二つの小ウィンドウの間での角膜反射の移動量を位置補正量として計算する。続いて、前処理部３２は明瞳孔画像及び暗瞳孔画像の間で角膜反射点が一致するように、前フィールド（ｉ番目のフィールド）の画像を、次フィールド（（ｉ＋１）番目のフィールド）の画像に位置補正量だけずらした上で、これら２画像から差分画像を生成する。一方、角膜反射を検出できなかった場合には、前処理部３２は位置補正を行うことなく２画像の差分を取ることで差分画像を生成する。 Subsequently, the preprocessing unit 32 determines the final pupil position in consideration of the position of corneal reflection. Specifically, the preprocessing unit 32 sets a small window centered on the temporary pupil position for each of the bright pupil image and the dark pupil image, and increases the resolution of only the range of the small window. Image data is created, and corneal reflection is detected from the image data. The preprocessing unit 32 performs binarization and labeling by the P tile method in a small window, and selects a corneal reflection candidate from information such as a shape and a luminance average. Then, the preprocessing unit 32 gives a separability filter to the central coordinates of the selected portion, and obtains a feature amount obtained by multiplying the separability and the luminance. If the feature amount is equal to or greater than a certain value, the pre-processing unit 32 detects the center coordinate of the small window as a temporary corneal reflection coordinate, and calculates the movement amount of the corneal reflection between the two small windows as the position correction amount. To do. Subsequently, the pre-processing unit 32 converts the image of the previous field (i-th field) into the next field ((i + 1) -th field) so that the corneal reflection points coincide between the bright pupil image and the dark pupil image. After shifting the position correction amount to the image, a difference image is generated from these two images. On the other hand, when the corneal reflection cannot be detected, the preprocessing unit 32 generates a difference image by taking the difference between the two images without performing position correction.

続いて、前処理部３２は差分画像から最終的な瞳孔位置を確定する。具体的には、前処理部３２は、前フレームと輝度が大きく変化しないことを利用して、前フレームで検出された瞳孔の輝度平均を利用して、その平均輝度の半分の値を閾値として差分画像を２値化し、ラベリングを行う。続いて、前処理部３２は、瞳孔らしい面積、サイズ、面積比、正方形度、及び瞳孔特徴量等の形状パラメータに基づいて、ラベルづけされた画素の連結成分の中から瞳孔候補を選択する。そして、前処理部３２は、予測瞳孔位置の近くにある瞳孔候補が求めるべき瞳孔であると判定し、その瞳孔の中心座標を算出する。 Subsequently, the preprocessing unit 32 determines the final pupil position from the difference image. Specifically, the pre-processing unit 32 uses the average luminance of the pupil detected in the previous frame by using the fact that the luminance does not change greatly from the previous frame, and sets the half value of the average luminance as a threshold value. The difference image is binarized and labeled. Subsequently, the preprocessing unit 32 selects pupil candidates from among the connected components of the labeled pixels based on the shape parameters such as the pupil-like area, size, area ratio, squareness, and pupil feature amount. Then, the preprocessing unit 32 determines that a pupil candidate near the predicted pupil position is a pupil to be obtained, and calculates the center coordinates of the pupil.

（鼻孔の検出）
前処理部３２は、鼻孔があると推定される鼻孔画像内の位置にウィンドウを設定し、そのウィンドウ内を処理することで鼻孔を検出する。なお、前処理部３２は、瞳孔の三次元位置に基づいて、鼻孔があると推定される鼻孔画像内の位置にウィンドウを設定してもよい。前処理部３２は、鼻孔画像及び暗瞳孔画像から鼻孔を検出する。あるいは、前処理部３２は、鼻孔画像から鼻孔を検出する。鼻孔検出の方法は前フィールドでの鼻孔の検出結果（前回の鼻孔検出の結果）によって下記の３種類に分類される。
（１）前フィールド（前回の鼻孔検出）で左右の鼻孔の双方を検出できなかった場合。
（２）前フィールド（前回の鼻孔検出）で左右の鼻孔の双方を検出できた場合。
（３）前フィールド（前回の鼻孔検出）で片方の鼻孔のみを検出できた場合。 (Detection of nostril)
The pre-processing unit 32 sets a window at a position in the nostril image where it is estimated that there is a nostril, and detects the nostril by processing the window. Note that the preprocessing unit 32 may set a window at a position in the nostril image that is estimated to have a nostril based on the three-dimensional position of the pupil. The preprocessing unit 32 detects a nostril from the nostril image and the dark pupil image. Alternatively, the preprocessing unit 32 detects a nostril from the nostril image. The nostril detection methods are classified into the following three types according to the nostril detection results in the previous field (previous nostril detection results).
(1) When both the left and right nostrils cannot be detected in the previous field (previous nostril detection).
(2) When both the left and right nostrils can be detected in the previous field (previous nostril detection).
(3) When only one nostril can be detected in the previous field (previous nostril detection).

前フィールドで左右の鼻孔の双方を検出できなかった場合には、前処理部３２は瞳孔の位置に基づいて鼻孔画像内に所定の大きさの大ウィンドウを設定し、その大ウィンドウ内の輝度を反転させ、Ｐタイル法によって設定された閾値で２値化を行った後、孤立点除去、収縮処理、膨張処理、及びラベリングを行う。続いて、前処理部３２はラベルづけされた画素の連結成分から、鼻孔らしい面積及び大ウィンドウ内での位置に基づいて鼻孔候補を選択する。続いて、前処理部３２は、大ウィンドウの中心に最も近い鼻孔候補を第１鼻孔とし、その第１鼻孔との距離が最も近い鼻孔候補を第２鼻孔と決定する。そして、前処理部３２はＸ座標に基づいて第１鼻孔及び第２鼻孔のどちらか一方を左鼻孔と認定し他方を右鼻孔と認定し、各鼻孔の中心座標を算出する。 If both the left and right nostrils cannot be detected in the previous field, the preprocessing unit 32 sets a large window of a predetermined size in the nostril image based on the position of the pupil, and the luminance in the large window is set. After inversion and binarization with a threshold set by the P tile method, isolated point removal, contraction processing, expansion processing, and labeling are performed. Subsequently, the preprocessing unit 32 selects nostril candidates from the connected components of the labeled pixels based on the area that is likely to be a nostril and the position within the large window. Subsequently, the preprocessing unit 32 determines the nostril candidate closest to the center of the large window as the first nostril, and determines the nostril candidate closest to the first nostril as the second nostril. Then, the preprocessing unit 32 recognizes one of the first nostril and the second nostril as the left nostril based on the X coordinate and the other as the right nostril, and calculates the center coordinates of each nostril.

前フィールドで左右の鼻孔の双方を検出できた場合には、前処理部３２は前フィールドの鼻孔位置からカルマンフィルタによって現在の処理対象フィールドにおける鼻孔位置を予測し、予測された鼻孔位置を中心とする小ウィンドウを設定する。小ウィンドウは大ウィンドウよりも小さい。そして、前処理部３２は大ウィンドウに対する処理と同様に、小ウィンドウ内の輝度反転、Ｐタイル法による２値化、孤立点除去、収縮処理、膨張処理、ラベリング、鼻孔候補の選択、及び左右の鼻孔の認定を実行することで、各鼻孔の中心座標を算出する。 When both the left and right nostrils can be detected in the previous field, the preprocessing unit 32 predicts the nostril position in the current processing target field by the Kalman filter from the nostril position of the previous field, and the predicted nostril position is the center. Set a small window. Small windows are smaller than large windows. Then, the preprocessing unit 32 performs the luminance inversion in the small window, binarization by the P tile method, isolated point removal, contraction processing, expansion processing, labeling, nostril candidate selection, The center coordinates of each nostril are calculated by executing the certification of the nostrils.

前フィールドで片方の鼻孔のみ検出された場合には、前処理部３２は鼻孔推定を行う。前処理部３２は、対象者Ａが鼻孔用カメラ２０に真っ直ぐ向いているときの両瞳孔及び両鼻孔の座標を事前に保持しており、これらの座標に基づいて瞳孔間の距離と鼻孔間の距離との比と求める。続いて、前処理部３２は、両瞳孔をつないだ直線と両鼻孔をつないだ直線が平行であるとの前提に立ち、二つの瞳孔座標と、検出できた一つの鼻孔座標と、求めた比とに基づいて、前フィールドで検出できなかった鼻孔座標を推定し、推定された鼻孔座標を中心に上記と同様の小ウィンドウを設定する。そして、前処理部３２は小ウィンドウ内の輝度反転、Ｐタイル法による２値化、孤立点除去、収縮処理、膨張処理、ラベリング、鼻孔候補の選択、及び左右の鼻孔の認定を実行することで、各鼻孔の中心座標を算出する。 When only one nostril is detected in the previous field, the preprocessing unit 32 performs nostril estimation. The pre-processing unit 32 holds in advance the coordinates of both pupils and both nostrils when the subject A is directly facing the nostril camera 20, and based on these coordinates, the distance between the pupils and the distance between the nostrils Find the ratio with the distance. Subsequently, the preprocessing unit 32 is based on the premise that the straight line connecting both pupils and the straight line connecting both nostrils are parallel to each other, the two pupil coordinates, one detected nostril coordinate, and the calculated ratio. Based on the above, the nostril coordinates that could not be detected in the previous field are estimated, and a small window similar to the above is set around the estimated nostril coordinates. Then, the pre-processing unit 32 executes luminance inversion in the small window, binarization by the P tile method, isolated point removal, contraction processing, expansion processing, labeling, nostril candidate selection, and right and left nostril identification. The center coordinates of each nostril are calculated.

（ステレオ法による顔姿勢ベクトルの導出）
本実施形態では、ステレオ法によって導出された顔姿勢ベクトルを真の顔方向（基準顔方向Ｈ_２）として取り扱う。ステレオ法とは、複数台のカメラによって撮像された画像データ（顔画像）から対象の三次元座標を復元する方法である。対象物は、瞳孔中心である。本実施形態では、瞳孔用カメラ１０を用いて得た少なくとも２枚の顔画像にステレオ法を適用することによって、瞳孔中心の三次元座標を決定する。そして、このステレオ法によって得られた瞳孔中心の三次元座標を利用して、対象者Ａの基準顔方向Ｈ_２を示す顔姿勢ベクトルを算出する。すなわち、基準顔方向Ｈ_２は、対象者Ａの顔画像に基づいて、ステレオ法により導出される。 (Derivation of face posture vector by stereo method)
In the present embodiment, the face posture vector derived by the stereo method is handled as the true face direction (reference face direction H ₂ ). The stereo method is a method for restoring the target three-dimensional coordinates from image data (face images) captured by a plurality of cameras. The object is the pupil center. In the present embodiment, the stereo method is applied to at least two face images obtained using the pupil camera 10 to determine the three-dimensional coordinates of the pupil center. Then, by using the three-dimensional coordinates of the obtained pupil center by the stereo method, and calculates the face pose vector indicating the reference face direction of H ₂ subjects A. That is, the reference face direction H _2, based on the face image of the subject A, is derived by the stereo method.

具体的には、図１２に示されるように、ステレオ法による瞳孔中心の三次元座標を決定には、世界座標系（Ｘ_Ｗ，Ｙ_Ｗ，Ｚ_Ｗ）、カメラ座標系（Ｘ_Ｃ，Ｙ_Ｃ，Ｚ_Ｃ）及び画像座標系（Ｘ_Ｉ，Ｙ_Ｉ）の３個の座標系を利用する。世界座標系は、複数のカメラ（例えば左側瞳孔用カメラ１０_Ｌ、右側瞳孔用カメラ１０_Ｒ）の間で共有する任意の点を規定する座標系である。特徴点の三次元座標は、世界座標系Ｘ_ＷＹ_ＷＺ_Ｗに基づく。世界座標系Ｘ_ＷＹ_ＷＺ_Ｗとカメラ座標系Ｘ_ＣＹ_ＣＺ_Ｃとの関係は、式（４）により示される。式（４）における回転行列Ｒと並進ベクトルＴ_Ｒは、カメラ較正により取得される定数である。前処理部３２は式（４）に基づいて、世界座標系Ｘ_ＷＹ_ＷＺ_Ｗにおける瞳孔の位置を算出する。

Specifically, as shown in FIG. 12, in order to determine the three-dimensional coordinates of the pupil center by the stereo method, the world coordinate system (X _W , Y _W , Z _W ), the camera coordinate system (X _C , Y _C). , Z _C ) and an image coordinate system (X _I , Y _I ). The world coordinate system is a coordinate system that defines an arbitrary point shared between a plurality of cameras (for example, the left pupil camera 10 _L and the right pupil camera 10 _R ). Three-dimensional coordinates of the feature points is based on the world coordinate system _X _W _Y W _Z W. The relationship between the world coordinate system X _W Y _W Z _W and the camera coordinate system X _C Y _C Z _C is expressed by Expression (4). Translation vector T _R and the rotation matrix R in Equation (4) is a constant that is acquired by the camera calibration. The pre-processing unit 32 calculates the position of the pupil in the world coordinate system X _W Y _W Z _W based on Expression (4).

続いて、前処理部３２は特徴点の座標Ｐ_０，Ｐ_１，Ｐ_２の三次元位置に基づいて顔方向を求める。図１３に示すように、カメラ座標系Ｘ_ＣＹ_ＣＺ_Ｃに対して特徴点の座標Ｐ_０，Ｐ_１，Ｐ_２及びそれらの重心Ｍを基準にした顔座標系ｘｙｚを定義する。このｘ軸、ｙ軸、ｚ軸は、顔座標系の原点が重心Ｍと一致し、顔平面がｘｙ平面と一致し、かつｚ軸が法線ベクトルと一致するように設定される。また、重心Ｍが顔座標系ｘｙｚの原点と位置し、かつ鼻孔中点がｙ軸上にあって負値をとるように設定された状態を顔座標系ｘｙｚでの基準姿勢と定義する。 Subsequently, the preprocessing unit 32 obtains the face direction based on the three-dimensional position of the coordinates P ₀ , P ₁ , P ₂ of the feature points. As shown in FIG. 13, defines the coordinate _P _0, P 1, _{P 2} and face coordinate system xyz relative to the those of the center of gravity M of the feature point relative to the camera coordinate system _X _C _Y C _Z C. The x-axis, y-axis, and z-axis are set so that the origin of the face coordinate system matches the center of gravity M, the face plane matches the xy plane, and the z-axis matches the normal vector. Further, a state in which the center of gravity M is located at the origin of the face coordinate system xyz and the nostril midpoint is on the y axis and is set to take a negative value is defined as a reference posture in the face coordinate system xyz.

前処理部３２はステレオ法によって算出された各特徴点の座標Ｐ_０，Ｐ_１，Ｐ_２の重心Ｍを通る平面の法線ベクトルを求める。この法線ベクトルは、対象者Ａの基準顔方向Ｈ_２を示す顔姿勢ベクトルＶ_Ｂ＝（ｎ_Ｘ，ｎ_Ｙ，ｎ_Ｚ）である。 The preprocessing unit 32 obtains a normal vector of a plane passing through the center of gravity M of the coordinates P ₀ , P ₁ , P ₂ of each feature point calculated by the stereo method. This normal vector is a face posture vector V _B = (n _X , n _Y , n _Z ) indicating the reference face direction H ₂ of the subject A.

（拘束条件法による顔姿勢ベクトルの導出）
顔検出装置１における撮像光学系は、図１４に示すように焦点距離ｆのピンホールモデルと仮定することができる。ピンホールを原点Ｏ_Ｒとしたカメラ座標系（基準座標系Ｘ_Ｃ−Ｙ_Ｃ−Ｚ_Ｃにおける鼻孔画像（撮像平面ＰＬ）上の右瞳孔、左瞳孔、左鼻孔、及び右鼻孔の中心点の二次元座標をそれぞれ、Ｑ_１（ｘ_１，ｙ_１）、Ｑ_２（ｘ_２，ｙ_２）、Ｑ_３（ｘ_３，ｙ_３）、及びＱ_４（ｘ_４，ｙ_４）とする。前処理部３２は、これら４点の二次元座標から、両鼻孔の中点（鼻孔間中心）の座標（鼻孔間中心座標）Ｐ_０、右瞳孔の座標Ｐ_１、及び左瞳孔の座標Ｐ_２を求める。ここで、Ｐ_ｎ＝（Ｘ_ｎ，Ｙ_ｎ，Ｚ_ｎ）（ｎ＝０，１，２）である。 (Derivation of face posture vector by constraint method)
The imaging optical system in the face detection apparatus 1 can be assumed to be a pinhole model with a focal length f as shown in FIG. Camera coordinate system where the pinhole as the origin O _R (reference coordinate system X _C -Y _C nostril images in -Z _C (imaging plane PL) on the right pupil of the center point of the left pupil, left nostril, and the right nostril two The dimensional coordinates are assumed to be Q ₁ (x ₁ , y ₁ ), Q ₂ (x ₂ , y ₂ ), Q ₃ (x ₃ , y ₃ ), and Q ₄ (x ₄ , y ₄ ), respectively. The unit 32 obtains the coordinates (center between the nostrils) P ₀ , the coordinates P _{1 of the} right pupil, and the coordinates P _{2 of the} left pupil from the two-dimensional coordinates of these four points. Here, P _n = (X _n , Y _n , Z _n ) (n = 0, 1, 2).

３個の特徴点（鼻孔間中心、左瞳孔及び右瞳孔）間を結んだ三角形の各辺の距離は、それらのうちの任意の１点をｉとし、他の点をｊとすると、点ｉ，ｊの間の距離Ｌ_ｉｊで示される（式（５））。

The distance between the sides of the triangle connecting the three feature points (the center of the nostril, the left pupil and the right pupil) is i if one of those points is i and j is the other point. , J is represented by a distance L _ij (formula (5)).

ピンホールから各特徴点への位置ベクトルが求まれば、各特徴点に対応する撮像平面ＰＬ上の二次元位置は、カメラの焦点距離ｆを用いて式（６）で得られる。

また、ピンホールＯから各特徴点へ向かう位置ベクトルに対応した単位ベクトルは式（７）により得られる。

各特徴点の位置ベクトルは定数ａ_ｎ（ｎ＝０，１，２）を用いて式（８）で表される。

すると、式（９）が成立する。

これにより連立方程式（１０）が得られる。

顔姿勢導出部３４はこの連立方程式からａ_０，ａ_１，ａ_２を求め、その解を式（８）に適用することで位置ベクトルを求める。 If the position vector from the pinhole to each feature point is obtained, the two-dimensional position on the imaging plane PL corresponding to each feature point can be obtained by Expression (6) using the focal length f of the camera.

A unit vector corresponding to a position vector from the pinhole O toward each feature point is obtained by Expression (7).

The position vector of each feature point is expressed by equation (8) using constants a _n (n = 0, 1, 2).

Then, Formula (9) is materialized.

Thereby, simultaneous equations (10) are obtained.

The face posture deriving unit 34 obtains a ₀ , a ₁ , a ₂ from the simultaneous equations, and obtains a position vector by applying the solution to the equation (8).

続いて、前処理部３２は特徴点の座標Ｐ_０，Ｐ_１，Ｐ_２に基づいて顔方向を求める。図１３に示すように、カメラ座標系Ｘ_ＣＹ_ＣＺ_Ｃに対して特徴点の座標Ｐ_０，Ｐ_１，Ｐ_２及びそれらの重心Ｍを基準にした顔座標系ｘｙｚを定義する。このｘ軸、ｙ軸、ｚ軸は、顔座標系の原点が重心Ｍと一致し、顔平面がｘｙ平面と一致し、かつｚ軸が法線ベクトルと一致するように設定される。また、重心Ｍが顔座標系ｘｙｚの原点と位置し、かつ鼻孔中点がｙ軸上にあって負値をとるように設定された状態を顔座標系ｘｙｚでの基準姿勢と定義する。 Subsequently, the preprocessing unit 32 obtains a face direction based on the coordinates P ₀ , P ₁ , P ₂ of the feature points. As shown in FIG. 13, defines the coordinate _P _0, P 1, _{P 2} and face coordinate system xyz relative to the those of the center of gravity M of the feature point relative to the camera coordinate system _X _C _Y C _Z C. The x-axis, y-axis, and z-axis are set so that the origin of the face coordinate system matches the center of gravity M, the face plane matches the xy plane, and the z-axis matches the normal vector. Further, a state in which the center of gravity M is located at the origin of the face coordinate system xyz and the nostril midpoint is on the y axis and is set to take a negative value is defined as a reference posture in the face coordinate system xyz.

前処理部３２は各特徴点の座標Ｐ_０，Ｐ_１，Ｐ_２の重心Ｍを通る撮像平面ＰＬの法線ベクトルを求める。この法線ベクトルは、対象者Ａの顔方向Ｈ_１を示す顔姿勢ベクトルＶ_Ｆ＝（ｎ_Ｘ，ｎ_Ｙ，ｎ_Ｚ）である。 The preprocessing unit 32 obtains a normal vector of the imaging plane PL that passes through the center of gravity M of the coordinates P ₀ , P ₁ , P ₂ of each feature point. This normal vector is a face posture vector V _F = (n _X , n _Y , n _Z ) indicating the face direction H ₁ of the subject A.

（視線の検出）
前処理部３２は左右の瞳孔の三次元座標に基づいて視線を検出する。この瞳孔の三次元座標には、上記ステレオ法による顔姿勢ベクトルの導出と同様に、瞳孔用カメラ１０を用いて得た少なくとも２枚の顔画像にステレオ法を適用することによって得られた瞳孔中心の三次元座標を利用することができる。すなわち、視線Ｇは、対象者Ａの顔画像に基づいてステレオ法により導出される。図１５に示されるように、瞳孔の三次元位置に基づいて、瞳孔用カメラ１０の開口部１２の中心を原点Ｏ_Ｒとし、その原点Ｏ_Ｒと瞳孔中心Ｐを結ぶ基準線Ｏ_ＲＰを法線とする仮想視点平面Ｘ’−Ｙ’を考える。ここで、Ｘ’軸は、世界座標系のＸｗ−Ｚｗ平面と仮想視点平面Ｘ’−Ｙ’との交線に相当する。 (Gaze detection)
The preprocessing unit 32 detects the line of sight based on the three-dimensional coordinates of the left and right pupils. In the three-dimensional coordinates of the pupil, the pupil center obtained by applying the stereo method to at least two face images obtained by using the pupil camera 10 is used in the same manner as the derivation of the face posture vector by the stereo method. 3D coordinates can be used. That is, the line of sight G is derived by the stereo method based on the face image of the subject A. As shown in FIG. 15, based on the three-dimensional position of the pupil, the center of the opening 12 of the pupil camera 10 as an origin O _R, modulo the reference line O _{R P} connecting the origin O _R and the pupil center P Consider a virtual viewpoint plane X′-Y ′ that is a line. Here, the X ′ axis corresponds to the intersection of the Xw-Zw plane of the world coordinate system and the virtual viewpoint plane X′-Y ′.

前処理部３２は、画像面Ｓ_Ｇにおける角膜反射点Ｇ_Ｒから瞳孔中心Ｐまでのベクトルｒ_Ｇを算出し、そのベクトルｒ_Ｇを、距離ＯＰから求められたカメラの拡大率を用いて実寸に換算したベクトルｒに変換する。このとき、瞳孔用カメラ１０をピンホールモデルと考え、角膜反射点Ｇ_Ｒと瞳孔中心Ｐとが、仮想視点平面Ｘ’−Ｙ’と平行な平面上にあると仮定する。つまり、前処理部３２は、仮想視点平面Ｘ’−Ｙ’と平行であって瞳孔の三次元座標を含む平面上において、瞳孔中心Ｐと角膜反射点Ｇ_Ｒの相対座標をベクトルｒとして算出し、このベクトルｒは角膜反射点Ｇ_Ｒから瞳孔中心Ｐまでの実距離を表す。 Preprocessing section 32 calculates a vector r _G from the corneal reflection point G _R in the image plane S _G to the pupil center P, and the vector r _G, the actual size with the magnification of the camera obtained from the distance OP It converts into the converted vector r. At this time, the pupil camera 10 considered pinhole model, assume that the corneal reflection point G _R and the pupil center P, is on the virtual viewpoint plane X'-Y 'plane parallel. That is, the preprocessing unit 32 is parallel with the virtual viewpoint plane X'-Y 'on a plane including the three-dimensional coordinates of the pupil, and calculates the relative coordinates of the pupil center P and the corneal reflection point G _R as a vector r the vector r represents the actual distance from the cornea reflection point G _R to the pupil center P.

前処理部３２は、対象者Ａの仮想視点平面Ｘ’−Ｙ’上の注視点Ｔに関して、直線ＯＴの水平軸Ｘ’に対する傾きφが、ベクトルｒの画像面上の水平軸ＸＧに対する傾きφ’と等しいと仮定する。更に、前処理部３２は、対象者Ａの視線ベクトル、すなわち、瞳孔中心Ｐと注視点Ｔとを結ぶ視線ベクトルＰＴと、基準線ＯＰとのなす傾きθを、ゲイン値ｋ_２を含むパラメータを使った式（１１）により計算する。

The pre-processing unit 32 relates to the gazing point T on the virtual viewpoint plane X′-Y ′ of the subject A and the inclination φ of the straight line OT with respect to the horizontal axis X ′ is the inclination φ of the vector r with respect to the horizontal axis XG on the image plane. Assume that is equal to '. Furthermore, the preprocessing unit 32, a line-of-sight vector of the subject A, i.e., a line-of-sight vector PT connecting the gazing point T and the pupil center P, and forms the inclination θ of the reference line OP, the parameters including a gain value k ₂ Calculated according to the equation (11) used.

このような傾きφ，θの計算は、瞳孔中心Ｐの存在する平面上のベクトルｒを仮想視点平面Ｘ’−Ｙ’上で拡大したものがそのまま対象者Ａの注視点に対応するとみなすことにより行われる。具体的には、対象者Ａの視線ベクトルＰＴの基準線ＯＰに対する傾きθは、瞳孔中心と角膜反射の距離｜ｒ｜との間で線形関係を有すると仮定する。 Such inclinations φ and θ are calculated by assuming that the vector r on the plane where the pupil center P exists is enlarged on the virtual viewpoint plane X′-Y ′ as it corresponds to the gaze point of the subject A as it is. Done. Specifically, it is assumed that the inclination θ of the visual line vector PT of the subject A with respect to the reference line OP has a linear relationship between the pupil center and the corneal reflection distance | r |.

傾きと距離｜ｒ｜とは線形近似できるという仮定、及び二つの傾きφ，φ’が等しいという仮定を利用することで、（θ，φ）と（｜ｒ｜，φ’）とを一対一に対応させることができる。このとき、前処理部３２は、瞳孔用カメラ１０の開口部１２の中心に設定された原点Ｏ_Ｒと、仮想視点平面Ｘ’−Ｙ’上の注視点Ｔとを結ぶベクトルＯＴを式（１２）により得る。なお、ベクトルＯＰは瞳孔用カメラ１０から得られる。

By using the assumption that the slope and the distance | r | can be linearly approximated and the two slopes φ and φ ′ are equal, (θ, φ) and (| r |, φ ′) are one-to-one. It can be made to correspond. In this case, the pre-processing unit 32, the origin O _R that is set in the center of the opening 12 of the pupil camera 10, the virtual viewpoint plane X'-Y 'vector OT connecting the gazing point on the T (12 ) The vector OP is obtained from the pupil camera 10.

最後に、前処理部３２は視線ベクトルＰＴと視対象平面（ディスプレイ装置４０）との交点である注視点Ｑを式（１３）により得る

Finally, the preprocessing unit 32 obtains the gazing point Q, which is the intersection of the line-of-sight vector PT and the viewing target plane (display device 40), using Expression (13).

しかし、一般的にヒトの視軸（瞳孔中心および中心窩を通る軸）と光軸（角膜からレンズの中心へと延びる法線）との間には偏りがあり、対象者Ａがカメラを注視した際にも角膜反射と瞳孔中心とは一致しない。そこで、これを補正する原点補正ベクトルｒ_０を定義し、カメラ画像から実測した角膜反射−瞳孔中心ベクトルをｒ’とすると、ベクトルｒはｒ＝ｒ’−ｒ_０で表されるので、式（１１）は式（１４）のように書き換えられる。

However, in general, there is a bias between the human visual axis (axis passing through the center of the pupil and the fovea) and the optical axis (normal line extending from the cornea to the center of the lens), and the subject A gazes at the camera. In this case, the corneal reflection does not match the pupil center. Therefore, when an origin correction vector r ₀ for correcting this is defined and the cornea reflection-pupil center vector measured from the camera image is r ′, the vector r is expressed by r = r′−r ₀ , 11) can be rewritten as in equation (14).

計測されたｒ’に対して原点補正を行うことで、（θ，φ）と（｜ｒ｜，φ’）とを一対位置に対応させることができ、精度の高い注視点検出を行うことができる。このような補正は、当業者に周知である一点較正法を用いて実現可能である。 By performing the origin correction for the measured r ′, (θ, φ) and (| r |, φ ′) can be made to correspond to a pair of positions, and high-precision gaze point detection can be performed. it can. Such correction can be achieved using a single point calibration method well known to those skilled in the art.

＜関数導出ステップＳ３＞
関数導出部３３は、式（２）における係数ｋ_１，ｋ_２を決定する。係数ｋ_１，ｋ_２の決定方法は次の通りである。まず、対象者Ａは、顔方向と視線とを自由に動かす。この間に、視線Ｇと、顔方向Ｈ_１と、基準顔方向Ｈ_２とを取得する。そして、図１６に示されるように、角度∠Ｇ’（第１角度）と角度∠Ｈ（第２角度）を二次元座標上にプロットする。これら角度∠Ｇ’と角度∠Ｈとは、世界座標系を基準として表現された角度である。ここで、角度∠Ｇ’と角度∠Ｈとを顔座標系に座標変換する。そして、関数決定部３３ａは、顔座標系を基準とした角度∠Ｇ’と角度∠Ｈを利用して、二次元座標におけるグラフの傾き（係数ｋ_１）および切片（係数ｋ_２）を算出する（ステップＳ３ａ）。これらの算出には、最小二乗法を利用することができる。これら係数ｋ_１，ｋ_２が決定されることにより、関数（式（２））が決定される。 <Function derivation step S3>
The function deriving unit 33 determines the coefficients k ₁ and k ₂ in Expression (2). The method for determining the coefficients k ₁ and k ₂ is as follows. First, the subject A freely moves the face direction and the line of sight. During this time, it acquires the sight line G, the face direction _{H 1,} and a reference face direction _{H 2.} Then, as shown in FIG. 16, the angle ∠G ′ (first angle) and the angle ∠H (second angle) are plotted on two-dimensional coordinates. These angle ∠G ′ and angle ∠H are angles expressed with reference to the world coordinate system. Here, the angle ∠G ′ and the angle ∠H are coordinate-converted into the face coordinate system. Then, the function determination unit 33a calculates the slope (coefficient k ₁ ) and the intercept (coefficient k ₂ ) of the graph in two-dimensional coordinates using the angle ∠G ′ and the angle ∠H with respect to the face coordinate system. (Step S3a). For these calculations, the least square method can be used. By determining these coefficients k ₁ and k ₂ , the function (formula (2)) is determined.

＜画像取得ステップＳ４＞
画像取得ステップＳ４では、上記画像取得ステップＳ１と同様の手法により、明瞳孔画像、暗瞳孔画像及び鼻孔画像を取得する。 <Image acquisition step S4>
In the image acquisition step S4, a bright pupil image, a dark pupil image, and a nostril image are acquired by the same method as in the image acquisition step S1.

＜前処理ステップＳ５＞
前処理ステップＳ５では、上記前処理ステップＳ２と同様の手法により、拘束条件法による顔方向Ｈ_１を導出する。また、上記前処理ステップＳ２と同様の手法により、ステレオ法による視線Ｇを導出する。前処理ステップＳ５では、ステレオ法による基準顔方向Ｈ_２は必要に応じて導出すればよい。 <Preprocessing step S5>
In the pretreatment step S5, in the same manner as the pre-processing step S2, deriving the face direction H ₁ by constraint method. Further, the line of sight G by the stereo method is derived by the same method as in the preprocessing step S2. In the pretreatment step S5, the reference face direction H ₂ by stereo method may be derived if necessary.

（拘束条件法による顔姿勢の導出：顔姿勢導出ステップＳ６）
顔姿勢導出ステップＳ６では、上記「拘束条件法による顔姿勢ベクトルの導出」と同様の手法により、顔方向Ｈ_１を導出する。 (Derivation of face posture by constraint condition method: face posture deriving step S6)
In the face pose deriving step S6, in the same manner as "Derivation of face pose vector by constraints Method" above, we derive the face direction H _1.

＜顔姿勢補正ステップＳ７＞
顔姿勢補正部３６は、式（２）と関数導出ステップＳ３で決定された係数ｋ_１，ｋ_２を利用して、顔方向Ｈ_１を補正する。まず、第１の座標変換部３６ａは、フレーム毎に求まる世界座標系における視線Ｇと顔方向Ｈ_１を顔座標系に変換し、顔座標系における視線Ｇと顔方向Ｈ_１を求める（第１の座標変換ステップ：Ｓ７ａ）。次に、角度取得部３６ｂは、顔座標系における視線Ｇと顔方向Ｈ_１を利用して、顔座標系における角度∠Ｇ’（第１角度）を求めた後に、角度∠Ｇ’と、すでに求まっている係数ｋ_１，ｋ_２を含む記式（２）を用いて、顔座標系における角度∠Ｈを求める（角度取得ステップ：Ｓ７ｂ）。次に、方向補正部３６ｃは、この角度∠Ｈと顔座標系における顔方向Ｈ_１から顔座標系における基準顔方向Ｈ_２を求める（方向補正ステップ：Ｓ７ｃ）。そして、第２の座標変換部３６ｄは、顔座標系における基準顔方向Ｈ_２を世界座標系における基準顔方向Ｈ_２に変換する（第２の座標変換ステップ：Ｓ７ｄ）。以上のステップＳ７ａ〜Ｓ７ｄにより、補正された顔方向Ｈ_１を得る。 <Face posture correction step S7>
The face posture correction unit 36 corrects the face direction H ₁ using Equation (2) and the coefficients k ₁ and k ₂ determined in the function derivation step S3. First, the first coordinate conversion unit 36a converts the sight line G and the face direction H ₁ in the world coordinate system which is obtained for each frame to the face coordinate system, determining the line of sight G and face direction H ₁ in the face coordinate system (first Coordinate conversion step: S7a). Then, the angle acquisition unit 36b, by using the sight line G and the face direction H ₁ in the face coordinate system, 'after obtaining the (first angle), the angle ∠G' angle ∠G in the face coordinate system, already The angle ∠H in the face coordinate system is obtained using the expression (2) including the obtained coefficients k ₁ and k ₂ (angle acquisition step: S7b). Then, the direction correction unit 36c determines a reference face direction H ₂ from the face direction H ₁ in this angle ∠H the face coordinate system in the face coordinate system (direction correction step: S7c). The second coordinate conversion unit 36d converts the reference face direction H ₂ in the face coordinate system in the reference face direction H ₂ in the world coordinate system (second coordinate conversion step: S7d). By the above steps S7a～S7d, obtaining a corrected face direction _{H 1.}

［顔検出プログラム］
次に、顔検出装置１を実現するための顔検出プログラムを説明する。 [Face detection program]
Next, a face detection program for realizing the face detection apparatus 1 will be described.

図１７に示されるように、顔検出プログラムＰ１０は、メインモジュールＰ１１、画像取得モジュールＰ１２、前処理モジュールＰ１３、関数導出モジュールＰ１４、顔姿勢導出モジュールＰ１５、及び顔姿勢補正モジュールＰ１６、を備える。 As shown in FIG. 17, the face detection program P10 includes a main module P11, an image acquisition module P12, a preprocessing module P13, a function derivation module P14, a face posture derivation module P15, and a face posture correction module P16.

メインモジュールＰ１１は、顔検出機能を統括的に制御する部分である。画像取得モジュールＰ１２、前処理モジュールＰ１３、関数導出モジュールＰ１４、顔姿勢導出モジュールＰ１５、及び顔姿勢補正モジュールＰ１６を実行することにより実現される機能はそれぞれ、上記の画像取得部３１、前処理部３２、関数導出部３３、顔姿勢導出部３４、及び顔姿勢補正部３６の機能と同様である。 The main module P11 is a part that comprehensively controls the face detection function. The functions realized by executing the image acquisition module P12, the preprocessing module P13, the function derivation module P14, the face posture derivation module P15, and the face posture correction module P16 are the image acquisition unit 31 and the preprocessing unit 32, respectively. The functions of the function deriving unit 33, the face posture deriving unit 34, and the face posture correcting unit 36 are the same.

顔検出プログラムＰ１０は、例えば、ＣＤ−ＲＯＭやＤＶＤ−ＲＯＭ、半導体メモリなどの有形の記録媒体に固定的に記録された上で提供されてもよい。また、顔検出プログラムＰ１０は、搬送波に重畳されたデータ信号として通信ネットワークを介して提供されてもよい。 The face detection program P10 may be provided after being fixedly recorded on a tangible recording medium such as a CD-ROM, DVD-ROM, or semiconductor memory. The face detection program P10 may be provided via a communication network as a data signal superimposed on a carrier wave.

１…顔検出装置、１０…瞳孔用カメラ（撮像手段）、２０…鼻孔用カメラ、３０…画像処理装置（処理手段）、３１…画像取得部、３２…前処理部、３３…関数導出部、３４…顔姿勢導出部、３６…顔姿勢補正部、Ｈ_１…顔方向、Ｈ_２…基準顔方向、Ｓ１，Ｓ４…画像取得ステップ、Ｓ２，Ｓ５…前処理ステップ、Ｓ３…関数導出ステップ、Ｓ６…顔姿勢導出ステップ、Ｓ７…顔姿勢補正ステップ。 DESCRIPTION OF SYMBOLS 1 ... Face detection apparatus, 10 ... Pupil camera (imaging means), 20 ... Nostril camera, 30 ... Image processing apparatus (processing means), 31 ... Image acquisition part, 32 ... Pre-processing part, 33 ... Function derivation part, 34 ... Face posture deriving unit, 36 ... Face posture correcting unit, H ₁ ... Face direction, H ₂ ... Reference face direction, S1, S4 ... Image acquisition step, S2, S5 ... Preprocessing step, S3 ... Function deriving step, S6 ... face posture deriving step, S7 ... face posture correcting step.

Claims

A function for deriving a function for correcting the face posture, including a coefficient that defines a first angle between the face posture and the line of sight of the subject and a second angle between the reference face posture and the face posture. A derivation step;
A face posture deriving step for deriving the face posture;
Using the function derived in the function deriving step to correct the face posture, and a face posture correcting step,
The face posture is detected in a distance between parts in a reference part group that is a combination of any one of the left pupil, right pupil, and right and left nostrils of the subject, and the face image of the subject. Using the two-dimensional position of the reference part group, it is derived by calculating the normal of the plane including the reference part group,
The face detection method, wherein the line of sight and the reference face posture are derived by stereo matching based on at least two face images of the subject.

The function derivation step includes:
Deriving the face posture;
Deriving the line of sight and the reference face posture;
The face detection method according to claim 1, further comprising: deriving the coefficient based on the first angle and the second angle.

The face detection method according to claim 1, wherein after executing the function deriving step once, the face posture deriving step and the face posture correcting step are repeatedly executed.

The face detection method according to claim 1, wherein the function deriving step, the face posture deriving step, and the face posture correcting step are repeatedly executed in this order.

The face posture correction step includes
Transforming the line of sight based on the first coordinate system and the face posture based on the first coordinate system to be based on a second coordinate system different from the first coordinate system;
The first angle based on the second coordinate system is acquired using the line of sight based on the second coordinate system and the face posture based on the second coordinate system, and the first angle and the Using a function to obtain the second angle based on the second coordinate system;
Correcting the face posture based on the second coordinate system using the second angle based on the second coordinate system;
The face detection method according to claim 1, further comprising: transforming the corrected face posture based on the first coordinate system.

At least two imaging means for imaging the face of the subject;
Processing means for deriving the face posture of the subject based on the face image imaged by the imaging means,
The processing means includes
Deriving a function for correcting the face posture, including coefficients defining a first angle between the face posture and the line of sight of the subject and a second angle relationship between a reference face posture and the face posture. A function derivation unit for
A face posture deriving unit for deriving the face posture;
A face posture correcting unit that corrects the face posture using the function derived in the function deriving unit;
The face posture is detected in a distance between parts in a reference part group that is a combination of any one of the left pupil, right pupil, and right and left nostrils of the subject, and the face image of the subject. Using the two-dimensional position of the reference part group, it is derived by calculating the normal of the plane including the reference part group,
The face detection device, wherein the line of sight and the reference face posture are derived by stereo matching based on at least two face images of the subject.

The face posture correction unit
A first coordinate transformation that transforms the line of sight based on the first coordinate system and the face posture based on the first coordinate system so as to be based on a second coordinate system different from the first coordinate system; And
The first angle based on the second coordinate system is acquired using the line of sight based on the second coordinate system and the face posture based on the second coordinate system, and the first angle and the An angle acquisition unit that acquires the second angle based on the second coordinate system using a function;
A direction correction unit that corrects the face posture based on the second coordinate system using the second angle based on the second coordinate system;
The face detection apparatus according to claim 6, further comprising: a second coordinate conversion unit configured to perform coordinate conversion of the corrected face posture based on the first coordinate system.

Computer
A function derivation that includes a coefficient that defines a relationship between a first angle between the face posture and the line of sight of the subject and a second angle between the reference face posture and the face posture, and derives a function for correcting the face posture. And
A face posture deriving unit for deriving the face posture;
Using the function derived in the function deriving unit, function as a face posture correcting unit that corrects the face posture,
The face posture is detected in a distance between parts in a reference part group that is a combination of any one of the left pupil, right pupil, and right and left nostrils of the subject, and the face image of the subject. Using the two-dimensional position of the reference part group, it is derived by calculating the normal of the plane including the reference part group,
The face detection program in which the line of sight and the reference face posture are derived by stereo matching based on at least two face images of the subject.