JP2023004678A

JP2023004678A - Processing device and control method therefor

Info

Publication number: JP2023004678A
Application number: JP2021106530A
Authority: JP
Inventors: 俊彦杉本; Toshihiko Sugimoto
Original assignee: Canon Inc
Current assignee: Canon Inc
Priority date: 2021-06-28
Filing date: 2021-06-28
Publication date: 2023-01-17

Abstract

To achieve stability and quick responsiveness in visual line detection.SOLUTION: With a Kalman filter using a low-order polynomial regression, a system control unit 50 corrects visual line information corresponding to a position of a visual line of a person.SELECTED DRAWING: Figure 6

Description

本発明は、人物の視線に対応する位置を取得する装置に関する。 The present invention relates to a device for acquiring a position corresponding to a person's line of sight.

人物の視線位置を検出する方法が知られている。特許文献１では、角膜に照明を行い、その反射像を撮影することで、被験者の視線位置を検出する方法が開示されている。また、特許文献１では、過去に取得した複数の時刻における視線位置から、カルマンフィルタを用いて現在の視線位置を予測する手段が開示されている。 A method for detecting the line-of-sight position of a person is known. Patent Literature 1 discloses a method of detecting the line-of-sight position of a subject by illuminating the cornea and photographing its reflected image. Further, Patent Document 1 discloses means for predicting the current line-of-sight position using a Kalman filter from line-of-sight positions obtained at a plurality of times in the past.

特開２０１８－１９７９７４号公報JP 2018-197974 A

しかしながら、特許文献１では視線位置を予測するためのカルマンフィルタについて詳しく開示されていない。本発明は、視線位置を検出する手段と視線の表示手段を有する撮像装置において、視線検出の即応性と安定性のバランスを鑑みて視線位置を導出することを目的とする。 However, Patent Document 1 does not disclose in detail the Kalman filter for predicting the line-of-sight position. SUMMARY OF THE INVENTION An object of the present invention is to derive a line-of-sight position in consideration of a balance between responsiveness and stability of line-of-sight detection in an imaging apparatus having a line-of-sight position detecting means and a line-of-sight display means.

本発明は、人物の視線の位置に対応する視線情報を取得する取得手段と、前記視線情報を補正する補正手段と、を有し、前記補正手段は、低次の多項式回帰式を用いたカルマンフィルタによって前記視線情報を補正するよう構成したことを特徴とする。 The present invention includes acquisition means for acquiring line-of-sight information corresponding to the line-of-sight position of a person, and correction means for correcting the line-of-sight information, wherein the correction means is a Kalman filter using a low-order polynomial regression equation. is configured to correct the line-of-sight information by

本発明によれば、視線検出の安定性と即応性を両立することが可能である。 According to the present invention, it is possible to achieve both stability and responsiveness in line-of-sight detection.

本発明の実施形態にかかる撮像装置の構成を示すブロック図1 is a block diagram showing the configuration of an imaging device according to an embodiment of the present invention; FIG. 本発明の実施形態にかかる撮像装置の画素の瞳面と光電変換部の対応関係を示す図FIG. 2 is a diagram showing a correspondence relationship between a pupil plane of pixels and a photoelectric conversion unit of an imaging device according to an embodiment of the present invention; 本発明の実施形態にかかる撮像装置の画素の瞳面と開口部の対応関係を示す図FIG. 4 is a diagram showing the correspondence relationship between the pupil plane of the pixel and the aperture of the imaging device according to the embodiment of the present invention; 本発明の実施形態にかかる視線入力操作部の構成を示す図The figure which shows the structure of the gaze input operation part concerning embodiment of this invention. 本発明の実施形態にかかる撮像装置の焦点検出、視線検出及び撮影動作のフローチャート4 is a flow chart of focus detection, line-of-sight detection, and shooting operation of the imaging device according to the embodiment of the present invention; 本発明の実施形態にかかる撮像装置の視線予測方法のフローチャート1 is a flow chart of a line-of-sight prediction method for an imaging device according to an embodiment of the present invention; 本発明の実施形態にかかる撮像装置の視線予測方法の概略図Schematic diagram of a line-of-sight prediction method for an imaging device according to an embodiment of the present invention.

以下に、本発明の好ましい実施の形態を、添付の図面に基づいて詳細に説明する。 Preferred embodiments of the present invention are described in detail below with reference to the accompanying drawings.

［撮像装置の構成の説明］
図１は、本発明の実施形態にかかる撮像装置の構成を示すブロック図である。図１において、レンズユニット１５０は、交換可能な撮影レンズを搭載するレンズユニットである。レンズ１０３は通常、複数枚のレンズから構成されるが、ここでは簡略して一枚のレンズのみで示している。通信端子６はレンズユニット１５０がデジタルカメラ１００側と通信を行う為の通信端子であり、通信端子１０はデジタルカメラ１００がレンズユニット１５０側と通信を行う為の通信端子である。レンズユニット１５０は、この通信端子６、１０を介してシステム制御部５０と通信し、内部のレンズシステム制御回路４によって絞り駆動回路２を介して絞り１０２の制御を行い、ＡＦ駆動回路３を介して、レンズ１０３の位置を変位させることで焦点を合わせる。 [Description of configuration of imaging device]
FIG. 1 is a block diagram showing the configuration of an imaging device according to an embodiment of the present invention. In FIG. 1, a lens unit 150 is a lens unit that mounts an interchangeable photographing lens. Although the lens 103 is normally composed of a plurality of lenses, only one lens is shown here for simplicity. A communication terminal 6 is a communication terminal for the lens unit 150 to communicate with the digital camera 100 side, and a communication terminal 10 is a communication terminal for the digital camera 100 to communicate with the lens unit 150 side. The lens unit 150 communicates with the system control section 50 via the communication terminals 6 and 10, controls the diaphragm 102 via the diaphragm driving circuit 2 by the internal lens system control circuit 4, and controls the diaphragm 102 via the AF driving circuit 3. The focus is adjusted by displacing the position of the lens 103.

シャッター１０１は、システム制御部５０の制御で撮像部２２の露光時間を自由に制御できるフォーカルプレーンシャッターである。撮像部２２は光学像を電気信号に変換するＣＣＤやＣＭＯＳ素子等で構成される撮像素子である。Ａ／Ｄ変換器２３は、アナログ信号をデジタル信号に変換する。Ａ／Ｄ変換器２３は、撮像部２２から出力されるアナログ信号をデジタル信号に変換するために用いられる。撮像部２２から得られた信号は、撮像だけでなく、露出制御、焦点検出制御にも用いられる。撮像部２２には、１つのマイクロレンズに対して、光電変換部が分割された画素が設けられている。光電変換部を分割することにより入射瞳が分割され、それぞれの光電変換部から位相差検出信号を得ることができる。また、分割された光電変換部からの信号を加算することにより、撮像信号も得ることができる。 The shutter 101 is a focal plane shutter that can freely control the exposure time of the imaging unit 22 under the control of the system control unit 50 . The imaging unit 22 is an imaging device configured by a CCD, a CMOS device, or the like that converts an optical image into an electrical signal. The A/D converter 23 converts analog signals into digital signals. The A/D converter 23 is used to convert the analog signal output from the imaging section 22 into a digital signal. Signals obtained from the imaging unit 22 are used not only for imaging but also for exposure control and focus detection control. The imaging unit 22 is provided with pixels obtained by dividing the photoelectric conversion unit for one microlens. By dividing the photoelectric conversion unit, the entrance pupil is divided, and a phase difference detection signal can be obtained from each photoelectric conversion unit. Further, by adding the signals from the divided photoelectric conversion units, an imaging signal can also be obtained.

このような画素は、焦点検出画素と撮像画素を兼用できるというメリットがある。 Such pixels have the advantage that they can be used both as focus detection pixels and imaging pixels.

図２は、本実施形態にかかる画素の構成と、瞳面と光電変換部の対応関係を示している。２０１が光電変換部を、２５３が瞳面を、２５１がマイクロレンズを、２５２がカラーフィルタをそれぞれ示している。図２には、光電変換部２０１ａ（第１焦点検出画素）と、光電変換部２０１ｂ（第２焦点検出画素）の２つの光電変換部２０１が設けられている。光電変換部２０１ａにおいて、２５３ａで示した瞳面を通過した光が光電変換部２０１ａに入射する。また、光電変換部２０１ｂにおいて、２５３ｂで示した瞳面を通過した光が光電変換部２０１ｂに入射する。これにより、光電変換部２０１ａと、光電変換部２０１ｂから得られた信号から焦点検出が行える。また、光電変換部２０１ａと、光電変換部２０１ｂから得られた信号を加算することにより、撮像信号を生成することができる。 FIG. 2 shows the configuration of a pixel according to this embodiment and the correspondence relationship between the pupil plane and the photoelectric conversion unit. 201 denotes a photoelectric conversion unit, 253 denotes a pupil plane, 251 denotes a microlens, and 252 denotes a color filter. In FIG. 2, two photoelectric conversion units 201 are provided: a photoelectric conversion unit 201a (first focus detection pixel) and a photoelectric conversion unit 201b (second focus detection pixel). In the photoelectric conversion unit 201a, light passing through the pupil plane indicated by 253a enters the photoelectric conversion unit 201a. Also, in the photoelectric conversion unit 201b, the light passing through the pupil plane indicated by 253b enters the photoelectric conversion unit 201b. Accordingly, focus detection can be performed from the signals obtained from the photoelectric conversion units 201a and 201b. Further, by adding the signals obtained from the photoelectric conversion units 201a and 201b, an imaging signal can be generated.

本実施例では、図２に示した画素を、撮像部２２の全画面領域に設けることにより、画面上に写るいずれの被写体に対しても、位相差検出により焦点を合わせることが可能となる。 In this embodiment, by providing the pixels shown in FIG. 2 in the entire screen area of the imaging unit 22, any subject captured on the screen can be focused by phase difference detection.

なお、本実施例では、上記の焦点検出方式で説明を行うが、焦点検出方式はこの場合に限らない。例えば、撮像部２２に、後述の図３に示す焦点検出専用画素を設けて焦点検出を行ってもよい。また、撮像部２２には、焦点検出用の画素を設けず、撮像用の画素のみを設け、コントラスト方式で焦点検出を行ってもよい。 In this embodiment, the above focus detection method will be described, but the focus detection method is not limited to this case. For example, the imaging unit 22 may be provided with focus detection pixels shown in FIG. 3, which will be described later, to perform focus detection. Alternatively, the imaging unit 22 may be provided with only imaging pixels instead of pixels for focus detection, and focus detection may be performed by a contrast method.

図３は、焦点検出専用画素の構成と、瞳面と光電変換部の対応関係を示している。図３は、図２と異なり、焦点検出専用の画素である。瞳面２５３の形状は、開口部２５４により決定される。また、瞳面２５３を通過した光のみを検出するため、対となる画素、図３において不図示の右側の瞳面からの光を検出する画素、を別途設けて焦点検出信号を取得する必要がある。撮像部２２に、図３に示す焦点検出画素と、撮像画素を全画面領域に設けることにより、画面上に写るいずれの被写体に対しても、位相差検出により焦点を合わせることが可能となる。 FIG. 3 shows the configuration of the focus detection dedicated pixels and the correspondence relationship between the pupil plane and the photoelectric conversion unit. Unlike FIG. 2, FIG. 3 shows pixels dedicated to focus detection. The shape of pupil plane 253 is determined by aperture 254 . In addition, since only the light that has passed through the pupil plane 253 is detected, it is necessary to separately provide a pair of pixels, pixels for detecting light from the right pupil plane (not shown in FIG. 3), and to acquire the focus detection signal. be. By providing the focus detection pixels shown in FIG. 3 and the imaging pixels in the entire screen area of the imaging unit 22, it is possible to focus on any subject captured on the screen by phase difference detection.

画像処理部２４は、Ａ／Ｄ変換器２３からのデータ、又は、メモリ制御部１５からのデータに対し所定の画素補間、縮小といったリサイズ処理や色変換処理を行う。また、画像処理部２４では、撮像した画像データを用いて所定の演算処理が行われ、得られた演算結果に基づいてシステム制御部５０が露光制御、測距制御を行う。これにより、ＴＴＬ（スルー・ザ・レンズ）方式のＡＦ（オートフォーカス）処理、ＡＥ（自動露出）処理、ＥＦ（フラッシュプリ発光）処理が行われる。画像処理部２４では更に、撮像した画像データを用いて所定の演算処理を行い、得られた演算結果に基づいてＴＴＬ方式のＡＷＢ（オートホワイトバランス）処理も行っている。 The image processing unit 24 performs resizing processing such as predetermined pixel interpolation and reduction, and color conversion processing on the data from the A/D converter 23 or the data from the memory control unit 15 . Further, the image processing unit 24 performs predetermined arithmetic processing using captured image data, and the system control unit 50 performs exposure control and distance measurement control based on the obtained arithmetic results. As a result, TTL (through-the-lens) AF (autofocus) processing, AE (automatic exposure) processing, and EF (flash pre-emission) processing are performed. The image processing unit 24 further performs predetermined arithmetic processing using the captured image data, and also performs TTL AWB (Auto White Balance) processing based on the obtained arithmetic results.

Ａ／Ｄ変換器２３からの出力データは、画像処理部２４及びメモリ制御部１５を介して、或いは、メモリ制御部１５を介してメモリ３２に直接書き込まれる。メモリ３２は、撮像部２２によって得られＡ／Ｄ変換器２３によりデジタルデータに変換された画像データや、表示手段としての表示部２８に表示するための画像データを格納する。メモリ３２は、所定枚数の静止画像や所定時間の動画像および音声を格納するのに十分な記憶容量を備えている。 Output data from the A/D converter 23 is directly written into the memory 32 via the image processing section 24 and the memory control section 15 or via the memory control section 15 . The memory 32 stores image data obtained by the imaging unit 22 and converted into digital data by the A/D converter 23, and image data to be displayed on the display unit 28 as display means. The memory 32 has a storage capacity sufficient to store a predetermined number of still images, moving images for a predetermined period of time, and audio.

また、メモリ３２は画像表示用のメモリ（ビデオメモリ）を兼ねている。Ｄ／Ａ変換器１９は、メモリ３２に格納されている画像表示用のデータをアナログ信号に変換して表示部２８に供給する。こうして、メモリ３２に書き込まれた表示用の画像データはＤ／Ａ変換器１９を介して表示部２８により表示される。表示部２８は、ＬＣＤ等の表示器上に、Ｄ／Ａ変換器１９からのアナログ信号に応じた表示を行う。Ａ／Ｄ変換器２３によって一度Ａ／Ｄ変換されメモリ３２に蓄積されたデジタル信号をＤ／Ａ変換器１９においてアナログ変換し、表示部２８に逐次転送して表示することで、電子ビューファインダとして機能し、スルー画像表示（ライブビュー表示）を行える。なお、表示部２８は、不図示の接眼部を通して覗き込む電子ビューファインダを設けても、デジタルカメラ１００の背面にディスプレイを設けてもよい。また、電子ビューファインダと、背面のディスプレイの両方を設けてもよい。 The memory 32 also serves as an image display memory (video memory). The D/A converter 19 converts image display data stored in the memory 32 into an analog signal and supplies the analog signal to the display unit 28 . Thus, the image data for display written in the memory 32 is displayed by the display section 28 via the D/A converter 19 . The display unit 28 displays on a display such as an LCD in accordance with the analog signal from the D/A converter 19 . The digital signal that is once A/D converted by the A/D converter 23 and stored in the memory 32 is converted to analog by the D/A converter 19, and is sequentially transferred to the display unit 28 for display. It functions and enables through image display (live view display). Note that the display unit 28 may be provided with an electronic viewfinder for viewing through an eyepiece (not shown), or may be provided with a display on the rear surface of the digital camera 100 . Also, both an electronic viewfinder and a rear display may be provided.

不揮発性メモリ５６は、電気的に消去・記録可能なメモリであり、例えばＥＥＰＲＯＭ等が用いられる。不揮発性メモリ５６には、システム制御部５０の動作用の定数、プログラム等が記憶される。ここでいう、プログラムとは、本実施形態にて後述する各種フローチャートを実行するためのプログラムのことである。 The nonvolatile memory 56 is an electrically erasable/recordable memory, and for example, an EEPROM or the like is used. The nonvolatile memory 56 stores constants, programs, etc. for the operation of the system control unit 50 . The program here is a program for executing various flowcharts described later in this embodiment.

システム制御部５０は、デジタルカメラ１００全体を制御する。前述した不揮発性メモリ５６に記録されたプログラムを実行することで、後述する本実施形態の各処理を実現する。５２はシステムメモリであり、ＲＡＭが用いられる。システムメモリ５２には、システム制御部５０の動作用の定数、変数、不揮発性メモリ５６から読み出したプログラム等を展開する。また、システム制御部はメモリ３２、Ｄ／Ａ変換器１９、表示部２８等を制御することにより表示制御も行う。 A system control unit 50 controls the entire digital camera 100 . By executing the program recorded in the non-volatile memory 56 described above, each process of this embodiment, which will be described later, is realized. A system memory 52 uses a RAM. In the system memory 52, constants and variables for operation of the system control unit 50, programs read from the nonvolatile memory 56, and the like are developed. Further, the system control unit also performs display control by controlling the memory 32, the D/A converter 19, the display unit 28, and the like.

システムタイマー５３は各種制御に用いる時間や、内蔵された時計の時間を計測する計時部である。 A system timer 53 is a timer that measures the time used for various controls and the time of a built-in clock.

電源スイッチ７２はデジタルカメラ１００の電源のＯＮ及びＯＦＦを切り替える操作部材である。 A power switch 72 is an operation member for switching ON and OFF of the power of the digital camera 100 .

モード切替スイッチ６０、第１シャッタースイッチ６２、第２シャッタースイッチ６４、操作部７０はシステム制御部５０に各種の動作指示を入力するための操作手段である。 A mode changeover switch 60 , a first shutter switch 62 , a second shutter switch 64 , and an operation section 70 are operation means for inputting various operation instructions to the system control section 50 .

モード切替スイッチ６０は、システム制御部５０の動作モードを静止画記録モード、動画撮影モード、再生モード等のいずれかに切り替える。静止画記録モードに含まれるモードとして、オート撮影モード、オートシーン判別モード、マニュアルモード、絞り優先モード（Ａｖモード）、シャッター速度優先モード（Ｔｖモード）がある。また、撮影シーン別の撮影設定となる各種シーンモード、プログラムＡＥモード、カスタムモード等がある。モード切り替えスイッチ６０で、メニューボタンに含まれるこれらのモードのいずれかに直接切り替えられる。あるいは、モード切り替えスイッチ６０でメニューボタンに一旦切り換えた後に、メニューボタンに含まれるこれらのモードのいずれかに、他の操作部材を用いて切り替えるようにしてもよい。同様に、動画撮影モードにも複数のモードが含まれていてもよい。第１シャッタースイッチ６２は、デジタルカメラ１００に設けられたシャッターボタン６１の操作途中、いわゆる半押し（撮影準備指示）でＯＮとなり第１シャッタースイッチ信号ＳＷ１を発生する。第１シャッタースイッチ信号ＳＷ１により、ＡＦ（オートフォーカス）処理、ＡＥ（自動露出）処理、ＡＷＢ（オートホワイトバランス）処理、ＥＦ（フラッシュプリ発光）処理等の動作を開始する。 The mode switch 60 switches the operation mode of the system control unit 50 between a still image recording mode, a moving image shooting mode, a playback mode, and the like. Modes included in the still image recording mode include an auto shooting mode, an auto scene determination mode, a manual mode, an aperture priority mode (Av mode), and a shutter speed priority mode (Tv mode). In addition, there are various scene modes, program AE modes, custom modes, etc., which are shooting settings for each shooting scene. A mode switch 60 switches directly to any of these modes contained in the menu button. Alternatively, after switching to the menu button once with the mode switching switch 60, any of these modes included in the menu button may be switched using another operation member. Similarly, the movie shooting mode may also include multiple modes. The first shutter switch 62 is turned on when the shutter button 61 provided on the digital camera 100 is pressed halfway (imaging preparation instruction), and generates a first shutter switch signal SW1. Operations such as AF (autofocus) processing, AE (automatic exposure) processing, AWB (automatic white balance) processing, and EF (flash pre-emission) processing are started by the first shutter switch signal SW1.

第２シャッタースイッチ６４は、シャッターボタン６１の操作完了、いわゆる全押し（撮影指示）でＯＮとなり、第２シャッタースイッチ信号ＳＷ２を発生する。システム制御部５０は、第２シャッタースイッチ信号ＳＷ２により、撮像部２２からの信号読み出しから記録媒体２００に画像データを書き込むまでの一連の撮影処理の動作を開始する。 The second shutter switch 64 is turned ON when the operation of the shutter button 61 is completed, that is, when the shutter button 61 is fully pressed (imaging instruction), and generates a second shutter switch signal SW2. In response to the second shutter switch signal SW2, the system control unit 50 starts a series of photographing processing operations from reading out signals from the imaging unit 22 to writing image data in the recording medium 200. FIG.

操作部７０の各操作部材は、表示部２８に表示される種々の機能アイコンを選択操作することなどにより、場面ごとに適宜機能が割り当てられ、各種機能ボタンとして作用する。機能ボタンとしては、例えば終了ボタン、戻るボタン、画像送りボタン、ジャンプボタン、絞込みボタン、属性変更ボタン等がある。例えば、メニューボタンが押されると各種の設定可能なメニュー画面が表示部２８に表示される。利用者は、表示部２８に表示されたメニュー画面と、上下左右の４方向ボタンやＳＥＴボタンとを用いて直感的に各種設定を行うことができる。 Each operation member of the operation unit 70 is appropriately assigned a function for each scene by selecting and operating various function icons displayed on the display unit 28, and acts as various function buttons. The function buttons include, for example, an end button, a return button, an image forward button, a jump button, a refinement button, an attribute change button, and the like. For example, when the menu button is pressed, a menu screen on which various settings can be made is displayed on the display unit 28 . The user can intuitively perform various settings by using the menu screen displayed on the display unit 28, the up, down, left, and right four-direction buttons and the SET button.

操作部７０は、ユーザーからの操作を受け付ける入力部としての各種操作部材である。操作部７０には、メニュー選択、モード選択、撮影した動画像の再生などを実施するための電子ボタンや十字キーなどが設けられている。 The operation unit 70 is various operation members as an input unit that receives operations from the user. The operation unit 70 is provided with electronic buttons, a cross key, and the like for performing menu selection, mode selection, playback of captured moving images, and the like.

本実施例では、操作部７０の１つとして、視線入力操作部７０１が設けられている。視線入力操作部７０１は、ユーザーの視線が表示部２８のいずれの箇所を見ているかを検出するための操作部材である。 In this embodiment, a line-of-sight input operation unit 701 is provided as one of the operation units 70 . The line-of-sight input operation unit 701 is an operation member for detecting which part of the display unit 28 the user's line of sight is looking at.

図４（ａ）は、視線入力操作部７０１の一例を示す。図４（ａ）では、特許文献１に開示されているファインダ視野内を覗くユーザーの眼球５０１ａの光軸の回転角を検出し、検出した回転角からユーザーの視線を検出する方式を実現する構成である。表示部２８には、レンズユニット１００を通して撮影されたライブビュー表示画像が表示されている。７０１ａはイメージセンサを、７０１ｂは受光レンズを、７０１ｃはダイクロイックミラーを、７０１ｄは接眼レンズを、７０１ｅは照明光源を示す。照明光源７０１ｅにより、眼球５０１ａに赤外光が投射される。眼球５０１ａを反射した赤外光は、ダイクロイックミラー７０１ｃに反射され、イメージセンサ７０１ａにより撮影される。撮影された眼球画像は、不図示のＡ／Ｄ変換器によりデジタル信号に変換され、システム制御部５０に送信される。視線情報生成手段、および、視線位置情報出力手段としてのシステム制御部５０では、撮影された眼球画像から、瞳孔の領域などを抽出し、ユーザーの視線を算出する。 FIG. 4A shows an example of the line-of-sight input operation unit 701. FIG. FIG. 4A shows a configuration for realizing a method of detecting the rotation angle of the optical axis of the user's eyeball 501a looking into the viewfinder field disclosed in Patent Document 1 and detecting the user's line of sight from the detected rotation angle. is. A live view display image captured through the lens unit 100 is displayed on the display unit 28 . 701a denotes an image sensor, 701b denotes a light receiving lens, 701c denotes a dichroic mirror, 701d denotes an eyepiece lens, and 701e denotes an illumination light source. Infrared light is projected onto the eyeball 501a by the illumination light source 701e. The infrared light reflected by the eyeball 501a is reflected by the dichroic mirror 701c and captured by the image sensor 701a. The photographed eyeball image is converted into a digital signal by an A/D converter (not shown) and transmitted to the system control unit 50 . The system control unit 50 serving as line-of-sight information generation means and line-of-sight position information output means extracts the pupil region and the like from the photographed eyeball image, and calculates the user's line of sight.

なお、視線入力操作部７０１は、この方式に限らず、ユーザーの両目を撮影し、視線を検出する方式でもよい。図４（ｂ）には、図４とは異なる視線入力操作部７０１の一例を示す。図４（ｂ）は、デジタルカメラ１００の背面に設けられている表示部２８に、レンズユニット１００を通して撮影されたライブビュー表示画像が表示されている。図４（ｂ）では、デジタルカメラ１００の背面に、表示部２８を観察しているユーザーの顔５００を撮影するカメラ７０１ｆが設けられている。図５において、カメラ７０１ｆが撮影する画角を点線で示している。不図示の照明光源７０１ｅからユーザーの顔に投光を行い、カメラ７０１ｆにより眼球画像を取得する。これにより、ユーザーの視線を算出する。なお、視線入力操作部７０１は、この方式に限らず、ユーザーが表示部２８のいずれの箇所を注視しているかを検出できる構成であればよい。 Note that the line-of-sight input operation unit 701 is not limited to this method, and may adopt a method of photographing both eyes of the user and detecting the line of sight. FIG. 4B shows an example of the line-of-sight input operation section 701 different from that in FIG. In FIG. 4B, a live view display image captured through the lens unit 100 is displayed on the display section 28 provided on the rear surface of the digital camera 100. FIG. In FIG. 4B, a camera 701f is provided on the rear surface of the digital camera 100 to photograph the face 500 of the user observing the display section . In FIG. 5, the dotted line indicates the angle of view captured by the camera 701f. An illumination light source 701e (not shown) projects light onto the user's face, and an eyeball image is acquired by the camera 701f. Thereby, the line of sight of the user is calculated. Note that the line-of-sight input operation unit 701 is not limited to this method, and may have any configuration as long as it can detect which part of the display unit 28 the user is gazing at.

電源制御部８０は、電池検出回路、ＤＣ－ＤＣコンバータ、通電するブロックを切り替えるスイッチ回路等により構成され、電池の装着の有無、電池の種類、電池残量の検出を行う。また、電源制御部８０は、その検出結果及びシステム制御部５０の指示に基づいてＤＣ－ＤＣコンバータを制御し、必要な電圧を必要な期間、記録媒体２００を含む各部へ供給する。 The power control unit 80 is composed of a battery detection circuit, a DC-DC converter, a switch circuit for switching blocks to be energized, and the like, and detects whether or not a battery is installed, the type of battery, and the remaining amount of the battery. Also, the power supply control unit 80 controls the DC-DC converter based on the detection results and instructions from the system control unit 50, and supplies necessary voltage to each unit including the recording medium 200 for a necessary period.

電源部３０は、アルカリ電池やリチウム電池等の一次電池やＮｉＣｄ電池やＮｉＭＨ電池、Ｌｉ電池等の二次電池、ＡＣアダプター等からなる。記録媒体Ｉ／Ｆ１８は、メモリカードやハードディスク等の記録媒体２００とのインターフェースである。記録媒体２００は、撮影された画像を記録するためのメモリカード等の記録媒体であり、半導体メモリや磁気ディスク等から構成される。 The power supply unit 30 includes a primary battery such as an alkaline battery or a lithium battery, a secondary battery such as a NiCd battery, a NiMH battery, or a Li battery, an AC adapter, or the like. A recording medium I/F 18 is an interface with a recording medium 200 such as a memory card or hard disk. A recording medium 200 is a recording medium such as a memory card for recording captured images, and is composed of a semiconductor memory, a magnetic disk, or the like.

通信部５４は、無線または優先ケーブルによって接続し、映像信号や音声信号の送受信を行う。通信部５４は無線ＬＡＮ（ＬｏｃａｌＡｒｅａＮｅｔｗｏｒｋ）やインターネットとも接続可能である。通信部５４は撮像部２２で撮像した画像（スルー画像を含む）や、記録媒体２００に記録された画像を送信可能であり、また、外部機器から画像データやその他の各種情報を受信することができる。 The communication unit 54 is connected wirelessly or via a priority cable to transmit and receive video signals and audio signals. The communication unit 54 can be connected to a wireless LAN (Local Area Network) or the Internet. The communication unit 54 can transmit images (including through images) captured by the imaging unit 22 and images recorded on the recording medium 200, and can receive image data and other various information from external devices. can.

姿勢検知部５５は重力方向に対するデジタルカメラ１００の姿勢を検知する。姿勢検知部５５で検知された姿勢に基づいて、撮像部２２で撮影された画像が、デジタルカメラ１００を横に構えて撮影された画像であるか、縦に構えて撮影された画像なのかを判別可能である。システム制御部５０は、姿勢検知部５５で検知された姿勢に応じた向き情報を撮像部２２で撮像された画像の画像ファイルへの付加や、画像を回転して記録することが可能である。姿勢検知部５５としては、加速度センサーやジャイロセンサーなどを用いることができる。 The orientation detection unit 55 detects the orientation of the digital camera 100 with respect to the direction of gravity. Based on the posture detected by the posture detection unit 55, whether the image captured by the imaging unit 22 is an image captured with the digital camera 100 held horizontally or an image captured with the digital camera 100 held vertically is determined. It is identifiable. The system control unit 50 can add orientation information corresponding to the orientation detected by the orientation detection unit 55 to the image file of the image captured by the imaging unit 22, and rotate and record the image. An acceleration sensor, a gyro sensor, or the like can be used as the posture detection unit 55 .

上述したデジタルカメラ１００では中央１点ＡＦや顔ＡＦを用いた撮影が可能である。中央１点ＡＦとは撮影画面内の中央位置１点に対してＡＦを行うことである。顔ＡＦとは顔検出機能によって検出された撮影画面内の顔に対してＡＦを行うことである。 The digital camera 100 described above is capable of photographing using center single-point AF or face AF. Central one-point AF is to perform AF on one central point within the photographing screen. Face AF is to perform AF on a face in a photographing screen detected by a face detection function.

顔検出機能について説明する。システム制御部５０は顔検出対象の画像データを画像処理部２４に送る。システム制御部５０の制御下で画像処理部２４は、当該画像データに水平方向バンドパスフィルタを作用させる。また、システム制御部５０の制御下で画像処理部２４は処理された画像データに垂直方向バンドパスフィルタを作用させる。これら水平及び垂直方向のバンドパスフィルタにより、画像データよりエッジ成分が検出される。 The face detection function will be explained. The system control unit 50 sends image data for face detection to the image processing unit 24 . Under the control of the system control unit 50, the image processing unit 24 applies a horizontal bandpass filter to the image data. Also, under the control of the system controller 50, the image processor 24 applies a vertical bandpass filter to the processed image data. Edge components are detected from the image data by these horizontal and vertical bandpass filters.

その後、システム制御部５０は、検出されたエッジ成分に関してパターンマッチングを行い、目及び鼻、口、耳の候補群を抽出する。そして、システム制御部５０は、抽出された目の候補群の中から、予め設定された条件（例えば２つの目の距離、傾き等）を満たすものを、目の対と判断し、目の対があるもののみ目の候補群として絞り込む。そして、システム制御部５０は、絞り込まれた目の候補群とそれに対応する顔を形成する他のパーツ（鼻、口、耳）を対応付け、また、予め設定した非顔条件フィルタを通すことで、顔を検出する。システム制御部５０は、顔の検出結果に応じて上記顔情報を出力し、処理を終了する。このとき、顔の数などの特徴量をシステムメモリ５２に記憶する。顔検出機能の実現方法は、上述の方法に限らず、公知の機械学習を用いた方法により、同様に、顔の数、サイズ、パーツなどを検出してもよい。また、被写体の種別として、人物の顔に限らず、動物や乗り物などを検出してもよい。 After that, the system control unit 50 performs pattern matching on the detected edge components to extract a group of candidates for eyes, nose, mouth, and ears. Then, the system control unit 50 determines eye pairs that satisfy preset conditions (for example, the distance and inclination of the two eyes) from the group of extracted eye candidates. Only those that have are narrowed down as the candidate group of eyes. Then, the system control unit 50 associates the group of narrowed-down eye candidates with the corresponding other parts (nose, mouth, ears) forming the face, and passes them through a preset non-face condition filter. , to detect faces. The system control unit 50 outputs the face information according to the face detection result, and ends the process. At this time, the feature quantity such as the number of faces is stored in the system memory 52 . The method of realizing the face detection function is not limited to the above-described method, and the number, size, parts, etc. of faces may be similarly detected by a method using known machine learning. Further, the type of subject is not limited to a person's face, and animals, vehicles, and the like may be detected.

以上のようにライブビュー表示あるいは再生表示される画像データを画像解析して、画像データの特徴量を抽出して被写体情報を検出することが可能である。本実施例では被写体情報として顔情報を例に挙げたが、被写体情報には他にも赤目判定や目の検出、目つむり検出、笑顔検出等の様々な情報がある。 As described above, it is possible to perform image analysis on image data displayed in live view display or playback display, extract feature amounts of the image data, and detect subject information. In this embodiment, face information is taken as an example of subject information, but there are various types of subject information such as red-eye determination, eye detection, blink detection, smile detection, and the like.

なお、顔ＡＦと同時に顔ＡＥ，顔ＦＥ、顔ＷＢを行うことができる。顔ＡＥとは検出された顔の明るさに合わせて、画面全体の露出を最適化することである。顔ＦＥとは検出された顔を中心にフラッシュの調光をすることである。顔ＷＢとは、検出された顔の色に合わせて画面全体のＷＢを最適化することである。 Note that face AE, face FE, and face WB can be performed simultaneously with face AF. Face AE is to optimize the exposure of the entire screen according to the brightness of the detected face. Face FE is to adjust the light of the flash centering on the detected face. Face WB is to optimize the WB of the entire screen according to the color of the detected face.

［課題］
視線位置を検出する手段と視線の表示手段を有する撮像装置においては、次のような課題が生じる。該撮像装置において、撮影者が表示手段を目視し、その視線によって快適に被写体選択や測距点選択を行うためには、検出した視線位置の表示において安定性と即応性を両立する必要がある。視線位置は、視線検出の誤差と人の視線微動などによって、視線位置に振動やゆらぎが生じる。検出した視線位置を毎秒数１０フレームのレートで該表示装置にポインタ表示等を行った場合、そのポインタを視認することで不快感を感じる、あるいは、被写体選択や測距点選択が困難となってしまう。そのため、視線の安定性を確保するために、複数の過去の視線位置を用いたフィルタリングを行う必要がある。しかしながら、過去に取得した複数の視線位置を用いて現在や未来の視線位置を予測すると、フィルタの位相遅れ特性によって遅延が生じてしまい即応性が損なわれる。この遅延は、過去の視線位置データを多く用いるほど大きくなってしまう。表示装置上で速い動きの被写体を視線で追跡する場合に遅延が大きいと、被写体を視線で捉えることが困難となってしまう。一方、即応性を確保するためにフィルタの強度を低下させると視線の振動を除去しきれず、安定性に問題が生じる。このように安定性と即応性は原理的にトレードオフの関係となる。該撮像装置においては、フィルタ強度を調整して安定性と即応性のバランスを確保することが考えられるが、視線検出の精度が撮影者や撮影条件によって異なるため、フィルタ強度を固定値として設定する等の手段では、安定性と即応性を両立するのは困難である。本実施例はこのような技術の課題に着目し、視線検出の安定性か即応性のバランスを両立するものである。 [Task]
The following problem arises in an imaging apparatus having means for detecting the line-of-sight position and means for displaying the line-of-sight. In the imaging apparatus, in order for the photographer to look at the display means and comfortably select a subject or a range-finding point based on the line of sight, it is necessary to achieve both stability and responsiveness in displaying the detected line-of-sight position. . The line-of-sight position vibrates and fluctuates due to line-of-sight detection errors and human line-of-sight slight movements. When the detected line-of-sight position is displayed as a pointer on the display device at a rate of several tens of frames per second, visually recognizing the pointer makes the user feel uncomfortable, or makes it difficult to select a subject or a range-finding point. put away. Therefore, in order to ensure the stability of the line of sight, it is necessary to perform filtering using a plurality of past line of sight positions. However, if the current and future line-of-sight positions are predicted using a plurality of line-of-sight positions acquired in the past, a delay occurs due to the phase delay characteristic of the filter, which impairs responsiveness. This delay increases as more past line-of-sight position data is used. If the delay is large when tracking a fast-moving object with the line of sight on the display device, it becomes difficult to catch the object with the line of sight. On the other hand, if the strength of the filter is reduced in order to ensure responsiveness, the line-of-sight vibration cannot be completely removed, resulting in a stability problem. In this way, stability and responsiveness are in principle in a trade-off relationship. In the imaging device, it is conceivable to adjust the filter strength to ensure a balance between stability and responsiveness, but since the accuracy of line-of-sight detection varies depending on the photographer and shooting conditions, the filter strength is set as a fixed value. With such means, it is difficult to achieve both stability and responsiveness. The present embodiment focuses on such technical problems, and balances the stability and responsiveness of line-of-sight detection at the same time.

［視線検出および撮影動作の説明］
以下、図５を参照して、本発明の第１の実施例における視線位置の検出処理方法について説明する。図５は、本実施形態の撮像装置の焦点検出、視線検出及び撮影動作を説明するためのフローチャートである。図６は、撮影スタンバイ状態などのライブビュー状態（動画撮影状態）から撮影を行うライブビュー撮影時の動作を示し、システム制御部５０が主体となって実現される。 [Description of line-of-sight detection and shooting operation]
Hereinafter, the sight line position detection processing method according to the first embodiment of the present invention will be described with reference to FIG. FIG. 5 is a flowchart for explaining focus detection, line-of-sight detection, and photographing operations of the imaging apparatus of this embodiment. FIG. 6 shows the operation at the time of live view shooting, in which shooting is performed from a live view state (moving image shooting state) such as a shooting standby state, and is realized mainly by the system control unit 50 .

Ｓ１では、システム制御部５０の制御に従い、撮像部２２を駆動し、撮像データを取得する。取得する撮像データは、後述する記録用ではなく、検出・表示用の画像であるため、記録画像に対してサイズの小さい画像を取得する。Ｓ１では、焦点検出や被写体検出、もしくは、ライブビュー表示を行うために十分な解像度を有する画像を取得する。ここでは、ライブビュー表示用の動画撮影のための駆動動作であるため、ライブビュー表示用のフレームレートに応じた時間の電荷蓄積と読み出しを行う、いわゆる電子シャッタを用いた撮影を行う。ここで行うライブビュー表示は、撮影者が撮影範囲や撮影条件の確認を行うためのもので、例えば、３０フレーム／秒（撮影間隔３３．３ｍｓ）や６０フレーム／秒（撮影間隔１６．６ｍｓ）であってよい。 In S1, under the control of the system control unit 50, the imaging unit 22 is driven to acquire imaging data. Since the imaging data to be acquired is not for recording, which will be described later, but for detection and display, an image smaller in size than the recorded image is acquired. In S1, an image having sufficient resolution for focus detection, subject detection, or live view display is acquired. Here, since the drive operation is for moving image shooting for live view display, shooting is performed using a so-called electronic shutter that performs charge accumulation and readout for a time corresponding to the frame rate for live view display. The live view display performed here is for the photographer to check the shooting range and shooting conditions. can be

Ｓ２で、システム制御部５０は、Ｓ１で得られた撮像データのうち、焦点検出領域に含まれる第１焦点検出画素と第２焦点検出画素から得られる焦点検出データを取得する。また、システム制御部５０は、第１焦点検出画素と第２焦点検出画素の出力信号を加算し撮像信号を生成し、画像処理部２４で色補間処理などを適用して得られる画像データを取得する。このように、１回の撮影により、画像データと、焦点検出データとを取得することができる。なお、撮像画素と、第１焦点検出画素、第２焦点検出画素を個別の画素構成とした場合には、焦点検出用画素の補完処理などを行って画像データを取得する。 In S2, the system control unit 50 acquires the focus detection data obtained from the first focus detection pixel and the second focus detection pixel included in the focus detection area among the imaging data obtained in S1. In addition, the system control unit 50 adds the output signals of the first focus detection pixel and the second focus detection pixel to generate an imaging signal, and acquires image data obtained by applying color interpolation processing in the image processing unit 24. do. In this way, image data and focus detection data can be acquired by one shot. Note that when the imaging pixels, the first focus detection pixels, and the second focus detection pixels are configured as individual pixels, image data is acquired by performing a complementing process for the focus detection pixels.

Ｓ３でシステム制御部５０は、Ｓ２で得られた画像データをもとに、画像処理部２４を用いてライブビュー表示用の画像を生成し、表示部２８に表示する。なお、ライブビュー表示用の画像は、例えば表示部２８の解像度に合わせた縮小画像であり、Ｓ２で画像データを生成する際に画像処理部２４で縮小処理を実施することもできる。この場合、システム制御部５０はＳ２で取得した画像データを表示部２８に表示させる。上述の通り、ライブビュー表示中は所定のフレームレートでの撮影と表示が行われるため、表示部２８を通じて撮影者は撮影時の構図や露出条件の調整などを行うことができる。また、上述の通り、本実施形態では、被写体として人物の顔や動物などを検出することが可能である。Ｓ３で、ライブビュー表示の開始に合わせて、検出している被写体の領域を示す枠などの表示も行う。 In S<b>3 , the system control unit 50 uses the image processing unit 24 to generate an image for live view display based on the image data obtained in S<b>2 , and displays the image on the display unit 28 . The image for live view display is, for example, a reduced image that matches the resolution of the display unit 28, and the image processing unit 24 can perform reduction processing when generating image data in S2. In this case, the system control unit 50 causes the display unit 28 to display the image data obtained in S2. As described above, shooting and display are performed at a predetermined frame rate during live view display, so the photographer can adjust the composition and exposure conditions at the time of shooting through the display unit 28 . Further, as described above, in this embodiment, it is possible to detect a person's face, an animal, or the like as a subject. In step S3, when the live view display is started, a frame or the like indicating the area of the detected subject is also displayed.

Ｓ４でシステム制御部５０は、視線検出、および焦点検出を開始する。Ｓ４以降、視線入力操作部７０１により、撮影者が、表示部２８上のどの位置を観察しているか（視線位置）を、撮影者が観察していた表示画像と関連付けて、所定の時間間隔で取得する。また、検出された視線位置を、撮影者に通知するため、表示部２８上に、表示する。 In S4, the system control unit 50 starts line-of-sight detection and focus detection. After S4, the line-of-sight input operation unit 701 associates which position the photographer is observing on the display unit 28 (line-of-sight position) with the display image observed by the photographer, and displays the position at predetermined time intervals. get. Also, the detected line-of-sight position is displayed on the display unit 28 in order to notify the photographer.

Ｓ５でシステム制御部５０は、撮影準備開始を示す第１シャッタースイッチ６２（Ｓｗ１）のオン／オフを検出する。操作部７０の一つであるシャッターボタン６１は、押し込み量に応じて、２段階のオン／オフを検出することが可能で、上述のＳｗ１のオン／オフは、レリーズ（撮影トリガ）スイッチの１段階目のオン／オフに相当する。 In S5, the system control unit 50 detects ON/OFF of the first shutter switch 62 (Sw1) indicating the start of shooting preparation. The shutter button 61, which is one of the operation units 70, can detect two stages of on/off depending on the amount of depression. It corresponds to ON/OFF of the first step.

Ｓ５でＳｗ１のオンが検出されない（あるいはオフが検出された）場合、システム制御部５０は処理をＳ１１に進め、操作部７０に含まれるメインスイッチがオフされたか否かを判別する。一方、Ｓ５でＳｗ１のオンが検出されると、システム制御部５０は処理をＳ６に進め、合焦させる焦点検出領域の設定、および焦点検出を行う。ここでは、Ｓ４で検出を開始した視線位置と、撮像装置内部の被写体検出位置の両方を用いて、焦点検出領域を設定する。Ｓ４で検出される視線位置は、撮影者が意図する被写体の位置に対して、様々な要因で、誤差を有する。また、個人差はあるが、人間は視認してから目が動き出すまでにコンマ数秒程度の遅延時間が存在する。本発明では、検出される視線位置情報を、撮影条件に応じて視線情報の信頼性を評価し、加工処理をすることにより、人間が視認してから目が動き出すまでの遅延時間があってもより精度の高い視線位置情報を取得することか可能となる。詳細は、後述する。Ｓ６では、後述する処理が施された視線位置情報と撮像装置内部の被写体検出位置の両方を用いて、焦点検出領域を設定する。Ｓ６以降、視線位置情報を用いた焦点検出領域の設定と、焦点検出処理は、撮像を行うたびに、繰り返し実行される。 If Sw1 is not detected to be on (or is detected to be off) in S5, the system control unit 50 advances the process to S11 and determines whether or not the main switch included in the operation unit 70 is turned off. On the other hand, when Sw1 is detected to be ON in S5, the system control unit 50 advances the process to S6 to set a focus detection area to be focused and perform focus detection. Here, the focus detection area is set using both the line-of-sight position detected in S4 and the subject detection position inside the imaging apparatus. The line-of-sight position detected in S4 has an error with respect to the position of the subject intended by the photographer due to various factors. In addition, although there are individual differences, there is a delay time of about several tenths of a second until the human eye starts to move after recognizing it. In the present invention, the reliability of the detected line-of-sight position information is evaluated according to the photographing conditions, and the line-of-sight information is processed. It is possible to obtain more accurate line-of-sight position information. Details will be described later. In S6, the focus detection area is set using both the line-of-sight position information processed to be described later and the subject detection position inside the imaging apparatus. After S6, the setting of the focus detection area using the line-of-sight position information and the focus detection process are repeatedly executed each time an image is captured.

設定された焦点検出領域に対応する焦点検出データを用いて、デフォーカス量および方向を焦点検出領域ごとに求める。本実施形態では、システム制御部５０が焦点検出用の像信号の生成と、焦点検出用信号のずれ量（位相差）の算出と、算出したずれ量からデフォーカス量と方向を求める処理を実施するものとする。 Using the focus detection data corresponding to the set focus detection area, the defocus amount and direction are obtained for each focus detection area. In this embodiment, the system control unit 50 generates an image signal for focus detection, calculates the shift amount (phase difference) of the focus detection signal, and obtains the defocus amount and direction from the calculated shift amount. It shall be.

設定した焦点検出領域から、焦点検出用の像信号として得られた第１焦点検出信号と第２焦点検出信号に、シェーディング補正、フィルター処理を行い、対の信号の光量差の低減と、位相差検出を行う空間周波数の信号抽出を行う。次に、フィルター処理後の第１焦点検出信号と第２焦点検出信号を相対的に瞳分割方向にシフトさせるシフト処理を行い、信号の一致度を表す相関量を算出する。 The first focus detection signal and the second focus detection signal obtained as image signals for focus detection from the set focus detection area are subjected to shading correction and filtering to reduce the light amount difference between the pair of signals and to reduce the phase difference. Perform signal extraction of the spatial frequency to be detected. Next, shift processing is performed to relatively shift the filtered first focus detection signal and the second focus detection signal in the direction of pupil division, and a correlation amount representing the degree of matching between the signals is calculated.

フィルター処理後のｋ番目の第１焦点検出信号をＡ（ｋ）、第２焦点検出信号をＢ（ｋ）、焦点検出領域に対応する番号ｋの範囲をＷとする。さらに、シフト処理によるシフト量をｓ１、シフト量ｓ１のシフト範囲をΓ１とすると、相関量ＣＯＲは、式（１）により算出される。 Let A(k) be the k-th first focus detection signal after filtering, B(k) be the second focus detection signal, and W be the range of number k corresponding to the focus detection area. Furthermore, when the shift amount by the shift process is s1 and the shift range of the shift amount s1 is Γ1, the correlation amount COR is calculated by Equation (1).

シフト量ｓ１のシフト処理により、ｋ番目の第１焦点検出信号Ａ（ｋ）とｋ－ｓ１番目の第２焦点検出信号Ｂ（ｋ－ｓ１）を対応させ減算し、シフト減算信号を生成する。生成されたシフト減算信号の絶対値を計算し、焦点検出領域に対応する範囲Ｗ内で番号ｋの和を取り、相関量ＣＯＲ（ｓ１）を算出する。必要に応じて、各行毎に算出された相関量を、各シフト量毎に、複数行に渡って加算しても良い。 By the shift processing of the shift amount s1, the k-th first focus detection signal A(k) and the k-s1-th second focus detection signal B(k-s1) are correlated and subtracted to generate a shift subtraction signal. The absolute value of the generated shift subtraction signal is calculated, the sum of the numbers k is taken within the range W corresponding to the focus detection area, and the correlation amount COR(s1) is calculated. If necessary, the correlation amount calculated for each row may be added over a plurality of rows for each shift amount.

次に、相関量から、サブピクセル演算により、相関量が最小値となる実数値のシフト量を算出して像ずれ量ｐ１とする。そして、算出した像ずれ量ｐ１に、焦点検出領域の像高と、撮像レンズ（結像光学系）のＦ値、射出瞳距離に応じた変換係数Ｋ１をかけて、検出デフォーカス量を検出する。 Next, from the correlation amount, a sub-pixel calculation is performed to calculate the real-value shift amount that minimizes the correlation amount, and this is used as the image shift amount p1. Then, the calculated image shift amount p1 is multiplied by a conversion coefficient K1 corresponding to the image height of the focus detection area, the F value of the imaging lens (imaging optical system), and the exit pupil distance to detect the detected defocus amount. .

Ｓ７でシステム制御部５０は、選択した焦点検出領域で検出されたデフォーカス量に基づき、レンズ駆動を行う。検出されたデフォーカス量が所定値より小さい場合には、必ずしもレンズ駆動を行う必要はない。 In S7, the system control unit 50 drives the lens based on the defocus amount detected in the selected focus detection area. If the detected defocus amount is smaller than the predetermined value, it is not necessary to drive the lens.

次に、Ｓ８で、Ｓ１で行った検出・表示用の画像の取得とライブビュー表示、および、Ｓ６で行った焦点検出処理を行う。ライブビュー表示には、上述の通り検出された被写体領域や視線位置の情報も重畳して表示する。Ｓ８で行う処理は、Ｓ７のレンズ駆動中に、並列的に行ってもよい。また、随時更新されるライブビュー表示に合わせて、得られる視線位置に対応させて、焦点検出領域を変更してもよい。焦点検出処理を終えるとＳ９に進み、システム制御部５０は撮影開始指示を示す第２シャッタースイッチ６４（Ｓｗ２）のオン／オフを検出する。操作部７０の一つであるレリーズ（撮影トリガ）スイッチは、押し込み量に応じて、２段階のオン／オフを検出することが可能で、上述のＳｗ２は、レリーズ（撮影トリガ）スイッチの２段階目のオン／オフに相当する。システム制御部５０は、Ｓ９でＳｗ２のオンが検出されない場合、Ｓ５に戻り、Ｓｗ１のオン／オフを検出する。 Next, in S8, acquisition of an image for detection/display and live view display performed in S1 and focus detection processing performed in S6 are performed. On the live view display, information on the subject area and line-of-sight position detected as described above is also superimposed and displayed. The processing performed in S8 may be performed in parallel while driving the lens in S7. Also, the focus detection area may be changed in accordance with the obtained line-of-sight position in accordance with the live view display that is updated as needed. When the focus detection process is finished, the process proceeds to S9, and the system control unit 50 detects ON/OFF of the second shutter switch 64 (Sw2) that indicates an instruction to start photographing. A release (shooting trigger) switch, which is one of the operation units 70, can detect two stages of on/off depending on the amount of depression. Equivalent to turning on/off the eyes. If the ON state of Sw2 is not detected in S9, the system control unit 50 returns to S5 and detects the ON/OFF state of Sw1.

Ｓ９でＳｗ２のオンが検出されるとシステム制御部５０は処理をＳ１０に進め、画像記録を行うか否かを判定する。本実施形態では、連写中の画像取得を、記録画像用と撮像／表示、焦点検出用で、処理を切り替える。切り替えは、交互でもよいし、例えば、３回に１回撮像／表示、焦点検出を行うなどしてもよい。これにより、単位時間当たりの撮影枚数を、大幅に減らすことなく、高精度な焦点検出を行うができる。 When the ON state of Sw2 is detected in S9, the system control unit 50 advances the process to S10 and determines whether or not to perform image recording. In the present embodiment, image acquisition during continuous shooting is switched between recording image processing, imaging/display processing, and focus detection processing. The switching may be alternated, or, for example, imaging/display and focus detection may be performed once every three times. As a result, highly accurate focus detection can be performed without significantly reducing the number of shots per unit time.

Ｓ１０で画像記録を行うと判定した場合には、Ｓ３００に進み、撮影サブルーチンを実行する。撮影サブルーチンの詳細については後述する。Ｓ３００で撮影サブルーチンが実行されるとＳ９に戻り、Ｓｗ２のオンが検出される、すなわち連写指示がされているか否かを判断する。 If it is determined in S10 that image recording is to be performed, the process advances to S300 to execute a photographing subroutine. Details of the shooting subroutine will be described later. When the photographing subroutine is executed in S300, the process returns to S9 and it is determined whether or not Sw2 is detected to be on, that is, whether or not continuous photographing is instructed.

Ｓ１０で撮像／表示、焦点検出を行うと判定した場合には、Ｓ４００に進み、連写中の撮像／表示、焦点検出処理を実行する。連写中の撮像／表示、焦点検出処理は、実行する処理の内容は、Ｓ８と同じである。違いは、連写の撮影コマ速、記録画像の生成処理などに応じて、Ｓ４００で撮像した画像の表示期間、表示更新レート（間隔）、表示遅延が、Ｓ８の処理の場合と異なっている点である。表示制御手段としてのシステム制御部５０が、上述の表示制御を行う。本実施形態のように、連写中に、表示画像の表示期間、更新レート、表示遅延が変わった際に、撮影者の視線位置は、少なからず影響を受ける。本発明では、上述の表示仕様の状態や切り替わりに応じて、検出される視線位置に誤差が生じることを鑑みて、適切に視線位置の加工や検出処理の制御を行う。これにより、表示仕様の変化によらず、精度の高い視線位置を取得することができる。得られた視線位置情報は、上述の通り、焦点検出領域の設定や検出された被写体領域との紐づけなどに用いる。詳細は後述する。Ｓ４００で連写中の撮像／表示、焦点検出処理が実行されるとＳ９に戻り、Ｓｗ２のオンが検出される、すなわち連写指示がされているか否かを判断する。 If it is determined in S10 that imaging/display and focus detection are to be performed, the process advances to S400 to execute imaging/display and focus detection processing during continuous shooting. The imaging/display and focus detection processing during continuous shooting are the same as those in S8. The difference is that the display period, display update rate (interval), and display delay of the image captured in S400 are different from those in the process of S8, depending on the shooting frame speed of continuous shooting, the generation process of the recorded image, and the like. is. The system control unit 50 as display control means performs the above-described display control. As in the present embodiment, when the display period, update rate, and display delay of the displayed image change during continuous shooting, the photographer's line of sight position is affected to some extent. In the present invention, the line-of-sight position is appropriately processed and the detection process is controlled in view of the fact that an error occurs in the detected line-of-sight position depending on the state and switching of the display specifications described above. As a result, it is possible to obtain a highly accurate line-of-sight position regardless of changes in display specifications. The obtained line-of-sight position information is used, as described above, for setting the focus detection area, linking it with the detected subject area, and the like. Details will be described later. When image pickup/display and focus detection processing during continuous shooting are executed in S400, the process returns to S9 to detect whether Sw2 is ON, that is, whether or not continuous shooting is instructed.

Ｓ５でＳｗ１のオンが検出されず（あるいはオフが検出された）、Ｓ１１で、メインスイッチのオフが検出されると、焦点検出及び撮影動作を終了する。一方Ｓ１１でメインスイッチのオフが検出されない場合には、Ｓ２に戻り、画像データ、焦点検出データの取得を行う。 When Sw1 is not detected to be on (or is detected to be off) in S5 and the main switch is detected to be off in S11, focus detection and photographing operations are terminated. On the other hand, if it is not detected in S11 that the main switch is off, the process returns to S2 to acquire image data and focus detection data.

［視線予測の説明］
次に、図６を用いて、検出された視線位置情報を用いて予測制御するための視線位置の加工制御処理について説明する。図６は、視線予測方法を説明するためのフローチャートである。図６の処理は、図５のＳ４以降において、システム制御部５０と視線入力操作部７０１が主体となって、並行して処理が実行される。 [Explanation of line-of-sight prediction]
Next, the line-of-sight position processing control process for predictive control using the detected line-of-sight position information will be described with reference to FIG. FIG. 6 is a flowchart for explaining the line-of-sight prediction method. The processing in FIG. 6 is executed in parallel by the system control unit 50 and the line-of-sight input operation unit 701 as main subjects after S4 in FIG.

ステップＳ２０１では、所定期間内に、検出された視線位置情報を取得する。 In step S201, the detected line-of-sight position information is acquired within a predetermined period.

次の方法などによって視線の信頼度を算出しておいてもよい。視線検出データの信頼性を取得する方法としては、過去のある時間幅に渡る視線検出位置の分散を算出し、その逆数を取ることで視線情報の信頼性評価値とする方法が考えられる。分散データの逆数を取る事で、分散データが小さい場合は視線情報としてばらつきが小さく、値が安定している（信頼性が高い）ため信頼性の値が大きくなる。逆に、分散データが大きい場合は視線情報としてばらつきが大きく、値が不安定な（信頼性が低い）ため、信頼性の値としては小さくなる。 The line-of-sight reliability may be calculated by the following method or the like. As a method of obtaining the reliability of the line-of-sight detection data, a method of calculating the variance of the line-of-sight detection positions over a certain time span in the past and taking the reciprocal thereof is considered as a reliability evaluation value of the line-of-sight information. By taking the reciprocal of the distributed data, when the distributed data is small, the line-of-sight information has little variation and the value is stable (high reliability), so the reliability value increases. Conversely, when the distributed data is large, the line-of-sight information has large variations and the value is unstable (reliability is low), so the reliability value is small.

その他にも、焦点距離が長い程、ユーザーの手振れにより撮影中の被写体がブレてしまい、そのブレた被写体をユーザーが視線で追うと視線情報も正しく視線で追えず振動してしまうため、焦点距離を加味して信頼性を算出しても良い。具体的には、焦点距離が短い程信頼性を高く、焦点距離が長い程信頼性を低く評価すれば良い。 In addition, the longer the focal length, the more the subject being photographed becomes blurred due to the user's camera shake. may be added to calculate the reliability. Specifically, the shorter the focal length, the higher the reliability, and the longer the focal length, the lower the reliability.

また、上記に加え、瞼の開き具合に応じて視線検出センサ自体から取得された情報を視線情報の信頼性に加味してもよい。瞼の開き具合に応じて視線情報の信頼性が変化する理由は、視線位置に応じて視線検出精度が異なる理由と類似し、瞼によって瞳孔の一部が隠れてしまうことで生じる。瞼の開き具合に応じた視線情報の信頼性の変化は、視線検出センサより取得可能である。視線検出センサで瞼の開き具合による視線情報の信頼性を得ることが出来ない場合には、別途センサより瞼の開き具合の情報を取得し、信頼性を評価しても良い。 In addition to the above, information acquired from the line-of-sight detection sensor itself may be added to the reliability of the line-of-sight information according to the degree of opening of the eyelids. The reason that the reliability of line-of-sight information changes according to the degree of eyelid opening is similar to the reason that the line-of-sight detection accuracy differs according to the line-of-sight position, and is caused by part of the pupil being hidden by the eyelid. A change in the reliability of line-of-sight information according to the degree of eyelid opening can be obtained from a line-of-sight detection sensor. If it is not possible to obtain the reliability of line-of-sight information based on the degree of opening of the eyelids with the line-of-sight detection sensor, information on the degree of opening of the eyelids may be obtained from a separate sensor and the reliability may be evaluated.

ステップＳ２０２からＳ２０５では、制御パラメータを用いて視線予測を行う。 In steps S202 to S205, line-of-sight prediction is performed using control parameters.

カルマンフィルタを使用する本発明においては、回帰データ数ｎをあらかじめ決定した固定値とすることが望ましい。視線情報の信頼性を用いて回帰データ数ｎを決定する方法が考えられる。信頼性が高い場合には即応性を重視するためにｎを減らし、信頼性が低い場合には振動成分の抑制を重視するためにｎを増やす。しかしながら、ｎを信頼性に応じて適切な値に決定するのは難しい場合がある。撮影者や撮影条件によって、視線検出のばらつきが異なる。例えば眼が細い撮影者の場合など、ばらつきが大きい撮影者では、常時大きいｎが設定され、常に大きな遅延が生じてしまい撮影者の快適性を損ねる場合がある。そのため、予めキャリブレーションなどによって、撮影者毎に最適なｎを決定しておいて、撮影時にはｎを動的に決定するのではなく、固定のｎで視線位置を自動的に決定する方法が望ましい。ただ、ｎを固定値とすることで視線の即応性が低下する場合があるため、即応性を改善するためにカルマンフィルタを使用するのが効果的である。 In the present invention using the Kalman filter, it is desirable to set the number of regression data n to a predetermined fixed value. A method of determining the number of regression data n using the reliability of line-of-sight information is conceivable. When the reliability is high, n is decreased in order to give importance to responsiveness, and when the reliability is low, n is increased in order to emphasize suppression of vibration components. However, it may be difficult to determine an appropriate value for n depending on reliability. Variation in line-of-sight detection varies depending on the photographer and shooting conditions. For example, for a photographer with wide eyes, a large n is always set, and a large delay always occurs, which may impair the comfort of the photographer. Therefore, it is desirable to automatically determine the line-of-sight position with a fixed n instead of determining the optimum n for each photographer by calibration in advance and then dynamically determining n during shooting. . However, since setting n to a fixed value may reduce the line-of-sight responsiveness, it is effective to use a Kalman filter to improve the responsiveness.

［カルマンフィルタの説明］
カルマンフィルタは、系の誤差の正規性や線形の状態遷移モデルを前提とした場合において、最適なフィルタであることが公知であり、これを視線位置の推定に応用することで視線微動や視線検出の誤差に対して安定かつ遅延の少ないフィルタリングが可能となる。 [Description of Kalman filter]
The Kalman filter is known to be the optimum filter when assuming the normality of errors in the system and a linear state transition model. Filtering can be performed stably with little delay against errors.

カルマンフィルタは状態方程式と観測方程式の２種の方程式と、関連する処理で構成される。本実施形態において、状態方程式は状態遷移モデル、すなわち視線位置の動きをモデル化した式であり、観測方程式は視線位置の検出システムを記述する式である。詳細な例は後述する。また、カルマンフィルタは予測ステップとフィルタリングステップを持つ。予測ステップは前時刻の値から、予め与えた状態方程式（モデル）に従って現在時刻の値の推定値である事前推定値の算出を行う。フィルタリングステップでは、現在時刻の観測値と該事前推定値から、内挿によって値を修正した事後推定値の算出を行う。内挿の重みはカルマンゲインと呼ばれ、事前推定値に対する観測値の誤差の分散から算出される。観測値の誤差の分散が小さい場合は、観測の信頼性が高いとして、事後推定値は観測値に近いものとなる。逆に分散が大きい場合の事後推定値は事前推定値に近い値となる。このように予測ステップとフィルタリングステップを交互に繰り返し算出することで、状態の予測を自動的に行うことができる。前記の誤差の分散もカルマンフィルタで自動的に更新される。カルマンフィルタを用いてもある程度の遅延は避けられないが、誤差の分散が小さい場合には、現在時刻の観測値に近い値を事後推定値として得ることができるため、遅延を最小限に抑えることができる。ただし前記のようにｎは撮影者や撮影条件に応じてあらかじめ適切な値に設定する必要がある。 The Kalman filter consists of two types of equations, a state equation and an observation equation, and related processing. In this embodiment, the state equation is a state transition model, that is, an equation that models the line-of-sight position movement, and the observation equation is an equation that describes the line-of-sight position detection system. A detailed example will be described later. Also, the Kalman filter has a prediction step and a filtering step. In the prediction step, a pre-estimated value, which is an estimated value at the current time, is calculated from the value at the previous time according to a state equation (model) given in advance. In the filtering step, a post-estimate value corrected by interpolation is calculated from the observed value at the current time and the pre-estimated value. The interpolation weight is called the Kalman gain and is calculated from the variance of the error of the observed value with respect to the prior estimate. When the variance of the observed value error is small, the posterior estimate is close to the observed value, assuming that the reliability of the observation is high. Conversely, when the variance is large, the posterior estimate is close to the prior estimate. By alternately and repeatedly calculating the prediction step and the filtering step in this manner, the state can be automatically predicted. The error variance described above is also automatically updated by the Kalman filter. Even if the Kalman filter is used, a certain amount of delay cannot be avoided, but if the variance of the error is small, a value close to the observed value at the current time can be obtained as the posterior estimate, so the delay can be minimized. can. However, as described above, n must be set to an appropriate value in advance according to the photographer and shooting conditions.

〔３次以上の高次の多項式をカルマンフィルタの状態方程式として使用したときの課題である「オーバーフィット（過学習）問題」についての説明〕
カルマンフィルタの状態方程式には、予測する系の状態変化をモデル化した式を使用する必要がある。視線位置の予測においては視線の動きをモデル化した式が必要となるが、カメラ撮影時にファインダで被写体を目で追うときの視線は撮影者の意思によるため、所謂、等速直線運動のような明確なモデル式は存在しない。しかしながら、人間の視線の動きは、カメラの撮影フレームを毎秒数３０フレームとした場合、１０フレーム分程度の時間長さ（約０．３秒）においては、連続的な動きをする場合が多い。そのため、撮影時のある時刻における視線位置は、その時刻から過去１０フレーム分程度の視線データから、位置の連続性を仮定して精度良く予測できる場合が多い。この考え方にもとづくと、視線データの履歴を回帰データとした回帰式を視線予測の状態方程式として近似的に使用できる。回帰式の選定に関して、まず、撮像装置での処理の負荷を考慮すると、低い演算コストで処理が可能な線形カルマンフィルタが望ましい。そのため、カルマンフィルタの状態方程式として使用する回帰式は多項式回帰式が望ましい。次に視線予測の安定性を考慮すると、低次の多項式が望ましい。人間の視線は無意識に微動しており、カメラの表示装置に視線位置を表示し、それを目視しながら被写体選択や測距点選択を快適に行うためには、フィルタによって視線を平滑化した表示が必要となる。高次の多項式回帰式を用いると、回帰データの各点に対する誤差は低減させることができるが、所謂、オーバーフィットによって予測値に振動が生じてしまう。そのため、視線の予測に使用する回帰式としては、０～２次の低次の回帰式が望ましい。３次以上の回帰式では変曲点が１つ以上存在し、極大値と極小値を持つこととなるため、これが振動として生じてしまう可能性がある。 [Description of the "overfitting (overlearning) problem", which is a problem when using a higher-order polynomial of third or higher order as the state equation of the Kalman filter]
The state equation of the Kalman filter should use an equation that models the state change of the system to be predicted. Predicting the line-of-sight position requires a formula that models the movement of the line of sight. There is no clear model formula. However, when the number of frames captured by a camera is 30 frames per second, the human line of sight often moves continuously for a time length of about 10 frames (about 0.3 seconds). Therefore, in many cases, the line-of-sight position at a certain time during shooting can be accurately predicted from the line-of-sight data for about 10 frames in the past from that time, assuming the continuity of the positions. Based on this concept, a regression equation using the history of line-of-sight data as regression data can be approximately used as a state equation for line-of-sight prediction. Regarding the selection of the regression equation, first, considering the processing load on the imaging device, a linear Kalman filter that can be processed at a low calculation cost is desirable. Therefore, it is desirable that the regression equation used as the state equation of the Kalman filter be a polynomial regression equation. Next, considering the stability of line-of-sight prediction, a low-order polynomial is desirable. The human line of sight moves unconsciously, so in order to display the position of the line of sight on the display device of the camera, and to comfortably select subjects and AF points while looking at it, it is necessary to smooth the line of sight using a filter. Is required. Using a higher-order polynomial regression equation can reduce the error for each point in the regression data, but causes oscillations in the predicted values due to so-called overfitting. Therefore, it is desirable to use a low-order regression formula of 0 to 2 as the regression formula used for predicting the line of sight. A cubic or higher regression equation has one or more points of inflection and has a maximum value and a minimum value, which may cause oscillation.

［低次の多項式回帰式をカルマンフィルタの状態方程式として用いた実施例］
本実施例では視線位置の過去の履歴データから直線回帰とカルマンフィルタを用いて視線位置の予測を行う場合について、図６と図７を参照して説明する。図７ではｎ＝６とした例を示す。 [Embodiment using a low-order polynomial regression equation as the state equation of the Kalman filter]
In this embodiment, a case where the line-of-sight position is predicted using linear regression and a Kalman filter from past history data of the line-of-sight position will be described with reference to FIGS. 6 and 7. FIG. FIG. 7 shows an example where n=6.

処理開始後、ステップＳ２０１で視線位置を取得する。同時に過去ｎ点の視線の履歴データを蓄積しておく。蓄積データ数をｍとして、ｍがｎ未満の場合は、ｎ＝ｍとして、以降で説明するステップＳ２０２～Ｓ２０５の処理を行ってもよいし、ｍ＝ｎとなるまで視線データを蓄積してから処理を開始してもよい。 After the process starts, the line-of-sight position is acquired in step S201. At the same time, history data of the past n points of sight lines are accumulated. If the number of accumulated data is m, and m is less than n, the processing of steps S202 to S205 described below may be performed with n=m, or after accumulating line-of-sight data until m=n Processing may begin.

図６のフローチャートでは処理の順序としてステップＳ２０１の直後にステップＳ２０２が続くが、ステップＳ２０１～Ｓ２０５は繰り返し処理されるため、説明の都合上、ステップＳ２０３から説明する。 In the flowchart of FIG. 6, step S202 follows immediately after step S201 as the order of processing, but since steps S201 to S205 are repeatedly processed, for the convenience of explanation, step S203 will be described.

処理開始後、ステップＳ２０２で視線位置ｘの事前推定値ｘ_ｑの初期値は、観測値をそのまま使用してもよいし、位置の座標原点の値を使用するなどでもよい。また、後述の事前推定値の誤差ｅ_ｑの分散Ｐ_ｑの初期値については公知のように０より大きな１０^－２程度の小さな値を与えればよい。 After the process is started, the initial value of the pre-estimated value _xq of the line-of-sight position x in step S202 may be the observed value as it is or the value of the coordinate origin of the position. Also, as is well known, a small value greater than 0, such as about 10 ⁻² , may be given to the initial value of the variance P _q of the error e _q of the pre-estimated value, which will be described later.

視線座標を（ｘ，ｙ）とする。ここでは説明のために１次元の視線座標ｘについて説明するが、ｙについても同様である。 Let the line-of-sight coordinates be (x, y). For the sake of explanation, the one-dimensional line-of-sight coordinate x will be described here, but the same applies to y.

撮像装置のライブビュー撮影で連続的に複数フレームの撮影を行うものとして、そのｋ番目の撮影フレーム（以降、フレームｋと称す）における時刻をｔ（ｋ）、直線回帰式を数式（１）とする。ｔは時刻で独立変数である。 Assuming that a plurality of frames are continuously shot by live view shooting of an imaging device, the time at the k-th shot frame (hereinafter referred to as frame k) is t(k), and the linear regression equation is expressed as Equation (1). do. t is time and is an independent variable.

図７（ａ）は時刻ｔ（ｋ－ｎ＋１）～ｔ（ｋ）までのｎ個のデータから直線回帰計算を行うことを説明する図である。算出した回帰直線はＬ_ｆ（ｋ）である。 FIG. 7(a) is a diagram for explaining linear regression calculation from n pieces of data from time t(k−n+1) to t(k). The calculated regression line is L _f (k).

ステップＳ２０３で回帰係数ａ（ｋ），ｂ（ｋ）を算出する。 In step S203, regression coefficients a(k) and b(k) are calculated.

数式（１）の回帰係数ａ（ｋ），ｂ（ｋ）をｎ個の回帰データから最小二乗法によって数式（２）と数式（３）で算出する。公知の逐次最小二乗法などでａ（ｋ），ｂ（ｋ）を算出してもよい。 Regression coefficients a(k) and b(k) of equation (1) are calculated from n pieces of regression data by the least-squares method using equations (2) and (3). You may calculate a(k) and b(k) by well-known iterative least-squares method.

これより、次フレーム、すなわちフレームｋ＋１における事前推定値ｘ_ｑ（ｋ＋１）は、図７（ｂ）に示すように、回帰データとして時刻ｔ（ｋ－ｎ＋１）～ｔ（ｋ）での事後推定値ｘ_ｐを使用して数式（４）によりステップＳ２０４で算出する。 From this, the a priori estimated value x _q (k+1) in the next frame, that is, the frame k+1 is the posterior estimated value at times t(k−n+1) to t(k) as regression data, as shown in FIG. _xp is used to calculate in step S204 by Equation (4).

数式（４）において数式（２）、数式（３）より、ａ（ｋ）とｂ（ｋ）にはｘ（ｋ）の高々一次の項が含まれるので、これは位置ｘ（ｋ）に関して線形の式となる。 In equation (4), from equations (2) and (3), a(k) and b(k) contain at most first-order terms in x(k), so this is linear with respect to position x(k) The formula is

本実施例において、カルマンフィルタの状態方程式は、数式（５）とする。 In this embodiment, the state equation of the Kalman filter is given by Equation (5).

数式（５）はＡ（ｋ）とｕ（ｋ）の一部に回帰係数を含む線形の状態方程式である。数式（５）は数式（４）でｘ（ｋ）について整理することで得られる。ｖ（ｋ）は視線微動等によるモデル化誤差成分であり、平均０、分散Ｑ（ｋ）の正規分布に従う確率変数であるとする。 Equation (5) is a linear state equation including regression coefficients in part of A(k) and u(k). Equation (5) is obtained by rearranging x(k) in Equation (4). It is assumed that v(k) is a modeling error component due to slight eye movement, etc., and is a random variable that follows a normal distribution with mean 0 and variance Q(k).

Ｑ（ｋ）は公知の人間の固視微動の平均的な値を与えてもよいし、視線検出装置のキャリブレーション後に、座標位置の正解が判っている指標点に対して測定した視線位置の分散を与えるなどでもよい。 Q(k) may be a known average value of human fixational eye movement, or may be the gaze position measured with respect to an index point whose correct coordinate position is known after calibration of the gaze detection device. For example, a variance may be given.

カルマンフィルタの観測方程式を、数式（６）とする。 Let the observation equation of the Kalman filter be Formula (6).

ｗ（ｋ）は測定誤差であり、平均０、分散Ｒ（ｋ）の正規分布に従う確率変数であるとする。 Let w(k) be the measurement error, which is a random variable following a normal distribution with mean 0 and variance R(k).

Ｒ（ｋ）は、座標位置の正解が判っている指標画像などを用いて予め測定した視線検出装置の誤差を与えてもよいし、視線位置の事後推定値に対する観測値の誤差の分散から、毎フレーム、あるいは数フレーム毎に動的に更新してもよい。 R(k) may be the error of the line-of-sight detection device that is measured in advance using an index image for which the correct coordinate position is known. It may be dynamically updated every frame or every few frames.

また、事前推定値の誤差ｅ_ｑ（ｋ）と事後推定値の誤差ｑ_ｐ（ｋ）を数式（７）、数式（８）のように定義する。ｘ_ｐ（ｋ）は視線位置ｘの事後推定値である。 Also, the error e _q (k) of the pre-estimated value and the error q _p (k) of the post-estimated value are defined as in Equations (7) and (8). x _p (k) is the posterior estimate of the gaze position x.

ｅ_ｑ（ｋ）の分散をＰ_ｑ（ｋ）、ｅ_ｐ（ｋ）の分散をＰ_ｐ（ｋ）として、誤差の分散についても数式（９）で更新する。 With the variance of e _q (k) as P _q (k) and the variance of e _p (k) as P _p (k), the error variance is also updated by Equation (9).

次に、図７（ｃ）に示すように、カルマンフィルタのフィルタリングステップで事後推定値ｘ（ｋ＋１）を算出する。 Next, as shown in FIG. 7C, the posterior estimated value x(k+1) is calculated in the filtering step of the Kalman filter.

まず、カルマンゲインＧ（ｋ）を数式（１０）で算出する。 First, the Kalman gain G(k) is calculated by Equation (10).

さらにカルマンゲインによって、数式（１１）によって事後推定値の更新を行う。 Further, the Kalman gain is used to update the posterior estimated value by Equation (11).

これは、事後推定値を事前推定値と観測値からカルマンゲインを重みとして内挿する式である。観測誤差の分散Ｒが事前推定値の誤差Ｐ_ｑとくらべて大きい場合はカルマンゲインＧ≒０となり、事後推定値ｘ_ｐは事前推定値と近い値となる。逆に、事前推定値の誤差Ｐｑが観測誤差の分散Ｒよりも大きい場合は、カルマンゲインＧ≒１となり、事後推定値ｘ_ｐは観測値ｘ_ｍと近い値となる。 This is a formula for interpolating the posterior estimated value from the prior estimated value and the observed value using the Kalman gain as a weight. When the variance R of the observation error is larger than the error _Pq of the pre-estimated value, the Kalman gain G≈0, and the posterior estimate x _p is close to the pre-estimated value. Conversely, when the error Pq of the pre-estimated value is larger than the variance R of the observed error, the Kalman gain _G≈1 , and the post-estimated value _xp is close to the observed value xm.

誤差の分散は数式（１２）で更新する。 The error variance is updated by Equation (12).

次に、図７（ｄ）に示すように時刻を１つ進め、ステップＳ２０１～ステップＳ２０５で同様の処理を繰り返す。 Next, as shown in FIG. 7D, the time is advanced by one, and the same processing is repeated in steps S201 to S205.

以上を順次、撮影フレームで繰り替えし演算処理することで、観測値の誤差の大きさに応じて、自動的に事後推定値の分散が最小となるような視線位置を算出、予測することができる。 By repeatedly repeating the above process for each captured frame, it is possible to automatically calculate and predict the line-of-sight position that minimizes the variance of the posterior estimated value according to the size of the error in the observed value. .

なお、実施例の説明では連続して取得された視線情報を用いて視線予測をする場合について述べたが、視線取得中に瞬きやシャッター等の理由で部分的に視線情報が取得できない場合がある。その場合においても、取得できなかった時間の視線情報を加味して視線情報とその視線情報を取得したタイミングを適切に関連付けて視線予測処理をすれば良く、必ずしも連続した視線情報を用いなければ視線予測ができないわけではない。 In addition, in the description of the embodiment, the case of predicting the line of sight using the continuously acquired line of sight information was described, but there are cases where the line of sight information cannot be partially acquired due to reasons such as blinking or shuttering during the line of sight acquisition. . Even in such a case, the line-of-sight information may be appropriately associated with the timing when the line-of-sight information was acquired, taking into consideration the line-of-sight information for the time when the line-of-sight information could not be obtained. It's not unpredictable.

また、実施例では、回帰式の次数を１次として、撮影時に固定値として実施した例を示し。例えばある時間幅にわたって検出情報が取得できている比率や、ある時間幅にわたる視線位置の誤差や視線の信頼度等を加味して、回帰式の次数を動的に変化させてもよい。 Further, in the embodiment, an example is shown in which the order of the regression equation is set to the first order, and fixed values are set at the time of photographing. For example, the order of the regression equation may be dynamically changed in consideration of the rate at which detection information can be acquired over a certain time span, the line-of-sight position error over a certain time span, the line-of-sight reliability, and the like.

また、実施例では本発明をデジタルカメラで実施する例を説明したが、視線検出を行う装置であればどんな装置に適用しても良い。例えばヘッドマウントディスプレイやスマートフォン、ＰＣ等において実施することも可能である。 Also, in the embodiments, an example in which the present invention is implemented in a digital camera has been described, but the present invention may be applied to any device as long as it performs line-of-sight detection. For example, it is also possible to implement in a head mounted display, a smart phone, a PC, or the like.

また、前述の実施例でフローチャートを用いて説明した動作は、同様の目的を達成することができるように、適宜実行されるステップの順序を変更することが可能である。 Also, in the operations described using the flowcharts in the above embodiments, it is possible to change the order of the steps to be executed as appropriate so as to achieve the same purpose.

本発明は、上述の実施例の１以上の機能を実現するプログラムを、ネットワークあるいは記憶媒体を介してシステム又は装置に供給する構成をとることも可能である。。そして、、そのシステムあるいは装置のコンピュータにおける１つ以上のプロセッサーがプログラムを読出し実行する処理でも実現可能である。また、１以上の機能を実現する回路（例えば、ＡＳＩＣ）によっても実現可能である。 The present invention can also be configured to supply a program implementing one or more functions of the above-described embodiments to a system or apparatus via a network or storage medium. . It can also be realized by processing in which one or more processors in the computer of the system or apparatus read and execute the program. It can also be implemented by a circuit (for example, ASIC) that implements one or more functions.

Claims

acquisition means for acquiring line-of-sight information corresponding to the line-of-sight position of a person;
and a correction means for correcting the line-of-sight information,
The processing device, wherein the correction means corrects the line-of-sight information by a Kalman filter using a low-order polynomial regression equation.

3. A processing apparatus according to claim 2, wherein said correction means uses a polynomial of degree 0 to 2 as the low-order polynomial.

having display means for displaying an image,
4. The processing apparatus according to any one of claims 1 to 3, wherein the acquisition unit acquires, as the line-of-sight information, information corresponding to a line-of-sight position on the display unit at which a person gazes. .

an acquisition step of acquiring line-of-sight information corresponding to the line-of-sight position of a person;
a correction step of correcting the line-of-sight information;
The method of controlling a processing device, wherein in the correcting step, the line-of-sight information is corrected using a Kalman filter.