WO2024166116A1

WO2024166116A1 - Retina image references for gaze tracking

Info

Publication number: WO2024166116A1
Application number: PCT/IL2024/050158
Authority: WO
Inventors: Ori WEITZ
Original assignee: Immersix Ltd
Current assignee: Immersix Ltd
Priority date: 2023-02-09
Filing date: 2024-02-11
Publication date: 2024-08-15
Anticipated expiration: 2025-08-09
Also published as: IL300550A

Abstract

A system and method for collecting reference images for retina-images-based gaze tracking, include, capturing images of a user's retina, while the user is looking at a known unmoving gaze target with continuous eye movement. Instructions are relayed to the user to move the user's head while looking at the known gaze target and output of a user interface (UI) is changed in accordance with the user's head movement. Data obtained from the images of the user's retina, which are obtained while the user is looking at the unmoving gaze target and moving the user's head, and a direction of gaze associated with each of the images, is stored in a reference database. An under- imaged portion of the retina may be detected and the UI may be controlled to cause the user to move the eye or head in accordance with the detected under-imaged portion of the retina, to expose the under-imaged portion of the retina to a camera.

Description

RETINA IMAGE REFERENCES FOR GAZE TRACKING

FIELD

[001] The present invention relates to gaze tracking based on images of a person’s retina.

BACKGROUND

[002] Eye tracking to determine direction of gaze (also referred to as gaze tracking) may be useful in different fields, including human-machine interaction control of devices such as industrial machines, in aviation, and emergency room situations where both hands are needed for tasks other than operation of a computer, in virtual, augmented or extended reality applications, in computer games, in entertainment applications and also in research, to better understand subjects' behavior and visual processes. In fact, gaze tracking methods can be used in all the ways that people use their eyes.

[003] Some video-based eye trackers use features of the eye, such as, corneal reflection, center of the pupil of the eye and features from inside the eye, such as the retinal blood vessels, as features from which to reconstruct the optical axis of the eye and/or as features to track in order to measure movement of the eye.

[004] In retinal image-based tracking systems, in order to obtain information on a user’s eye properties and on the relationship between the user’s eye properties and the user’s direction of gaze, users are typically asked to look at several known gaze targets, so that images of the eye, when gazing at the known targets, can be recorded and mapped to the known target. However, obtaining user eye properties using a method that requires discontinuous motion of the eye, is usually lengthy and inconvenient due to fixation of the eye each time the user looks at a new gaze target. This is especially bothersome for retinal image-based tracking systems, since a large area of the retina needs to be imaged, which is done by obtaining many images, each image covering only a small portion of the retina.

[005] Some methods of mapping have the user follow a moving target with their eyes as the target moves around a display, thereby avoiding fixation of the eye.

[006] However, none of the existing methods ensure obtaining a large number of retinal images to enable reliable and accurate mapping, in a quick and user-friendly manner. SUMMARY

[007] Embodiments of the invention provide a system and method for gaze tracking and for collecting reference images of a person’s retina, which is shorter and more user-friendly than existing mapping methods and which ensures wide enough coverage of the retina to enable accurate, uninterrupted gaze tracking.

[008] In one embodiment, a gaze tracking system and method include using a camera to capture images of a person’s retina during continuous eye movement of the person, and using a user interface (UI) configured to display a known gaze target for the person to look at with continuous eye movement. A processor of the system compares information of the images of the person’s retina associated with the known gaze target, with image information of the person’s retina while the person is looking at an unknown gaze target, to calculate a location of the unknown gaze target. [009] Thus, reference images (e.g., images of the person’s retina associated with a known gaze target) obtained according to embodiments of the invention can be used to track a person’s gaze (e.g., by comparing the reference images with image information of the person’s retina while the person is looking at an unknown gaze target). Obtaining reference images during continuous eye movement provides a short and user-friendly method for enrolling users.

[0010] In another embodiment, a method for collecting reference images of a person’s retina includes providing a gaze target at one or more known locations, for a person to look at and while the person is looking at the gaze target, obtaining from a camera, a plurality of images of a retina of an eye of the person, each of the images capturing a different portion of the retina and each of the images associated with a direction of gaze. Based on the obtained images and/or based on the directions of gaze associated with the images, an under-imaged portion of the retina may be detected. The method includes controlling a display to cause or motivate the person to move the eye(s) or head such that the under-imaged portion of the retina becomes exposed to the camera. For example, the display may be controlled to provide a visual presentation indicative of one or more under-imaged portions of the retina. In one embodiment, a gaze target is displayed at new locations and/or the display may be controlled to display instructions for head movement of the person, such that when the person looks at the gaze target, rotation of the person’s eye will cause under-imaged portions of the retina to become exposed.

[0011] Detecting areas of the retina that are under-imaged and then causing rotation of the eye so as to enable capturing images of the under-imaged area, contributes to completeness of a reference database. Using a more complete reference database for gaze tracking enables accurate and uninterrupted gaze tracking.

BRIEF DESCRIPTION OF THE FIGURES

[0012] The invention will now be described in relation to certain examples and embodiments with reference to the following illustrative figures so that it may be more fully understood. In the drawings:

[0013] Figs. 1A-C schematically illustrate examples of systems for gaze tracking, according to embodiments of the invention;

[0014] Figs. 2A-C schematically illustrate methods for collecting reference images of a person’s retina, according to embodiments of the invention;

[0015] Figs. 3A-C schematically illustrate methods for collecting reference images of a person’s retina, according to additional embodiments of the invention; and

[0016] Figs. 4A-B schematically illustrate panorama images of the person’s retina, according to embodiments of the invention.

DETAILED DESCRIPTION

[0017] Embodiments of the invention provide systems and methods for gaze tracking using images of a person’s retina and for collecting reference images for retina-images-based gaze tracking. In some embodiments the person’s retina is imaged while the person is continuously looking at a single unmoving gaze target. Although the person is looking at a single and still gaze target, a wide portion of the retina can be imaged in a quick and convenient manner due to movement of the person’s head (while keeping the person’s gaze on the target).

[0018] A ray of gaze corresponding to the person’s gaze includes the origin of the ray and its direction. The origin of the ray can be assumed to be at the optical center of the person’s eye lens whereas the direction of the ray is determined by the line connecting the origin of the ray and the gaze target. The direction of the ray of gaze is derived from the orientation of the eye.

[0019] A change in orientation of a person’s eye from one gaze target to another can be calculated based on comparing to each other different images of the person’s retina associated with different gaze targets. Thus, a change in orientation of a person’s eye between a known gaze target and an unknown gaze target can be calculated by comparing an image of the person’s retina associated with an unknown gaze target to a reference image, namely, an image of the person’s retina which is associated with a known gaze target.

[0020] Some embodiments of the invention provide systems and methods for collecting reference images of a person’s retina, e.g., for gaze tracking. “Reference images” include images or features or information from images of different portions of a retina of a specific eye of a specific person (also termed “user”) and/or images of the retina associated with a known gaze target (i.e., a known direction of gaze of the specific person). So, essentially, reference images provide retinal features to visually identify a region of a person’s retina at a later time, and the relative orientation of the person’s visual axis to these features.

[0021] Systems and methods according to embodiments of the invention, are exemplified below.

[0022] In the following description, various aspects of the present invention will be described. For purposes of explanation, specific configurations and details are set forth in order to provide a thorough understanding of the present invention. However, it will also be apparent to one skilled in the art that the present invention may be practiced without the specific details presented herein. Furthermore, well known features may be omitted or simplified in order not to obscure the present invention.

[0023] Unless specifically stated otherwise, as apparent from the following discussions, it is appreciated that throughout the specification discussions utilizing terms such as “analyzing”, "processing", "computing", "calculating", “comparing”, "determining", “detecting”, “identifying”, “creating”, “producing”, “controlling”, “tracking”, “choosing”, or the like, refer to the action and/or processes of a computer or computing system, or similar electronic computing device, that manipulates and/or transforms data represented as physical, such as electronic, quantities within the computing system's registers and/or memories into other data similarly represented as physical quantities within the computing system's memories, registers or other such information storage, transmission or display devices. Unless otherwise stated, these terms refer to automatic action of a processor, independent of and without any actions of a human operator.

[0024] In one embodiment of the invention, a gaze tracking system includes a camera to capture images of a user’s retina, while the user is looking at a known gaze target with continuous eye movement, and a processor in communication with the camera. The processor may also be in communication with a user interface (UI) and a reference database. The processor may cause display of the known gaze target at a substantially unchanging location. For example, the processor may cause display of the known gaze target on a display of a UI device, e.g., on a monitor of a personal device such as a computer or phone, or on a display of an XR device, as further detailed below.

[0025] The processor may further cause transmission of instructions to the user to move the user’s head while looking at the known gaze target (which is typically not moving, namely, at a substantially unchanging location in the real and/or virtual world). The instructions may be provided to the user as output from the UI device and may include, for example, text, sound and/or instructive graphics. The processor may then cause a change in output of the UI in accordance with the user’s head movement. For example, a visual indication on a UI display may be changed in accordance with the user’s head movement and/or other output of a UI (such as sound or light) may be changed in accordance with the user’s head movement. The instructions output to the user and the change in output of the UI can provide the user with an indication of the particular head movements required from the user to achieve eye rotations that will expose to a camera a wide area (e.g., wide angular area) of the user’s retina.

[0026] The camera captures images of the user’s retina while the user is looking at the substantially unmoving gaze target and while the user is achieving eye rotations by moving the user’s head according to the instructions output from the UI. The processor stores in a reference database data obtained from these images, typically together with a direction of gaze associated with each of the images.

[0027] The known gaze target may include a graphical element, such as a shape and the instructions may indicate to the user how to move the head relative to the shape, to obtain a particular eye rotation that will expose to the camera a plurality of different portions of the retina of the user. A change in output of the UI may include, for example, a visual indication displayed within the shape.

[0028] The eye rotations caused by the head movement in accordance with the instructions, expose to the camera a plurality of different portions of the retina of the user, thereby enabling to capture images of a wide area of the retina.

[0029] In some embodiments, the processor tracks a trajectory of the user’s head and the output of the UI may change in accordance with the trajectory. For example, a visual indication provided by the processor may change in accordance with the trajectory. [0030] The system may include a front-facing camera typically coupled to the user’s head, to capture images of the world. In some embodiments, the system includes an XR device. The frontfacing camera may be part of or connected to the XR device.

[0031] In some embodiments, the processor tracks a trajectory of the user’s head by tracking movement of the front-facing camera.

[0032] In some embodiments, the processor is configured to detect an under-imaged portion of the user’s retina and cause transmission of instructions prompting the user to achieve eye rotations configured to make the under-imaged portion visible to the camera.

[0033] Examples of some aspects of embodiments of the system of the invention are described with reference to Figs. 1 A-C, below.

[0034] In one example, which is schematically illustrated in Fig. 1A, a system is configured to capture images of a person’s retina, e.g., during continuous eye movement of the person. The system includes one or more camera 103 and a user interface (UI) device 106, e.g., a device capable of providing visual, acoustic or other output to a user. In one embodiment UI device 106 is configured to display a target at a known location or locations, for the person to look at continuously. The location of the target is typically at a known location in relation to a frame of reference (e.g., a coordinate system), for example, in relation to a frame of reference of the display of the UI device 106 or the frame of reference of camera 103.

[0035] Continuously looking at a target typically requires keeping an eye (or eyes) on the target. If the target moves continuously, the eye rotates so that it can keep on the target while the target changes locations, enabling a camera to capture a wide area of the retina. A wide range of angles of rays of gaze can be provided and a wide area of the retina can be captured by the camera even if the target does not move, by keeping the eye on the target but moving the head (e.g., back and forth or in a circle). Motion of the head while gazing at a single unmoving target, which changes the angle of ray of gaze, rotates the eye relative to the head. Thus, if camera 103 is coupled to the user’s head (e.g., if camera 103 is located on a head mounted device, such as glasses, e.g., as illustrated in Fig. IB), motion of the head rotates the eye relative to camera 103, which enables capturing images of many different portions of the retina. The rotation of the eye while tracking a continuously moving target or when moving the head while the gaze is fixed on a single unmoving target, provides continuous motion, as opposed to the discontinuous motion of the eye when looking at different targets appearing at different locations. This continuous motion enables a relatively quick and therefore user-friendly process for collecting reference images.

[0036] UI device 106 may include a display, such as a monitor or screen, for displaying targets and instructions and/or notifications to a user (e.g., via text or other content displayed on the monitor).

[0037] A processor 102, which is in communication with camera 103 and with UI 106, can cause a gaze target to be displayed at a known location (or locations) on a display of UI device 106 and can cause the gaze target to move continuously on the display. In some embodiments, processor 102 can cause instructions to be displayed on the display of UI device 106, to prompt a user to keep the user’s gaze on a moving target and/or to move the user’s head while keeping the gaze on an unmoving target.

[0038] In one embodiment, which is schematically illustrated in Fig. IB, processor 102 causes a target 120 (which may be a graphical element, such as a 2-dimensional shape, e.g., a rectangle or circle) to be displayed on UI device 106. In some embodiments target 120 may be a virtual target displayed in a virtual world. The user 104 is prompted (possibly via instructions displayed on UI device 106) to look at a point in target 120 and to move the head while keeping the gaze fixed on the point in target 120. Processor 102 may track a trajectory of the head movement (e.g., as illustrated by the dashed arrows) of the user 104 while the user 104 is continuously looking at target 120 (typically at one point in target 120) and may cause the area or space of target 120 to be filled-in with marking 122 on the UI display, or, for example, may cause marking to be erased from the area of target 120, in accordance with the trajectory. Marking 122 can provide an indication for user 104 of how much more and where to direct head movement in order to fill-in (or erase) more of shape of target 120.

[0039] In the example illustrated in Fig. IB, the system may include an XR device. An XR device may include augmented reality (AR), virtual reality (VR) and/or mixed reality (MR) optical systems, such as Uumus™ DK-Vision, Microsoft™ Hololens, or Magic Ueap One™. The XR device may include, for example a VR headset or, as exemplified in Fig. IB, XR (e.g., VR or AR) glasses 110. A front-facing “world camera” 130 that captures images of the world, is attached to XR glasses 110. The display of UI device 106 may be part of an XR device, such as XR glasses 110. In some embodiments, UI device 106 may include markings 116 that will be visible in images captured by world camera 130 and which can mark the edges of UI 106, assisting in determining the location of a displayed target within a coordinate system of the world camera 130.

[0040] Camera 103 and possibly processor 102 may be attached to XR glasses 110. Camera 103 can obtain an image of a portion of the person’s retina, via the pupil of the eye, with minimal interference or limitation of the person’s field of view. For example, camera 103 may be located at the periphery of a person’s eye (e.g., below, above or at one side of the eye) a couple of centimeters from the eye.

[0041] Processor 102 may track the trajectory of the user’s 104 head movement, e.g., by tracking movement of world camera 130.

[0042] In another embodiment, which is schematically illustrated in Fig. 1C, processor 102 moves a target 142 continuously on a display of UI 106. Target 142 may be moved in a predetermined pattern or in a random pattern. User 104 is prompted (possibly via instructions displayed on UI 106) to look at target 142 without moving the head. Camera 103, which may be attached onto XR glasses 110 can obtain an image of a portion of the person’s retina, as described above, while the person is gazing at target 142.

[0043] As discussed above, having a user continuously gaze at a moving target without moving the head and/or gaze at an unmoving target while moving the head, enables easily capturing many reference images covering a wide typically angular portion of the user’s retina. The reference images may be used by processor 102 to calculate an unknown location of a gaze target by comparing information from reference images (e.g., images of the person’s retina associated with a known location), with image information of the person’s retina while the person is looking at a target at an unknown location.

[0044] Referring back to Fig. 1 A, camera 103 may include a CCD or CMOS or other appropriate image sensor. Camera 103 images the retina by converting rays of light from a particular point of the retina to a pixel on the camera sensor. Camera 103 may include an optical system which may include a lens 107 and possibly additional optics such as mirrors, filters, beam splitters and polarizers.

[0045] In some embodiments, lens 107 has a wide depth of field or an adjustable focus. In some embodiments lens 107 may be a multi-element lens.

[0046] The system may include one or more light source(s) 105 configured to illuminate the person’s eye. Light source 105 may include one or multiple illumination sources and may be arranged, for example, as a circular array of LEDs surrounding the camera 103 and/or lens 107. Light source 105 may illuminate at a wavelength which is undetected by a human eye (and therefore unobtrusive), for example, light source 105 may include an IR LED or other appropriate IR illumination source. The wavelength of the light source (e.g., the wavelength of each individual LED in the light source), may be chosen so as to maximize the contrast of features in the retina and to obtain an image rich with details. In some embodiments, light source 105 may include a miniature light source which may be positioned in close proximity to the camera lens 107, e.g., in front of the lens, on the camera sensor (behind the lens) or inside the lens.

[0047] Processor 102 may be in communication with light source 105 to control, for example, the intensity and/or timing of illumination, e.g., to be synchronized with operation of camera 103. Different LEDs, having different wavelengths can be turned on or off to obtain different wavelength illumination. In one example, the amount of light emitted by light source 105 can be adjusted by processor 102 based on the brightness of the captured image. In another example, light source 105 can be controlled to emit different wavelength lights such that different frames can capture the retina at different wavelengths, and thereby capture more detail. In yet another example, light source 105 can be synchronized with the camera 103 shutter. In some embodiments, short bursts of very bright light can be emitted by light source 105 to prevent motion blur, rolling shutter effect, or reduce overall power consumption.

[0048] Processor 102 may receive image data from camera 103 based on which processor 102 can control light source 105 (e.g., as described above). In additional embodiments, processor 102 receives image data from camera 103 and may calculate a change in orientation of an eye of a person (e.g., a user), and possibly determine the person’s direction of gaze, based on the received image data.

[0049] Image data may include representations of the image as well as partial or full images or videos of the retina or portions of the retina. Representations of the image may include, for example, information describing the image, e.g., key features detailing a location of one or more pixels and information about the image at that location. Information about the image may include, for example, pixel values that represent the intensity of light reflected from a person’s retina, a histogram of gradients around the location of the pixel(s), a blood vessel between two points having a given thickness, a blotch described by an ellipse, etc. [0050] Processor 102 may calculate a direction of gaze associated with each image of the user’s retina, using relative locations of, for example, the camera 103, the world camera 130, the known gaze target and the user’s eye, when each image was captured. For example, a line connecting the world camera 130 and the known gaze target may be used to estimate the direction of gaze of the user while looking at the known gaze target. In another embodiment, a user’s eye pupil may be detected in the image and the center of the pupil may thus be estimated. The three-dimensional location of the gaze target may be transformed from coordinates of the world camera 130 to coordinates of camera 103 (based on the known pose of each camera in relation to the other). Within the coordinate system of camera 103, a line connecting the center of pupil and the gaze target can be estimated to be the direction of gaze.

[0051] In some embodiments processor 102 obtains from camera 103 a plurality of images of a retina of an eye of the person, each of the images capturing a different portion of the retina, each image associated with a direction of gaze. The processor detects an under-imaged portion of the retina based on either one of or on a combination of the obtained images and the directions of gaze associated with the images. UI device 106 provides instructions causing the person to move the eye or head in accordance with the detected under-imaged portion of the retina.

[0052] Processor 102, which may be locally embedded or remote, may include, for example, one or more processing units including a central processing unit (CPU), a graphics processing unit (GPU), a digital signal processor (DSP), a field-programmable gate array (FPGA), a microprocessor, a controller, a chip, a microchip, an integrated circuit (IC), or any other suitable multi-purpose or specific processing or controlling unit.

[0053] In some embodiments, processor 102 may be in communication with a storage device 108 such as a server, including, for example, volatile and/or non-volatile storage media, such as a hard disk drive (HDD) or solid-state drive (SSD). Storage device 108, which may be connected locally or remotely, e.g., in the cloud, may store and allow processor 102 access to a reference database, namely, a database of reference images and maps (e.g., lookup tables) linking image data of the reference images with gaze targets.

[0054] Components of the system may be in wired or wireless communication and may include suitable ports and/or network hubs.

[0055] Processor 102 is typically in communication with a memory unit 112, which may store at least part of the image data received from camera(s) 103. Memory unit 112 may be locally embedded or remote. Memory unit 112 may include, for example, a random access memory (RAM), a dynamic RAM (DRAM), a flash memory, a volatile memory, a non-volatile memory, a cache memory, a buffer, a short term memory unit, a long term memory unit, or other suitable memory units or storage units.

[0056] In some embodiments the memory unit 112 stores executable instructions that, when executed by processor 102, facilitate performance of operations of processor 102, as described herein.

[0057] Aspects of methods according to embodiments of the invention are exemplified in Figs. 2A - C, below.

[0058] In one embodiment, which is schematically illustrated in Fig. 2A, a method for collecting reference images of a person’s retina, is performed by processor 102 and includes providing a known gaze target for a person to look at with continuous eye movement (202). A “known gaze target” is a target located at a known location, typically a known location in relation to a frame of reference e.g., as described herein and/or at a known direction of gaze of a user. While the person is looking at the known gaze target with continuous eye movement, a plurality of images of the person’s retina are obtained (204), each of the images capturing a different portion of the retina. Typically, at least some of the images -at least partially- overlap.

[0059] Image data from the plurality of images is stored in association with the known gaze target (206). For example, image data may be associated with a known target gaze (e.g., with a known location of the target, typically, a location in relation to a reference frame such as a reference frame of the display of UI 106 and/or of the world camera 130) via a lookup table or other suitable indexing methods.

[0060] A signal generated based on the image data (which is stored in association with the known gaze target), may then be used as input to another device, e.g., may be used to control another device (212).

[0061] An example of how the stored image data (or a signal generated based on the stored image data) may be used as input to another device, e.g., for controlling another device, is schematically illustrated in Fig. 2B. In this example, a method for collecting reference images of a person’s retina, is performed by processor 102 and includes providing a known gaze target for a person to look at continuously (202) and while the person is continuously looking at the known gaze target, obtaining a plurality of images of the person’ s retina (204), each of the images capturing a different portion of the retina. The image data captured at step 204 may be included in the term “reference images”.

[0062] Image data from the plurality of reference images, is stored in association with the known gaze target (206).

[0063] The stored image data is then compared with image data of the person’s retina while the person is looking at an unknown gaze target (207), to calculate the unknown gaze target (208). An “unknown gaze target” is a target at an unknown location, (e.g., at an unknown location in relation to a frame of reference) and/or at an unknown direction of gaze of the user, and calculating the unknown gaze target may include calculating the unknown location and/or direction of gaze.

[0064] The comparison at step 207 may include, for example, finding a spatial transformation between an image of the retina at a known direction of gaze (a reference image) and a matching image of the retina at an unknown direction of gaze (an input image). In one embodiment, processor 102 receives an input image and compares the input image to a reference image, to find a spatial transformation (e.g., translation and/or rotation) of the input image relative to the reference image. Typically, a transformation that optimally matches or overlays the retinal features of the reference image on the retinal features of the input image, is found. A change in orientation of the person’s eye, which corresponds to the spatial transformation, is then calculated. The direction of gaze associated with the input image can then be determined based on the calculated transformation (or based on the change in eye orientation). A signal generated based on the change in orientation and/or based on the person’s direction of gaze associated with the input image, may then be output (210).

[0065] For example, the location of the unknown target and/or a change in rotation of the person’s eye caused by looking at the unknown gaze target, may be output (e.g., transmitted) to another device and/or may be displayed and/or used to control another device.

[0066] Devices operating based on gaze tracking may include, for example, industrial machines, devices used in sailing, aviation or driving, devices used in advertising, computer games, devices used in entertainment, XR devices (e.g., devices using virtual, augmented and/or mixed reality), devices used in medical applications, etc. These devices may receive input including a person’s direction of gaze (e.g., determined according to embodiments of the invention) and/or may be controlled based on the person’s direction of gaze. [0067] In another example, a device used in a medical procedure may include components of the system described in Fig.1 A. For example, a system may include a retinal camera consisting of an image sensor to capture images of different portions of a person’s retina. The processor may stitch together the images to create a panorama image of the person’s retina. The processor can cause the panorama image to be displayed, e.g., to be viewed by a professional. In some embodiments, the processor can run a machine learning model to predict a health state of the person’s eye based on the panorama image, the machine learning model trained on panorama images of retinas of eyes in different health states. Alternatively or in addition, a device for biometric user identification and/or authentication may use the stored image data as a reference database by which to identify specific users.

[0068] Fig. 2C schematically illustrates a method that can be performed by processor 102 and the systems described in Figs. 1 A-B.

[0069] In one embodiment, a gaze tracking method includes using a processor to display a known gaze target at a substantially unchanging location (222), e.g., a location in the real or virtual world, and using the processor to transmit instructions to be output to a user (224), e.g., by a UI. The instructions may include a directive to the user to move the user’s head while looking at the unmoving known gaze target and/or the instructions may include graphics to indicate this directive to the user. The processor then causes a change in output of the UI (226) in accordance with the user’s head movement. For example, visual and/or acoustic output of a UI may be modulated in accordance with the user’s head movement.

[0070] The method may further include obtaining images of the user’s retina (228) while the user is looking at the known gaze target in accordance with the instructions and using the processor to store in a reference database, data obtained from the images of the user’s retina and a direction of gaze associated with each image of the user’s retina (230).

[0071] As described above, the method may further include comparing the data stored in the reference database with image information of the user’s retina while the user is looking at a gaze target at an unknown location, to calculate the user’s direction of gaze and generate a signal based on the calculated user’s direction of gaze, the signal used to control a device.

[0072] Collecting reference images and storing reference images in a reference database, such as described herein, may be done at an initial stage (e.g., when enrolling a user, e.g., to use a gaze tracking system) and/or during later stages, e.g., during use of the gaze tracking system. [0073] In one embodiment, a method for collecting reference images of a person’s retina includes providing a gaze target at one or more known locations, for a person to look at, and, while the person is looking at the gaze target, obtaining a plurality of images of a retina of an eye of the person, each of the images capturing a different portion of the retina and each of the images associated with a different direction of gaze. Based on the obtained images and/or based on the directions of gaze associated with the images, an under-imaged portion of the retina may be detected and a UI may be controlled to cause the person to move the eye or head in accordance with the detected under-imaged portion of the retina, to expose the under-imaged portion of the retina to a camera.

[0074] For example, a new location of the gaze target and/or movement of the person’s head that will cause rotation of the eye such that the under-imaged portion of the retina to be exposed, may be calculated. A device (e.g., a display or UI device) may be controlled in accordance with the calculated new location and/or head movement. For example, instructions or gaze targets may be displayed in such a way to motivate the person to move the eye or head so as to expose the underimaged portion of the retina.

[0075] In some embodiments, information obtained from the plurality of images is stored in association with the one or more known locations of the gaze target and the stored information is compared with image information of the person’s retina while the person is looking at a gaze target at an unknown location, to calculate the person’s direction of gaze. A signal may be generated based on the person’s calculated direction of gaze and the signal may be used to control a device, e.g., as described herein.

[0076] The method may include providing the gaze target for the person to look at with continuous eye movement. For example, a moving target may be provided for the person to look at without moving the person’s head relative to the moving target or a still target may be provided for the person to look at while moving the person’s head.

[0077] In one embodiment, controlling a UI may include changing a location of the gaze target. Images of the person’s retina may be obtained while the person is looking at the gaze target after changing the location.

[0078] In another embodiment, controlling a UI may include providing instructions relating to movement of the person’s head. Images of the person’s retina may be obtained while the person is looking at the known gaze target while moving the person’s head according to the instructions. [0079] In some embodiments, detecting an under-imaged portion of the person’s retina includes marking an area on a map representing the person’s retina, in accordance with a number of directions of gaze associated with the images obtained for the area and detecting in the map a relatively sparsely marked area. The relatively sparsely marked area in the map is determined to represent the under-imaged portion.

[0080] In other embodiments, detecting the under-imaged portion of the person’s retina includes stitching the plurality of images together to create a map representing the person’s retina (e.g., a panorama) and determining that a missing or sparse part of the map represents the under-imaged portion.

[0081 ] In some embodiments, the gaze target is provided on a display of the UI. A visual indication may be added on the gaze target on the display, in accordance with a trajectory of the person’s head. A trajectory of the person’s head may be obtained, e.g., by tracking movement of a camera coupled to the person’s head.

[0082] Aspects of methods according to embodiments of the invention are exemplified in Figs. 3A - C, below.

[0083] In one example, which is schematically illustrated in Fig. 3A, a method for collecting reference images of a person’s retina includes obtaining a plurality of images of the person’s retina, while the person is looking at a first known gaze target (302), each of the images capturing a different portion of the retina. Typically, at least some of the images -at least partially- overlap. Directions of gaze associated with the images are also obtained (e.g., as described herein). Some methods of capturing different portions of the retina may include having a person look continuously at a moving target without movement of the head or having the person look at a single unmoving target while moving the head, as described herein, or a combination of both methods.

[0084] Once the plurality of images (and their associated directions of gaze) have been obtained, an under-imaged (e.g., unimaged) portion of the person’s retina may be detected (304). Based on the under-imaged portion, processor 102 may calculate one or more different locations of the gaze target that will require eye rotations and/or angles of rays of gaze that will make under-imaged portions of the retina, visible to the camera 103. Thus, based on the under-imaged portion, the target is provided at one or more different calculated location(s) (306) (e.g., as described herein) and images of the person’s retina are obtained while the person is looking at the known gaze target at the different location(s) (308). [0085] In another embodiment, based on the under-imaged portion, instructions may be provided to the user (e.g., by processor 102) to move the head so as to enable capturing reference images of under-imaged parts of the retina and/or to enable completing a scan of the retina (e.g., as described with reference to Fig. IB).

[0086] Fig. 3B schematically illustrates one example of a method for detecting an under-imaged portion of the retina. The method includes obtaining a plurality of images of the person’s retina, while the person is looking at a first known gaze target (302), each of the images capturing a different portion of the retina. An area of a map representing the retina may be painted or otherwise marked based on if an image of that area has been obtained or based on a number of images obtained of the area and/or based on the directions of gaze associated with the images obtained for that area. In another embodiment, the map may be painted based on a size of a portion covered by the obtained images (314). In one example, a map may be created by using a dot to represent each image and/or associated direction of gaze and placing each dot on the map based on the (possibly approximate) location on the retina of the image it represents. Once all the dots are placed on the map, areas with no dots or a relatively sparse area in the map or an area that is not painted (or only lightly painted), can be detected (316) and the relatively sparse area or lightly painted area in the map may be determined to represent an under- imaged portion (318).

[0087] Fig. 3C schematically illustrates another example of a method for detecting an underimaged portion of the retina. The method includes obtaining a plurality of images of the person’s retina, while the person is looking at a first known gaze target (302), each of the images capturing a different portion of the retina. The images are then stitched together to create a panorama (or another map representing the person’s retina) of the person’s retina (324). The stitching may be done, for example, using standard techniques, such as feature matching and/or finding areas where two images share an area that is very similar (e.g., by Pearson correlation or square difference), merging the two into one image, and repeating until all images are merged to one.

[0088] Once a panorama of the retina is created based on the obtained images, a missing part of the panorama may be detected (326). It can then be determined that the missing part of the panorama represents an under-imaged (e.g., unimaged) portion of the person’s retina (328).

[0089] Fig. 4A shows a schematic example of a panorama 400 of the retina created by stitching (e.g., as described above) images of different (possibly overlapping) portions of the retina 401, 402, 403, etc., up to 409. A missing part 407 represents an under-imaged portion of the person’s retina.

[0090] In some embodiments, as schematically illustrated in Fig. 4B, dots 42 or other markings may be used to represent each captured image of the retina and/or its associated direction of gaze. The dots 42 are placed on the panorama 400 based on the location on the retina of the images they represent. Once all the dots are placed on the panorama, areas (possibly, areas of a predetermined size) with no dots or with a relatively small number of dots (such as area 47), can be detected on the panorama, thereby identifying an under-imaged area of the retina.

[0091] Using the methods schematically illustrated in Figs. 4A-B it is detected, for example, that a bottom left-hand portion of the person’s retina is an under- imaged portion. This portion of the person’s retina will become exposed (to the camera used to obtain the plurality of images making up the panorama) when the person’s gaze is to an upper right-hand direction or when the person’s head moves to a bottom left-hand location, while the eye does not move. In this case, a processor may control a display to display a target of gaze at an upper right-hand location of the display causing the person’s gaze to be directed to an upper right-hand direction. Alternatively or in addition, the processor may control the display to show instruction to the person to move the head to a bottom left-hand direction. For example, if the person is instructed to move the head in order to “fill-in” an unmoving target on the display, a processor may change the shape or the size of the unmoving target enlarging the bottom left-hand corner of the target and/or the processor may remove the marking in the bottom left-hand corner of the target, causing the user to move his head more in this direction.

[0092] These and other methods for detecting under-imaged portions of the retina may be performed during enrollment of a user to a gaze tracking system and/or during later stages, e.g., during use of the gaze tracking system.

[0093] The embodiments described above provide an improved reference database, which includes a complete and broad representation of a person’s retina.

[0094] Embodiments of the invention enable obtaining images of a wide portion of the retina and enable maintaining a high-quality reference database, providing an improved basis for smooth operation of a gaze tracking system.

Y1

Claims

1. A system for collecting reference images for retina images-based gaze tracking, the system comprising: a camera to capture images of a user’s retina, while the user is looking at a known gaze target with continuous eye movement; a processor in communication with: the camera, a user interface (UI) and a reference database; the processor configured to: cause display of the known gaze target at a substantially unchanging location; cause transmission of instructions to the user to move the user’s head while looking at the known gaze target at the substantially unchanging location; cause a change in output of the UI in accordance with the user’s head movement; and store data obtained from the images of the user’s retina and a direction of gaze associated with each image of the user’s retina, in the reference database.

2. The system of claim 1 wherein the known gaze target comprises a graphical element.

3. The system of claim 2 wherein the graphical element comprises a shape and wherein the instructions indicate to the user how to move the head relative to the shape, to obtain a particular eye rotation that will expose to the camera a plurality of different portions of the retina of the user.

4. The system of claim 3 wherein the change in output of the UI comprises a visual indication displayed within the shape.

5. The system of claim 1 wherein the processor tracks a trajectory of the user’s head and wherein the change in output of the UI comprises a visual indication that changes in accordance with the trajectory.

6. The system of claim 1 comprising a front facing camera to capture images of the world, the front facing camera configured to be coupled to the user’s head.

7. The system of claim 6 comprising an XR device.

8. The system of claims 6 - 7 wherein the front facing camera is connected to the XR device.

9. The system of any of claims 1-8 wherein the processor tracks a trajectory of the user’s head by tracking movement of the front facing camera.

10. The system of claim 6 wherein the direction of gaze associated with each image of the user’s retina, is calculated using relative locations of the camera, the front facing camera, the known gaze target and the user’s eye, when each image was captured.

11. The system of claim 1 wherein the processor is configured to detect an under-imaged portion of the user’s retina; and cause transmission of instructions prompting the user to achieve eye rotations configured to make visible to the camera the under-imaged portion.

12. The system of claim 1 wherein the processor is configured to compare the stored data with image information of the user’s retina while the user is looking at a gaze target at an unknown location, to calculate the user’s direction of gaze; and generate a signal based on the calculated user’s direction of gaze, the signal to control a device.

13. A method for collecting reference images for retina images-based gaze tracking, the method comprising: using a processor to display a known gaze target at a substantially unchanging location; using the processor to transmit instructions to a UI, the instructions to be output to a user by the UI, the instructions comprising a directive to a user to move the user’s head while looking at the known gaze target; using the processor to cause a change in output of the UI in accordance with the user’s head movement; obtaining images of the user’s retina while the user is looking at the known gaze target in accordance with the instructions; and using the processor to store data obtained from the images of the user’s retina and a direction of gaze associated with each image of the user’s retina, in a reference database.

14. The method of claim 13 comprising comparing the data stored in the reference database with image information of the user’s retina while the user is looking at a gaze target at an unknown location, to calculate the user’s direction of gaze; and generate a signal based on the calculated user’s direction of gaze, the signal to control a device.

15. A method for collecting reference images of a person’s retina, the method comprising: providing a gaze target at one or more known locations, for a person to look at; while the person is looking at the gaze target, obtaining from a camera a plurality of images of a retina of an eye of the person, each of the images capturing a different portion of the retina and each of the images associated with a different direction of gaze; based on either one of or on a combination of the obtained images and directions of gaze associated with the images, detecting an under-imaged portion of the retina; controlling a user interface (UI) to cause the person to move the eye or head in accordance with the detected under-imaged portion of the retina to expose the underimaged portion of the retina to the camera.

16. The method of claim 15 comprising providing the gaze target for the person to look at with continuous eye movement.

17. The method of claim 16 comprising providing a moving target for the person to look at without moving the person’s head relative to the moving target or providing a still target for the person to look at while moving the person’s head.

18. The method of claim 15 wherein controlling the UI comprises: changing a location of the gaze target; the method comprising obtaining images of the person’s retina while the person is looking at the gaze target after changing the location.

19. The method of claim 16 wherein controlling the UI comprises providing instructions relating to movement of the person’s head, and the method comprises obtaining images of the person’s retina while the person is looking at the known gaze target while moving the person’s head according to the instructions.

20. The method of claim 15 wherein detecting the under-imaged portion of the person’s retina comprises: marking an area on a map representing the person’s retina, in accordance with a number of directions of gaze associated with the images obtained for the area; detecting in the map a relatively sparsely marked area; and determining that the relatively sparsely marked area in the map represents the under-imaged portion.

21. The method of claim 15 wherein detecting the under-imaged portion of the person’s retina comprises: stitching the plurality of images together to create a map representing the person’s retina; and determining that a missing or sparse part of the map represents the under-imaged portion.

22. The method of claim 15 comprising: providing the gaze target on a display of the UI.

23. The method of claim 22 comprising: tracking a trajectory of the person’s head; and adding a visual indication on the gaze target on the display, in accordance with the trajectory.

24. The method of claim 23 wherein tracking the trajectory of the person’s head comprises tracking movement of a camera coupled to the person’s head.

25. The method of claim 15 comprising moving the gaze target continuously on the display.

26. The method of claim 15 comprising: storing information from the plurality of images in association with the one or more known locations; comparing the stored information with image information of the person’s retina while the person is looking at a gaze target at an unknown location, to calculate the person’s direction of gaze; and using a signal generated based on the person’s direction of gaze to control a device.

27. A system for collecting reference images for retina-images-based gaze tracking, the system comprising: a camera to capture a plurality of images of a retina of an eye of the person, each of the images capturing a different portion of the retina; a processor to obtain the images of the retina, each image associated with a direction of gaze, and to detect, based on either one of or on a combination of the obtained images and the directions of gaze associated with the images, an under-imaged portion of the retina; and a UI in communication with the processor, the UI to provide instructions causing the person to move the eye or head in accordance with the detected under-imaged portion of the retina.

28. The system of claim 27 comprising a front facing camera to capture world images, the front facing camera coupled to the person’s head, and wherein a known gaze target is displayed on the UI for the person to look at while the camera is capturing the plurality of images of the retina of the eye of the person.

29. The system of claim 28 wherein the processor is to determine a direction of gaze associated with an image based on locations of the front facing camera and the known gaze target.