CN119963652B

CN119963652B - Calibration method of camera parameters and related equipment

Info

Publication number: CN119963652B
Application number: CN202311484874.4A
Authority: CN
Inventors: 龙翔; 苏丹; 王喜龙; 王辉
Original assignee: Beijing Zitiao Network Technology Co Ltd
Current assignee: Beijing Zitiao Network Technology Co Ltd
Priority date: 2023-11-08
Filing date: 2023-11-08
Publication date: 2025-11-04
Anticipated expiration: 2043-11-08
Also published as: CN119963652A; WO2025098400A1

Abstract

This disclosure provides a method and related apparatus for calibrating camera parameters. The method includes: receiving multiple images of a calibration reference object captured by at least two cameras, the at least two cameras being disposed within the lens barrel of a wearable device; determining the projection relationship between pixels in the multiple images and the calibration reference object; and determining a first parameter set and a second parameter set for the at least two cameras based on the projection relationship; wherein the first parameter set includes multiple first parameters, the first parameters being used to characterize target pixels in the images captured by the camera to be calibrated and the corresponding projection direction of the target pixels; and the second parameter set includes at least one set of second parameters, the at least one set of second parameters being used to indicate the pose relationship between the at least two cameras.

Description

Calibration method of camera parameters and related equipment

Technical Field

The disclosure relates to the technical field of augmented reality, in particular to a calibration method of camera parameters and related equipment.

Background

Extended Reality (XR for short) refers to that a virtual environment which can be interacted with human-computer is created by combining Reality with virtual through a computer. XR (augmented reality) technology may further include Augmented Reality (AR), virtual Reality (VR), mixed Reality (MR), and virtual content and a real scene are fused by combining various technical means with a hardware device.

In general, an augmented reality system provides a wearable device, which may be a head-mounted wearable device, for a user to achieve human-machine interaction. In some scenarios, the wearable device may perform calculations by capturing human eye images to implement gaze tracking or pupil distance estimation functions.

The present inventors have found that in the related art, the camera that collects images is generally disposed outside the lens barrel, limiting the accuracy of the pupil distance estimation and the gaze tracking algorithm.

Disclosure of Invention

The disclosure provides a calibration method for camera parameters and related equipment, so as to solve or partially solve the above problems.

In a first aspect of the present disclosure, a method for calibrating a camera parameter is provided, including:

The wearable device further comprises a binocular display module, wherein the lens barrel is arranged on the light emitting side of the binocular display module, an optical component is arranged in the lens barrel, and the at least two cameras are positioned between the binocular display module and the optical component;

determining the projection relation between the pixel points of the plurality of images and the calibration reference object;

determining a first parameter set and a second parameter set of the at least two cameras according to the projection relation;

the first parameter set comprises a plurality of first parameters, the first parameters are used for representing target pixel points in images shot by the cameras to be calibrated and the projection directions corresponding to the target pixel points, the second parameter set comprises at least one group of second parameters, and the at least one group of second parameters are used for indicating the pose relation between the at least two cameras.

In a second aspect of the present disclosure, there is provided a wearable device comprising:

A binocular display module;

The two lens barrels are arranged on the light emitting side of the binocular display module, at least one of the two lens barrels comprises at least two cameras and an optical assembly, the at least two cameras are arranged inside the lens barrels and used for collecting human eye images, and the at least two cameras are located between the binocular display module and the optical assembly.

In a third aspect of the present disclosure, there is provided a calibration device for camera parameters, including:

a first determining module configured to determine a projection relationship of pixels of the plurality of images and the calibration reference object;

A second determination module configured to determine a first parameter set and a second parameter set of the at least two cameras according to the projection relationship;

In a fourth aspect of the present disclosure, there is provided a wearable device comprising:

The lens barrel is internally provided with a display module, an optical assembly and at least two cameras, wherein the at least two cameras and the optical assembly are positioned on the light emitting side of the display module, and the at least two cameras are positioned between the optical assembly and the display module;

and the processing module is electrically coupled with the at least two cameras and is configured to acquire a first parameter set and a second parameter set obtained by adopting the method of the first aspect and an image shot by the camera, and solve the position of a target area in the image in space by utilizing the first parameter set and the second parameter set and the image.

In a fifth aspect of the present disclosure, there is provided a computer device comprising one or more processors, a memory, and one or more programs, wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs comprising instructions for performing the method according to the first aspect.

In a sixth aspect of the present disclosure, there is provided a non-transitory computer readable storage medium containing a computer program which, when executed by one or more processors, causes the processors to perform the method of the first aspect.

In a seventh aspect of the present disclosure, there is provided a computer program product comprising computer program instructions which, when run on a computer, cause the computer to perform the method of the first aspect.

The wearable device provided by the embodiment of the disclosure has the advantages that the camera is arranged inside the lens barrel, the influence of shielding is smaller, the accuracy of the pupil distance estimation or sight tracking algorithm can be improved, and further, more observation information can be provided by arranging at least two cameras, so that the algorithm accuracy is further improved. The method and the related equipment for calibrating the camera parameters provided by the embodiment of the disclosure provide a feasible scheme for calibrating the camera parameters for scenes with asymmetric imaging distortion and non-uniform projection centers of the camera in the lens barrel.

Drawings

In order to more clearly illustrate the technical solutions of the present disclosure or related art, the drawings required for the embodiments or related art description will be briefly described below, and it is apparent that the drawings in the following description are only embodiments of the present disclosure, and other drawings may be obtained according to these drawings without inventive effort to those of ordinary skill in the art.

Fig. 1A shows a schematic diagram of an exemplary system provided by an embodiment of the present disclosure.

Fig. 1B shows a schematic diagram of an exemplary head-mounted wearable device.

Fig. 1C and 1D show schematic diagrams of exemplary human eye images.

Fig. 2A illustrates a schematic diagram of an exemplary wearable device provided by embodiments of the present disclosure.

Fig. 2B illustrates a schematic diagram of another exemplary wearable device provided by embodiments of the present disclosure.

Fig. 2C illustrates a schematic diagram of yet another exemplary wearable device provided by embodiments of the present disclosure.

Fig. 2D illustrates a schematic diagram of yet another exemplary wearable device provided by embodiments of the present disclosure.

Fig. 3A shows a flow diagram of an exemplary method provided by an embodiment of the present disclosure.

FIG. 3B illustrates a flow diagram of an exemplary method of determining a projected relationship according to an embodiment of the present disclosure.

Fig. 4A shows a schematic diagram of an exemplary view-finding scenario, according to an embodiment of the present disclosure.

Fig. 4B shows a schematic diagram of another exemplary view-finding scenario, according to an embodiment of the present disclosure.

Fig. 4C shows a schematic diagram of the 3 selected target images.

FIG. 4D illustrates a schematic diagram of a projected relationship of pixel points to a calibration reference object, according to an embodiment of the present disclosure.

Fig. 4E shows a schematic diagram of an exemplary first parameter according to an embodiment of the present disclosure.

FIG. 4F illustrates a calibration schematic of a second parameter of an exemplary ipsilateral in-barrel camera in accordance with an embodiment of the present disclosure.

Fig. 4G shows a schematic diagram of an image obtained by binarizing an image acquired in an embodiment of the present disclosure.

Fig. 5 shows a schematic diagram of an exemplary wearable device provided by an embodiment of the present disclosure.

Fig. 6 shows a hardware architecture diagram of an exemplary computer device provided by an embodiment of the present disclosure.

Fig. 7 shows a schematic diagram of an exemplary apparatus provided by an embodiment of the present disclosure.

Detailed Description

For the purposes of promoting an understanding of the principles and advantages of the disclosure, reference will now be made to the embodiments illustrated in the drawings and specific language will be used to describe the same.

It should be noted that unless otherwise defined, technical or scientific terms used in the embodiments of the present disclosure should be given the ordinary meaning as understood by one of ordinary skill in the art to which the present disclosure pertains. The terms "first," "second," and the like, as used in embodiments of the present disclosure, do not denote any order, quantity, or importance, but rather are used to distinguish one element from another. The word "comprising" or "comprises", and the like, means that elements or items preceding the word are included in the element or item listed after the word and equivalents thereof, but does not exclude other elements or items. The terms "connected" or "connected," and the like, are not limited to physical or mechanical connections, but may include electrical connections, whether direct or indirect. "upper", "lower", "left", "right", etc. are used merely to indicate relative positional relationships, which may also be changed when the absolute position of the object to be described is changed.

It will be appreciated that prior to using the technical solutions disclosed in the embodiments of the present disclosure, the user should be informed and authorized of the type, usage range, usage scenario, etc. of the personal information related to the present disclosure in an appropriate manner according to the relevant legal regulations.

For example, in response to receiving an active request from a user, a prompt is sent to the user to explicitly prompt the user that the operation it is requesting to perform will require personal information to be obtained and used with the user. Thus, the user can autonomously select whether to provide personal information to software or hardware such as an electronic device, an application program, a server or a storage medium for executing the operation of the technical scheme of the present disclosure according to the prompt information.

As an alternative but non-limiting implementation, in response to receiving an active request from a user, the manner in which the prompt information is sent to the user may be, for example, a popup, in which the prompt information may be presented in a text manner. In addition, a selection control for the user to select to provide personal information to the electronic device in a 'consent' or 'disagreement' manner can be carried in the popup window.

It will be appreciated that the above-described notification and user authorization process is merely illustrative and not limiting of the implementations of the present disclosure, and that other ways of satisfying relevant legal regulations may be applied to the implementations of the present disclosure.

It will be appreciated that the data (including but not limited to the data itself, the acquisition or use of the data) involved in the present technical solution should comply with the corresponding legal regulations and the requirements of the relevant regulations.

Fig. 1A shows a schematic diagram of an exemplary augmented reality system 100 provided by an embodiment of the present disclosure.

As shown in fig. 1A, the system 100 may include various types of wearable devices, such as a head-mounted wearable device (e.g., VR/AR glasses or a head-mounted display (HMD)) 104, an operating handle 108, and the like. In some scenarios, a camera/webcam 110 for taking pictures of the operator (user) 102 may also be provided. In some embodiments, when the aforementioned device does not have processing functionality, the system 100 may also include an external control device 112 for providing processing functionality. The control device 112 may be, for example, a computer device such as a mobile phone or a computer. In some embodiments, when any of the foregoing devices is used as a control device or a main control device, information interaction may be implemented with other devices in the system 100 through a wired or wireless communication manner.

In the system 100, a user 102 may utilize a head-mounted wearable device 104, an operating handle 108 to enable interaction with the augmented reality system 100. In some scenarios, the system 100 may utilize the captured images of the camera 110 to identify gestures, etc. of the user 102, thereby completing interactions with the user 102 based on the identified gestures, gestures. In some embodiments, the user 130 may also implement gesture input through a bare hand, the head wearable device 104 may capture a front image in real time through a video camera or the like disposed in front of the head wearable device 104, and recognize the gesture of the user 130 by recognizing the image.

In some embodiments, as shown in fig. 1A, the system 100 may also communicate with the server 114 and may obtain data, e.g., pictures, audio, video, etc., from the server 114 and may output such data through the head-mounted wearable device 104, e.g., display pictures or video on a display screen of the head-mounted wearable device 104, play audio carried by audio and video with a speaker of the head-mounted wearable device 104, etc. In some embodiments, as shown in FIG. 1A, the server 114 may retrieve desired data, e.g., pictures, audio, video, etc., from the database server 116 for storing the data.

In some embodiments, an acquisition unit for acquiring information may be provided on the head wearable device 104. The type of acquisition unit may be varied.

In some embodiments, the acquisition unit may further include an environment acquisition unit that may be used to acquire environment information around (e.g., in front of) the wearable device 104, and a location tracking unit that may be used to location track the wearable device 104. Optionally, the environment obtaining unit may include but is not limited to a trichromatic camera (such as RGB camera), a depth camera, a binocular camera, a laser, etc., and the positioning tracking unit may include but is not limited to a visual real-time positioning and mapping (visual SLAM), an Inertial Measurement Unit (IMU), a Global Positioning System (GPS), an ultra wideband wireless communication technology (UWB), a laser, etc.

In some embodiments, the head-mounted wearable device 104 may also provide a speed sensor, an acceleration sensor, an angular velocity sensor (e.g., a gyroscope), etc. for collecting speed or acceleration information of the head-mounted wearable device 104. For another example, the operation handle 108 may be provided with a speed sensor, an acceleration sensor, an angular velocity sensor (for example, a gyroscope), or the like for speed information or acceleration information of the operation handle 108. It should be noted that, the aforementioned acquisition unit may be disposed on the body part of the interactive user 102 directly by attaching instead of the hardware device, so as to acquire relevant information of the body part, such as speed or acceleration or angular velocity information, or information acquired by other sensors or acquisition units, besides being disposed on the wearable device 104 and the operation handle 108.

In some embodiments, the head-mounted wearable device 104 may also be provided with a camera or webcam for taking photographs of the operator (user) 102 (e.g., photographs of hands or feet) and images of the environment.

In some embodiments, the system 100 can identify the gesture, etc. of the user 102 through the collected information, and further can perform corresponding interaction according to the identified gesture, etc.

Fig. 1B shows a schematic diagram of an exemplary head-mounted wearable device 104.

As shown in fig. 1B, the head wearable device 104 may include a lens barrel 1042, and a display screen 1044 for displaying an image, and an optical component 1046 for processing an optical path may be provided inside the lens barrel 1042. Optionally, the optical assembly 1046 may further include a plurality of lenses (e.g., lenses 1046A and 1046B), and the combination of the plurality of lenses may project light emitted by the display screen 1044 into the human eye 1022 such that the human eye 1022 may view the image displayed by the display screen 1044. It will be appreciated that the single-sided structure of the head-mounted wearable device 104 is only exemplarily shown in fig. 1B, and that in order to achieve binocular display, two barrel structures arranged side by side may be included in the head-mounted wearable device 104.

In some embodiments, as shown in fig. 1B, the head-mounted wearable device 104 may also be provided with a camera 1048 for capturing an image of the human eye, which camera 1048 may be a Charge Coupled Device (CCD) image sensor, a Complementary Metal Oxide Semiconductor (CMOS) image sensor, or the like.

Alternatively, the camera 1048 may be a gaze tracking (EYE TRACKING ET) camera, and the captured eye images may be used to perform pupil distance estimation, gaze tracking, and other functions.

As shown in fig. 1B, in the related art, the camera 1048 is generally provided outside the lens barrels and generally only one camera is provided for each lens barrel. Moreover, in order to better acquire a complete human eye image, and not affect the human eye to observe the picture of the display screen 1044, the common deployment position of the camera 1048 is generally at the external corner of the eye or the nose wing position. Referring to fig. 1B, if the camera 1048 is near the outside of the device, the camera deployment position shown in fig. 1B is the external canthus position, and if the camera 1048 is near the inside of the device, the camera deployment position shown in fig. 1B is the nose wing position.

However, the inventors of the present disclosure found that the manner of installing the camera in the related art easily makes the installation inclination angle of the camera 1048 with respect to the human eye 1022 large, resulting in a large angle α between the orientation of the camera 1048 and the front view direction of the human eye 1022, so that the captured human eye image is difficult to reflect the image of the front view angle of the human eye, as shown in fig. 1C and 1D.

In some cases, the user may need to wear the glasses first and then use the head-mounted wearable device 104. However, since the camera 1048 is disposed outside the lens barrel 1042, the camera 1048 is higher than the lens barrel 1042, and is easy to squeeze the glasses, which affects the wearing comfort of the head-mounted wearable device 104. Meanwhile, the imaging of the camera 1048 is easily affected by the edges of the glasses, and the light rays are refracted through the edges of the glasses, so that the definition of the image of the camera 1048 is reduced, and a plurality of refraction light spots are formed in the image, so that the accuracy of a subsequent algorithm is affected. In particular, such a problem is further aggravated when there is only one of the cameras corresponding to the lens barrel.

In view of this, the embodiments of the present disclosure provide a wearable device, where at least two cameras are disposed inside a lens barrel, which can solve or partially solve the above-mentioned problems to some extent.

Fig. 2A shows a schematic diagram of an exemplary wearable device 200 provided by embodiments of the present disclosure.

As shown in fig. 2A, similarly, the wearable device 200 may also include a lens barrel 202, and a display module 204, an optical assembly 206 disposed inside the lens barrel 202. The optical assembly 206 may further include a plurality of lenses (e.g., lenses 2062 and 2064), and the combination of the plurality of lenses may project light emitted from the display module 204 into the human eye 1022 such that the human eye 1022 may view the image displayed by the display module 204.

Unlike the wearable device 104 shown in fig. 1B, the wearable device 200 includes two cameras 208A and 208B, and both cameras 208A and 208B are disposed inside the lens barrel 202, and since the cameras 208A and 208B are disposed inside the lens barrel 202, the wearing of the glasses is not affected, so that the comfort of the wearable device 200 can be improved. Meanwhile, as shown in fig. 2A, since the cameras 208A and 208B are disposed inside the lens barrel 202, the distance between the cameras 208A and 208B and the human eye 1022 is prolonged, so that the installation inclination angle of the cameras 208A and 208B relative to the human eye 1022 is reduced, and then the included angle β between the directions of the cameras 208A and 208B and the forward viewing direction of the human eye 1022 is smaller than the included angle α, so that the cameras 208A and 208B have better viewing angles, the acquired human eye images can reflect the images of the forward viewing angles of the human eyes, and the imaging quality is better. Also, since the cameras 208A and 208B are placed inside the lens barrel 202, the eyeglasses do not interfere with imaging of the cameras 208A and 208B, further improving imaging quality. The imaging quality is improved, so that the accuracy of algorithms such as pupil distance estimation or sight tracking is also improved. In addition, due to the adoption of the two cameras 208A and 208B, more observation information (more acquired images) can be provided, and the accuracy of algorithms such as pupil distance estimation or sight line tracking can be improved.

In some embodiments, to improve the accuracy of subsequent pupil distance estimation or gaze tracking algorithms, the two cameras 208A and 208B may be symmetrically disposed within the barrel 202 with respect to the axis of the barrel 202 (center dashed line of fig. 2A), as shown in fig. 2A. In this way, the images acquired by the two cameras 208A and 208B may be symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

In some embodiments, as shown in fig. 2A, both cameras 208A and 208B are facing the light-exiting side of the barrel 202 and the cameras 208A and 208B are facing at an equal angle to the axis of the barrel 202 (center dashed line of fig. 2A), e.g., at an angle β. In this way, the images acquired by the two cameras 208A and 208B may be strictly symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

Fig. 2B shows a schematic diagram of another exemplary wearable device 200 provided by embodiments of the present disclosure.

As shown in fig. 2B, in some embodiments, the wearable device 200 may further include two reflective structures 210A and 210B, where the reflective structures 210A and 210B may be structures with reflective surfaces, such as reflective films or mirrors. The reflective structures 210A and 210B may correspond to the cameras 208A and 208B, respectively, with both cameras 208A and 208B facing the display module 204 and the reflective structure 210A for reflecting light from the human eye 1022 into the camera 208A and the reflective structure 210B for reflecting light from the human eye 1022 into the camera 208B. The reflection of light by the reflective structures 210A and 210B allows the cameras 208A and 208B to still capture images of the human eye. Also, because a primary reflection process is added to the optical path, the observation angle γ is further reduced, and the cameras 208A and 208B can perform imaging better.

In some embodiments, to improve the accuracy of subsequent pupil estimation or gaze tracking algorithms, as shown in fig. 2B, the two cameras 208A and 208B may be symmetrically disposed within the barrel 202 with respect to the axis of the barrel 202 (the center dashed line of fig. 2B), and the reflective structures 210A and 210B may be symmetrically disposed within the barrel 202 with respect to the axis of the barrel 202 (the center dashed line of fig. 2B). In this way, the images acquired by the two cameras 208A and 208B may be symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

In some embodiments, as shown in FIG. 2B, both cameras 208A and 208B are facing the display module 204 and both cameras 208A and 208B are facing the same angle with the axis of the barrel 202 (the center dashed line of FIG. 2B), e.g., both angles are γ, while the reflective structures 210A and 210B are facing the same angle with the axis of the barrel 202 (the center dashed line of FIG. 2B). In this way, the images acquired by the two cameras 208A and 208B may be strictly symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

The foregoing embodiments are described with two cameras disposed in a single lens barrel, and it will be appreciated that when the number of cameras is further increased, the observed data may be further increased, so as to further improve the accuracy of the algorithm, and therefore, embodiments in which two or more cameras are disposed in a single lens barrel should all fall within the protection scope of the present disclosure.

Only a single-sided structure of the wearable device 200 is exemplarily shown in fig. 2A and 2B, and it can be appreciated that, in order to implement binocular display, two barrel structures arranged side by side may be included in the wearable device 200.

Fig. 2C shows a schematic diagram of yet another exemplary wearable device 200 provided by an embodiment of the present disclosure.

As shown in fig. 2C, the wearable device 200 may include a binocular display module, which may further include a first display module 204A and a second display module 204B. The first display module 204A and the second display module 204B may display an image for viewing by a first eye 1022A (e.g., right eye) and an image for viewing by a second eye 1022B (e.g., left eye), respectively.

In some embodiments, as shown in fig. 2C, the wearable device 200 may further include two lens barrels, e.g., a first lens barrel 202A and a second lens barrel 202B, disposed on the light exit side of the binocular display module. At least one of the two lens barrels may further include at least two cameras disposed inside the lens barrel for collecting human eye images, thereby providing more observation data and improving the accuracy of the subsequent algorithm.

As an alternative embodiment, as shown in fig. 2C, a first lens barrel 202A may be disposed on the light emitting side of the first display module 204A, and a first camera 208A and a second camera 208B for capturing an image of the human eye of the first eye 1022A may be further disposed in the first lens barrel 202A. In this way, for the human eye image acquired for the first eye 1022A (e.g., right eye), more accurate calculation results can be obtained when used later for performing the pupil distance calculation or the line-of-sight tracking. In some embodiments, as shown in fig. 2C, a first optical component 206A may be further disposed in the first lens barrel 202A, for optically projecting the image displayed by the first display module 204A into the first eye 1022A. Optionally, the first optical component 206A may further include a first lens 2062A and a second lens 2064A, and parameters of the first lens 2062A and the second lens 2064A may be different or the same and may be designed according to actual requirements. It will be appreciated that the type and number of lenses in the first optical assembly 206A may vary, and that the particular type and number of lenses employed may be designed according to actual requirements.

Similarly, as another alternative embodiment, as shown in fig. 2C, a second lens barrel 202B may be disposed on the light emitting side of the second display module 204A, and a third camera 208C and a fourth camera 208D for capturing an image of the human eye of the second eye 1022B may be further disposed in the second lens barrel 202A. In this way, for the human eye image acquired by the second eye 1022B (for example, the left eye), a more accurate calculation result can be obtained when used for the subsequent pupil distance calculation or the line of sight tracking. Similarly, in some embodiments, as shown in fig. 2C, a second optical component 206B may be further disposed in the second lens barrel 202B, for optically projecting the image displayed by the second display module 204B into the second eye 1022B. Optionally, the second optical component 206B may further include a third lens 2062B and a fourth lens 2064B, and parameters of the third lens 2062B and the fourth lens 2064B may be different or the same and may be designed according to actual requirements. It will be appreciated that the type and number of lenses in the second optical assembly 206B may vary, and that the particular type and number of lenses employed may be designed according to actual requirements.

In some embodiments, as shown in fig. 2C, the first camera 208A and the second camera 208B may be disposed within the first barrel 202A symmetrically with respect to an axis of the first barrel 202A. In this way, the human eye images of the first eye 1022A acquired by the first camera 208A and the second camera 208B may be symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

Similarly, in some embodiments, as shown in fig. 2C, third camera 208C and fourth camera 208D are disposed within second barrel 202B symmetrically with respect to the axis of second barrel 202B. Thus, the human eye images of the second eye 1022B acquired by the third camera 208C and the fourth camera 208D may also be symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

In some embodiments, as shown in fig. 2C, the first camera 208A and the second camera 208B are both facing the light emitting side of the first display module 204A, and the first camera 208A and the second camera 208B are facing the same angle as the axis of the first lens barrel 202A, for example, the angles are all β. In this way, the human eye images of the first eye 1022A acquired by the first camera 208A and the second camera 208B may be strictly symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

Similarly, in some embodiments, as shown in fig. 2C, the third camera 208C and the fourth camera 208D are both facing the light emitting side of the second display module 204B and the third camera 208C and the fourth camera 208D are facing the same angle as the axis of the second lens barrel 202B, for example, the angles are all β. In this way, the human eye images of the second eye 1022B acquired by the third camera 208C and the fourth camera 208D may be strictly symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

Fig. 2D shows a schematic diagram of yet another exemplary wearable device 200 provided by embodiments of the present disclosure.

In some embodiments, as shown in fig. 2D, the first camera 208A and the second camera 208B are both facing the first display module 204A, and a first reflective structure 210A corresponding to the first camera 208A and a second reflective structure 210B corresponding to the second camera 208B are disposed in the first lens barrel 202A, and the first reflective structure 210A and the second reflective structure 210B may be structures such as a reflective film or a reflective mirror having a reflective surface. First and second reflective structures 210A, 210B may correspond to first and second cameras 208A, 208B, respectively, the first reflective structure 210A being configured to reflect light rays from the first eye 1022A into the first camera 208A, and the second reflective structure 210B being configured to reflect light rays from the first eye 1022A into the second camera 208B. The first and second cameras 208A and 208B can still capture the human eye image of the first eye 1022A by the reflection of the light by the first and second reflective structures 210A and 210B. Also, because the optical path adds a primary reflection process, so that the observation angle γ is further reduced, the first camera 208A and the second camera 208B can better image the first eye 1022A.

Similarly, in some embodiments, as shown in fig. 2D, the third camera 208C and the fourth camera 208D face the second display module 204B, and a third reflective structure 210C corresponding to the third camera 208C and a fourth reflective structure 210D corresponding to the fourth camera 208D are disposed in the second lens barrel 202B, and the third reflective structure 210C and the fourth reflective structure 210D may be structures having reflective surfaces, such as a reflective film or a reflective mirror. Third and fourth reflective structures 210C, 210D may correspond to third and fourth cameras 208C, 208D, respectively, the third reflective structure 210C being configured to reflect light rays from the second eye 1022B into the third camera 208C, the fourth reflective structure 210D being configured to reflect light rays from the second eye 1022B into the fourth camera 208D. The reflection of light by the third and fourth reflective structures 210C, 210D allows the third and fourth cameras 208C, 208D to still capture an image of the human eye of the second eye 1022B. Also, because of the addition of a primary reflection process in the optical path, the observation angle γ is further reduced, and the third camera 208C and the fourth camera 208D can better image the second eye 1022B.

In some embodiments, to improve the accuracy of subsequent pupil distance estimation or gaze tracking algorithms, as shown in fig. 2D, the first camera 208A and the second camera 208B may be disposed within the first barrel 202A symmetrically with respect to the axis of the first barrel 202A, and the first reflective structure 210A and the second reflective structure 210B may also be disposed within the first barrel 202A symmetrically with respect to the axis of the first barrel 202A. In this way, the human eye images of the first eye 1022A acquired by the first camera 208A and the second camera 208B may be symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

Similarly, in some embodiments, to improve the accuracy of subsequent pupil distance estimation or gaze tracking algorithms, as shown in fig. 2D, the third camera 208C and the fourth camera 208D may be disposed within the second barrel 202B symmetrically with respect to the axis of the second barrel 202B, and the third reflective structure 210C and the fourth reflective structure 210D may also be disposed within the second barrel 202B symmetrically with respect to the axis of the second barrel 202B. In this way, the eye images of the second eye 1022B acquired by the third camera 208C and the fourth camera 208D may be symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

In some embodiments, as shown in fig. 2D, the first camera 208A and the second camera 208B are both facing the first display module 204A and the first camera 208A and the second camera 208B are facing the same angle with the axis of the first barrel 202A, e.g., the angle is γ, while the first reflective structure 210A and the second reflective structure 210B are facing the same angle with the axis of the first barrel 202A. In this way, the human eye images of the first eye 1022A acquired by the first camera 208A and the second camera 208B may be strictly symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

In some embodiments, as shown in fig. 2D, the third camera 208C and the fourth camera 208D are both facing the second display module 204B and the third camera 208C and the fourth camera 208D are facing the same angle as the axis of the second barrel 202B, e.g., both are at an angle γ, while the third reflective structure 210C and the fourth reflective structure 210D are facing the same angle as the axis of the second barrel 202B. In this way, the human eye images of the second eye 1022B acquired by the third camera 208C and the fourth camera 208D may be strictly symmetrical, and the processing efficiency of the subsequent algorithm may be further improved.

The foregoing embodiments are described with two cameras respectively disposed in two lens barrels, and it can be understood that one of the two lens barrels may be operated to set two or more cameras, and the other lens barrel may be set with one camera, which can still improve the algorithm accuracy to a certain extent, and should also belong to the protection scope of the present disclosure. In addition, the structures shown in fig. 2A and 2B may be collocated in the same wearable device 200, for example, a lens barrel corresponding to a first eye has the structure shown in fig. 2A, and a lens barrel corresponding to a second eye has the structure shown in fig. 2B. Of course, the structures corresponding to the first eye and the second eye may be exchanged.

As can be seen from the above embodiments, the embodiments of the present disclosure can obtain better imaging quality and can improve the comfort of the wearable device 200 by disposing at least two cameras 208 inside at least one lens barrel of the wearable device 200. In some embodiments, by placing 2 cameras in each of the left and right lens barrels of the wearable device 200, the adverse effects of the cameras outside the lens barrels can be avoided, and the 2 cameras can provide more observation information and are less affected by occlusion, so that the accuracy of the algorithms such as pupil distance estimation or gaze tracking can be improved.

Further, the inventors of the present disclosure found that the camera is placed inside the lens barrel, as compared to the camera placed outside the lens barrel, and that the camera is affected by the optical components in the lens barrel in addition to the camera module itself when imaging.

As shown in fig. 2A and 2B, in relation to the camera 1048 of fig. 1B, in the optical path of the cameras 208A, 208B to the human eye 1022, there is also an optical component 206 or a part of the optical component 206 within the barrel (which is different according to the relative positional relationship of the cameras 208A, 208B to the lenses in the optical component 206, for example, in addition to the positions shown in fig. 2A and 2B, the cameras 208A, 208B may be disposed between the lenses 2062 and 2064), so that the optical component 206 or a part of the optical component 206 affects or changes the optical path of the cameras 208A, 208B to the human eye 1022 when the cameras 208A, 208B image the human eye 1022, with the result that distortion of the image imaged by the cameras 208A, 208B may no longer be symmetrical.

Moreover, because the light paths of the cameras 208A, 208B reaching the human eye 1022 are changed, the cameras 208A, 208B do not have a uniform projection center, so that the parametric camera model cannot be used to fit the projection process of the camera in the lens barrel to process distortion, which makes the calibration of camera parameters difficult.

In addition, in some cases, as shown in fig. 2C and 2D, because the positions of the different cameras set in the wearable device 200 are different, calibration of the pose relationship between the different cameras is also required, and such calibration is also difficult to achieve. In particular, such calibration may become more complex when different cameras are located within different barrels (e.g., between the first camera 208A and the third camera 208C or the fourth camera 208D, or between the second camera 208B and the third camera 208C or the fourth camera 208D).

In view of this, the embodiments of the present disclosure further provide a method for calibrating parameters of a camera, which can use a non-parametric camera model to calibrate parameters of a camera in a lens barrel. In some embodiments, for cameras in different lens barrels, a camera external parameter calibration algorithm based on a non-common view point is provided, so that the problem of difficult camera calibration in different lens barrels is solved.

Fig. 3A shows a flow diagram of an exemplary method 300 provided by an embodiment of the present disclosure.

The method 300 may be applied to any computer device having data processing capabilities and may be used to calibrate parameters of the cameras 208A, 208B, 208C, or 208D in the wearable device 200 shown in fig. 2A, 2B, 2C, or 2D. As shown in fig. 3A, the method 300 may further include the following steps.

In step 302, a plurality of images taken of a calibrated reference object by at least two cameras may be received. In this step, the plurality of images may be images acquired by different cameras. As an alternative embodiment, to achieve calibration of the first parameter (e.g. the internal parameter) and the second parameter (e.g. the external parameter) of the camera, multiple images may be acquired for the internal parameter calibration and multiple images for the external parameter calibration, respectively.

Fig. 4A shows a schematic diagram of an exemplary acquisition scenario 400, according to an embodiment of the present disclosure.

As shown in fig. 4A, in a mapping scene 400, a calibration reference object 402 may be placed at a fixed location. In some embodiments, the calibration reference object 402 may be a calibration plate or chart, on which some black and white checkerboard is provided on the calibration reference object 402. After the camera shoots the calibration reference object 402 to obtain a corresponding image, corresponding camera parameters can be calculated according to the corresponding relationship between the checkerboard in the image and the checkerboard of the calibration reference object 402.

When the calibration reference object 402 is subjected to image acquisition (or image acquisition), a camera to be calibrated (for example, the camera 208A) can be controlled to move along a certain motion track, and the calibration reference object 402 is continuously shot in the motion process of the camera, so that a plurality of images acquired by the camera under different postures are obtained and used as data for calculating camera parameters subsequently. Because the acquired image is obtained by the camera performing image acquisition under multiple postures, the robustness of the camera parameters obtained by subsequent calculation is better, and the method can be applied to a wider range.

In some embodiments, if the at least two cameras provided in the wearable device are different types of cameras or cameras with different camera parameters, then the cameras for each camera or each parameter need to acquire multiple images in the manner described above, respectively, for calibrating the first parameter (e.g., the internal reference) of the respective camera. If the camera parameters of the at least two cameras arranged in the wearable device are consistent, one camera can be selected to implement the aforementioned image capturing operation, so that the camera internal parameters are used for calibrating, and the calibrated internal parameters can be suitable for other cameras in the wearable device.

Fig. 4B shows a schematic diagram of another exemplary view-finding scenario 410, according to an embodiment of the present disclosure.

As shown in fig. 4B, a calibration reference object 402 may also be provided at a fixed location in the acquisition scene 410, similar to the acquisition scene 400.

Unlike the image capturing scenario 400, in order to calculate the pose relationship (second parameter) between at least two cameras in the wearable device 200, the at least two cameras may be installed in a lens barrel of the wearable device 200 and after the complete assembly of the wearable device 200 is completed, the assembled wearable device 200 or a prototype thereof is utilized to perform the external parameter capturing, and then the corresponding external parameters (for example, the pose relationship between different cameras) are calculated based on the captured images.

As shown in fig. 4B, taking two cameras as an example, where two lens barrels of the wearable device 200 are respectively provided, when the calibration reference object 402 is subjected to image acquisition (or image acquisition), the wearable device 200 can be controlled to move along a certain motion track, and four cameras are used to continuously shoot the calibration reference object 402 in the motion process of the wearable device 200, so that multiple images acquired by the four cameras under different poses are obtained as image data for subsequently calculating the second parameter under the condition that the relative pose relationship of the four cameras is unchanged. Because the acquired image is obtained by the camera performing image acquisition under multiple postures, the robustness of the camera parameters obtained by subsequent calculation is better, and the method can be applied to a wider range.

From the foregoing, it will be appreciated that in the acquisition scene 410, each camera may acquire a plurality of images that, in addition to being used to calculate the relative pose relationship of the four cameras, may also be used to calculate the intrinsic parameters of each camera itself, and thus, in some embodiments, the acquisition scene 410 may be used only to acquire images and the first and second parameters of the cameras may be calibrated based on the images.

In step 302, the computer device may receive a plurality of images acquired in the aforementioned scene 400 and/or scene 410 for subsequent processing.

After the desired plurality of images are acquired, the camera may be parameter calibrated based on the plurality of images. The parameter calibration may include calibrating a first parameter and a second parameter, where the first parameter may be an internal parameter of the camera and the second parameter may be an external parameter of the camera.

In some embodiments, the camera's internal parameters may be calibrated first. However, as mentioned above, since the camera is disposed in the lens barrel, the imaging of the camera is not only affected by the module of the camera itself, but also affected by the lens of the lens barrel, so that the imaging distortion is no longer symmetrical about the principal point of the image, and the camera does not have a uniform projection center, which results in that the parameterized camera model cannot be used to fit the projection process of the camera in the lens barrel to deal with the asymmetric distortion. Thus, in some embodiments, a non-parametric camera model is provided to calibrate internal parameters of a camera.

Then, in step 304, a projection relationship between pixels of the plurality of images and the calibration reference object may be determined. In this step, a projection relationship between the image and the calibration reference object 402 may be established according to the acquired image, so as to establish a correspondence between the pixel point and the projection direction (pixel-ray).

It will be appreciated that the projected relationship of the pixels of an image to a calibration reference object represents a camera's intrinsic parameters (i.e., internal parameters), and therefore, when performing a projected relationship calculation for a particular camera (e.g., camera 208A), the projected relationship needs to be established using the image acquired by that particular camera. Thus, taking the image capturing scene shown in fig. 4B as an example, the multiple images need to be distinguished from the multiple first images captured by the first camera 208A, the multiple second images captured by the second camera 208B, the multiple third images captured by the third camera 208C, and the multiple fourth images captured by the fourth camera 208D.

The projection relationship calculation is performed by taking the first camera 208A as an example.

In some embodiments, as shown in fig. 3B, the step 304 of determining the projection relationship between the pixels of the plurality of images and the calibration reference object may further include the steps of:

in step 3042, a first number of target images is selected from the plurality of images.

In this step, the projection relationship may be established by selecting a certain number of target images from a plurality of first images (images acquired by the first camera 208A) among the plurality of images, and the remaining first images may be used to complement the portion where the projection relationship is not established.

It will be appreciated that the first amount is not particularly limited as long as it is sufficient to proceed with the subsequent steps. As an alternative embodiment, the first number may be 3. Fig. 4C shows 3 selected target images 412, 414, 416. As shown in fig. 4C, the target images 412, 414, 416 respectively show images obtained by capturing the calibration reference object 402 by the first camera 208A in different poses, it can be understood that the robustness of the calculated camera parameters can be better by calibrating the camera parameters of the images captured in different camera poses.

At step 3044, at least one local region is selected in each of the target images (e.g., a region corresponding to one of the checkered patterns may be considered as a local region), and a homography transformation matrix for each of the local regions and the calibration reference object is established.

After at least one local region of each target image is selected, a homography transformation matrix with the calibration reference object 402 can be respectively constructed for each local region of each target image, so as to establish a corresponding relationship between each pixel point contained in the local region and the coordinate system of the calibration reference object 402 (pixel-coordinate).

It can be known that the coordinate system of the position pair where the calibration reference object 402 is located is known, and the checkerboard on the calibration reference object 402 has a corresponding relationship with the checkerboard image in the target image, and meanwhile, the coordinates of the four vertices corresponding to the local area in the camera coordinate system where the target image is located are also known, and according to these known information, the homography matrix H of the local area and the calibration reference object 402 can be obtained.

In this way, the pixel points contained in the local area and the calibration reference object 402 or the coordinate system where the calibration reference object 402 is located are in a corresponding relationship.

In step 3046, a projection relationship between the pixel points included in the local area and the calibration reference object is determined according to the homography transformation matrix.

As shown in fig. 4D, the area corresponding to 4 vertices connected by 4 dashed lines in the figure is taken as the local area, and by constructing the homography transformation matrix, the homography transformation matrix can be used to calculate the projection relationship between each pixel point included in the local area in the target image and the corresponding point on the calibration reference object.

In step 3048, the first number of target images are converted into a reference coordinate system, and the projection relationship between the pixel points of the plurality of images and the calibration reference object is determined according to the area of the established projection relationship corresponding to the local area.

As shown in fig. 4C, 3 target images 412, 414, 416 are acquired under different camera poses, and when the homography transformation matrix of the local area is constructed, the homography transformation matrix is constructed by using the camera coordinate systems corresponding to the target images, it is understood that in order to unify the projection relations obtained by the target images under the same reference coordinate system, in this step, the 3 target images 412, 414, 416 may be converted into the reference coordinate system.

After processing of the 3 target images 412, 414, 416 is completed according to the foregoing steps, a similar method as before may be used to continue selecting local areas of the new target image in the remaining first image to construct a homography transformation matrix until the projection relationship of each pixel point in the entire image to the calibration reference object is completed.

As can be seen from the foregoing embodiments, in some embodiments, if all the image processing is completed and the calibration is completed for the whole image, there may be a plurality of pixel points corresponding to a plurality of projection direction data, so that the data may be averaged to obtain a projection direction as the projection direction of the pixel point to obtain better projection direction data. In addition, only the projection direction after the mean value processing can be stored in the subsequent storage, so that the storage space is saved.

It can be appreciated that, for the second camera 208B, the third camera 208C, the fourth camera 208D, and possibly more cameras (depending on the number of cameras set in the wearable device), the aforementioned method may be used to establish the projection relationship between the image pixels and the calibration reference object 402, which is not described herein.

After the projection relationship is established, a first parameter set and a second parameter set of the at least two cameras may be determined from the projection relationship at step 306. The first parameter set comprises a plurality of first parameters, the first parameters are used for representing target pixel points in images shot by the cameras to be calibrated and the projection directions corresponding to the target pixel points, the second parameter set comprises at least one group of second parameters, and the at least one group of second parameters are used for indicating the pose relation between the at least two cameras.

In some embodiments, a first set of parameters for the at least two cameras may be determined based on the projection relationship, and then a second set of parameters may be determined.

As shown in fig. 4E, in constructing the aforementioned projection relationship, at least some pixels in the image in the reference coordinate system are in one-to-one correspondence with some points on the calibration reference object, so that a straight line equation about the pixels can be obtained, where the straight line represented by the straight line equation passes through the corresponding pixels and has a ray (ray) as shown in the figure to represent the projection direction thereof. Such a combination of pixel points and projection directions may be regarded as a first parameter (which may be regarded as an internal reference of the camera) with which the conversion of the coordinate system of the image captured by the camera into the camera coordinate system is achieved, and thus may be used to calculate the positional information of certain features (e.g. pupils) in the image captured by the camera in three-dimensional space (camera coordinate system). In addition, in the calculation process of the camera parameters, a homography transformation matrix of the calibration reference object is respectively established by utilizing a plurality of local areas of the first image, so that when the camera parameters are utilized for coordinate system conversion calculation, the obtained projection direction corresponding to the pixel point already comprises correction of the distortion of the image at the position, and the distortion correction is not needed to be additionally carried out.

In some embodiments, the data obtained above may be reduced and then used as the first parameter of the camera, so that the space for storing the first parameter may be saved. For example, the spline surface may be used to fit the initial set of the first parameters obtained by calibration, then a beam method adjustment (BA) algorithm is used to optimize the control points of the spline surface after the fitting, and the first parameter sets of the at least two cameras are set for the first parameters corresponding to all the control points of the spline surface obtained by optimization.

When the first parameters are needed to be used, a complete set of the first parameters can be obtained through an interpolation algorithm according to the fitting spline surface and the optimized control points.

It can be seen that the foregoing method only provides a calculation manner of the first parameter for a single camera, and for each camera in the wearable device 200, the foregoing method may be adopted to calculate the corresponding first parameter, which is not described herein.

After the first parameter set is obtained, the second parameter set may be further calibrated.

Returning to fig. 2C, the wearable apparatus 200 includes a first camera 208A and a second camera 208B provided in the first barrel 202A for capturing an image of the human eye of the first eye 1022A, and a third camera 208C and a fourth camera 208D provided in the second lens 202B for capturing an image of the human eye of the second eye 1022B. Since the positions of the four cameras in the wearable device 200 are different, in order to determine the relative relationship of the images acquired by the four cameras, it is necessary to know the pose relationship between the four cameras to use as the second parameter, so as to complete calibration of the external parameters of the cameras.

In the following, it is exemplarily described how to calculate the second parameter, taking calibration of four cameras in the wearable device 200 as an example.

As shown in fig. 2C, since two cameras disposed in the same barrel of the wearable device 200 need to perform image acquisition on the same human eye, generally, the two cameras have a common point of view (or optical path junction), and thus, a pose relationship between the two cameras disposed in the same barrel can be found using a dual-object localization algorithm.

Thus, in some embodiments, the second parameters may include a pose relationship between the first camera 208A and the second camera 208B and a pose relationship between the third camera 208C and the fourth camera 208D, the pose relationship between the first camera and the second camera and the pose relationship between the third camera and the fourth camera being calculated from the projection relationship based on a dual-targeting algorithm.

Optionally, the step of determining 306 a first set of parameters and a second set of parameters of the at least two cameras based on the projected relationship may further comprise determining a pose relationship between the first camera 208A and the second camera 208B and a pose relationship between the third camera 208C and the fourth camera 208D using a dual targeting algorithm based on the projected relationship.

With continued reference to fig. 2C, it may be appreciated that because the cameras within the different barrels capture images of different human eyes, the cameras within the different barrels may not have a common viewpoint, a dual-target determination algorithm may not be employed to calculate the pose relationship, or the pose relationship calculated using the dual-target determination algorithm may be inaccurate. Accordingly, embodiments of the present disclosure provide a method of calculating pose relationships of internal phase machines in different lens barrels.

Taking the example of calculating the pose relationship between the first camera 208A and the fourth camera 208D, as shown in fig. 4F, where the two cameras do not have a common viewpoint, when they move to a certain position, the relative positions of the two cameras and the calibration reference object 402 are as shown in fig. 4F, the pose relationship between the first camera 208A and the calibration reference object 402 is T _1,i, the pose relationship between the fourth camera 208D and the calibration reference object 402 is T _4,i, and the pose relationship between the first camera 208A and the fourth camera 208D is T ₁₄, then using the observation data (the first image set and the fourth image set obtained in the drawing step) of the first camera 208A and the fourth camera 208D, a BA optimization problem can be constructed, and the pose relationship T ₁₄ can be obtained by solving the problem.

Thus, in some embodiments, the second parameter may further comprise a pose relationship between the first camera and the third camera, a pose relationship between the first camera and the fourth camera, a pose relationship between the second camera and the third camera, a pose relationship between the second camera and the fourth camera.

Optionally, the step of determining 306 a first parameter set and a second parameter set of the at least two cameras according to the projection relationship may further comprise determining a pose relationship between the first camera and the third camera, a pose relationship between the first camera and the fourth camera, a pose relationship between the second camera and the third camera, a pose relationship between the second camera and the fourth camera according to the plurality of images and the projection relationship.

Specifically, the optimization function may be constructed first.

Optionally, the optimization function may include a first formula for representing an error between a detection point in the first image acquired by the first camera 208A and a spatial position (corresponding position on the calibration reference object 402) of the detection point corresponding to the detection point projected onto a two-dimensional point in the image coordinate system using a first parameter of the first camera 208A corresponding to the detection point. Alternatively, the first formula is expressed as:

f_cam1,1＝π1(T_1,i,P_cam1)-d_cam1

Where pi 1 is a projection function corresponding to the first camera 208A (i.e., a projection relationship between the pixel points obtained in the foregoing and the calibration reference object 402), T _1,i is an external parameter (which may be calculated according to the spatial coordinates of the first image and the calibration reference object 402) of the first camera 208A corresponding to the i first image and the calibration reference object 402, P _cam1 is a 3D point corresponding to the i first image of the first camera 208A on the calibration reference object 402 (i.e., a spatial coordinate of a point corresponding to each pixel point of the first image on the calibration reference object 402), and D _cam1 is all detection points corresponding to the first camera 208A.

Optionally, the optimization function may further include a second formula for representing an error between the detection point in the fourth image acquired by the fourth camera 208D and the two-dimensional point in the image coordinate system projected by the first parameter of the fourth camera 208A corresponding to the detection point (the corresponding position on the calibration reference object 402) with the spatial position of the detection point corresponding to the detection point. Alternatively, the second formula is expressed as:

f_cam4,2＝π4(T_4,i,P_cam4)-d_cam4

Where pi 4 is a projection function corresponding to the fourth camera 208D (i.e., the projection relationship between the pixel points obtained in the foregoing and the calibration reference object 402), T _4,i is an external parameter (which can be calculated according to the coordinate information of the fourth image and the calibration reference object 402) of the fourth camera 208D corresponding to the i fourth image and the calibration reference object 402, P _cam4 is a 3D point corresponding to the i fourth image of the fourth camera 208D on the calibration reference object 402 (i.e., the spatial coordinates of the points corresponding to the pixels of the fourth image on the calibration reference object 402), and D _cam4 is all detection points corresponding to the fourth camera 208D.

Optionally, the optimization function may further include a third formula for representing a spatial position of the detection point in the fourth image acquired by the fourth camera 208D (corresponding position on the calibration reference object 402) corresponding to the detection point, projected to an error between two-dimensional points in the image coordinate system using the first parameter of the first camera 208A corresponding to the detection point and the pose relationship between the first camera and the fourth camera. Optionally, the third formula is expressed as:

f_cam4,1＝π4(T₁₄*T_1,i,P_cam4)-d_cam4

Wherein T ₁₄ is a pose relationship between the first camera 208A and the fourth camera 208D, that is, external parameters of the first camera and the fourth camera required to be obtained in the embodiments of the present disclosure.

Optionally, the optimization function may further include a fourth formula for representing a spatial position of the detection point in the fourth image acquired by the fourth camera 208D (corresponding position on the calibration reference object 402) corresponding to the detection point, projected to an error between two-dimensional points in the image coordinate system using the first parameter of the fourth camera corresponding to the detection point and the pose relationship between the first camera and the fourth camera. Optionally, the fourth formula is expressed as:

wherein, the The transposed matrix of the pose relationship between the first camera 208A and the fourth camera 208D is obtained by transposed T ₁₄.

In some embodiments, the detection points d _cam1 and d _cam4 may be calculated in the following manner.

As shown in fig. 4G, after binarization, the pixels on the image are either black or white, and the vertices of the lattices in the checkerboard can be identified by feature detection, and these identifiable vertices can be used as detection points d, and the detection points detected on the ith image can be represented as d _i.

Optionally, the optimization function may further determine an error function based on the first formula, the second formula, the third formula, and the fourth formula, where the error function is expressed as:

f=f_cam1,1+f_cam4,1+f_cam1,2+f_cam4,2

Finally, optionally, an optimization function may be constructed based on the error function, determining the final external parameters of the first camera 208A and the fourth camera 208D. Optionally, the optimization function is expressed as follows:

Where ρ is a loss function, f is the error function O _i is all information contained in the ith image, and I is the number of input images.

It can be seen that the loss function res (pi, T) is a function related to pose, and the best solution obtained by solving the loss function by using the Levenberg-Marquardt (LM for short) is the final pose relationship T ₁₄.

It can be understood that the above method can be used to obtain the pose relationship for the two cameras in the different-side lens barrel, and the detailed description is omitted here.

Considering that the arrangement and combination modes of the cameras at different sides are more, if the pose relation is calculated by adopting the method for each pair of cameras, more calculation amount is increased. Thus, in some embodiments, according to the plurality of images and the projection relationship, the pose relationship between the first camera and the third camera, the pose relationship between the first camera and the fourth camera, the pose relationship between the second camera and the third camera, the pose relationship between the second camera and the fourth camera, comprises:

determining a pose relationship between the first camera and the fourth camera according to the plurality of images and the projection relationship;

Determining the pose relationship between the first camera and the third camera according to the pose relationship between the third camera and the fourth camera and the pose relationship between the first camera and the fourth camera;

Determining the pose relationship between the second camera and the third camera according to the pose relationship between the first camera and the second camera and the pose relationship between the first camera and the third camera;

and determining the pose relationship between the second camera and the fourth camera according to the pose relationship between the third camera and the fourth camera and the pose relationship between the second camera and the third camera.

In this way, after the pose relationship between the first camera 208A and the fourth camera 208D is calculated, the pose relationship of other permutation and combination can be obtained by a data conversion manner based on the pose relationship between the first camera 208A and the second camera 208B and the pose relationship between the third camera 208C and the fourth camera 208D that have been calculated, thereby saving the calculation amount.

It will be appreciated that the above method provides a calculation method that can be used when there are no common viewpoints for 2 cameras within the alien lens barrel, but when there are common viewpoints for the alien cameras, a dual-objective localization algorithm can still be used to calculate the pose relationship.

It should be noted that, the above example only uses two lens barrels of the wearable device 200 to set two cameras as an example, and it is understood that the number of cameras in the lens barrels may be less or more according to different actual needs, but in any case, based on the inventive concept of the embodiments provided by the present disclosure, corresponding camera parameters can still be calculated, which is not described herein again.

In a more specific embodiment, the method for calibrating the camera parameters provided in the embodiments of the present disclosure may include steps of drawing, internal parameter calibration, external parameter calibration, optimizing and saving parameters, so that better camera parameters can be obtained for calculation by a subsequent algorithm.

As can be seen from the above embodiments, the calibration method for camera parameters provided by the embodiments of the present disclosure provides a feasible calibration scheme for internal parameters and external parameters of a camera for a scene with asymmetric imaging distortion and a non-uniform projection center of the camera in a lens barrel.

It should be noted that the method of the embodiments of the present disclosure may be performed by a single device, such as a computer or a server. The method of the embodiment can also be applied to a distributed scene, and is completed by mutually matching a plurality of devices. In the case of such a distributed scenario, one of the devices may perform only one or more steps of the methods of embodiments of the present disclosure, the devices interacting with each other to accomplish the methods.

It should be noted that the foregoing describes some embodiments of the present disclosure. Other embodiments are within the scope of the following claims. In some cases, the actions or steps recited in the claims may be performed in a different order than in the embodiments described above and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In some embodiments, multitasking and parallel processing are also possible or may be advantageous.

The embodiment of the disclosure also provides a wearable device.

Fig. 5 shows a schematic diagram of an exemplary wearable device 500 provided by embodiments of the present disclosure.

As shown in fig. 5, similar to fig. 2A and 2B, the wearable device 500 includes a lens barrel 202, in which a display module 204, cameras 208A, 208B, and an optical component 206 are disposed in the lens barrel 202, the cameras 208A, 208B and the optical component 206 are located on the light emitting side of the display module 204, and the cameras 208A, 208B are located between the optical component 206 and the display module 204.

In some embodiments, the cameras 208A, 208B are oriented toward the optical assembly 206, as shown with reference to fig. 2A. In other embodiments, as shown in fig. 2B, a reflective structure 210A, 210B is disposed between the display module 204 and the cameras 208A, 208B, the reflective surfaces of the reflective structures 210A, 210B face the light emitting side of the display module 204, and the shooting directions of the cameras 208A, 208B face the reflective structures 210A, 210B.

Further, as shown in FIG. 5, the wearable device 500 further includes a processing module 502 electrically coupled to the camera 208 and the display module 204, respectively, and configured to obtain first and second parameter sets obtained by the method 300 and images captured by the cameras 208A, 208B, and to solve a position of a target region in space in the images using the first and second parameter sets and the images.

As previously described, the first parameter set includes camera parameters (which may be considered as internal parameters of the camera) in combination with the pixel points and the projection direction, with which conversion of the coordinate system of the image captured by the camera into the camera coordinate system is achieved, and thus may be used to calculate positional information of certain features (e.g., pupils) in the image captured by the camera in three-dimensional space (camera coordinate system). In addition, the correction of the image distortion is included in the calculation process of the camera parameters, so that the distortion can be directly corrected without additional distortion correction when the coordinate system conversion calculation is carried out by utilizing the camera parameters.

The second parameter set includes a second parameter representing a pose relationship between different cameras, according to which image information acquired by the different cameras may be unified (e.g., unified into the same camera coordinate system).

The disclosed embodiments also provide a computer device for implementing the above-described method 300. Fig. 6 shows a hardware architecture diagram of an exemplary computer device 600 provided by an embodiment of the present disclosure. The computer device 600 may be used to implement the head wearable device 104 of fig. 1A, the wearable device 200 of fig. 2A-2D, the external device 112 of fig. 1A, and the server 114 of fig. 1A. In some scenarios, the computer device 600 may also be used to implement the database server 116 of FIG. 1A.

As shown in FIG. 6, computer device 600 may include a processor 602, a memory 604, a network module 606, a peripheral interface 608, and a bus 610. Wherein the processor 602, the memory 604, the network module 606, and the peripheral interface 608 enable communication connections therebetween within the computer device 600 via a bus 610.

The processor 602 may be a central processing unit (Central Processing Unit, CPU), an image processor, a neural Network Processor (NPU), a Microcontroller (MCU), a programmable logic device, a Digital Signal Processor (DSP), an Application SPECIFIC INTEGRATED Circuit (ASIC), or one or more integrated circuits. The processor 602 may be used to perform functions related to the techniques described in this disclosure. In some embodiments, the processor 602 may also include multiple processors integrated as a single logical component. For example, as shown in fig. 6, the processor 602 may include a plurality of processors 602a, 602b, and 602c.

The memory 604 may be configured to store data (e.g., instructions, computer code, etc.). As shown in fig. 6, the data stored by the memory 604 may include program instructions (e.g., program instructions for implementing the methods 300 or 500 of embodiments of the present disclosure) as well as data to be processed (e.g., the memory may store configuration files of other modules, etc.). The processor 602 may also access program instructions and data stored in the memory 604 and execute the program instructions to perform operations on the data to be processed. Memory 604 may include volatile memory or nonvolatile memory. In some embodiments, memory 604 may include Random Access Memory (RAM), read Only Memory (ROM), optical disks, magnetic disks, hard disks, solid State Disks (SSD), flash memory, memory sticks, and the like.

The network interface 606 may be configured to provide communications with other external devices to the computer device 600 via a network. The network may be any wired or wireless network capable of transmitting and receiving data. For example, the network may be a wired network, a local wireless network (e.g., bluetooth, wiFi, near Field Communication (NFC), etc.), a cellular network, the internet, or a combination of the foregoing. It will be appreciated that the type of network is not limited to the specific examples described above.

Peripheral interface 608 may be configured to connect computer apparatus 600 with one or more peripheral devices to enable information input and output. For example, the peripheral devices may include input devices such as keyboards, mice, touchpads, touch screens, microphones, various types of sensors, and output devices such as displays, speakers, vibrators, and indicators.

Bus 610 may be configured to transfer information between the various components of computer device 600 (e.g., processor 602, memory 604, network interface 606, and peripheral interface 608), such as an internal bus (e.g., processor-memory bus), an external bus (USB port, PCI-E bus), etc.

It should be noted that although the architecture of the computer device 600 described above illustrates only the processor 602, the memory 604, the network interface 606, the peripheral interface 608, and the bus 610, in particular implementations, the architecture of the computer device 600 may include other components necessary to achieve proper operation. Moreover, those skilled in the art will appreciate that the architecture of the computer device 600 described above may include only the components necessary to implement the disclosed embodiments, and not all of the components shown in the figures.

The embodiment of the disclosure also provides a calibration device for camera parameters. Fig. 7 shows a schematic diagram of an exemplary apparatus 700 provided by an embodiment of the present disclosure. As shown in fig. 7, the apparatus 700 may be used to implement the method 300 and may further include the following modules.

a first determining module 704 configured to determine a projection relationship of pixels of the plurality of images with the calibration reference object;

A second determination module 706 is configured to determine a first parameter set and a second parameter set of the at least two cameras from the projection relationship.

In some embodiments, the at least two cameras include a first camera and a second camera for acquiring an image of a human eye of a first eye and a third camera and a fourth camera for acquiring an image of a human eye of a second eye, the first camera and the second camera being disposed in a first barrel of the wearable device, the third camera and the fourth camera being disposed in a second barrel of the wearable device;

The second parameters comprise pose relations between the first camera and the second camera and pose relations between the third camera and the fourth camera, and the pose relations between the first camera and the second camera and the pose relations between the third camera and the fourth camera are calculated according to the projection relations based on a double-target determination algorithm.

In some embodiments, the second parameter further comprises a pose relationship between the first camera and the third camera, a pose relationship between the first camera and the fourth camera, a pose relationship between the second camera and the third camera, a pose relationship between the second camera and the fourth camera.

In some embodiments, the second determination module 706 is configured to:

In some embodiments, the first determination module 704 is configured to:

selecting a first number of target images from the plurality of images;

selecting at least one local area from each target image, and establishing a homography transformation matrix of each local area and the calibration reference object;

Determining the projection relation between the pixel points contained in the local area and the calibration reference object according to the homography transformation matrix;

And converting the first number of target images into a reference coordinate system, and determining the projection relation between the pixel points of the plurality of images and the calibration reference object according to the area which corresponds to the local area and has the established projection relation.

In some embodiments, the second determination module 706 is configured to:

determining initial first parameter sets of the at least two cameras according to the projection relation, wherein the initial first parameter sets comprise a plurality of first parameters corresponding to a plurality of pixel points of the image one by one;

fitting the initial first parameter set by adopting a spline surface to obtain a fitted spline surface;

and optimizing the control points of the fitting spline surface based on a beam method adjustment algorithm to obtain a first parameter set of the at least two cameras.

For convenience of description, the above devices are described as being functionally divided into various modules, respectively. Of course, the functions of the various modules may be implemented in the same one or more pieces of software and/or hardware when implementing the present disclosure.

The apparatus of the foregoing embodiments is configured to implement the corresponding method 300 in any of the foregoing embodiments, and has the beneficial effects of the corresponding method embodiments, which are not described herein.

Based on the same inventive concept, corresponding to any of the above-described embodiment methods, the present disclosure also provides a non-transitory computer-readable storage medium storing computer instructions for causing the computer to perform the method 300 as described in any of the above-described embodiments.

The computer readable media of the present embodiments, including both permanent and non-permanent, removable and non-removable media, may be used to implement information storage by any method or technology. The information may be computer readable instructions, data structures, modules of a program, or other data. Examples of storage media for a computer include, but are not limited to, phase change memory (PRAM), static Random Access Memory (SRAM), dynamic Random Access Memory (DRAM), other types of Random Access Memory (RAM), read Only Memory (ROM), electrically Erasable Programmable Read Only Memory (EEPROM), flash memory or other memory technology, compact disc read only memory (CD-ROM), digital Versatile Discs (DVD) or other optical storage, magnetic cassettes, magnetic tape magnetic disk storage or other magnetic storage devices, or any other non-transmission medium, which can be used to store information that can be accessed by a computing device.

The storage medium of the foregoing embodiments stores computer instructions for causing the computer to perform the method 300 as described in any of the foregoing embodiments, and has the advantages of the corresponding method embodiments, which are not described herein.

Based on the same inventive concept, the present disclosure also provides a computer program product, corresponding to any of the embodiment methods 300 described above, comprising computer program instructions. In some embodiments, the computer program instructions may be executed by one or more processors of a computer to cause the computer and/or the processor to perform the described method 300. Corresponding to the execution bodies corresponding to the steps in the embodiments of the method 300, the processor executing the corresponding step may belong to the corresponding execution body.

The computer program product of the above embodiment is configured to enable the computer and/or the processor to perform the method 300 of any one of the above embodiments, and has the advantages of the corresponding method embodiments, which are not described herein.

It will be appreciated by persons skilled in the art that the foregoing discussion of any embodiment is merely exemplary and is not intended to imply that the scope of the disclosure, including the claims, is limited to these examples, that the steps may be implemented in any order and that many other variations of the different aspects of the disclosed embodiments described above are present, which are not provided in detail for the sake of brevity, and that the features of the above embodiments or of the different embodiments may also be combined within the spirit of the disclosure.

Additionally, well-known power/ground connections to Integrated Circuit (IC) chips and other components may or may not be shown within the provided figures, in order to simplify the illustration and discussion, and so as not to obscure the embodiments of the present disclosure. Furthermore, the devices may be shown in block diagram form in order to avoid obscuring the embodiments of the present disclosure, and this also accounts for the fact that specifics with respect to implementation of such block diagram devices are highly dependent upon the platform on which the embodiments of the present disclosure are to be implemented (i.e., such specifics should be well within purview of one skilled in the art). Where specific details (e.g., circuits) are set forth in order to describe example embodiments of the disclosure, it should be apparent to one skilled in the art that embodiments of the disclosure can be practiced without, or with variation of, these specific details. Accordingly, the description is to be regarded as illustrative in nature and not as restrictive.

While the present disclosure has been described in conjunction with specific embodiments thereof, many alternatives, modifications, and variations of those embodiments will be apparent to those skilled in the art in light of the foregoing description. For example, other memory architectures (e.g., dynamic RAM (DRAM)) may use the embodiments discussed.

The disclosed embodiments are intended to embrace all such alternatives, modifications and variances which fall within the broad scope of the appended claims. Accordingly, any omissions, modifications, equivalents, improvements, and the like, which are within the spirit and principles of the embodiments of the disclosure, are intended to be included within the scope of the disclosure.

Claims

1. A method for calibrating camera parameters, comprising:

The wearable device receives multiple images of a calibration reference object captured by at least two cameras, the at least two cameras being disposed within the lens barrel of the wearable device; the wearable device further includes a binocular display module, the lens barrel being disposed on the light-emitting side of the binocular display module, an optical component being disposed within the lens barrel, and the at least two cameras being located between the binocular display module and the optical component;

Determine the projection relationship between the pixels of the multiple images and the calibration reference object;

Based on the projection relationship, determine the first parameter set and the second parameter set of the at least two cameras;

The first parameter set includes multiple first parameters, which are used to characterize the target pixel in the image captured by the camera to be calibrated and the projection direction corresponding to the target pixel. The second parameter set includes at least one set of second parameters, which are used to indicate the pose relationship between the at least two cameras.

2. The method of claim 1, wherein the at least two cameras include a first camera and a second camera for acquiring images of a first eye, and a third camera and a fourth camera for acquiring images of a second eye, the first camera and the second camera being disposed in a first lens barrel of the wearable device, and the third camera and the fourth camera being disposed in a second lens barrel of the wearable device;

The second parameter includes the pose relationship between the first camera and the second camera, and the pose relationship between the third camera and the fourth camera. The pose relationship between the first camera and the second camera, and the pose relationship between the third camera and the fourth camera are calculated based on the projection relationship using a dual-camera positioning algorithm.

3. The method of claim 2, wherein the second parameter further includes the pose relationship between the first camera and the third camera, the pose relationship between the first camera and the fourth camera, the pose relationship between the second camera and the third camera, and the pose relationship between the second camera and the fourth camera.

4. The method of claim 3, wherein determining the first parameter set and the second parameter set of the at least two cameras according to the projection relationship includes:

Based on the multiple images and the projection relationship, the pose relationship between the first camera and the fourth camera is determined;

Based on the pose relationship between the third camera and the fourth camera and the pose relationship between the first camera and the fourth camera, the pose relationship between the first camera and the third camera is determined;

Based on the pose relationship between the first camera and the second camera, and the pose relationship between the first camera and the third camera, determine the pose relationship between the second camera and the third camera;

The pose relationship between the second camera and the fourth camera is determined based on the pose relationship between the third camera and the fourth camera, and the pose relationship between the second camera and the third camera.

5. The method of claim 1, wherein determining the projection relationship between the pixels of the plurality of images and the calibration reference object includes:

Select a first number of target images from the plurality of images;

At least one local region is selected in each target image, and a homography transformation matrix between each local region and the calibration reference object is established;

The projection relationship between the pixels contained in the local region and the calibration reference object is determined based on the homography transformation matrix.

The first number of target images are transformed into a reference coordinate system, and the projection relationship between the pixels of the multiple images and the calibration reference object is determined according to the regions with established projection relationships corresponding to the local regions.

6. The method of claim 1, wherein determining the first parameter set and the second parameter set of the at least two cameras according to the projection relationship includes:

Based on the projection relationship, an initial first parameter set for the at least two cameras is determined, the initial first parameter set including multiple first parameters that correspond one-to-one with multiple pixels of the image;

The initial first parameter set is fitted using a spline surface to obtain the fitted spline surface;

The control points of the fitted spline surface are optimized based on the bundle adjustment algorithm to obtain the first parameter set of the at least two cameras.

7. A wearable device, comprising:

Binocular display module;

Two lens barrels are disposed on the light-emitting side of the binocular display module. At least one of the two lens barrels includes at least two cameras for acquiring human eye images and an optical component disposed inside the lens barrel. The at least two cameras are located between the binocular display module and the optical component.

The wearable device further includes a processing module electrically coupled to the at least two cameras and the binocular display module, and is configured to: acquire a first parameter set and a second parameter set as well as images captured by the at least two cameras; and use the first parameter set, the second parameter set, and the images to determine the spatial position of a target region in the images.

The first parameter set includes multiple first parameters, which are used to characterize the target pixel in the image captured by the camera and the projection direction corresponding to the target pixel. The second parameter set includes at least one set of second parameters, which are used to indicate the pose relationship between the at least two cameras.

8. The wearable device of claim 7, wherein the binocular display module comprises a first display module and a second display module, and the two lens barrels comprise:

A first lens barrel is disposed on the light-emitting side of the first display module, and a first camera and a second camera are disposed inside the first lens barrel for capturing images of the human eye.

The second lens barrel is located on the light-emitting side of the second display module, and a third camera and a fourth camera are installed inside the second lens barrel for capturing images of the human eye of the second eye.

9. The wearable device of claim 8, wherein the first camera and the second camera are symmetrically arranged within the first lens barrel with respect to the axis of the first lens barrel;

The third camera and the fourth camera are symmetrically arranged inside the second lens barrel with respect to the axis of the second lens barrel.

10. The wearable device of claim 9, wherein the first camera and the second camera are both oriented toward the light-emitting side of the first display module and the angle between the orientation of the first camera and the second camera and the axis of the first lens barrel is equal;

Both the third camera and the fourth camera are oriented towards the light-emitting side of the second display module, and the angle between the orientation of the third camera and the fourth camera and the axis of the second lens barrel is equal.

11. The wearable device of claim 9, wherein the first camera and the second camera are both facing the first display module, and a first reflective structure corresponding to the first camera and a second reflective structure corresponding to the second camera are disposed inside the first lens barrel, the first reflective structure being used to reflect light from the first eye into the first camera, and the second reflective structure being used to reflect light from the first eye into the second camera;

Both the third camera and the fourth camera face the second display module. The second lens barrel is provided with a third reflection structure corresponding to the third camera and a fourth reflection structure corresponding to the fourth camera. The third reflection structure is used to reflect light from the second eye into the third camera, and the fourth reflection structure is used to reflect light from the second eye into the fourth camera.

12. A camera parameter calibration device, comprising:

The receiving module is configured to receive multiple images of a calibration reference object captured by at least two cameras, the at least two cameras being disposed within the lens barrel of a wearable device; the wearable device further includes a binocular display module, the lens barrel being disposed on the light-emitting side of the binocular display module, an optical component being disposed within the lens barrel, and the at least two cameras being located between the binocular display module and the optical component;

The first determining module is configured to: determine the projection relationship between the pixels of the plurality of images and the calibration reference object;

The second determining module is configured to: determine a first parameter set and a second parameter set of the at least two cameras based on the projection relationship;

13. A computer device comprising one or more processors, a memory; and one or more programs, wherein the one or more programs are stored in the memory and executed by the one or more processors, the programs comprising instructions for performing the method as claimed in any one of claims 1-6.

14. A non-volatile computer-readable storage medium comprising a computer program, which, when executed by one or more processors, causes the processors to perform the method as described in any one of claims 1-6.

15. A computer program product comprising computer program instructions that, when executed on a computer, cause the computer to perform the method as described in any one of claims 1-6.