[go: up one dir, main page]

WO2019205842A1 - 相机姿态追踪过程的重定位方法、装置及存储介质 - Google Patents

相机姿态追踪过程的重定位方法、装置及存储介质 Download PDF

Info

Publication number
WO2019205842A1
WO2019205842A1 PCT/CN2019/078928 CN2019078928W WO2019205842A1 WO 2019205842 A1 WO2019205842 A1 WO 2019205842A1 CN 2019078928 W CN2019078928 W CN 2019078928W WO 2019205842 A1 WO2019205842 A1 WO 2019205842A1
Authority
WO
WIPO (PCT)
Prior art keywords
feature point
image
camera
pose
initial
Prior art date
Application number
PCT/CN2019/078928
Other languages
English (en)
French (fr)
Inventor
林祥凯
凌永根
暴林超
刘威
Original Assignee
腾讯科技(深圳)有限公司
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by 腾讯科技(深圳)有限公司 filed Critical 腾讯科技(深圳)有限公司
Priority to EP19792392.3A priority Critical patent/EP3779883B1/en
Publication of WO2019205842A1 publication Critical patent/WO2019205842A1/zh
Priority to US16/915,798 priority patent/US11205282B2/en

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F17/00Digital computing or data processing equipment or methods, specially adapted for specific functions
    • G06F17/10Complex mathematical operations
    • G06F17/16Matrix or vector computation, e.g. matrix-matrix or matrix-vector multiplication, matrix factorization
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T3/00Geometric image transformations in the plane of the image
    • G06T3/02Affine transformations
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/20Analysis of motion
    • G06T7/246Analysis of motion using feature-based methods, e.g. the tracking of corners or segments
    • G06T7/248Analysis of motion using feature-based methods, e.g. the tracking of corners or segments involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • G06T7/74Determining position or orientation of objects or cameras using feature-based methods involving reference images or patches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20016Hierarchical, coarse-to-fine, multiscale or multiresolution image processing; Pyramid transform
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/30Subject of image; Context of image processing
    • G06T2207/30244Camera pose

Definitions

  • the embodiments of the present application relate to the field of enhanced display, and in particular, to a method, a device, and a storage medium for repositioning a camera attitude tracking process.
  • Visual SLAM refers to the technique of estimating the movement of the body while the camera is the main body, without the prior information of the environment, establishing a model of the environment during the movement.
  • SLAM can be used in the field of AR (Augmented Reality), robotics and unmanned driving.
  • the first frame image captured by the camera is usually used as a marker image (Anchor).
  • the device tracks the feature points commonly shared between the current image and the mark image, and calculates the pose change of the camera in the real world according to the change of the feature point position between the current image and the mark image.
  • the feature point loss (Lost) in the current image may occur, and the tracking cannot be continued.
  • the current image needs to be relocated using the SLAM relocation method.
  • the embodiment of the present application provides a relocation method, device, and storage medium for a camera attitude tracking process.
  • the technical solution is as follows:
  • a relocation method of a camera attitude tracking process which is applied to a device having a camera for sequentially performing camera attitude tracking of a plurality of marker images, the method include:
  • the initial pose parameter being used to instruct the camera to acquire the The camera pose when the first image is marked;
  • the target camera pose is that the camera is collecting the current Camera pose when image
  • Retargeting obtains a target pose parameter corresponding to the target camera pose according to the initial pose parameter and the pose change amount.
  • a repositioning device for a camera attitude tracking process the apparatus for performing camera attitude tracking of a plurality of marker images in sequence, the apparatus comprising:
  • An image acquisition module configured to acquire a current image acquired after the i-th mark image of the plurality of mark images, where i is an integer greater than one;
  • An information acquiring module configured to acquire an initial feature point and an initial pose parameter of the first one of the plurality of mark images when the current image meets a relocation condition, where the initial pose parameter is used to indicate a camera pose when the camera captures the first marker image;
  • a feature point tracking module configured to perform feature point tracking on the current image with respect to the first marker image, to obtain a target feature point that matches the initial feature point;
  • a change amount calculation module configured to calculate, according to the initial feature point and the target feature point, a pose change amount when the camera changes from the first camera pose to the target camera pose, the target camera pose is the camera a camera pose when acquiring the current image;
  • a relocation module configured to retarget the target pose parameter corresponding to the target camera pose according to the initial pose parameter and the pose change amount.
  • an electronic device including a memory and a processor
  • At least one instruction is stored in the memory, the at least one instruction being loaded by the processor and executed to implement a relocation method of the camera pose tracking process as described above.
  • a computer readable storage medium having stored therein at least one instruction loaded by a processor and executed to implement a camera pose as described above The method of relocating the tracking process.
  • relocation can be implemented in the Anchor-SLAM algorithm for continuously tracking multiple marker images, thereby reducing the possibility of interruption of the tracking process.
  • Sex since the relocation process relocates the current image relative to the first marker image, it can also eliminate the cumulative error caused by the tracking process of multiple marker images, thereby solving the related art SLAM relocation method in AR The problem of poor relocation in the field.
  • FIG. 1 is a schematic diagram of a scenario of an AR application scenario provided by an exemplary embodiment of the present application
  • FIG. 2 is a schematic diagram of a scenario of an AR application scenario provided by an exemplary embodiment of the present application
  • FIG. 3 is a schematic diagram of a schematic diagram of an Anchor-Switching AR System algorithm provided by an exemplary embodiment of the present application
  • FIG. 4 is a structural block diagram of an electronic device provided by an exemplary embodiment of the present application.
  • FIG. 5 is a flowchart of a method for relocating a camera attitude tracking process according to an exemplary embodiment of the present application
  • FIG. 6 is a flowchart of a method for relocating a camera pose tracking process provided by an exemplary embodiment of the present application
  • FIG. 7 is a schematic diagram of a pyramid image provided by an exemplary embodiment of the present application.
  • FIG. 8 is a flowchart of a method for relocating a camera attitude tracking process according to an exemplary embodiment of the present application.
  • FIG. 9 is a schematic diagram of a principle of a relocation method provided by an exemplary embodiment of the present application.
  • FIG. 10 is a flowchart of a relocation method provided by an exemplary embodiment of the present application.
  • FIG. 11 is a block diagram of a relocation device of a camera pose tracking process provided by an exemplary embodiment of the present application.
  • FIG. 12 is a block diagram of an electronic device provided by an exemplary embodiment of the present application.
  • AR Augmented Reality
  • a method of calculating camera pose parameters in the real world (or 3D world, real world) in real time during camera acquisition of images, according to the camera pose parameters collected by the camera The technique of adding virtual elements to an image.
  • Virtual elements include, but are not limited to, images, video, and 3D models.
  • the goal of AR technology is to interact with the virtual world on the screen in the real world.
  • the camera pose parameters include a displacement vector used to characterize the displacement distance of the camera in the real world and a rotation matrix used to characterize the angle of rotation of the camera in the real world.
  • the device adds a virtual character image to the image captured by the camera.
  • the image captured by the camera changes, and the orientation of the avatar changes. It simulates that the avatar is still in the image, and the camera changes with position and posture.
  • the effect of capturing images and avatars gives the user a realistic three-dimensional picture.
  • the camera set by the device in the present application is a monocular camera.
  • Anchor-Switching AR System is based on continuous camera tracking of multiple marker images (Anchor) to determine camera pose parameters in natural scenes, and then superimposes the virtual world AR system on the images captured by the camera according to camera pose parameters.
  • IMU Inertial Measurement Unit
  • an IMU consists of three single-axis accelerometers and three single-axis gyros.
  • the accelerometer is used to detect the acceleration signal of each object in each coordinate axis of the three-dimensional coordinate system, and then calculate the displacement vector; Used to detect the rotation matrix of an object in a three-dimensional coordinate system.
  • the IMU includes a gyroscope, an accelerometer, and a geomagnetic sensor.
  • the three-dimensional coordinate system is established as follows: 1.
  • the X-axis is defined by the vector product Y*Z. In the current position of the device, the X-axis points to the east in a direction tangent to the ground; 2.
  • the Y-axis is At the current position of the device, it points in the direction tangent to the ground to the north pole of the earth's magnetic field; 3.
  • the Z axis points to the sky and is perpendicular to the ground.
  • the present application provides a relocation method suitable for the Anchor-Switching AR System algorithm.
  • the Anchor-Switching AR System algorithm divides the camera's motion process into at least two tracking processes for tracking. Each tracking process corresponds to the respective marker image.
  • a preset condition for example, the feature points that can be matched are less than a preset threshold
  • the previous image of the current image is determined as the i+1th marker image, and the i+1th segment tracking process is turned on.
  • i is a positive integer.
  • FIG. 3 is a schematic diagram showing the principle of an Anchor-Switching AR System algorithm provided by an exemplary embodiment of the present application.
  • an object 320 is present, the device 340 provided with the camera is moved by the user, and a multi-frame image 1-6 including the object 320 is captured during the movement.
  • the device determines image 1 as the first marker image (born-anchor or born-image) and records the initial camera pose parameter, which may be acquired by the IMU, and then performs feature point tracking on the image 2 relative to the image 1.
  • the camera pose parameter when the image 3 is captured; the feature point tracking of the image 4 with respect to the image 1 is performed, and the camera pose parameter of the camera when the image 4 is captured is calculated according to the initial camera pose parameter and the feature point tracking result.
  • the image 5 is tracked with respect to the image 1 , and if the feature point tracking effect is worse than the preset condition (for example, the number of matching feature points is small), the image 4 is determined as the second marker image, and the image 5 is Performing feature point tracking with respect to the image 4, calculating the amount of displacement change of the camera between the captured image 4 and the image 5, and calculating the amount of displacement change between the captured image 4 and the image 1 and the initial camera posture parameter, and calculating The camera pose parameter of the camera when taking image 5. Then, the image 6 is tracked with respect to the image 4, and so on. If the feature point tracking effect of the current image is deteriorated, the previous frame image of the current image can be determined as a new mark image, and the new mark is switched. Feature point tracking is performed again after the image.
  • the preset condition for example, the number of matching feature points is small
  • the feature point tracking may use an optical trajectory-based algorithm such as optical flow tracking or direct method.
  • the above-mentioned Anchor-Switching AR System tracking process may be lost if the camera is subjected to more intense motion during the tracking process, toward a strong light source, or toward a white wall. Loss phenomenon means that enough feature points cannot be matched in the current image, resulting in tracking failure.
  • the device includes a processor 420, a memory 440, a camera 460, and an IMU 480.
  • Processor 420 includes one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 420 is configured to execute at least one of instructions, code, code segments, and programs stored in the memory 440.
  • the processor 420 is electrically connected to the memory 440.
  • processor 420 is coupled to memory 440 via a bus.
  • Memory 440 stores one or more instructions, code, code segments, and/or programs. The instructions, code, code segments and/or programs, when executed by processor 420, are used to implement the SLAM relocation method provided in the following embodiments.
  • the processor 420 is also electrically coupled to the camera 460.
  • processor 420 is coupled to camera 460 via a bus.
  • Camera 460 is a sensor device having image acquisition capabilities. Camera 460 may also be referred to as a camera, a photosensitive device, and the like. Camera 460 has the ability to continuously acquire images or acquire images multiple times.
  • camera 460 is located inside or outside the device.
  • the camera 460 is a monocular camera.
  • the processor 420 is also electrically connected to the IMU 480.
  • the IMU 480 is configured to acquire the pose parameters of the camera every predetermined time interval, and record the time stamp of each set of pose parameters at the time of acquisition.
  • the camera's pose parameters include: displacement vector and rotation matrix. Among them, the rotation matrix acquired by IMU480 is relatively accurate, and the displacement vector acquired may have a large error due to the actual environment.
  • FIG. 5 a flowchart of a method of relocating a camera pose tracking process provided by an exemplary embodiment of the present application is shown.
  • This embodiment is exemplified by the application of the relocation method to the apparatus shown in FIG. 4 for performing camera attitude tracking of a plurality of marker images in sequence.
  • the method includes:
  • Step 502 Acquire a current image acquired after the i-th mark image in the plurality of mark images.
  • the camera in the device collects a frame image at a preset time interval to form an image sequence.
  • the camera acquires a frame of image forming image sequence according to a preset time interval during motion (translation and/or rotation).
  • the device determines the first frame image in the image sequence (or one frame image in the first few frames of images that meets the predetermined condition) as the first marker image, and performs the subsequently acquired image on the first marker image.
  • Feature point tracking, and calculating a camera pose parameter according to the feature point tracking result if the feature point tracking effect of the current frame image is worse than a preset condition, determining the previous frame image of the current frame image as the second marker image, The subsequently acquired image is subjected to feature point tracking with respect to the second marker image, and the camera pose parameter of the camera is calculated according to the feature point tracking result, and so on.
  • the device can sequentially perform camera attitude tracking of a plurality of marked images in sequence.
  • the camera When in the i-th tracking process corresponding to the i-th mark image, the camera captures the current image.
  • the current image is a certain frame image acquired after the i-th mark image, where i is an integer greater than one.
  • the current image refers to an image currently being processed, and is not necessarily an image acquired at the current time.
  • Step 504 Acquire an initial feature point and an initial pose parameter of the first one of the plurality of mark images when the current image meets the relocation condition, where the initial pose parameter is used to indicate when the camera captures the first mark image.
  • the device determines if the current image meets the relocation criteria.
  • the relocation condition is used to indicate that the tracking process of the current image with respect to the i-th flag image fails, or the re-location condition is used to indicate that the accumulated error in the history tracking process has been higher than the preset condition.
  • the device tracks the current image relative to the i-th mark image, if there is no feature point matching the i-th mark image in the current image, or the i-th mark in the current image When the feature points of the image matching are less than the first number, it is determined that the tracking process of the current image with respect to the i-th flag image fails, and the relocation condition is met.
  • the device determines that the number of frames between the current image and the last relocated image is greater than the second number, determining that the accumulated error in the history tracking process is higher than a preset condition, or When it is determined that the number of mark images between the i-th mark image and the first mark image is greater than the third number, it is determined that the cumulative error in the history tracking process has been higher than the preset condition.
  • This embodiment does not limit the specific condition content of the relocation condition.
  • the device When the current image meets the relocation condition, the device attempts to track the current image relative to the first marker image. At this time, the device acquires an initial feature point in the first marked image of the cache and an initial pose parameter, which is used to indicate a camera pose when the camera captures the first marker image.
  • the initial feature point is a feature point extracted from the first marker image, and the initial feature point may be plural, such as 10-500.
  • This initial pose parameter is used to indicate the camera pose when the camera captures the first marker image.
  • the initial pose parameters include a rotation matrix R and a displacement vector T, and the initial pose parameters can be acquired by the IMU.
  • Step 506 Perform feature point tracking on the current image with respect to the first marker image to obtain a target feature point that matches the initial feature point.
  • the feature point tracking can use a visual odometer-based tracking algorithm, which is not limited in this application.
  • feature point tracking uses a KLT (Kanade-Lucas) optical flow tracking algorithm; in another embodiment, feature point tracking is based on an ORB (Oriented FAST and Rotated BRIEF) algorithm.
  • KLT Kerade-Lucas
  • ORB Oriented FAST and Rotated BRIEF
  • the ORB feature descriptor performs feature point tracking.
  • the specific algorithm for feature point tracking is not limited in this application, and the feature point tracking process may adopt a feature point method or a direct method.
  • the device performs feature point extraction on the first marker image to obtain N initial feature points; the device further performs feature point extraction on the current image to obtain M candidate feature points; and then M candidate feature points are successively Matching with N initial feature points to determine at least one set of matching feature point pairs.
  • Each set of matching feature point pairs includes: an initial feature point and a target feature point.
  • the initial feature point is a feature point on the first marker image, and the target feature point is a candidate feature point having the highest matching degree with the initial feature point on the current image.
  • the number of initial feature points is greater than or equal to the number of target feature points.
  • the number of initial feature points is 450, and the target feature points are 320 groups.
  • Step 508 Calculate, according to the initial feature point and the target feature point, a pose change amount when the camera changes from the first camera pose to the target camera pose, where the target camera pose is a camera pose when the camera captures the current image;
  • the device calculates a homography matrix homography between the two image images according to the initial feature point and the target feature point; and decomposes the homography matrix homography to obtain a bit when the camera changes from the first camera posture to the target camera posture.
  • the amount of change in position R relocalize and T relocalize .
  • the homography matrix describes the mapping relationship between two planes. If the feature points in the natural scene (real environment) fall on the same physical plane, the motion estimation can be performed through the homography matrix.
  • the device decomposes the homography matrix by ransac to obtain a rotation matrix R relocalize and a translation vector T relocalize .
  • R relocalize is the rotation matrix when the camera changes from the first camera pose to the target camera pose
  • T relocalize is the displacement vector when the camera changes from the first camera pose to the target camera pose
  • Step 510 Retargeting according to the initial pose parameter and the pose change amount to obtain a target pose parameter corresponding to the target camera pose.
  • the device transforms the initial pose parameter by the pose change amount, and then re-positions the target pose parameter corresponding to the target camera pose, thereby calculating the camera pose when the camera captures the current image.
  • the terminal determines the current image as the i+1th mark image.
  • the terminal continues feature point tracking based on the (i+1)th tag image.
  • the terminal may continue to generate the i+2th marker image, the i+3th marker image, the i+4th marker image, and the like according to the subsequent feature point tracking situation, and so on.
  • the relocation method provided by this embodiment can reconstruct the Anchor-SLAM in multiple consecutive marker images by relocating the current image and the first marker image when the current image meets the relocation condition.
  • the relocation is implemented in the algorithm, thereby reducing the possibility of interruption of the tracking process, thereby solving the problem that the relocation method of the SLAM relocation method in the related art is poor in the AR field.
  • the relocation process is to reposition the current image relative to the first marker image
  • the first marker image can be considered to have no cumulative error, so the embodiment can also eliminate the tracking process of multiple marker images. Cumulative error.
  • the first marker image is usually the first frame image captured by the camera and is also the current image used in the relocation process, for the purpose of improving the success rate of feature point matching,
  • the first marker image needs to be preprocessed. As shown in FIG. 6, before step 502, the following steps are further included:
  • Step 501a recording an initial pose parameter corresponding to the first marker image
  • the IMU is set in the device, and the camera's pose parameters and time stamps are collected periodically by the IMU.
  • the pose parameters include a rotation matrix and a displacement vector, and the timestamp is used to represent the acquisition time of the pose parameter.
  • the rotation matrix acquired by the IMU is relatively accurate.
  • the shooting time of each frame of image is recorded at the same time.
  • the device queries and records the initial pose parameters of the camera when taking the first marker image based on the shooting time of the first marker image.
  • Step 501b Obtain n pyramid images with different scales corresponding to the first marker image, where n is an integer greater than 1.
  • the device also extracts the initial feature points in the first marker image.
  • the feature extraction algorithm used by the device to extract feature points may be a FAST (Features from Accelerated Segment Test) detection algorithm, a Shi-Tomasi corner detection algorithm, and a Harris Corner Detection. (Harris corner detection) algorithm, SIFT (Scale-Invariant Feature Transform) algorithm, ORB (Oriented FAST and Rotated BRIEF) algorithm.
  • An ORB feature point includes a FAST Point-point and a Binary Robust Independent Elementary Feature Descirptor.
  • the FAST corner point refers to the location of the ORB feature point in the image.
  • the FAST corner point mainly detects the obvious change of the local pixel gray scale, and is known for its fast speed.
  • the idea of the FAST corner If a pixel differs greatly from the neighborhood's pixels (too bright or too dark), the pixel may be a corner.
  • the BRIEF descriptor is a binary representation of a vector that describes the information about the pixels around the key in an artificially designed way.
  • the description vector of the BRIEF descriptor consists of a number of 0's and 1's, where 0's and 1's encode the size relationship of two pixels near the FAST corner.
  • the ORB feature is faster to calculate, it is suitable for implementation on mobile devices. However, since the ORB feature descriptor has no scale invariance, the scale change when the user holds the camera to capture the image is very obvious, and the user is likely to observe the corresponding image of the first marker image at a very long or very close scale. In an alternative implementation, the device generates n pyramid images of different scales for the first marker image.
  • the pyramid image refers to an image obtained by scaling the first marker image by a preset ratio. Taking the pyramid image including the four-layer image as an example, the first marker image is scaled according to the scaling ratios of 1.0, 0.8, 0.6, and 0.4, and four images of different scales are obtained.
  • Step 501c extracting initial feature points for each pyramid image, and recording two-dimensional coordinates of the initial feature points when the pyramid image is scaled to the original size.
  • the device extracts feature points for each layer of pyramid image and calculates an ORB feature descriptor. For the feature points extracted on the pyramid image that is not the original scale (1.0), after the pyramid image is scaled to the original scale, the two-dimensional coordinates of each feature point on the pyramid image of the original scale are recorded.
  • the feature points and two-dimensional coordinates on these pyramid images can be called layer-keypoint.
  • feature points on each layer of pyramid images have a maximum of 500 feature points.
  • the feature points on each pyramid image are determined as initial feature points.
  • the current image has a large scale and the high-frequency details on the current image are clearly visible, the current image and the pyramid image with a lower number of layers (such as the original image) will have a higher matching score.
  • the current image has a small scale and only the low-frequency information on the current image is visible, the current image has a higher matching score with the pyramid image with a higher number of layers.
  • the first marker image has three pyramid images 71, 72 and 73
  • the pyramid image 1 is located in the first layer of the pyramid, with the smallest dimension of the three images
  • the pyramid image 2 is located
  • the second layer of the pyramid has an intermediate scale in the three images
  • the pyramid image 3 is located in the third layer of the pyramid, with the largest scale of the three images, if the current image 74 is tracking the feature points relative to the first marker image
  • the device can match the current image 74 with the feature points extracted from the three pyramid images, respectively. Since the scales of the pyramid image 3 and the current image 74 are closer, the feature points extracted in the pyramid image 3 have a higher matching score.
  • a plurality of scale pyramid images are set on the first marker image, and then initial feature points on each pyramid image are extracted for subsequent feature point tracking processes, and the feature points on multiple scales are matched together.
  • the scale of the first marker image is automatically adjusted to achieve scale invariance.
  • the feature point tracking process is shown for step 506.
  • the device extracts feature points from the current image, which may be ORB feature descriptors. Different from extracting multi-layer feature points from the first marker image, the device can extract a layer of feature points (for example, up to 500) for the current image, and extract the layer-keypoint and the current image extracted on the first marker image. The feature points are matched by the ORB feature descriptor.
  • step 506 optionally includes the following sub-steps 506a through 506c:
  • Step 506a extracting candidate feature points for the current image
  • the feature extraction algorithm used by the device to extract the feature points may adopt at least one of a FAST detection algorithm, a Shi-Tomasi corner detection algorithm, a Harris corner detection algorithm, a SIFT algorithm, and an ORB algorithm.
  • This embodiment exemplifies the use of the ORB algorithm to extract the ORB feature descriptor in the current image.
  • Step 506b obtaining, by the IMU, a reference pose change amount when the camera captures the current image
  • the IMU is set in the device, and the IMU can obtain the amount of reference pose change when the camera captures the current image.
  • the reference pose change amount is used to characterize the amount of pose change of the camera from the acquisition of the first marker image to the acquisition of the current image, the pose variation including the rotation matrix and the displacement vector. Due to the physical characteristics of the IMU, the rotation matrix acquired by the IMU is more accurate. There is a certain cumulative error in the displacement vector collected by the IMU, but it is not too different from the real result and still has guiding significance.
  • Step 506c Perform a rotational translation projection on the initial feature point in the first marker image according to the reference pose change amount, to obtain a projection feature point corresponding to the initial feature point in the current image;
  • this step includes the following sub-steps:
  • the device pre-fetches and caches the two-dimensional coordinates of the initial feature points in the first marker image.
  • the two-dimensional coordinates are represented in homogeneous order.
  • the device transforms the two-dimensional coordinates of the initial feature points into a three-dimensional space by the following formula, and obtains the first three-dimensional coordinates X born of the initial feature points in the three-dimensional space.
  • f x , f y , c x , and c y are built-in parameters of the camera.
  • the two-dimensional coordinate x born of the initial feature point is a homogeneous representation of layer-keyPoints on the first marker image, and the three-dimensional point x born is a non-homogeneous representation. Assume that the initial depth d of the first marker image is 1.
  • the first three-dimensional coordinate X born is subjected to three-dimensional rotational translation by the following formula, and a second three-dimensional coordinate X current corresponding to the initial feature point on the current image is obtained ;
  • R is the rotation matrix in the reference pose change collected by the IMU
  • T is the displacement vector in the reference pose change acquired by the IMU.
  • the device projects the second three-dimensional coordinate X current to the current image by the following formula, and obtains the two-dimensional coordinate x current of the projected feature point in the current image:
  • f x , f y , c x , and c y are built-in parameters of the camera.
  • the positions of these projection feature points in the current image are used to predict the position of the target feature points, and usually the positions of these projected feature points are the same or close to the position of the target feature points.
  • Step 506d Search for a target feature point that matches the initial feature point in a first range centered on the projected feature point.
  • the device extracts multiple ORB feature descriptors in the current image. For each of the projected feature points corresponding to the initial feature points, the candidate ORB feature descriptors in the first range centered on the projected feature points are selected, and then the projected feature points are matched with the candidate ORB feature descriptors, and when the matching is successful, It is considered that the target feature points matching the initial feature points are searched.
  • the first range is a rectangular box or a square box.
  • the embodiment of the present application does not limit the style of the first range, and the first range may also be other shapes such as a diamond frame, a parallelogram frame, a circular frame, and the like.
  • Step 506e When the number of the target feature points searched is less than the preset threshold, the target feature points matching the initial feature points are re-searched in the second range centered on the projected feature points.
  • the second range is greater than the first range.
  • each of the projected feature points corresponds to a respective initial feature point
  • each of the target feature points corresponds to a respective initial feature point.
  • the total number of projected feature points is less than or equal to the total number of initial feature points
  • the total number of target feature points is less than or equal to the total number of initial feature points.
  • the relocation method provided by the embodiment obtains the projected feature point by using the reference pose change amount of the error acquired by the IMU, and the initial feature point is projected into the current image through the rotation translation to obtain the projected feature point;
  • the feature points are matched in a smaller range to obtain the target feature points.
  • the search range becomes smaller, the number of candidate ORB feature descriptors for each initial feature point is reduced, so that the number of matching calculations to be performed is reduced, and the matching process is accelerated; on the other hand, due to the projected feature points It is based on the 2D-3D-2D conversion process, which is equivalent to the addition of 3D constraints, which can eliminate some interference matching points with high feature matching but not satisfying geometric consistency.
  • the pose change calculation process for the camera pose shown in step 508 is performed.
  • the initial feature point and the target feature point are input into the algorithm of ransac, and the homography matrix of the current image relative to the first marker image is calculated.
  • the decomposition matrix of the homography matrix can be decomposed into the rotation matrix by the decomposition algorithm in the IMU And translation vector
  • the image coordinate system F of the first mark image, the image coordinate system A of the i-th mark image, the image coordinate system L of the previous frame image, and the image coordinate system C of the current image are assumed.
  • the homography matrix between the first marker image and the ith marker image is H af
  • the Ha af is decomposed to obtain the first rotation matrix R_old and the first translation vector T_old
  • the homography matrix between the marker image and the previous frame image is H la
  • the homography matrix between the previous frame image and the current image is H cl
  • iteratively decomposes H la and H cl to obtain the first Two rotation matrices R_ca and a first translation vector T_ca.
  • Anchor-SLAM is the process of tracking the optical flow from the first marker image to the current image. There is no lost or lost mark image, and there is no cumulative error such as motion blur, which is also the target matching point of the first mark image to the current image, and then the result is calculated by decomposing the homography method. Therefore, the accumulation error can be surely eliminated by the relocation of the embodiment of the present application, and the result is equivalent to the best case of the Anchor-SLAM.
  • FIG. 10 is a flowchart of a method for relocating a camera pose tracking process provided by another exemplary embodiment of the present application. This embodiment is exemplified by applying the method to the apparatus shown in FIG. The method includes:
  • Step 1001 Record an initial pose parameter corresponding to the first marker image
  • the IMU is set in the device, and the camera's pose parameters and time stamps are collected periodically by the IMU.
  • the pose parameters include a rotation matrix and a displacement vector, and the timestamp is used to represent the acquisition time of the pose parameter.
  • the rotation matrix acquired by the IMU is relatively accurate.
  • the shooting time of each frame of image is recorded at the same time.
  • the device queries and records the initial pose parameters of the camera when taking the first marker image based on the shooting time of the first marker image.
  • the first mark image is the first frame image collected by the device, or the first mark image is a frame image in which the number of feature points is greater than a preset threshold in the first few frames of the device.
  • Step 1002 Obtain n pyramid images with different scales corresponding to the first marker image, where n is an integer greater than 1.
  • the device extracts the initial feature points in the first marker image.
  • the device may extract an ORB feature point in the first marker image as an initial feature point.
  • the device generates n pyramid images of different scales for the first marker image, n being a positive integer.
  • the pyramid image refers to an image obtained by scaling the first marker image by a preset ratio. Taking the pyramid image including the four-layer image as an example, the first marker image is scaled according to the scaling ratios of 1.0, 0.8, 0.6, and 0.4, and four images of different scales are obtained.
  • Step 1003 Extract initial feature points for each pyramid image, and record two-dimensional coordinates of the initial feature points when the pyramid image is scaled to the original size.
  • the device extracts feature points for each layer of pyramid image and calculates an ORB feature descriptor. For the feature points extracted on the pyramid image that is not the original scale (1.0), after the pyramid image is scaled to the original scale, the two-dimensional coordinates of each feature point on the pyramid image of the original scale are recorded.
  • the feature points and two-dimensional coordinates on these pyramid images can be called layer-keypoint.
  • feature points on each layer of pyramid images have a maximum of 500 feature points.
  • the feature points on each pyramid image are determined as initial feature points.
  • Step 1004 Acquire a current image acquired after the i-th mark image of the plurality of mark images
  • the camera in the device collects a frame image at a preset time interval to form an image sequence.
  • the camera acquires a frame of image forming image sequence according to a preset time interval during motion (translation and/or rotation).
  • the device determines the first frame image in the image sequence (or one frame image in the first few frames of images that meets the predetermined condition) as the first marker image, and performs the subsequently acquired image on the first marker image.
  • Feature point tracking, and calculating a camera pose parameter according to the feature point tracking result if the feature point tracking effect of the current frame image is worse than a preset condition, determining the previous frame image of the current frame image as the second marker image, The subsequently acquired image is subjected to feature point tracking with respect to the second marker image, and the camera pose parameter of the camera is calculated according to the feature point tracking result, and so on.
  • the device can sequentially perform camera attitude tracking of a plurality of marked images in sequence.
  • the camera When in the i-th tracking process corresponding to the i-th mark image, the camera captures the current image.
  • the current image is a certain frame image acquired after the i-th mark image, where i is an integer greater than one.
  • Step 1005 Acquire an initial feature point and an initial pose parameter of the first one of the plurality of mark images when the current image meets the relocation condition, where the initial pose parameter is used to indicate when the camera captures the first mark image.
  • the device determines if the current image meets the relocation criteria.
  • the relocation condition is used to indicate that the tracking process of the current image with respect to the i-th flag image fails, or the re-location condition is used to indicate that the accumulated error in the history tracking process has been higher than the preset condition.
  • the device tracks the current image relative to the i-th mark image, if there is no feature point matching the i-th mark image in the current image, or the i-th mark in the current image When the feature points of the image matching are less than the first number, it is determined that the tracking process of the current image with respect to the i-th flag image fails, and the relocation condition is met.
  • the device determines that the number of frames between the current image and the last relocated image is greater than the second number, determining that the accumulated error in the history tracking process is higher than a preset condition, or When it is determined that the number of mark images between the i-th mark image and the first mark image is greater than the third number, it is determined that the cumulative error in the history tracking process has been higher than the preset condition.
  • This embodiment does not limit the specific condition content of the relocation condition.
  • the device When the current image meets the relocation condition, the device attempts to track the current image relative to the first marker image. At this time, the device acquires an initial feature point in the first marked image of the cache and an initial pose parameter, which is used to indicate a camera pose when the camera captures the first marker image.
  • Step 1006 Extract candidate feature points for the current image.
  • the feature extraction algorithm used by the device to extract the feature points may adopt at least one of a FAST detection algorithm, a Shi-Tomasi corner detection algorithm, a Harris corner detection algorithm, a SIFT algorithm, and an ORB algorithm.
  • This embodiment exemplifies the use of the ORB algorithm to extract an ORB feature descriptor in a current image as a candidate feature point.
  • Step 1007 Acquire, by the IMU, a reference pose change amount when the camera captures the current image
  • the IMU is set in the device, and the IMU can obtain the amount of reference pose change when the camera captures the current image.
  • the reference pose change amount is used to characterize the amount of pose change of the camera from the acquisition of the first marker image to the acquisition of the current image, the pose variation including the rotation matrix and the displacement vector. Due to the physical characteristics of the IMU, the rotation matrix acquired by the IMU is more accurate. There is a certain cumulative error in the displacement vector collected by the IMU, but it is not too different from the real result and still has guiding significance.
  • Step 1008 Perform a rotational translation projection of the initial feature point in the first marker image according to the reference pose change amount, to obtain a projection feature point corresponding to the initial feature point in the current image;
  • this step includes the following sub-steps:
  • the device pre-fetches and caches the two-dimensional coordinates of the initial feature points in the first marker image.
  • the two-dimensional coordinates are represented in homogeneous order.
  • the device transforms the two-dimensional coordinates of the initial feature points into a three-dimensional space by the following formula, and obtains the first three-dimensional coordinates X born of the initial feature points in the three-dimensional space.
  • f x , f y , c x , and c y are built-in parameters of the camera.
  • the two-dimensional coordinate x born of the initial feature point is a homogeneous representation of layer-keyPoints on the first marker image, and the three-dimensional point x born is a non-homogeneous representation. Assume that the initial depth d of the first marker image is 1.
  • the first three-dimensional coordinate X born is subjected to three-dimensional rotational translation by the following formula, and a second three-dimensional coordinate X current corresponding to the initial feature point on the current image is obtained ;
  • R is the rotation matrix in the reference pose change collected by the IMU
  • T is the displacement vector in the reference pose change acquired by the IMU.
  • the device projects the second three-dimensional coordinate X current to the current image by the following formula, and obtains the two-dimensional coordinate x current of the projected feature point in the current image:
  • f x , f y , c x , and c y are built-in parameters of the camera.
  • the positions of these projection feature points in the current image are used to predict the position of the target feature points, and usually the positions of these projected feature points are the same or close to the position of the target feature points.
  • Step 1009 Search for a target feature point that matches the initial feature point in a first range centered on the projected feature point.
  • the device extracts multiple ORB feature descriptors in the current image. For each of the projected feature points corresponding to the initial feature points, the candidate ORB feature descriptors in the first range centered on the projected feature points are selected, and then the projected feature points are matched with the candidate ORB feature descriptors, and when the matching is successful, It is considered that the target feature points matching the initial feature points are searched.
  • the first range is a rectangular box or a square box.
  • the embodiment of the present application does not limit the style of the first range, and the first range may also be other shapes such as a diamond frame, a parallelogram frame, a circular frame, and the like.
  • step 1010 when the number of target feature points searched is less than a preset threshold, the target feature points matching the initial feature points are re-searched in the second range centered on the projected feature points.
  • the second range is greater than the first range.
  • each of the projected feature points corresponds to a respective initial feature point
  • each of the target feature points corresponds to a respective initial feature point.
  • the total number of projected feature points is less than or equal to the total number of initial feature points
  • the total number of target feature points is less than or equal to the total number of initial feature points.
  • Step 1011 Calculate, according to the initial feature point and the target feature point, a pose change amount when the camera changes from the first camera pose to the target camera pose, where the target camera pose is a camera pose when the camera captures the current image;
  • the device calculates a homography matrix homography between the two image images according to the initial feature point and the target feature point; and decomposes the homography matrix homography to obtain a bit when the camera changes from the first camera posture to the target camera posture.
  • the amount of change in position R relocalize and T relocalize .
  • the homography matrix describes the mapping relationship between two planes. If the feature points in the natural scene (real environment) fall on the same physical plane, the motion estimation can be performed through the homography matrix.
  • the device decomposes the homography matrix by ransac to obtain a rotation matrix R relocalize and a translation vector T relocalize .
  • R relocalize is the rotation matrix when the camera changes from the first camera pose to the target camera pose
  • T relocalize is the displacement vector when the camera changes from the first camera pose to the target camera pose
  • Step 1012 Retargeting the target pose parameter corresponding to the target camera pose according to the initial pose parameter and the pose change amount.
  • the device transforms the initial pose parameter by the pose change amount, and then re-positions the target pose parameter corresponding to the target camera pose, thereby calculating the camera pose when the camera captures the current image.
  • the terminal determines the current image as the i+1th marker image.
  • the terminal continues feature point tracking based on the (i+1)th tag image.
  • the terminal may continue to generate the i+2th marker image, the i+3th marker image, the i+4th marker image, and the like according to the subsequent feature point tracking situation, and so on.
  • the relocation method provided by this embodiment can reconstruct the Anchor-SLAM in multiple consecutive marker images by relocating the current image and the first marker image when the current image meets the relocation condition.
  • the relocation is implemented in the algorithm, thereby reducing the possibility of interruption of the tracking process, thereby solving the problem that the relocation method of the SLAM relocation method in the related art is poor in the AR field.
  • the relocation process is to reposition the current image relative to the first marker image
  • the first marker image can be considered to have no cumulative error, so the embodiment can also eliminate the tracking process of multiple marker images. Cumulative error.
  • the relocation method provided by the embodiment obtains the projected feature point by using the reference pose change amount of the error acquired by the IMU, and the initial feature point is projected into the current image by the rotation translation, and then the projection feature point is smaller according to the projection feature point.
  • the range of matches is obtained by the target feature points.
  • the search range becomes smaller, the number of candidate ORB feature descriptors for each initial feature point is reduced, so that the number of matching calculations to be performed is reduced, and the matching process is accelerated; on the other hand, due to the projected feature points It is based on the 2D-3D-2D conversion process, which is equivalent to the addition of 3D constraints, which can eliminate some interference matching points with high feature matching but not satisfying geometric consistency.
  • the relocation method of the camera attitude tracking process described above can be used in an AR program, by which the camera pose on the terminal can be tracked according to real-world scene information in real time, and tracked according to the tracking.
  • the result adjusts and modifies the display position of the AR element in the AR application.
  • the AR program running on the mobile phone shown in FIG. 1 or FIG. 2 as an example, when it is necessary to display a still cartoon character standing on a book, no matter how the user moves the mobile phone, it only needs to change according to the camera posture on the mobile phone. Modifying the display position of the cartoon character will keep the standing position of the cartoon character on the book unchanged.
  • FIG. 11 shows a block diagram of a relocation device of a camera pose tracking process provided by an exemplary embodiment of the present application.
  • the relocation device can be implemented as all or part of an electronic device or a mobile terminal by software, hardware, or a combination of both.
  • the device is provided with a camera, which may be a monocular camera.
  • the device includes an image acquisition module 1110, an information acquisition module 1120, a feature point tracking module 1130, a change amount calculation module 1140, and a relocation module 1150.
  • the image acquisition module 1110 is configured to acquire a current image acquired after the i-th mark image of the plurality of mark images, where i is an integer greater than one;
  • the information acquiring module 1120 is configured to acquire, when the current image meets the relocation condition, an initial feature point and an initial pose parameter of the first one of the plurality of mark images, where the initial pose parameter is used Instructing the camera to capture a camera pose when the first marker image is captured;
  • the feature point tracking module 1130 is configured to perform feature point tracking on the current image with respect to the first marker image to obtain a target feature point that matches the initial feature point;
  • a change amount calculation module 1140 configured to calculate, according to the initial feature point and the target feature point, a pose change amount when the camera changes from the first camera pose to a target camera pose, where the target camera pose is a camera pose of the camera when acquiring the current image;
  • the relocation module 1150 is configured to reposition the target pose parameter corresponding to the target camera pose according to the initial pose parameter and the pose change amount.
  • an inertial measurement unit IMU is further disposed in the device; the feature point tracking module 1130 includes:
  • a collection submodule configured to acquire, by the IMU, a reference pose change amount when the camera captures the current image
  • a projection sub-module configured to perform a rotational translation projection of the initial feature point in the first marker image according to the reference pose change amount, to obtain a projection feature corresponding to the initial feature point in the current image point;
  • the projection sub-module is used to:
  • the first three-dimensional coordinate X born is subjected to three-dimensional rotational translation by the following formula to obtain a second three-dimensional coordinate X current corresponding to the initial feature point on the current image ;
  • R is a rotation matrix in the reference pose change amount
  • T is a displacement parameter in the reference pose change amount
  • the searching sub-module is further configured to: when the number of the target feature points searched is less than a preset threshold, in a second range centered on the projected feature point, Re-searching for target feature points that match the initial feature points;
  • the second range is greater than the first range.
  • the apparatus further includes: an extraction module 1160;
  • the image obtaining module 1110 is further configured to acquire n pyramid images with different scales corresponding to the first marker image, where n is an integer greater than one;
  • the extraction module 1160 is further configured to extract the initial feature points for each of the pyramid images, and record two-dimensional coordinates of the initial feature points when the pyramid image is scaled to an original size.
  • the change amount calculation module 1140 is configured to calculate a homography matrix of the camera during a camera pose change process according to the initial feature point and the target feature point;
  • the homography matrix is decomposed to obtain the pose change amounts R relocalize and T relocalize when the camera changes from the first camera pose to the target camera pose.
  • the feature point tracking module 1130 is configured to determine the current image as an i+1th tag image, and compare the subsequently acquired image with the i+1th tag image. Perform feature point tracking. That is, the feature point tracking module 1130 is configured to continue the feature point tracking based on the (i+1)th tag image when the current image is determined as the i+1th tag image.
  • the relocation device of the camera attitude tracking process provided by the above embodiment is only exemplified by the division of the above functional modules when implementing relocation. In actual applications, the functions may be assigned differently according to needs.
  • the function module is completed, that is, the internal structure of the device is divided into different functional modules to complete all or part of the functions described above.
  • the relocation device and the relocation method embodiment are provided in the same concept, and the specific implementation process is described in detail in the method embodiment, and details are not described herein again.
  • FIG. 12 is a block diagram showing the structure of an electronic device 1200 provided by an exemplary embodiment of the present application.
  • the electronic device 1200 can be: a smart phone, a tablet computer, an MP3 player (Moving Picture Experts Group Audio Layer III), an MP4 (Moving Picture Experts Group Audio Layer IV), a dynamic image expert compression standard Audio level 4) Player, laptop or desktop computer.
  • Electronic device 1200 may also be referred to as a user terminal, a portable electronic device, a laptop device, a desktop device, and the like.
  • the electronic device 1200 includes a processor 1201 and a memory 1202.
  • Processor 1201 may include one or more processing cores, such as a 4-core processor, an 8-core processor, and the like.
  • the processor 1201 may be configured by at least one of a DSP (Digital Signal Processing), an FPGA (Field-Programmable Gate Array), and a PLA (Programmable Logic Array). achieve.
  • the processor 1201 may also include a main processor and a coprocessor.
  • the main processor is a processor for processing data in an awake state, which is also called a CPU (Central Processing Unit); the coprocessor is A low-power processor for processing data in standby.
  • the processor 1201 may be integrated with a GPU (Graphics Processing Unit), which is responsible for rendering and rendering of the content that the display needs to display.
  • the processor 1201 may further include an AI (Artificial Intelligence) processor for processing computational operations related to machine learning.
  • AI Artificial Intelligence
  • Memory 1202 can include one or more computer readable storage media, which can be non-transitory.
  • the memory 1202 may also include high speed random access memory, as well as non-volatile memory such as one or more magnetic disk storage devices, flash memory storage devices.
  • the non-transitory computer readable storage medium in memory 1202 is for storing at least one instruction for execution by processor 1201 to implement the camera pose provided by the method embodiments of the present application. The method of relocating the tracking process.
  • the electronic device 1200 optionally further includes: a peripheral device interface 1203 and at least one peripheral device.
  • the processor 1201, the memory 1202, and the peripheral device interface 1203 may be connected by a bus or a signal line.
  • Each peripheral device can be connected to the peripheral device interface 1203 via a bus, signal line or circuit board.
  • the peripheral device includes at least one of a radio frequency circuit 1204, a touch display screen 1205, a camera 1206, an audio circuit 1207, a positioning component 1208, and a power source 1209.
  • the peripheral device interface 1203 can be used to connect at least one peripheral device associated with an I/O (Input/Output) to the processor 1201 and the memory 1202.
  • processor 1201, memory 1202, and peripheral interface 1203 are integrated on the same chip or circuit board; in some other embodiments, any one of processor 1201, memory 1202, and peripheral interface 1203 or The two can be implemented on a separate chip or circuit board, which is not limited in this embodiment.
  • the RF circuit 1204 is configured to receive and transmit an RF (Radio Frequency) signal, also referred to as an electromagnetic signal.
  • Radio frequency circuit 1204 communicates with the communication network and other communication devices via electromagnetic signals.
  • the RF circuit 1204 converts the electrical signal into an electromagnetic signal for transmission, or converts the received electromagnetic signal into an electrical signal.
  • the radio frequency circuit 1204 includes an antenna system, an RF transceiver, one or more amplifiers, a tuner, an oscillator, a digital signal processor, a codec chipset, a subscriber identity module card, and the like.
  • Radio frequency circuitry 1204 can communicate with other electronic devices via at least one wireless communication protocol.
  • the wireless communication protocols include, but are not limited to, the World Wide Web, a metropolitan area network, an intranet, generations of mobile communication networks (2G, 3G, 4G, and 5G), wireless local area networks, and/or WiFi (Wireless Fidelity) networks.
  • the radio frequency circuit 1204 may further include an NFC (Near Field Communication) related circuit, which is not limited in this application.
  • the display screen 1205 is used to display a UI (User Interface).
  • the UI can include graphics, text, icons, video, and any combination thereof.
  • display screen 1205 is a touch display screen
  • display screen 1205 also has the ability to capture touch signals over the surface or surface of display screen 1205.
  • the touch signal can be input to the processor 1201 as a control signal for processing.
  • the display screen 1205 can also be used to provide virtual buttons and/or virtual keyboards, also referred to as soft buttons and/or soft keyboards.
  • the display screen 1205 may be one, and the front panel of the electronic device 1200 is disposed; in other embodiments, the display screen 1205 may be at least two, respectively disposed on different surfaces of the electronic device 1200 or in a folded design.
  • the display screen 1205 can be a flexible display screen disposed on a curved surface or a folded surface of the electronic device 1200. Even the display screen 1205 can be set as a non-rectangular irregular pattern, that is, a profiled screen.
  • the display screen 1205 can be prepared by using an LCD (Liquid Crystal Display) or an OLED (Organic Light-Emitting Diode).
  • Camera component 1206 is used to capture images or video.
  • camera assembly 1206 includes a front camera and a rear camera.
  • the front camera is placed on the front panel of the electronic device and the rear camera is placed on the back of the electronic device.
  • the rear camera is at least two, which are respectively a main camera, a depth camera, a wide-angle camera, and a telephoto camera, so as to realize the background blur function of the main camera and the depth camera, and the main camera Combine with a wide-angle camera for panoramic shooting and VR (Virtual Reality) shooting or other integrated shooting functions.
  • camera assembly 1206 can also include a flash.
  • the flash can be a monochrome temperature flash or a two-color temperature flash.
  • the two-color temperature flash is a combination of a warm flash and a cool flash that can be used for light compensation at different color temperatures.
  • the audio circuit 1207 can include a microphone and a speaker.
  • the microphone is used to collect sound waves of the user and the environment, and convert the sound waves into electrical signals for processing into the processor 1201 for processing, or input to the RF circuit 1204 for voice communication.
  • the microphones may be multiple, and are respectively disposed at different parts of the electronic device 1200.
  • the microphone can also be an array microphone or an omnidirectional acquisition microphone.
  • the speaker is then used to convert electrical signals from the processor 1201 or the RF circuit 1204 into sound waves.
  • the speaker can be a conventional film speaker or a piezoelectric ceramic speaker.
  • the audio circuit 1207 can also include a headphone jack.
  • the positioning component 1208 is configured to locate the current geographic location of the electronic device 1200 to implement navigation or LBS (Location Based Service).
  • the positioning component 1208 can be a positioning component based on a US-based GPS (Global Positioning System), a Chinese Beidou system, or a Russian Galileo system.
  • Power source 1209 is used to power various components in electronic device 1200.
  • the power source 1209 can be an alternating current, a direct current, a disposable battery, or a rechargeable battery.
  • the rechargeable battery may be a wired rechargeable battery or a wireless rechargeable battery.
  • a wired rechargeable battery is a battery that is charged by a wired line
  • a wireless rechargeable battery is a battery that is charged by a wireless coil.
  • the rechargeable battery can also be used to support fast charging technology.
  • electronic device 1200 also includes one or more sensors 1210.
  • the one or more sensors 1210 include, but are not limited to, an acceleration sensor 1211, a gyro sensor 1212, a pressure sensor 1213, a fingerprint sensor 1214, an optical sensor 1215, and a proximity sensor 1216.
  • the acceleration sensor 1211 can detect the magnitude of the acceleration on the three coordinate axes of the coordinate system established by the electronic device 1200.
  • the acceleration sensor 1211 can be used to detect components of gravity acceleration on three coordinate axes.
  • the processor 1201 can control the touch display 1205 to display the user interface in a landscape view or a portrait view according to the gravity acceleration signal collected by the acceleration sensor 1211.
  • the acceleration sensor 1211 can also be used for the acquisition of game or user motion data.
  • the gyro sensor 1212 can detect the body direction and the rotation angle of the electronic device 1200, and the gyro sensor 1212 can cooperate with the acceleration sensor 1211 to collect the user's 3D motion on the electronic device 1200. Based on the data collected by the gyro sensor 1212, the processor 1201 can implement functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • functions such as motion sensing (such as changing the UI according to the user's tilting operation), image stabilization at the time of shooting, game control, and inertial navigation.
  • the pressure sensor 1213 can be disposed on a side border of the electronic device 1200 and/or a lower layer of the touch display screen 1205.
  • the pressure sensor 1213 When the pressure sensor 1213 is disposed on the side frame of the electronic device 1200, the user's holding signal to the electronic device 1200 can be detected, and the processor 1201 performs left and right hand recognition or shortcut operation according to the holding signal collected by the pressure sensor 1213.
  • the operability control on the UI interface is controlled by the processor 1201 according to the user's pressure operation on the touch display screen 1205.
  • the operability control includes at least one of a button control, a scroll bar control, an icon control, and a menu control.
  • the fingerprint sensor 1214 is configured to collect the fingerprint of the user, and the processor 1201 identifies the identity of the user according to the fingerprint collected by the fingerprint sensor 1214, or the fingerprint sensor 1214 identifies the identity of the user according to the collected fingerprint. Upon identifying that the user's identity is a trusted identity, the processor 1201 authorizes the user to perform related sensitive operations including unlocking the screen, viewing encrypted information, downloading software, paying and changing settings, and the like.
  • the fingerprint sensor 1214 can be disposed on the front, back, or side of the electronic device 1200. When the physical device or vendor logo is provided on the electronic device 1200, the fingerprint sensor 1214 can be integrated with the physical button or the manufacturer logo.
  • Optical sensor 1215 is used to collect ambient light intensity.
  • the processor 1201 can control the display brightness of the touch display screen 1205 based on the ambient light intensity acquired by the optical sensor 1215. Illustratively, when the ambient light intensity is high, the display brightness of the touch display screen 1205 is raised; when the ambient light intensity is low, the display brightness of the touch display screen 1205 is lowered.
  • the processor 1201 can also dynamically adjust the shooting parameters of the camera assembly 1206 based on the ambient light intensity acquired by the optical sensor 1215.
  • Proximity sensor 1216 also referred to as a distance sensor, is typically disposed on the front panel of electronic device 1200. Proximity sensor 1216 is used to capture the distance between the user and the front of electronic device 1200. In one embodiment, when the proximity sensor 1216 detects that the distance between the user and the front side of the electronic device 1200 is gradually decreasing, the processor 1201 controls the touch display screen 1205 to switch from the bright screen state to the interest screen state; when the proximity sensor 1216 When it is detected that the distance between the user and the front side of the electronic device 1200 is gradually increased, the processor 1201 controls the touch display screen 1205 to switch from the state of the screen to the bright state.
  • FIG. 12 does not constitute a limitation to electronic device 1200, may include more or fewer components than illustrated, or may combine certain components, or employ different component arrangements.
  • a person skilled in the art may understand that all or part of the steps of implementing the above embodiments may be completed by hardware, or may be instructed by a program to execute related hardware, and the program may be stored in a computer readable storage medium.
  • the storage medium mentioned may be a read only memory, a magnetic disk or an optical disk or the like.

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Mathematical Physics (AREA)
  • Mathematical Analysis (AREA)
  • Computational Mathematics (AREA)
  • Mathematical Optimization (AREA)
  • Pure & Applied Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Multimedia (AREA)
  • Computing Systems (AREA)
  • Algebra (AREA)
  • Databases & Information Systems (AREA)
  • Software Systems (AREA)
  • General Engineering & Computer Science (AREA)
  • Image Analysis (AREA)
  • Studio Devices (AREA)

Abstract

一种相机姿态追踪过程的重定位方法、装置及存储介质,属于增强显示领域。所述方法包括:获取多个标记图像中第i个标记图像之后采集的当前图像(501);当当前图像符合重定位条件时,获取多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,初始位姿参数用于指示相机采集第一个标记图像时的相机姿态(502);将当前图像相对于第一个标记图像进行特征点追踪,得到与初始特征点匹配的目标特征点(503);根据初始特征点和目标特征点,计算相机从第一相机姿态改变至目标相机姿态时的位姿变化量,目标相机姿态是相机在采集当前图像时的相机姿态(504);根据初始位姿参数和位姿变化量,重定位得到目标相机姿态对应的目标位姿参数(505)。该方法能够在连续多个标记图像进行追踪的Anchor-SLAM算法中实现重定位,从而减少了追踪过程中断的可能性。

Description

相机姿态追踪过程的重定位方法、装置及存储介质
本申请要求于2018年04月27日提交的申请号为201810391550.9、发明名称为“相机姿态追踪过程的重定位方法、装置及存储介质”的中国专利申请的优先权,其全部内容通过引用结合在本申请中。
技术领域
本申请实施例涉及增强显示领域,特别涉及一种相机姿态追踪过程的重定位方法、装置及存储介质。
背景技术
视觉SLAM(simultaneous Localization and mapping,同时定位与地图构建)是指搭载相机的主体,在没有环境先验信息的情况下,于运动过程中建立环境的模型,同时估计自己的运动的技术。SLAM可以应用在AR(Augmented Reality,增强显示)领域、机器人领域和无人驾驶领域中。
以单目视觉SLAM为例,通常将相机采集的第一帧图像作为标记图像(Anchor)。在相机后续采集到当前图像时,设备对当前图像与标记图像之间共同具有的特征点进行追踪,根据当前图像与标记图像之间的特征点位置变化计算得到相机在现实世界中的位姿变化。但某些场景下会发生当前图像中的特征点丢失(Lost),无法继续追踪的情况。此时,需要使用SLAM重定位方法对当前图像进行重定位。
发明内容
本申请实施例提供了一种相机姿态追踪过程的重定位方法、装置及存储介质。所述技术方案如下:
根据本申请实施例的一个方面,提供了一种相机姿态追踪过程的重定位方法,应用于具有相机的设备中,所述设备用于按序执行多个标记图像的相机姿态追踪,所述方法包括:
获取所述多个标记图像中第i个标记图像之后采集的当前图像,i为大于1的整数;
当所述当前图像符合重定位条件时,获取所述多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,所述初始位姿参数用于指示所述相机采集所述第一个标记图像时的相机姿态;
将所述当前图像相对于所述第一个标记图像进行特征点追踪,得到与所述初始特征点匹配的目标特征点;
根据所述初始特征点和所述目标特征点,计算所述相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量,所述目标相机姿态是所述相机在采集所述当前图像时的相机姿态;
根据所述初始位姿参数和所述位姿变化量,重定位得到所述目标相机姿态对应的目标位姿参数。
根据本申请实施例的另一方面,提供了一种相机姿态追踪过程的重定位装置,所述装置用于按序执行多个标记图像的相机姿态追踪,所述装置包括:
图像获取模块,用于获取所述多个标记图像中第i个标记图像之后采集的当前图像,i为大于1的整数;
信息获取模块,用于当所述当前图像符合重定位条件时,获取所述多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,所述初始位姿参数用于指示所述相机采集所述第一个标记图像时的相机姿态;
特征点追踪模块,用于将所述当前图像相对于所述第一个标记图像进行特征点追踪,得到与所述初始特征点匹配的目标特征点;
变化量计算模块,用于根据所述初始特征点和所述目标特征点,计算相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量,所述目标相机姿态是所述相机在采集所述当前图像时的相机姿态;
重定位模块,用于根据所述初始位姿参数和所述位姿变化量,重定位得到所述目标相机姿态对应的目标位姿参数。
根据本申请实施例的另一方面,提供了一个电子设备,所述电子设备包括存储器和处理器;
所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如上所述的相机姿态追踪过程的重定位方法。
根据本申请实施例的另一方面,提供了一种计算机可读存储介质,所述存储介质中存储有至少一条指令,所述至少一条指令由处理器加载并执行以实现如上所述的相机姿态追踪过程的重定位方法。
本申请实施例提供的技术方案带来的有益效果至少包括:
通过在当前图像符合重定位条件时,将当前图像与第一个标记图像进行重定位,能够在连续多个标记图像进行追踪的Anchor-SLAM算法中实现重定位,从而减少了追踪过程中断的可能性,由于重定位过程是将当前图像相对于第一个标记图像进行重定位,所以还能消除多个标记图像的追踪过程所产生的累积误差,从而解决相关技术中的SLAM重定位方法在AR领域中重定位效果较差的问题。
附图说明
为了更清楚地说明本申请实施例中的技术方案,下面将对实施例描述中所需要使用的附图作简单地介绍,显而易见地,下面描述中的附图仅仅是本申请的一些实施例,对于本领域普通技术人员来讲,在不付出创造性劳动的前提下,还可以根据这些附图获得其他的附图。
图1是本申请一个示例性实施例提供的AR应用场景的场景示意图;
图2是本申请一个示例性实施例提供的AR应用场景的场景示意图;
图3是本申请一个示例性实施例提供的Anchor-Switching AR System算法的原理示意图;
图4是本申请一个示例性实施例提供的电子设备的结构框图;
图5是本申请一个示例性实施例提供的相机姿态追踪过程的重定位方法的流程图;
图6是本申请一个示例性实施例提供的相机姿态追踪过程的重定位方法的流程图;
图7是本申请一个示例性实施例提供的金字塔图像的示意图;
图8是本申请一个示例性实施例提供的相机姿态追踪过程的重定位方法的流程图;
图9是本申请一个示例性实施例提供的重定位方法的原理示意图;
图10是本申请一个示例性实施例提供的重定位方法的流程图;
图11是本申请一个示例性实施例提供的相机姿态追踪过程的重定位装置的框图;
图12是本申请一个示例性实施例提供的电子设备的框图。
具体实施方式
为使本申请的目的、技术方案和优点更加清楚,下面将结合附图对本申请实施方式作进一步地详细描述。
首先对本申请涉及的若干个名词进行简介:
AR(Augmented Reality,增强现实):一种在相机采集图像的过程中,实时地计算相机在现实世界(或称三维世界、真实世界)中的相机姿态参数,根据该相机姿态参数在相机采集的图像上添加虚拟元素的技术。虚拟元素包括但不限于:图像、视频和三维模型。AR技术的目标是在屏幕上把虚拟世界套接在现实世界上进行互动。该相机姿态参数包括位移向量和旋转矩阵,位移向量用于表征相机在现实世界中发生的位移距离,旋转矩阵用于表征相机在现实世界中发生的旋转角度。
例如,参见图1和参见图2,设备在相机拍摄到的图像中添加了一个虚拟人物形象。随着相机在现实世界中的运动,相机拍摄到的图像会发生变化,虚拟人物的拍摄方位也发生变化,模拟出了虚拟人物在图像中静止不动,而相机随着位置和姿态的变化同时拍摄图像和虚拟人物的效果,为用户呈现了一幅真实立体的画面。
可选地,本申请中的设备所设置的相机为单目相机。
Anchor-Switching AR System:是基于连续的多个标记图像(Anchor)的相机姿态追踪来确定在自然场景下的相机姿态参数,进而根据相机姿态参数在相机采集的图像上叠加虚拟世界的AR系统。
IMU(Inertial Measurement Unit,惯性测量单元):是用于测量物体的三轴姿态角(或角速率)以及加速度的装置。一般的,一个IMU包含了三个单轴的加速度计和三个单轴的陀螺,加速度计用于检测物体在三维坐标系中每个坐标轴上的加速度信号,进而计算得到位移向量;而陀螺用于检测物体在三维坐标系中的旋转矩阵。可选地,IMU包括陀螺仪、加速度计和地磁传感器。
示意性的,三维坐标系的建立方式为:1、X轴使用向量积Y*Z来定义,在X轴在设备当前的位置上,沿与地面相切的方向指向东方;2、Y轴在设备当前的位置上,沿与地面相切的方向指向地磁场的北极;3、Z轴指向天空并垂直于地面。
在AR(Augmented Reality,增强现实)领域进行相机姿态追踪时,比如使用手机拍摄桌面进行AR游戏的场景,由于AR使用场景存在其场景特殊性,通常会对现实世界中的某个固定平面进行持续性拍摄(比如某个桌面或墙面),直接使用相关技术中的SLAM重定位方法的效果较差,尚需提供一种适用于AR领域的重定位解决方案。
本申请提供了一种适用于Anchor-Switching AR System算法的重定位方法。Anchor-Switching AR System算法在确定相机姿态的过程中,将相机的运动过程划分为至少两段追踪过程进行追踪,每段追踪过程对应各自的标记图像。示意性地,当第i个标记图像对应的追踪过程中,当当前图像相对于第i个标记图像的追踪效果差于预设条件(比如能够匹配到的特征点少于预设阈值)时,将当前图像的上一个图像确定为第i+1个标记图像,开启第i+1段追踪过程。其中,i为正整数。示意性的参考图3,其示出了本申请一个示例性实施例提供的Anchor-Switching AR System算法的原理示意图。在现实世界中存在物体320,设置有相机的设备340被用户手持进行移动,在移动过程中拍摄得到包括物体320的多帧图像1-6。设备将图像1确定为第1个标记图像(born-anchor或born-image)并记录初始相机姿态参数,该初始相机姿态参数可以是IMU采集的,然后将图像2相对于图像1进行特征点追踪,根据初始相机姿态参数和特征点追踪结果计算出相机在拍摄图像2时的相机姿态参数;将图像3相对于图像1进行特征点追踪,根据初始相机姿态参数和特征点追踪结果计算出相机在拍摄图像3时的相机姿态参数;将图像4相对于图像1进行特征点追踪,根据初始相机姿态参数和特征点追踪结果计算出相机在拍摄图像4时的相机姿态参数。
然后,将图像5相对于图像1进行特征点追踪,如果特征点追踪效果差于预设条件(比如匹配的特征点数量较少),则将图像4确定为第2个标记图像,将图像5相对于图像4进行特征点追踪,计算出相机在拍摄图像4至图像5之间的位移变化量,再结合相机在拍摄图像4至图像1之间的位移变化量以及初始相机姿态参数,计算出相机在拍摄图像5时的相机姿态参数。然后再将图像6相对于图像4进行特征点追踪,依次类推,若当前图像的特征点追踪效果变差时,即可将当前图像的上一帧图像确定为新的标记图像,切换新的标记图像后重新进行特征点追踪。
可选地,特征点追踪可以采用光流追踪、直接法等基于视觉里程计原理的 算法。若相机在追踪过程中发生较为剧烈的运动、朝向强光源、朝向白色墙壁等各种异常场景时,上述Anchor-Switching AR System追踪过程可能会发生丢失(Lost)现象。丢失现象是指在当前图像中无法匹配到足够多的特征点,导致追踪失败。
参考图4,其示出了本申请一个示例性实施例提供的电子设备的结构框图。该设备包括:处理器420、存储器440、相机460和IMU 480。
处理器420包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器420用于执行存储器440中存储的指令、代码、代码片段和程序中的至少一种。
处理器420与存储器440电性相连。可选地,处理器420通过总线与存储器440相连。存储器440存储有一个或多个指令、代码、代码片段和/或程序。该指令、代码、代码片段和/或程序在被处理器420执行时,用于实现如下实施例中提供的SLAM重定位方法。
处理器420还与相机460电性相连。可选地,处理器420通过总线与相机460相连。相机460是具有图像采集能力的传感器件。相机460还可称为摄像头、感光器件等其它名称。相机460具有连续采集图像或多次采集图像的能力。可选地,相机460设置在设备内部或设备外部。可选地,该相机460是单目相机。
处理器420还与IMU480电性相连。可选地,IMU480用于每隔预定时间间隔采集相机的位姿参数,并记录每组位姿参数在采集时的时间戳。相机的位姿参数包括:位移向量和旋转矩阵。其中,IMU480采集的旋转矩阵相对准确,采集的位移向量受实际环境可能会有较大的误差。
参考图5,其示出了本申请一个示例性实施例提供的相机姿态追踪过程的重定位方法的流程图。本实施例以该重定位方法应用于图4所示的设备中来举例说明,该设备用于按序执行多个标记图像的相机姿态追踪。该方法包括:
步骤502,获取多个标记图像中第i个标记图像之后采集的当前图像;
设备内的相机按照预设时间间隔采集一帧帧图像,形成图像序列。可选地,相机是在运动(平移和/或旋转)过程中,按照预设时间间隔采集一帧帧图像形成图像序列。
可选地,设备将图像序列中的第一帧图像(或前几帧图像中符合预定条件 的一帧图像)确定为第一个标记图像,将后续采集的图像相对于第一个标记图像进行特征点追踪,并根据特征点追踪结果计算相机的相机姿态参数;若当前帧图像的特征点追踪效果差于预设条件时,将当前帧图像的上一帧图像确定为第二个标记图像,将后续采集的图像相对于第二个标记图像进行特征点追踪,并根据特征点追踪结果计算相机的相机姿态参数,依次类推。设备可以按序进行连续多个标记图像的相机姿态追踪。
当处于第i个标记图像对应的第i个追踪过程时,相机会采集到当前图像。当前图像是第i个标记图像之后采集的某一帧图像,其中,i为大于1的整数。
需要说明的是,当前图像是指当前正在处理的图像,不一定是当前时刻采集到的图像。
步骤504,当当前图像符合重定位条件时,获取多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,初始位姿参数用于指示相机采集第一个标记图像时的相机姿态;
设备会确定当前图像是否符合重定位条件。重定位条件用于指示当前图像相对于第i个标记图像的追踪过程失败,或者,重定位条件用于指示历史追踪过程中的累积误差已经高于预设条件。
在一个可选的实施例中,设备对当前图像相对于第i个标记图像进行追踪,若当前图像中不存在与第i个标记图像匹配的特征点,或者,当前图像中与第i个标记图像匹配的特征点少于第一数量时,确定当前图像相对于第i个标记图像的追踪过程失败,符合重定位条件。
在另一个可选的实施例中,设备确定当前图像与上一次重定位的图像之间的帧数大于第二数量时,确定历史追踪过程中的累积误差已经高于预设条件,或者,设备确定第i个标记图像和第一个标记图像之间的标记图像数量大于第三数量时,确定历史追踪过程中的累计误差已经高于预设条件。
本实施例对重定位条件的具体条件内容不加以限定。
当当前图像符合重定位条件时,设备尝试将当前图像相对于第一个标记图像进行特征点追踪。此时,设备获取缓存的第一个标记图像中的初始特征点以及初始位姿参数,该初始位姿参数用于指示相机采集第一个标记图像时的相机姿态。
初始特征点是从第一个标记图像上提取到的特征点,初始特征点可以是多个,比如10-500个。该初始位姿参数用于指示相机采集第一个标记图像时的相 机姿态。可选地,初始位姿参数包括旋转矩阵R和位移向量T,初始位姿参数可以由IMU采集得到。
步骤506,将当前图像相对于第一个标记图像进行特征点追踪,得到与初始特征点匹配的目标特征点;
特征点追踪可采用基于视觉里程计的追踪算法,本申请对此不加以限定。在一个实施例中,特征点追踪采用KLT(Kanade-Lucas)光流追踪算法;在另一个实施例中,特征点追踪采用基于ORB(Oriented FAST and Rotated BRIEF,快速特征点提取和描述)算法提取的ORB特征描述子进行特征点跟踪。本申请对特征点追踪的具体算法不加以限定,特征点追踪过程可以采用特征点法或直接法。
在一个实施例中,设备对第一个标记图像进行特征点提取,得到N个初始特征点;设备还对当前图像进行特征点提取,得到M个候选特征点;然后将M个候选特征点逐一与N个初始特征点进行匹配,确定出至少一组匹配特征点对。每组匹配特征点对包括:一个初始特征点和一个目标特征点。初始特征点是第1个标记图像上的特征点,目标特征点是当前图像上与该初始特征点匹配度最高的候选特征点。
可选地,初始特征点的数量大于或等于目标特征点的数量。比如,初始特征点的数量是450个,目标特征点为320组。
步骤508,根据初始特征点和目标特征点,计算相机从第一相机姿态改变至目标相机姿态时的位姿变化量,目标相机姿态是相机在采集当前图像时的相机姿态;
可选地,设备根据初始特征点和目标特征点计算两帧图像之间的单应性矩阵homography;对单应性矩阵homography进行分解,得到相机从第一相机姿态改变至目标相机姿态时的位姿变化量R relocalize和T relocalize
单应性矩阵描述了两个平面之间的映射关系,若自然场景(现实环境)中的特征点都落在同一物理平面上,则可以通过单应性矩阵进行运动估计。当存在至少四对相匹配的初始特征点和目标特征点时,设备通过ransac对该单应性矩阵进行分解,得到旋转矩阵R relocalize和平移向量T relocalize
其中,R relocalize是相机从第一相机姿态改变至目标相机姿态时的旋转矩阵,T relocalize是相机从第一相机姿态改变至目标相机姿态时的位移向量。
步骤510,根据初始位姿参数和位姿变化量,重定位得到目标相机姿态对应 的目标位姿参数。
设备将初始位姿参数利用位姿变化量进行变换后,重定位得到目标相机姿态对应的目标位姿参数,从而计算得到相机在采集当前图像时的相机姿态。
可选地,在对当前图像重定位成功时,终端将当前图像确定为第i+1个标记图像。
终端基于第i+1个标记图像继续进行特征点追踪。终端根据后续的特征点追踪情况,还可以继续生成第i+2个标记图像、第i+3个标记图像、第i+4个标记图像等等,以此类推不再赘述。相关过程可参考上述图3所示的追踪内容。
综上所述,本实施例提供的重定位方法,通过在当前图像符合重定位条件时,将当前图像与第一个标记图像进行重定位,能够在连续多个标记图像进行追踪的Anchor-SLAM算法中实现重定位,从而减少了追踪过程中断的可能性,从而解决相关技术中的SLAM重定位方法在AR领域中重定位效果较差的问题。
另外,由于重定位过程是将当前图像相对于第一个标记图像进行重定位,第一个标记图像可以认为是没有累积误差的,所以本实施例还能消除多个标记图像的追踪过程所产生的累积误差。
以下对上述重定位方法的若干个阶段进行介绍:
预处理阶段:
在基于图5所示的可选实施例中,由于第一个标记图像通常是相机拍摄的第一帧图像,也是重定位过程使用的当前图像,出于提高特征点匹配的成功率的目的,需要对第一个标记图像进行预处理。如图6所示,步骤502之前还包括如下步骤:
步骤501a,记录第一个标记图像对应的初始位姿参数;
设备中设置有IMU,通过IMU定时采集相机的位姿参数以及时间戳。位姿参数包括旋转矩阵和位移向量,时间戳用于表示位姿参数的采集时间。可选地,IMU采集的旋转矩阵是较为准确的。
设备中的相机采集每帧图像时,同时记录有每帧图像的拍摄时间。设备根据第一个标记图像的拍摄时间,查询并记录相机在拍摄第一个标记图像时的初始位姿参数。
步骤501b,获取第一个标记图像对应的n个尺度不同的金字塔图像,n为 大于1的整数;
设备还提取第一个标记图像中的初始特征点。可选地,设备提取特征点时采用的特征提取算法可以为FAST(Features from Accelerated Segment Test,加速段测试特征点)检测算法、Shi-Tomasi(史托马西)角点检测算法、Harris Corner Detection(Harris角点检测)算法、SIFT(Scale-Invariant Feature Transform,尺度不变特征转换)算法、ORB(Oriented FAST and Rotated BRIEF,快速特征点提取和描述)算法等。
由于SIFT特征的实时计算难度较大,为了保证实时性,设备可以提取第一个标记图像中的ORB特征点。一个ORB特征点包括FAST角点(Key-point)和BRIER描述子(Binary Robust Independent Elementary Feature Descirptor)两部分。
FAST角点是指该ORB特征点在图像中所在的位置。FAST角点主要检测局部像素灰度变化明显的地方,以速度快著称。FAST角点的思想时:如果一个像素与邻域的像素差别较大(过亮或过暗),则该像素可能是一个角点。
BRIEF描述子是一个二进制表示的向量,该向量按照某种人为设计的方式描述了该关键点周围像素的信息。BRIEF描述子的描述向量由多个0和1组成,这里的0和1编码了FAST角点附近的两个像素的大小关系。
由于ORB特征的计算速度较快,因此适用于移动设备上实施。但由于ORB特征描述子没有尺度不变性,用户手持相机采集图像时的尺度变化又很明显,用户很可能在很远或很近的尺度下观测到第一个标记图像对应的画面,在一个可选的实现中,设备为第一个标记图像生成n个尺度不同的金字塔图像。
金字塔图像是指对第一个标记图像按照预设比例进行缩放后的图像。以金字塔图像包括四层图像为例,按照缩放比例1.0、0.8、0.6、0.4将第一个标记图像进行缩放后,得到四张不同尺度的图像。
步骤501c,对每个金字塔图像提取初始特征点,并记录初始特征点在金字塔图像缩放至原始尺寸时的二维坐标。
设备对每一层金字塔图像都提取特征点并计算ORB特征描述子。对于不是原始尺度(1.0)的金字塔图像上提取的特征点,将该金字塔图像按照缩放比例放大到原始尺度后,记录每个特征点在原始尺度的金字塔图像上的二维坐标。这些金字塔图像上的特征点以及二维坐标,可称为layer-keypoint。在一个例子中,每层金字塔图像上的特征点最多有500个特征点。
对于第一个标记图像,将每个金字塔图像上的特征点确定为初始特征点。在后续特征点追踪过程中,若当前图像的尺度很大,当前图像上的高频细节都清晰可见,则当前图像与层数较低的金字塔图像(比如原始图像)会有更高的匹配分数;反之,若当前图像的尺度很小,当前图像上只能看到模糊的低频信息,则当前图像与层数较高的金字塔图像有更高的匹配分数。
在如图7所示出的例子中,第一个标记图像具有三个金字塔图像71、72和73,金字塔图像1位于金字塔的第一层,具有三个图像中的最小尺度;金字塔图像2位于金字塔的第二层,具有三个图像中的中间尺度;金字塔图像3位于金字塔的第三层,具有三个图像中的最大尺度,若当前图像74相对于第一个标记图像进行特征点追踪时,设备可以将当前图像74分别与三个金字塔图像中提取的特征点进行匹配,由于金字塔图像3和当前图像74的尺度更接近,则金字塔图像3中提取的特征点具有更高的匹配分数。
本实施例通过对第一个标记图像设置多个尺度的金字塔图像,并进而提取每层金字塔图像上的初始特征点用于后续的特征点追踪过程,通过多个尺度上的特征点共同匹配,自动调节了第一个标记图像的尺度,实现了尺度不变性。
特征点追踪阶段:
在基于图5所示的可选实施例中,对于步骤506所示出的特征点追踪过程。设备对当前图像提取特征点,该特征点可以是ORB特征描述子。与第一个标记图像提取多层特征点不同的是,设备可以对当前图像提取一层特征点(比如最多500个),对于第一个标记图像上预先提取的layer-keypoint和当前图像上提取的特征点通过ORB特征描述子进行匹配。
出于提高匹配速度的目的,本申请实施例还对特征点追踪过程进行加速匹配。如图8所示,步骤506可选包括如下子步骤506a至506c:
步骤506a,对当前图像提取候选特征点;
设备提取特征点时采用的特征提取算法可以采用FAST检测算法、Shi-Tomasi角点检测算法、Harris角点检测算法、SIFT算法、ORB算法中的至少一种。本实施例以采用ORB算法提取当前图像中的ORB特征描述子来举例说明。
步骤506b,通过IMU获取相机采集当前图像时的参考位姿变化量;
设备中设置有IMU,通过IMU能够获取相机采集当前图像时的参考位姿变 化量。参考位姿变化量用于表征相机从采集第一个标记图像至采集当前图像过程中的位姿变化量,该位姿变化量包括旋转矩阵和位移向量。由于IMU的物理特征,IMU所采集的旋转矩阵是较为准确的,IMU采集的位移向量会存在一定的累积误差,但与真实结果相差不会太大,仍然具有指导意义。
步骤506c,根据参考位姿变化量将第一个标记图像中的初始特征点进行旋转平移投影,得到当前图像中与初始特征点对应的投影特征点;
在一个示例性的例子中,本步骤包括如下子步骤:
1、获取第一个标记图像中的初始特征点的二维坐标;
设备预先提取和缓存有第一个标记图像中的初始特征点的二维坐标。该二维坐标采用齐次表示。
2、对初始特征点的二维坐标进行反投影,得到初始特征点在三维空间中的第一三维坐标X born
设备通过如下公式将初始特征点的二维坐标变换至三维空间,得到这些初始特征点在三维空间中的第一三维坐标X born
Figure PCTCN2019078928-appb-000001
其中,f x、f y、c x、c y为相机的内置参数。初始特征点的二维坐标x born是第一个标记图像上的layer-keyPoints的齐次表示,三维点x born是非齐次表示。假设第一个标记图像的初始深度d为1。
3、将第一三维坐标X born通过如下公式进行三维旋转平移,得到初始特征点在当前图像上对应的第二三维坐标X current;
X current=R*X born+T;
其中,R是IMU采集的参考位姿变化量中的旋转矩阵,T是IMU采集的参考位姿变化量中的位移向量。
4、将第二三维坐标X current投影至当前图像,得到投影特征点在当前图像中的二维坐标;
设备通过如下公式将第二三维坐标X current投影至当前图像,得到投影特征点在当前图像中的二维坐标x current
Figure PCTCN2019078928-appb-000002
其中,f x、f y、c x、c y为相机的内置参数。
这些投影特征点在当前图像中所在的位置用于预测目标特征点的位置,通常这些投影特征点的位置与目标特征点的位置相同或接近。
步骤506d,在以投影特征点为中心的第一范围内,搜索与初始特征点匹配的目标特征点。
设备在当前图像中提取了多个ORB特征描述子。对于每个初始特征点对应的投影特征点,挑选出位于投影特征点为中心的第一范围内的候选ORB特征描述子,然后对投影特征点与候选ORB特征描述子进行匹配,当匹配成功时,认为搜索到与初始特征点匹配的目标特征点。
可选地,第一范围是矩形框或正方形框。本申请实施例对第一范围的样式不加以限定,第一范围还可以是菱形框、平行四边形框、圆形框等其它样式。
步骤506e,当搜索到的目标特征点数量少于预设阈值时,在以投影特征点为中心的第二范围内,重新搜索与初始特征点匹配的目标特征点。
可选地,第二范围大于第一范围。
需要说明的是,每个投影特征点对应各自的初始特征点,每个目标特征点对应各自的初始特征点。但投影特征点的总数小于或等于初始特征点的总数,目标特征点的总数小于或等于初始特征点的总数。
综上所述,本实施例提供的重定位方法,通过利用IMU采集的存在一定误差的参考位姿变化量,将初始特征点通过旋转平移投影至当前图像中,得到投影特征点;进而根据投影特征点在较小的范围内匹配得到目标特征点。一方面,由于搜索范围变小,减小了每个初始特征点的候选ORB特征描述子的个数,使得需要进行的匹配计算次数变少,加速了匹配过程;另一方面,由于投影特征点是基于2D-3D-2D的转换过程得到的,相当于附加入了3D约束条件,可以排除掉一些特征匹配度高但不满足几何一致性的干扰匹配点。
重定位计算过程:
在基于图5所示的可选实施例中,对于步骤508所示出的相机姿态的位姿变化量计算过程。设备得到当前图像相对于第一个标记图像的多个目标特征点之后,将初始特征点和目标特征点输入至ransac的算法中,计算得到当前图像 相对于第一个标记图像的单应性矩阵homography,通过IMU中的分解算法对单应性矩阵homography可以分解得到旋转矩阵
Figure PCTCN2019078928-appb-000003
和平移向量
Figure PCTCN2019078928-appb-000004
下面对比一下未进行重定位和重定位两种情况下的误差情况:
1、未进行重定位的场景下:
如图9所示,假设第1个标记图像的图像坐标系F、第i个标记图像的图像坐标系A、前一帧图像的图像坐标系L、当前图像的图像坐标系C。在追踪过程较为顺利时,第1个标记图像和第i个标记图像之间的单应性矩阵为H af,对H af进行分解后得到第一旋转矩阵R_old和第一平移向量T_old;第i个标记图像和前一帧图像之间的单应性矩阵为H la,前一帧图像与当前图像之间的单应性矩阵为H cl,对H la和H cl进行迭代分解后,得到第二旋转矩阵R_ca和第一平移向量T_ca。
也即,最终得到当前图像相对于第一个标记图像的旋转矩阵R和位移向量T如下:
Figure PCTCN2019078928-appb-000005
2、重定位的场景下,由于直接对当前图像相对于第一个标记图像进行重定位,消除了多次标记图像的追踪过程中不停累积的误差,直接得到结果如下:
Figure PCTCN2019078928-appb-000006
因为当前图像上的点不一定满足平面性假设,虽然通过直接分解homography得到的结果可能会有假设误差,但Anchor-SLAM的最理想情况就是从第一个标记图像光流追踪到当前图像的过程中没有出现lost也没有切换过标记图像,也没有出现运动模糊等累积误差的情况,那也是得到第一个标记图像到当前图像上的目标匹配点,进而通过分解homography的方法计算结果。因此通过本申请实施例的重定位可以确实地消除掉累积误差,且结果等同于Anchor-SLAM的最好情况。
请参考图10,其示出了本申请另一个示例性实施例提供的相机姿态追踪过程的重定位方法的流程图。本实施例以该方法应用于图4所示的设备中来举例说明。该方法包括:
步骤1001,记录第一个标记图像对应的初始位姿参数;
设备中设置有IMU,通过IMU定时采集相机的位姿参数以及时间戳。位姿 参数包括旋转矩阵和位移向量,时间戳用于表示位姿参数的采集时间。可选地,IMU采集的旋转矩阵是较为准确的。
设备中的相机采集每帧图像时,同时记录有每帧图像的拍摄时间。设备根据第一个标记图像的拍摄时间,查询并记录相机在拍摄第一个标记图像时的初始位姿参数。
可选地,第一个标记图像是设备采集的第一帧图像,或者,第一个标记图像是设备采集的前几帧图像中,特征点数量大于预设阈值的一帧图像。
步骤1002,获取第一个标记图像对应的n个尺度不同的金字塔图像,n为大于1的整数;
设备提取第一个标记图像中的初始特征点。在本实施例中,设备可以提取第一个标记图像中的ORB特征点作为初始特征点。
在一个可选的实现中,设备为第一个标记图像生成n个尺度不同的金字塔图像,n为正整数。
金字塔图像是指对第一个标记图像按照预设比例进行缩放后的图像。以金字塔图像包括四层图像为例,按照缩放比例1.0、0.8、0.6、0.4将第一个标记图像进行缩放后,得到四张不同尺度的图像。
步骤1003,对每个金字塔图像提取初始特征点,并记录初始特征点在金字塔图像缩放至原始尺寸时的二维坐标。
设备对每一层金字塔图像都提取特征点并计算ORB特征描述子。对于不是原始尺度(1.0)的金字塔图像上提取的特征点,将该金字塔图像按照缩放比例放大到原始尺度后,记录每个特征点在原始尺度的金字塔图像上的二维坐标。这些金字塔图像上的特征点以及二维坐标,可称为layer-keypoint。在一个例子中,每层金字塔图像上的特征点最多有500个特征点。
对于第一个标记图像,将每个金字塔图像上的特征点确定为初始特征点。
步骤1004,获取多个标记图像中第i个标记图像之后采集的当前图像;
设备内的相机按照预设时间间隔采集一帧帧图像,形成图像序列。可选地,相机是在运动(平移和/或旋转)过程中,按照预设时间间隔采集一帧帧图像形成图像序列。
可选地,设备将图像序列中的第一帧图像(或前几帧图像中符合预定条件的一帧图像)确定为第一个标记图像,将后续采集的图像相对于第一个标记图像进行特征点追踪,并根据特征点追踪结果计算相机的相机姿态参数;若当前 帧图像的特征点追踪效果差于预设条件时,将当前帧图像的上一帧图像确定为第二个标记图像,将后续采集的图像相对于第二个标记图像进行特征点追踪,并根据特征点追踪结果计算相机的相机姿态参数,依次类推。设备可以按序进行连续多个标记图像的相机姿态追踪。
当处于第i个标记图像对应的第i个追踪过程时,相机会采集到当前图像。当前图像是第i个标记图像之后采集的某一帧图像,其中,i为大于1的整数。
步骤1005,当当前图像符合重定位条件时,获取多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,初始位姿参数用于指示相机采集第一个标记图像时的相机姿态;
设备会确定当前图像是否符合重定位条件。重定位条件用于指示当前图像相对于第i个标记图像的追踪过程失败,或者,重定位条件用于指示历史追踪过程中的累积误差已经高于预设条件。
在一个可选的实施例中,设备对当前图像相对于第i个标记图像进行追踪,若当前图像中不存在与第i个标记图像匹配的特征点,或者,当前图像中与第i个标记图像匹配的特征点少于第一数量时,确定当前图像相对于第i个标记图像的追踪过程失败,符合重定位条件。
在另一个可选的实施例中,设备确定当前图像与上一次重定位的图像之间的帧数大于第二数量时,确定历史追踪过程中的累积误差已经高于预设条件,或者,设备确定第i个标记图像和第一个标记图像之间的标记图像数量大于第三数量时,确定历史追踪过程中的累计误差已经高于预设条件。
本实施例对重定位条件的具体条件内容不加以限定。
当当前图像符合重定位条件时,设备尝试将当前图像相对于第一个标记图像进行特征点追踪。此时,设备获取缓存的第一个标记图像中的初始特征点以及初始位姿参数,该初始位姿参数用于指示相机采集第一个标记图像时的相机姿态。
步骤1006,对当前图像提取候选特征点;
设备提取特征点时采用的特征提取算法可以采用FAST检测算法、Shi-Tomasi角点检测算法、Harris角点检测算法、SIFT算法、ORB算法中的至少一种。本实施例以采用ORB算法提取当前图像中的ORB特征描述子作为候选特征点来举例说明。
步骤1007,通过IMU获取相机采集当前图像时的参考位姿变化量;
设备中设置有IMU,通过IMU能够获取相机采集当前图像时的参考位姿变化量。参考位姿变化量用于表征相机从采集第一个标记图像至采集当前图像过程中的位姿变化量,该位姿变化量包括旋转矩阵和位移向量。由于IMU的物理特征,IMU所采集的旋转矩阵是较为准确的,IMU采集的位移向量会存在一定的累积误差,但与真实结果相差不会太大,仍然具有指导意义。
步骤1008,根据参考位姿变化量将第一个标记图像中的初始特征点进行旋转平移投影,得到当前图像中与初始特征点对应的投影特征点;
在一个示例性的例子中,本步骤包括如下子步骤:
1、获取第一个标记图像中的初始特征点的二维坐标;
设备预先提取和缓存有第一个标记图像中的初始特征点的二维坐标。该二维坐标采用齐次表示。
2、对初始特征点的二维坐标进行反投影,得到初始特征点在三维空间中的第一三维坐标X born
设备通过如下公式将初始特征点的二维坐标变换至三维空间,得到这些初始特征点在三维空间中的第一三维坐标X born
Figure PCTCN2019078928-appb-000007
其中,f x、f y、c x、c y为相机的内置参数。初始特征点的二维坐标x born是第一个标记图像上的layer-keyPoints的齐次表示,三维点x born是非齐次表示。假设第一个标记图像的初始深度d为1。
3、将第一三维坐标X born通过如下公式进行三维旋转平移,得到初始特征点在当前图像上对应的第二三维坐标X current;
X current=R*X born+T;
其中,R是IMU采集的参考位姿变化量中的旋转矩阵,T是IMU采集的参考位姿变化量中的位移向量。
4、将第二三维坐标X current投影至当前图像,得到投影特征点在当前图像中的二维坐标;
设备通过如下公式将第二三维坐标X current投影至当前图像,得到投影特征 点在当前图像中的二维坐标x current
Figure PCTCN2019078928-appb-000008
其中,f x、f y、c x、c y为相机的内置参数。
这些投影特征点在当前图像中所在的位置用于预测目标特征点的位置,通常这些投影特征点的位置与目标特征点的位置相同或接近。
步骤1009,在以投影特征点为中心的第一范围内,搜索与初始特征点匹配的目标特征点。
设备在当前图像中提取了多个ORB特征描述子。对于每个初始特征点对应的投影特征点,挑选出位于投影特征点为中心的第一范围内的候选ORB特征描述子,然后对投影特征点与候选ORB特征描述子进行匹配,当匹配成功时,认为搜索到与初始特征点匹配的目标特征点。
可选地,第一范围是矩形框或正方形框。本申请实施例对第一范围的样式不加以限定,第一范围还可以是菱形框、平行四边形框、圆形框等其它样式。
步骤1010,当搜索到的目标特征点数量少于预设阈值时,在以投影特征点为中心的第二范围内,重新搜索与初始特征点匹配的目标特征点。
可选地,第二范围大于第一范围。
需要说明的是,每个投影特征点对应各自的初始特征点,每个目标特征点对应各自的初始特征点。但投影特征点的总数小于或等于初始特征点的总数,目标特征点的总数小于或等于初始特征点的总数。
步骤1011,根据初始特征点和目标特征点,计算相机从第一相机姿态改变至目标相机姿态时的位姿变化量,目标相机姿态是相机在采集当前图像时的相机姿态;
可选地,设备根据初始特征点和目标特征点计算两帧图像之间的单应性矩阵homography;对单应性矩阵homography进行分解,得到相机从第一相机姿态改变至目标相机姿态时的位姿变化量R relocalize和T relocalize
单应性矩阵描述了两个平面之间的映射关系,若自然场景(现实环境)中的特征点都落在同一物理平面上,则可以通过单应性矩阵进行运动估计。当存在至少四对相匹配的初始特征点和目标特征点时,设备通过ransac对该单应性矩阵进行分解,得到旋转矩阵R relocalize和平移向量T relocalize
其中,R relocalize是相机从第一相机姿态改变至目标相机姿态时的旋转矩阵, T relocalize是相机从第一相机姿态改变至目标相机姿态时的位移向量。
步骤1012,根据初始位姿参数和位姿变化量,重定位得到目标相机姿态对应的目标位姿参数。
设备将初始位姿参数利用位姿变化量进行变换后,重定位得到目标相机姿态对应的目标位姿参数,从而计算得到相机在采集当前图像时的相机姿态。
在对当前图像重定位成功时,终端将当前图像确定为第i+1个标记图像。
终端基于第i+1个标记图像继续进行特征点追踪。终端根据后续的特征点追踪情况,还可以继续生成第i+2个标记图像、第i+3个标记图像、第i+4个标记图像等等,以此类推不再赘述。相关过程可参考上述图3所示的追踪内容。
综上所述,本实施例提供的重定位方法,通过在当前图像符合重定位条件时,将当前图像与第一个标记图像进行重定位,能够在连续多个标记图像进行追踪的Anchor-SLAM算法中实现重定位,从而减少了追踪过程中断的可能性,从而解决相关技术中的SLAM重定位方法在AR领域中重定位效果较差的问题。
另外,由于重定位过程是将当前图像相对于第一个标记图像进行重定位,第一个标记图像可以认为是没有累积误差的,所以本实施例还能消除多个标记图像的追踪过程所产生的累积误差。
本实施例提供的重定位方法,通过利用IMU采集的存在一定误差的参考位姿变化量,将初始特征点通过旋转平移投影至当前图像中,得到投影特征点;进而根据投影特征点在较小的范围内匹配得到目标特征点。一方面,由于搜索范围变小,减小了每个初始特征点的候选ORB特征描述子的个数,使得需要进行的匹配计算次数变少,加速了匹配过程;另一方面,由于投影特征点是基于2D-3D-2D的转换过程得到的,相当于附加入了3D约束条件,可以排除掉一些特征匹配度高但不满足几何一致性的干扰匹配点。
在一个示意性的例子中,上述相机姿态追踪过程的重定位方法可以用于AR程序中,通过该重定位方法能够实时根据现实世界的场景信息,对终端上的相机姿态进行追踪,并根据追踪结果调整和修改AR应用程序中的AR元素的显示位置。以图1或图2所示的运行在手机上的AR程序为例,当需要显示一个站立在书籍上的静止卡通人物时,不论用户如何移动该手机,只需要根据该手机上的相机姿态变化修改该卡通人物的显示位置,即可使该卡通人物在书籍上的站立位置保持不变。
以下为本申请的装置实施例,对于装置实施例中未详细描述的技术细节,可以参考上述方法实施例中的对应描述。
请参考图11,其示出了本申请一个示例性实施例提供的相机姿态追踪过程的重定位装置的框图。该重定位装置可以通过软件、硬件或者两者的结合实现成为电子设备或移动终端的全部或一部分。该装置上设置有相机,该相机可以是单目相机。该装置包括:图像获取模块1110、信息获取模块1120、特征点追踪模块1130、变化量计算模块1140和重定位模块1150。
图像获取模块1110,用于获取所述多个标记图像中第i个标记图像之后采集的当前图像,i为大于1的整数;
信息获取模块1120,用于当所述当前图像符合重定位条件时,获取所述多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,所述初始位姿参数用于指示所述相机采集所述第一个标记图像时的相机姿态;
特征点追踪模块1130,用于将所述当前图像相对于所述第一个标记图像进行特征点追踪,得到与所述初始特征点匹配的目标特征点;
变化量计算模块1140,用于根据所述初始特征点和所述目标特征点,计算所述相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量,所述目标相机姿态是所述相机在采集所述当前图像时的相机姿态;
重定位模块1150,用于根据所述初始位姿参数和所述位姿变化量,重定位得到所述目标相机姿态对应的目标位姿参数。
在一个可选的实施例中,所述设备中还设置有惯性测量单元IMU;所述特征点追踪模块1130包括:
提取子模块,用于对所述当前图像提取候选特征点;
采集子模块,用于通过所述IMU获取所述相机采集所述当前图像时的参考位姿变化量;
投影子模块,用于根据所述参考位姿变化量将所述第一个标记图像中的所述初始特征点进行旋转平移投影,得到所述当前图像中与所述初始特征点对应的投影特征点;
搜索子模块,用于在以所述投影特征点为中心的第一范围内,从所述候选特征点中搜索与所述初始特征点匹配的目标特征点。
在一个可选的实施例中,所述投影子模块用于:
获取所述第一个标记图像中的所述初始特征点的二维坐标;
对所述初始特征点的二维坐标进行反投影,得到所述初始特征点在三维空间中的第一三维坐标X born
将所述第一三维坐标X born通过如下公式进行三维旋转平移,得到所述初始特征点在所述当前图像上对应的第二三维坐标X current;
X current=R*X born+T;
将所述第二三维坐标X current投影至当前图像,得到所述投影特征点在所述当前图像中的二维坐标;
其中,R是所述参考位姿变化量中的旋转矩阵,T是所述参考位姿变化量中的位移参数。
在一个可选的实施例中,所述搜索子模块,还用于当搜索到的所述目标特征点数量少于预设阈值时,在以所述投影特征点为中心的第二范围内,重新搜索与所述初始特征点匹配的目标特征点;
其中,所述第二范围大于所述第一范围。
在一个可选的实施例中,所述装置还包括:提取模块1160;
所述图像获取模块1110,还用于获取所述第一个标记图像对应的n个尺度不同的金字塔图像,n为大于1的整数;
所述提取模块1160,还用于对每个所述金字塔图像提取所述初始特征点,并记录所述初始特征点在所述金字塔图像缩放至原始尺寸时的二维坐标。
在一个可选的实施例中,所述变化量计算模块1140,用于根据所述初始特征点和所述目标特征点计算所述相机在相机姿态改变过程时的单应性矩阵;对所述单应性矩阵进行分解,得到所述相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量R relocalize和T relocalize
在一个可选的实施例中,所述特征点追踪模块1130,用于将所述当前图像确定为第i+1个标记图像,将后续采集的图像相对于所述第i+1个标记图像进行特征点追踪。也即,所述特征点追踪模块1130,用于将当所述当前图像确定为第i+1个标记图像,基于所述第i+1个标记图像继续进行特征点追踪。
需要说明的是:上述实施例提供的相机姿态追踪过程的重定位装置在实现重定位时,仅以上述各功能模块的划分进行举例说明,实际应用中,可以根据需要而将上述功能分配由不同的功能模块完成,即将设备的内部结构划分成不同的功能模块,以完成以上描述的全部或者部分功能。另外,上述实施例提供的重定位装置与重定位方法实施例属于同一构思,其具体实现过程详见方法实 施例,这里不再赘述。
图12示出了本申请一个示例性实施例提供的电子设备1200的结构框图。该电子设备1200可以是:智能手机、平板电脑、MP3播放器(Moving Picture Experts Group Audio Layer III,动态影像专家压缩标准音频层面3)、MP4(Moving Picture Experts Group Audio Layer IV,动态影像专家压缩标准音频层面4)播放器、笔记本电脑或台式电脑。电子设备1200还可能被称为用户终端、便携式电子设备、膝上型设备、台式设备等其他名称。
通常,电子设备1200包括有:处理器1201和存储器1202。
处理器1201可以包括一个或多个处理核心,比如4核心处理器、8核心处理器等。处理器1201可以采用DSP(Digital Signal Processing,数字信号处理)、FPGA(Field-Programmable Gate Array,现场可编程门阵列)、PLA(Programmable Logic Array,可编程逻辑阵列)中的至少一种硬件形式来实现。处理器1201也可以包括主处理器和协处理器,主处理器是用于对在唤醒状态下的数据进行处理的处理器,也称CPU(Central Processing Unit,中央处理器);协处理器是用于对在待机状态下的数据进行处理的低功耗处理器。在一些实施例中,处理器1201可以在集成有GPU(Graphics Processing Unit,图像处理器),GPU用于负责显示屏所需要显示的内容的渲染和绘制。一些实施例中,处理器1201还可以包括AI(Artificial Intelligence,人工智能)处理器,该AI处理器用于处理有关机器学习的计算操作。
存储器1202可以包括一个或多个计算机可读存储介质,该计算机可读存储介质可以是非暂态的。存储器1202还可包括高速随机存取存储器,以及非易失性存储器,比如一个或多个磁盘存储设备、闪存存储设备。在一些实施例中,存储器1202中的非暂态的计算机可读存储介质用于存储至少一个指令,该至少一个指令用于被处理器1201所执行以实现本申请中方法实施例提供的相机姿态追踪过程的重定位方法。
在一些实施例中,电子设备1200还可选包括有:外围设备接口1203和至少一个外围设备。处理器1201、存储器1202和外围设备接口1203之间可以通过总线或信号线相连。各个外围设备可以通过总线、信号线或电路板与外围设备接口1203相连。示意性地,外围设备包括:射频电路1204、触摸显示屏1205、摄像头1206、音频电路1207、定位组件1208和电源1209中的至少一种。
外围设备接口1203可被用于将I/O(Input/Output,输入/输出)相关的至少一个外围设备连接到处理器1201和存储器1202。在一些实施例中,处理器1201、存储器1202和外围设备接口1203被集成在同一芯片或电路板上;在一些其他实施例中,处理器1201、存储器1202和外围设备接口1203中的任意一个或两个可以在单独的芯片或电路板上实现,本实施例对此不加以限定。
射频电路1204用于接收和发射RF(Radio Frequency,射频)信号,也称电磁信号。射频电路1204通过电磁信号与通信网络以及其他通信设备进行通信。射频电路1204将电信号转换为电磁信号进行发送,或者,将接收到的电磁信号转换为电信号。可选地,射频电路1204包括:天线系统、RF收发器、一个或多个放大器、调谐器、振荡器、数字信号处理器、编解码芯片组、用户身份模块卡等等。射频电路1204可以通过至少一种无线通信协议来与其它电子设备进行通信。该无线通信协议包括但不限于:万维网、城域网、内联网、各代移动通信网络(2G、3G、4G及5G)、无线局域网和/或WiFi(Wireless Fidelity,无线保真)网络。在一些实施例中,射频电路1204还可以包括NFC(Near Field Communication,近距离无线通信)有关的电路,本申请对此不加以限定。
显示屏1205用于显示UI(User Interface,用户界面)。该UI可以包括图形、文本、图标、视频及其它们的任意组合。当显示屏1205是触摸显示屏时,显示屏1205还具有采集在显示屏1205的表面或表面上方的触摸信号的能力。该触摸信号可以作为控制信号输入至处理器1201进行处理。此时,显示屏1205还可以用于提供虚拟按钮和/或虚拟键盘,也称软按钮和/或软键盘。在一些实施例中,显示屏1205可以为一个,设置电子设备1200的前面板;在另一些实施例中,显示屏1205可以为至少两个,分别设置在电子设备1200的不同表面或呈折叠设计;在再一些实施例中,显示屏1205可以是柔性显示屏,设置在电子设备1200的弯曲表面上或折叠面上。甚至,显示屏1205还可以设置成非矩形的不规则图形,也即异形屏。显示屏1205可以采用LCD(Liquid Crystal Display,液晶显示屏)、OLED(Organic Light-Emitting Diode,有机发光二极管)等材质制备。
摄像头组件1206用于采集图像或视频。可选地,摄像头组件1206包括前置摄像头和后置摄像头。通常,前置摄像头设置在电子设备的前面板,后置摄像头设置在电子设备的背面。在一些实施例中,后置摄像头为至少两个,分别为主摄像头、景深摄像头、广角摄像头、长焦摄像头中的任意一种,以实现主摄像头和景深摄像头融合实现背景虚化功能、主摄像头和广角摄像头融合实现 全景拍摄以及VR(Virtual Reality,虚拟现实)拍摄功能或者其它融合拍摄功能。在一些实施例中,摄像头组件1206还可以包括闪光灯。闪光灯可以是单色温闪光灯,也可以是双色温闪光灯。双色温闪光灯是指暖光闪光灯和冷光闪光灯的组合,可以用于不同色温下的光线补偿。
音频电路1207可以包括麦克风和扬声器。麦克风用于采集用户及环境的声波,并将声波转换为电信号输入至处理器1201进行处理,或者输入至射频电路1204以实现语音通信。出于立体声采集或降噪的目的,麦克风可以为多个,分别设置在电子设备1200的不同部位。麦克风还可以是阵列麦克风或全向采集型麦克风。扬声器则用于将来自处理器1201或射频电路1204的电信号转换为声波。扬声器可以是传统的薄膜扬声器,也可以是压电陶瓷扬声器。当扬声器是压电陶瓷扬声器时,不仅可以将电信号转换为人类可听见的声波,也可以将电信号转换为人类听不见的声波以进行测距等用途。在一些实施例中,音频电路1207还可以包括耳机插孔。
定位组件1208用于定位电子设备1200的当前地理位置,以实现导航或LBS(Location Based Service,基于位置的服务)。定位组件1208可以是基于美国的GPS(Global Positioning System,全球定位系统)、中国的北斗系统或俄罗斯的伽利略系统的定位组件。
电源1209用于为电子设备1200中的各个组件进行供电。电源1209可以是交流电、直流电、一次性电池或可充电电池。当电源1209包括可充电电池时,该可充电电池可以是有线充电电池或无线充电电池。有线充电电池是通过有线线路充电的电池,无线充电电池是通过无线线圈充电的电池。该可充电电池还可以用于支持快充技术。
在一些实施例中,电子设备1200还包括有一个或多个传感器1210。该一个或多个传感器1210包括但不限于:加速度传感器1211、陀螺仪传感器1212、压力传感器1213、指纹传感器1214、光学传感器1215以及接近传感器1216。
加速度传感器1211可以检测以电子设备1200建立的坐标系的三个坐标轴上的加速度大小。比如,加速度传感器1211可以用于检测重力加速度在三个坐标轴上的分量。处理器1201可以根据加速度传感器1211采集的重力加速度信号,控制触摸显示屏1205以横向视图或纵向视图进行用户界面的显示。加速度传感器1211还可以用于游戏或者用户的运动数据的采集。
陀螺仪传感器1212可以检测电子设备1200的机体方向及转动角度,陀螺 仪传感器1212可以与加速度传感器1211协同采集用户对电子设备1200的3D动作。处理器1201根据陀螺仪传感器1212采集的数据,可以实现如下功能:动作感应(比如根据用户的倾斜操作来改变UI)、拍摄时的图像稳定、游戏控制以及惯性导航。
压力传感器1213可以设置在电子设备1200的侧边框和/或触摸显示屏1205的下层。当压力传感器1213设置在电子设备1200的侧边框时,可以检测用户对电子设备1200的握持信号,由处理器1201根据压力传感器1213采集的握持信号进行左右手识别或快捷操作。当压力传感器1213设置在触摸显示屏1205的下层时,由处理器1201根据用户对触摸显示屏1205的压力操作,实现对UI界面上的可操作性控件进行控制。可操作性控件包括按钮控件、滚动条控件、图标控件、菜单控件中的至少一种。
指纹传感器1214用于采集用户的指纹,由处理器1201根据指纹传感器1214采集到的指纹识别用户的身份,或者,由指纹传感器1214根据采集到的指纹识别用户的身份。在识别出用户的身份为可信身份时,由处理器1201授权该用户执行相关的敏感操作,该敏感操作包括解锁屏幕、查看加密信息、下载软件、支付及更改设置等。指纹传感器1214可以被设置电子设备1200的正面、背面或侧面。当电子设备1200上设置有物理按键或厂商Logo时,指纹传感器1214可以与物理按键或厂商Logo集成在一起。
光学传感器1215用于采集环境光强度。在一个实施例中,处理器1201可以根据光学传感器1215采集的环境光强度,控制触摸显示屏1205的显示亮度。示意性地,当环境光强度较高时,调高触摸显示屏1205的显示亮度;当环境光强度较低时,调低触摸显示屏1205的显示亮度。在另一个实施例中,处理器1201还可以根据光学传感器1215采集的环境光强度,动态调整摄像头组件1206的拍摄参数。
接近传感器1216,也称距离传感器,通常设置在电子设备1200的前面板。接近传感器1216用于采集用户与电子设备1200的正面之间的距离。在一个实施例中,当接近传感器1216检测到用户与电子设备1200的正面之间的距离逐渐变小时,由处理器1201控制触摸显示屏1205从亮屏状态切换为息屏状态;当接近传感器1216检测到用户与电子设备1200的正面之间的距离逐渐变大时,由处理器1201控制触摸显示屏1205从息屏状态切换为亮屏状态。
本领域技术人员可以理解,图12中示出的结构并不构成对电子设备1200 的限定,可以包括比图示更多或更少的组件,或者组合某些组件,或者采用不同的组件布置。
上述本申请实施例序号仅仅为了描述,不代表实施例的优劣。
本领域普通技术人员可以理解实现上述实施例的全部或部分步骤可以通过硬件来完成,也可以通过程序来指令相关的硬件完成,所述的程序可以存储于一种计算机可读存储介质中,上述提到的存储介质可以是只读存储器,磁盘或光盘等。
以上所述仅为本申请的较佳实施例,并不用以限制本申请,凡在本申请的精神和原则之内,所作的任何修改、等同替换、改进等,均应包含在本申请的保护范围之内。

Claims (16)

  1. 一种相机姿态追踪过程的重定位方法,其特征在于,应用于具有相机的设备中,所述设备用于按序执行多个标记图像的相机姿态追踪,所述方法包括:
    获取所述多个标记图像中第i个标记图像之后采集的当前图像,i为大于1的整数;
    当所述当前图像符合重定位条件时,获取所述多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,所述初始位姿参数用于指示所述相机采集所述第一个标记图像时的相机姿态;
    将所述当前图像相对于所述第一个标记图像进行特征点追踪,得到与所述初始特征点匹配的目标特征点;
    根据所述初始特征点和所述目标特征点,计算所述相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量,所述目标相机姿态是所述相机在采集所述当前图像时的相机姿态;
    根据所述初始位姿参数和所述位姿变化量,重定位得到所述目标相机姿态对应的目标位姿参数。
  2. 根据权利要求1所述的方法,其特征在于,所述设备中还设置有惯性测量单元IMU;
    所述将所述当前图像相对于所述第一标记图像进行特征点追踪,得到与所述初始特征点匹配的目标特征点,包括:
    对所述当前图像提取候选特征点;
    通过所述IMU获取所述相机采集所述当前图像时的参考位姿变化量;
    根据所述参考位姿变化量将所述第一个标记图像中的所述初始特征点进行旋转平移投影,得到所述当前图像中与所述初始特征点对应的投影特征点;
    在以所述投影特征点为中心的第一范围内,从所述候选特征点中搜索与所述初始特征点匹配的目标特征点。
  3. 根据权利要求2所述的方法,其特征在于,所述根据所述参考位姿变化量将所述第一个标记图像中的所述初始特征点进行旋转平移投影,得到所述当前图像中与所述初始特征点对应的投影特征点,包括:
    获取所述第一个标记图像中的所述初始特征点的二维坐标;
    对所述初始特征点的二维坐标进行反投影,得到所述初始特征点在三维空间中的第一三维坐标X born
    将所述第一三维坐标X born通过如下公式进行三维旋转平移,得到所述初始特征点在所述当前图像上对应的第二三维坐标X current
    X current=R*X born+T;
    将所述第二三维坐标X current投影至所述当前图像,得到所述投影特征点在所述当前图像中的二维坐标;
    其中,R是所述参考位姿变化量中的旋转矩阵,T是所述参考位姿变化量中的位移参数。
  4. 根据权利要求2所述的方法,其特征在于,所述以所述投影特征点为中心的第一范围内,搜索与所述初始特征点匹配的目标特征点之后,还包括:
    当搜索到的所述目标特征点数量少于预设阈值时,在以所述投影特征点为中心的第二范围内,重新搜索与所述初始特征点匹配的目标特征点;
    其中,所述第二范围大于所述第一范围。
  5. 根据权利要求1至4任一所述的方法,其特征在于,所述方法还包括:
    获取所述第一个标记图像对应的n个尺度不同的金字塔图像,n为大于1的整数;
    对每个所述金字塔图像提取所述初始特征点,并记录所述初始特征点在所述金字塔图像缩放至原始尺寸时的二维坐标。
  6. 根据权利要求1至4任一所述的方法,其特征在于,所述根据所述初始特征点和所述目标特征点,计算所述相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量,包括:
    根据所述初始特征点和所述目标特征点计算所述相机在相机姿态改变过程时的单应性矩阵;
    对所述单应性矩阵进行分解,得到所述相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量R relocalize和T relocalize
  7. 根据权利要求1至4任一所述的方法,其特征在于,所述根据所述初始位姿参数和所述位姿变化量,重定位得到所述目标相机姿态对应的目标位姿参数之后,还包括:
    将所述当前图像确定为第i+1个标记图像;
    基于所述第i+1个标记图像继续进行特征点追踪。
  8. 一种相机姿态追踪过程的重定位装置,其特征在于,所述装置用于按序执行多个标记图像的相机姿态追踪,所述装置包括:
    图像获取模块,用于获取所述多个标记图像中第i个标记图像之后采集的当前图像,i为大于1的整数;
    信息获取模块,用于当所述当前图像符合重定位条件时,获取所述多个标记图像中的第一个标记图像的初始特征点和初始位姿参数,所述初始位姿参数用于指示所述相机采集所述第一个标记图像时的相机姿态;
    特征点追踪模块,用于将所述当前图像相对于所述第一个标记图像进行特征点追踪,得到与所述初始特征点匹配的目标特征点;
    变化量计算模块,用于根据所述初始特征点和所述目标特征点,计算相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量,所述目标相机姿态是所述相机在采集所述当前图像时的相机姿态;
    重定位模块,用于根据所述初始位姿参数和所述位姿变化量,重定位得到所述目标相机姿态对应的目标位姿参数。
  9. 根据权利要求8所述的装置,其特征在于,所述设备中还设置有惯性测量单元IMU;所述特征点追踪模块包括:
    提取子模块,用于对所述当前图像提取候选特征点;
    采集子模块,用于通过所述IMU获取所述相机采集所述当前图像时的参考位姿变化量;
    投影子模块,用于根据所述参考位姿变化量将所述第一个标记图像中的所述初始特征点进行旋转平移投影,得到所述当前图像中与所述初始特征点对应的投影特征点;
    搜索子模块,用于在以所述投影特征点为中心的第一范围内,从所述候选特征点中搜索与所述初始特征点匹配的目标特征点。
  10. 根据权利要求9所述的装置,其特征在于,所述投影子模块用于:
    获取所述第一个标记图像中的所述初始特征点的二维坐标;
    对所述初始特征点的二维坐标进行反投影,得到所述初始特征点在三维空间中的第一三维坐标X born
    将所述第一三维坐标X born通过如下公式进行三维旋转平移,得到所述初始特征点在所述当前图像上对应的第二三维坐标X current
    X current=R*X born+T;
    将所述第二三维坐标X current投影至当前图像,得到所述投影特征点在所述当前图像中的二维坐标;
    其中,R是所述参考位姿变化量中的旋转矩阵,T是所述参考位姿变化量中的位移参数。
  11. 根据权利要求9所述的装置,其特征在于,
    所述搜索子模块,还用于当搜索到的所述目标特征点数量少于预设阈值时,在以所述投影特征点为中心的第二范围内,重新搜索与所述初始特征点匹配的目标特征点;
    其中,所述第二范围大于所述第一范围。
  12. 根据权利要求8至11任一所述的装置,其特征在于,所述装置还包括:提取模块;
    所述图像获取模块,还用于获取所述第一个标记图像对应的n个尺度不同的金字塔图像,n为大于1的整数;
    所述提取模块,还用于对每个所述金字塔图像提取所述初始特征点,并记录所述初始特征点在所述金字塔图像缩放至原始尺寸时的二维坐标。
  13. 根据权利要求8至11任一所述的装置,其特征在于,
    所述变化量计算模块,用于根据所述初始特征点和所述目标特征点计算所 述相机在相机姿态改变过程时的单应性矩阵;对所述单应性矩阵进行分解,得到所述相机从所述第一相机姿态改变至目标相机姿态时的位姿变化量R relocalize和T relocalize
  14. 根据权利要求8至11任一所述的装置,其特征在于,
    所述特征点追踪模块,用于将所述当前图像确定为第i+1个标记图像;基于所述第i+1个标记图像继续进行特征点追踪。
  15. 一种电子设备,其特征在于,所述电子设备包括存储器和处理器;
    所述存储器中存储有至少一条指令,所述至少一条指令由所述处理器加载并执行以实现如权利要求1至7任一所述的相机姿态追踪过程的重定位方法。
  16. 一种计算机可读存储介质,其特征在于,所述存储介质中存储有至少一条指令,所述至少一条指令由处理器加载并执行以实现如权利要求1至7任一所述的相机姿态追踪过程的重定位方法。
PCT/CN2019/078928 2018-04-27 2019-03-20 相机姿态追踪过程的重定位方法、装置及存储介质 WO2019205842A1 (zh)

Priority Applications (2)

Application Number Priority Date Filing Date Title
EP19792392.3A EP3779883B1 (en) 2018-04-27 2019-03-20 Method and device for repositioning in camera orientation tracking process, and storage medium
US16/915,798 US11205282B2 (en) 2018-04-27 2020-06-29 Relocalization method and apparatus in camera pose tracking process and storage medium

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
CN201810391550.9A CN108648235B (zh) 2018-04-27 2018-04-27 相机姿态追踪过程的重定位方法、装置及存储介质
CN201810391550.9 2018-04-27

Related Child Applications (1)

Application Number Title Priority Date Filing Date
US16/915,798 Continuation US11205282B2 (en) 2018-04-27 2020-06-29 Relocalization method and apparatus in camera pose tracking process and storage medium

Publications (1)

Publication Number Publication Date
WO2019205842A1 true WO2019205842A1 (zh) 2019-10-31

Family

ID=63748232

Family Applications (1)

Application Number Title Priority Date Filing Date
PCT/CN2019/078928 WO2019205842A1 (zh) 2018-04-27 2019-03-20 相机姿态追踪过程的重定位方法、装置及存储介质

Country Status (4)

Country Link
US (1) US11205282B2 (zh)
EP (1) EP3779883B1 (zh)
CN (2) CN110555883B (zh)
WO (1) WO2019205842A1 (zh)

Cited By (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN112907662A (zh) * 2021-01-28 2021-06-04 北京三快在线科技有限公司 特征提取方法、装置、电子设备及存储介质
CN111582385B (zh) * 2020-05-11 2023-10-31 杭州易现先进科技有限公司 Slam质量的量化方法、系统、计算机设备和存储介质
EP4198874A4 (en) * 2020-08-11 2024-02-14 ZTE Corporation IMAGE PROCESSING METHOD AND DEVICE AS WELL AS ELECTRONIC DEVICE AND STORAGE MEDIUM
US12014520B2 (en) 2021-08-26 2024-06-18 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and methods for detecting objects within an image in a wide-view format

Families Citing this family (30)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN110555883B (zh) * 2018-04-27 2022-07-22 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置及存储介质
CN108876854B (zh) 2018-04-27 2022-03-08 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN110544280B (zh) 2018-05-22 2021-10-08 腾讯科技(深圳)有限公司 Ar系统及方法
CN110148178B (zh) * 2018-06-19 2022-02-22 腾讯科技(深圳)有限公司 相机定位方法、装置、终端及存储介质
CN109544615B (zh) * 2018-11-23 2021-08-24 深圳市腾讯信息技术有限公司 基于图像的重定位方法、装置、终端及存储介质
CN111260779B (zh) * 2018-11-30 2022-12-27 华为技术有限公司 地图构建方法、装置及系统、存储介质
CN111354042B (zh) * 2018-12-24 2023-12-01 深圳市优必选科技有限公司 机器人视觉图像的特征提取方法、装置、机器人及介质
CN111696157B (zh) * 2019-03-12 2024-06-18 北京京东尚科信息技术有限公司 图像重定位的确定方法、系统、设备和存储介质
CN111949112A (zh) 2019-05-14 2020-11-17 Oppo广东移动通信有限公司 对象交互方法及装置、系统、计算机可读介质和电子设备
CN110296686B (zh) * 2019-05-21 2021-11-09 北京百度网讯科技有限公司 基于视觉的定位方法、装置及设备
CN110414353B (zh) * 2019-06-24 2023-06-20 炬星科技(深圳)有限公司 机器人开机定位、运行重定位方法、电子设备及存储介质
CN110310333B (zh) * 2019-06-27 2021-08-31 Oppo广东移动通信有限公司 定位方法及电子设备、可读存储介质
CN110335317B (zh) * 2019-07-02 2022-03-25 百度在线网络技术(北京)有限公司 基于终端设备定位的图像处理方法、装置、设备和介质
CN111768443A (zh) * 2019-07-23 2020-10-13 北京京东尚科信息技术有限公司 基于移动摄像头的图像处理方法和装置
CN112406608B (zh) * 2019-08-23 2022-06-21 国创移动能源创新中心(江苏)有限公司 充电桩及其自动充电装置和方法
WO2021051227A1 (zh) * 2019-09-16 2021-03-25 深圳市大疆创新科技有限公司 三维重建中图像的位姿信息确定方法和装置
CN110942007B (zh) * 2019-11-21 2024-03-05 北京达佳互联信息技术有限公司 手部骨骼参数确定方法、装置、电子设备和存储介质
CN112585956B (zh) * 2019-11-29 2023-05-19 深圳市大疆创新科技有限公司 轨迹复演方法、系统、可移动平台和存储介质
CN113033590B (zh) * 2019-12-25 2024-08-09 杭州海康机器人股份有限公司 图像特征匹配方法、装置、图像处理设备及存储介质
CN115023743B (zh) * 2020-02-13 2024-12-31 Oppo广东移动通信有限公司 在增强现实会话中进行表面检测和追踪的方法和系统
US11397869B2 (en) * 2020-03-04 2022-07-26 Zerofox, Inc. Methods and systems for detecting impersonating social media profiles
CN111862150B (zh) * 2020-06-19 2024-06-14 杭州易现先进科技有限公司 图像跟踪的方法、装置、ar设备和计算机设备
CN111798489B (zh) * 2020-06-29 2024-03-08 北京三快在线科技有限公司 一种特征点跟踪方法、设备、介质及无人设备
CN112084041B (zh) * 2020-09-30 2025-02-25 汉海信息技术(上海)有限公司 资源处理方法、装置、电子设备及存储介质
CN112489224B (zh) * 2020-11-26 2025-01-10 北京字跳网络技术有限公司 图像绘制方法、装置、可读介质及电子设备
CN112562047B (zh) * 2020-12-16 2024-01-19 北京百度网讯科技有限公司 三维模型的控制方法、装置、设备以及存储介质
CN112945240B (zh) * 2021-03-16 2022-06-07 北京三快在线科技有限公司 特征点位置的确定方法、装置、设备及可读存储介质
CN113409388A (zh) * 2021-05-18 2021-09-17 深圳市乐纯动力机器人有限公司 扫地机位姿确定方法、装置、计算机设备和存储介质
CN113689484B (zh) * 2021-08-25 2022-07-15 北京三快在线科技有限公司 深度信息的确定方法、装置、终端及存储介质
CN115937305A (zh) * 2022-06-28 2023-04-07 北京字跳网络技术有限公司 图像处理方法、装置及电子设备

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160327395A1 (en) * 2014-07-11 2016-11-10 Regents Of The University Of Minnesota Inverse sliding-window filters for vision-aided inertial navigation systems
CN106885574A (zh) * 2017-02-15 2017-06-23 北京大学深圳研究生院 一种基于重跟踪策略的单目视觉机器人同步定位与地图构建方法
CN107193279A (zh) * 2017-05-09 2017-09-22 复旦大学 基于单目视觉和imu信息的机器人定位与地图构建系统
CN107808395A (zh) * 2017-10-31 2018-03-16 南京维睛视空信息科技有限公司 一种基于slam的室内定位方法
CN108615248A (zh) * 2018-04-27 2018-10-02 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN108648235A (zh) * 2018-04-27 2018-10-12 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置及存储介质

Family Cites Families (17)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN101464134B (zh) * 2009-01-16 2010-08-11 哈尔滨工业大学 一种空间目标三维位姿视觉测量方法
WO2011048497A2 (en) * 2009-10-19 2011-04-28 National University Of Singapore Computer vision based hybrid tracking for augmented reality in outdoor urban environments
CN102506757B (zh) * 2011-10-10 2014-04-23 南京航空航天大学 双目立体测量系统多视角测量中的自定位方法
US10860683B2 (en) * 2012-10-25 2020-12-08 The Research Foundation For The State University Of New York Pattern change discovery between high dimensional data sets
US9940553B2 (en) * 2013-02-22 2018-04-10 Microsoft Technology Licensing, Llc Camera/object pose from predicted coordinates
CN103247075B (zh) * 2013-05-13 2015-08-19 北京工业大学 基于变分机制的室内环境三维重建方法
US11051000B2 (en) * 2014-07-14 2021-06-29 Mitsubishi Electric Research Laboratories, Inc. Method for calibrating cameras with non-overlapping views
US10152825B2 (en) * 2015-10-16 2018-12-11 Fyusion, Inc. Augmenting multi-view image data with synthetic objects using IMU and image data
US11232583B2 (en) * 2016-03-25 2022-01-25 Samsung Electronics Co., Ltd. Device for and method of determining a pose of a camera
CN105953796A (zh) * 2016-05-23 2016-09-21 北京暴风魔镜科技有限公司 智能手机单目和imu融合的稳定运动跟踪方法和装置
CN106092104B (zh) * 2016-08-26 2019-03-15 深圳微服机器人科技有限公司 一种室内机器人的重定位方法及装置
KR102267482B1 (ko) * 2016-08-30 2021-06-22 스냅 인코포레이티드 동시 로컬화 및 매핑을 위한 시스템 및 방법
CN106446815B (zh) * 2016-09-14 2019-08-09 浙江大学 一种同时定位与地图构建方法
CN106875450B (zh) * 2017-02-20 2019-09-20 清华大学 用于相机重定位的训练集优化方法及装置
CN106996769B (zh) * 2017-03-22 2020-02-14 天津大学 一种无需相机标定的主动式位姿快速重定位方法
CN107301661B (zh) * 2017-07-10 2020-09-11 中国科学院遥感与数字地球研究所 基于边缘点特征的高分辨率遥感图像配准方法
CN107888828B (zh) * 2017-11-22 2020-02-21 杭州易现先进科技有限公司 空间定位方法及装置、电子设备、以及存储介质

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
US20160327395A1 (en) * 2014-07-11 2016-11-10 Regents Of The University Of Minnesota Inverse sliding-window filters for vision-aided inertial navigation systems
CN106885574A (zh) * 2017-02-15 2017-06-23 北京大学深圳研究生院 一种基于重跟踪策略的单目视觉机器人同步定位与地图构建方法
CN107193279A (zh) * 2017-05-09 2017-09-22 复旦大学 基于单目视觉和imu信息的机器人定位与地图构建系统
CN107808395A (zh) * 2017-10-31 2018-03-16 南京维睛视空信息科技有限公司 一种基于slam的室内定位方法
CN108615248A (zh) * 2018-04-27 2018-10-02 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN108648235A (zh) * 2018-04-27 2018-10-12 腾讯科技(深圳)有限公司 相机姿态追踪过程的重定位方法、装置及存储介质

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
See also references of EP3779883A4 *

Cited By (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111582385B (zh) * 2020-05-11 2023-10-31 杭州易现先进科技有限公司 Slam质量的量化方法、系统、计算机设备和存储介质
EP4198874A4 (en) * 2020-08-11 2024-02-14 ZTE Corporation IMAGE PROCESSING METHOD AND DEVICE AS WELL AS ELECTRONIC DEVICE AND STORAGE MEDIUM
CN112907662A (zh) * 2021-01-28 2021-06-04 北京三快在线科技有限公司 特征提取方法、装置、电子设备及存储介质
CN112907662B (zh) * 2021-01-28 2022-11-04 北京三快在线科技有限公司 特征提取方法、装置、电子设备及存储介质
US12014520B2 (en) 2021-08-26 2024-06-18 Toyota Motor Engineering & Manufacturing North America, Inc. Systems and methods for detecting objects within an image in a wide-view format

Also Published As

Publication number Publication date
CN108648235A (zh) 2018-10-12
CN110555883A (zh) 2019-12-10
US20200327694A1 (en) 2020-10-15
EP3779883A1 (en) 2021-02-17
US11205282B2 (en) 2021-12-21
CN108648235B (zh) 2022-05-17
EP3779883B1 (en) 2024-10-16
EP3779883A4 (en) 2021-12-22
CN110555883B (zh) 2022-07-22

Similar Documents

Publication Publication Date Title
WO2019205842A1 (zh) 相机姿态追踪过程的重定位方法、装置及存储介质
US11481923B2 (en) Relocalization method and apparatus in camera pose tracking process, device, and storage medium
CN110544280B (zh) Ar系统及方法
WO2019205853A1 (zh) 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN108596976B (zh) 相机姿态追踪过程的重定位方法、装置、设备及存储介质
WO2019205851A1 (zh) 位姿确定方法、装置、智能设备及存储介质
CN109947886B (zh) 图像处理方法、装置、电子设备及存储介质
US11276183B2 (en) Relocalization method and apparatus in camera pose tracking process, device, and storage medium
WO2019205850A1 (zh) 位姿确定方法、装置、智能设备及存储介质
CN108682038B (zh) 位姿确定方法、装置及存储介质
CN110148178B (zh) 相机定位方法、装置、终端及存储介质
WO2019154231A1 (zh) 图像处理方法、电子设备及存储介质
CN111738220A (zh) 三维人体姿态估计方法、装置、设备及介质
CN109086709A (zh) 特征提取模型训练方法、装置及存储介质
CN114170349B (zh) 图像生成方法、装置、电子设备及存储介质
CN110544272A (zh) 脸部跟踪方法、装置、计算机设备及存储介质
CN108776822B (zh) 目标区域检测方法、装置、终端及存储介质
CN108682037B (zh) 相机姿态追踪过程的重定位方法、装置、设备及存储介质
CN110570460A (zh) 目标跟踪方法、装置、计算机设备及计算机可读存储介质
WO2019134305A1 (zh) 确定姿态的方法、装置、智能设备、存储介质和程序产品
CN113033590B (zh) 图像特征匹配方法、装置、图像处理设备及存储介质
CN113298040A (zh) 关键点检测方法、装置、电子设备及计算机可读存储介质
CN113409235B (zh) 一种灭点估计的方法及装置
CN114254687A (zh) 钻井轨道匹配度的确定方法、装置、设备及存储介质
CN111369566A (zh) 确定路面消隐点位置的方法、装置、设备及存储介质

Legal Events

Date Code Title Description
121 Ep: the epo has been informed by wipo that ep was designated in this application

Ref document number: 19792392

Country of ref document: EP

Kind code of ref document: A1

NENP Non-entry into the national phase

Ref country code: DE

ENP Entry into the national phase

Ref document number: 2019792392

Country of ref document: EP

Effective date: 20201026