CN113012298A

CN113012298A - Curved MARK three-dimensional registration augmented reality method based on region detection

Info

Publication number: CN113012298A
Application number: CN202011563089.4A
Authority: CN
Inventors: 张明敏; 陈忠庆; 潘志庚
Original assignee: Zhejiang University ZJU
Current assignee: Zhejiang University ZJU
Priority date: 2020-12-25
Filing date: 2020-12-25
Publication date: 2021-06-22
Anticipated expiration: 2040-12-25
Also published as: CN113012298B

Abstract

The invention discloses a curved MARK three-dimensional registration augmented reality method based on region detection. Partial occlusion can be done for the curved MARK without affecting the final effect. The method provided by the invention overcomes the problems that the traditional plane MARK can not be bent, so that the consistency of a cylindrical object on an augmented reality scene is damaged, the robustness of the natural texture MARK is low, the real-time performance is low and the like.

Description

Curved MARK three-dimensional registration augmented reality method based on region detection

Technical Field

The invention belongs to the field of intersection of computer vision technology and graphics, and particularly relates to a curved MARK three-dimensional registration augmented reality method based on region detection.

Background

With the increasing development and maturity of the internet and the iterative update of multimedia technology, Augmented Reality (AR) is more and more common in our daily life and learning. Augmented reality is a very practical technology combining computer graphics and computer vision, and can overlay information such as virtual objects, videos and characters to a real scene so that a user can acquire more information in the scene, and the user can understand the scene more deeply and clearly.

The augmented reality has wide applications in daily life, such as teaching demonstrations, tour navigation, virtual shopping, workshop guidance, and the like. In the teaching field, the augmented reality can bring safer and more interesting teaching experiments for students, and improve the interest and the practical ability of the students. Because the dangerousness or the imperceptibility of many experiments, these experiments are often neglected in the teaching process, this has promoted the application of augmented reality in the teaching process, and the student can use the interaction of virtual object and real object that superpose in the augmented reality to accomplish the dangerousness experiment and observe more detailed experimental effect, all has very big promotion to student's hands-on ability and theoretical cognition. For the tourism industry, the augmented reality can also enable the user to obtain more direct and vivid explanation on the mobile terminal, and the interestingness and the interactivity of navigation are enhanced.

The augmented reality system mainly relates to technologies such as three-dimensional registration, user interaction, virtual-real fusion and the like, wherein the three-dimensional registration plays a decisive role in development and popularization of the augmented reality system, and the method mainly has the function of estimating the relative pose of a camera in a scene and then superimposing a virtual object on the real scene. Three-dimensional registration is not satisfactory in the experience degree of users at present due to the problems of instantaneity, robustness, stability, attractiveness and the like, so that the deep exploration of the three-dimensional registration technology has a profound significance in the development process of augmented reality for the research and development of the three-dimensional registration technology to be a hot topic in the field of augmented reality nowadays.

In the augmented reality system, in order to generate the visual effect of virtual-real fusion, registration alignment of virtual-real environment is firstly ensured. The most commonly used method in the augmented reality system is that the virtual and real environments share the same spatial coordinate system, so that virtual objects can be rendered in a scene to achieve the effect of virtual-real interaction. In augmented reality systems, cameras are generally used as main sensors, and the positions of virtual objects to be rendered are acquired by estimating the relative poses of the cameras in real time through a three-dimensional registration technology.

The three-dimensional registration technique most commonly used today is the vision-based registration technique, with planar MARKs being the most commonly used. However, for a planar MARK with a curved surface, such as a cylinder, the aesthetic property of the MARK is damaged, so that the immersion of a user is greatly reduced, and therefore, the three-dimensional registration based on the curved MARK is not very significant for the development of augmented reality. The category of the MARK mainly includes an artificial MARK and a MARK based on a natural texture, wherein the artificial MARK such as a hamming code, a two-dimensional code, and the like has the disadvantages of being not shelterable and not being able to obtain a correct pose after being bent, so the MARK based on the natural texture becomes a unique choice for realizing the bent MARK.

Disclosure of Invention

The invention aims to apply a curved MARK three-dimensional registration technology to the field of augmented reality, and provides a curved MARK three-dimensional registration augmented reality method based on region detection. The region where the MARK is located is obtained through the neural network model, three-dimensional registration is carried out on the bent MARK, meanwhile, the MARK can be partially shielded, and the attractiveness and the real-time performance of augmented reality cannot be damaged.

The method is based on a region detection technology, the region where the bent MARK is located in the scene is obtained, and a three-dimensional model formed by attaching the MARK to the cylinder after bending is built according to the radius of the cylinder object and the coordinates of the MARK plane feature points. Coordinates of the curved MARK feature points are acquired in a scene, and the relative pose between the camera and the MARK is restored through a PnP algorithm, so that the virtual object is rendered in the scene.

The technical scheme adopted by the invention for solving the technical problems is as follows:

a curved MARK three-dimensional registration augmented reality method based on region detection comprises the following steps:

step (1), calibrating a camera:

acquiring internal parameters and distortion parameters of the RGB monocular camera by using a Zhang Zhen friend camera calibration method;

step (2), constructing a data set:

the natural texture MARK to be identified is shot by more than 300 pictures under different angles, different distances, different illumination, partial shielding and non-shielding conditions respectively, wherein 80% of the pictures are used as a training set, and the rest 20% of the pictures are used as a verification set. Calibrating a border (bounding box) and a class (classes) of a natural texture MARK in a picture by using labelImg software to generate a corresponding xml format file, framing out an area where the natural texture MARK is located by using a rectangular frame in the calibration process, and then calibrating the class corresponding to the natural texture MARK;

step (3), a neural network model of Yolov5 is used as a natural texture MARK target detection model, the training set constructed in the step (2) is adopted for training, the accuracy of the model is verified through a verification set, the trained model is extracted, and the frame of MARK in a scene can be identified and a specific type can be identified through the trained natural texture MARK target detection model;

step (4), printing the MARK, pasting the MARK on a cylindrical object, measuring the radius r of the cylinder, the width and height of a natural texture MARK picture, marker _ w and marker _ h, extracting the MARK picture characteristic points by using a Fast algorithm, calculating the three-dimensional coordinate of each characteristic point relative to the central point of the MARK, and calculating the included angle between the characteristic point with the coordinate (x, y) of the MARK picture and the connecting line of the characteristic point and the center of the cylinder

The pixel to millimeter conversion scale is pixel2mm, solving for the equation (1):

the corresponding three-dimensional coordinates are (the coordinate units here are all millimeters):

storing the three-dimensional coordinates of all the feature points in a dictionary, wherein the keys of the dictionary are the coordinates (x, y) of a MARK picture, and the values are the coordinates obtained by the formulas (2), (3) and (4);

and (5) for a scene picture obtained by a camera, extracting a natural texture MARK in the scene picture by using a natural texture MARK target detection model to generate a Region Of Interest (ROI Region), extracting all feature points in the ROI Region by adopting a Fast algorithm, calculating a descriptor by using an ORB algorithm, matching the extracted feature points Of the ROI Region with original MARK feature points by using a RANSAC and K nearest neighbor classification algorithm (KNN) on the basis Of calculating the hamming distance Of the descriptor, obtaining 30 most matched feature point matching pairs, obtaining three-dimensional coordinates Of the feature points Of the MARK picture from a dictionary in the step (4), and estimating the relative pose Of the camera and the curved MARK by using a PNP algorithm, wherein the specific implementation is as follows:

point X of world coordinate system_w＝(x_w,y_w,z_w1), with its projection coordinates X on the image plane_i＝(x_i,y_iThe relation of 1) is expressed by the following formula, wherein fx, fy, cx and cy are camera internal parameters calibrated by Zhangyingyou, and r_ijRepresenting a rotational variable, t_iRepresents the translation variables:

the method is simplified as follows:

wherein, λ represents a scale factor, the matrix K is a camera internal parameter matrix, and the matrix M is a model viewpoint matrix. Randomly selecting 4 characteristic point matching point pairs from 30 characteristic point matching pairs, calculating 4 groups of different solutions by using 3 point pairs, substituting the rest 1 point pairs into a formula, solving the solution with the minimum reprojection error into a final solution, and optimizing the final solution by using a random sample consensus (RANSAC) algorithm in the process;

step (6), in the MARK moving process, tracking the moving state of the feature points by using an optical flow method, judging the moving quantity of the feature points of the same two frames, when the moving quantity is less than or equal to ten percent of the total feature point quantity, considering that the pose of the marker does not change relative to the previous frame, and when the moving quantity is greater than ten percent of the total feature point quantity, considering that the pose of the marker changes relative to the previous frame, and then following the step (5) to obtain the pose of the current natural texture MARK for three-dimensional registration, specifically realizing the following steps:

and setting I and J as the gray level images of the previous frame and the current frame, then:

wherein, the point A is any point in the image, and the coordinate vector is (x, y)^TFor a point u on the previous frame I ═ u_x,u_y]^TThe purpose of feature point tracking is to find its position v + u + d in the current frame image_x+d_x,u_y+d_y]^TD is ═ d_x,d_y]^TIs the image velocity at point a, i.e. the optical flow at point a. Defining the concept of similarity in the two-dimensional neighborhood sense due to the influence of the aperture, setting ω_xAnd ω_yFor two integer values, the minimized residual function for the velocity vector d is defined as follows:

the similarity definition can be obtained by the above formulaThe definition of similarity is based on the size of the image neighborhood as (2 omega)_x+1)×(2ω_y+1), solving d to obtain the corresponding position of the point u in the image J; omega_xAnd ω_yIs 2, 3, 4, 5, 6 or 7.

Comparing the position of the feature point calculated by the current frame with the position of the feature point corresponding to the previous frame, judging whether the feature points of two adjacent frames of the camera move or not, counting the number of the moved feature points, if the number of the feature points of the current frame is less than or equal to ten percent of the number of the total feature points, considering that the object does not move relative to the object of the previous frame, and directly acquiring the pose of the previous frame, if the number of the feature points of the current frame is greater than ten percent of the number of the total feature points, considering that the object moves relative to the object of the previous frame, and recalculating the relative pose of the camera relative to the object;

and (7) in the process of estimating the relative pose of the camera in the step (5), predicting and correcting the 6D pose of the MARK by using Kalman filtering.

First defining the displacement of the camera with respect to the natural texture MARK (t)_x,t_y,t_z) And the rotation angles (psi, theta, phi), the first derivative of the coordinates being (t)_x',t_y',t_z') and the second derivative of the coordinates is (t)_x”,t'_y', t_z") where the first derivative represents the speed at which the natural texture MARK moves and the second derivative represents the acceleration at which the natural texture MARK moves, the first derivative of the rotation angle is (ψ ', θ ', φ '), and the second derivative is (ψ", θ ", φ"), where the first derivative represents the speed at which the MARK rotates and the second derivative represents the acceleration at which the natural texture MARK rotates. The kalman filtering may be used for estimation and correction, and the specific formula is as follows:

Kalman＝(t_x,t_y,t_z,t_x′,t_y′,t_z′,t_x″,t_y″,t_z″,ψ,θ,φ,ψ′,θ′,φ′,ψ″,θ″,φ″) (9)

and (8) eliminating the frame with the wrong pose estimation by using a sliding window.

And judging whether the current estimated camera pose is correct or not by the camera pose coordinates of the last two frames and the first two frames relative to the current frame, so as to eliminate the frame with wrong estimation of the camera pose caused by blurring in the motion process of the natural texture MARK. The displacement in the 6D pose estimation of the current frame camera is (x)^t,y^t,z^t) (parameters with different meanings need to be represented by different symbols), calculating the average displacement (x ', y ', z ') of the cameras of the first two frames and the average displacement (x ", y", z ") of the cameras of the second two frames, wherein the current displacement satisfies:

x"-d_t＜x^t＜x'+d_t orx'-d_t＜x^t＜x"+d_t

y"-d_t＜y^t＜y'+d_t ory'-d_t＜y^t＜y"+d_t (10)

z"-d_t＜z^t＜z'+d_t or z'-d_t＜z^t＜z"+d_t

considering the current pose as an effective pose, otherwise considering the current frame as a fuzzy frame, and continuously using the last effective pose, wherein d_tThreshold adjusted for translation, d_t＝3。

And (9) after the relative pose between the camera and the natural texture MARK is obtained through the steps (5), (6), (7) and (8), the virtual object needing three-dimensional registration is subjected to translation and rotation transformation, and the virtual object is rendered into a scene through OpenGL and OpenCV to achieve the effect of augmented reality.

The invention has the beneficial effects that:

the method comprises the steps of attaching a two-dimensional natural texture MARK to a cylinder to form a curved MARK, processing the curved MARK through a neural network model to obtain the region where the curved MARK is located in the current scene, calculating the relative pose between a camera and an object through feature point matching, and rendering a virtual object into an augmented reality scene. Partial occlusion can be done for the curved MARK without affecting the final effect. The method solves the problems that the traditional plane MARK can not be bent, so that the consistency of the cylindrical object to the augmented reality scene is damaged, the robustness of the natural texture MARK is low, the real-time performance is low and the like.

Drawings

FIG. 1 is a picture of a natural texture MARK according to an embodiment of the present invention;

FIG. 2 is a diagram illustrating feature points detected in a MARK picture according to an embodiment of the present invention;

FIG. 3 is a schematic diagram illustrating the calculation of three-dimensional coordinates of feature points on a MARK according to an embodiment of the present invention;

FIG. 4 is a comparison diagram of feature points between two adjacent frames according to an embodiment of the present invention;

FIG. 5 is an effect diagram of a virtual object rendered to an assigned pose in a scene according to an embodiment of the present invention;

FIG. 6 is a flowchart of a method according to an embodiment of the present invention.

Detailed Description

The method of the present invention is further described below with reference to the accompanying drawings.

The experimental environment is a monocular RGB video camera (640 x 480), a cylindrical object, a natural texture MARK picture is printed and attached to the cylindrical object, and the part of the MARK on the cylindrical object is always aligned to the monocular camera in the experimental process.

As shown in fig. 1, a picture with sharp corners and irregular natural texture features with multiple features is used as a MARK, and a symmetrical picture is not selected, so that a large number of pictures are selected, and the pictures have obvious differences.

As shown in fig. 2, all feature points in the MARK picture are calculated by using Fast algorithm, and the specific steps are as follows:

step (a), selecting a pixel Q from the MARK picture, and setting the brightness value of the pixel Q as I to judge whether the pixel is a characteristic point or not_q；

Step (b), a Bresenham circle is obtained by taking the pixel point Q as the center and the radius of the Bresenham circle as 3, and the circle has 16 pixels;

step (c), on the circle with the size of 16 pixels, if the pixel values of 9 continuous pixel points are all larger than I_q+ t or both less than I_qAnd t, the pixel point Q is considered as a characteristic point, and the t is a set threshold value.

Step (d), in order to improve the judgment efficiency of the angular points to eliminate pixels of non-angular points in the image, checking corresponding pixels according to four positions of 1, 9, 5 and 13, and when a pixel point Q is an angular point, at least 3 pixel values of the pixel points of the four positions are all larger than I_q+ t is greater than or less than I_qAnd (c) if the pixel values of the pixel points at the four positions do not meet the condition, judging and screening all the pixel points which are not the angular points if the pixel values of the pixel points at the four positions are not the angular points, and judging and screening the rest pixel points to obtain the final angular points by performing the operation judgment in the step (c).

As in fig. 3, for each feature point, its three-dimensional coordinates relative to the MARK center point are calculated. After attaching the MARK to the cylindrical object, a three-dimensional model can be obtained. After the radius r of the cylinder is obtained, for the feature point with the coordinate (x, y) of the MARK picture, the included angle between the feature point and the central connecting line of the cylinder is

The pixel to millimeter conversion scale is pixel2mm, and the calculation formula is as follows:

the corresponding three-dimensional coordinates are (the coordinate units here are all pixels):

storing the three-dimensional coordinates of all the feature points in a dictionary, wherein the keys of the dictionary are the coordinates (x, y) of the MARK picture, and the values are obtainedThree dimensional coordinates (3 d)_x,3d_y,3d_z)。

Shooting under different angles, different distances, different illumination, partial shielding and non-shielding conditions to obtain a training set picture; 300 pictures are shot in total, wherein 240 pictures are used as a training set, the rest pictures are used as a verification set, a border (bounding box) and a category (classes) of the pictures are calibrated by using labelImg to generate an xml file, and then the corresponding pictures and the marked xml are put into corresponding paths of a Yolov5 model code (step (3)).

Detecting a natural texture MARK in a scene through a Yolov5 target detection model, wherein the model can extract a region where the MARK is located in the scene and the confidence coefficient of the region in real time, if the confidence coefficient is less than 20, the region is not considered to be the region where the MARK is located, if the confidence coefficient is greater than or equal to 20, a bounding box (bounding box) of the MARK in the scene is obtained, and masking operation is performed on an image of the region by using OPENCV so that the RGB value of pixels except the other part of the MARK region is (0,0, 0).

As shown in fig. 4, tracking all feature points in a scene by an optical flow method (optical flow) is specifically implemented as follows:

the similarity definition can be obtained through the formula, and the similarity definition is based on the image neighborhood size of (2 omega)_x+1)×(2ω_y+1), solving for d, resulting in the corresponding position of point u in image J, for ω_xAnd ω_yTypical values are 2, 3, 4, 5, 6, 7.

and calculating the obtained descriptor of the feature point by using an ORB algorithm, and specifically comprising the following steps of:

setting the center of a key point O, and using O as the center of a circle_rThe size of the pixel is that the radius is made into a circle;

taking N point pairs in the circle, wherein N is 512;

step (g), defining operation M (wherein I)_AExpressing the gray scale of A, I_BGrayscale for B):

and (h) carrying out the operation of the step (g) on the selected key points to obtain a descriptor combination consisting of 0 and 1.

The ORB implementation in OPENCV adopts an image pyramid to solve the problem that descriptors are not sensitive to illumination and have no scale consistency. For rotation consistency, the principal direction of each feature point is calculated by adopting a gray scale centroid method, the gray scale centroid coordinate is calculated in the circular area range with the radius r of the feature point, and the direction vector from the center position to the centroid position is determined as the principal direction.

And carrying out similarity matching on the feature point descriptor of the natural texture MARK in the scene and the original MARK feature point descriptor. Hamming distance is used to calculate the similarity between two descriptors, with d_kHamming distance, D, between rBRIEF descriptors representing feature points A and B_ADescriptor representing characteristic point A, D_BA descriptor representing a feature point B, i representing the bit value of the i-th position of the descriptor:

and after obtaining a feature point pair matched with the natural texture MARK and the reference MARK in the scene, carrying out outlier rejection through a ratio test, regarding the feature point p of the natural texture MARK in the scene, the distance between the two feature points closest to the reference image is d1 and d2, and when d1/d2> ratio (ratio is preferably 0.8), considering the feature point p as an outlier to carry out rejection. The use of random sample consensus (RANSAC) algorithm on the ratio-tested valid feature points (inliers) further eliminates possible outliers. In the matching process, cross validation (namely that the feature point p and the feature point q are mutually the most matched feature points) and a nearest neighbor algorithm are used for further screening out the point pairs with wrong matching, and finally the camera pose is calculated through a PnP algorithm. As shown in fig. 5, for the effect of augmented reality by rendering the virtual object into the scene, the virtual object completely covers the cup attached with the natural texture MARK, so that the cup in the scene is replaced.

As shown in fig. 6, which is a flow chart of the practical application of the method of the present invention, the steps are as follows:

step (1) acquiring a natural texture MARK Region (ROI) in a scene by using Yolov 5;

step (2) comparing the feature points of the previous frame with the optical flow method, if the object is judged not to move, directly acquiring the pose of the camera of the previous frame, and executing step (4), otherwise, executing step (3);

step (3) extracting feature points of an ROI (region of interest) by using a Fast algorithm to match with feature points of a MARK picture, finding out 30 feature points which are most matched by a KNN (nearest neighbor algorithm), recovering three-dimensional coordinates of the 30 feature points by a dictionary, calculating to obtain a MARK pose by using a PnP (neighbor nearest neighbor algorithm) and a RANSAC (random sample consensus) algorithm, comparing the MARK pose with an average pose of a sliding window to judge whether the current pose is effective, updating the sliding window if the MARK pose is effective, and executing the step (4), otherwise, using the pose of a previous frame of camera;

and (4) rendering the virtual object to the acquired pose through OPENGL and OPENCV to perform augmented reality.

Claims

1. a curved MARK three-dimensional registration augmented reality method based on area detection, is characterized in that, step is as follows:

An augmented reality method for curved MARK 3D registration based on region detection, the steps are as follows:

Step (1), calibrate the camera:

Use Zhang Zhengyou's camera calibration method to obtain the internal parameters and distortion parameters of the RGB monocular camera;

Step (2), construct the data set:

Take a total of more than 300 pictures of the natural texture MARK to be identified under different angles, different distances, different lighting, partial occlusion and non-occlusion conditions, of which 80% of the pictures are used as the training set, and the remaining 20% of the pictures are used for To make a validation set; use labelImg software to calibrate the bounding box and classes of the natural texture MARK in the picture to generate the corresponding xml format file. During the calibration process, first frame the area where the natural texture MARK is located with a rectangular frame , and then calibrate the category corresponding to the natural texture MARK;

Step (3), use the neural network model of Yolov5 as the natural texture MARK target detection model, use the training set constructed in step (2) for training, and verify the accuracy of the model through the verification set, and extract the trained model, Through the trained natural texture MARK target detection model, the frame of the MARK in the scene can be recognized and the specific category can be identified;

Step (4), print the MARK, and paste it on the cylinder object, and measure the cylinder radius r, the width and height of the natural texture MARK image marker_w, marker_h, and use the Fast algorithm to extract the MARK image feature points. For each feature point Calculate its three-dimensional coordinates relative to the MARK center point. For the feature point whose MARK image coordinates are (x, y), the angle between the line connecting it and the center of the cylinder is

The conversion scale from pixel to millimeter is pixel2mm, and the solution formula is (1):

Store the three-dimensional coordinates of all feature points in a dictionary, the keys of the dictionary are the coordinates (x, y) of the MARK picture, and the values are the coordinates obtained by formulas (2), (3), (4);

Step (5), for the scene picture obtained by the camera, use the natural texture MARK target detection model to extract the natural texture mark to generate a region of interest, that is, the ROI region (Region Of Interest), and use the Fast algorithm to extract all the features in the ROI region point, and use the ORB algorithm to calculate the descriptor, and use RANSAC and K nearest neighbor classification algorithm (KNN) on the basis of calculating the descriptor Hamming distance to match the extracted feature points of the ROI area with the original MARK feature points, Obtain the most matching 30 feature point matching pairs, from the dictionary in step (4), obtain the three-dimensional coordinates of the feature point MARK picture feature point, and use the PNP algorithm to estimate the relative pose of the camera and the curved MARK. The specific implementation is as follows:

The point X _w =(x _w , y _w , z _w , 1) of the world coordinate system, and the relationship between its projected coordinate X _i =(x _i , y _i , 1) on the image plane is expressed by the following formula, where fx ,fy,cx,cy are the camera internal parameters calibrated by Zhang Zhengyou, r _ij represents the rotation variable, t _i represents the translation variable:

Simplifies to:

Among them, λ represents the scale factor, the matrix K is the camera internal parameter matrix, and the matrix M is the model viewpoint matrix; 4 feature point matching point pairs are randomly selected from 30 feature point matching pairs, and 3 point pairs are used to calculate 4 groups. Different solutions, and then substitute the remaining 1 point pair into the formula to obtain the solution with the smallest reprojection error as the final solution, and use the random sampling consistency algorithm (RANSAC) to optimize the final solution in this process;

Step (6), in the process of MARK moving, use the optical flow method to track the moving state of the feature points, and judge the moving quantity of the feature points in the same two frames. When the moving quantity is less than or equal to 10% of the total feature points, it is considered that the marker level The pose does not change relative to the previous frame. When the number of moves is greater than ten percent of the total feature points, it is considered that the pose of the marker has changed for the previous frame, and the current natural texture MARK is obtained by following step (5). The pose of the 3D registration is performed;

Step (7), in the process of estimating the relative pose of the camera in step (5), use Kalman filtering to predict and correct the 6D pose of the natural texture MARK;

First define the displacement (t _x , t _y , t _z ) and rotation angle (ψ, θ, φ) of the camera relative to the natural texture MARK, and the first derivative of the coordinates is (t _x ', t _y ', t _z ') , the second derivative of the coordinates is (t _x ”, t” _y , t _z ”), where the first derivative represents the moving speed of the natural texture MARK, and the second derivative represents the acceleration of the natural texture MARK moving, the first order of the rotation angle The derivative is (ψ', θ', φ'), and the second derivative is (ψ", θ", φ"), where the first derivative represents the speed of MARK's rotation, and the second-order derivative represents the acceleration of natural texture MARK rotation; The Kalman filter is used for estimation and correction, and the specific formula is as follows:

Kalman=(t _x ,t _y ,t _z ,t _x ′,t _y ′,t _z ′,t _x ″,t _y ″,t _z ″,ψ,θ,φ,ψ′,θ′,φ′ ,ψ″,θ″,φ″) (7)

Step (8), use the sliding window to eliminate the frame with wrong pose estimation;

Judging whether the currently estimated camera pose is correct by relative to the camera pose coordinates of the last two frames and the first two frames of the current frame, so as to eliminate the frames that cause the wrong camera pose estimation due to blurring during the movement of the natural texture MARK; The displacement in the 6D pose estimation of the camera in the current frame is (x ^t , y ^t , z ^t ) (parameters with different meanings need to be represented by different symbols), and the average displacement of the camera in the first two frames is calculated (x', y', z ') and the average displacement of the camera in the next two frames (x", y", z"), the current displacement satisfies:

Then the current pose is considered to be a valid pose, otherwise the current frame is considered to be a fuzzy frame, and the last valid pose is continued, where d _t is the threshold adjusted by translation, d _t =3;

Step (9), after obtaining the relative pose of the camera and the natural texture MARK through steps (5), (6), (7), and (8), translate and rotate the virtual object that needs to be registered in three dimensions, and Render virtual objects into the scene through OpenGL and OpenCV to complete the augmented reality effect.

2. a kind of curved MARK three-dimensional registration augmented reality method based on area detection according to claim 1, it is characterized in that, use sharp-edged, irregular, the picture with the natural texture feature of many features as MARK, do not choose symmetrical MARK. Pictures, there are a large number of graphics in the selected pictures, and there are obvious differences between the graphics.

3. a kind of curved MARK three-dimensional registration augmented reality method based on regional detection according to claim 1, is characterized in that, uses Fast algorithm to calculate all feature points in MARK picture, concrete steps are as follows:

Step (a), select a pixel Q from the MARK picture, in order to judge whether this pixel is a feature point, at first its brightness value is set as _Iq ;

Step (b), take the pixel point Q as the center, and obtain a Bresenham circle with a radius of 3, and there are 16 pixels on this circle;

Step (c), on this circle with a size of 16 pixels, if the pixel values of 9 consecutive pixel points are all greater than I _q + t or less than I _q + t, the pixel point Q is considered to be a feature point. , the t is the set threshold;

Step (d), in order to improve the judgment efficiency of corner points to exclude non-corner pixels in the image, check the corresponding pixels according to the four positions of 1, 9, 5, and 13. When the pixel point Q is a corner point, then these four At least 3 of the pixel values of the pixel points in each position are larger than I _q +t or smaller than I _q + t. If the pixel values of the pixel points in the four positions do not meet this condition, then the pixel point Q is not a corner point, All pixel points are judged and screened, pixels that are not corner points are excluded, and the remaining pixel points are judged by the operation of step (c) to obtain the final corner point.

4. a kind of curved MARK three-dimensional registration augmented reality method based on regional detection according to claim 1 and 3, is characterized in that, uses the descriptor of the feature point that ORB algorithm calculates, concrete steps are as follows:

Step (e), set the center of the key point O, and make a circle with the size of O _r pixel as the radius;

Step (f), take N point pairs in the circle, N=512;

Step (g), define operation M, wherein IA represents the grayscale of _A , and _IB represents the grayscale of B:

In step (h), step (g) is performed on the selected key points to obtain a descriptor combination composed of 0 and 1.

5. a kind of curved MARK three-dimensional registration augmented reality method based on area detection according to claim 4, is characterized in that, step (6) is specifically realized as follows:

Let I and J be the grayscale images of the previous frame and the current frame, then:

The point A is any point in the image, and its coordinate vector is (x, y) ^T . For a point u=[u _x , u _y ] ^T on the previous frame of image I, the purpose of feature point tracking is to find its current The position v=u+d=[u _x +d _x , u _y +d _y ] ^T in the frame image, the vector d=[d _x , _dy ] ^T is the image speed of point A, that is, the light at point A flow; due to the influence of the aperture, the concept of similarity is defined in the sense of two-dimensional neighborhood, and ω _x and ω _y are set as two integer values, then the minimized residual function for the velocity vector d is defined as follows:

The similarity definition can be obtained through the above formula. The similarity definition is based on the image neighborhood size of (2ω _x +1)×(2ω _y +1), the solution of d, and the corresponding position of point u in the image J is obtained;

Compare the position of the feature point calculated in the current frame with the position of the feature point corresponding to the previous frame, determine whether there is movement between the feature points of the two adjacent frames of the camera, and count the number of moving feature points. The number of points moving is less than or equal to 10% of the total number of feature points, it is considered that the object has not moved relative to the object in the previous frame, and the pose of the previous frame can be directly obtained. If the feature point of the current frame moves If the number is greater than ten percent of the total number of feature points, it is considered that the object has moved relative to the object in the previous frame, and the relative pose of the camera relative to the object needs to be recalculated.

6 . The augmented reality method for curved MARK three-dimensional registration based on region detection according to claim 5 , wherein the values of ω _x and ω _y are 2, 3, 4, 5, 6 or 7. 7 .