CN114721511B

CN114721511B - A method and device for positioning three-dimensional objects

Info

Publication number: CN114721511B
Application number: CN202210261706.8A
Authority: CN
Inventors: 葛凯麟
Original assignee: Beijing Xingsu Visual Culture Communication Co ltd
Current assignee: Beijing Xingsu Visual Culture Communication Co ltd
Priority date: 2019-04-24
Filing date: 2019-04-24
Publication date: 2025-03-25
Anticipated expiration: 2039-04-24
Also published as: CN114721511A; CN110134234B; CN110134234A

Abstract

The present invention discloses a method for locating a three-dimensional object, comprising: using a single camera to capture an image of a three-dimensional object, identifying the posture and position of the three-dimensional object, wherein the outer surface of the three-dimensional object is divided into multiple regions, and the colors of adjacent regions are different; displaying a virtual object corresponding to the three-dimensional object; and after detecting that the posture and/or position of the three-dimensional object in three-dimensional space has changed, adjusting the virtual object according to the amount of change in the posture and/or position. Accordingly, an embodiment of the present invention provides a device for locating a three-dimensional object, which solves the problem in the prior art that it is impossible to adaptively adjust the changes of virtual objects according to the dynamic changes of three-dimensional objects in a single camera scenario.

Description

Method and device for positioning three-dimensional object

Technical Field

The invention belongs to the technical field of augmented reality, and particularly relates to a method and a device for positioning a three-dimensional object.

Background

In the current augmented reality application, a three-dimensional object may be acquired through a camera, the three-dimensional object is identified and a corresponding virtual three-dimensional object is displayed in a virtual scene, for example, a hexahedral image may be acquired, and a virtual object corresponding to the hexahedral image, such as a virtual hexahedral, virtual character, virtual globe, etc., may be displayed in a display screen.

In the prior art, the virtual object can be displayed in a mode of acquiring the three-dimensional object, but the change of the virtual object cannot be adaptively adjusted according to the dynamic change of the three-dimensional object in a single camera use scene, so that the cost is high, and the position and the gesture of the three-dimensional object are difficult to accurately identify.

Disclosure of Invention

The invention provides a method and a device for positioning a three-dimensional object, which solve the problem that the change of a virtual object can not be adaptively adjusted according to the dynamic change of the three-dimensional object in a single-camera scene in the prior art.

In order to achieve the above object, the present invention provides a positioning method of a three-dimensional object, including:

Acquiring an image of a three-dimensional object by using a single camera, and identifying the gesture and the position of the three-dimensional object, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of the adjacent areas are different;

displaying a virtual object corresponding to the three-dimensional object;

And after detecting that the posture and/or the position of the three-dimensional object in the three-dimensional space are changed, adjusting the virtual object according to the change amount of the posture and/or the position.

In one embodiment, the adjusting the virtual object includes:

rotating the virtual object in a virtual space, wherein the rotation angle and the angular speed of the virtual object correspond to the posture change quantity of the three-dimensional object, or,

And moving the position of the virtual object in a virtual space, wherein the displacement amount of the virtual object corresponds to the position change amount of the three-dimensional object.

In one embodiment, after the adjusting the virtual object, the method further includes:

when the camera acquires different combinations of different color planes of the three-dimensional object, different virtual scenes are entered according to preset instructions, or,

When the rotational angular velocity of the virtual object exceeds a first preset threshold, a first virtual scene is displayed, or,

And when the rotation angular velocity of the virtual object is lower than a second preset threshold value, displaying a second virtual scene, wherein the first preset threshold value is larger than the second preset threshold value.

In one embodiment, the adjusting the virtual object according to the variation of the gesture and/or the position includes:

and adjusting the size of the virtual object according to the depth of field distance between the three-dimensional object and the camera.

In one embodiment, identifying the pose and position of the three-dimensional object includes:

Performing color block segmentation on an image, and decomposing the image into a plurality of areas with different colors;

averaging the colors of each region, and traversing all adjacent color block pairs;

screening the color lump pairs by using a table look-up method, and screening out a region matched with a preset model;

and calculating the orientation data of the matched area, and acquiring the position and the posture of the three-dimensional object.

In one embodiment, after calculating the orientation data of the matching region, the method further includes:

Calculating candidate solutions corresponding to the matched areas;

comparing the compatibility of the candidate solutions pairwise, and discarding any one of the two compatible candidate solutions;

Identifying edge pixels between the pairs of color patches using an edge detection algorithm;

optimizing the position and the posture of the three-dimensional object by using an optimization formula, wherein the optimization formula is as follows:

Wherein P is the position and posture parameters of the three-dimensional object, including position coordinates (X, y, z) and posture angles (q _w,q_x,q_y,q_z), f is a projection function for calculating the position of a point X _i on the surface of the three-dimensional object when the three-dimensional object is in the P posture, E is a cost function for calculating the difference between the projection position and the observation position, X _i and theta _i are edge pixel points detected in the image, X _i is the coordinates of the edge points in the image, and theta _i is the tangential angle of the edge points.

In one embodiment, after the image is color block segmented, the method further comprises:

Establishing a model of the three-dimensional object;

traversing adjacent surfaces in the model, and recording color pairs and surface orientations of the adjacent surfaces;

Coordinate information of boundary lines of all adjacent faces is recorded.

The embodiment of the invention also provides a method for positioning the three-dimensional object, which comprises the following steps:

Acquiring an image of a three-dimensional object by using a single camera, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of the adjacent areas are different;

Recording two or more adjacent surfaces in the three-dimensional object;

Calculating the orientation data of the matched area, and acquiring the position and the posture of the three-dimensional object;

And displaying the virtual object corresponding to the three-dimensional object.

Calculating candidate solutions corresponding to the matched areas;

The embodiment of the invention also provides a positioning device of the three-dimensional object, a processor of the device and a memory for storing a computer program capable of running on the processor, wherein the processor is used for executing the method for positioning the three-dimensional object when running the computer program.

The embodiment of the invention also provides a computer-readable storage medium, on which computer-executable instructions are stored, for performing the method for three-dimensional object positioning described above.

The embodiment of the invention provides a method and a device for positioning a three-dimensional object, wherein the method is characterized in that different color combination areas of the three-dimensional object are identified through a single camera, the spatial gesture and the position of the three-dimensional object are determined, the corresponding virtual object is displayed, meanwhile, the spatial gesture and the position of the three-dimensional object can be determined in a refined mode, in addition, the depth of field of the three-dimensional object can be measured through the single camera, the cost is low, and the precision is high.

Drawings

FIG. 1 is a flow chart of a method of positioning a three-dimensional object in an embodiment of the invention;

FIG. 2 is a flow chart of a method of recognizing the pose and position of a three-dimensional object in an embodiment of the invention;

FIG. 3 is a schematic representation of three-dimensional object recognition in an embodiment of the invention;

FIG. 4 is a further schematic representation of three-dimensional object recognition in accordance with an embodiment of the present invention;

FIG. 5 is a schematic diagram of a three-dimensional object color patch edge to optimize pose in an embodiment of the invention;

FIG. 6 is a schematic view of a three-dimensional object positioning apparatus according to an embodiment of the present invention;

FIG. 7 is a schematic view of the structure of a three-dimensional object positioning device according to an embodiment of the present invention.

Detailed Description

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be further described in detail with reference to the accompanying drawings and examples. It should be understood that the specific embodiments described herein are for purposes of illustration only and are not intended to limit the scope of the invention. In addition, the technical features of the embodiments of the present invention described below may be combined with each other as long as they do not collide with each other.

In order to achieve the above object, as shown in fig. 1, an embodiment of the present invention provides a recommendation method for an application program, including:

S101, acquiring an image of a three-dimensional object by using a single camera, and identifying the gesture and the position of the three-dimensional object, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of the adjacent areas are different;

In the embodiment of the invention, the three-dimensional object positioning and virtual object display can be realized through the three-dimensional object positioning device, optionally, the positioning device comprises a single camera, a processing unit and a display unit, the image of the three-dimensional object is acquired in a detection area through the single camera, the processing unit is used for realizing the image processing and finally determining the gesture and the position of the three-dimensional object, and a virtual object which is the same as or corresponds to the three-dimensional object is created in a virtual scene by means of the augmented reality AR technology. The three-dimensional object may be a polyhedron, such as a sphere, cylinder, tetrahedron, hexahedron, or the like. In the embodiment of the invention, the orientation and the gesture of the three-dimensional object facing the camera can be determined through the color combination of each surface in the rotating or moving process of the three-dimensional object. For convenience of explanation, the embodiments of the present invention are all illustrated by taking hexahedron as an example, and the remaining three-dimensional objects (with combinations of different color planes) are also within the scope of the embodiments of the present invention.

As shown in fig. 2, in one embodiment, the gesture and the position of the three-dimensional object are identified, which may specifically be:

s201, performing color block segmentation on an image, and decomposing the image into a plurality of areas with different colors;

As shown in fig. 3, the image may be a yellow color region, a green color region, a blue color region, a black color region, a white color region, and a red color region.

In addition, after the color block is segmented, a model of the three-dimensional object can be built, adjacent surfaces in the model are traversed, color pairs and face directions of the adjacent surfaces are recorded, and coordinate information of dividing lines of all the adjacent surfaces is recorded. That is, in the embodiment of the present invention, adjacent surfaces in the three-dimensional object can be traversed, color pairs (such as "green-red", "red-yellow", etc.) of the adjacent surfaces and orientations of the surfaces are recorded, and a table is built for use in a subsequent algorithm, and meanwhile, coordinate information (represented by a series of discrete sampling points) of boundary lines of all the adjacent surfaces is recorded for subsequent accurate measurement.

When the camera can see two or more surfaces at the same time, the position and the gesture of the mark can be primarily judged through color combination (namely, the direction) and then more accurate position and gesture data are obtained through an iterative algorithm.

S202, averaging the colors of each region, and traversing all adjacent color block pairs (pairs);

S203, screening the color lump pairs by using a table look-up method, and screening out a region matched with a preset model;

The color patch pairs are screened by using a table look-up method, for example, the "red-white" is matched with the three-dimensional object model, but the "red-blue" and the "green-purple" are not matched with the model (the former is because red and blue are not adjacent in the model, the latter is because no purple is in the model), the color patch pairs which are not matched are discarded, and the areas (the color patch pairs) matched with the preset model are screened.

S204, calculating the orientation data of the matched area, and acquiring the position and the posture of the three-dimensional object.

Some candidate color block pairs are left after screening, for example, in fig. 4, "red-white", "red-black", "black-white" and the like can be obtained, the orientation data of the faces can be searched out by using a table look-up method, so that the approximate orientation of the camera relative to the mark can be known, meanwhile, the approximate rotation angle of the camera (the rotation angle taking the connecting line of the camera and the mark as the axis) can be calculated due to the data of the two faces at the same time, and the approximate position and the gesture of the three-dimensional object can be calculated.

In the embodiment, the rough position and the gesture of the three-dimensional object can be calculated, and the obtained result may not be accurate. The specific method comprises the following steps:

After calculating the orientation data of the matching region, the embodiment of the invention further comprises:

s2041, calculating a candidate solution corresponding to the matched region;

S2042, comparing the compatibility of the candidate solutions pairwise, and discarding any one of the two compatible candidate solutions;

When the marker positions obtained by the two candidate solutions are relatively close to the gesture (the distance and the angle difference are smaller than a certain threshold value), the two candidate solutions can be considered to be compatible, namely, the two candidate solutions are all color patch pairs on the same marker. At which point one of the solutions may be discarded.

S2043, identifying edge pixels between the color block pairs by using an edge detection algorithm;

S2044, optimizing the position and the posture of the three-dimensional object by using an optimization formula, wherein the optimization formula is as follows:

Wherein P is the position and posture parameters of the three-dimensional object, including position coordinates (X, y, z) and posture angles (q _w,q_x,q_y,q_z), f is a projection function for calculating the position of a point X _i on the surface of the three-dimensional object when the three-dimensional object is in the P posture, E is a cost function for calculating the difference between the projection position and the observation position, X _i and theta _i are edge pixel points detected in the image, X _i is the coordinates of the edge points in the image, and theta _i is the tangential angle of the edge points. After iterative optimization of the above formula is performed by using a Levenberg-Marquardt algorithm, a better P ^* value can be obtained, namely the position and the posture of the marker after optimization.

After the gesture optimization is completed through an iterative algorithm, accurate gesture data of the three-dimensional object can be obtained.

In addition, for video data, the poses of successive frames may be smoothed using Kalman filtering (KALMANFILTER), resulting in a relatively stable sequence of poses.

S102, displaying a virtual object corresponding to the three-dimensional object;

In the embodiment of the invention, the AR technology can be used for displaying the virtual object corresponding to the three-dimensional object in the virtual scene after the gesture and the position of the three-dimensional object are identified. For example, a hexahedron having the same shape as the three-dimensional object in size is displayed, and the colors of the respective faces are identical to the actual three-dimensional object.

S103, after the change of the posture and/or the position of the three-dimensional object in the three-dimensional space is detected, the virtual object is adjusted according to the change amount of the posture and/or the position.

In one embodiment, the adjusting the virtual object may specifically be:

For example, in the field of games, the three-dimensional object may be shaped as a polyhedron of a "magic wand/wand", and by the three-dimensional positioning method described above, combinations of different face colors may be identified to determine which particular pose is currently facing the camera. The virtual magic stick can be adaptively changed when the user operates the magic stick, and if the user rotates the magic stick, the virtual magic stick also synchronously rotates, and when the user moves the magic stick, the virtual magic stick also synchronously moves. Meanwhile, the actual magic wand can be used for measuring the depth of field relative to the camera through a single camera according to different distances, the depth of field is different, the size of the virtual magic wand is changed, a user can use the characteristics to perform different game experiences, for example, different magic wand are operated in a three-dimensional space, the game effect generated by the magic wand which is close to the camera and far from the camera is different, the game experience of the user can be improved, and meanwhile, the manufacturing cost is reduced (the depth of field is measured by adopting a TOF camera or a scheme with multiple cameras on the market, and the cost is high).

Meanwhile, a user can design different game experiences according to different characteristics of the magic wand, for example, the magic wand is rotated, the user can rotate to different surfaces in front of the camera (namely, the color combination acquired by the camera is different), new interaction experiences can be designed, for example, the user can enter a game after turning to the surface A, enter another game after turning to the surface B, an interaction virtual scene is generated when turning to the surface C from the surface A, and meanwhile, when the rotating speed and the rotating angle are different, different games can be designed, namely, different angles and different speeds are rotated, and different corresponding games are different, or corresponding game playing methods are different.

In addition, the embodiment of the invention also provides a method for positioning the three-dimensional object, and the positioning algorithm inputs marked information in advance, wherein the marked information comprises coordinates of all vertexes, parameter equations of edges and colors of faces. The following points exist when the model is input:

1. traversing adjacent surfaces in the model, recording color pairs (such as green-red, red-yellow and the like) of the adjacent surfaces and the orientation of the surfaces, and building a table for a subsequent algorithm;

2. Coordinate information (represented by a series of discrete sample points) of the boundaries of all adjacent facets is recorded,

For subsequent accurate measurements.

When the camera can see two or more surfaces at the same time, the position and the gesture (namely the orientation) of the mark can be primarily determined through the color combination, and then more accurate position and gesture data can be obtained through an iterative algorithm.

The method specifically comprises the following steps:

s501, acquiring an image of a three-dimensional object by using a single camera, wherein the outer surface of the three-dimensional object is divided into a plurality of areas, and the colors of the adjacent areas are different;

s502, recording two or more adjacent surfaces in the three-dimensional object;

S503, performing color block segmentation on an image, and decomposing the image into a plurality of areas with different colors;

S504, averaging the colors of each region, and traversing all adjacent color block pairs;

For example, "red-white" is in agreement with the model, but "red-blue" and "green-violet" are not in agreement with the model (the former because red is not adjacent to blue in the model, and the latter because there is no purple in the model).

S505, screening the color lump pairs by using a table look-up method, and screening out a region matched with a preset model;

For example, in fig. 4, "red-white", "red-black", "black-white" and the like are obtained, and the orientation data of these surfaces can be checked by using a table look-up method to obtain the approximate orientation of the camera with respect to the marker, and at the same time, the approximate rotation angle of the camera (rotation angle about the axis connecting the camera and the marker) can be calculated due to the data of both surfaces, and the approximate position and orientation of the marker can be calculated.

S506, calculating orientation data of the matched area, and acquiring the position and the posture of the three-dimensional object;

S507, displaying the virtual object corresponding to the three-dimensional object.

S5061, calculating a candidate solution corresponding to the matched region;

S5062, comparing the compatibility of the candidate solutions pairwise, and discarding any one of the two compatible candidate solutions;

S5063, identifying edge pixels between the color block pairs by using an edge detection algorithm;

s5064, optimizing the position and the posture of the three-dimensional object by using an optimization formula, where the optimization formula is:

Wherein P is the position and attitude parameters of the three-dimensional object, including position coordinates (X, y, z) and attitude angles (q _w,q_x,q_y,q_z), wherein the attitude angles use quaternion (Quaternion) representation, f is a projection function for calculating where a point X _i on the marker is located on an image shot by a camera when the marker is in the P attitude, E is a cost function for calculating the difference between the projection position and the observed position, the larger the difference is, the more the cost is, X _i and theta _i are edge pixel points detected in the image, X _i is the coordinates of the edge points in the image, and theta _i is the tangential angle of the edge points. After iterative optimization of the above formula is performed by using a Levenberg-Marquardt algorithm, a better P ^* value can be obtained, namely the position and the posture of the marker after optimization.

Furthermore, for video data, the poses of successive frames may be smoothed using Kalman filtering (KALMANFILTER), resulting in a relatively stable sequence of poses.

Fig. 5 is a schematic diagram of a three-dimensional object color lump edge to optimize a pose according to an embodiment of the present invention, wherein a dotted line is a color lump edge calculated by a current estimated pose, and a solid line is a color lump edge obtained by actual shooting. The broken line is drawn towards the solid line by adjusting the posture during optimization.

In addition, in the embodiment of the invention, the virtual object can be combined with the remote control module to perform man-machine interaction. The conventional remote controller has a patterned operation instruction set, presses a button to generate an instruction signal, and transmits the signal so that terminals such as televisions, air conditioners, etc. respond to the instruction. In the embodiment of the invention, a series of operation protocols can be set in a targeted manner, and a remote control instruction is combined with the space gesture/position of the three-dimensional object to form a set of new man-machine interaction protocols. For example, a series of instruction sets may be provided in the remote control module, and interactions occurring at different spatial poses/positions may be distinguished. After a key moving rightward is pressed, if the position of the three-dimensional object is not detected to be updated, the virtual object moves rightward by one grid, and if the position of the three-dimensional object is detected to be updated, the key moving rightward is pressed, and three grids can be moved rightward.

FIG. 6 is a schematic diagram of a three-dimensional pointing device according to an embodiment of the present invention, which may be used by a user to operate an application on a device such as a cell phone, tablet or computer. The device may include a camera, microphone, application prop, and/or an information entry device such as an AR device. The apparatus may also include output means such as a display device, a speaker, etc.

The embodiment of the invention also provides a storage medium, on which computer instructions are stored, which when being executed by a processor, realize the method for realizing three-dimensional object positioning.

Fig. 7 is a schematic diagram of a system structure according to an embodiment of the present invention. The system may include one or more central processing units (central processing units, CPUs) 610 (e.g., one or more processors) and memory 620, one or more storage mediums 630 (e.g., one or more mass storage devices) that store applications or data. Wherein the memory 620 and the storage medium 630 may be transitory or persistent storage. The program stored on the storage medium 630 may include one or more modules (not shown in the figures), each of which may include a series of instruction operations in the device. Still further, the central processor 610 may be configured to communicate with the storage medium 630 to execute a series of instruction operations on the system in the storage medium 630. The system may also include one or more power supplies 640, one or more wired or wireless network interfaces 650, one or more input/output interfaces 660, and the steps performed by the above-described method embodiments of three-dimensional positioning may be based on the system architecture shown in FIG. 7.

It should be understood that, in various embodiments of the present application, the sequence number of each process does not mean that the execution sequence of each process should be determined by its functions and internal logic, and should not constitute any limitation on the implementation process of the embodiments of the present application.

Those of ordinary skill in the art will appreciate that the modules and method steps of the examples described in connection with the embodiments disclosed herein can be implemented as electronic hardware, or as a combination of computer software and electronic hardware. Whether such functionality is implemented as hardware or software depends upon the particular application and design constraints imposed on the solution. Skilled artisans may implement the described functionality in varying ways for each particular application, but such implementation decisions should not be interpreted as causing a departure from the scope of the present application. It will be clearly understood by those skilled in the art that, for convenience and brevity of description, specific working procedures of the above-described system, apparatus and module may refer to corresponding procedures in the foregoing method embodiments, which are not repeated herein.

All parts of the specification are described in a progressive manner, and all parts of the embodiments which are the same and similar to each other are referred to each other, and each embodiment is mainly described as being different from other embodiments. In particular, for apparatus and system embodiments, the description is relatively simple, as it is substantially similar to method embodiments, with reference to the description of the method embodiments section.

Finally, it should be noted that the above is only a preferred embodiment of the technical solution of the present application, and is not intended to limit the scope of the present application. It will be apparent to those skilled in the art that various modifications and variations can be made to the present application without departing from the scope of the application. Any modification, equivalent replacement, improvement, etc. made in the present application should be included in the scope of protection of the present application, provided that such modifications and variations fall within the scope of the claims of the present application and the equivalent technology thereof.

Claims

1. A three-dimensional object positioning method is characterized by comprising the steps of acquiring an image of a three-dimensional object by using a single camera, identifying the gesture and the position of the three-dimensional object, dividing the outer surface of the three-dimensional object into a plurality of areas, displaying a virtual object corresponding to the three-dimensional object, adjusting the virtual object according to the change of the gesture and/or the position after detecting the change of the gesture and/or the position of the three-dimensional object in a three-dimensional space, receiving an input instruction, generating an instruction code, and adjusting the virtual object according to the instruction code and the change of the gesture and/or the position;

The adjusting the virtual object comprises the steps of rotating the virtual object in a virtual space, wherein the rotation angle and the angular speed of the virtual object correspond to the posture change amount of the three-dimensional object, or moving the position of the virtual object in the virtual space, and the displacement amount of the virtual object corresponds to the position change amount of the three-dimensional object;

or adjusting the virtual object according to the variation of the gesture and/or the position, wherein the adjusting the virtual object comprises adjusting the size of the virtual object according to the depth of field distance between the three-dimensional object and the camera;

The three-dimensional object attitude and position identification method comprises the steps of performing color block segmentation on an image, decomposing the image into a plurality of areas with different colors, averaging the colors of each area, traversing all adjacent color block pairs, screening the color block pairs by using a table look-up method, screening out areas matched with a preset model, calculating the orientation data of the matched areas, and acquiring the position and attitude of the three-dimensional object;

After calculating the orientation data of the matching area, the method further comprises:

Calculating candidate solutions corresponding to the matched areas;

Wherein the method comprises the steps of Is the position and posture parameter of the three-dimensional object, including the position coordinatesAnd attitude angle;Projection function for calculating three-dimensional object inPoints on the three-dimensional object surface when in positionIs a position of (2); Calculating the difference between the projection position and the observation position as a cost function; and (3) with Is the edge pixel detected in the image,Is the coordinates of the edge points in the image,Is the tangential angle of the edge point.

2. The method of claim 1, wherein the adjusting the virtual object further comprises entering different virtual scenes according to a preset command when the camera acquires different combinations of different color planes of the three-dimensional object, or displaying a first virtual scene when a rotational angular velocity of the virtual object exceeds a first preset threshold, or displaying a second virtual scene when the rotational angular velocity of the virtual object is below a second preset threshold, the first preset threshold being greater than the second preset threshold.

3. A three-dimensional object positioning device, characterized by a device processor and a memory for storing a computer program executable on the processor, wherein the processor is adapted to perform the method of three-dimensional object positioning according to claim 1 or 2 when the computer program is executed.