[go: up one dir, main page]

CN113763569B - Image labeling method and device used in three-dimensional simulation and electronic equipment - Google Patents

Image labeling method and device used in three-dimensional simulation and electronic equipment Download PDF

Info

Publication number
CN113763569B
CN113763569B CN202111003690.2A CN202111003690A CN113763569B CN 113763569 B CN113763569 B CN 113763569B CN 202111003690 A CN202111003690 A CN 202111003690A CN 113763569 B CN113763569 B CN 113763569B
Authority
CN
China
Prior art keywords
target object
image
virtual camera
dimensional
boundary frame
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Active
Application number
CN202111003690.2A
Other languages
Chinese (zh)
Other versions
CN113763569A (en
Inventor
陈培俊
朱永东
赵志峰
赵旋
时强
刘云涛
朱凯男
杨斌
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Zhejiang Lab
Original Assignee
Zhejiang Lab
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Zhejiang Lab filed Critical Zhejiang Lab
Priority to CN202111003690.2A priority Critical patent/CN113763569B/en
Publication of CN113763569A publication Critical patent/CN113763569A/en
Application granted granted Critical
Publication of CN113763569B publication Critical patent/CN113763569B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T19/00Manipulating 3D models or images for computer graphics
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/70Determining position or orientation of objects or cameras
    • G06T7/73Determining position or orientation of objects or cameras using feature-based methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2219/00Indexing scheme for manipulating 3D models or images for computer graphics
    • G06T2219/012Dimensioning, tolerancing

Landscapes

  • Engineering & Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Theoretical Computer Science (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Computer Graphics (AREA)
  • Computer Hardware Design (AREA)
  • General Engineering & Computer Science (AREA)
  • Software Systems (AREA)
  • Processing Or Creating Images (AREA)

Abstract

The invention discloses an image labeling method and device used in three-dimensional simulation and electronic equipment, wherein the method comprises the following steps: constructing a three-dimensional scene, wherein the three-dimensional scene comprises a target object, a virtual camera and a shielding object; obtaining an image in a visual field range through a virtual camera; judging whether the target object is in the visual field of the virtual camera or not, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object; and if any detection point is not blocked by the blocking object, marking the target object in the image.

Description

Image labeling method and device used in three-dimensional simulation and electronic equipment
Technical Field
The present application relates to the field of data labeling, and in particular, to an image labeling method and apparatus used in three-dimensional simulation, and an electronic device.
Background
Image annotation, in short, annotates the type, position, speed and other attributes of the target object on the image as training data of a depth model or other classifier. It is important to train an excellent algorithm, have more data and obtain more accurate annotation information.
The existing training data mainly come from the field collection of the physical sensor, the data collection cost is high, the difficulty is high, and the later labeling is mainly performed manually; the method has the advantages that the environment for collecting training data is highly consistent with the working environment of the final model, and the method is very helpful to the recognition accuracy of the model. Recently, three-dimensional simulation technology is also used for generating training data, and three-dimensional simulation has the advantages of low data generation cost and high speed, and has the defect that the recognition accuracy of a model can be affected to a certain extent due to the fact that the simulation data are different from a real environment to some extent. The simulation data is taken as a supplement, so that the simulation data is a good choice.
In the process of implementing the present invention, the inventor finds that at least the following problems exist in the prior art: the existing labeling method mainly uses manual labeling and is assisted by semi-automatic labeling. The semi-automatic labeling is to train a less accurate primary model with a small amount of manually labeled data, then automatically identify the targets in the images by using the primary model, output the types, positions and other attribute data of the target objects, inaccurately output labels and manually and accurately adjust the output labels. Training a more accurate model … … with more data can be repeated multiple times, spiraling up.
Disclosure of Invention
The embodiment of the application aims to provide a method, a device and electronic equipment, which are used for solving the technical problem that the image cannot be fully-automatically marked in the related technology.
According to a first aspect of an embodiment of the present application, there is provided an image labeling method used in three-dimensional simulation, including:
constructing a three-dimensional scene, wherein the three-dimensional scene comprises a target object, a virtual camera and a shielding object;
obtaining an image in a visual field range through a virtual camera;
judging whether the target object is in the visual field of the virtual camera or not, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object;
and if any detection point is not blocked by the blocking object, marking the target object in the image.
Further, determining whether the target object is in the field of view of the virtual camera, and if the target object is in the field of view of the virtual camera, generating a detection point of the target object includes:
acquiring a three-dimensional boundary frame of a target object, and calculating physical coordinates of each vertex on the three-dimensional boundary frame in an image according to the three-dimensional boundary frame;
calculating to obtain a physical coordinate boundary of the virtual camera according to the image and the view angle of the virtual camera;
if the physical coordinates of the image of any vertex are within the physical coordinate boundary of the virtual camera, the target object is within the visual field of the virtual camera;
and generating a detection point of the target object according to the physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the image.
Specifically, a three-dimensional boundary box of a target object is obtained, and according to the three-dimensional boundary box, the physical coordinates of each vertex on the three-dimensional boundary box in an image are calculated, wherein the method comprises the following steps:
acquiring a first world coordinate, a first direction and a length, width and height of a three-dimensional boundary frame of a target object;
Calculating a second world coordinate of the vertex on the three-dimensional boundary frame of the target object according to the first world coordinate, the first direction and the length, width and height of the three-dimensional boundary frame;
Converting the second world coordinates into camera coordinates;
And calculating the physical coordinates of the image of the vertex according to the coordinates of the camera.
Specifically, generating a detection point of the target object according to the physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the image, including:
calculating the image pixel coordinates of the vertex according to the image physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the width and height of the output image pixel;
Calculating the length and width of a two-dimensional boundary frame of the target object according to the image pixel coordinates;
according to the length, width and height of the three-dimensional boundary frame, calculating and obtaining the body diagonal length of the three-dimensional boundary frame;
Calculating to obtain a contraction step length according to the length and width of the two-dimensional boundary frame and the body diagonal length of the three-dimensional boundary frame;
Shrinking the three-dimensional boundary frame according to the shrinkage step length and the body diagonal length of the three-dimensional boundary frame;
and generating a detection point of the target object according to the contracted three-dimensional boundary frame and the center point of the target object.
Further, shrinking the three-dimensional bounding box according to the shrinkage step size and the body diagonal length of the three-dimensional bounding box, including:
calculating to obtain shrinkage parameters according to the shrinkage step length and the body diagonal length of the three-dimensional boundary frame;
setting a shrinkage ratio;
According to the shrinkage proportion, shrinking the three-dimensional boundary frame along the body diagonal direction of the target object; updating the shrinkage ratio, and taking the difference of the shrinkage ratio minus the shrinkage parameter as a new shrinkage ratio; this step is repeated until the shrinkage ratio is less than or equal to zero.
Specifically, if any one of the detection points is not blocked by the blocking object, labeling the target object in the image includes:
acquiring a third world coordinate of the virtual camera;
judging whether a shielding object exists between the virtual camera and the detection point according to the third world coordinate and the world coordinate of the detection point, and if any detection point is not shielded, marking a two-dimensional boundary box of the target object in the image.
According to a second aspect of an embodiment of the present application, there is provided an image labeling apparatus for use in three-dimensional simulation, including:
the construction module is used for constructing a three-dimensional scene, wherein the three-dimensional scene comprises a target object, a virtual camera and a shielding object;
The acquisition module acquires an image in a visual field range through the virtual camera;
The generation module is used for judging whether the target object is in the visual field of the virtual camera or not, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object;
and the labeling module is used for labeling the target object in the image if any detection point is not blocked by the blocking object.
According to a third aspect of an embodiment of the present application, there is provided an electronic apparatus including:
One or more processors;
A memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of the first aspect.
According to a fourth aspect of embodiments of the present application there is provided a computer readable storage medium having stored thereon computer instructions which when executed by a processor perform the steps of the method according to the first aspect.
The technical scheme provided by the embodiment of the application can comprise the following beneficial effects:
as can be seen from the above embodiments, the present invention is an automatic image labeling method used in three-dimensional simulation application, in which a target object, a virtual camera, and a shielding object are added into a three-dimensional scene; obtaining an image in a visual field range through a virtual camera; calculating whether a target object is in the visual field of the virtual camera, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object; calculating whether shielding exists between a virtual camera and a detection point of the target object, and if any detection point of the target object is not shielded by the shielding object, marking the target object in the image when the target appears in the image. The invention realizes continuous output of the labeling information and the image data in a low-cost three-dimensional simulation mode, and is an important source for acquiring intelligent model training data. Compared with the traditional method, the marking method has the advantages of low cost, high precision, high speed and the like, and can be applied to the fields of automatic driving of vehicles, cooperative technology of vehicles and roads, digital twinning, intelligent traffic intersections and the like.
It is to be understood that both the foregoing general description and the following detailed description are exemplary and explanatory only and are not restrictive of the application as claimed.
Drawings
The accompanying drawings, which are incorporated in and constitute a part of this specification, illustrate embodiments consistent with the application and together with the description, serve to explain the principles of the application.
FIG. 1 is a flow chart illustrating a method of image annotation for use in three-dimensional simulation, according to an exemplary embodiment.
Fig. 2 is a flowchart illustrating step S103 according to an exemplary embodiment;
fig. 3 is a flowchart illustrating step S201, according to an exemplary embodiment;
FIG. 4 is a schematic diagram illustrating the perspective of a virtual camera according to an exemplary embodiment;
FIG. 5 is a flowchart illustrating step S204, according to an exemplary embodiment;
FIG. 6 is a flowchart illustrating step S405, according to an exemplary embodiment;
FIG. 7 is a flowchart illustrating step S104, according to an exemplary embodiment;
FIG. 8 is a block diagram illustrating an image annotation device for use in three-dimensional simulation, according to an exemplary embodiment.
Detailed Description
Reference will now be made in detail to exemplary embodiments, examples of which are illustrated in the accompanying drawings. When the following description refers to the accompanying drawings, the same numbers in different drawings refer to the same or similar elements, unless otherwise indicated. The implementations described in the following exemplary examples do not represent all implementations consistent with the application. Rather, they are merely examples of apparatus and methods consistent with aspects of the application as detailed in the accompanying claims.
The terminology used herein is for the purpose of describing particular embodiments only and is not intended to be limiting of the application. As used in this specification and the appended claims, the singular forms "a," "an," and "the" are intended to include the plural forms as well, unless the context clearly indicates otherwise. It should also be understood that the term "and/or" as used herein refers to and encompasses any or all possible combinations of one or more of the associated listed items.
It should be understood that although the terms first, second, third, etc. may be used herein to describe various information, these information should not be limited by these terms. These terms are only used to distinguish one type of information from another. For example, first information may also be referred to as second information, and similarly, second information may also be referred to as first information, without departing from the scope of the application. The word "if" as used herein may be interpreted as "at … …" or "at … …" or "in response to a determination" depending on the context.
FIG. 1 is a flow chart illustrating a method of image annotation used in three-dimensional simulation, according to an exemplary embodiment, as shown in FIG. 1, may include the steps of:
Step S101: constructing a three-dimensional scene, wherein the three-dimensional scene comprises a target object, a virtual camera and a shielding object;
Step S102: obtaining an image in a visual field range through a virtual camera;
step S103: judging whether the target object is in the visual field of the virtual camera or not, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object;
step S104: and if any detection point is not blocked by the blocking object, marking the target object in the image.
As can be seen from the above embodiments, the present application is an automatic image labeling method used in three-dimensional simulation application, in which a target object, a virtual camera and a shielding object are added into a three-dimensional scene; obtaining an image in a visual field range through a virtual camera; calculating whether a target object is in the visual field of the virtual camera, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object; calculating whether shielding exists between a virtual camera and a detection point of the target object, and if any detection point of the target object is not shielded by the shielding object, marking the target object in the image when the target appears in the image. The application realizes continuous output of the labeling information and the image data by a low-cost three-dimensional simulation mode. Compared with the traditional method, the marking method has the advantages of low cost, high precision, high speed and the like, and can be applied to the fields of automatic driving of vehicles, cooperative technology of vehicles and roads, digital twinning, intelligent traffic intersections and the like.
In the implementation of step S101, a three-dimensional scene is constructed, where the three-dimensional scene includes a target object, a virtual camera, and a shielding object;
Specifically, a three-dimensional scene reconstructed by virtual or real environment is created and imported into a virtual engine, and the virtual engine UE4 (Unreal Engine 4) is used in the present embodiment; adding a plurality of movable target objects (mainly referred to as people, non-motor vehicles and motor vehicles) in a scene, wherein the target objects have autonomous movement capability, such as movement along a set route, so as to ensure the diversity of output images; the virtual camera is installed, camera parameters (including image resolution, field angle and the like) are set, and the installation position (including coordinates and orientation) is set.
In order to provide rich scenes and accurately annotated image data. By means of creating the virtual scene, the scene of the image can be enriched; modeling for a specific real environment can improve the algorithm performance in this scenario. The autonomous movement of the target object is to ensure the diversity of the images so as to ensure that the training data is rich enough.
In the implementation of step S102, an image in a field of view is acquired by a virtual camera;
Specifically, a virtual camera is placed at a specific position in the three-dimensional scene through the UE4, and a corresponding image is output through graphics card operation.
In the implementation of step S103, determining whether the target object is within the field of view of the virtual camera, and if the target object is within the field of view of the virtual camera, generating a detection point of the target object; specifically, as shown in fig. 2, this step includes the sub-steps of:
step S201: acquiring a three-dimensional boundary frame of a target object, and calculating physical coordinates of each vertex on the three-dimensional boundary frame in an image according to the three-dimensional boundary frame; specifically, as shown in fig. 3, this step may include the following process:
Step S301: acquiring a first world coordinate, a first direction and a length, width and height of a three-dimensional boundary frame of a target object;
Specifically, the first world coordinate (X i,Yi,Zi), the orientation (Pitch i,Yawi,Rolli), the length, width and height (W i,Hi,Li) of the three-dimensional bounding box of the target object. The three-dimensional bounding box is specified when the target object is added in step S101, such as the length, width, height of the vehicle, and the coordinates and orientation of the target object are updated continuously in the simulation.
Step S302: calculating a second world coordinate of the vertex on the three-dimensional boundary frame of the target object according to the first world coordinate, the first direction and the length, width and height of the three-dimensional boundary frame;
Specifically, according to the first world coordinate (X i,Yi,Zi), the orientation (Pitch i,Yawi,Rolli) and the length, width and height (W i,Hi,Li) of the three-dimensional bounding box of the target object. Converting the target object to a local coordinate system:
under the local coordinate system, the center point (X i,Yi,Zi) of the target object is respectively translated along the positive and negative directions of the X, Y, Z axis, as follows,
World coordinates of 8 vertices of the three-dimensional bounding box are obtained. The bounding box is the smallest cuboid surrounding the target, and the three-dimensional bounding box of the target object can enable the subsequent calculation for judging whether the target object is in the field of view of the camera to be more accurate.
Step S303: converting the second world coordinates into camera coordinates;
Specifically, the coordinate of the position of the camera is a third world coordinate (X s,Ys,Zs); taking vertex (X 0,Y0,Z0) as an example, converting to camera coordinates
Wherein R is the same as S302.
Other vertices are transformed in the same way, all vertices being represented as
Step S304: and calculating the physical coordinates of the image of the vertex according to the coordinates of the camera.
Specifically, the virtual camera defaults to orient the view angle to the x-axis, as shown in fig. 4 below, the imaging plane of the camera corresponds to the yz plane, and according to the imaging principle of the camera, the image physical coordinates (x j,yj) of the vertex are calculated by using the following formula:
The physical coordinates of the image are obtained to calculate the pixel coordinates of the image.
Step S202: calculating to obtain a physical coordinate boundary of the virtual camera according to the image and the view angle of the virtual camera;
Specifically, according to the camera imaging principle, the physical coordinate boundary of the virtual camera is calculated by using the following formula
Wherein (W s,Hs) is the width and height of the pixel of the output image, and is an adjustable parameter, and the virtual camera needs to be manually set during initialization, and generally (1920,1080), (1280, 720) and the like are preferable. Fov is the angle of view, and is also a setting parameter, and 60 degrees, 90 degrees, etc. can be more than 0 and less than 180. These parameters must be set when the simulation begins to place the camera. The physical coordinate boundary is calculated to compare whether the coordinates of the target object fall within the camera range.
Step S203: if the physical coordinates of the image of any vertex are within the physical coordinate boundary of the virtual camera, the target object is within the visual field of the virtual camera;
Specifically, if x j is in the range [ -x limit,xlimit ] and y j is in the range [ -y limit,ylimit ], then the current vertex is within the camera field of view;
Step S204: and generating a detection point of the target object according to the physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the image. Specifically, as shown in fig. 5, this step may include the sub-steps of:
step S401: calculating the image pixel coordinates of the vertex according to the image physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the width and height of the output image pixel;
specifically, according to the principle of similar triangle imaged by a camera, the image pixel coordinates of the vertex are calculated by using the following formula:
It is deduced that,
Wherein (x j,yj) is the physical coordinates of the image of the vertex, x limit、ylimit is the physical coordinate boundary of the virtual camera, and (W s,Hs) is the width and height of the pixel of the output image.
Step S402: calculating the length and width of a two-dimensional boundary frame of the target object according to the image pixel coordinates;
Specifically, the two-dimensional bounding box is represented as: [ u min,vmin,umax,vmax ] calculated according to the following formula:
wherein C F is the number of vertices of the target object in the virtual camera field of view, (u j,vj) is the image pixel coordinates of the vertices; namely, in the two-dimensional image, taking a minimum rectangle which can surround a target object; the portion exceeding the image boundary is truncated by the boundary.
The length and width calculation mode of the two-dimensional boundary box of the target object is as follows:
The above equation is obviously true given the maximum and minimum coordinates of the rectangular frame in the lateral and longitudinal directions. The length and width of the two-dimensional boundary box are calculated for use in generating detection points, and the larger the length and width value is, the closer the target object is to the camera, the fewer detection points need to be generated.
Step S403: according to the length, width and height of the three-dimensional boundary frame, calculating and obtaining the body diagonal length of the three-dimensional boundary frame;
specifically, the body diagonal length of the three-dimensional bounding box of the target object may be calculated according to the following equation:
Where W i is the length of the three-dimensional bounding box, H i is the width of the three-dimensional bounding box, and L i is the height of the three-dimensional bounding box.
Step S404: calculating to obtain a contraction step length according to the length and width of the two-dimensional boundary frame and the body diagonal length of the three-dimensional boundary frame;
Specifically, the calculation formula of the contraction step is as follows:
Where α may take any positive value, and in this embodiment α takes a fixed value between [10,100 ].
The smaller the value of alpha is, the smaller d is, the more detection is generated, the higher the calculation precision is, and the larger the calculation amount is; conversely, the smaller the value of alpha is, the larger d is, the smaller the generated detection is, the lower the calculation accuracy is, and the smaller the calculation amount is; meanwhile, the smaller L u,Lv is, the larger d is, and the fewer detection points are generated; otherwise, the more detection points are generated; it can be seen that α is a parameter for manually adjusting the number of detection points, and the imaging size (distance from the camera) of the target object automatically adjusts the number of detection points.
Step S405: according to the contraction step length, the three-dimensional boundary frame is contracted; further, as shown in fig. 6, this step may further include:
Step S501: calculating to obtain shrinkage parameters according to the shrinkage step length and the body diagonal length of the three-dimensional boundary frame;
specifically, the shrinkage parameter k is calculated according to the following formula:
Where D is the contraction step size and D is the body diagonal length of the three-dimensional bounding box.
Step S502: setting a shrinkage ratio;
Specifically, the shrinkage ratio r=1 is set, and the initial value is cycled, i.e., no shrinkage.
Step S503: according to the shrinkage proportion, shrinking the three-dimensional boundary frame along the body diagonal direction of the target object; updating the shrinkage ratio, and taking the difference of the shrinkage ratio minus the shrinkage parameter as a new shrinkage ratio; repeating this step until the shrinkage ratio is less than or equal to zero;
Specifically, the three-dimensional bounding box is contracted from (W i,Hi,Li) to (r×w i,r×Hi,r×Li) along the body diagonal direction of the target object; after the shrinkage is completed, the shrinkage ratio r=r-k is updated; repeating the process of the step until r is less than or equal to 0; the bounding box is shrunk to generate more evenly distributed detection points so that partially occluded objects can still be detected.
Step S406: generating a detection point of the target object according to the contracted three-dimensional boundary frame or the center point of the target object;
specifically, 8 vertexes of the contracted three-dimensional boundary frame or the center point of the target object (r is less than or equal to 0) is used as a detection point of the target object; the generated detection point is used as a premise of detecting whether shielding exists between the camera and the detection point or not.
In step S502, after each time of contraction, the three-dimensional bounding box after contraction generates detection points of the target object as described in step S406, that is, if the target object is contracted n times in step S502, 8×n+1 detection points are generated in step S406.
In the implementation of step S104, if any one of the detection points is not blocked by the blocking object, the target object in the image is marked. Specifically, as shown in fig. 7, this step includes the following sub-steps:
Step S601: acquiring a third world coordinate of the virtual camera;
Specifically, a third world coordinate (X s,Ys,Zs) of the virtual camera is acquired; the camera coordinates are placed at the time of creation of the three-dimensional simulation environment and the coordinates are also determined.
Step S602: judging whether a shielding object exists between the virtual camera and the detection point according to the third world coordinate and the world coordinate of the detection point, and if any detection point is not shielded, marking a two-dimensional boundary box of the target object in the image.
Specifically, using a ray collision detection algorithm in a game engine to detect whether an occlusion object exists between the virtual camera and a detection point, wherein for a transparent occlusion object (such as glass) and a grid occlusion object (such as wire mesh), the detection method can be set to be non-occlusion so as to ensure that an object behind the transparent occlusion object (such as wire mesh) can be detected; if any one of all detection points of the target object is not blocked, it is indicated that the target object is not completely blocked, and the two-dimensional bounding box of the target object is marked in the image in step S102.
The application also provides an embodiment of an image labeling device used in the three-dimensional simulation, corresponding to the embodiment of the image labeling method used in the three-dimensional simulation.
FIG. 8 is a block diagram illustrating an image annotation device for use in three-dimensional simulation, according to an exemplary embodiment. Referring to fig. 8, the apparatus includes:
The construction module 21 constructs a three-dimensional scene, wherein the three-dimensional scene comprises a target object, a virtual camera and a shielding object;
The acquisition module 22 acquires an image in a visual field range through the virtual camera;
A generating module 23, configured to determine whether the target object is within the field of view of the virtual camera, and if the target object is within the field of view of the virtual camera, generate a detection point of the target object;
and the labeling module 24 is used for labeling the target object in the image if any detection point is not blocked by the blocking object.
The specific manner in which the various modules perform the operations in the apparatus of the above embodiments have been described in detail in connection with the embodiments of the method, and will not be described in detail herein.
For the device embodiments, reference is made to the description of the method embodiments for the relevant points, since they essentially correspond to the method embodiments. The apparatus embodiments described above are merely illustrative, wherein the elements illustrated as separate elements may or may not be physically separate, and the elements shown as elements may or may not be physical elements, may be located in one place, or may be distributed over a plurality of network elements. Some or all of the modules may be selected according to actual needs to achieve the purposes of the present application. Those of ordinary skill in the art will understand and implement the present application without undue burden.
Correspondingly, the application also provides electronic equipment, which comprises: one or more processors; a memory for storing one or more programs; the one or more programs, when executed by the one or more processors, cause the one or more processors to implement an image annotation method for use in three-dimensional simulation as described above.
Correspondingly, the application also provides a computer readable storage medium, on which computer instructions are stored, characterized in that the instructions, when executed by a processor, implement an image labeling method for use in three-dimensional simulation as described above.
Other embodiments of the application will be apparent to those skilled in the art from consideration of the specification and practice of the disclosure herein. This application is intended to cover any variations, uses, or adaptations of the application following, in general, the principles of the application and including such departures from the present disclosure as come within known or customary practice within the art to which the application pertains. It is intended that the specification and examples be considered as exemplary only, with a true scope and spirit of the application being indicated by the following claims.
It is to be understood that the application is not limited to the precise arrangements and instrumentalities shown in the drawings, which have been described above, and that various modifications and changes may be effected without departing from the scope thereof. The scope of the application is limited only by the appended claims.

Claims (5)

1. An image labeling method for use in three-dimensional simulation, comprising:
constructing a three-dimensional scene, wherein the three-dimensional scene comprises a target object, a virtual camera and a shielding object;
obtaining an image in a visual field range through a virtual camera;
Judging whether the target object is in the visual field of the virtual camera or not, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object; comprising the following steps:
Acquiring a three-dimensional boundary frame of a target object, and calculating physical coordinates of each vertex on the three-dimensional boundary frame in an image according to the three-dimensional boundary frame; comprising the following steps:
acquiring a first world coordinate, a first direction and a length, width and height of a three-dimensional boundary frame of a target object;
Calculating a second world coordinate of the vertex on the three-dimensional boundary frame of the target object according to the first world coordinate, the first direction and the length, width and height of the three-dimensional boundary frame;
Converting the second world coordinates into camera coordinates;
calculating the physical coordinates of the image of the vertex according to the coordinates of the camera;
calculating to obtain a physical coordinate boundary of the virtual camera according to the image and the view angle of the virtual camera;
if the physical coordinates of the image of any vertex are within the physical coordinate boundary of the virtual camera, the target object is within the visual field of the virtual camera;
generating a detection point of the target object according to the physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the image; comprising the following steps:
calculating the image pixel coordinates of the vertex according to the image physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the width and height of the output image pixel;
Calculating the length and width of a two-dimensional boundary frame of the target object according to the image pixel coordinates;
according to the length, width and height of the three-dimensional boundary frame, calculating and obtaining the body diagonal length of the three-dimensional boundary frame;
Calculating to obtain a contraction step length according to the length and width of the two-dimensional boundary frame and the body diagonal length of the three-dimensional boundary frame;
Shrinking the three-dimensional boundary frame according to the shrinkage step length and the body diagonal length of the three-dimensional boundary frame;
generating a detection point of the target object according to the contracted three-dimensional boundary frame and the center point of the target object;
If any detection point is not blocked by the blocking object, marking the target object in the image, including:
acquiring a third world coordinate of the virtual camera;
judging whether a shielding object exists between the virtual camera and the detection point according to the third world coordinate and the world coordinate of the detection point, and if any detection point is not shielded, marking a two-dimensional boundary box of the target object in the image.
2. The method of claim 1, wherein shrinking the three-dimensional bounding box according to the shrink step size and the body diagonal length of the three-dimensional bounding box comprises:
calculating to obtain shrinkage parameters according to the shrinkage step length and the body diagonal length of the three-dimensional boundary frame;
setting a shrinkage ratio;
According to the shrinkage proportion, shrinking the three-dimensional boundary frame along the body diagonal direction of the target object; updating the shrinkage ratio, and taking the difference of the shrinkage ratio minus the shrinkage parameter as a new shrinkage ratio; this step is repeated until the shrinkage ratio is less than or equal to zero.
3. An image annotation device for use in three-dimensional simulation, comprising:
the construction module is used for constructing a three-dimensional scene, wherein the three-dimensional scene comprises a target object, a virtual camera and a shielding object;
The acquisition module acquires an image in a visual field range through the virtual camera;
the generation module is used for judging whether the target object is in the visual field of the virtual camera or not, and if the target object is in the visual field of the virtual camera, generating a detection point of the target object; comprising the following steps:
Acquiring a three-dimensional boundary frame of a target object, and calculating physical coordinates of each vertex on the three-dimensional boundary frame in an image according to the three-dimensional boundary frame; comprising the following steps:
acquiring a first world coordinate, a first direction and a length, width and height of a three-dimensional boundary frame of a target object;
Calculating a second world coordinate of the vertex on the three-dimensional boundary frame of the target object according to the first world coordinate, the first direction and the length, width and height of the three-dimensional boundary frame;
Converting the second world coordinates into camera coordinates;
calculating the physical coordinates of the image of the vertex according to the coordinates of the camera;
calculating to obtain a physical coordinate boundary of the virtual camera according to the image and the view angle of the virtual camera;
if the physical coordinates of the image of any vertex are within the physical coordinate boundary of the virtual camera, the target object is within the visual field of the virtual camera;
generating a detection point of the target object according to the physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the image; comprising the following steps:
calculating the image pixel coordinates of the vertex according to the image physical coordinates of the vertex, the physical coordinate boundary of the virtual camera and the width and height of the output image pixel;
Calculating the length and width of a two-dimensional boundary frame of the target object according to the image pixel coordinates;
according to the length, width and height of the three-dimensional boundary frame, calculating and obtaining the body diagonal length of the three-dimensional boundary frame;
Calculating to obtain a contraction step length according to the length and width of the two-dimensional boundary frame and the body diagonal length of the three-dimensional boundary frame;
Shrinking the three-dimensional boundary frame according to the shrinkage step length and the body diagonal length of the three-dimensional boundary frame;
generating a detection point of the target object according to the contracted three-dimensional boundary frame and the center point of the target object;
The labeling module is used for labeling the target object in the image if any detection point is not blocked by the blocking object, and comprises the following steps:
acquiring a third world coordinate of the virtual camera;
judging whether a shielding object exists between the virtual camera and the detection point according to the third world coordinate and the world coordinate of the detection point, and if any detection point is not shielded, marking a two-dimensional boundary box of the target object in the image.
4. An electronic device, comprising:
One or more processors;
A memory for storing one or more programs;
the one or more programs, when executed by the one or more processors, cause the one or more processors to implement the method of any of claims 1-2.
5. A computer readable storage medium having stored thereon computer instructions which, when executed by a processor, implement the steps of the method according to any of claims 1-2.
CN202111003690.2A 2021-08-30 2021-08-30 Image labeling method and device used in three-dimensional simulation and electronic equipment Active CN113763569B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202111003690.2A CN113763569B (en) 2021-08-30 2021-08-30 Image labeling method and device used in three-dimensional simulation and electronic equipment

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202111003690.2A CN113763569B (en) 2021-08-30 2021-08-30 Image labeling method and device used in three-dimensional simulation and electronic equipment

Publications (2)

Publication Number Publication Date
CN113763569A CN113763569A (en) 2021-12-07
CN113763569B true CN113763569B (en) 2024-10-01

Family

ID=78791902

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202111003690.2A Active CN113763569B (en) 2021-08-30 2021-08-30 Image labeling method and device used in three-dimensional simulation and electronic equipment

Country Status (1)

Country Link
CN (1) CN113763569B (en)

Families Citing this family (4)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN114333489A (en) * 2021-12-30 2022-04-12 广州小鹏汽车科技有限公司 Remote driving simulation method, device and simulation system
CN114898076B (en) * 2022-03-29 2023-04-21 北京城市网邻信息技术有限公司 Model label adding method and device, electronic equipment and storage medium
CN116363085B (en) * 2023-03-21 2024-01-12 江苏共知自动化科技有限公司 Industrial part target detection method based on small sample learning and virtual synthesized data
CN116012843B (en) * 2023-03-24 2023-06-30 北京科技大学 Virtual scene data annotation generation method and system

Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103489214A (en) * 2013-09-10 2014-01-01 北京邮电大学 Virtual reality occlusion handling method, based on virtual model pretreatment, in augmented reality system
CN109840947A (en) * 2017-11-28 2019-06-04 广州腾讯科技有限公司 Implementation method, device, equipment and the storage medium of augmented reality scene

Family Cites Families (8)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106126579B (en) * 2016-06-17 2020-04-28 北京市商汤科技开发有限公司 Object identification method and device, data processing device and terminal equipment
US10699165B2 (en) * 2017-10-30 2020-06-30 Palo Alto Research Center Incorporated System and method using augmented reality for efficient collection of training data for machine learning
JP2020008917A (en) * 2018-07-03 2020-01-16 株式会社Eidea Augmented reality display system, augmented reality display method, and computer program for augmented reality display
US10733742B2 (en) * 2018-09-26 2020-08-04 International Business Machines Corporation Image labeling
CN111160261A (en) * 2019-12-30 2020-05-15 北京每日优鲜电子商务有限公司 Sample image labeling method and device for automatic sales counter and storage medium
CN112258610B (en) * 2020-10-10 2023-12-01 万物镜像(北京)计算机系统有限公司 Image labeling method and device, storage medium and electronic equipment
CN112150575B (en) * 2020-10-30 2023-09-01 深圳市优必选科技股份有限公司 Scene data acquisition method, model training method and device and computer equipment
CN112819804B (en) * 2021-02-23 2024-07-12 西北工业大学 Insulator defect detection method based on improved YOLOv convolutional neural network

Patent Citations (2)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN103489214A (en) * 2013-09-10 2014-01-01 北京邮电大学 Virtual reality occlusion handling method, based on virtual model pretreatment, in augmented reality system
CN109840947A (en) * 2017-11-28 2019-06-04 广州腾讯科技有限公司 Implementation method, device, equipment and the storage medium of augmented reality scene

Also Published As

Publication number Publication date
CN113763569A (en) 2021-12-07

Similar Documents

Publication Publication Date Title
CN113763569B (en) Image labeling method and device used in three-dimensional simulation and electronic equipment
CN112444242B (en) Pose optimization method and device
CA2395257C (en) Any aspect passive volumetric image processing method
CN113240734B (en) Vehicle cross-position judging method, device, equipment and medium based on aerial view
CN111932627B (en) Marker drawing method and system
CN115410167A (en) Target detection and semantic segmentation method, device, equipment and storage medium
CN116363290A (en) Texture map generation method for large-scale scene three-dimensional reconstruction
KR20240022986A (en) METHODS AND APPARATUS FOR ESTIMATION OF THE DISTANCE OF THE OBJECTS IN EUCLIDEAN SPACE USING FUSION OF CAMERA POSE INFORMATION and SCENE PRIOR INFORMATION'S AND THEREOF
CN112146647B (en) Binocular vision positioning method and chip for ground texture
CN113536854B (en) A method, device and server for generating high-precision map road signs
CN119779275A (en) Map generation method, device, equipment, storage medium and program product
CN113808243B (en) Drawing method and device for deformable snowfield grid
CN114004957A (en) Augmented reality picture generation method, device, equipment and storage medium
CN117611438B (en) Monocular image-based reconstruction method from 2D lane line to 3D lane line
CN112818866B (en) Vehicle positioning method and device and electronic equipment
CN113177984B (en) Semantic element distance measurement method and device based on sparse direct method and electronic equipment
CN118089753B (en) Monocular semantic SLAM positioning method and system based on three-dimensional target
CN116883489B (en) Signpost positioning method, electronic device and storage medium
CN119152505B (en) Labeling method, labeling device, labeling equipment and storage medium
CN118967954B (en) Urban space three-dimensional reconstruction method and system based on big data
KR102836215B1 (en) Method for generating pose-converted data from autonomous vehicle camera data
Zhang et al. Range image registration via probability field
CN119205588A (en) A target image prediction method, storage medium, vehicle-mounted device and vehicle
CN117495692A (en) Image data enhancement method and device based on 3D (three-dimensional) drivable area
KR20250100931A (en) Method for generating pose-converted data from autonomous vehicle camera data

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant