[go: up one dir, main page]

CN111223053A - Data enhancement method based on depth image - Google Patents

Data enhancement method based on depth image Download PDF

Info

Publication number
CN111223053A
CN111223053A CN201911128481.3A CN201911128481A CN111223053A CN 111223053 A CN111223053 A CN 111223053A CN 201911128481 A CN201911128481 A CN 201911128481A CN 111223053 A CN111223053 A CN 111223053A
Authority
CN
China
Prior art keywords
coordinate system
transformation
point cloud
image
depth image
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN201911128481.3A
Other languages
Chinese (zh)
Inventor
叶平
孙亮
张治广
徐煜秾
王树义
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Beijing University of Posts and Telecommunications
Original Assignee
Beijing University of Posts and Telecommunications
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Beijing University of Posts and Telecommunications filed Critical Beijing University of Posts and Telecommunications
Priority to CN201911128481.3A priority Critical patent/CN111223053A/en
Publication of CN111223053A publication Critical patent/CN111223053A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T5/00Image enhancement or restoration
    • G06T5/70Denoising; Smoothing
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10028Range image; Depth image; 3D point clouds
    • GPHYSICS
    • G06COMPUTING OR CALCULATING; COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning

Landscapes

  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Image Processing (AREA)
  • Length Measuring Devices By Optical Means (AREA)

Abstract

本发明提出了一种基于深度图像的数据增强方法,适用于计算机视觉领域,基于深度图像的识别、目标检测、行为识别等算法。本发明公开了一种基于深度图像的数据增强方法,主要由像素坐标转换三维点云、三维点云空间变换、三维点云转换像素坐标、最小值滤波处理部分组成。像素坐标系转换三维点云是通过像素坐标系、图像坐标系、相机坐标系、世界坐标系之间的转换关系,将深度图像中的平面像素坐标点转换到世界坐标系下的三维空间点云。三维点云空间变换,是将深度图像转换到三维空间点云后,对三维空间点云做随机平移变换与随机旋转变换,形成新的三维空间点云。通过世界坐标系到像素坐标系之间的转换关系,将新生成的三维空间点云投影到深度图像中。经由最小值滤波处理后,得到数据增强后的新深度图像。这种数据增强方法,为计算机视觉邻域中,基于深度图像的研究提供了一种数据扩种的方法。该方法能够使网络模型的泛化能力得到极大提升。

Figure 201911128481

The invention proposes a data enhancement method based on a depth image, which is suitable for algorithms such as recognition, target detection and behavior recognition based on the depth image in the field of computer vision. The invention discloses a data enhancement method based on a depth image, which is mainly composed of three-dimensional point cloud transformation of pixel coordinates, spatial transformation of three-dimensional point clouds, three-dimensional point cloud transformation of pixel coordinates, and minimum value filtering processing parts. Pixel coordinate system conversion 3D point cloud is to convert the plane pixel coordinate points in the depth image to the 3D space point cloud under the world coordinate system through the conversion relationship between the pixel coordinate system, the image coordinate system, the camera coordinate system, and the world coordinate system. . The 3D point cloud space transformation is to convert the depth image to the 3D space point cloud, and then perform random translation transformation and random rotation transformation on the 3D space point cloud to form a new 3D space point cloud. Through the transformation relationship between the world coordinate system and the pixel coordinate system, the newly generated 3D space point cloud is projected into the depth image. After the minimum value filtering process, a new depth image after data enhancement is obtained. This data augmentation method provides a data augmentation method for the research based on depth images in the computer vision neighborhood. This method can greatly improve the generalization ability of the network model.

Figure 201911128481

Description

Data enhancement method based on depth image
The technical field is as follows:
the invention provides a data enhancement method based on a depth image, which is suitable for the field of computer vision and is based on algorithms of depth image recognition, target detection, behavior recognition and the like.
Background art:
in recent years, deep learning has become more and more widely used in the field of computer vision. The excellent performance of deep learning in the face of many problems in the field of computer vision has led more and more researchers to begin to be involved in this research direction. The deep learning can be performed with such excellent performance because the deep convolutional network has strong expression capability, and can train the required model result according to the training target. However, this is also true, and the network model itself requires a large amount of data, even massive, to drive the model training, otherwise it may put the model in a dilemma of overfitting. In practical cases, however, not all data sets possess a huge amount of training samples. As such, data enhancement becomes an important step in model training during the actual training process. Effective data expansion not only can expand the number of training samples, but also can increase the diversity of sample training. On one hand, overfitting of the model can be avoided, and on the other hand, the performance of the model can be improved. Common image data enhancement methods are: horizontal flipping, random rotation, random scaling, random cropping, random translation, and the like. These common image data enhancement methods are applied in the RGB image field, and for other types of images, these methods are not applicable.
With the development of binocular cameras in recent years, the cost of the binocular cameras becomes lower and lower, and depth images acquired by the binocular cameras are applied to the field of computer vision by more and more researchers. For example: human skeleton key point detection, human behavior recognition, gesture recognition and other fields. However, the common data enhancement method applied in the RGB image field is not applicable in the depth image. Since the value stored by each pixel point of the depth image is the depth distance from the position to the camera, the picture is distorted by directly using the data enhancement method in the RGB image. Aiming at the problem, the invention provides a data enhancement method based on a depth image, which comprises the steps of converting pixel points in an image into a three-dimensional space to form a three-dimensional point cloud through a conversion relation from an image coordinate system to a world coordinate system according to the imaging principle of a depth image, then carrying out corresponding pose conversion on the three-dimensional point cloud by taking the world coordinate as a center, and converting the space three-dimensional point cloud into the pixel points through the conversion relation from the world coordinate to the image coordinate after conversion to form a new image. In addition, the smoothing method using minimum value filtering is proposed to remove noise in the depth image and fill in blank spots generated after the depth image is transformed. The depth image data enhancement method can improve the generalization capability and accuracy of the network model.
The invention content is as follows:
the invention aims to provide a data enhancement method for the depth learning direction of a depth image in the field of vision computers, and the method can enhance the data of the depth image in the network model training process so as to improve the generalization capability and accuracy of a training model.
The invention mainly adopts the following scheme:
the data enhancement method based on the depth image mainly comprises a pixel coordinate conversion three-dimensional point cloud, three-dimensional point cloud space transformation, three-dimensional point cloud conversion pixel coordinates and a minimum value filtering processing part, and the algorithm flow is shown in figure 1. The pixel coordinate conversion three-dimensional point cloud comprises a pixel coordinate system conversion image coordinate system, an image coordinate system conversion camera coordinate system and a camera coordinate system conversion world coordinate system. And converting pixel points in the depth image into a three-dimensional space by multiplication and concatenation of conversion matrixes among the four coordinate systems to form a three-dimensional space point cloud. The three-dimensional point cloud space transformation comprises random translation transformation under a three-dimensional point cloud space and random rotation transformation under the three-dimensional point cloud space. The rotation transformation in the three-dimensional point cloud space comprises rotation transformation around an X axis, a Y axis and a Z axis, the rotation transformation angles of the X axis, the Y axis and the Z axis are generated by randomly generating numbers, and then the space rotation angle transformation is completed by a corresponding transformation matrix. The random translation transformation in the three-dimensional point cloud space comprises translation transformation along an X axis, a Y axis and a Z axis, the translation transformation distances along the X axis, the Y axis and the Z axis are generated by randomly generating numbers, and the space translation transformation is completed by a translation transformation matrix. The three-dimensional point cloud conversion pixel coordinate system comprises a world coordinate system conversion camera coordinate system, a camera coordinate system conversion image coordinate system and an image coordinate system conversion pixel coordinate system. And projecting the transformed three-dimensional space point cloud into the depth image through the combination of the transformation matrixes among the four coordinate systems. When the position of the spatial three-dimensional point cloud relative to the camera is changed, partial points are shielded by other points, so that data loss exists in the converted depth image. The minimum filtering process is used for filling up lost pixel points, so that the final output effect of the converted and transformed depth image is clearer.
The pixel coordinate transformation three-dimensional point cloud is obtained by inputting a depth image through a pixel plane coordinate system (u, v), an image coordinate system (X, y) and a camera coordinate system (X)c,Yc,Zc) World coordinate system (X)w,Yw,Zw) The conversion relationship between the two points converts the pixel points in the image into a space three-dimensional point cloud. Firstly, converting the pixel points in each image into the plane image coordinates through a conversion matrix from a pixel coordinate system to an image coordinate system, wherein the relationship between the pixel coordinate system and the image coordinate system is shown in fig. 2, and the matrix transformation between the image coordinate system (x, y) and the pixel coordinate system (u, v) is shown in fig. 3. Projecting the image plane points in the obtained image coordinate system to a camera coordinate system according to the geometric relationship principle of camera imaging, wherein the geometric relationship of the camera imaging is shown in figure 4, and the image coordinate system (X, y) is changed into the camera coordinate system (X)c,Yc,Zc) The transformation matrix of (2) is shown in fig. 5. Then, the position relation of the obtained points in the camera coordinate system with respect to the established world coordinate system is converted, and the camera coordinate system (X) is obtainedc,Yc,Zc) To the world coordinate system (X)w,Yw,Zw) The transformation relationship matrix of (2) is shown in fig. 6. Finally, by combining the coordinate system transformation relations, a transformation relation from the depth image to the world coordinate system is obtained, and the transformation matrix is shown in fig. 7, where M is a projection matrix. And converting pixel points in the depth map into a three-dimensional space point cloud map under a world coordinate system through the projection matrix, wherein an effect map is shown in fig. 8.
The three-dimensional point cloud space transformation mainly comprises rotation transformation around an axis X, Z, Y of a world coordinate system and translation transformation along an XYZ axis of the world coordinate system, namely a rotation transformation matrix R and a translation transformation matrix T. The rotation transformation matrix R comprises an angle rotation transformation matrix around an X axis, an angle rotation transformation matrix around a Y axis and an angle rotation transformation matrix around a Z axis, and the three groups of rotation matrices are combined to obtain the rotation transformation matrix R. The relationship diagram of the rotation transformation matrix around the axis X, Y, Z is shown in fig. 9, 10 and 11. Transforming angles [ psi, omega, theta ] for translations around axes of world coordinate system X, Y, Z]TAnd generating a corresponding transformation angle through a random number. For translation transformations along the axes of world coordinate system X, Y, Z, translation along the X-axis, translation along the Y-axis, translation along the Z-axis, respectively, a translation transformation matrix T is shown in fig. 12. Transforming distances [ x, y, z ] for translation along axes of world coordinate system X, Y, Z]TThe corresponding translation distance is generated by a random number.
The three-dimensional point cloud converts pixel coordinates including a world coordinate system (X)w,Yw,Zw) Transforming the camera coordinate System (X)c,Yc,Zc) Camera coordinate system (X)c,Yc,Zc) The image coordinate system (x, y) is converted, and the image coordinate system (x, y) is converted into the pixel coordinate system (u, v). And (4) performing inversion operation on the relation by using the previously obtained conversion matrix relation among the four coordinate systems, and projecting the transformed three-dimensional space point cloud into the depth image. A transformed depth image is formed.
When the position of the spatial three-dimensional point cloud relative to the camera is changed, a part of points may be blocked by other points, so that a part of data of the converted depth image is lost, for example, fig. 13 is an effect diagram obtained by projecting the three-dimensional point cloud to the pixel coordinates of the image after changing the distance of the three-dimensional point cloud relative to the Z axis of the world coordinate system, and a black line part in the diagram is lost data. The minimum filtering process is used for filling up lost pixel points, so that the converted depth image has a better imaging effect. The designed minimum value filter is used for sequencing the coordinate point and values of points around the coordinate point, the minimum non-zero value is used for replacing the values of the coordinate point, the designed filter kernel is a 3-by-3 matrix, and the final output depth image is obtained after the processing of the minimum value filter. The front and back contrast of the filtered image is shown in fig. 14. Finally, the specific effect of the input depth image after the data enhancement method is shown in fig. 15. The specific method for enhancing the effect map data is to translate the input depth image along the world coordinate system Z, rotate the input depth image along the world coordinate system Z axis and rotate the input depth image along the world coordinate system Y axis.
①, in the process of performing rotation transformation around the X, Y and Z axes of a world coordinate system on three-dimensional space point cloud and performing random transformation along the X, Y, Z axis direction of the world coordinate system, the value of a random rotation angle is between-5 degrees and 5 degrees, the translation distance is between-0.3 m and 0.3m, when the translation distance exceeds the range, the transformed three-dimensional point cloud is re-projected into a depth image, the point cloud is seriously lost, ②, after the transformed three-dimensional point cloud is projected into the depth image, the depth image is subjected to twice filtering treatment by using a minimum filter, and the obtained image effect is ideal.
The data enhancement method based on the depth image has the following advantages:
1. the data enhancement method based on the depth image provides a data expansion method for the research based on the neighborhood of the depth image in the field of computer vision, so that the training of a network model on a data set of the depth image is not limited to only the data set any more.
2. The distance value is stored in each coordinate point of the depth image, so the data distribution of the collected images is different greatly due to different installation positions of the depth cameras. In the data enhancement method based on the depth image, the method for converting the image points into the three-dimensional space point cloud and then carrying out pose transformation on the three-dimensional space point cloud is provided, so that the data distribution of a network model at different camera installation positions can be learned in the training process, and the generalization capability of the network model is greatly improved.
3. The depth image after data enhancement is filtered by using the minimum filter, so that the problem of data loss caused by partial points being shielded after pose transformation is carried out on the three-dimensional space point cloud in the data enhancement of the depth image can be effectively eliminated, and the effect of the depth image generated after the data enhancement is better.
Drawings
FIG. 1 is a flow chart of a method for enhancing depth image data;
FIG. 2 is a diagram of a relationship between a pixel coordinate system and an image coordinate system;
FIG. 3 is a matrix transformation equation from a pixel coordinate system to an image coordinate system;
FIG. 4 is a diagram showing a transformation relationship between a camera coordinate system and an image coordinate system;
FIG. 5 is a matrix transformation equation from an image coordinate system to a camera coordinate system;
FIG. 6 is a matrix transformation equation from a camera coordinate system to a world coordinate system;
FIG. 7 is a matrix transformation equation from a pixel coordinate system to a world coordinate system;
FIG. 8 is a three-dimensional cloud plot of depth images and the corresponding depth images after conversion from the image coordinate system to the world coordinate system;
FIG. 9 is an angular rotation transformation matrix equation about the X-axis of the world coordinate system;
FIG. 10 is an angular rotation transformation matrix equation about the Y-axis of the world coordinate system;
FIG. 11 is an angular rotation transformation matrix equation about the Z-axis of the world coordinate system;
FIG. 12 is a translation transformation matrix equation along world coordinate system XYZ axes;
FIG. 13 is a transformed depth image without minimum filtering;
FIG. 14 is a front-to-back comparison graph of a depth image after a minimum filtering process is performed on the depth image after data enhancement operations have been performed on the depth image;
FIG. 15 is a diagram of data enhancement effect of depth images;
the specific implementation mode is as follows:
the invention is further described below with reference to the figures and examples.
The data enhancement method based on the depth image mainly comprises a pixel coordinate conversion three-dimensional point cloud, three-dimensional point cloud space transformation, three-dimensional point cloud conversion image coordinates and a minimum value filtering processing part. The process of converting the pixel coordinate into the three-dimensional point cloud comprises the steps of converting a pixel coordinate system into an image coordinate system, converting an image coordinate system into a camera coordinate system and converting a camera coordinate system into a world coordinate system. And converting the image points in the depth image into a three-dimensional space through the combination of the conversion matrixes among the four coordinate systems to form a three-dimensional space point cloud. The three-dimensional point cloud space transformation comprises random translation transformation under a three-dimensional point cloud space and random rotation transformation under the three-dimensional point cloud space, and the purpose of data expansion is achieved by performing pose transformation on the three-dimensional point cloud. The rotation transformation in the three-dimensional point cloud space comprises rotation transformation around an X axis, a Y axis and a Z axis, the rotation transformation angles of the X axis, the Y axis and the Z axis are generated by randomly generating numbers, and then the space rotation angle transformation is completed by a corresponding transformation matrix. The random translation transformation in the three-dimensional point cloud space comprises translation transformation along an X axis, a Y axis and a Z axis, the translation transformation distances along the X axis, the Y axis and the Z axis are generated by randomly generating numbers, and the space translation transformation is completed by a translation transformation matrix. The three-dimensional point cloud conversion pixel coordinate system comprises a world coordinate system conversion camera coordinate system, a camera coordinate system conversion image coordinate system and an image coordinate system conversion pixel coordinate system. And projecting the transformed three-dimensional space point cloud into the depth image through the multiplication and concatenation of the transformation matrixes among the four coordinate systems. When the position of the spatial three-dimensional point cloud relative to the camera is changed, partial points are shielded by other points, so that data loss exists in the converted depth image. The minimum filtering process is used for filling up lost pixel points, so that the converted depth image is clearer.
The image coordinate conversion three-dimensional point cloud is composed of an image coordinate system conversion pixel coordinate system, a pixel coordinate system conversion camera coordinate system and a camera coordinate system conversion world coordinate system, and an input depth image is converted into a three-dimensional point cloud through an image coordinate system (X, y), a pixel plane coordinate system (u, v) and a camera coordinate system (X, y)c,Yc,Zc) World coordinate system (X)w,Yw,Zw) The conversion relationship between the points in the image is converted into a space three-dimensional point cloud. The pixel coordinate system is converted into an image coordinate system, the pixel coordinate system and the image coordinate system are on an imaging plane, only the respective origin and measurement unit are different, and a relation graph is shown in fig. 2. Since (u, v) represents only the column and row numbers of pixels, and the positions of the pixels in the image are not expressed in physical units, an image coordinate system x-y in physical units (e.g., millimeters) is established. Defining the intersection point of the camera optical axis and the image plane as the origin O of the coordinate system1And the x-axis is parallel to the u-axis and the y-axis is parallel to the v-axis, assuming (u)0,v0) Represents O1Coordinates in the u-v coordinate system, dx and dy respectively represent the physical dimensions of each pixel on the horizontal axis x and the vertical axis y, and the conversion relationship between the coordinates in the u-v coordinate system and the coordinates in the x-y coordinate system of each pixel in the image is expressed in the form of a matrix as shown in fig. 3. The conversion between the pixel coordinate system and the image coordinate system can be completed through the matrix relational expression. The image coordinate system is converted into a camera coordinate system, and the geometric relationship of camera imaging can be represented by fig. 4. Wherein the O point is the optical center (projection center) of the camera and XcAxis and YcThe axes being parallel to the x-and y-axes of the imaging plane coordinate system, ZcThe axis is the optical axis of the camera and is perpendicular to the image plane. The intersection point of the optical axis and the image plane is the principal point O of the image1From points O and XcYcZcThe rectangular coordinate system of axes is called the camera coordinate system. O is1Is the camera focal length. Point P (X)c,Yc,Zc) Projected onto the image plane by light rays passing through the center of projection. The corresponding image point is p (x, y), and the transformation matrix equation derived from the principle of similarity triangles is shown in fig. 6. The conversion between the image coordinate system and the camera coordinate system can be completed through the matrix equation. The camera coordinate system is converted into a world coordinate system, and the world coordinate system is introduced to describe the position of the camera. The translation vector t and the rotation matrix R can be used to represent the relationship between the camera coordinate system and the world coordinate system. Therefore, assume that the spatial points P are aligned in the world coordinate systemThe secondary coordinate is (X)w,Yw,Zw,1)TThe homogeneous coordinate in the camera coordinate is (X)c,Yc,Zc,1)TThen, the transformation relation matrix equation between the world coordinate system and the camera coordinate system is shown in fig. 6, where R is a 3 × 3 rotation matrix and T is a 3 × 1 translation vector. Combining the above descriptions, the transformation matrix equation of the world coordinate system to the pixel plane coordinate system is shown in fig. 7, where M is the projection matrix. The conversion from the pixel coordinate system to the world coordinate system can be realized through the matrix equation, so that the pixel points in the depth image are converted into a three-dimensional space point cloud image under the world coordinate system, and the effect image is shown in fig. 8.
The three-dimensional point cloud space transformation mainly comprises rotation transformation around an axis X, Z, Y of a world coordinate system and translation transformation along an XYZ axis of the world coordinate system, namely a rotation transformation matrix R and a translation transformation matrix T. The rotation transformation matrix R comprises an angle rotation transformation matrix around an X axis, an angle rotation transformation matrix around a Y axis and an angle rotation transformation matrix around a Z axis, and the three groups of rotation matrices are combined to obtain the rotation transformation matrix R. The relationship diagram of the rotation transformation matrix around the axis X, Y, Z is shown in fig. 9, 10 and 11. Transforming angles [ psi, omega, theta ] for translations around axes of world coordinate system X, Y, Z]TGenerating corresponding transformation angle by random number, taking [ -a, a [ -a [ ]]The specific formula is [ psi, omega, theta%]T(-1+2 × random (0,1)) × a. For translation transformations along the axes of world coordinate system X, Y, Z, translation along the X-axis, translation along the Y-axis, translation along the Z-axis, respectively, a translation transformation matrix T is shown in fig. 12. Transforming distances [ x, y, z ] for translation along axes of world coordinate system X, Y, Z]TGenerating corresponding translation distance by random number, taking [ -b, b [ -b [ ]]The specific formula is [ x, y, z ]]T=(-1+2*random(0,1))*b。
The three-dimensional point cloud converts pixel coordinates including a world coordinate system (X)w,Yw,Zw) Transforming the camera coordinate System (X)c,Yc,Zc) Camera coordinate system (X)c,Yc,Zc) Converting image coordinate system (x, y) and converting image of image coordinate system (x, y)The prime coordinate system (u, v). And projecting the transformed three-dimensional space point cloud to pixel points of the depth image according to the obtained conversion matrix relation among the four coordinate systems. A transformed depth image is formed.
When the position of the spatial three-dimensional point cloud relative to the camera is changed, a part of points may be blocked by other points, so that a part of data of the converted depth image is lost, for example, fig. 13 is an effect diagram obtained by projecting the three-dimensional point cloud to the pixel coordinates of the image after changing the distance of the three-dimensional point cloud relative to the Z axis of the world coordinate system, and a black line part in the diagram is lost data. The minimum filtering process is used for filling up lost pixel points, so that the converted depth image has a better imaging effect. The center pixel is then compared to a non-zero minimum pixel value and if less than the minimum value, the replacement center pixel is the minimum value. The designed filter kernel is a matrix of 3 x 3, and when the depth value of a pixel point is less than 100, the depth value of the point is judged to be 0. And obtaining the final output depth image after the processing of the minimum value filter. The front and back contrast of the filtered image is shown in fig. 14. Finally, the specific effect of the input depth image after the data enhancement method is shown in fig. 15. The specific method for enhancing the effect map data is to translate the input depth image along the world coordinate system Z, rotate the input depth image along the world coordinate system Z axis and rotate the input depth image along the world coordinate system Y axis.

Claims (3)

1.基于深度图像的数据增强方法主要由像素坐标转换三维点云、三维点云空间变换、三维点云转换像素坐标、最小值滤波处理部分组成,其特征在于:1. The data enhancement method based on depth image is mainly composed of pixel coordinate conversion 3D point cloud, 3D point cloud space transformation, 3D point cloud conversion pixel coordinate, minimum value filtering processing part, and it is characterized in that: 所述的像素坐标转换三维点云包括像素坐标系转换图像坐标系、图像坐标系转换相机坐标系、相机坐标系转换世界坐标系。通过四个坐标系之间转换矩阵的连乘,将深度图像中的像素点转换到三维空间下,形成三维空间点云。所述的三维点云空间变换包括三维点云空间下的随机平移变换与三维点云空间下的随机旋转变换。通过三维点云的随机位姿变换,形成新的三维点云图像。三维点云空间下的旋转变换包括围绕世界坐标系的X轴、Y轴、Z轴的旋转变换,通过随机生成数,生成绕X轴、Y轴、Z轴的旋转变换角度,再由对应的变换矩阵完成空间旋转角度变换。三维点云空间下的随机平移变换包括沿世界坐标系X轴、Y轴、Z轴的平移变换,通过随机生成数,生成沿X轴、Y轴、Z轴的平移变换距离,由平移变换矩阵完成空间平移变换。所述的三维点云转换像素坐标,包括世界坐标系转换相机坐标系、相机坐标系转换图像坐标系、图像坐标系转换像素坐标系。通过四个坐标系之间的转换矩阵组合,将经过变换的三维空间点云投影到深度图像中。所述的最小值滤波处理部分,是将经过数据增强后的深度图像做进一步处理,填补在数据增强过程中可能会丢失的像素点,也对深度图像做平滑处理,使得成像效果更好。本发明的提出,为计算机视觉领域内基于深度图像的研究方向提供了一种数据扩充的方法,能够提高基于深度图像所训练的网络模型泛化能力与准确性。The transformation of the three-dimensional point cloud from the pixel coordinates includes the transformation of the pixel coordinate system into the image coordinate system, the transformation of the image coordinate system into the camera coordinate system, and the transformation of the camera coordinate system into the world coordinate system. Through the continuous multiplication of the conversion matrix between the four coordinate systems, the pixels in the depth image are converted into the three-dimensional space to form a three-dimensional space point cloud. The three-dimensional point cloud space transformation includes random translation transformation in the three-dimensional point cloud space and random rotation transformation in the three-dimensional point cloud space. Through the random pose transformation of the 3D point cloud, a new 3D point cloud image is formed. The rotation transformation in the 3D point cloud space includes the rotation transformation around the X axis, Y axis, and Z axis of the world coordinate system. By randomly generating numbers, the rotation transformation angles around the X axis, Y axis, and Z axis are generated, and then the corresponding rotation transformation angles are generated. The transformation matrix completes the spatial rotation angle transformation. The random translation transformation in the 3D point cloud space includes translation transformation along the X axis, Y axis, and Z axis of the world coordinate system. Through random generation of numbers, the translation transformation distance along the X axis, Y axis, and Z axis is generated. The translation transformation matrix Complete the space translation transformation. The three-dimensional point cloud conversion to pixel coordinates includes the conversion of the world coordinate system to the camera coordinate system, the conversion of the camera coordinate system to the image coordinate system, and the conversion of the image coordinate system to the pixel coordinate system. The transformed 3D space point cloud is projected into the depth image through a combination of transformation matrices between the four coordinate systems. The minimum filtering processing part further processes the depth image after data enhancement, fills in the pixels that may be lost during the data enhancement process, and also smoothes the depth image, so that the imaging effect is better. The present invention provides a data expansion method for the research direction based on depth images in the field of computer vision, and can improve the generalization ability and accuracy of the network model trained based on the depth images. 2.根据权利要求1所述的基于深度图像的数据增强方法,三维点云空间变换由随机旋转变换和随机平移变换两部分组成,其特征在于:2. the data enhancement method based on depth image according to claim 1, three-dimensional point cloud space transformation is made up of two parts of random rotation transformation and random translation transformation, it is characterized in that: 所述随机旋转变换是按照本发明所提出的策略选取一定范围内围绕世界坐标系X、Y、Z轴的随机变换角度[ψ,ω,θ]T,通过旋转变换矩阵R,对已生成的三维空间点云图像做随机旋转变换,以改变三维空间点云相对于世界坐标原点的位姿。旋转变换矩阵R包括绕X轴的角度旋转变换矩阵、绕Y轴的角度旋转变换矩阵、绕Z轴的角度旋转变换矩阵,将三组旋转矩阵组合即得到旋转变换矩阵R。所述随机平移变换是按照本发明所提出的策略选取一定范围内沿着世界坐标系X、Y、Z轴平移的随机平移距离[x,y,z]T,通过平移变换矩阵T,对已生成的三维空间点云图像做随机平移变换,以改变三维空间点云相对于世界坐标原点的位置。通过随机平移变换与随机旋转变换,改变了三维空间点云相对于世界坐标的位姿,将三维点云图投影到平面像素坐标图像中时,便形成了一副新的深度图像,以此来达到数据扩充的目的。The random rotation transformation is to select a random transformation angle [ψ,ω,θ] T around the X, Y, Z axes of the world coordinate system within a certain range according to the strategy proposed by the present invention, and through the rotation transformation matrix R, the generated The 3D space point cloud image is subjected to random rotation transformation to change the pose of the 3D space point cloud relative to the origin of the world coordinates. The rotation transformation matrix R includes an angular rotation transformation matrix around the X axis, an angular rotation transformation matrix around the Y axis, and an angular rotation transformation matrix around the Z axis. The rotation transformation matrix R is obtained by combining the three sets of rotation matrices. The random translation transformation is to select the random translation distance [x, y, z] T to translate along the X, Y, Z axes of the world coordinate system within a certain range according to the strategy proposed by the present invention, and through the translation transformation matrix T, the The generated 3D space point cloud image is subjected to random translation transformation to change the position of the 3D space point cloud relative to the origin of the world coordinates. Through random translation transformation and random rotation transformation, the pose of the three-dimensional point cloud relative to the world coordinates is changed. When the three-dimensional point cloud image is projected into the plane pixel coordinate image, a new depth image is formed, so as to achieve Purpose of data augmentation. 3.根据权利要求1所述的基于深度图像的数据增强方法,最小值滤波处理的特征在于:3. the data enhancement method based on depth image according to claim 1, the minimum value filter processing is characterized in that: 所述最小值滤波处理是在经过数据增强生成新的深度图像后,对深度图像做进一步处理。通过本发明所提出的筛选策略,将数据增强中,由于改变三维空间点云相对于世界坐标原点位置时,可能会出现部分点被其他点遮挡,以此出现点云投影到平面像素坐标时部分位置数据丢失的位置进行填补。同时,也通过最小值滤波处理,对深度图像中的数据做平滑处理,使得成像效果更好。The minimum value filtering process is to further process the depth image after generating a new depth image through data enhancement. Through the screening strategy proposed by the present invention, in the data enhancement, when the position of the three-dimensional space point cloud relative to the origin of the world coordinates is changed, some points may be occluded by other points, so that when the point cloud is projected to the plane pixel coordinates, some points may appear. Position data missing positions are filled. At the same time, the data in the depth image is smoothed through the minimum filtering process, so that the imaging effect is better.
CN201911128481.3A 2019-11-18 2019-11-18 Data enhancement method based on depth image Pending CN111223053A (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN201911128481.3A CN111223053A (en) 2019-11-18 2019-11-18 Data enhancement method based on depth image

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN201911128481.3A CN111223053A (en) 2019-11-18 2019-11-18 Data enhancement method based on depth image

Publications (1)

Publication Number Publication Date
CN111223053A true CN111223053A (en) 2020-06-02

Family

ID=70829013

Family Applications (1)

Application Number Title Priority Date Filing Date
CN201911128481.3A Pending CN111223053A (en) 2019-11-18 2019-11-18 Data enhancement method based on depth image

Country Status (1)

Country Link
CN (1) CN111223053A (en)

Cited By (24)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639626A (en) * 2020-06-11 2020-09-08 深圳市泰沃德自动化技术有限公司 Three-dimensional point cloud data processing method, device, computer equipment and storage medium
CN111862185A (en) * 2020-07-24 2020-10-30 唯羲科技有限公司 Method for extracting plane from image
CN111996883A (en) * 2020-08-28 2020-11-27 四川长虹电器股份有限公司 Method for detecting width of road surface
CN112184806A (en) * 2020-09-14 2021-01-05 国家电网有限公司 Space distance measurement method based on three-dimensional live-action transformer substation
CN112767442A (en) * 2021-01-18 2021-05-07 中山大学 Pedestrian three-dimensional detection tracking method and system based on top view angle
CN113012210A (en) * 2021-03-25 2021-06-22 北京百度网讯科技有限公司 Method and device for generating depth map, electronic equipment and storage medium
CN113052797A (en) * 2021-03-08 2021-06-29 江苏师范大学 BGA solder ball three-dimensional detection method based on depth image processing
CN113282168A (en) * 2021-05-08 2021-08-20 青岛小鸟看看科技有限公司 Information input method and device of head-mounted display equipment and head-mounted display equipment
CN113920020A (en) * 2021-09-26 2022-01-11 中国舰船研究设计中心 Human point cloud real-time repairing method based on depth generation model
CN113971835A (en) * 2021-09-23 2022-01-25 深圳市联洲国际技术有限公司 Control method and device of household appliance, storage medium and terminal device
CN114119826A (en) * 2021-11-12 2022-03-01 苏州挚途科技有限公司 Image processing method and device and electronic equipment
CN114418903A (en) * 2022-01-21 2022-04-29 支付宝(杭州)信息技术有限公司 Human-computer interaction method and human-computer interaction device based on privacy protection
CN114627008A (en) * 2022-03-02 2022-06-14 厦门聚视智创科技有限公司 Depth image data enhancement method adopting quantitative deflection
CN114972694A (en) * 2022-05-16 2022-08-30 北京深度搜索科技有限公司 Object labeling method and device, device and storage medium
CN115061656A (en) * 2022-06-06 2022-09-16 中国电信股份有限公司 Random number generation method and device, electronic equipment and storage medium
CN115497242A (en) * 2022-09-07 2022-12-20 东南大学 Intelligent monitoring system and monitoring method for foreign matter invasion in railway business line construction
CN115511961A (en) * 2022-09-19 2022-12-23 忘平(广东)科技有限公司 Three-dimensional space positioning method, system and storage medium
CN115741687A (en) * 2022-11-15 2023-03-07 深圳市泰达机器人有限公司 Method, system and storage medium for visual recognition, tracking and processing of welding line
WO2023029969A1 (en) * 2021-08-31 2023-03-09 上海商汤智能科技有限公司 Image processing method and apparatus, and electronic device and computer-readable storage medium
CN115984801A (en) * 2023-03-07 2023-04-18 安徽蔚来智驾科技有限公司 Point cloud object detection method, computer equipment, storage medium and vehicle
CN116239062A (en) * 2023-05-11 2023-06-09 临工重机股份有限公司 Telescopic arm construction machine operation control method and telescopic arm construction machine
CN116862997A (en) * 2023-07-14 2023-10-10 安徽科力信息产业有限责任公司 Method, device, equipment and storage medium for calculating and verifying camera calibration
CN118505591A (en) * 2024-02-02 2024-08-16 中国医学科学院北京协和医院 CPR (CPR) period chest cross section based on depth camera
CN118521646A (en) * 2024-07-25 2024-08-20 中国铁塔股份有限公司江西省分公司 Image processing-based multi-machine type unmanned aerial vehicle power receiving frame alignment method and system

Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780592A (en) * 2016-06-30 2017-05-31 华南理工大学 Kinect depth reconstruction algorithms based on camera motion and image light and shade
CN107016704A (en) * 2017-03-09 2017-08-04 杭州电子科技大学 A kind of virtual reality implementation method based on augmented reality
WO2018094932A1 (en) * 2016-11-23 2018-05-31 北京清影机器视觉技术有限公司 Method and device for generating human eye observation image presented in stereoscopic vision
CN108470323A (en) * 2018-03-13 2018-08-31 京东方科技集团股份有限公司 A kind of image split-joint method, computer equipment and display device
WO2018185104A1 (en) * 2017-04-06 2018-10-11 B<>Com Method for estimating pose, associated device, system and computer program
CN110349251A (en) * 2019-06-28 2019-10-18 深圳数位传媒科技有限公司 A kind of three-dimensional rebuilding method and device based on binocular camera

Patent Citations (6)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106780592A (en) * 2016-06-30 2017-05-31 华南理工大学 Kinect depth reconstruction algorithms based on camera motion and image light and shade
WO2018094932A1 (en) * 2016-11-23 2018-05-31 北京清影机器视觉技术有限公司 Method and device for generating human eye observation image presented in stereoscopic vision
CN107016704A (en) * 2017-03-09 2017-08-04 杭州电子科技大学 A kind of virtual reality implementation method based on augmented reality
WO2018185104A1 (en) * 2017-04-06 2018-10-11 B<>Com Method for estimating pose, associated device, system and computer program
CN108470323A (en) * 2018-03-13 2018-08-31 京东方科技集团股份有限公司 A kind of image split-joint method, computer equipment and display device
CN110349251A (en) * 2019-06-28 2019-10-18 深圳数位传媒科技有限公司 A kind of three-dimensional rebuilding method and device based on binocular camera

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
BISHWAJIT PAL 等: "3D Point Cloud Generation from 2D Depth Camera Images Using Successive Triangulation" *

Cited By (33)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN111639626B (en) * 2020-06-11 2021-09-17 深圳市泰沃德技术有限公司 Three-dimensional point cloud data processing method and device, computer equipment and storage medium
CN111639626A (en) * 2020-06-11 2020-09-08 深圳市泰沃德自动化技术有限公司 Three-dimensional point cloud data processing method, device, computer equipment and storage medium
CN111862185A (en) * 2020-07-24 2020-10-30 唯羲科技有限公司 Method for extracting plane from image
CN111996883A (en) * 2020-08-28 2020-11-27 四川长虹电器股份有限公司 Method for detecting width of road surface
CN111996883B (en) * 2020-08-28 2021-10-29 四川长虹电器股份有限公司 Method for detecting width of road surface
CN112184806A (en) * 2020-09-14 2021-01-05 国家电网有限公司 Space distance measurement method based on three-dimensional live-action transformer substation
CN112767442A (en) * 2021-01-18 2021-05-07 中山大学 Pedestrian three-dimensional detection tracking method and system based on top view angle
CN112767442B (en) * 2021-01-18 2023-07-21 中山大学 A three-dimensional detection and tracking method and system for pedestrians based on top view
CN113052797A (en) * 2021-03-08 2021-06-29 江苏师范大学 BGA solder ball three-dimensional detection method based on depth image processing
CN113052797B (en) * 2021-03-08 2024-01-05 江苏师范大学 Three-dimensional detection method of BGA solder balls based on depth image processing
CN113012210A (en) * 2021-03-25 2021-06-22 北京百度网讯科技有限公司 Method and device for generating depth map, electronic equipment and storage medium
CN113282168A (en) * 2021-05-08 2021-08-20 青岛小鸟看看科技有限公司 Information input method and device of head-mounted display equipment and head-mounted display equipment
WO2023029969A1 (en) * 2021-08-31 2023-03-09 上海商汤智能科技有限公司 Image processing method and apparatus, and electronic device and computer-readable storage medium
CN113971835A (en) * 2021-09-23 2022-01-25 深圳市联洲国际技术有限公司 Control method and device of household appliance, storage medium and terminal device
CN113920020B (en) * 2021-09-26 2023-07-18 中国舰船研究设计中心 A real-time inpainting method of human body point cloud based on deep generative model
CN113920020A (en) * 2021-09-26 2022-01-11 中国舰船研究设计中心 Human point cloud real-time repairing method based on depth generation model
CN114119826A (en) * 2021-11-12 2022-03-01 苏州挚途科技有限公司 Image processing method and device and electronic equipment
CN114418903A (en) * 2022-01-21 2022-04-29 支付宝(杭州)信息技术有限公司 Human-computer interaction method and human-computer interaction device based on privacy protection
CN114627008A (en) * 2022-03-02 2022-06-14 厦门聚视智创科技有限公司 Depth image data enhancement method adopting quantitative deflection
CN114972694A (en) * 2022-05-16 2022-08-30 北京深度搜索科技有限公司 Object labeling method and device, device and storage medium
CN115061656A (en) * 2022-06-06 2022-09-16 中国电信股份有限公司 Random number generation method and device, electronic equipment and storage medium
CN115061656B (en) * 2022-06-06 2024-10-08 中国电信股份有限公司 Random number generation method and device, electronic equipment and storage medium
CN115497242A (en) * 2022-09-07 2022-12-20 东南大学 Intelligent monitoring system and monitoring method for foreign matter invasion in railway business line construction
CN115497242B (en) * 2022-09-07 2023-11-17 东南大学 An intelligent monitoring system and monitoring method for foreign matter intrusion for railway business line construction
CN115511961A (en) * 2022-09-19 2022-12-23 忘平(广东)科技有限公司 Three-dimensional space positioning method, system and storage medium
CN115741687A (en) * 2022-11-15 2023-03-07 深圳市泰达机器人有限公司 Method, system and storage medium for visual recognition, tracking and processing of welding line
CN115984801A (en) * 2023-03-07 2023-04-18 安徽蔚来智驾科技有限公司 Point cloud object detection method, computer equipment, storage medium and vehicle
CN116239062B (en) * 2023-05-11 2023-07-07 临工重机股份有限公司 Telescopic arm construction machine operation control method and telescopic arm construction machine
CN116239062A (en) * 2023-05-11 2023-06-09 临工重机股份有限公司 Telescopic arm construction machine operation control method and telescopic arm construction machine
CN116862997A (en) * 2023-07-14 2023-10-10 安徽科力信息产业有限责任公司 Method, device, equipment and storage medium for calculating and verifying camera calibration
CN118505591A (en) * 2024-02-02 2024-08-16 中国医学科学院北京协和医院 CPR (CPR) period chest cross section based on depth camera
CN118505591B (en) * 2024-02-02 2024-11-15 中国医学科学院北京协和医院 A depth camera-based chest cross-section during CPR
CN118521646A (en) * 2024-07-25 2024-08-20 中国铁塔股份有限公司江西省分公司 Image processing-based multi-machine type unmanned aerial vehicle power receiving frame alignment method and system

Similar Documents

Publication Publication Date Title
CN111223053A (en) Data enhancement method based on depth image
Gao et al. A general deep learning based framework for 3D reconstruction from multi-view stereo satellite images
CN110070025B (en) Monocular image-based three-dimensional target detection system and method
CN111079545A (en) Three-dimensional target detection method and system based on image restoration
CN104778656B (en) Fisheye image correcting method based on spherical perspective projection
TW202004679A (en) Image feature extraction method and saliency prediction method including the same
CN1702693A (en) Image providing method and equipment
CN108592884B (en) A kind of general linear array satellite core line image generating method
CN114549669B (en) Color three-dimensional point cloud acquisition method based on image fusion technology
CN115330935A (en) A 3D reconstruction method and system based on deep learning
CN114266900B (en) Monocular 3D target detection method based on dynamic convolution
CN116363290A (en) A texture map generation method for 3D reconstruction of large-scale scenes
CN111105452A (en) High-low resolution fusion stereo matching method based on binocular vision
CN117522803A (en) Precise positioning method of bridge components based on binocular vision and target detection
CN113781305A (en) Point cloud fusion method of double-monocular three-dimensional imaging system
CN117611438A (en) A reconstruction method from 2D lane lines to 3D lane lines based on monocular images
CN107958489B (en) Surface reconstruction method and device
Luo et al. A defocused calibration method using ROI-patched MAM-UNet and progressive training for froth flotation process
CN120782937B (en) Optimization method and apparatus for sparse-view 3D Gaussian splashing
CN116912158A (en) Workpiece quality inspection methods, devices, equipment and readable storage media
CN119693543A (en) A new perspective synthesis 3D reconstruction method and system based on a single panoramic image
CN113240584A (en) Multitask gesture picture super-resolution method based on picture edge information
CN120125746A (en) A video 3D reconstruction method and system for high-quality digitization of cultural relics
CN116129036B (en) A Depth-Information-Guided Method for Automatic 3D Structure Restoration of Omnidirectional Images
JP5249157B2 (en) Method and apparatus for improving accuracy of three-dimensional model at high speed

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
WD01 Invention patent application deemed withdrawn after publication
WD01 Invention patent application deemed withdrawn after publication

Application publication date: 20200602