CN115065826B

CN115065826B - A LiDAR point cloud inter-frame encoding and decoding method based on pose state

Info

Publication number: CN115065826B
Application number: CN202210653112.1A
Authority: CN
Inventors: 郑明魁; 王泽峰; 邱鑫; 王适; 黄昕
Original assignee: Fuzhou University; Mindu Innovation Laboratory
Current assignee: Fuzhou University; Mindu Innovation Laboratory
Priority date: 2022-06-09
Filing date: 2022-06-09
Publication date: 2025-08-29
Anticipated expiration: 2042-06-09
Also published as: CN115065826A

Abstract

The present invention specifically discloses a posture state-based inter-frame encoding and decoding method for laser radar point clouds. During the encoding process, the geometric information of a three-dimensional point cloud is first mapped into a two-dimensional distance map; the point cloud sequence is divided into a reference point cloud and a point cloud to be encoded, and the posture states of the reference point cloud and the point cloud to be encoded are used to design an inter-frame prediction method for the point cloud to be encoded, and finally the prediction residual is quantized and encoded, and the transformation matrix, residual coding information and reference point cloud coding information are synthesized to output a code stream; during the decoding process, the transformation matrix, residual coding information and reference point cloud coding information are sequentially decrypted from the code stream to obtain the original distance between the reference point cloud and the point cloud to be encoded on the two-dimensional distance map, and finally the distance map is converted into a point cloud to restore the three-dimensional point cloud; the method provided by the present invention has high encoding performance when encoding and decoding a laser radar point cloud sequence.

Description

Pose state-based laser radar point cloud inter-frame encoding and decoding method

Technical Field

The invention belongs to the technical field of image coding, and particularly relates to a pose state-based laser radar point cloud inter-frame coding and decoding method.

Background

The point cloud is a massive point set expressing the target space distribution and the target surface specificity under the same space reference system. The point cloud is generally composed of two parts, namely geometric information and attribute information, wherein the geometric information refers to position information of a three-dimensional space, and the attribute information comprises reflectivity, color and the like. Because the point cloud can accurately express information of a scene or an object, the point cloud is widely applied to the fields of intelligent driving, virtual reality, augmented reality, map modeling and the like. However, the data volume of the point cloud is huge, sparsity and disorder are very challenging to store and transmit, so that the exploration and research of an efficient point cloud coding method are very important.

In the continuous point cloud sequence, the interval time between two adjacent frames of point clouds is short, and the laser radar only moves a small distance. Therefore, the adjacent frame point clouds have similar structures with a large range in space, objects in the scene only undergo small-range position variation, and a large amount of redundancy exists in the time dimension. The Chinese patent with publication number CN111899152A discloses a point cloud data compression method based on projection and video stitching, wherein the method is particularly used for analyzing geometrical attribute characteristics of point cloud data to determine a corresponding projection strategy and an optimal projection angle, a group of two-dimensional pictures with spatial correlation in the same angle are obtained by multiple projections of a projection area with the same projection angle, and a group of two-dimensional pictures are stitched into a video file, but the method is suitable for dense point cloud data and has poor universality in various point cloud types.

Disclosure of Invention

In order to solve the technical problems, the invention provides a pose state-based laser radar point cloud inter-frame coding method, which can reduce time domain redundancy in a point cloud sequence and improve data compression performance by means of pose relations of adjacent frame point clouds.

The technical scheme of the invention is as follows:

a laser radar point cloud inter-frame coding method based on pose state comprises the following steps:

The coding process comprises the steps of mapping the geometric information of the three-dimensional point cloud into a two-dimensional distance map, obtaining a pose state, calculating a transformation matrix M through the pose state, calculating an I frame and the transformation matrix M to obtain a predicted distance map, obtaining residual coding information and I frame coding information about a P frame through the predicted distance map, and synthesizing the transformation matrix M and the coding information into an output code stream;

And the decoding process sequentially solves the transformation matrix, the residual error coding information and the reference point cloud coding information from the code stream to obtain the original distances of the reference point cloud and the point cloud to be coded on the two-dimensional distance map, and finally, the distance map points the cloud to recover the three-dimensional point cloud.

Further, the pose relationship is obtained according to the IMU information in the Kitti dataset.

Further, the reference point cloud is an I-frame point cloud and a P-frame point cloud of the point cloud to be encoded, and the point cloud sequence encoding is performed in the form of "ipppippp.

Further, the specific steps of the encoding process are as follows:

S1, dividing a point cloud sequence into a reference point cloud and a point cloud to be encoded, mapping geometric information of the reference point cloud and the point cloud to be encoded into a two-dimensional distance map, respectively obtaining an I frame original distance map of the reference point cloud and a P frame original distance map with the point cloud to be encoded, and encoding the point cloud sequence in an IPPPPPP.

S2, acquiring a pose state according to IMU information in Kitti data sets, and calculating a transformation matrix M through the pose state;

S3, carrying out formula operation on the reference point cloud I frame and the transformation matrix M to obtain a predicted distance map of the point cloud P frame to be encoded;

S4, making a difference value between the predicted distance map obtained in the step S3 and the original distance map of the P frame in the step S1 to obtain a predicted residual error;

S5, carrying out quantization processing and JPEG-LS lossless coding on the prediction residual error obtained in the step S4 to obtain residual error coding information;

S6, the original distance map of the I frame in the step S1 is also encoded by JPEG-LS to obtain I frame encoding information;

s7, synthesizing the residual coding information in the step S5 and the I frame coding information in the step S6 into an output code stream by using the transformation matrix M in the step S2.

Further, the specific steps of the decoding process are as follows:

m1, after receiving a code stream, taking out I frame coding information in the code stream, and obtaining an I frame distance map through JPEG-LS decoding;

M2, after receiving the code stream, taking out a transformation matrix M in the code stream, and calculating the transformation matrix M information by combining the I frame distance map obtained in the step M1 to obtain a predicted distance map;

M3, after receiving the code stream, taking out residual error coding information in the code stream, and performing inverse quantization processing and JPEG-LS decoding on the residual error coding information to obtain a predicted residual error;

m4, reconstructing a P frame distance map by combining the prediction residual error obtained through M3 and the prediction distance map obtained through M2;

and M5, converting the I frame distance map obtained in the step M1 and the P frame distance map obtained in the step M4 into point cloud, and finally recovering the two-dimensional distance map into three-dimensional point cloud.

Further, the formulas for operating the reference point cloud I frame and the transformation matrix M in the encoding process and the decoding process are as follows:

Where I is any point in the I-frame point cloud, p' is any point in the predicted point cloud, M is the transformation matrix, R _3×3 is the rotation matrix, and T _3×1 is the translation matrix.

Compared with the prior art, the invention has the beneficial effects that:

1. The invention creatively realizes the mapping of the geometric information of the three-dimensional point cloud into the two-dimensional distance graph, simultaneously divides the point cloud sequence into the reference point cloud and the point cloud to be encoded, designs an inter-frame prediction method of the point cloud to be encoded by utilizing the pose relation of the reference point cloud and the point cloud to be encoded, predicts the point cloud to be encoded by combining a transformation matrix, only needs the difference between encoding and original point cloud of the point cloud to be encoded, realizes the data compression of the continuous point cloud sequence in the whole encoding and decoding process, reduces the data quantity required by transmission, can reduce the time domain redundancy of the point cloud sequence and improves the compression performance of the data;

2. the method provided by the invention has higher coding performance when coding and decoding the laser radar point cloud sequence.

Drawings

FIG. 1 is a schematic diagram of the encoding process of the present invention;

fig. 2 is a schematic diagram of a decoding process according to the present invention.

Detailed Description

The invention is further described below with reference to the preferred drawings and examples.

The invention provides a pose state-based laser radar point cloud inter-frame encoding and decoding method, which comprises the steps of firstly mapping geometric information of a three-dimensional point cloud into a two-dimensional distance graph in an encoding process, dividing a point cloud sequence into a reference point cloud and a point cloud to be encoded, designing an inter-frame prediction method of the point cloud to be encoded by utilizing the pose relation between the reference point cloud and the point cloud to be encoded, finally quantizing and encoding prediction residues, synthesizing a transformation matrix, residual error encoding information and the reference point cloud encoding information into an output code stream, and sequentially solving the transformation matrix, the residual error encoding information and the reference point cloud encoding information from the code stream in the decoding process to obtain original distances of the reference point cloud and the point cloud to be encoded on the two-dimensional distance graph, and finally turning the point cloud into the three-dimensional point cloud by the distance graph.

The encoding process comprises the following steps:

s1, dividing a point cloud sequence into a reference point cloud and a point cloud to be encoded, mapping geometric information of the reference point cloud and the point cloud to be encoded into a two-dimensional distance graph to respectively obtain an I frame original distance graph of the reference point cloud and a P frame original distance graph with the point cloud to be encoded, predicting the P frame point cloud by combining the I frame point cloud with pose information, wherein the number of frames of the point cloud to be encoded predicted by one frame of the reference point cloud is determined by a parameter n, and the red setting parameter n=3 in the embodiment, namely, the point cloud sequence is encoded in an IPPPP.

S2, acquiring pose states according to IMU information in Kitti data sets, and calculating a transformation matrix M through the pose states, wherein the transformation matrix is calculated by using translational acceleration in the IMU informationAnd rotation rateThe transformation matrix is calculated by a rotation matrix R _3×3 and a translation matrix T _3×1, wherein the translation matrix is calculated as shown in a formula 1Is toPerforming translation parameters obtained by calculation of a first-order Longer-Kutta (Runge-Kutta) numerical method;

Similarly, the rotation matrix R _3×3 is calculated as shown in equation 2, Is toThe rotation parameters obtained by calculation by a first-order Longer-Kutta (Runge-Kutta) numerical method are calculated;

Finally, according to the translation matrix T _3×1 and the rotation matrix R _3×3, a transformation matrix M is obtained, as shown in a formula 3;

S5, carrying out quantization processing and JPEG-LS lossless coding on the predicted residual obtained in the step S4 to obtain residual coding information, wherein the operations of relevant quantization processing and JPEG-LS lossless coding are specifically described in the following papers:

Quantization process ：TU C,TAKEUCHI E,CARBALLO A,et al.Real-time streaming point cloud compression for 3d LiDAR sensor using U-NET[J].IEEE Access,2019,7:113616-113625.

JPEG-LS：MARKOS E,PAPADONIKOLAKIS,ATHANASIOS P,et al.Efficient high-performance implementation of JPEG-LS encoder[J].Journal of Real-Time Image Processing,2008,3(4):303-310.

The decoding process comprises the following specific steps:

The formula for operating the reference point cloud I frame and the transformation matrix M in the encoding process and the decoding process is as follows:

In the embodiment, the four scene sequences of kitti data sets are subjected to experimental tests by using the method, and the experimental results are shown in a table 1, wherein each scene sequence in the experiment selects 100 frames for encoding and decoding, and data in the table is an experimental result of averaging one-frame point cloud, wherein residual quantization parameter selection is minimum;

Where CR represents the compression rate, size _coding represents the encoded output stream Size, size _original represents the original data Size, RMSE is root mean square error, max=13000;

Table 1 experimental test results

From the test results, the method provided by the invention has higher coding performance when coding and decoding the laser radar point cloud sequence.

Although the present invention has been described in terms of the preferred embodiments, it is not intended to be limited to the embodiments, and any person skilled in the art can make any possible variations and modifications to the technical solution of the present invention by using the methods and technical matters disclosed above without departing from the spirit and scope of the present invention, so any simple modifications, equivalent variations and modifications to the embodiments described above according to the technical matters of the present invention are within the scope of the technical matters of the present invention. The foregoing description is only of the preferred embodiments of the invention, and all changes and modifications that come within the meaning and range of equivalency of the claims are therefore intended to be embraced therein.

Claims

1. A method for inter-frame encoding and decoding of LiDAR point clouds based on posture state, characterized by:

Coding process: S1. Divide the point cloud sequence into a reference point cloud and a point cloud to be coded. Map the geometric information of the reference point cloud and the point cloud to be coded into a two-dimensional distance map. Obtain the original distance map of the I frame of the reference point cloud and the original distance map of the P frame of the point cloud to be coded. The point cloud sequence is encoded in the form of "IPPPIPPP..."

S2, obtain the posture state according to the IMU information in the Kitti data set, and calculate the transformation matrix M through the posture state;

S3, the reference point cloud I frame and the transformation matrix M are used to calculate the predicted distance map of the point cloud P frame to be encoded , the formula is expressed as:

;

in, is any point in the I-frame point cloud, is any point in the predicted point cloud, M is the transformation matrix, is the rotation matrix, is the translation matrix;

S4, the predicted distance map obtained in step S3 The original distance map of the P frame in step S1 Take the difference to get the prediction residual , , and then get the prediction residual;

S5. quantize and JPEG-LS lossless encode the prediction residual obtained in step S4 to obtain residual coding information;

S6. The original distance map of the I frame in step S1 is also encoded using JPEG-LS to obtain I frame encoding information;

S7, synthesize the transformation matrix M in step S2, the residual coding information in step S5 and the I frame coding information in step S6 into an output stream;

Decoding process: The transformation matrix, residual coding information and reference point cloud coding information are sequentially extracted from the bitstream to obtain the original distance between the reference point cloud and the point cloud to be encoded on the two-dimensional distance map. Finally, the distance map is converted into a point cloud to restore the three-dimensional point cloud.

2. The inter-frame encoding and decoding method of a LiDAR point cloud based on posture state according to claim 1, wherein the specific steps of the decoding process are as follows:

M1, after receiving the code stream, extracts the I frame encoding information in the code stream and obtains the I frame original distance map through JPEG-LS decoding;

M2, after receiving the code stream, extracts the transformation matrix M in the code stream, combines the transformation matrix M information with the original distance map of the I frame obtained in step M1 to calculate the predicted distance map;

M3, after receiving the code stream, extracts the residual coding information in the code stream, and performs inverse quantization and JPEG-LS decoding on it to obtain the prediction residual;

M4, reconstruct the original distance map of the P frame by combining the prediction residual obtained by M3 and the predicted distance map obtained by M2;

M5. Convert the original distance map of the I frame obtained in step M1 and the original distance map of the P frame obtained in step M4 into a point cloud, and finally restore the two-dimensional distance map into a three-dimensional point cloud.