CN109101966B

CN109101966B - Workpiece recognition, positioning and pose estimation system and method based on deep learning

Info

Publication number: CN109101966B
Application number: CN201810591858.8A
Authority: CN
Inventors: 卜伟; 张波; 徐显兵; 彭成斌; 肖江剑
Original assignee: Ningbo Institute of Material Technology and Engineering of CAS
Current assignee: Ningbo Institute of Material Technology and Engineering of CAS
Priority date: 2018-06-08
Filing date: 2018-06-08
Publication date: 2022-03-08
Anticipated expiration: 2038-06-08
Also published as: CN109101966A

Abstract

The invention provides a workpiece recognition positioning and attitude estimation system and method based on deep learning. The workpiece recognition, positioning and posture estimation system based on deep learning comprises a network construction module, a data acquisition module, a model training module and a workpiece recognition, positioning and posture estimation module which are sequentially connected. By adopting the workpiece recognition positioning and attitude estimation system based on deep learning, the classification recognition and the position determination of different types of workpieces and the spatial attitude estimation of a single workpiece can be simultaneously detected, and the automation efficiency of a production line is greatly improved.

Description

Workpiece recognition positioning and attitude estimation system and method based on deep learning

Technical Field

The invention relates to a workpiece recognition positioning and attitude estimation system and method, in particular to a workpiece recognition positioning and attitude estimation system and method based on deep learning, and belongs to the field of target recognition detection.

Background

With the advancement of science and technology, more and more industrial robots are applied to the production field to replace human beings to perform repetitive production activities. Industrial robots are multi-joint manipulators or multi-degree-of-freedom machine devices oriented to the industrial field, can automatically execute work, and are machines which realize various functions by means of self power and control capacity. The robot can accept human command and operate according to a preset program, and modern industrial robots can also perform actions according to a principle formulated by artificial intelligence technology.

In order to improve the automation degree of the industrial robot, the industrial robot is required to intelligently identify, position and estimate the posture of a workpiece in production, and the workpiece can be sorted according to the motion track and the grabbing angle of different workpieces in different posture self-adaption adjustment.

In recent years, deep learning algorithms make great breakthrough in various fields of computer vision, and particularly, various excellent deep learning algorithms in the fields of target detection and identification classification emerge in large numbers. Such as GoogleNet, VGG, Faster R-CNN, YOLO, and the like. Therefore, the reliability of the algorithm can be effectively improved by applying the strong deep learning algorithm to the field of workpiece detection, identification and positioning, and the detection and positioning precision and dimension are increased, so that the automation degree of the industrial robot is improved, and the actual production efficiency is greatly enhanced. However, the prior art has certain defects in workpiece detection, such as classification identification, position determination and spatial attitude estimation of different workpieces on the same production line, which cannot simultaneously provide satisfactory detection results.

Disclosure of Invention

The invention mainly aims to provide a workpiece recognition positioning and posture estimation system and method based on deep learning so as to overcome the defects of the prior art.

To achieve the foregoing object, an embodiment of the present invention provides a deep learning based workpiece recognition positioning and posture estimation system, which includes:

the network construction module is at least used for carrying out workpiece recognition positioning and attitude estimation network design based on a YOLO deep learning network, wherein the workpiece recognition positioning and attitude estimation network design comprises an output item added behind a full connection layer, and the output item is used for acquiring angle information;

the data acquisition module is at least used for constructing a training set, and the construction process comprises the steps of acquiring workpiece pictures with different postures as training samples, and carrying out angle information labeling, classification information labeling and position information labeling on the training samples;

the model training module is at least used for training the workpiece recognition positioning and attitude estimation network according to a training set constructed by the data acquisition module, and when the loss value reaches a preset threshold value, the training is finished and a workpiece recognition positioning and attitude estimation model is obtained;

and the workpiece recognition, positioning and posture estimation module is at least used for carrying out recognition, positioning and posture estimation on the workpiece object picture according to the workpiece recognition, positioning and posture estimation model.

Preferably, the model training module further includes a loss value calculation operator module, configured to calculate a loss value of the currently trained workpiece recognition positioning and attitude estimation network, where the loss value calculation employs a loss function that simultaneously fuses a workpiece classification error, a workpiece position coordinate error, and a workpiece attitude error.

The embodiment of the invention also provides a workpiece recognition positioning and posture estimation method based on deep learning, which comprises the following steps:

s1, carrying out workpiece recognition positioning and posture estimation network design based on a YOLO deep learning network, wherein an output item is added behind a full connection layer and is used for obtaining angle information;

s2, collecting workpiece pictures with different postures as training samples to construct a training set, wherein angle information labeling, classification information labeling and position information labeling are carried out on the training samples;

s3, training the workpiece recognition positioning and posture estimation network by using the training set constructed in the step S2; when the loss value reaches a preset threshold value, finishing training and obtaining a workpiece recognition positioning and posture estimation model;

and S4, calling the workpiece identification, positioning and posture estimation model to perform identification, positioning and posture estimation on the workpiece object picture.

Preferably, the angle information labeling includes:

selecting a certain workpiece posture as a reference, setting the posture of the workpiece to be (0 degrees, 0 degrees and 0 degrees) around the x axis, the y axis and the z axis, respectively setting the angle intervals of the rotation around the x axis, the y axis and the z axis, and marking the intermediate value of the interval where the angles of the rotation around the x axis, the y axis and the z axis of the training sample picture are positioned.

Preferably, the classification information labeling and the position information labeling include: the classification information is marked by numbers to distinguish different categories; the bounding box of the workpiece is obtained by solving the minimum bounding rectangle.

Preferably, the loss value is calculated by using a loss function which simultaneously fuses the workpiece classification error, the workpiece position coordinate error and the workpiece attitude error.

Preferably, the process of training the workpiece recognition positioning and pose estimation network in step S3 specifically includes:

s31, training a YOLO deep learning network before the network structure is not changed, optimizing variables by adopting a gradient descent optimizer, repeatedly training until the loss value reaches a preset threshold value, and acquiring updated weight;

and S32, loading the weight trained in the step S31 into the modified workpiece recognition positioning and posture estimation network, optimizing variables related to the prediction angle by adopting a gradient descent optimizer, and repeatedly training until the loss value reaches a preset threshold value.

Preferably, the training sample pictures are rotated around the x-axis and y-axis by an angle in the range of [ -15 °, 14 ° ] and around the z-axis by an angle in the range of [0 °, 90 ° ], respectively.

Preferably, when the training sample picture rotates around the x and y axes, the angle interval is set to be 5 °; the angular interval is set to 10 ° when the training sample picture is rotated around the z-axis.

Preferably, the loss function includes an angle error loss function, a coordinate error function, an IOU error loss function, a classification error loss function;

the angle error loss function formula is:

wherein the input image is divided into S x S grids, Ax, Ay, Az are respectively the angular values of rotation around x, y, z axes predicted by the grids,

are respectively the corresponding labeled values, and are,

indicating that the object center falls within grid i;

the coordinate error loss function is formulated as:

IoU the error loss function is formulated as:

the classification error loss function is formulated as:

wherein, B is the number of prediction frames to be predicted for each grid, x, y, w, h, C, p are network prediction values,

are all marked values, and are all marked values,

indicating that the center of the object falls within grid i,

and

respectively indicating whether the center of the object falls into the jth prediction box of the ith grid;

the loss function is: l ═ L_a+L_c+L_IoU+L_cls。

Further, the workpiece recognition positioning and posture estimation method based on deep learning is realized based on the workpiece recognition positioning and posture estimation system based on deep learning.

Compared with the prior art, the invention has the advantages that: by utilizing the technical scheme provided by the invention, the classification identification and position determination of different types of workpieces and the spatial attitude estimation of a single workpiece can be simultaneously detected, and the automation efficiency of a production line is greatly improved.

Drawings

FIG. 1 is a flow chart of a method for deep learning based workpiece recognition positioning and pose estimation in an exemplary embodiment of the present invention;

FIG. 2 is a schematic diagram of a workpiece recognition positioning and pose estimation network based on a YOLO deep learning network improvement in an exemplary embodiment of the present invention.

Detailed Description

In view of the deficiencies in the prior art, the inventors of the present invention have made extensive studies and extensive practices to provide technical solutions of the present invention. The technical solution, its implementation and principles, etc. will be further explained as follows.

The embodiment of the invention provides a workpiece recognition positioning and posture estimation system based on deep learning, which comprises:

Furthermore, the network construction module, the data acquisition module, the model training module and the workpiece recognition, positioning and attitude estimation module are sequentially connected and arranged to form the deep learning-based workpiece recognition, positioning and attitude estimation system.

Furthermore, the model training module further comprises a loss value calculation operator module, which is used for calculating the loss value of the currently trained workpiece recognition positioning and attitude estimation network, wherein the loss value calculation adopts a loss function which simultaneously integrates the workpiece classification error, the workpiece position coordinate error and the workpiece attitude error.

Referring to fig. 1, an embodiment of the present invention further provides a method for identifying, positioning and estimating an orientation of a workpiece based on deep learning, which includes the following steps:

101, carrying out workpiece identification and positioning and posture estimation network design based on a YOLO deep learning network;

and improving the deep learning network based on the YOLO, and increasing output angle information.

102, acquiring and labeling workpiece training sample pictures in different postures;

and acquiring workpiece pictures of different postures, and carrying out angle information labeling, classification information labeling and position information labeling.

103, training a workpiece recognition positioning and posture estimation model by using the training set constructed in the step 102;

and a loss function which simultaneously integrates the classification error of the workpiece, the position coordinate error of the workpiece and the attitude error of the workpiece is adopted in the training process.

And step 104, calling the workpiece identification, positioning and posture estimation model to perform identification, positioning and posture estimation on the workpiece object picture.

In order to make the objects, technical solutions and advantages of the present invention more apparent, the present invention will be described in further detail below with reference to the accompanying drawings and specific embodiments. It should be understood that the specific embodiments described herein are merely illustrative of the invention and are not intended to limit the invention.

In some more specific embodiments, a method of workpiece recognition positioning and pose estimation may comprise the steps of:

workpiece recognition positioning and attitude estimation network design based on YOLO deep learning network

Referring to fig. 2, fig. 2 is a diagram illustrating an improved workpiece recognition positioning and posture estimation network based on a YOLO deep learning network according to an exemplary embodiment of the present invention, where classification conditions and position information of a workpiece can be obtained according to an original YOLO deep learning network, and in an embodiment of the present invention, besides obtaining information of a workpiece type and a workpiece position, a posture, i.e., angle information of the workpiece is obtained, so that an improvement needs to be performed on the basis of an original network to obtain an angle value, and a modified network structure is shown in fig. 2. As can be seen from fig. 2, the improved YOLO network basically retains the original network structure for calculating the classification and location information, and the improvement includes adding an output item after the full-connected layer, i.e., using a full-connected layer on the full-connected layer with output of 4096-dimensional vector for obtaining the angle information, where the output size is 7 × 3, and 3 is 3 angles of the output, i.e., angle, angley, anglez.

Secondly, training image acquisition and annotation are carried out on workpieces with different postures

Suppose that the workpieces tested at this time have three types, the shapes and the sizes of the workpieces are different, and the workpieces do not have high symmetry. For the workpieces produced by the production line, the sorting difficulty cannot be increased by mixing several workpieces together, and according to the actual application scenario, only the condition of one workpiece is considered in the embodiment. Taking the first workpiece as an example, in order to obtain the posture information thereof, when a training set is made, pictures of each posture thereof need to be collected, and the first workpiece is set to be (0 °, 0 °, 0 °) around x, y, and z axes with a certain posture as a reference. Preferably, the three-dimensional CAD model picture can be rotated around the x, y and z axes by OPENGL

In an exemplary embodiment of the present invention, a part of the gestures is used for training and testing to simulate the testing of other gestures. The rotation angles around the x-axis and the y-axis are respectively in the range of [ -15 degrees, 14 degrees ], and the rotation angle around the z-axis is in the range of [0 degrees, 90 degrees ]. When the picture rotates around the x axis and the y axis, setting the angle interval to be 5 degrees, and marking the middle value of the interval where the angle of the picture rotating around the x axis and the y axis is located; because the rotation around the z-axis can be regarded as the rotation in a plane, the attitude of the workpiece is not changed greatly, the angle interval is set to be 10 degrees, and the middle value of the interval where the angle of the rotation of the picture around the z-axis is located is taken for marking. For example, for a workpiece rotated between 6 ° and 10 ° around the x and y axes and between 11 ° and 20 ° around the z axis, the angles are collectively denoted as (8, 8, 15), and workpieces within this range of postures are considered to be in the same posture and the rotation angles are all (8, 8, 15).

In a typical embodiment of the invention, 250 training pictures are acquired for a posture, after the acquisition of the pictures in a training set is completed, the pictures need to be labeled, the classification information of the training pictures and a target frame to be trained are extracted, the classification information is labeled as 1 and represents the 1 st class, a boundary frame of a workpiece can be obtained by solving the minimum external rectangle of the boundary frame, and the four values of Xmin, Xmax, Ymin and Ymax are written into an annotation file in the same format.

Thirdly, designing a loss function simultaneously fusing the classification error of the workpiece, the position coordinate error of the workpiece and the attitude error of the workpiece

Because an angle regression network is introduced, an angle error loss function needs to be added on the basis of the original loss function, and the formula is as follows:

are respectively the corresponding labeled values, and are,

indicating that the object center falls within grid i.

In addition to the angular loss function, the loss function includes: coordinate error loss function, IOU error loss function, classification error loss function. The formulas are respectively as follows:

coordinate error loss function:

IoU error loss function:

class error loss function:

are all marked values, and are all marked values,

indicating that the center of the object falls within grid i,

and

respectively, whether the center of the object falls within the jth prediction box of the ith grid.

For the whole workpiece recognition positioning and attitude estimation network, the total loss value is as follows:

L＝L_a+L_c+L_IoU+L_cls

fourthly, training a workpiece recognition positioning and posture estimation model by using the acquired training images

Because the angle regression layer is added to the network, and all variables of the network are trained and optimized simultaneously, the loss function is difficult to converge, so that a two-step training mode can be adopted; firstly, training a YOLO network before the structure of the network is not changed, wherein the initial value of a learning rate is 0.01, the batch size is 30, the period is 11, optimizing variables by adopting a gradient descent optimizer, and finally obtaining a relatively accurate test result without angle test through repeated training.

And after the training and the testing of the first step are finished, loading the trained weight into the modified workpiece recognition positioning and posture estimation network, retraining, and optimizing variables by still adopting a gradient descent optimizer, but only optimizing newly-added variables related to the prediction angle. The batch size is still 30, the learning rate is initially 0.01 and gradually changes from 0.01 to 0.0001 with a period of 11, and the training is repeated.

Fifthly, calling the model to identify, position and estimate the attitude of the workpiece

For the test set, a workpiece object image shot by a camera is used for testing, the size of the image is different from that of the test image, the conditions of influence of illumination, rusting and the like on the surface of the workpiece are also different from those of a training image, but the angle range of the image is still in the range of rotating around the x axis and the y axis by-15 degrees to 14 degrees and rotating around the z axis by 0 degree to 90 degrees, 1600 pieces of each workpiece are 4800 pieces in total, the specific training and testing result pair of each workpiece is shown in table 1, and the statistical results of the table 1 show that the classification results are excellent no matter training or testing; the error of the x-direction boundary box and the y-direction boundary box is very low, and the error of the test set result is about 1 mm; when the rotation angle errors around the x, y and z axes in the training set are respectively 4.038 degrees, 4.334 degrees and 8.464 degrees on average, the loss value can not be reduced any more; the reason for this result is that the rotation angles of the x and y axes are labeled with the intermediate values at intervals of 5 degrees in the training set, the z axis is labeled with the intermediate values at intervals of 10 degrees, and the error difference is respectively 5 degrees, 5 degrees and 10 degrees; and the error of the test set is possibly increased due to the influence of the rusting of the surface of the workpiece, the illumination and the size of the workpiece, so that the difference from the training set is caused, but the error is still in an acceptable normal range, and the influence on the test result is small.

Table 1 shows the training and test result statistics of 3 workpiece test samples

By adopting the workpiece recognition positioning and attitude estimation system based on deep learning, provided by the invention, the classification recognition and position determination of different types of workpieces on the same production line and the spatial attitude estimation of a single workpiece can simultaneously give out detection results, so that the automatic operation of sorting the workpieces and the like by an industrial robot according to the self-adaptive motion track adjustment and grabbing angles of different workpieces is facilitated, and the production efficiency of the production line can be greatly improved.

It should be understood that the above-mentioned embodiments are merely illustrative of the technical concepts and features of the present invention, which are intended to enable those skilled in the art to understand the contents of the present invention and implement the present invention, and therefore, the protection scope of the present invention is not limited thereby. All equivalent changes and modifications made according to the spirit of the present invention should be covered within the protection scope of the present invention.

Claims

1. a workpiece recognition positioning and attitude estimation method based on deep learning, is characterized in that comprising:

S1. Workpiece recognition positioning and pose estimation network design based on YOLO deep learning network, including adding an output item after the fully connected layer to obtain angle information;

S2. Collect workpiece pictures of different postures as training samples to construct a training set, including labeling angle information, classification information and location information on the training samples,

The angle information annotation includes:

Select the pose of a certain workpiece as the benchmark, set it to be (0°, 0°, 0°) around the x, y, and z axes, set the angular interval of rotation around the x, y, and z axes respectively, and take the training sample image around the The middle value of the interval where the rotation angles of the x, y, and z axes are located is marked. 0°, 90°] range, when the training sample picture rotates around the x, y axes, the angular interval is set to 5°; when the training sample picture rotates around the z-axis, the angular interval is set to 10°;

The classification information annotation and position information annotation include: classification information is marked with numbers to distinguish different categories; the bounding box of the workpiece is obtained by finding its minimum circumscribed rectangle; the loss value adopts the simultaneous fusion of workpiece classification error, workpiece position coordinate error and workpiece. The loss function of the attitude error is calculated; the loss function includes an angle error loss function, a coordinate error function, an IOU error loss function, and a classification error loss function;

The angle error loss function formula is:

Among them, the input image is divided into S×S grids, and Ax, Ay, and Az are the angle values of rotation around the x, y, and z axes predicted by the network, respectively.

are the corresponding label values, respectively.

Indicates that the center of the object falls in grid i;

The coordinate error loss function formula is:

The IoU error loss function formula is:

The classification error loss function formula is:

Among them, B is the number of prediction frames that need to be predicted for each grid, x, y, w, h, C, p are the predicted values of the network,

are marked values,

Indicates that the center of the object falls in grid i,

and

Respectively indicate whether the center of the object falls within the jth prediction frame of the ith grid;

The loss function is: L=L _a +L _c +L _IoU +L _cls ;

S3. use the training set constructed in step S2 to train the workpiece recognition positioning and attitude estimation network; when the loss value reaches a preset threshold, the training ends and a workpiece recognition positioning and attitude estimation model is obtained. The training of the positioning and pose estimation network includes:

S31. Train the YOLO deep learning network before the network structure is changed, use the gradient descent optimizer to optimize the variables, train repeatedly until the loss value reaches the preset threshold, and obtain the updated weight;

S32. Load the weights trained in step S31 into the modified workpiece recognition positioning and attitude estimation network, use a gradient descent optimizer to optimize the variables related to the prediction angle, and repeat the training until the loss value reaches a preset threshold;

S4. Invoke the workpiece identification positioning and attitude estimation model to perform identification positioning and attitude estimation on the real picture of the workpiece.