CN115222652A

CN115222652A - Method for identifying, counting and centering end faces of bundled steel bars and memory thereof

Info

Publication number: CN115222652A
Application number: CN202210478695.9A
Authority: CN
Inventors: 黄思博; 邱嘉伟; 黄剑锋; 崔晗; 魏晓慧; 蔡昭权; 罗中良
Original assignee: Huizhou University
Current assignee: Huizhou University
Priority date: 2022-05-05
Filing date: 2022-05-05
Publication date: 2022-10-21

Abstract

The invention relates to the technical field of machine vision, in particular to a method for identifying, counting and centering end faces of bundled reinforcing steel bars and a memory thereof, wherein the method comprises the steps of S1, shooting images of the end faces of the reinforcing steel bars, and obtaining images to be identified after processing; s2, performing data enhancement operation on the image to be recognized by adopting a first preset algorithm; s3, forming final detection frames in the image to be recognized by adopting a second preset algorithm with a lightweight convolutional neural network, and calculating the number of the final detection frames; and S4, generating a counting result. The invention solves the problems that the counting result is inaccurate and the practical requirement cannot be met because the conventional reinforcing steel bar end face identification technology is generally carried out by adopting a common machine vision algorithm.

Description

A method for identifying, counting and centering the end face of bundled steel bars and its memory

技术领域technical field

本发明涉及机器视觉技术领域，特别涉及一种成捆钢筋端面的识别计数及中心定位方法与其存储器。The invention relates to the technical field of machine vision, in particular to a method for identifying, counting and centering the end faces of bundled steel bars and a memory thereof.

背景技术Background technique

机器视觉，是用机器人代替人眼做测量和判断。机器视觉系统是通过及其视觉产品将被摄取目标转换成图像信号，传送给专用的图像处理系统，得到被摄目标的形态信息，根据像素分布和亮度、颜色等信息，转变成数字化信号；图像系统对这些信号进行各种运算来抽取目标的特征，进而根据判别的结果来控制现场的设备动作。Machine vision is the use of robots to replace human eyes for measurement and judgment. The machine vision system converts the captured target into an image signal through its visual products, and transmits it to a dedicated image processing system to obtain the morphological information of the captured target, and converts it into a digital signal according to pixel distribution, brightness, color and other information; image The system performs various operations on these signals to extract the characteristics of the target, and then controls the on-site equipment actions according to the results of the discrimination.

YOLOv3算法，可用于解决“如何检测两个距离很近的同类的物体，或者是距离很近的不同类的物体”，其对距离很近的物体或者小物体有很好的鲁棒性。The YOLOv3 algorithm can be used to solve "how to detect two objects of the same type that are close to each other, or objects of different types that are close to each other", which has good robustness to objects that are close to each other or small objects.

现有的钢筋端面识别技术通常采用普通机器视觉算法进行，但由于该算法无法精准地得出结果，因此导致钢筋端面计数领域应用机器视觉不能得出准确的结果，无法满足实用性需要，因此一种成捆钢筋端面的识别计数及中心定位方法与其存储器应运而生。The existing reinforcement end face recognition technology is usually carried out by ordinary machine vision algorithms, but because the algorithm cannot accurately obtain the results, the application of machine vision in the field of reinforcement end face counting cannot obtain accurate results and cannot meet the practical needs. A method of identifying, counting and centering the end face of bundled steel bars and its storage came into being.

发明内容SUMMARY OF THE INVENTION

本发明的发明内容在于提供一种成捆钢筋端面的识别计数及中心定位方法与其存储器，主要解决了现有的钢筋端面识别技术通常采用普通机器视觉算法进行，导致的计数结果不准确，无法满足实用性需要的问题。The content of the present invention is to provide a method for identifying, counting and centering the end faces of bundled steel bars and a memory thereof, which mainly solves the problem that the existing technology for identifying the end faces of steel bars is usually carried out by using ordinary machine vision algorithms, resulting in inaccurate counting results, which cannot meet the requirements of A question of practicality.

本发明提出了一种成捆钢筋端面的识别计数及中心定位方法，包括以下步骤：The invention provides a method for identifying, counting and centering the end faces of bundled steel bars, comprising the following steps:

S1，拍摄钢筋端面的图像，处理后获得待识别图像；S1, take an image of the end face of the steel bar, and obtain an image to be recognized after processing;

S2，采用第一预设算法对所述待识别图像进行数据增强操作；S2, using a first preset algorithm to perform a data enhancement operation on the to-be-recognized image;

S3，采用具有轻量级卷积神经网络的第二预设算法在所述待识别图像中形成最终检测框，并计算所述最终检测框的数量；S3, using a second preset algorithm with a lightweight convolutional neural network to form a final detection frame in the to-be-recognized image, and calculate the number of the final detection frame;

S4，生成计数结果。S4, generating a counting result.

优选地，所述步骤S3具体包括：Preferably, the step S3 specifically includes:

S31，预先形成包括具有轻量级卷积神经网络的第二预设算法；S31, pre-forming a second preset algorithm including a lightweight convolutional neural network;

S32，采用所述第二预设算法在所述待识别图像中形成最终检测框，并计算所述最终检测框的数量。S32, using the second preset algorithm to form a final detection frame in the to-be-recognized image, and calculate the number of the final detection frame.

其中，步骤S31中的第二预设算法具体为，通过改进其骨干特征提取网络，将YoloV3原网络中的Darknet53骨干特征提取网络替换成Shfflenetv2骨干特征提取网络。Specifically, the second preset algorithm in step S31 is to replace the Darknet53 backbone feature extraction network in the original YoloV3 network with the Shfflenetv2 backbone feature extraction network by improving its backbone feature extraction network.

优选地，所述步骤S31具体包括：Preferably, the step S31 specifically includes:

S311，对训练图像进行聚类操作形成锚框；S311, performing a clustering operation on the training image to form an anchor frame;

S312，对训练图像进行划分，并形成多个小分块；S312, dividing the training image and forming a plurality of small blocks;

S313，在每一小分块中生成多个矩形框，且所述矩形框的长宽由所述锚框确定；S313, generating a plurality of rectangular frames in each small block, and the length and width of the rectangular frames are determined by the anchor frame;

S314，对同一所述小分块下的多个矩形框进行微调后形成初级检测框；S314, after fine-tuning a plurality of rectangular frames under the same small sub-block, a primary detection frame is formed;

S315，判断任一所述小分块中是否包含目标检测误，若是则计算当前所述小分块中的多个初级检测框与训练图像的真实框间的IOU 值，若所有所述初级检测框都大于设定阈值时，选取IOU值最大的所述初级检测框作为正样本；S315, determine whether any of the small blocks contains target detection errors, and if so, calculate the IOU values between the multiple primary detection frames in the current small block and the real frame of the training image, if all the primary detection frames When the frame is greater than the set threshold, the primary detection frame with the largest IOU value is selected as a positive sample;

S316，保存所述正样本的框形后生成具备轻量级卷积神经网络的第二预设算法。S316, after saving the frame shape of the positive sample, a second preset algorithm with a lightweight convolutional neural network is generated.

优选地，所述步骤S314具体包括：Preferably, the step S314 specifically includes:

S314a，获取所述锚框的多个参数值；S314a, obtaining multiple parameter values of the anchor frame;

S314b，根据获取的多个所述参数值对所述矩形框进行调整后形成初级检测框；S314b, forming a primary detection frame after adjusting the rectangular frame according to a plurality of the obtained parameter values;

其中，所述步骤S314a中的所述多个参数值包括四个拐角的坐标值，并分别记为t_x，t_y，t_w，t_h，还包括所述锚框相对训练图像的偏移量(c_x，c_y)；Wherein, the plurality of parameter values in the step S314a include the coordinate values of the four corners, which are respectively denoted as t _x , _{ty , t w} _, _th , and also include the offset of the anchor frame relative to the training image quantity(c _x , c _y );

所述步骤S314b中，对所述矩形框进行调整包括以下公式，In the step S314b, the adjustment of the rectangular frame includes the following formula:

b_x＝σ(t_x)+c_x；b _x =σ(t _x )+c _x ;

b_y＝σ(t_y)+c_y；b _y =σ( _ty )+ _cy ;

记当前所述矩形框的宽度为p_w，所述矩形框的高度为p_h；记所述矩形框坐标的真实值为

所述预设坐标值为t^*。Note that the current width of the rectangular frame is p _w , and the height of the rectangular frame is p _h ; note that the real value of the coordinates of the rectangular frame is

The preset coordinate value is t ^* .

优选地，在所述步骤S315后，步骤S316之前，设置有步骤：Preferably, after the step S315 and before the step S316, there are steps:

SX，采用以下公式进行损失计算，SX, the loss is calculated using the following formula,

Wbox＝2.0-tw*thWbox=2.0-tw*th

Loss＝lossbox+lossconf+lossclassLoss=lossbox+lossconf+lossclass

其中，其中S²代表训练图像大小为S乘S，B代表box，

代表若在坐标[i,j]处的box有目标则其值为1否则为0，主要计算3个损失函数，分别是与真实框中心坐标，宽高之间的损失，以及预测框中是否包含检测物的损失，以及预测类别损失最终将3种损失进行相加作为一个层级的损失函数值，最终计算损失时采用三个层级损失函数的平均作为最终损失值。Among them, where S ² represents the training image size is S times S, B represents the box,

Represents that if the box at coordinates [i, j] has a target, its value is 1, otherwise it is 0, and three loss functions are mainly calculated, which are the loss between the center coordinates of the real box, the width and height, and whether the prediction box is Including the loss of the detection object and the loss of the prediction category, the three losses are finally added as the loss function value of one level, and the average of the three levels of loss functions is used as the final loss value when calculating the loss.

优选地，所述步骤S1中，所述处理后获得待识别图像，包括，图像随机翻转，随机缩放，随机裁剪，随机改变亮暗。Preferably, in the step S1, the image to be recognized is obtained after the processing, including randomly flipping the image, randomly scaling, randomly cropping, and randomly changing the brightness and darkness.

优选地，所述步骤S2中，所述采用第一预设算法对所述待识别图像进行数据增强操作，包括，对脏污或纹理等数据集进行Fmix增强混合。Preferably, in the step S2, the first preset algorithm is used to perform a data enhancement operation on the to-be-recognized image, including performing Fmix enhancement and mixing on data sets such as dirt or texture.

本发明还提出了一种计算机可读存储器，所述计算机可读存储器包括存储的计算机程序，其中在所述计算机程序运行时控制所述计算机可读存储器所在设备执行如前述权利要求所述的方法。The present invention also provides a computer-readable memory, the computer-readable memory comprising a stored computer program, wherein when the computer program runs, a device in which the computer-readable memory is located is controlled to execute the method described in the preceding claims .

由上可知，应用本发明提供的技术方案可以得到以下有益效果：As can be seen from the above, the following beneficial effects can be obtained by applying the technical solution provided by the present invention:

本发明提出的方法中，通过模型训练，令轻量级卷积神经网络构成的第二预设算法可针对性地用于形成检测框，并确保技术结果准确。In the method proposed by the present invention, through model training, the second preset algorithm composed of the lightweight convolutional neural network can be used to form the detection frame in a targeted manner, and the technical result is ensured to be accurate.

具体实施方式Detailed ways

下面对本发明实施例中的技术方案进行清楚、完整地描述。显然，所描述的实施例仅仅是本发明部分实施例，而不是全部的实施例。基于本发明中的实施例，本领域普通技术人员在没有付出创造性劳动前提下所获得的所有其他实施例，都属于本发明保护的范围。The technical solutions in the embodiments of the present invention will be clearly and completely described below. Obviously, the described embodiments are only some, but not all, embodiments of the present invention. Based on the embodiments of the present invention, all other embodiments obtained by those of ordinary skill in the art without creative efforts shall fall within the protection scope of the present invention.

现有的钢筋端面识别技术通常采用普通机器视觉算法进行，导致的计数结果不准确，无法满足实用性需要的问题。Existing steel bar end face recognition technology is usually carried out by common machine vision algorithm, resulting in inaccurate counting results, which cannot meet the problem of practicality.

应强调的是，本实施例提出的计数及中心定位方法不仅仅用于钢筋端面，可用于不同背景、识别对象下的计数领域。It should be emphasized that the counting and center positioning method proposed in this embodiment is not only used for the end face of the steel bar, but can be used in the counting field under different backgrounds and recognition objects.

为了解决上述问题，本实施例提出了一种成捆钢筋端面的识别计数及中心定位方法，其主要包括以下步骤：In order to solve the above problems, the present embodiment proposes a method for identifying, counting and centering the end faces of bundled steel bars, which mainly includes the following steps:

S4，生成计数结果。S4, generating a counting result.

优选但不限定的是，本实施例中步骤S1中对拍摄的钢筋端面图像进行的处理包括图像随机翻转，随机缩放，随机裁剪，随机改变亮暗等，以确保钢筋端面中多个钢筋结构的边沿清晰，有助于准确计数。Preferably, but not limitedly, in this embodiment, the processing of the shot image of the steel bar end face in step S1 includes random image flipping, random scaling, random cropping, random changing of brightness and darkness, etc. Sharp edges help with accurate counting.

优选但不限定的是，本实施例中步骤S2的第一预设算法对待识别图像进行数据增强操作，包括，对脏污或纹理等数据集进行Fmix 增强混合。其中混合函数为Fmix函数，Fmix函数实现过程如下：1) 随机从脏污数据集中抽出一张图片；2)通过傅里叶空间采样的低频灰度图像进行阈值处理得到掩膜；3)将第一步随机获取的图像与第二步得出的掩膜进行掩膜混合。其中，傅里叶变化与掩膜混合函数包括：Preferably, but not limitedly, in this embodiment, the first preset algorithm in step S2 performs a data enhancement operation on the image to be recognized, including performing Fmix enhancement and mixing on data sets such as dirt or texture. The mixing function is the Fmix function, and the implementation process of the Fmix function is as follows: 1) randomly extract a picture from the dirty data set; 2) perform threshold processing on the low-frequency grayscale image sampled by Fourier space to obtain a mask; The randomly acquired image in one step is mask-blended with the mask obtained in the second step. Among them, the Fourier transform and mask mixing function include:

也即，首先对一个实部和虚部都独立且为高斯分布的随机复张量进行采样；再通过参数λ根据其频率对每个分量进行缩放，使得λ的较高值对于高频信息的增加衰减；在对复张量进行傅里叶逆变换，并取实部得到灰度图像；最后通过设定的阈值(图像的顶部比例)，将图像变成二进制掩码，高于阈值的设为1，低于阈值为0。That is, first sample a random complex tensor whose real and imaginary parts are independent and Gaussian distributed; then each component is scaled according to its frequency by the parameter λ, so that the higher value of λ is more important for the high frequency information. Increase the attenuation; perform inverse Fourier transform on the complex tensor, and take the real part to obtain a grayscale image; finally, through the set threshold (the top ratio of the image), the image is turned into a binary mask, and the setting higher than the threshold is 1, below the threshold is 0.

更具体地，所述步骤S3具体包括：More specifically, the step S3 specifically includes:

S32，采用第二预设算法在待识别图像中形成最终检测框，并计算最终检测框的数量。S32, a second preset algorithm is used to form a final detection frame in the image to be recognized, and the number of final detection frames is calculated.

优选地，步骤S31中的网络是由yolov3原基础上进行改进的网络，其主要改进其骨干特征提取网络，通过引入channel split使得网络输入输出通道数相同从而使得模型内存访问成本降低；引入逐点分组卷积即带分组且卷积核为1×1的卷积从而减少了计算量，其中逐点卷积的作用是减少参数量，而分组逐点卷积是在逐点卷积基础上进一步减少计算量；为解决过多分组会降低网络平行度，引入channel shuffle，其作用是丰富各个组的得到的信息，从而提取更多的特征；Preferably, the network in step S31 is an improved network based on the original yolov3, which mainly improves its backbone feature extraction network. By introducing channel split, the number of input and output channels of the network is the same, thereby reducing the memory access cost of the model; introducing point-by-point The grouped convolution is the convolution with grouping and the convolution kernel is 1×1, which reduces the amount of calculation. The function of the point-by-point convolution is to reduce the amount of parameters, and the grouped point-by-point convolution is based on the point-by-point convolution. Reduce the amount of calculation; in order to solve the problem that too many groups will reduce the network parallelism, channel shuffle is introduced, and its function is to enrich the information obtained by each group, thereby extracting more features;

优选地，步骤S31具体包括：Preferably, step S31 specifically includes:

S313，在每一小分块中生成多个矩形框，且矩形框的长宽由锚框确定；S313, generate a plurality of rectangular frames in each small block, and the length and width of the rectangular frame are determined by the anchor frame;

S314，对同一小分块下的多个矩形框进行微调后形成初级检测框；S314, after fine-tuning a plurality of rectangular frames under the same small block, a primary detection frame is formed;

S315，判断任一小分块中是否包含目标检测物，若是则计算当前小分块中的多个初级检测框与训练图像的真实框间的IOU值，若所有初级检测框都大于设定阈值时，选取IOU值最大的初级检测框作为正样本；S315, determine whether any small block contains the target detection object, if so, calculate the IOU value between the multiple primary detection frames in the current small block and the real frame of the training image, if all the primary detection frames are greater than the set threshold value When , select the primary detection frame with the largest IOU value as the positive sample;

S316，保存正样本的框形后生成具备轻量级卷积神经网络的第二预设算法。S316, after saving the frame shape of the positive sample, a second preset algorithm with a lightweight convolutional neural network is generated.

优选但不限定的是，本实施例中在S3之前，还设置有步骤，用于将预设数据转换为卷积网络能处理的格式。在本实施例中，预设原始数据集采用VOC格式进行标注，步骤S1在给数据进行加强时，当图像大小发生变化时，其对应的标注坐标也随之改变，在将数据放入卷积神经网络时，需要分别对图片数据和标注数据进行处理，图片数据需要进行整体归一化处理，并且转换成NCHW，即数量，通道数，高，宽的格式。而标签数据需要先进行kmeans聚类，生成9个聚类中心作为锚框的长宽，从而得到训练时用到的锚框。Preferably, but not limitedly, in this embodiment, before S3, a step is further provided for converting the preset data into a format that can be processed by the convolutional network. In this embodiment, the preset original data set is marked in VOC format. When the data is enhanced in step S1, when the size of the image changes, the corresponding marking coordinates also change. When the data is put into the convolution When using a neural network, image data and label data need to be processed separately. Image data needs to be normalized as a whole and converted into NCHW, that is, the format of number, number of channels, height, and width. The label data needs to be clustered by kmeans first, and 9 cluster centers are generated as the length and width of the anchor frame, so as to obtain the anchor frame used in training.

优选但不限定的是，本实施例中步骤S312中的小分块可设置为 N×N的格式，且步骤S313中选择的矩形框数量有三个，三个矩形框的形状都不相同，对应的，锚框大小根据网络下采样倍率，从小到大分成三组，每组有三个从小到大的矩形框。Preferably, but not limitedly, in this embodiment, the small blocks in step S312 can be set to an N×N format, and the number of rectangular frames selected in step S313 is three, and the shapes of the three rectangular frames are different, corresponding to , the anchor box size is divided into three groups from small to large according to the downsampling ratio of the network, and each group has three rectangular boxes from small to large.

在本实施例中，步骤S315中初级检测款小于设定阈值时，认为该部分与真实框重合度不高，以及小方块不包含物体是则设为负样本。In this embodiment, when the primary detection amount in step S315 is smaller than the set threshold, it is considered that the overlap of the part with the real frame is not high, and the small square does not contain an object, and it is set as a negative sample.

更具体地，步骤S314具体包括：More specifically, step S314 specifically includes:

S314a，获取锚框的多个参数值；S314a, obtaining multiple parameter values of the anchor frame;

S314b，根据获取的多个参数值对矩形框进行调整后形成初级检测框；S314b, after adjusting the rectangular frame according to the obtained multiple parameter values, a primary detection frame is formed;

其中，步骤S314a中的多个参数值包括四个拐角的坐标值，并分别记为t_x，t_y，t_w，t_h，还包括锚框相对训练图像的偏移量(c_x，c_y)；Among them, the multiple parameter values in step S314a include the coordinate values of the four corners, which are respectively recorded as t _x , _{ty , t w} _, _th , and also include the offset of the anchor frame relative to the training image (c _x , c _y );

步骤S314b中，对矩形框进行调整包括以下公式，In step S314b, the adjustment of the rectangular frame includes the following formula:

b_x＝σ(t_x)+c_x；b _x =σ(t _x )+c _x ;

b_y＝σ(t_y)+c_y；b _y =σ( _ty )+ _cy ;

b＝p e^t；b= ^pet ;

记当前矩形框的宽度为p_w，矩形框的高度为p_h；记矩形框坐标的真实值为

预设坐标值为t^*。Note that the width of the current rectangular frame is p _w , and the height of the rectangular frame is p _h ; note that the real value of the coordinates of the rectangular frame is

The preset coordinate value is t ^* .

更具体地，在步骤S315后，步骤S316之前，设置有步骤：More specifically, after step S315 and before step S316, there are steps:

Wbox＝2.0-tw*thWbox=2.0-tw*th

Loss＝lossbox+lossconf+lossclassLoss=lossbox+lossconf+lossclass

其中，其中S²代表训练图像大小为S乘S，B代表box，

在本实施例中，根据多个参数值得出的坐标值可用于标定当前检测框内的钢筋坐标，预设坐标值为在当前钢筋端面的图像已人工检验或其他方式检验得出的确定坐标值，两者间的差值可用于判定当前初级检测框的偏差是否在允许范围内，若是则可选用当前的初级检测框作为检测标准，若否则抛弃该初级检测框。In this embodiment, the coordinate values obtained according to a plurality of parameter values can be used to calibrate the coordinates of the steel bars in the current detection frame, and the preset coordinate values are determined coordinate values obtained by manual inspection or other methods of inspection on the image of the end face of the current steel bar. , the difference between the two can be used to determine whether the deviation of the current primary detection frame is within the allowable range, if so, the current primary detection frame can be selected as the detection standard, otherwise the primary detection frame is discarded.

应强调的，搭载有前述方法的存储器也属于本实施例的保护范围。It should be emphasized that the memory equipped with the aforementioned method also belongs to the protection scope of this embodiment.

综上所述，本实施例提出了一种成捆钢筋端面的识别计数及中心定位方法，其主要通过多个不同尺度的检测框迭代更替后得出最合适的检测框，并根据经检测框在当前图像中的识别数量判断得出当前图像中的钢筋端面数量，其计数更为准确。To sum up, this embodiment proposes a method for identifying, counting and centering the end faces of bundles of steel bars, which mainly obtains the most suitable detection frame by iteratively replacing a plurality of detection frames of different scales. The number of recognized bars in the current image is judged to obtain the number of steel bar end faces in the current image, and its count is more accurate.

以上所述的实施方式，并不构成对该技术方案保护范围的限定。任何在上述实施方式的精神和原则之内所作的修改、等同替换和改进等，均应包含在该技术方案的保护范围之内。The above-mentioned embodiments do not constitute a limitation on the protection scope of the technical solution. Any modifications, equivalent replacements and improvements made within the spirit and principles of the above-mentioned embodiments shall be included within the protection scope of this technical solution.

Claims

1. A method for identifying, counting and centering end faces of bundles of reinforcing steel bars is characterized by comprising the following steps:

s1, shooting an image of the end face of a steel bar, and processing to obtain an image to be identified;

s2, performing data enhancement operation on the image to be recognized by adopting a first preset algorithm;

s3, forming final detection frames in the image to be recognized by adopting a second preset algorithm with a lightweight convolutional neural network, and calculating the number of the final detection frames;

and S4, generating a counting result.

2. The method for identifying, counting and centering end faces of bundles of steel bars according to claim 1, wherein the step S3 specifically comprises:

s31, a second preset algorithm with a lightweight convolutional neural network is formed in advance;

s32, forming a final detection frame in the image to be recognized by adopting the second preset algorithm, and calculating the number of the final detection frames;

the second preset algorithm in step S31 is specifically to replace the Darknet53 backbone feature extraction network in the YoloV3 original network with the Shfflenetv2 backbone feature extraction network by improving the backbone feature extraction network.

3. The method as claimed in claim 2, wherein the step S31 specifically includes:

s311, clustering the training images to form an anchor frame;

s312, dividing the training image and forming a plurality of small blocks;

s313, generating a plurality of rectangular frames in each small block, wherein the length and the width of each rectangular frame are determined by the anchor frame;

s314, fine-tuning a plurality of rectangular frames under the same small block to form a primary detection frame;

s315, judging whether any small block contains a target detection object, if so, calculating IOU values between a plurality of primary detection frames in the small block and a real frame of a training image, and if all the primary detection frames are larger than a set threshold value, selecting the primary detection frame with the largest IOU value as a positive sample;

and S316, generating a second preset algorithm with the light-weight convolutional neural network after the frame shape of the positive sample is saved.

4. The method as claimed in claim 3, wherein the step S314 specifically includes:

s314a, acquiring a plurality of parameter values of the anchor frame;

s314b, adjusting the rectangular frame according to the acquired multiple parameter values to form a primary detection frame;

wherein, theThe parameter values in step S314a include coordinate values of four corners, which are respectively denoted as t _x ，t _y ，t _w ，t _h And an offset (c) of the anchor frame with respect to the training image _x ，c _y )；

In step S314b, the adjustment of the rectangular frame includes the following formula,

b _x ＝σ(t _x )+c _x ；

b _y ＝σ(t _y )+c _y ；

note that the width of the rectangular frame is p _w The height of the rectangular frame is p _h (ii) a Recording the real value of the coordinates of the rectangular frame as

The preset coordinate value is t ^* 。

5. The method for identifying, counting and centering end faces of reinforcing steel bundles according to claim 4, wherein after the step S315 and before the step S316, the steps of:

SX, which performs loss calculation by adopting the following formula,

Wbox＝2.0-tw*th

Loss＝lossbox+lossconf+lossclass

wherein S is ² Representing the training image size as S times S, B representing box,

represents if at coordinate [ i, j]And if the box has a target, the value of the box is 1, otherwise the box is 0, 3 loss functions are mainly calculated, namely the loss between the box and the center coordinate of the real box and the width and the height, whether the loss of the detection object is contained in the prediction box or not, and the 3 losses are finally added to be used as a loss function value of one level in the prediction type loss, and the average of the three-level loss functions is used as a final loss value in the final loss calculation.

6. The method for identifying, counting and centering the end faces of the steel bars in the bundle according to any one of claims 1 to 5, wherein the method comprises the following steps:

in the step S1, the image to be recognized is obtained after the processing, which includes randomly turning the image, randomly scaling, randomly cutting, and randomly changing brightness and darkness.

7. The method for identifying, calculating and centering end faces of bundles of steel bars according to any one of claims 1 to 5, wherein the method comprises the following steps:

in the step S2, the data enhancement operation is performed on the image to be recognized by using a first preset algorithm, which includes performing Fmix enhancement mixing on a dirty or texture data set.

8. A computer-readable memory, characterized in that: the computer-readable memory comprises a stored computer program, wherein the computer program when executed controls an apparatus in which the computer-readable memory is located to perform the method of any of claims 1-7.