CN112070784A

CN112070784A - Perception edge detection method based on context enhancement network

Info

Publication number: CN112070784A
Application number: CN202010965729.8A
Authority: CN
Inventors: 欧阳宁; 韦羽; 周宏敏; 林乐平; 莫建文; 袁华
Original assignee: Guilin University of Electronic Technology
Current assignee: Guangxi Zhengshichang Information Technology Co ltd
Priority date: 2020-09-15
Filing date: 2020-09-15
Publication date: 2020-12-11
Anticipated expiration: 2040-09-15
Also published as: CN112070784B

Abstract

The invention discloses a perception edge detection method based on a context enhancement network, which is characterized by comprising the following steps: 1) acquiring an image training data set training edge detection model; 2) extracting characteristics; 3) reducing the dimension and weighting; 4) bidirectional recursion; 5) classifying; 6) and (6) outputting. The method improves the accuracy of edge detection by capturing the internal relation between the multi-scale context information and has the advantage of high speed.

Description

A Context-Enhanced Network-Based Perceptual Edge Detection Method

技术领域technical field

本发明涉及图像处理领域，具体涉及一种基于上下文增强网络的感知边缘检测方法。The invention relates to the field of image processing, in particular to a perceptual edge detection method based on a context enhancement network.

背景技术Background technique

图像边缘是指其周围像素灰度有阶跃变化和屋顶变化的那些像素的集合，是图像最基本的特征之一。图像边缘往往携带一幅图像的大部分信息，边缘检测在计算机视觉、图像处理等应用中起着重要的作用，是图像分析与识别的重要环节,因此图像的边缘检测一直是人们研究的热门课题。Image edge refers to the set of pixels whose surrounding pixels have step change and roof change, and is one of the most basic features of the image. Image edges often carry most of the information of an image. Edge detection plays an important role in computer vision, image processing and other applications, and is an important part of image analysis and recognition. Therefore, image edge detection has always been a hot topic of research. .

传统的边缘检测方法通常以颜色以及亮度等特征来寻找图像的边缘点，如Canny、Sobel算子。但由于自然图像的复杂性，仅利用梯度和颜色等特征难以实现准确的边缘检测，一些学者尝试通过数据驱动的方式将梯度、颜色以及亮度等多种低级特征用于边缘检测，如gPb+UCM、StrucutredEdge等方法，虽然该类方法相较基于梯度的方法取得了一定的提高，但由于该类方法仅利用了图像的低级特征，在特殊场景下难以实现检测的鲁棒性。为了捕捉图像的高层次特征，一些学者尝试利用卷积神经网络来提取图像块特征，提出了一些经典的边缘检测方法，如N4Fields、DeepEdge等，得益于卷积神经网络强大的特征提取能力，该类方法在检测精度上得到了较好的提升，但图像块的处理方式不利于捕捉图像的全局特征，基于此，xie等人提出一种端到端学习的边缘检测方法，通过结合多层次、多尺度特征，使得边缘检测精度得到了大幅提升，在此基础上，Liu等人通过融合每个尺度中大小不同的感受野特征以提高每个尺度的表达能力，提高边缘检测的精度。Traditional edge detection methods usually use features such as color and brightness to find edge points of images, such as Canny and Sobel operators. However, due to the complexity of natural images, it is difficult to achieve accurate edge detection only by using features such as gradient and color. Some scholars try to use various low-level features such as gradient, color and brightness for edge detection in a data-driven way, such as gPb+UCM. , StrucutredEdge and other methods, although this type of method has achieved a certain improvement compared with the gradient-based method, but because this type of method only utilizes the low-level features of the image, it is difficult to achieve robust detection in special scenarios. In order to capture the high-level features of images, some scholars try to use convolutional neural networks to extract image block features, and propose some classic edge detection methods, such as N4Fields, DeepEdge, etc., thanks to the powerful feature extraction capabilities of convolutional neural networks, This type of method has been improved in detection accuracy, but the processing method of image blocks is not conducive to capturing the global features of the image. Based on this, Xie et al. proposed an end-to-end learning edge detection method. , multi-scale features, so that the edge detection accuracy has been greatly improved. On this basis, Liu et al. improved the expressive ability of each scale and improved the accuracy of edge detection by fusing the receptive field features of different sizes in each scale.

虽然近年来提出的一系列基于多尺度结合的边缘检测方法在检测精度上取得了一定的提升，但是这些方法普遍存在着大尺度特征语义信息少，误分类多，小尺度空间信息少，定位不精准等问题，而通过对不同尺度输出加权的方式不能很好解决以上问题，如何对图像多尺度特征进行信息互补、增强每个尺度的分类准确性是目前的一个难点。Although a series of edge detection methods based on multi-scale combination proposed in recent years have achieved a certain improvement in detection accuracy, these methods generally suffer from less large-scale feature semantic information, more misclassification, less small-scale spatial information, and poor localization. However, the above problems cannot be well solved by weighting the output of different scales. How to complement the information of the multi-scale features of the image and enhance the classification accuracy of each scale is a difficulty at present.

发明内容SUMMARY OF THE INVENTION

针对现有基于多尺度结构的边缘检测方法中各尺度特征互相独立，存在着大尺度特征语义信息少，误分类多，小尺度空间信息少，小目标信息丢失严重等问题，本发明提供一种基于上下文增强网络的感知边缘检测方法。这种方法通过捕捉多尺度上下文信息之间的内在联系提升边缘检测的准确性，具有速度快的优点。Aiming at the problems of the existing edge detection methods based on multi-scale structure that the features of each scale are independent of each other, the large-scale features have less semantic information, more misclassification, less small-scale spatial information, and serious loss of small target information, the present invention provides an A perceptual edge detection method based on context-augmented networks. This method improves the accuracy of edge detection by capturing the intrinsic connection between multi-scale contextual information, and has the advantage of being fast.

实现本发明目的的技术方案是：The technical scheme that realizes the object of the present invention is:

一种基于上下文增强网络的感知边缘检测方法，包括如下步骤：A method for perceptual edge detection based on a context-enhanced network, comprising the following steps:

(1)获取图像训练数据集训练边缘检测模型：所述边缘检测模型包括特征提取过程、神经网络双向递归过程和分类过程，具体为：(1) Obtain the image training data set and train the edge detection model: the edge detection model includes a feature extraction process, a neural network two-way recursive process and a classification process, specifically:

特征提取过程：将样本图像x映射为5组具有d维的特征，特征提取网络由5个CSU模块组成，作为一个可训练的特征提取网络将输入的样本图像x映射为5组具有21维的特征，即

其中CSU模块里包含横向和纵向两个支路，每个CSU模块的纵向支路数N从前到后分别为{2,2,3,3,3}，纵向支路负责提取高维图像特征，横向支路负责进行特征的聚合以及上采样，过程可以通过公式(1)、公式(2)进行表示：Feature extraction process: map the sample image x into 5 groups of d-dimensional features. The feature extraction network consists of 5 CSU modules. As a trainable feature extraction network, the input sample image x is mapped into 5 groups of 21-dimensional features. feature, that is

The CSU module contains two branches, horizontal and vertical. The number of vertical branches N of each CSU module is {2, 2, 3, 3, 3} from front to back. The vertical branch is responsible for extracting high-dimensional image features. The lateral branch is responsible for feature aggregation and upsampling, and the process can be expressed by formula (1) and formula (2):

其中，i为第i个CSU模块，n为CSU模块中第n个纵向支路卷积，W为纵向卷积核参数，卷积核大小统一为3×3，下同，dW为横向卷积核参数，卷积核大小统一为1×1，Φ为Relu激活函数，up(·)为双线性插值函数，聚合操作采用1×1卷积实现；Among them, i is the i-th CSU module, n is the n-th longitudinal branch convolution in the CSU module, W is the longitudinal convolution kernel parameter, and the size of the convolution kernel is uniformly 3×3, the same below, dW is the horizontal convolution Kernel parameters, the size of the convolution kernel is uniformly 1×1, Φ is the Relu activation function, up( ) is a bilinear interpolation function, and the aggregation operation is implemented by 1×1 convolution;

(2)将5个特征组[x₁,x₂,x₃,x₄,x₅]按照正反序列方向分别有序输入到两个递归神经网络，得到前向递归神经网络的5个特征输出

和后向递归神经网络的5个输出

其中递归神经网络由5个CLSTM单元串联组成，CLSTM单元内部设有三部分，分别为输入单元i_t、输出单元o_t、遗忘单元f_t，递归过程如下所示：(2) Input the five feature groups [x ₁ , x ₂ , x ₃ , x ₄ , x ₅ ] into two recurrent neural networks in an orderly manner according to the forward and reverse sequence directions, and obtain five features of the forward recurrent neural network output

and the 5 outputs of the backward recurrent neural network

The recurrent neural network is composed of 5 CLSTM units connected in series. There are three parts inside the CLSTM unit, which are the input unit i _t , the output unit _ot , and the forgetting unit ft _. The recursive process is as follows:

输入：上一CLSTM单元的记忆特征c_t-1，当前时刻输入x_t，上一CLSTM单元输出h_t-1，Input: memory feature c _t-1 of the previous CLSTM unit, input x _t at the current moment, output h _t-1 of the previous CLSTM unit,

2-1)由遗忘门f_t对上一CLSTM单元的记忆特征c_t-1进行筛选，输出c'为公式(3)：2-1) The memory feature c _t-1 of the previous CLSTM unit is screened by the forget gate f _t , and the output c' is formula (3):

c'＝f_t·c_t-1 (3)，c'=f _t ·c _t-1 (3),

f_t＝σ(W_f*[x_t,h_t-1]+b_f) (4)，f _t =σ(W _f *[x _t ,h _t-1 ]+b _f ) (4),

其中W_f为1×1卷积权重，b_f为偏置项，x_t为当前时刻输入，h_t-1为上一CLSTM单元输出，σ为Sigmoid激活函数，c_t-1为上一CLSTM单元的记忆特征，遗忘门旨在对上一单元的记忆特征选择筛选，上一单元通过1×1卷积将x_t,h_t-1的特征信息进行融合，经过Sigmoid激活函数生成一张权重图，并利用该权重图对记忆特征进行选择和筛选；where W _f is the 1×1 convolution weight, b _f is the bias term, x _t is the input at the current moment, h _t-1 is the output of the previous CLSTM unit, σ is the sigmoid activation function, and c _t-1 is the previous CLSTM The memory features of the unit, the forget gate is designed to select and filter the memory features of the previous unit. The previous unit fuses the feature information of x _t , h _t-1 through 1×1 convolution, and generates a weight through the sigmoid activation function. map, and use the weight map to select and filter memory features;

2-2)由输入门i_t对输入特征进行筛选，并与筛选后的记忆特征c'加权，得到当前CLSTM单元的记忆特征c_t为公式(5)：2-2) The input feature is screened by the input gate i _t , and weighted with the screened memory feature c' to obtain the memory feature c _t of the current CLSTM unit as formula (5):

c_t＝c'+(1+i_t)×Φ(W_x*[x_t,h_t-1]+b_c) (5)，c _t =c'+(1+i _t )×Φ(W _x *[x _t ,h _t-1 ]+b _c ) (5),

i_t＝σ(W_i*[x_t,h_t-1]+b_i) (6)，i _t =σ(W _i *[x _t ,h _t-1 ]+b _i ) (6),

其中W_x、W_i为卷积权重，b_i、b_c为偏置项，Φ为Relu函数，σ为Sigmoid函数，*为卷积操作，输入门旨在对输入特征的重要性进行建模，生成特征权重矩阵，对输入特征进行选择性增强，当前CLSTM单元通过1×1卷积将x_t，h_t-1的特征信息进行融合，经过Sigmoid激活函数生成一张权重图，并利用该权重图对融合后的输入特征信息进行选择性增强；where W _x and Wi are the convolution weights, _bi and b _c are the bias terms, Φ is the _Relu function, σ is the Sigmoid function, * is the convolution operation, and the input gate is designed to model the importance of the input features , generate a feature weight matrix, and selectively enhance the input features. The current CLSTM unit fuses the feature information of x _t and h _t-1 through 1×1 convolution, and generates a weight map through the Sigmoid activation function. The weight map selectively enhances the fused input feature information;

2-3)由输出门o_t对记忆特征进行选择性增强，得到当前CLSTM输出h_t为公式(7)：2-3) The memory feature is selectively enhanced by the output gate o _t , and the current CLSTM output h _t is obtained as formula (7):

h_t＝(o_t+1)×Φ(c_t) (7)，h _t =(o _t +1)×Φ(c _t ) (7),

o_t＝σ(W_o*[x_t,h_t-1]+b_o) (8)，o _t =σ(W _o *[x _t ,h _t-1 ]+b _o ) (8),

其中，Φ为Relu函数，σ为Sigmoid函数，*为卷积操作，W_o为卷积权重，b_o为偏置项；Among them, Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, W _o is the convolution weight, and b _o is the bias term;

输出：当前CLSTM单元的记忆特征c_t，当前CLSTM单元输出h_t；Output: the memory feature c _t of the current CLSTM unit, the output h _t of the current CLSTM unit;

(3)将得到的前向支路特征张量

以及后向支路特征张量

分别经过卷积操作将特征通道数降至1，得到

以及

过程如公式(9)、公式(10)所示：(3) The obtained forward branch feature tensor

and the backward branch feature tensor

After the convolution operation, the number of feature channels is reduced to 1, and we get

as well as

The process is shown in formula (9) and formula (10):

(4)将特征张量

在维度1进行通道拼接，得到张量so，采用1×1卷积对张量so进行加权融合，经过Sigmoid激活函数得到最终输出sout为公式(11)：(4) The feature tensor

Perform channel splicing in dimension 1 to obtain the tensor so, use 1×1 convolution to weight the tensor so, and obtain the final output sout through the Sigmoid activation function as formula (11):

sout＝W_so*so (11)，sout=Wso* _so (11),

(5)将

分别经过Sigmoid激活函数，得到10个支路输出

(5) will

After passing through the Sigmoid activation function respectively, 10 branch outputs are obtained

(6)采用交叉熵l_w(X_i)计算所有输出与标签的损失，通过最小化L_w损失优化整体网络，由于数据集中标签由多人进行标注，本技术方案对多个标签进行加权平均，并按照阀值θ对标签进行划分，以y_i＜θ的像素点为歧义点，不计算损失，y_i＝θ为非边缘点，y_i≥θ为边缘点：(6) Calculate the loss of all outputs and labels by using the cross entropy _lw (X _i ), and optimize the overall network by minimizing the loss of _Lw . Since the labels in the data set are labelled by many people, this technical solution performs a weighted average of multiple labels. , and divide the label according to the threshold θ, take the pixel points with y _i <θ as the ambiguous point, do not calculate the loss, y _i =θ is the non-edge point, y _i ≥ θ is the edge point:

其中，in,

|Y⁺|和|Y^-|分别代表了每个batch中正样本数量总和以及负样本数量总和，超参数λ为正负样本平衡参数，X表示网络输出，y_i为标签像素点，所以，最终的损失函数为公式(16)：：|Y ⁺ | and |Y ^- | represent the sum of the number of positive samples and the total number of negative samples in each batch, respectively, the hyperparameter λ is the balance parameter of positive and negative samples, X represents the network output, and _yi is the label pixel. Therefore, in the end, The loss function of is formula (16):

为前向递归神经网络第k阶段的输出，

为后向递归神经网络第k阶段的输出，sout是最终网络的输出值，|I|为图像I的所有像素点数量；

is the output of the kth stage of the forward recurrent neural network,

is the output of the kth stage of the backward recurrent neural network, sout is the output value of the final network, |I| is the number of all pixels in the image I;

(7)重复步骤(1)-步骤(6)，直到网络收敛。(7) Repeat steps (1) to (6) until the network converges.

本技术方案采用双向递归神经网络来增强多尺度上下文之间的内在联系，解决了目前基于多尺度方法各尺度信息利用不充分，边缘检测精度受限的问题，该方法根据多尺度上下文感受野在时间顺序上由小到大的特点，将多尺度上下文信息从时间维度进行分析建模，利用递归神经网络的思想，通过由上至下和由下至上的递归神经网络支路逐步捕捉不同方向且不同范围内的上下文联系，完成对特征金字塔表达能力的增强，具有检测精度高，速度快的优点。This technical solution uses a bidirectional recurrent neural network to enhance the internal connection between multi-scale contexts, and solves the problems of insufficient utilization of each scale information based on the current multi-scale method and limited edge detection accuracy. The characteristics of time sequence from small to large, the multi-scale context information is analyzed and modeled from the time dimension, using the idea of recurrent neural network, through the recurrent neural network branches from top to bottom and bottom to top to gradually capture different directions and The context connection in different ranges completes the enhancement of the expression ability of the feature pyramid, which has the advantages of high detection accuracy and fast speed.

这种方法通过捕捉多尺度上下文信息之间的内在联系提升边缘检测的准确性，具有速度快的优点。This method improves the accuracy of edge detection by capturing the intrinsic connection between multi-scale contextual information, and has the advantage of being fast.

附图说明Description of drawings

图1为实施例中CLSTM单元内部结构示意图；1 is a schematic diagram of the internal structure of a CLSTM unit in an embodiment;

图2为实施例中总体网络结构示意图；2 is a schematic diagram of an overall network structure in an embodiment;

图3为实施例中方法流程示意图。FIG. 3 is a schematic flowchart of the method in the embodiment.

具体实施方式Detailed ways

下面结合附图和实施例对本发明的内容作进一步的阐述，但不是对本发明的限定。The content of the present invention will be further elaborated below in conjunction with the accompanying drawings and embodiments, but it is not intended to limit the present invention.

实施例：Example:

参照图2、3，一种基于上下文增强网络的感知边缘检测方法，包括如下步骤：Referring to Figures 2 and 3, a method for perceptual edge detection based on a context-enhanced network, comprising the following steps:

和后向递归神经网络的5个输出

其中递归神经网络由5个CLSTM单元串联组成，CLSTM单元内部设有三部分，分别为输入单元i_t、输出单元o_t、遗忘单元f_t，CLSTM单元内部结构如图1所示，递归过程如下所示：(2) Input the five feature groups [x ₁ , x ₂ , x ₃ , x ₄ , x ₅ ] into two recurrent neural networks in an orderly manner according to the forward and reverse sequence directions, and obtain five features of the forward recurrent neural network output

and the 5 outputs of the backward recurrent neural network

The recurrent neural network is composed of 5 CLSTM units in series. There are three parts inside the _CLSTM unit, namely the input unit it, the output unit ot, and the forgetting unit ft _. The internal structure of the _CLSTM unit is shown in Figure 1. The recursive process is as follows Show:

c'＝f_t·c_t-1 (3)，c'=f _t ·c _t-1 (3),

f_t＝σ(W_f*[x_t,h_t-1]+b_f) (4)，f _t =σ(W _f *[x _t ,h _t-1 ]+b _f ) (4),

i_t＝σ(W_i*[x_t,h_t-1]+b_i) (6)，i _t =σ(W _i *[x _t ,h _t-1 ]+b _i ) (6),

h_t＝(o_t+1)×Φ(c_t) (7)，h _t =(o _t +1)×Φ(c _t ) (7),

o_t＝σ(W_o*[x_t,h_t-1]+b_o) (8)，其中，Φ为Relu函数，σ为Sigmoid函数，*为卷积操作，W_o为卷积权重，b_o为偏置项；o _t =σ(W _o *[x _t ,h _t-1 ]+b _o ) (8), where Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, and W _o is the convolution weight, b _o is the bias term;

(3)将得到的前向支路特征张量

以及后向支路特征张量

分别经过卷积操作将特征通道数降至1，得到

以及

and the backward branch feature tensor

as well as

The process is shown in formula (9) and formula (10):

(4)将特征张量

sout＝W_so*so (11)，sout=Wso* _so (11),

(5)将

分别经过Sigmoid激活函数，得到10个支路输出

(5) will

其中，in,

|Y⁺|和|Y^-|分别代表了每个batch中正样本数量总和以及负样本数量总和，超参数λ为正负样本平衡参数，X表示网络输出，y_i为标签像素点，所以，最终的损失函数为公式(16)：|Y ⁺ | and |Y ^- | represent the sum of the number of positive samples and the total number of negative samples in each batch, respectively, the hyperparameter λ is the balance parameter of positive and negative samples, X represents the network output, and _yi is the label pixel. Therefore, in the end, The loss function of is Equation (16):

为前向递归神经网络第k阶段的输出，

is the output of the kth stage of the forward recurrent neural network,

Claims

1. a perceptual edge detection method based on context enhancement network, is characterized in that, comprises the steps:

(1) Obtain the image training data set and train the edge detection model: the edge detection model includes a feature extraction process, a neural network two-way recursive process and a classification process, specifically:

Feature extraction process: map the sample image x into 5 groups of d-dimensional features. The feature extraction network consists of 5 CSU modules. As a trainable feature extraction network, the input sample image x is mapped into 5 groups of 21-dimensional features. feature, that is

i∈{1～5}, in which the CSU module contains two branches, horizontal and vertical. The number of vertical branches N of each CSU module is {2,2,3,3,3} from front to back, respectively. The path is responsible for extracting high-dimensional image features, and the lateral branch is responsible for feature aggregation and upsampling. The process is represented by formula (1) and formula (2):

Among them, i is the i-th CSU module, n is the n-th longitudinal branch convolution in the CSU module, W is the longitudinal convolution kernel parameter, and the size of the convolution kernel is uniformly 3×3, the same below, dW is the horizontal convolution Kernel parameters, the size of the convolution kernel is uniformly 1×1, Φ is the Relu activation function, up( ) is a bilinear interpolation function, and the aggregation operation is implemented by 1×1 convolution;

(2) Input the five feature groups [x ₁ , x ₂ , x ₃ , x ₄ , x ₅ ] into two recurrent neural networks in an orderly manner according to the forward and reverse sequence directions, and obtain five features of the forward recurrent neural network output

and the 5 outputs of the backward recurrent neural network

Input: memory feature c _t-1 of the previous CLSTM unit, input x _t at the current moment, output h _t-1 of the previous CLSTM unit,

2-1) The memory feature c _t-1 of the previous CLSTM unit is screened by the forget gate f _t , and the output c' is formula (3):

c'=f _t ·c _t-1 (3),

f _t =σ(W _f *[x _t ,h _t-1 ]+b _f ) (4),

where W _f is the 1×1 convolution weight, b _f is the bias term, x _t is the input at the current moment, h _t-1 is the output of the previous CLSTM unit, σ is the sigmoid activation function, and c _t-1 is the previous CLSTM The memory features of the unit, the forget gate is designed to select and filter the memory features of the previous unit. The previous unit fuses the feature information of x _t , h _t-1 through 1×1 convolution, and generates a weight through the sigmoid activation function. map, and use the weight map to select and filter memory features;

2-2) The input feature is screened by the input gate i _t , and weighted with the screened memory feature c' to obtain the memory feature c _t of the current CLSTM unit as formula (5):

c _t =c'+(1+i _t )×Φ(W _x *[x _t ,h _t-1 ]+b _c ) (5),

i _t =σ(W _i *[x _t ,h _t-1 ]+b _i ) (6),

where W _x and Wi are the convolution weights, _bi and b _c are the bias terms, Φ is the _Relu function, σ is the Sigmoid function, * is the convolution operation, and the input gate is designed to model the importance of the input features , generate a feature weight matrix, and selectively enhance the input features. The current CLSTM unit fuses the feature information of x _t and h _t-1 through 1×1 convolution, and generates a weight map through the Sigmoid activation function. The weight map selectively enhances the fused input feature information;

2-3) The memory feature is selectively enhanced by the output gate o _t , and the current CLSTM output h _t is obtained as formula (7):

h _t =(o _t +1)×Φ(c _t ) (7),

o _t =σ(W _o *[x _t ,h _t-1 ]+b _o ) (8), where Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, and W _o is the convolution weight, b _o is the bias term;

Output: the memory feature c _t of the current CLSTM unit, the output h _t of the current CLSTM unit;

(3) The obtained forward branch feature tensor

and the backward branch feature tensor

as well as

The process is shown in formula (9) and formula (10):

(4) The feature tensor

sout=Wso* _so (11),

(5) will

(6) Calculate the loss of all outputs and labels by using the cross entropy _lw (X _i ), optimize the overall network by minimizing the loss of Lw, perform a weighted average of multiple labels, and divide the labels according to the threshold θ, with _y Pixels with _i < θ are ambiguous points, no loss is calculated, _yi = θ is a non-edge point, y _i ≥ θ is an edge point:

in,

|Y ⁺ | and |Y ^- | represent the total number of positive samples and the total number of negative samples in each batch, respectively, the hyperparameter λ is the balance parameter of positive and negative samples, X represents the network output, _yi is the label pixel, the final loss The function is formula (16):

is the output of the kth stage of the forward recurrent neural network,

(7) Repeat steps (1) to (6) until the network converges.