[go: up one dir, main page]

CN112070784A - Perception edge detection method based on context enhancement network - Google Patents

Perception edge detection method based on context enhancement network Download PDF

Info

Publication number
CN112070784A
CN112070784A CN202010965729.8A CN202010965729A CN112070784A CN 112070784 A CN112070784 A CN 112070784A CN 202010965729 A CN202010965729 A CN 202010965729A CN 112070784 A CN112070784 A CN 112070784A
Authority
CN
China
Prior art keywords
feature
output
convolution
unit
clstm
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Granted
Application number
CN202010965729.8A
Other languages
Chinese (zh)
Other versions
CN112070784B (en
Inventor
欧阳宁
韦羽
周宏敏
林乐平
莫建文
袁华
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Guangxi Zhengshichang Information Technology Co ltd
Original Assignee
Guilin University of Electronic Technology
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Guilin University of Electronic Technology filed Critical Guilin University of Electronic Technology
Priority to CN202010965729.8A priority Critical patent/CN112070784B/en
Publication of CN112070784A publication Critical patent/CN112070784A/en
Application granted granted Critical
Publication of CN112070784B publication Critical patent/CN112070784B/en
Active legal-status Critical Current
Anticipated expiration legal-status Critical

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T7/00Image analysis
    • G06T7/10Segmentation; Edge detection
    • G06T7/13Edge detection
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/048Activation functions
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/049Temporal neural networks, e.g. delay elements, oscillating neurons or pulsed inputs
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V10/00Arrangements for image or video recognition or understanding
    • G06V10/40Extraction of image or video features
    • G06V10/44Local feature extraction by analysis of parts of the pattern, e.g. by detecting edges, contours, loops, corners, strokes or intersections; Connectivity analysis, e.g. of connected components
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/10Image acquisition modality
    • G06T2207/10004Still image; Photographic image
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20081Training; Learning
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06TIMAGE DATA PROCESSING OR GENERATION, IN GENERAL
    • G06T2207/00Indexing scheme for image analysis or image enhancement
    • G06T2207/20Special algorithmic details
    • G06T2207/20084Artificial neural networks [ANN]

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Data Mining & Analysis (AREA)
  • Evolutionary Computation (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Artificial Intelligence (AREA)
  • General Engineering & Computer Science (AREA)
  • Computing Systems (AREA)
  • Software Systems (AREA)
  • Molecular Biology (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Mathematical Physics (AREA)
  • General Health & Medical Sciences (AREA)
  • Health & Medical Sciences (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Evolutionary Biology (AREA)
  • Multimedia (AREA)
  • Image Analysis (AREA)

Abstract

The invention discloses a perception edge detection method based on a context enhancement network, which is characterized by comprising the following steps: 1) acquiring an image training data set training edge detection model; 2) extracting characteristics; 3) reducing the dimension and weighting; 4) bidirectional recursion; 5) classifying; 6) and (6) outputting. The method improves the accuracy of edge detection by capturing the internal relation between the multi-scale context information and has the advantage of high speed.

Description

一种基于上下文增强网络的感知边缘检测方法A Context-Enhanced Network-Based Perceptual Edge Detection Method

技术领域technical field

本发明涉及图像处理领域,具体涉及一种基于上下文增强网络的感知边缘检测方法。The invention relates to the field of image processing, in particular to a perceptual edge detection method based on a context enhancement network.

背景技术Background technique

图像边缘是指其周围像素灰度有阶跃变化和屋顶变化的那些像素的集合,是图像最基本的特征之一。图像边缘往往携带一幅图像的大部分信息,边缘检测在计算机视觉、图像处理等应用中起着重要的作用,是图像分析与识别的重要环节,因此图像的边缘检测一直是人们研究的热门课题。Image edge refers to the set of pixels whose surrounding pixels have step change and roof change, and is one of the most basic features of the image. Image edges often carry most of the information of an image. Edge detection plays an important role in computer vision, image processing and other applications, and is an important part of image analysis and recognition. Therefore, image edge detection has always been a hot topic of research. .

传统的边缘检测方法通常以颜色以及亮度等特征来寻找图像的边缘点,如Canny、Sobel算子。但由于自然图像的复杂性,仅利用梯度和颜色等特征难以实现准确的边缘检测,一些学者尝试通过数据驱动的方式将梯度、颜色以及亮度等多种低级特征用于边缘检测,如gPb+UCM、StrucutredEdge等方法,虽然该类方法相较基于梯度的方法取得了一定的提高,但由于该类方法仅利用了图像的低级特征,在特殊场景下难以实现检测的鲁棒性。为了捕捉图像的高层次特征,一些学者尝试利用卷积神经网络来提取图像块特征,提出了一些经典的边缘检测方法,如N4Fields、DeepEdge等,得益于卷积神经网络强大的特征提取能力,该类方法在检测精度上得到了较好的提升,但图像块的处理方式不利于捕捉图像的全局特征,基于此,xie等人提出一种端到端学习的边缘检测方法,通过结合多层次、多尺度特征,使得边缘检测精度得到了大幅提升,在此基础上,Liu等人通过融合每个尺度中大小不同的感受野特征以提高每个尺度的表达能力,提高边缘检测的精度。Traditional edge detection methods usually use features such as color and brightness to find edge points of images, such as Canny and Sobel operators. However, due to the complexity of natural images, it is difficult to achieve accurate edge detection only by using features such as gradient and color. Some scholars try to use various low-level features such as gradient, color and brightness for edge detection in a data-driven way, such as gPb+UCM. , StrucutredEdge and other methods, although this type of method has achieved a certain improvement compared with the gradient-based method, but because this type of method only utilizes the low-level features of the image, it is difficult to achieve robust detection in special scenarios. In order to capture the high-level features of images, some scholars try to use convolutional neural networks to extract image block features, and propose some classic edge detection methods, such as N4Fields, DeepEdge, etc., thanks to the powerful feature extraction capabilities of convolutional neural networks, This type of method has been improved in detection accuracy, but the processing method of image blocks is not conducive to capturing the global features of the image. Based on this, Xie et al. proposed an end-to-end learning edge detection method. , multi-scale features, so that the edge detection accuracy has been greatly improved. On this basis, Liu et al. improved the expressive ability of each scale and improved the accuracy of edge detection by fusing the receptive field features of different sizes in each scale.

虽然近年来提出的一系列基于多尺度结合的边缘检测方法在检测精度上取得了一定的提升,但是这些方法普遍存在着大尺度特征语义信息少,误分类多,小尺度空间信息少,定位不精准等问题,而通过对不同尺度输出加权的方式不能很好解决以上问题,如何对图像多尺度特征进行信息互补、增强每个尺度的分类准确性是目前的一个难点。Although a series of edge detection methods based on multi-scale combination proposed in recent years have achieved a certain improvement in detection accuracy, these methods generally suffer from less large-scale feature semantic information, more misclassification, less small-scale spatial information, and poor localization. However, the above problems cannot be well solved by weighting the output of different scales. How to complement the information of the multi-scale features of the image and enhance the classification accuracy of each scale is a difficulty at present.

发明内容SUMMARY OF THE INVENTION

针对现有基于多尺度结构的边缘检测方法中各尺度特征互相独立,存在着大尺度特征语义信息少,误分类多,小尺度空间信息少,小目标信息丢失严重等问题,本发明提供一种基于上下文增强网络的感知边缘检测方法。这种方法通过捕捉多尺度上下文信息之间的内在联系提升边缘检测的准确性,具有速度快的优点。Aiming at the problems of the existing edge detection methods based on multi-scale structure that the features of each scale are independent of each other, the large-scale features have less semantic information, more misclassification, less small-scale spatial information, and serious loss of small target information, the present invention provides an A perceptual edge detection method based on context-augmented networks. This method improves the accuracy of edge detection by capturing the intrinsic connection between multi-scale contextual information, and has the advantage of being fast.

实现本发明目的的技术方案是:The technical scheme that realizes the object of the present invention is:

一种基于上下文增强网络的感知边缘检测方法,包括如下步骤:A method for perceptual edge detection based on a context-enhanced network, comprising the following steps:

(1)获取图像训练数据集训练边缘检测模型:所述边缘检测模型包括特征提取过程、神经网络双向递归过程和分类过程,具体为:(1) Obtain the image training data set and train the edge detection model: the edge detection model includes a feature extraction process, a neural network two-way recursive process and a classification process, specifically:

特征提取过程:将样本图像x映射为5组具有d维的特征,特征提取网络由5个CSU模块组成,作为一个可训练的特征提取网络将输入的样本图像x映射为5组具有21维的特征,即

Figure BDA0002682229190000021
其中CSU模块里包含横向和纵向两个支路,每个CSU模块的纵向支路数N从前到后分别为{2,2,3,3,3},纵向支路负责提取高维图像特征,横向支路负责进行特征的聚合以及上采样,过程可以通过公式(1)、公式(2)进行表示:Feature extraction process: map the sample image x into 5 groups of d-dimensional features. The feature extraction network consists of 5 CSU modules. As a trainable feature extraction network, the input sample image x is mapped into 5 groups of 21-dimensional features. feature, that is
Figure BDA0002682229190000021
The CSU module contains two branches, horizontal and vertical. The number of vertical branches N of each CSU module is {2, 2, 3, 3, 3} from front to back. The vertical branch is responsible for extracting high-dimensional image features. The lateral branch is responsible for feature aggregation and upsampling, and the process can be expressed by formula (1) and formula (2):

Figure BDA0002682229190000022
Figure BDA0002682229190000022

Figure BDA0002682229190000023
Figure BDA0002682229190000023

其中,i为第i个CSU模块,n为CSU模块中第n个纵向支路卷积,W为纵向卷积核参数,卷积核大小统一为3×3,下同,dW为横向卷积核参数,卷积核大小统一为1×1,Φ为Relu激活函数,up(·)为双线性插值函数,聚合操作采用1×1卷积实现;Among them, i is the i-th CSU module, n is the n-th longitudinal branch convolution in the CSU module, W is the longitudinal convolution kernel parameter, and the size of the convolution kernel is uniformly 3×3, the same below, dW is the horizontal convolution Kernel parameters, the size of the convolution kernel is uniformly 1×1, Φ is the Relu activation function, up( ) is a bilinear interpolation function, and the aggregation operation is implemented by 1×1 convolution;

(2)将5个特征组[x1,x2,x3,x4,x5]按照正反序列方向分别有序输入到两个递归神经网络,得到前向递归神经网络的5个特征输出

Figure BDA0002682229190000024
和后向递归神经网络的5个输出
Figure BDA0002682229190000025
其中递归神经网络由5个CLSTM单元串联组成,CLSTM单元内部设有三部分,分别为输入单元it、输出单元ot、遗忘单元ft,递归过程如下所示:(2) Input the five feature groups [x 1 , x 2 , x 3 , x 4 , x 5 ] into two recurrent neural networks in an orderly manner according to the forward and reverse sequence directions, and obtain five features of the forward recurrent neural network output
Figure BDA0002682229190000024
and the 5 outputs of the backward recurrent neural network
Figure BDA0002682229190000025
The recurrent neural network is composed of 5 CLSTM units connected in series. There are three parts inside the CLSTM unit, which are the input unit i t , the output unit ot , and the forgetting unit ft . The recursive process is as follows:

输入:上一CLSTM单元的记忆特征ct-1,当前时刻输入xt,上一CLSTM单元输出ht-1Input: memory feature c t-1 of the previous CLSTM unit, input x t at the current moment, output h t-1 of the previous CLSTM unit,

2-1)由遗忘门ft对上一CLSTM单元的记忆特征ct-1进行筛选,输出c'为公式(3):2-1) The memory feature c t-1 of the previous CLSTM unit is screened by the forget gate f t , and the output c' is formula (3):

c'=ft·ct-1 (3),c'=f t ·c t-1 (3),

ft=σ(Wf*[xt,ht-1]+bf) (4),f t =σ(W f *[x t ,h t-1 ]+b f ) (4),

其中Wf为1×1卷积权重,bf为偏置项,xt为当前时刻输入,ht-1为上一CLSTM单元输出,σ为Sigmoid激活函数,ct-1为上一CLSTM单元的记忆特征,遗忘门旨在对上一单元的记忆特征选择筛选,上一单元通过1×1卷积将xt,ht-1的特征信息进行融合,经过Sigmoid激活函数生成一张权重图,并利用该权重图对记忆特征进行选择和筛选;where W f is the 1×1 convolution weight, b f is the bias term, x t is the input at the current moment, h t-1 is the output of the previous CLSTM unit, σ is the sigmoid activation function, and c t-1 is the previous CLSTM The memory features of the unit, the forget gate is designed to select and filter the memory features of the previous unit. The previous unit fuses the feature information of x t , h t-1 through 1×1 convolution, and generates a weight through the sigmoid activation function. map, and use the weight map to select and filter memory features;

2-2)由输入门it对输入特征进行筛选,并与筛选后的记忆特征c'加权,得到当前CLSTM单元的记忆特征ct为公式(5):2-2) The input feature is screened by the input gate i t , and weighted with the screened memory feature c' to obtain the memory feature c t of the current CLSTM unit as formula (5):

ct=c'+(1+it)×Φ(Wx*[xt,ht-1]+bc) (5),c t =c'+(1+i t )×Φ(W x *[x t ,h t-1 ]+b c ) (5),

it=σ(Wi*[xt,ht-1]+bi) (6),i t =σ(W i *[x t ,h t-1 ]+b i ) (6),

其中Wx、Wi为卷积权重,bi、bc为偏置项,Φ为Relu函数,σ为Sigmoid函数,*为卷积操作,输入门旨在对输入特征的重要性进行建模,生成特征权重矩阵,对输入特征进行选择性增强,当前CLSTM单元通过1×1卷积将xt,ht-1的特征信息进行融合,经过Sigmoid激活函数生成一张权重图,并利用该权重图对融合后的输入特征信息进行选择性增强;where W x and Wi are the convolution weights, bi and b c are the bias terms, Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, and the input gate is designed to model the importance of the input features , generate a feature weight matrix, and selectively enhance the input features. The current CLSTM unit fuses the feature information of x t and h t-1 through 1×1 convolution, and generates a weight map through the Sigmoid activation function. The weight map selectively enhances the fused input feature information;

2-3)由输出门ot对记忆特征进行选择性增强,得到当前CLSTM输出ht为公式(7):2-3) The memory feature is selectively enhanced by the output gate o t , and the current CLSTM output h t is obtained as formula (7):

ht=(ot+1)×Φ(ct) (7),h t =(o t +1)×Φ(c t ) (7),

ot=σ(Wo*[xt,ht-1]+bo) (8),o t =σ(W o *[x t ,h t-1 ]+b o ) (8),

其中,Φ为Relu函数,σ为Sigmoid函数,*为卷积操作,Wo为卷积权重,bo为偏置项;Among them, Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, W o is the convolution weight, and b o is the bias term;

输出:当前CLSTM单元的记忆特征ct,当前CLSTM单元输出htOutput: the memory feature c t of the current CLSTM unit, the output h t of the current CLSTM unit;

(3)将得到的前向支路特征张量

Figure BDA0002682229190000031
以及后向支路特征张量
Figure BDA0002682229190000032
分别经过卷积操作将特征通道数降至1,得到
Figure BDA0002682229190000033
以及
Figure BDA0002682229190000034
过程如公式(9)、公式(10)所示:(3) The obtained forward branch feature tensor
Figure BDA0002682229190000031
and the backward branch feature tensor
Figure BDA0002682229190000032
After the convolution operation, the number of feature channels is reduced to 1, and we get
Figure BDA0002682229190000033
as well as
Figure BDA0002682229190000034
The process is shown in formula (9) and formula (10):

Figure BDA0002682229190000035
Figure BDA0002682229190000035

Figure BDA0002682229190000036
Figure BDA0002682229190000036

(4)将特征张量

Figure BDA0002682229190000037
在维度1进行通道拼接,得到张量so,采用1×1卷积对张量so进行加权融合,经过Sigmoid激活函数得到最终输出sout为公式(11):(4) The feature tensor
Figure BDA0002682229190000037
Perform channel splicing in dimension 1 to obtain the tensor so, use 1×1 convolution to weight the tensor so, and obtain the final output sout through the Sigmoid activation function as formula (11):

sout=Wso*so (11),sout=Wso* so (11),

Figure BDA0002682229190000038
Figure BDA0002682229190000038

(5)将

Figure BDA0002682229190000039
分别经过Sigmoid激活函数,得到10个支路输出
Figure BDA00026822291900000310
(5) will
Figure BDA0002682229190000039
After passing through the Sigmoid activation function respectively, 10 branch outputs are obtained
Figure BDA00026822291900000310

(6)采用交叉熵lw(Xi)计算所有输出与标签的损失,通过最小化Lw损失优化整体网络,由于数据集中标签由多人进行标注,本技术方案对多个标签进行加权平均,并按照阀值θ对标签进行划分,以yi<θ的像素点为歧义点,不计算损失,yi=θ为非边缘点,yi≥θ为边缘点:(6) Calculate the loss of all outputs and labels by using the cross entropy lw (X i ), and optimize the overall network by minimizing the loss of Lw . Since the labels in the data set are labelled by many people, this technical solution performs a weighted average of multiple labels. , and divide the label according to the threshold θ, take the pixel points with y i <θ as the ambiguous point, do not calculate the loss, y i =θ is the non-edge point, y i ≥ θ is the edge point:

Figure BDA00026822291900000311
Figure BDA00026822291900000311

其中,in,

Figure BDA0002682229190000041
Figure BDA0002682229190000041

Figure BDA0002682229190000042
Figure BDA0002682229190000042

|Y+|和|Y-|分别代表了每个batch中正样本数量总和以及负样本数量总和,超参数λ为正负样本平衡参数,X表示网络输出,yi为标签像素点,所以,最终的损失函数为公式(16)::|Y + | and |Y - | represent the sum of the number of positive samples and the total number of negative samples in each batch, respectively, the hyperparameter λ is the balance parameter of positive and negative samples, X represents the network output, and yi is the label pixel. Therefore, in the end, The loss function of is formula (16):

Figure BDA0002682229190000043
Figure BDA0002682229190000043

Figure BDA0002682229190000044
为前向递归神经网络第k阶段的输出,
Figure BDA0002682229190000045
为后向递归神经网络第k阶段的输出,sout是最终网络的输出值,|I|为图像I的所有像素点数量;
Figure BDA0002682229190000044
is the output of the kth stage of the forward recurrent neural network,
Figure BDA0002682229190000045
is the output of the kth stage of the backward recurrent neural network, sout is the output value of the final network, |I| is the number of all pixels in the image I;

(7)重复步骤(1)-步骤(6),直到网络收敛。(7) Repeat steps (1) to (6) until the network converges.

本技术方案采用双向递归神经网络来增强多尺度上下文之间的内在联系,解决了目前基于多尺度方法各尺度信息利用不充分,边缘检测精度受限的问题,该方法根据多尺度上下文感受野在时间顺序上由小到大的特点,将多尺度上下文信息从时间维度进行分析建模,利用递归神经网络的思想,通过由上至下和由下至上的递归神经网络支路逐步捕捉不同方向且不同范围内的上下文联系,完成对特征金字塔表达能力的增强,具有检测精度高,速度快的优点。This technical solution uses a bidirectional recurrent neural network to enhance the internal connection between multi-scale contexts, and solves the problems of insufficient utilization of each scale information based on the current multi-scale method and limited edge detection accuracy. The characteristics of time sequence from small to large, the multi-scale context information is analyzed and modeled from the time dimension, using the idea of recurrent neural network, through the recurrent neural network branches from top to bottom and bottom to top to gradually capture different directions and The context connection in different ranges completes the enhancement of the expression ability of the feature pyramid, which has the advantages of high detection accuracy and fast speed.

这种方法通过捕捉多尺度上下文信息之间的内在联系提升边缘检测的准确性,具有速度快的优点。This method improves the accuracy of edge detection by capturing the intrinsic connection between multi-scale contextual information, and has the advantage of being fast.

附图说明Description of drawings

图1为实施例中CLSTM单元内部结构示意图;1 is a schematic diagram of the internal structure of a CLSTM unit in an embodiment;

图2为实施例中总体网络结构示意图;2 is a schematic diagram of an overall network structure in an embodiment;

图3为实施例中方法流程示意图。FIG. 3 is a schematic flowchart of the method in the embodiment.

具体实施方式Detailed ways

下面结合附图和实施例对本发明的内容作进一步的阐述,但不是对本发明的限定。The content of the present invention will be further elaborated below in conjunction with the accompanying drawings and embodiments, but it is not intended to limit the present invention.

实施例:Example:

参照图2、3,一种基于上下文增强网络的感知边缘检测方法,包括如下步骤:Referring to Figures 2 and 3, a method for perceptual edge detection based on a context-enhanced network, comprising the following steps:

(1)获取图像训练数据集训练边缘检测模型:所述边缘检测模型包括特征提取过程、神经网络双向递归过程和分类过程,具体为:(1) Obtain the image training data set and train the edge detection model: the edge detection model includes a feature extraction process, a neural network two-way recursive process and a classification process, specifically:

特征提取过程:将样本图像x映射为5组具有d维的特征,特征提取网络由5个CSU模块组成,作为一个可训练的特征提取网络将输入的样本图像x映射为5组具有21维的特征,即

Figure BDA0002682229190000051
其中CSU模块里包含横向和纵向两个支路,每个CSU模块的纵向支路数N从前到后分别为{2,2,3,3,3},纵向支路负责提取高维图像特征,横向支路负责进行特征的聚合以及上采样,过程可以通过公式(1)、公式(2)进行表示:Feature extraction process: map the sample image x into 5 groups of d-dimensional features. The feature extraction network consists of 5 CSU modules. As a trainable feature extraction network, the input sample image x is mapped into 5 groups of 21-dimensional features. feature, that is
Figure BDA0002682229190000051
The CSU module contains two branches, horizontal and vertical. The number of vertical branches N of each CSU module is {2, 2, 3, 3, 3} from front to back. The vertical branch is responsible for extracting high-dimensional image features. The lateral branch is responsible for feature aggregation and upsampling, and the process can be expressed by formula (1) and formula (2):

Figure BDA0002682229190000052
Figure BDA0002682229190000052

Figure BDA0002682229190000053
Figure BDA0002682229190000053

其中,i为第i个CSU模块,n为CSU模块中第n个纵向支路卷积,W为纵向卷积核参数,卷积核大小统一为3×3,下同,dW为横向卷积核参数,卷积核大小统一为1×1,Φ为Relu激活函数,up(·)为双线性插值函数,聚合操作采用1×1卷积实现;Among them, i is the i-th CSU module, n is the n-th longitudinal branch convolution in the CSU module, W is the longitudinal convolution kernel parameter, and the size of the convolution kernel is uniformly 3×3, the same below, dW is the horizontal convolution Kernel parameters, the size of the convolution kernel is uniformly 1×1, Φ is the Relu activation function, up( ) is a bilinear interpolation function, and the aggregation operation is implemented by 1×1 convolution;

(2)将5个特征组[x1,x2,x3,x4,x5]按照正反序列方向分别有序输入到两个递归神经网络,得到前向递归神经网络的5个特征输出

Figure BDA0002682229190000054
和后向递归神经网络的5个输出
Figure BDA0002682229190000055
其中递归神经网络由5个CLSTM单元串联组成,CLSTM单元内部设有三部分,分别为输入单元it、输出单元ot、遗忘单元ft,CLSTM单元内部结构如图1所示,递归过程如下所示:(2) Input the five feature groups [x 1 , x 2 , x 3 , x 4 , x 5 ] into two recurrent neural networks in an orderly manner according to the forward and reverse sequence directions, and obtain five features of the forward recurrent neural network output
Figure BDA0002682229190000054
and the 5 outputs of the backward recurrent neural network
Figure BDA0002682229190000055
The recurrent neural network is composed of 5 CLSTM units in series. There are three parts inside the CLSTM unit, namely the input unit it, the output unit ot, and the forgetting unit ft . The internal structure of the CLSTM unit is shown in Figure 1. The recursive process is as follows Show:

输入:上一CLSTM单元的记忆特征ct-1,当前时刻输入xt,上一CLSTM单元输出ht-1Input: memory feature c t-1 of the previous CLSTM unit, input x t at the current moment, output h t-1 of the previous CLSTM unit,

2-1)由遗忘门ft对上一CLSTM单元的记忆特征ct-1进行筛选,输出c'为公式(3):2-1) The memory feature c t-1 of the previous CLSTM unit is screened by the forget gate f t , and the output c' is formula (3):

c'=ft·ct-1 (3),c'=f t ·c t-1 (3),

ft=σ(Wf*[xt,ht-1]+bf) (4),f t =σ(W f *[x t ,h t-1 ]+b f ) (4),

其中Wf为1×1卷积权重,bf为偏置项,xt为当前时刻输入,ht-1为上一CLSTM单元输出,σ为Sigmoid激活函数,ct-1为上一CLSTM单元的记忆特征,遗忘门旨在对上一单元的记忆特征选择筛选,上一单元通过1×1卷积将xt,ht-1的特征信息进行融合,经过Sigmoid激活函数生成一张权重图,并利用该权重图对记忆特征进行选择和筛选;where W f is the 1×1 convolution weight, b f is the bias term, x t is the input at the current moment, h t-1 is the output of the previous CLSTM unit, σ is the sigmoid activation function, and c t-1 is the previous CLSTM The memory features of the unit, the forget gate is designed to select and filter the memory features of the previous unit. The previous unit fuses the feature information of x t , h t-1 through 1×1 convolution, and generates a weight through the sigmoid activation function. map, and use the weight map to select and filter memory features;

2-2)由输入门it对输入特征进行筛选,并与筛选后的记忆特征c'加权,得到当前CLSTM单元的记忆特征ct为公式(5):2-2) The input feature is screened by the input gate i t , and weighted with the screened memory feature c' to obtain the memory feature c t of the current CLSTM unit as formula (5):

ct=c'+(1+it)×Φ(Wx*[xt,ht-1]+bc) (5),c t =c'+(1+i t )×Φ(W x *[x t ,h t-1 ]+b c ) (5),

it=σ(Wi*[xt,ht-1]+bi) (6),i t =σ(W i *[x t ,h t-1 ]+b i ) (6),

其中Wx、Wi为卷积权重,bi、bc为偏置项,Φ为Relu函数,σ为Sigmoid函数,*为卷积操作,输入门旨在对输入特征的重要性进行建模,生成特征权重矩阵,对输入特征进行选择性增强,当前CLSTM单元通过1×1卷积将xt,ht-1的特征信息进行融合,经过Sigmoid激活函数生成一张权重图,并利用该权重图对融合后的输入特征信息进行选择性增强;where W x and Wi are the convolution weights, bi and b c are the bias terms, Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, and the input gate is designed to model the importance of the input features , generate a feature weight matrix, and selectively enhance the input features. The current CLSTM unit fuses the feature information of x t and h t-1 through 1×1 convolution, and generates a weight map through the Sigmoid activation function. The weight map selectively enhances the fused input feature information;

2-3)由输出门ot对记忆特征进行选择性增强,得到当前CLSTM输出ht为公式(7):2-3) The memory feature is selectively enhanced by the output gate o t , and the current CLSTM output h t is obtained as formula (7):

ht=(ot+1)×Φ(ct) (7),h t =(o t +1)×Φ(c t ) (7),

ot=σ(Wo*[xt,ht-1]+bo) (8),其中,Φ为Relu函数,σ为Sigmoid函数,*为卷积操作,Wo为卷积权重,bo为偏置项;o t =σ(W o *[x t ,h t-1 ]+b o ) (8), where Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, and W o is the convolution weight, b o is the bias term;

输出:当前CLSTM单元的记忆特征ct,当前CLSTM单元输出htOutput: the memory feature c t of the current CLSTM unit, the output h t of the current CLSTM unit;

(3)将得到的前向支路特征张量

Figure BDA0002682229190000061
以及后向支路特征张量
Figure BDA0002682229190000062
分别经过卷积操作将特征通道数降至1,得到
Figure BDA0002682229190000063
以及
Figure BDA0002682229190000064
过程如公式(9)、公式(10)所示:(3) The obtained forward branch feature tensor
Figure BDA0002682229190000061
and the backward branch feature tensor
Figure BDA0002682229190000062
After the convolution operation, the number of feature channels is reduced to 1, and we get
Figure BDA0002682229190000063
as well as
Figure BDA0002682229190000064
The process is shown in formula (9) and formula (10):

Figure BDA0002682229190000065
Figure BDA0002682229190000065

Figure BDA0002682229190000066
Figure BDA0002682229190000066

(4)将特征张量

Figure BDA0002682229190000067
在维度1进行通道拼接,得到张量so,采用1×1卷积对张量so进行加权融合,经过Sigmoid激活函数得到最终输出sout为公式(11):(4) The feature tensor
Figure BDA0002682229190000067
Perform channel splicing in dimension 1 to obtain the tensor so, use 1×1 convolution to weight the tensor so, and obtain the final output sout through the Sigmoid activation function as formula (11):

sout=Wso*so (11),sout=Wso* so (11),

Figure BDA0002682229190000068
Figure BDA0002682229190000068

(5)将

Figure BDA0002682229190000069
分别经过Sigmoid激活函数,得到10个支路输出
Figure BDA00026822291900000610
(5) will
Figure BDA0002682229190000069
After passing through the Sigmoid activation function respectively, 10 branch outputs are obtained
Figure BDA00026822291900000610

(6)采用交叉熵lw(Xi)计算所有输出与标签的损失,通过最小化Lw损失优化整体网络,由于数据集中标签由多人进行标注,本技术方案对多个标签进行加权平均,并按照阀值θ对标签进行划分,以yi<θ的像素点为歧义点,不计算损失,yi=θ为非边缘点,yi≥θ为边缘点:(6) Calculate the loss of all outputs and labels by using the cross entropy lw (X i ), and optimize the overall network by minimizing the loss of Lw . Since the labels in the data set are labelled by many people, this technical solution performs a weighted average of multiple labels. , and divide the label according to the threshold θ, take the pixel points with y i <θ as the ambiguous point, do not calculate the loss, y i =θ is the non-edge point, y i ≥ θ is the edge point:

Figure BDA00026822291900000611
Figure BDA00026822291900000611

其中,in,

Figure BDA00026822291900000612
Figure BDA00026822291900000612

Figure BDA0002682229190000071
Figure BDA0002682229190000071

|Y+|和|Y-|分别代表了每个batch中正样本数量总和以及负样本数量总和,超参数λ为正负样本平衡参数,X表示网络输出,yi为标签像素点,所以,最终的损失函数为公式(16):|Y + | and |Y - | represent the sum of the number of positive samples and the total number of negative samples in each batch, respectively, the hyperparameter λ is the balance parameter of positive and negative samples, X represents the network output, and yi is the label pixel. Therefore, in the end, The loss function of is Equation (16):

Figure BDA0002682229190000072
Figure BDA0002682229190000072

Figure BDA0002682229190000073
为前向递归神经网络第k阶段的输出,
Figure BDA0002682229190000074
为后向递归神经网络第k阶段的输出,sout是最终网络的输出值,|I|为图像I的所有像素点数量;
Figure BDA0002682229190000073
is the output of the kth stage of the forward recurrent neural network,
Figure BDA0002682229190000074
is the output of the kth stage of the backward recurrent neural network, sout is the output value of the final network, |I| is the number of all pixels in the image I;

(7)重复步骤(1)-步骤(6),直到网络收敛。(7) Repeat steps (1) to (6) until the network converges.

Claims (1)

1.一种基于上下文增强网络的感知边缘检测方法,其特征在于,包括如下步骤:1. a perceptual edge detection method based on context enhancement network, is characterized in that, comprises the steps: (1)获取图像训练数据集训练边缘检测模型:所述边缘检测模型包括特征提取过程、神经网络双向递归过程和分类过程,具体为:(1) Obtain the image training data set and train the edge detection model: the edge detection model includes a feature extraction process, a neural network two-way recursive process and a classification process, specifically: 特征提取过程:将样本图像x映射为5组具有d维的特征,特征提取网络由5个CSU模块组成,作为一个可训练的特征提取网络将输入的样本图像x映射为5组具有21维的特征,即
Figure FDA0002682229180000015
i∈{1~5},其中CSU模块里包含横向和纵向两个支路,每个CSU模块的纵向支路数N从前到后分别为{2,2,3,3,3},纵向支路负责提取高维图像特征,横向支路负责进行特征的聚合以及上采样,过程通过公式(1)、公式(2)进行表示:
Feature extraction process: map the sample image x into 5 groups of d-dimensional features. The feature extraction network consists of 5 CSU modules. As a trainable feature extraction network, the input sample image x is mapped into 5 groups of 21-dimensional features. feature, that is
Figure FDA0002682229180000015
i∈{1~5}, in which the CSU module contains two branches, horizontal and vertical. The number of vertical branches N of each CSU module is {2,2,3,3,3} from front to back, respectively. The path is responsible for extracting high-dimensional image features, and the lateral branch is responsible for feature aggregation and upsampling. The process is represented by formula (1) and formula (2):
Figure FDA0002682229180000011
Figure FDA0002682229180000011
Figure FDA0002682229180000012
Figure FDA0002682229180000012
其中,i为第i个CSU模块,n为CSU模块中第n个纵向支路卷积,W为纵向卷积核参数,卷积核大小统一为3×3,下同,dW为横向卷积核参数,卷积核大小统一为1×1,Φ为Relu激活函数,up(·)为双线性插值函数,聚合操作采用1×1卷积实现;Among them, i is the i-th CSU module, n is the n-th longitudinal branch convolution in the CSU module, W is the longitudinal convolution kernel parameter, and the size of the convolution kernel is uniformly 3×3, the same below, dW is the horizontal convolution Kernel parameters, the size of the convolution kernel is uniformly 1×1, Φ is the Relu activation function, up( ) is a bilinear interpolation function, and the aggregation operation is implemented by 1×1 convolution; (2)将5个特征组[x1,x2,x3,x4,x5]按照正反序列方向分别有序输入到两个递归神经网络,得到前向递归神经网络的5个特征输出
Figure FDA0002682229180000013
和后向递归神经网络的5个输出
Figure FDA0002682229180000014
其中递归神经网络由5个CLSTM单元串联组成,CLSTM单元内部设有三部分,分别为输入单元it、输出单元ot、遗忘单元ft,递归过程如下所示:
(2) Input the five feature groups [x 1 , x 2 , x 3 , x 4 , x 5 ] into two recurrent neural networks in an orderly manner according to the forward and reverse sequence directions, and obtain five features of the forward recurrent neural network output
Figure FDA0002682229180000013
and the 5 outputs of the backward recurrent neural network
Figure FDA0002682229180000014
The recurrent neural network is composed of 5 CLSTM units connected in series. There are three parts inside the CLSTM unit, which are the input unit i t , the output unit ot , and the forgetting unit ft . The recursive process is as follows:
输入:上一CLSTM单元的记忆特征ct-1,当前时刻输入xt,上一CLSTM单元输出ht-1Input: memory feature c t-1 of the previous CLSTM unit, input x t at the current moment, output h t-1 of the previous CLSTM unit, 2-1)由遗忘门ft对上一CLSTM单元的记忆特征ct-1进行筛选,输出c'为公式(3):2-1) The memory feature c t-1 of the previous CLSTM unit is screened by the forget gate f t , and the output c' is formula (3): c'=ft·ct-1 (3),c'=f t ·c t-1 (3), ft=σ(Wf*[xt,ht-1]+bf) (4),f t =σ(W f *[x t ,h t-1 ]+b f ) (4), 其中Wf为1×1卷积权重,bf为偏置项,xt为当前时刻输入,ht-1为上一CLSTM单元输出,σ为Sigmoid激活函数,ct-1为上一CLSTM单元的记忆特征,遗忘门旨在对上一单元的记忆特征选择筛选,上一单元通过1×1卷积将xt,ht-1的特征信息进行融合,经过Sigmoid激活函数生成一张权重图,并利用该权重图对记忆特征进行选择和筛选;where W f is the 1×1 convolution weight, b f is the bias term, x t is the input at the current moment, h t-1 is the output of the previous CLSTM unit, σ is the sigmoid activation function, and c t-1 is the previous CLSTM The memory features of the unit, the forget gate is designed to select and filter the memory features of the previous unit. The previous unit fuses the feature information of x t , h t-1 through 1×1 convolution, and generates a weight through the sigmoid activation function. map, and use the weight map to select and filter memory features; 2-2)由输入门it对输入特征进行筛选,并与筛选后的记忆特征c'加权,得到当前CLSTM单元的记忆特征ct为公式(5):2-2) The input feature is screened by the input gate i t , and weighted with the screened memory feature c' to obtain the memory feature c t of the current CLSTM unit as formula (5): ct=c'+(1+it)×Φ(Wx*[xt,ht-1]+bc) (5),c t =c'+(1+i t )×Φ(W x *[x t ,h t-1 ]+b c ) (5), it=σ(Wi*[xt,ht-1]+bi) (6),i t =σ(W i *[x t ,h t-1 ]+b i ) (6), 其中Wx、Wi为卷积权重,bi、bc为偏置项,Φ为Relu函数,σ为Sigmoid函数,*为卷积操作,输入门旨在对输入特征的重要性进行建模,生成特征权重矩阵,对输入特征进行选择性增强,当前CLSTM单元通过1×1卷积将xt,ht-1的特征信息进行融合,经过Sigmoid激活函数生成一张权重图,并利用该权重图对融合后的输入特征信息进行选择性增强;where W x and Wi are the convolution weights, bi and b c are the bias terms, Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, and the input gate is designed to model the importance of the input features , generate a feature weight matrix, and selectively enhance the input features. The current CLSTM unit fuses the feature information of x t and h t-1 through 1×1 convolution, and generates a weight map through the Sigmoid activation function. The weight map selectively enhances the fused input feature information; 2-3)由输出门ot对记忆特征进行选择性增强,得到当前CLSTM输出ht为公式(7):2-3) The memory feature is selectively enhanced by the output gate o t , and the current CLSTM output h t is obtained as formula (7): ht=(ot+1)×Φ(ct) (7),h t =(o t +1)×Φ(c t ) (7), ot=σ(Wo*[xt,ht-1]+bo) (8),其中,Φ为Relu函数,σ为Sigmoid函数,*为卷积操作,Wo为卷积权重,bo为偏置项;o t =σ(W o *[x t ,h t-1 ]+b o ) (8), where Φ is the Relu function, σ is the Sigmoid function, * is the convolution operation, and W o is the convolution weight, b o is the bias term; 输出:当前CLSTM单元的记忆特征ct,当前CLSTM单元输出htOutput: the memory feature c t of the current CLSTM unit, the output h t of the current CLSTM unit; (3)将得到的前向支路特征张量
Figure FDA0002682229180000021
以及后向支路特征张量
Figure FDA0002682229180000022
分别经过卷积操作将特征通道数降至1,得到
Figure FDA0002682229180000023
以及
Figure FDA0002682229180000024
过程如公式(9)、公式(10)所示:
(3) The obtained forward branch feature tensor
Figure FDA0002682229180000021
and the backward branch feature tensor
Figure FDA0002682229180000022
After the convolution operation, the number of feature channels is reduced to 1, and we get
Figure FDA0002682229180000023
as well as
Figure FDA0002682229180000024
The process is shown in formula (9) and formula (10):
Figure FDA0002682229180000025
Figure FDA0002682229180000025
Figure FDA0002682229180000026
Figure FDA0002682229180000026
(4)将特征张量
Figure FDA0002682229180000027
在维度1进行通道拼接,得到张量so,采用1×1卷积对张量so进行加权融合,经过Sigmoid激活函数得到最终输出sout为公式(11):
(4) The feature tensor
Figure FDA0002682229180000027
Perform channel splicing in dimension 1 to obtain the tensor so, use 1×1 convolution to weight the tensor so, and obtain the final output sout through the Sigmoid activation function as formula (11):
sout=Wso*so (11),sout=Wso* so (11),
Figure FDA0002682229180000028
Figure FDA0002682229180000028
(5)将
Figure FDA0002682229180000029
分别经过Sigmoid激活函数,得到10个支路输出
Figure FDA00026822291800000210
(5) will
Figure FDA0002682229180000029
After passing through the Sigmoid activation function respectively, 10 branch outputs are obtained
Figure FDA00026822291800000210
(6)采用交叉熵lw(Xi)计算所有输出与标签的损失,通过最小化Lw损失优化整体网络,对多个标签进行加权平均,并按照阀值θ对标签进行划分,以yi<θ的像素点为歧义点,不计算损失,yi=θ为非边缘点,yi≥θ为边缘点:(6) Calculate the loss of all outputs and labels by using the cross entropy lw (X i ), optimize the overall network by minimizing the loss of Lw, perform a weighted average of multiple labels, and divide the labels according to the threshold θ, with y Pixels with i < θ are ambiguous points, no loss is calculated, yi = θ is a non-edge point, y i ≥ θ is an edge point:
Figure FDA00026822291800000211
Figure FDA00026822291800000211
其中,in,
Figure FDA0002682229180000031
Figure FDA0002682229180000031
Figure FDA0002682229180000032
Figure FDA0002682229180000032
|Y+|和|Y-|分别代表了每个batch中正样本数量总和以及负样本数量总和,超参数λ为正负样本平衡参数,X表示网络输出,yi为标签像素点,最终的损失函数为公式(16):|Y + | and |Y - | represent the total number of positive samples and the total number of negative samples in each batch, respectively, the hyperparameter λ is the balance parameter of positive and negative samples, X represents the network output, yi is the label pixel, the final loss The function is formula (16):
Figure FDA0002682229180000033
Figure FDA0002682229180000033
Figure FDA0002682229180000034
为前向递归神经网络第k阶段的输出,
Figure FDA0002682229180000035
为后向递归神经网络第k阶段的输出,sout是最终网络的输出值,|I|为图像I的所有像素点数量;
Figure FDA0002682229180000034
is the output of the kth stage of the forward recurrent neural network,
Figure FDA0002682229180000035
is the output of the kth stage of the backward recurrent neural network, sout is the output value of the final network, |I| is the number of all pixels in the image I;
(7)重复步骤(1)-步骤(6),直到网络收敛。(7) Repeat steps (1) to (6) until the network converges.
CN202010965729.8A 2020-09-15 2020-09-15 Perception edge detection method based on context enhancement network Active CN112070784B (en)

Priority Applications (1)

Application Number Priority Date Filing Date Title
CN202010965729.8A CN112070784B (en) 2020-09-15 2020-09-15 Perception edge detection method based on context enhancement network

Applications Claiming Priority (1)

Application Number Priority Date Filing Date Title
CN202010965729.8A CN112070784B (en) 2020-09-15 2020-09-15 Perception edge detection method based on context enhancement network

Publications (2)

Publication Number Publication Date
CN112070784A true CN112070784A (en) 2020-12-11
CN112070784B CN112070784B (en) 2022-07-01

Family

ID=73696724

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202010965729.8A Active CN112070784B (en) 2020-09-15 2020-09-15 Perception edge detection method based on context enhancement network

Country Status (1)

Country Link
CN (1) CN112070784B (en)

Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570497A (en) * 2016-10-08 2017-04-19 中国科学院深圳先进技术研究院 Text detection method and device for scene image
CN107180248A (en) * 2017-06-12 2017-09-19 桂林电子科技大学 Strengthen the hyperspectral image classification method of network based on associated losses
CN108595632A (en) * 2018-04-24 2018-09-28 福州大学 A kind of hybrid neural networks file classification method of fusion abstract and body feature
US10289903B1 (en) * 2018-02-12 2019-05-14 Avodah Labs, Inc. Visual sign language translation training device and method
CN109886971A (en) * 2019-01-24 2019-06-14 西安交通大学 A method and system for image segmentation based on convolutional neural network
CN110322009A (en) * 2019-07-19 2019-10-11 南京梅花软件系统股份有限公司 Image prediction method based on the long Memory Neural Networks in short-term of multilayer convolution
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN111222580A (en) * 2020-01-13 2020-06-02 西南科技大学 High-precision crack detection method
US10701394B1 (en) * 2016-11-10 2020-06-30 Twitter, Inc. Real-time video super-resolution with spatio-temporal networks and motion compensation
CN111539916A (en) * 2020-04-08 2020-08-14 中山大学 Image significance detection method and system for resisting robustness

Patent Citations (10)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
CN106570497A (en) * 2016-10-08 2017-04-19 中国科学院深圳先进技术研究院 Text detection method and device for scene image
US10701394B1 (en) * 2016-11-10 2020-06-30 Twitter, Inc. Real-time video super-resolution with spatio-temporal networks and motion compensation
CN107180248A (en) * 2017-06-12 2017-09-19 桂林电子科技大学 Strengthen the hyperspectral image classification method of network based on associated losses
US10289903B1 (en) * 2018-02-12 2019-05-14 Avodah Labs, Inc. Visual sign language translation training device and method
CN108595632A (en) * 2018-04-24 2018-09-28 福州大学 A kind of hybrid neural networks file classification method of fusion abstract and body feature
CN109886971A (en) * 2019-01-24 2019-06-14 西安交通大学 A method and system for image segmentation based on convolutional neural network
CN110322009A (en) * 2019-07-19 2019-10-11 南京梅花软件系统股份有限公司 Image prediction method based on the long Memory Neural Networks in short-term of multilayer convolution
CN110705457A (en) * 2019-09-29 2020-01-17 核工业北京地质研究院 Remote sensing image building change detection method
CN111222580A (en) * 2020-01-13 2020-06-02 西南科技大学 High-precision crack detection method
CN111539916A (en) * 2020-04-08 2020-08-14 中山大学 Image significance detection method and system for resisting robustness

Non-Patent Citations (10)

* Cited by examiner, † Cited by third party
Title
H CHO等: "Biomedical named entity recognition using deep neural networks with contextual information", 《BMC BIOINFORMATICS》 *
JINZHENG CAI等: "Improving deep pancreas segmentation in CT and MRI images via recurrent neural contextual learning and direct loss function", 《COMPUTER VISION AND PATTERN RECOGNITION》 *
RUNMIN WU等: "A mutual learning method for salient object detection with intertwined multi-supervision", 《PROCEEDINGS OF THE IEEE/CVF CONFERENCE ON COMPUTER VISION AND PATTERN RECOGNITION (CVPR)》 *
刘鹏里: "基于深度卷积神经网络特征再提取的边缘检测算法研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *
杨国花: "基于级联神经网络的对话状态追踪技术研究与实现", 《中国博士学位论文全文数据库 (信息科技辑)》 *
欧阳宁等: "结合感知边缘约束与多尺度融合网络的图像超分辨率重建方法", 《计算机应用》 *
王帅帅等: "基于全卷积神经网络的车道线检测", 《数字制造科学》 *
王雅玲: "基于词义消歧卷积神经网络的文本分类技术研究", 《中国优秀硕士学位论文全文数据库 (信息科技辑)》 *
秦锋等: "融合主题的CLSTM短文本情感分类", 《安徽工业大学学报(自然科学版)》 *
马惠珠等: "项目计算机辅助受理的研究方向与关键词――2012年度受理情况与2013年度注意事项", 《电子与信息学报》 *

Also Published As

Publication number Publication date
CN112070784B (en) 2022-07-01

Similar Documents

Publication Publication Date Title
CN108830855B (en) Full convolution network semantic segmentation method based on multi-scale low-level feature fusion
Chen et al. Crowd counting with crowd attention convolutional neural network
Wang et al. FE-YOLOv5: Feature enhancement network based on YOLOv5 for small object detection
CN109993220B (en) Multi-source remote sensing image classification method based on two-way attention fusion neural network
CN110852383B (en) Target detection method and device based on attention mechanism deep learning network
CN105701508B (en) Global local optimum model and conspicuousness detection algorithm based on multistage convolutional neural networks
CN112686304B (en) Target detection method and device based on attention mechanism and multi-scale feature fusion and storage medium
CN113052210A (en) Fast low-illumination target detection method based on convolutional neural network
CN110298387A (en) Incorporate the deep neural network object detection method of Pixel-level attention mechanism
CN110533041B (en) Regression-based multi-scale scene text detection method
CN110378288A (en) A kind of multistage spatiotemporal motion object detection method based on deep learning
CN110633708A (en) Deep network significance detection method based on global model and local optimization
CN113743505A (en) An improved SSD object detection method based on self-attention and feature fusion
CN113435254A (en) Sentinel second image-based farmland deep learning extraction method
CN115631369A (en) A fine-grained image classification method based on convolutional neural network
CN113128308B (en) Pedestrian detection method, device, equipment and medium in port scene
CN112733942A (en) Variable-scale target detection method based on multi-stage feature adaptive fusion
CN108664968B (en) An Unsupervised Text Localization Method Based on Text Selection Model
CN117372898A (en) Unmanned aerial vehicle aerial image target detection method based on improved yolov8
CN112528058B (en) Fine-grained image classification method based on image attribute active learning
CN115459996B (en) Network intrusion detection method based on gated convolution and feature pyramid
CN110675405B (en) One-shot image segmentation method based on attention mechanism
CN113065426A (en) Gesture image feature fusion method based on channel perception
CN116503726A (en) Multi-scale light smoke image segmentation method and device
Wang et al. SLMS-SSD: Improving the balance of semantic and spatial information in object detection

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
GR01 Patent grant
GR01 Patent grant
TR01 Transfer of patent right
TR01 Transfer of patent right

Effective date of registration: 20240927

Address after: Room 2, 9th Floor, Unit 1, Building 15, No. 63-1 Taoyuan Road, Qingxiu District, Nanning City, Guangxi Zhuang Autonomous Region 530000

Patentee after: Guangxi Zhengshichang Information Technology Co.,Ltd.

Country or region after: China

Address before: 541004 1 Jinji Road, Qixing District, Guilin, the Guangxi Zhuang Autonomous Region

Patentee before: GUILIN University OF ELECTRONIC TECHNOLOGY

Country or region before: China