U-Shape Network For Chip Surface Defect Detection
U-Shape Network For Chip Surface Defect Detection
Abstract—Currently, deep learning methods and models have the growing market demand. However, due to the complex
achieved significant progress in image segmentation and have manufacturing process of capacitor components, their quality
been successfully applied in various practical applications. How-
ever, common methods and models are usually only suitable for
directly affects the performance and reliability of electronic
situations where the segmentation target occupies a large portion devices. Therefore, the appearance inspection, electrical per-
of the input image and has a very obvious difference with the formance testing, and sorting and packaging of capacitors after
background. Additionally, they require high-quality datasets with production are critical steps to ensure product quality. Accurate
balanced categories and large quantities. In the field of industrial and efficient detection of semiconductor capacitor components
defect detection, defects often occupy a small area of the image
and are difficult to distinguish from the background. The edges
has become an important market demand.
of defects are often blurry, and the differences in shape and In recent years, deep learning has been shown to be robust
color between defect categories are relatively small. Moreover, in to background, illumination, color, shape, size, and intensity,
industrial settings, acquiring a large amount of defect data and which is particularly ideal for detecting complex surface
labeling it is very expensive. Therefore, the available training data defects in industrial environments. As surface defects can take
in practical applications is often small and imbalanced. These
characteristics make it difficult to train common image detection
many forms, the surface to be detected may have multiple
models on industrial defect datasets. To address these issues, we types of these defects simultaneously. To address the defect
propose a novel network architecture based on the Unet [25] detection, several methods have emerged. Some supervised
segmentation model. This architecture effectively analyzes and learning methods use datasets including DAGM2007, road
utilizes image features extracted by commonly used backbone crack datasets [1], railway datasets [2], fabric datasets [3], sili-
networks. As a result, the model can detect defects that closely
resemble the background and occupy small regions. We also
con steel strip datasets [4], and railway defect datasets [5]. The
introduce a practical method for training segmentation models work of supervised methods mainly focuses on how to obtain
on low-quality datasets. Compared to existing segmentation efficient feature representations in images, and many network
models commonly used in image segmentation, our proposed modules have been proposed for this purpose. For example,
architecture demonstrates better performance across various [6] uses an autoencoder network to learn the representations
evaluation metrics. This research aims to provide a viable solution
for industrial defect detection.
of these local anomalies and find common features between
Index Terms—Image Segmentation, Defect Detection, Camou- different defects. Another method [7] proposes a multi-modal
flaged Object Detection, Small Object Detection, Gaussian pyramid scheme using patches, combined with con-
volutional denoising autoencoder (CDAE) networks at each
I. I NTRODUCTION level, for defect detection on textured surfaces. It learns the
pattern distribution in the reference image. Method [8] detects
With the development and expansion of China’s industry, insulators in traction power systems connected to electrified
the demand for electronic components in various sectors has railways through segmentation methods. In addition to design-
also been increasing. As one of the most commonly used ing new network modules with CNN [36], the use of attention
electronic components, capacitors are now being produced mechanisms has also been proven useful in many works.
on a large and mass scale by major manufacturers to meet For example, [9] adds attention mechanisms in FPN [26] to
A. Image segmentation xi,j represents the influence of the j-th channel on the i-th
channel. Then, X is multiplied by V to obtain the channel
In 2015, Long et al. from the University of California,
attention-enhanced feature F ′ .
Berkeley, proposed the Fully Convolution Network (FCN)
[11], which performs pixel-level classification on input images X
C
299
Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on February 11,2025 at 07:17:35 UTC from IEEE Xplore. Restrictions apply.
Fig. 1. The overall architecture of our model extracts features from each layer of the input image through the backbone. As the features progress from higher
layers to lower layers, the semantic information gradually strengthens while the spatial information gradually weakens. These features are then input into the
proposed MS fusion module to enhance the semantic and spatial information of each layer. The enhanced features are subsequently fed into the PFNet [32]
decoder, which outputs the final result.
by comparing patterns, such as texture and semantics, of Ff nd , enhancing both false positive and false negative iden-
ambiguous regions with confident regions to make the final tification capabilities, obtaining the current layer’s enhanced
decision. This paper proposes using the focus module in PFNet feature Fi , which can serve as the input for the focus module
[32] to mimic this behavior to identify chip defects that are of the previous layer. The input for the second layer’s focus
highly similar to the background. module is the output feature of the bottom layer attention
First, for the next layer output feature Fh , a convolution mechanism module.
operation reduces the number of channels from 512 to 1 to For Ff pd , Ff nd , and the output feature Fi :
obtain a rough estimate Pi+1 of the defect target’s position.
Ff pd = CE(Fc ∗ U p(Pi+1 )) (3)
Then, through the FM module (focus module), the current
layer feature Fc is multiplied by the binary feature map Pi+1 Ff nd = CE(Fc ∗ U p(1 − P i+1 )) (4)
of the rough estimate of the defect target’s position from the
next layer, followed by a spatial convolution pyramid CE Fi = BR (BR (U p (CBR (Fh )) − Ff pd ) + Ff nd ) (5)
block. The spatial convolution pyramid uses 1x1, 3x3, 5x5,
and 7x7 convolutions to achieve different receptive fields for
the same input feature. The outputs of these four convolution
operations are concatenated and then scaled back to the III. M ETHODOLOGY
original channel size through a 1x1 convolution to obtain Our motivation is to leverage camouflage object detec-
the feature Ff nd , enhancing the feature’s ability to perceive tion methods to accomplish the task of chip surface defect
contextual information of non-defective target areas. detection. Since PFNet [32] can achieve both high detec-
Next, Fc is multiplied by the single-channel feature map tion accuracy and fast inference speed, we have chosen to
Pi+1 of the rough estimate of the defect target’s position from modify the PFNet architecture to suit our task. To address
the next layer, followed by a spatial convolution pyramid CE the three main challenges of this task—significant variability
block to enhance the feature’s ability to capture contextual among segmentation target categories, the non-fixed positions
information of the defect target area, resulting in the feature of segmentation targets, and the extreme similarity between
Ff pd . The next layer’s feature output Fh is then upsampled by targets and the background—we have added an multi-scale
interpolation to the current layer’s feature size. Fh is element- feature fusion (MS fusion) module to the PFNet architecture.
wise subtracted from Ff pd and then element-wise added to This module enhances the semantic and spatial information
300
Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on February 11,2025 at 07:17:35 UTC from IEEE Xplore. Restrictions apply.
integration of the features input to the decoder. Additionally, Humans, after careful analysis, can effectively distinguish
we have introduced a category loss and a position loss to force distraction regions by performing contextual reasoning. This
the model to have a clearer understanding of the edges between involves comparing patterns such as texture and semantics
defect categories and their locations relative to the background, between ambiguous regions and confident regions to make a
thereby improving segmentation accuracy. final decision. This paper proposes using the decoder from
For an input image I, we first perform feature extraction PFNet [32] to mimic this behavior in order to identify chip
using PVT [35] (Pyramid Vision Transformer), resulting in defects that are highly similar to the background and acquire
4 4 ′′
four intermediate feature outputs {fk=1 } with feature sizes of the enhanced features fk=1
512, 320, 128, and 64, respectively. To reduce computational Unlike the single-classification task required by camou-
4
load, these four features {fk=1 } are initially channel-reduced flaged object detection, which only needs to identify the
to 512, 256, 128, and 64, respectively, before being input into background and camouflaged objects in the image, the chip
the multi-scale feature fusion module. defect dataset in this paper further requires the classification
of camouflaged objects. For the image segmentation task,
A. multi-scale feature fusion although it only requires classifying each pixel of the input
For the task of camouflage object detection, it is cru- image, the relationships between different pixels also play
cial to enhance shallow, information-rich features with deep, a very important role in classifying each pixel. Therefore, a
semantically rich features. This feature combination allows method [23] has been proposed to divide semantic segmenta-
the network to leverage semantic information for accurate tion into two tasks: pixel prediction and pixel grouping.After
classification of ambiguous targets while maintaining clear the network successfully predicts the segmented image, the
boundaries between targets and the background. This approach segmentation results are further processed to explore the rela-
is also effective for detecting small defects in chip defect tionships between pixels of the same category. This approach
detection datasets. According to research by Guang Chen et aims to further enhance the segmentation capability.
al. [22], the fundamental challenge in small object detection is The feature output of each layer from the feature extraction
4 ′′
the limited coverage area, which restricts the detector’s ability network fk=1 is processed through the SA [38] (Spatial
to capture sufficient information. Small objects often exist Attention) module. Assuming the input to the module is Xin
in specific contexts or alongside other objects, so utilizing and the output is Xout , it can be calculated using the following
contextual information can supplement the limited features formula:
provided by small objects. Contextual information methods Xattn = U p(CBR (AP ool (Xin ))) (7)
improve small object detection accuracy by exploiting the
relationships between small objects and other objects or the Xout = Xattn ∗ CBR (Xin ) + Xattn (8)
background.Based on this insight, we propose a multi-scale To improve the model’s ability to learn input features, this
4
feature fusion module to integrate the features fk=1 and paper proposes a multi-scale supervised training method to
enhance the contextual and semantic information of each train the proposed defect detection network. Let the output
feature. For the output features fk and fk+1 from adjacent features of the PFNet [32] decoder be Fi=1 4
. These fea-
layers, we first resize fk+1 through interpolation to match the tures are passed through a 3x3 convolution to compress the
size of fk . Then, each feature undergoes a 3x3 convolution- number of feature channels to the number of chip defect
batch normalization-ReLU (CBR) operation, and their out- classes C, resulting in the predicted features for each pixel
put features are concatenated along the channel dimension. 4
{Fi=1 4
}pre . Then, {Fi=1 }pre is input into the SA module [38]
Finally, another 3x3 CBR operation adjusts the number of for pixel aggregation, obtaining the network output features
channels to match the size of fk , producing the output feature 4
{Fi=1 }M ergeP re .
′
fk+1 . For each output feature {Fi }M ergeP re , global pooling is
For the output features fk+1 ′ obtained through the multi- applied to predict the categories contained in the input image,
scale feature fusion module, they can be calculated using the resulting in {Fi }CategoryP re . Additionally, {Fi }M ergeP re is
following formula: passed through a 3x3 convolution to compress the channels to
1, resulting in the prediction of all defect target locations in
fk+1 ′ = CBR(concat(CBR(fk ),
(6) the input image, denoted as {Fi }P osP re .
CBR(interpolate(fk+1 )))) Therefore, to train the network, we require twelve out-
4 4
puts from the network: {Fi=1 }M ergeP re , {Fi=1 }P osP re , and
Through the above operations, fk+1 ′ can capture the seman- 4
{Fi=1 }CategoryP re .The overall loss calculation is as follows:
tic information and underlying features of the input image
better than fk . This enhances the model’s understanding Lossoverall = Losspreidit + Losspos + Losscategory (9)
of the boundaries between the segmentation target and the
background. Additionally, the larger-sized fk can provide
more contextual information to the smaller-sized but more Lossipredict = OHEM CrossEntropy(FM
i
ergeP re , GT )+
semantically rich fk+1 , thereby laying a better foundation for i
Lovaszloss(FM ergeP re , GT )
identifying small-area chip defects. (10)
301
Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on February 11,2025 at 07:17:35 UTC from IEEE Xplore. Restrictions apply.
Lossicategory = CrossEntropy {Fi=1
� 4
}CategoryP re , GTcategory
(11)
302
Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on February 11,2025 at 07:17:35 UTC from IEEE Xplore. Restrictions apply.
TABLE I
C OMPARISON R ESULT
TABLE II
A BLATION R ESULT
comparison models due to human intervention. The models memory. The AdamW optimizer was uniformly used. Since
compared include FPN [26], Unet++ [25], DeepLabv3+ [27], the comparison models were not specifically designed for chip
Manet [28], Linknet [29], PSPNet [30], and PAN [31]. We defect detection, their original training methods, such as loss
also compared our method with the model currently used by functions, are not suitable for this task. To showcase the feature
Nuoding Intelligent Technology Co., Ltd. Due to confidential- extraction capability of the proposed model for chip defects,
ity of the platform where this model is deployed, we could the loss functions used in training the comparison models, ex-
not obtain the implementation and training parameters. Thus, cept for the proposed method and the company’s model, were
the comparison is limited to MIOU, Precision, and Recall. Online Hard Example Mining [33](OHEM) Cross-Entropy
Loss and LovaszLossSoftmax. This aimed to maximize the
All compared networks, except the company’s model, were adaptability of the comparison models to the chip defect
trained on the same RTX 2080 Ti GPU with 11GB of
303
Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on February 11,2025 at 07:17:35 UTC from IEEE Xplore. Restrictions apply.
dataset. Apart from the added convolutional layer and the training loss,
For fairness, the implementation codes of the multi-category optimizer, learning rate, and feature extraction backbone, other
segmentation networks were all taken from publicly available implementations remain consistent with the original PFNet
sources. Only the training codes were modified, while other [32] implementation.The results are shown in Table II.
implementation codes were left unchanged. All networks were Effectiveness of the Category Loss and Position Loss:
tested on the same chip defect detection dataset, and the test By reducing the update distance from the predicted map to
codes and metric processing codes were identical. The results the position prediction, the similarity between the position
are shown in Table I. prediction and the actual segmentation position can be effec-
From the results shown shown in Table I, the proposed tively improved. This also effectively guides the focus module
method demonstrates superior performance across various in identifying false positives and false negatives, ultimately
comparison metrics on the chip defect dataset compared to enhancing the final prediction accuracy. Additionally, by ag-
common multi-class segmentation networks. Compared to the gregating the features of each predicted pixel through the SA
best-performing Unet++ [25] , our method achieves a 5.321% [38]module, the relationship between pixels of the same defect
improvement in the MIOU metric. The figure 4 illustrates category is strengthened. This further enhances the prediction
some prediction results of common models alongside the output features and improves prediction accuracy. Class loss
predictions of our method. can suppress the model’s random predictions, allowing it to
For larger defect targets, it’s evident that our method can focus more on overall class predictions and consider the
achieve segmentation results that are more similar to the relationships between pixels of the same defect category.
actual defect targets and have clearer segmentation boundaries Incorporating these methods can result in a 1% improvement
compared to other methods. Common segmentation methods in the mIoU metric.The performance of the self-modified
often struggle to accurately capture the relationship between multi-class PFNet [32] is close to that of the best-performing
defect targets and the background. This indicates that our comparison model, Unet++ [25] . This can be attributed to
method effectively leverages the features of the input image PFNet’s U-shaped network architecture, which can be seen as
to accurately locate defects, distinguishing defect targets that an improvement over Unet. However, it lacks steps for the
closely resemble the background in terms of shape and color. fusion and enhancement of features across different scales.
Regarding smaller defect targets, while other comparison Unet++, on the other hand, has such modifications compared
models often struggle to identify such small defects, our to Unet, with many lateral connections during down-sampling
method can predict the approximate location of these targets. and up-sampling, allowing for better feature fusion across
Moreover, there is a high overlap between our predictions and different scales. However, Unet++ is not directly suitable for
the actual segmentation results. This demonstrates that our the chip defect detection task as it cannot effectively distin-
method effectively integrates features across different scales guish between defect targets and the background, whereas
and can analyze contextual information near defect targets. PFNet can. Therefore, both models have their advantages and
It identifies differences between small defect targets and disadvantages for the chip defect detection task, resulting in
background features, thereby achieving accurate segmentation similar performance.
predictions for small defect targets. Effectiveness of the Multi-Scale Feature Fusion Module:
Combining features across different scales can enhance the
E. Ablation Analysis contextual information contained in the features, while also
To demonstrate the differences and effectiveness of the enhancing semantic information and low-level features. This
method proposed in this paper compared to the original improves the model’s ability to use contextual information to
method PFNet, we conducted corresponding ablation exper- distinguish large defect targets from the background. Addition-
iments. Since our network is an improved version based on ally, it enhances the combination of semantic information and
the camouflaged object detection method PFNet [17], we low-level features across different scales, allowing the model
used PFNet as the baseline model for ablation experiments. to effectively segment defect target positions while correctly
However, the camouflaged object detection task is a binary classifying defect targets. More precise intermediate features
image segmentation task and cannot be directly transferred to also enable the model to more accurately determine defect
the task addressed in this paper. Therefore, we set the baseline positions during up-sampling, facilitating better learning. This
model as a self-modified multi-class PFNet trained with the makes the designed class loss and position loss more effective
training method proposed in this paper. in assisting the model’s training.
The specific modifications include adding a 7x7 convolu-
tional layer after each focus module to reduce the number of V. C ONCLUSION
channels from the focus module’s output to the number of de- In order to address the challenge of accurately segmenting
fect classes in the dataset. Additionally, the loss function uses chip defect objects, this paper proposes a chip defect detection
Online Hard Example Mining [33](OHEM) Cross-Entropy and segmentation method based on a U-shaped network ar-
loss and LovaszSoftmax loss to supervise outputs at all scales. chitecture. This method combines camouflage target detection
The backbone, like the other ablation experiment comparison and small target detection techniques, as well as training on
objects, is pvt v2 b2 [35] pre-trained on ImageNet-1k [34]. imbalanced sample datasets. Specifically, by integrating the
304
Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on February 11,2025 at 07:17:35 UTC from IEEE Xplore. Restrictions apply.
camouflage target detection method PFNet [32], the proposed [18] Jinnan Yan, Trung-Nghia Le, Khanh-Duy Nguyen, MinhTriet Tran,
Thanh-Toan Do, and Tam V. Nguyen. Mirror net: Bio-inspired adver-
approach enhances the ability to identify defect targets that sarial attack for camouflaged object segmentation. arXiv:2007.12881,
are highly similar to the background. A multi-scale feature 2020.
fusion module is introduced to merge features of different [19] P.Sengottuvelan, A. Wahi, andA.Shanmugam. Performance of decamou-
flaging through exploratory image analysis. In ETET, 2008.
scales, mitigating PFNet’s difficulty in detecting small targets [20] Martin Stevens and Sami Merilaita. Animal camouflage: current issues
and improving its ability to detect camouflaged defect targets. and new perspectives. Philosophical Transactions of the Royal Society
Additionally, a multi-scale output supervision training method B, 2009.
[21] Gerald Handerson Thayer and Abbott Handerson Thayer.Concealing-
that includes class loss, position loss, and actual prediction coloration in the animal kingdom : an exposition of the laws of
loss is proposed to enhance the model’s ability to learn disguise through color and pattern being a summary of abbott h. thayer’s
defect features. Comparative experiments and ablation studies discoveries. New York the Macmillan Co, 1909.
[22] Chen G, Wang H, Chen K, et al. A survey of the four pillars for
demonstrate the superiority of the proposed method over small object detection: Multiscale representation, contextual information,
common multi-class image segmentation models, as well as super-resolution, and region proposal[J]. IEEE Transactions on systems,
the effectiveness of the proposed multi-scale feature fusion man, and cybernetics: systems, 2020, 52(2): 936-953.
[23] Zhong Z, Lin Z Q, Bidart R, et al. Squeeze-and-attention networks for
module and training methods. semantic segmentation[C]//Proceedings of the IEEE/CVF conference on
computer vision and pattern recognition. 2020: 13065-13074.
R EFERENCES [24] Loshchilov I, Hutter F. Decoupled weight decay regularization[J]. arXiv
preprint arXiv:1711.05101, 2017.
[25] Zhou Z, Siddiquee M M R, Tajbakhsh N, et al. Unet++: Redesigning
[1] Gan J, Li Q, Wang J, et al. A hierarchical extractor-based visual rail skip connections to exploit multiscale features in image segmentation[J].
surface inspection system[J]. IEEE Sensors Journal, 2017, 17(23): 7935- IEEE transactions on medical imaging, 2019, 39(6): 1856-1867.
7944. [26] Lin T Y, Dollár P, Girshick R, et al. Feature pyramid networks for object
[2] Silvestre-Blanes J, Albero-Albero T, Miralles I, et al. A public fabric detection[C]//Proceedings of the IEEE conference on computer vision
database for defect detection methods and results. Autex Res J 19 (4): and pattern recognition. 2017: 2117-2125.
363–374[J]. 2019. [27] Firdaus-Nawi M, Noraini O, Sabri M Y, et al. DeepLabv3+ encoder-
[3] Song K, Yan Y. Micro surface defect detection method for silicon steel decoder with Atrous separable convolution for semantic image segmen-
strip based on saliency convex active contour model[J]. Mathematical tation[J]. Pertanika J. Trop. Agric. Sci, 2011, 34(1): 137-143.
Problems in Engineering, 2013, 2013. [28] He P, Jiao L, Shang R, et al. MANet: Multi-scale aware-relation network
[4] Faghih-Roohi S, Hajizadeh S, Núñez A, et al. Deep convolutional neural for semantic segmentation in aerial scenes [J]. IEEE Transactions on
networks for detection of rail surface defects[C]//2016 International joint Geoscience and Remote Sensing, 2022, 60: 1-15.
conference on neural networks (IJCNN). IEEE, 2016: 2584-2589. [29] Chaurasia A, Culurciello E. Linknet: Exploiting encoder representations
[5] Tao X, Zhang D, Ma W, et al. Automatic metallic surface defect for efficient semantic segmentation[C]//2017 IEEE visual communica-
detection and recognition with convolutional neural networks[J]. Applied tions and image processing (VCIP). IEEE, 2017: 1-4.
Sciences, 2018, 8(9): 1575. [30] Zhao H, Shi J, Qi X, et al. Pyramid scene parsing network
[6] Mei S, Yang H, Yin Z. An unsupervised-learning-based approach for [C]//Proceedings of the IEEE conference on computer vision and pattern
automated defect inspection on textured surfaces[J]. IEEE transactions recognition. 2017: 2881-2890.
on instrumentation and measurement, 2018, 67(6): 1266-1277. [31] Liu S, Qi L, Qin H, et al. Path aggregation network for instance
[7] Kang G, Gao S, Yu L, et al. Deep architecture for high-speed railway segmentation[C]//Proceedings of the IEEE conference on computer
insulator surface defect detection: Denoising autoencoder with multitask vision and pattern recognition. 2018: 8759-8768.
learning[J]. IEEE Transactions on Instrumentation and Measurement, [32] Mei H, Ji G P, Wei Z, et al. Camouflaged object segmentation with
2018, 68(8): 2679-2690. distraction mining[C]//Proceedings of the IEEE/CVF conference on
[8] Liu Q, Liu M, Wang C, et al. An efficient CNN-based detector computer vision and pattern recognition. 2021: 8772-8781.
for photovoltaic module cells defect detection in electroluminescence [33] Loshchilov I, Hutter F. Decoupled weight decay regularization[J]. arXiv
images[J]. Solar Energy, 2024, 267: 112245. preprint arXiv:1711.05101, 2017.
[9] Lu Q, Lin J, Luo L, et al. A supervised approach for automated surface [34] Deng J, Dong W, Socher R, et al. Imagenet: A large-scale hierarchical
defect detection in ceramic tile quality control[J]. Advanced Engineering image database[C]//2009 IEEE conference on computer vision and
Informatics, 2022, 53: 101692. pattern recognition. Ieee, 2009: 248-255.
[10] Zavrtanik V, Kristan M, Skočaj D. Reconstruction by inpainting for [35] Wang W, Xie E, Li X, et al. Pyramid vision transformer: A versatile
visual anomaly detection[J]. Pattern Recognition, 2021, 112: 107706. backbone for dense prediction without convolutions[C]//Proceedings of
[11] Long J, Shelhamer E, Darrell T. Fully convolutional networks for seman- the IEEE/CVF international conference on computer vision. 2021: 568-
tic segmentation[C]//Proceedings of the IEEE conference on computer 578.
vision and pattern recognition. 2015: 3431-3440. [36] Chen Y. Convolutional neural network for sentence classification[D].
[12] Ronneberger O, Fischer P, Brox T. U-net: Convolutional networks University of Waterloo, 2015.
for biomedical image segmentation[C]//Medical image computing and [37] Goodfellow I, Pouget-Abadie J, Mirza M, et al. Generative adversarial
computer-assisted intervention–MICCAI 2015: 18th international con- nets[J]. Advances in neural information processing systems, 2014, 27.
ference, Munich, Germany, October 5-9, 2015, proceedings, part III 18. [38] Zhong Z, Lin Z Q, Bidart R, et al. Squeeze-and-attention networks for
Springer International Publishing, 2015: 234-241. semantic segmentation[C]//Proceedings of the IEEE/CVF conference on
[13] Hugh Bamford Cott. Adaptive coloration in animals. Methuen & Co. computer vision and pattern recognition. 2020: 13065-13074.
Ltd, 1940.
[14] Deng-Ping Fan, Ge-Peng Ji, Guolei Sun, Ming-Ming Cheng,Jianbing
Shen, and Ling Shao. Camouflaged object detection. In CVPR, 2020.
[15] Trung-Nghia Le, Tam V Nguyen, Zhongliang Nie, MinhTriet Tran, and
Akihiro Sugimoto. Anabranch network forcamouflaged object segmen-
tation. CVIU, 2019.
[16] Jianqin Yin Yanbin Han Wendi Hou Jinping Li. Detection of the mobile
object with camouflage color under dynamic background based on
optical flow. Procedia Engineering, 2011.
[17] Yuxin Pan, Yiwang Chen, Qiang Fu, Ping Zhang, and Xin Xu. Study on
the camouflaged target detection method based on 3d convexity. Modern
Applied Science, 2011.
305
Authorized licensed use limited to: International Institute of Information Technology Bangalore. Downloaded on February 11,2025 at 07:17:35 UTC from IEEE Xplore. Restrictions apply.