Facial Mask Detection Using Semantic Segmentation

The document discusses a method for facial mask detection using semantic segmentation. It proposes using a fully convolutional network to semantically segment faces from images by classifying each pixel as face or non-face. Experimental results on the Multi Parsing Human Dataset show the method can accurately detect faces in images, including non-frontal faces, with a mean pixel accuracy of 93.884%.

Uploaded by

pakalanaveen7975

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

29 views5 pages

Facial Mask Detection Using Semantic Segmentation

Uploaded by

pakalanaveen7975

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 5

2019 4th International Conference on Computing, Communications and Security (ICCCS)

Facial Mask Detection using Semantic Segmentation

Toshanlal Meenpal Ashutosh Balakrishnan Amit Verma
Dept. of Electronics and Telecomm. Dept. of Electronics and Telecomm. Dept. of Electronics and Telecomm.
National Institute of Technology National Institute of Technology National Institute of Technology
Raipur, India Raipur, India Raipur, India
tmeenpal.etc@nitrr.ac.in abalakrishnan1909@gmail.com averma.phd2016.etc@nitrr.ac.in

Abstract—Face Detection has evolved as a very popular and then detecting that segmented area. The model works
problem in Image processing and Computer Vision. Many new very well not only for images having frontal faces but also
algorithms are being devised using convolutional architectures to for non-frontal faces. The paper also focuses on removing
make the algorithm as accurate as possible. These convolutional
architectures have made it possible to extract even the pixel the erroneous predictions which are bound to occur. Semantic
details. We aim to design a binary face classifier which can detect segmentation of human face is performed with the help of a
any face present in the frame irrespective of its alignment. We fully convolutional network.
present a method to generate accurate face segmentation masks The next section discusses the related work done in the
from any arbitrary size input image. Beginning from the RGB domain of face detection. In section III we describe the
image of any size, the method uses Predefined Training Weights
of VGG – 16 Architecture for feature extraction. Training is method followed for face segmentation and detection using
performed through Fully Convolutional Networks to semantically semantic segmentation on any arbitrary RGB image. Finally,
segment out the faces present in that image. Gradient Descent is the generated facial masks are demonstrated in experimental
used for training while Binomial Cross Entropy is used as a loss results in section IV. Post processing on the predicted images
function. Further the output image from the FCN is processed to has also been discussed at length which also entails the
remove the unwanted noise and avoid the false predictions if any
and make bounding box around the faces. Furthermore, proposed removal of erroneous predictions.
model has also shown great results in recognizing non-frontal
faces. Along with this it is also able to detect multiple facial II. R ELATED W ORKS
masks in a single frame. Experiments were performed on Multi Initially researchers focused on edge and gray value of face
Parsing Human Dataset obtaining mean pixel level accuracy of image. [1] was based on pattern recognition model, having
93.884 % for the segmented face masks.
Index Terms—Fully Convolutional Network, Semantic Segmen-
a prior information of the face model. Adaboost [2] was a
tation, Face Segmentation and Detection good training classifier. The face detection technology got a
breakthrough with the famous Viola Jones Detector [3], which
I. I NTRODUCTION greatly improved real time face detection. Viola Jones detector
optimized the features of Haar [4], but failed to tackle the real
Face detection has emerged as a very interesting problem world problems and was influenced by various factors like face
in image processing and computer vision. It has a range of brightness and face orientation. Viola Jones could only detect
applications from facial motion capture to face recognition frontal well lit faces. It failed to work well in dark condi-
which at the start needs the face to be detected with a very tions and with non-frontal images. These issues have made
good accuracy. Face detection is more relevant today because the independent researchers work on developing new face
it not only used on images but also in video applications detection models based on deep learning, to have better results
like real time surveillance and face detection in videos. High for the different facial conditions. We have developed our face
accuracy image classification is possible now with the ad- detection model using Multi Human Parsing Dataset [5], based
vancements of Convolutional networks. Pixel level information on fully convolutional networks, such that it can detect the
is often required after face detection which most face detection face in any geometric condition frontal or non-frontal for that
methods fail to provide. Obtaining pixel level details has matter. Convolutional Networks have always been used for
been a challenging part in semantic segmentation. Semantic image classification tasks. Typical architectures like AlexNet
segmentation is the process of assigning a label to each pixel [6] and VGGNet [7] comprise of stacked convolutional layers.
of the image. In our case the labels are either face or non-face. AlexNet with 5 convolutional layers and 3 fully connected
Semantic segmentation is thus used to separate out the face layers has been the winner of ImageNet LSVRC-2012 com-
by classifying each pixel of the image as face or background. petition while VGGNet is an improvement over AlexNet as it
Also most of the widely used face detection algorithms tend replaces large kernels with 3x3 multiple kernels consecutively.
to focus on the detection of frontal faces. The ILSVRC-2014 winning architecture GoogleNet [8] uses
This paper proposes a model for face detection using parallel convolution kernels and concatenating the feature
semantic segmentation in an image by classifying each pixel maps together. In it 1×1, 3×3 and 5×5 convolutions and 3×3
as face and non-face i.e. effectively creating a binary classifier max-pooling have been used. Smaller convolutions extract the

978-1-7281-0875-9/19/$31.00 2019
c IEEE 1

Authorized licensed use limited to: National Institute of Technology. Downloaded on October 15,2023 at 06:05:19 UTC from IEEE Xplore. Restrictions apply.
2019 4th International Conference on Computing, Communications and Security (ICCCS)

irrespective of alignment and train it in an appropriate neural

network to get accurate results. The model requires inputting
an RGB image of any arbitary size to the model. The model’s
basic fucntion is feature extraction and class prediction. The
output of the model is a feature vector which is optimized
using Gradient descent and the loss function used is Binomial
Cross Entropy. Figure 1 represents the end to end pipeline
of ourmethod along with sample demonstration of obtained
output at each step.

A. Proposed Work Flow

We propose a method of obtaining segmentation masks
directly from the images containing one or more faces in
different orientation. The input image of any arbitrary size
is resized to 224 × 224 × 3 and fed to the FCN network for
feature extraction and prediction. The output of the network is
then subjected to post processing. Initially the pixel values of
the face and background are subjected to global thresholding.
After that its passed through median ﬁlter to remove the high
frequency noise and then subjected to Closing operation to
ﬁll the gaps in the segmented area. After this bounding box
is drawn around the segmented area.

B. Architecture
The feature extraction and prediction is performed using
pre-defined training weights of VGG 16 architecture. The basic
VGG-16 architecture is depicted in Figure 2. Our proposed
model consists of a total of 17 convolutional layers and 5
Max pooling layers. The initial image size which is fed to the
model is 224 × 224 × 3. As the image is processed through the
layers for feature extraction its passed through convolutional
layers and max pooling layers.
Convolutional layer convolutes the input image with another
window while the max pooling operation ensures that the
size of the feature vector being produced in every layer is
halved so as to reduce the number of parameters. This is
Fig. 1. Flowchart of the proposed method. a very crucial step in feature extraction, if the number of
parameters are not reduced then it will become very difficult
to predict the classes of each pixel in a fully convolutional
local features whereas larger convolutions extract high level network. The initial layers extract the lower level features
features. More recent architectures such as ResNet [9] have while as the subsequent layers extract the mid-level and higher
introduced skip connections which allows deeper networks level features. The segmentation task requires that the spatial
to avoid saturation in training accuracy. These architectures information be stored in a pixel wise classification, this we
are often used for initial feature extraction in face detection have achieved by converting the VGG layers to convolutional
networks. In our method, we are using VGG 16 architecture as layers. After the final max pooling layer – the image size is
the base network for face detection and Fully Convolutional reduced to 28 × 28 × 2. This is further upsampled to bring the
Network for segmentation. VGG 16 network is sufficiently image to standard size i.e. 224 × 224 × 2 since it’s a binary
deep to extract features and computationally less expensive for classifier – hence creates two channels for both the classes,
our case. Though majority of segmentation architectures rely face and background.
on downsampling and consecutive upsampling of input image,
Fully Convolutional Networks [10], [11], [12] still are modest C. Face Detection and Avoiding Erroneous Prediction
and have significantly accurate approach for segmentation. Post processing on the predicted mask obtained is performed
so that the irregularities in the region can be filled and to
III. M ETHODOLOGY remove the unwanted errors (which may have crept during the
We propose this paper with twin objective of creating a processing). This we perform by first passing the mask through
Binary face classifier which can detect faces in any orientaiton Median filter and then performing the Closing Operation.

Fig. 2. The complete architecture of Fully Convolutional Network used generating segmentation masks.

Fig. 3. (a) Actual Image (b) Erroneous Prediction (c) False Face Detection (d) Correct Face Detection.

This ensures that the gaps in the segmented region are filled TABLE I
and most of the unwanted false erroneous prediction removed. S EGMENTED R EGION PARAMETER VALUES
In spite of this there is a possibility that some large error may S. No. Centroid Major Axis Length Minor Axis Length
not have been removed. We have designed the model such 1. 9.414 11.62 7.2
that all those erroneous predictions are not considered while 2. 18.00 22.65 13.36
3. 13.18 14.84 11.51
showing the final detected faces. We find out the following 4. 22.81 32.09 13.52
parameters in each region – Centroid, Major Axis Length and 5. 18.07 27.35 8.8
Minor Axis Length. These values for Figure 3 are depicted 6. 20.67 30.55 10.7
in Table 1 for all the facial (segmented) regions detected
(including false predictions).
In the Figure 3(b), even after post processing through Using the centroid cx , Major Axis max and minor axis
median filter and dilation, the unwanted erroneous predictions values mix for each of the segmented region, we calculate the
have not completely gone. This results in false face detection diameter dx of each region. We compute mean D and standard
– Figure 3(c) deviation ηD for diameter vector D. Finally, we keep the most

Fig. 4. Pixel level accuracy for predicted facial masks.

probable diameters lying within the first standard deviation. [2] T.-H. Kim, D.-C. Park, D.-M. Woo, T. Jeong, and S.-Y. Min, “Multi-class
The detail procedure is shown in Algorithm 1. classifier-based adaboost algorithm,” in Proceedings of the Second Sino-
foreign-interchange Conference on Intelligent Science and Intelligent
Data Engineering, ser. IScIDE’11. Berlin, Heidelberg: Springer-Verlag,
Algorithm 1 Detail Distance Computing Procedure 2012, pp. 122–127.
X ← Number of regions [3] P. Viola and M. J. Jones, “Robust real-time face detection,” Int. J.
Comput. Vision, vol. 57, no. 2, pp. 137–154, May 2004.
D ← RX [4] P. Viola and M. Jones, “Rapid object detection using a boosted cascade
for x ← 1, X do of simple features,” in Proceedings of the 2001 IEEE Computer Society
dx ← max +mi2
x Conference on Computer Vision and Pattern Recognition. CVPR 2001,
vol. 1, Dec 2001, pp. I–I.
D[x] ← dx [5] J. Li, J. Zhao, Y. Wei, C. Lang, Y. Li, and J. Feng, “Towards real
end for world human parsing: Multiple-human parsing in the wild,” CoRR, vol.
X
D ← X1 x=1 D[x] abs/1705.07206.
X [6] A. Krizhevsky, I. Sutskever, and G. E. Hinton, “Imagenet classification
ηD ← X1 x=1 (D[x] − D)2 with deep convolutional neural networks,” in Advances in Neural Infor-
mation Processing Systems 25, F. Pereira, C. J. C. Burges, L. Bottou, and
Dtrue ← [ ] K. Q. Weinberger, Eds. Curran Associates, Inc., 2012, pp. 1097–1105.
c←0 [7] K. Simonyan and A. Zisserman, “Very deep convolutional networks for
for x ← 1, X do large-scale image recognition,” CoRR, vol. abs/1409.1556, 2014.
[8] C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan,
if D − ηD < D[x] < D + ηD then V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,”
c←c+1 2015.
Dtrue [c] ← D[x] [9] K. He, X. Zhang, S. Ren, and J. Sun, “Deep residual learning for image
recognition,” 2016 IEEE Conference on Computer Vision and Pattern
end if Recognition (CVPR), pp. 770–778, 2016.
end for [10] K. Li, G. Ding, and H. Wang, “L-fcn: A lightweight fully convolutional
network for biomedical semantic segmentation,” in 2018 IEEE Inter-
national Conference on Bioinformatics and Biomedicine (BIBM), Dec
2018, pp. 2363–2367.
IV. E XPERIMENTAL R ESULTS [11] X. Fu and H. Qu, “Research on semantic segmentation of high-resolution
remote sensing image based on full convolutional neural network,” in
All the experiments have been performed on Multi Human 2018 12th International Symposium on Antennas, Propagation and EM
Parsing Dataset containing about 5000 images, each with at Theory (ISAPE), Dec 2018, pp. 1–4.
least two persons. Out of these, 2500 images were used for [12] S. Kumar, A. Negi, J. N. Singh, and H. Verma, “A deep learning
for brain tumor mri images semantic segmentation using fcn,” in
training and validation while the remaining where used for 2018 4th International Conference on Computing Communication and
testing the model. Figure 4 shows true and predicted class to Automation (ICCCA), Dec 2018, pp. 1–4.
a given input image of any arbitrary size. It also represents
detected faces inside a bounding circle with respective pixel
level accuracy. We have also shown the refined predicted
mask after its subjected to post processing. The designed FCN
semantically segments out the facial spatial location with a
specific label. Furthermore, proposed model has also shown
great results in recognizing non-frontal faces. Along with this
it is also able to detect multiple facial masks in a single
frame. The post processing provides a large boost to pixel
level accuracy. The mean pixel level accuracy for facial masks
: 93.884%.

V. C ONCLUSION
We were able to generate accurate face masks for human
objects from RGB channel images containing localized ob-
jects. We demonstrated our results on Multi Human Parsing
Dataset with mean pixel level accuracy. Also the problem of
erroneous predictions has been solved and a proper bounding
box has been drawn around the segmented region. Proposed
network can detect non frontal faces and multiple faces from
single image. The method can ﬁnd applications in advanced
tasks such as facial part detection.

R EFERENCES
[1] T. Ojala, M. Pietikainen, and T. Maenpaa, “Multiresolution gray-scale
and rotation invariant texture classiﬁcation with local binary patterns,”
IEEE Transactions on Pattern Analysis and Machine Intelligence,
vol. 24, no. 7, pp. 971–987, July 2002.

Authorized licensed use limited to: National Institute of Technology. Downloaded on October 15,2023 at 06:05:19 UTC from IEEE Xplore. Restrictions apply.

Detect Faces Efficiently: A Survey and Evaluations: Yuantao Feng, Shiqi Yu, Hanyang Peng, Yan-Ran Li, Jianguo Zhang
No ratings yet
Detect Faces Efficiently: A Survey and Evaluations: Yuantao Feng, Shiqi Yu, Hanyang Peng, Yan-Ran Li, Jianguo Zhang
19 pages
DualFaceNet: Augmentation Consistency For Optimal Facial Landmark Detection and Face Mask Classification
No ratings yet
DualFaceNet: Augmentation Consistency For Optimal Facial Landmark Detection and Face Mask Classification
12 pages
Face Mask Classification Using Convolutional Neural Networks With Facial Image Regions and Super Resolution
No ratings yet
Face Mask Classification Using Convolutional Neural Networks With Facial Image Regions and Super Resolution
10 pages
Thesis On Face Detection
100% (3)
Thesis On Face Detection
7 pages
Detect Faces Efficiently A Survey and Evaluations
No ratings yet
Detect Faces Efficiently A Survey and Evaluations
19 pages
Face Mask Detection
No ratings yet
Face Mask Detection
5 pages
Convolutional Neural Network Approach Fo
No ratings yet
Convolutional Neural Network Approach Fo
6 pages
Face Mask Detection: International Research Journal of Engineering and Technology (IRJET)
No ratings yet
Face Mask Detection: International Research Journal of Engineering and Technology (IRJET)
5 pages
Covid-19 Face Mask Detection Using Tensorflow, Keras and Opencv
No ratings yet
Covid-19 Face Mask Detection Using Tensorflow, Keras and Opencv
5 pages
HAND Gesture
No ratings yet
HAND Gesture
9 pages
Chhavi
No ratings yet
Chhavi
8 pages
Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks
No ratings yet
Joint Face Detection and Alignment Using Multitask Cascaded Convolutional Networks
5 pages
A Neural Architecture For Fast and Robust Face Detection
No ratings yet
A Neural Architecture For Fast and Robust Face Detection
4 pages
Avishek Dey, Journal of Information Systems Engineering and Management
No ratings yet
Avishek Dey, Journal of Information Systems Engineering and Management
10 pages
.A General Review of Human Face Image Detection Using Machine Learning Classifier
No ratings yet
.A General Review of Human Face Image Detection Using Machine Learning Classifier
4 pages
Synopsis
No ratings yet
Synopsis
8 pages
Face Recognition with OpenCV & AI
No ratings yet
Face Recognition with OpenCV & AI
11 pages
AI-Based Face Mask Detection
No ratings yet
AI-Based Face Mask Detection
44 pages
Introduction To Operations Research
No ratings yet
Introduction To Operations Research
7 pages
Developing A Face Recognition System Using Convolutional Neural Network and Raspberry Pi Including Facial Expression
No ratings yet
Developing A Face Recognition System Using Convolutional Neural Network and Raspberry Pi Including Facial Expression
13 pages
Face Detection and Its Applications: ISSN: 2320 - 8791
No ratings yet
Face Detection and Its Applications: ISSN: 2320 - 8791
10 pages
Face Detection System Based On MLP Neural Network
No ratings yet
Face Detection System Based On MLP Neural Network
6 pages
Refer Nce
No ratings yet
Refer Nce
17 pages
Face Mask Detection With Fine Allocation
No ratings yet
Face Mask Detection With Fine Allocation
5 pages
Face Recognition System: Abstract-We Present An Approach To The Detection and
No ratings yet
Face Recognition System: Abstract-We Present An Approach To The Detection and
6 pages
Deep Learning Face Recognition Seminar
No ratings yet
Deep Learning Face Recognition Seminar
25 pages
Documento 5
No ratings yet
Documento 5
22 pages
Deep Learning Face Recognition System
No ratings yet
Deep Learning Face Recognition System
12 pages
Image Processing
No ratings yet
Image Processing
24 pages
Explainable Artificial Intelligence How Face Masks Are Detected Via Deep Neural Networks
No ratings yet
Explainable Artificial Intelligence How Face Masks Are Detected Via Deep Neural Networks
9 pages
Deep Learning Framework To Detect Face Masks From Video Footage
No ratings yet
Deep Learning Framework To Detect Face Masks From Video Footage
6 pages
Report
No ratings yet
Report
33 pages
Face Mask Detection and Attendance System
No ratings yet
Face Mask Detection and Attendance System
2 pages
A Face Detection Method Based On Cascade Convolutional Neural Network
No ratings yet
A Face Detection Method Based On Cascade Convolutional Neural Network
18 pages
Face Detectors Evaluation To Select The
No ratings yet
Face Detectors Evaluation To Select The
13 pages
Face Recognition Approach Via Deep and Machine Lea
No ratings yet
Face Recognition Approach Via Deep and Machine Lea
13 pages
Face Recognition Techniques and Its Application
No ratings yet
Face Recognition Techniques and Its Application
3 pages
Real Time Gender and Age Prediction Using Deep Lea
No ratings yet
Real Time Gender and Age Prediction Using Deep Lea
5 pages
Sarkar Parameter Efficient Local Implicit Image Function Network For Face Segmentation CVPR 2023 Paper
No ratings yet
Sarkar Parameter Efficient Local Implicit Image Function Network For Face Segmentation CVPR 2023 Paper
11 pages
ECCV-2018-CR154-Pyramidbox - A Context-Assisted Single Shot Face Detector
No ratings yet
ECCV-2018-CR154-Pyramidbox - A Context-Assisted Single Shot Face Detector
17 pages
Face Detection Based On Skin Color: Yang Ling Gu Xiaohan June2012
No ratings yet
Face Detection Based On Skin Color: Yang Ling Gu Xiaohan June2012
31 pages
COVID-19 Mask Detection with AI
No ratings yet
COVID-19 Mask Detection with AI
5 pages
Face Recognition System IJERTV8IS050150
No ratings yet
Face Recognition System IJERTV8IS050150
4 pages
CNN-Based Face Recognition Insights
No ratings yet
CNN-Based Face Recognition Insights
4 pages
Real-Time Face Mask Detection in Video Data: Yuchen Ding, Zichen Li, David Yastremsky University of Pennsylvania
No ratings yet
Real-Time Face Mask Detection in Video Data: Yuchen Ding, Zichen Li, David Yastremsky University of Pennsylvania
8 pages
Face Mask Detection Project
No ratings yet
Face Mask Detection Project
17 pages
Sumitha Undergraduate Thesis
No ratings yet
Sumitha Undergraduate Thesis
109 pages
Abstract:: Case Study of Real Time Based Facial Recognition System For Criminal Identification
No ratings yet
Abstract:: Case Study of Real Time Based Facial Recognition System For Criminal Identification
5 pages
IEEE Research Paper 2
No ratings yet
IEEE Research Paper 2
5 pages
Face Recognition Based On Convolutional Neural Network: Jiahao Zhao
No ratings yet
Face Recognition Based On Convolutional Neural Network: Jiahao Zhao
11 pages
Proposal For The Reasearch
No ratings yet
Proposal For The Reasearch
6 pages
Advanced Facial Recognition Tech
No ratings yet
Advanced Facial Recognition Tech
22 pages
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
No ratings yet
Vision-Face Recognition Attendance Monitoring System For Surveillance Using Deep Learning Technology and Computer Vision
5 pages
Facial Detection Using Deep Learning, 1
No ratings yet
Facial Detection Using Deep Learning, 1
7 pages
Implementation of FaceNet and Support Vector Machine in A Real-Time Web-Based Timekeeping Application
No ratings yet
Implementation of FaceNet and Support Vector Machine in A Real-Time Web-Based Timekeeping Application
9 pages
Gen Ai Major
No ratings yet
Gen Ai Major
11 pages
A Comparative Analysis of Face Recognition Models On Masked Faces
No ratings yet
A Comparative Analysis of Face Recognition Models On Masked Faces
4 pages
Baseppr
No ratings yet
Baseppr
15 pages
Eigenfaces: Face Recognition Simplified
No ratings yet
Eigenfaces: Face Recognition Simplified
6 pages

Facial Mask Detection Using Semantic Segmentation

Uploaded by

Facial Mask Detection Using Semantic Segmentation

Uploaded by

2019 4th International Conference on Computing, Communications and Security (ICCCS)

Facial Mask Detection using Semantic Segmentation

irrespective of alignment and train it in an appropriate neural

A. Proposed Work Flow

Fig. 4. Pixel level accuracy for predicted facial masks.

You might also like