[go: up one dir, main page]

 
 
Sign in to use this feature.

Years

Between: -

Subjects

remove_circle_outline
remove_circle_outline
remove_circle_outline

Journals

Article Types

Countries / Regions

Search Results (4)

Search Parameters:
Keywords = FDDB dataset

Order results
Result details
Results per page
Select all
Export citation of selected articles as:
16 pages, 6361 KiB  
Article
Improved Face Detection Method via Learning Small Faces on Hard Images Based on a Deep Learning Approach
by Dilnoza Mamieva, Akmalbek Bobomirzaevich Abdusalomov, Mukhriddin Mukhiddinov and Taeg Keun Whangbo
Sensors 2023, 23(1), 502; https://doi.org/10.3390/s23010502 - 2 Jan 2023
Cited by 33 | Viewed by 6544
Abstract
Most facial recognition and face analysis systems start with facial detection. Early techniques, such as Haar cascades and histograms of directed gradients, mainly rely on features that had been manually developed from particular images. However, these techniques are unable to correctly synthesize images [...] Read more.
Most facial recognition and face analysis systems start with facial detection. Early techniques, such as Haar cascades and histograms of directed gradients, mainly rely on features that had been manually developed from particular images. However, these techniques are unable to correctly synthesize images taken in untamed situations. However, deep learning’s quick development in computer vision has also sped up the development of a number of deep learning-based face detection frameworks, many of which have significantly improved accuracy in recent years. When detecting faces in face detection software, the difficulty of detecting small, scale, position, occlusion, blurring, and partially occluded faces in uncontrolled conditions is one of the problems of face identification that has been explored for many years but has not yet been entirely resolved. In this paper, we propose Retina net baseline, a single-stage face detector, to handle the challenging face detection problem. We made network improvements that boosted detection speed and accuracy. In Experiments, we used two popular datasets, such as WIDER FACE and FDDB. Specifically, on the WIDER FACE benchmark, our proposed method achieves AP of 41.0 at speed of 11.8 FPS with a single-scale inference strategy and AP of 44.2 with multi-scale inference strategy, which are results among one-stage detectors. Then, we trained our model during the implementation using the PyTorch framework, which provided an accuracy of 95.6% for the faces, which are successfully detected. Visible experimental results show that our proposed model outperforms seamless detection and recognition results achieved using performance evaluation matrices. Full article
(This article belongs to the Special Issue Application of Semantic Technologies in Sensors and Sensing Systems)
Show Figures

Figure 1

Figure 1
<p>Architecture of proposed method for face detection.</p>
Full article ">Figure 2
<p>The detection results the FDDB dataset.</p>
Full article ">Figure 3
<p>Expression image results (WIDER FACE dataset).</p>
Full article ">Figure 4
<p>Makeup image results (WIDER FACE dataset).</p>
Full article ">Figure 5
<p>Occlusion image results (WIDER FACE dataset).</p>
Full article ">Figure 6
<p>Pose images results (WIDER FACE dataset).</p>
Full article ">Figure 7
<p>Scale images results (WIDER FACE dataset).</p>
Full article ">Figure 8
<p>Visible results of false positive speech signal feature extraction experiments.</p>
Full article ">
13 pages, 4985 KiB  
Article
An Efficient Multi-Scale Anchor Box Approach to Detect Partial Faces from a Video Sequence
by Dweepna Garg, Priyanka Jain, Ketan Kotecha, Parth Goel and Vijayakumar Varadarajan
Big Data Cogn. Comput. 2022, 6(1), 9; https://doi.org/10.3390/bdcc6010009 - 11 Jan 2022
Cited by 8 | Viewed by 3968
Abstract
In recent years, face detection has achieved considerable attention in the field of computer vision using traditional machine learning techniques and deep learning techniques. Deep learning is used to build the most recent and powerful face detection algorithms. However, partial face detection still [...] Read more.
In recent years, face detection has achieved considerable attention in the field of computer vision using traditional machine learning techniques and deep learning techniques. Deep learning is used to build the most recent and powerful face detection algorithms. However, partial face detection still remains to achieve remarkable performance. Partial faces are occluded due to hair, hat, glasses, hands, mobile phones, and side-angle-captured images. Fewer facial features can be identified from such images. In this paper, we present a deep convolutional neural network face detection method using the anchor boxes section strategy. We limited the number of anchor boxes and scales and chose only relevant to the face shape. The proposed model was trained and tested on a popular and challenging face detection benchmark dataset, i.e., Face Detection Dataset and Benchmark (FDDB), and can also detect partially covered faces with better accuracy and precision. Extensive experiments were performed, with evaluation metrics including accuracy, precision, recall, F1 score, inference time, and FPS. The results show that the proposed model is able to detect the face in the image, including occluded features, more precisely than other state-of-the-art approaches, achieving 94.8% accuracy and 98.7% precision on the FDDB dataset at 21 frames per second (FPS). Full article
Show Figures

Figure 1

Figure 1
<p>Examples of partial faces.</p>
Full article ">Figure 2
<p>Proposed face detection pipeline.</p>
Full article ">Figure 3
<p>Samples of eight anchor boxes per grid in 19 × 19 grid of input image.</p>
Full article ">Figure 4
<p>Facial parts having important features to predict the face.</p>
Full article ">Figure 5
<p>Types of anchor boxes scale.</p>
Full article ">Figure 6
<p>Samples of the FDDB dataset with variations in pose, expression, scale, illumination, and occlusion.</p>
Full article ">Figure 7
<p>Result analysis of proposed work with other state-of-the-art face detectors in terms of AP.</p>
Full article ">Figure 8
<p>Face detection results of the proposed method.</p>
Full article ">Figure 9
<p>Comparative analysis of FPS of proposed method with existing DCNN detection approaches.</p>
Full article ">
19 pages, 7139 KiB  
Article
An Efficient Deep Convolutional Neural Network Approach for Object Detection and Recognition Using a Multi-Scale Anchor Box in Real-Time
by Vijayakumar Varadarajan, Dweepna Garg and Ketan Kotecha
Future Internet 2021, 13(12), 307; https://doi.org/10.3390/fi13120307 - 29 Nov 2021
Cited by 10 | Viewed by 3328
Abstract
Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. [...] Read more.
Deep learning is a relatively new branch of machine learning in which computers are taught to recognize patterns in massive volumes of data. It primarily describes learning at various levels of representation, which aids in understanding data that includes text, voice, and visuals. Convolutional neural networks have been used to solve challenges in computer vision, including object identification, image classification, semantic segmentation and a lot more. Object detection in videos involves confirming the presence of the object in the image or video and then locating it accurately for recognition. In the video, modelling techniques suffer from high computation and memory costs, which may decrease performance measures such as accuracy and efficiency to identify the object accurately in real-time. The current object detection technique based on a deep convolution neural network requires executing multilevel convolution and pooling operations on the entire image to extract deep semantic properties from it. For large objects, detection models can provide superior results; however, those models fail to detect the varying size of the objects that have low resolution and are greatly influenced by noise because the features after the repeated convolution operations of existing models do not fully represent the essential characteristics of the objects in real-time. With the help of a multi-scale anchor box, the proposed approach reported in this paper enhances the detection accuracy by extracting features at multiple convolution levels of the object. The major contribution of this paper is to design a model to understand better the parameters and the hyper-parameters which affect the detection and the recognition of objects of varying sizes and shapes, and to achieve real-time object detection and recognition speeds by improving accuracy. The proposed model has achieved 84.49 mAP on the test set of the Pascal VOC-2007 dataset at 11 FPS, which is comparatively better than other real-time object detection models. Full article
(This article belongs to the Special Issue Big Data Analytics, Privacy and Visualization)
Show Figures

Figure 1

Figure 1
<p>Building blocks of CNN.</p>
Full article ">Figure 2
<p>Classification pipeline using CNN.</p>
Full article ">Figure 3
<p>Detection pipeline using CNN.</p>
Full article ">Figure 4
<p>Architecture of R-CNN.</p>
Full article ">Figure 5
<p>S × S grid distribution and class distribution map of YOLO.</p>
Full article ">Figure 6
<p>IOU calculation [<a href="#B14-futureinternet-13-00307" class="html-bibr">14</a>].</p>
Full article ">Figure 7
<p>Architecture of proposed model.</p>
Full article ">Figure 8
<p>Efficient multi-scale anchor box (Example 1).</p>
Full article ">Figure 9
<p>Efficient multi-scale anchor box (Example 2).</p>
Full article ">Figure 10
<p>Flowchart of the training and detection phase of the proposed model.</p>
Full article ">Figure 11
<p>Comparison of Loss vs. Epoch @ Dropout.</p>
Full article ">Figure 12
<p>Comparison of IoU mAP vs. Epoch.</p>
Full article ">Figure 13
<p>Comparison of Loss vs Epoch @ Learning Rate.</p>
Full article ">Figure 14
<p>Comparison of FPS and mAP on various resolutions using different configurations of CPU and GPU.</p>
Full article ">Figure 15
<p>Comparison between Precision vs. Recall.</p>
Full article ">Figure 16
<p>Object detection and recognition from a video sequence (Example 1).</p>
Full article ">Figure 17
<p>Object detection and recognition from a video sequence (Example 2).</p>
Full article ">Figure 18
<p>Minimum size of object detection.</p>
Full article ">Figure 19
<p>Object detection and recognition from an image (Example 3).</p>
Full article ">
14 pages, 5005 KiB  
Article
A Fast and Lightweight Method with Feature Fusion and Multi-Context for Face Detection
by Lei Zhang and Xiaoli Zhi
Future Internet 2018, 10(8), 80; https://doi.org/10.3390/fi10080080 - 17 Aug 2018
Viewed by 3858
Abstract
Convolutional neural networks (CNN for short) have made great progress in face detection. They mostly take computation intensive networks as the backbone in order to obtain high precision, and they cannot get a good detection speed without the support of high-performance GPUs (Graphics [...] Read more.
Convolutional neural networks (CNN for short) have made great progress in face detection. They mostly take computation intensive networks as the backbone in order to obtain high precision, and they cannot get a good detection speed without the support of high-performance GPUs (Graphics Processing Units). This limits CNN-based face detection algorithms in real applications, especially in some speed dependent ones. To alleviate this problem, we propose a lightweight face detector in this paper, which takes a fast residual network as backbone. Our method can run fast even on cheap and ordinary GPUs. To guarantee its detection precision, multi-scale features and multi-context are fully exploited in efficient ways. Specifically, feature fusion is used to obtain semantic strongly multi-scale features firstly. Then multi-context including both local and global context is added to these multi-scale features without extra computational burden. The local context is added through a depthwise separable convolution based approach, and the global context by a simple global average pooling way. Experimental results show that our method can run at about 110 fps on VGA (Video Graphics Array)-resolution images, while still maintaining competitive precision on WIDER FACE and FDDB (Face Detection Data Set and Benchmark) datasets as compared with its state-of-the-art counterparts. Full article
Show Figures

Figure 1

Figure 1
<p>Two context adding approaches. (<b>a</b>) Red window is the original bounding box and the dashed windows are the enlarged windows; (<b>b</b>) Using global pooling to incorporate global pixel information.</p>
Full article ">Figure 2
<p>Structure of the ResNet-18. The dotted shortcuts increase dimensions.</p>
Full article ">Figure 3
<p>Overall architecture of our method.</p>
Full article ">Figure 4
<p>Context incorporation. Depthwise separable convolution (dw for short) is used to reduce computation, each convolutional layer is followed by a batchnorm layer and a ReLU layer.</p>
Full article ">Figure 5
<p>Precision-recall curves on WIDER FACE validation set. (<b>a</b>) Precision-recall curves on ‘easy’ subset; (<b>b</b>) Precision-recall curves on ‘medium’ subset; (<b>c</b>) Precision-recall curves on ‘hard’ subset.</p>
Full article ">Figure 5 Cont.
<p>Precision-recall curves on WIDER FACE validation set. (<b>a</b>) Precision-recall curves on ‘easy’ subset; (<b>b</b>) Precision-recall curves on ‘medium’ subset; (<b>c</b>) Precision-recall curves on ‘hard’ subset.</p>
Full article ">Figure 6
<p>Receiver operating characteristic (ROC) curves on FDDB dataset. (<b>a</b>) Discontinuous ROC curves; (<b>b</b>) Continuous ROC curves.</p>
Full article ">Figure 6 Cont.
<p>Receiver operating characteristic (ROC) curves on FDDB dataset. (<b>a</b>) Discontinuous ROC curves; (<b>b</b>) Continuous ROC curves.</p>
Full article ">Figure 7
<p>Qualitative results under various challenging conditions, i.e., illumination, pose changes, occlusion, race and etc. Ground truth bounding boxes are in red and predicted bounding boxes are in green. (Zoom in to see better).</p>
Full article ">Figure 7 Cont.
<p>Qualitative results under various challenging conditions, i.e., illumination, pose changes, occlusion, race and etc. Ground truth bounding boxes are in red and predicted bounding boxes are in green. (Zoom in to see better).</p>
Full article ">Figure 7 Cont.
<p>Qualitative results under various challenging conditions, i.e., illumination, pose changes, occlusion, race and etc. Ground truth bounding boxes are in red and predicted bounding boxes are in green. (Zoom in to see better).</p>
Full article ">
Back to TopTop