Deepiris: Iris Recognition Using A Deep Learning Approach
Deepiris: Iris Recognition Using A Deep Learning Approach
Deepiris: Iris Recognition Using A Deep Learning Approach
Approach
Abstract—Iris recognition has been an active research area wavelet for iris recognition. Each iris is represented as a la-
during last few decades, because of its wide applications in
arXiv:1907.09380v1 [cs.CV] 22 Jul 2019
I. I NTRODUCTION
To personalize an experience or make an application more
secure and less accessible to undesired people, we need to
be able to distinguish a person from everyone else. There are
various ways to identify a person, and biometrics are one of
the most secure options so far. They can be divided into two
categories: behavioral and physiological features. Behavioral
features are those actions that a per-son can uniquely cre-
ate or express, such as signatures, walking rhythm, and the
physiological features are those characteristics that a person
possesses, such as fingerprints and iris pattern. Many works
revolved around recognition and categorization of such data
including, but not limited to, fingerprints,faces, palmprints and
iris patterns [1]-[5].
Iris recognition systems are widely used for security ap-
plications, since they contain a rich set of features and do
not change significantly over time. They are also virtually
impossible to fake. One of the first modern algorithms for
iris recognition was developed by John Daugman and used
2D Gabor wavelet transform [6]. Since then, there have been
Fig. 1. The images from the first (on top) and second layers of scattering
various works proposing different approaches for iris recogni- transform [10] for a sample iris image. Each image is capturing the wavelet
tion. Many of the traditional approaches follow the two-step energies along specific orientation and scale.
machine learning approach, where in the first step a set of
hand-crafted features are derived from iris images, and in the
second step a classifier is used recognize the iris images. Here Although many of the previous works for iris recogni-
we will discuss about some of the previous works proposed tion achieve high accuracy rates, they involve a lot of pre-
for iris recognition. processing (including iris segmentation, and unwraping the
original iris into a rectangular area) and using some hand-
In a more recent work, Kumar [6] proposed an algorithm crafted features, which may not be optimum for different iris
based on a combination of Log-Gabor, Haar wavelet, DCT and datasets (collected under different lightning and environmental
FFT features, and achieved high accuracy. In [7], Farouk pro- conditions). In recent years, there have been a lot of focus
posed an scheme which uses elastic graph matching and Gabor on developing models for jointly learning the features, while
doing prediction. Along this direction, convolutional neural There are two main ways in which the pre-trained model
networks [11] have been very successful in various computer is used for a different task. In one approach, the pre-trained
vision and natural language processing (NLP) tasks [12]. Their model is treated as a feature extractor, and then a classi-
success is mainly due to three key factors: the availability fier/regressor model is trained on top of that to perform the
of large-scale manually labeled datasets, powerful processing second task. In this approach the internal weights of the
tools (such Nvidia’s GPUs), and good regularization tech- pre-trained model are not adapted to the new task. One can
niques (such as dropout, etc) that can prevent overfitting think of using a pre-trained language model for deriving word
problem. representation used in another task (such as sentiment analysis,
NER, etc.) as an example of the first approach. In the second
Deep learning have been used for various problems in approach, the whole network (or a subset of layers/parameters
computer vision, such as image classification, image segmenta- of the model) is fine-tuned on the new task, therefore the pre-
tion, super-resolution, image captioning, emotion analysis, face trained model weights are treated as the initial values for the
recognition, and object detection, and significantly improved new task, and are updated during the training procedure.
the performance over traditional approaches [13]-[20]. It has
also been used heavily for various NLP tasks, such as sen-
timent analysis, machine translation, name-entity-recognition, B. Iris Image Classification Using ResNet
and question answering [21]-[24].
In this work, we focused on iris recognition task, and chose
More interestingly, it is shown that the features learned a dataset with a large number of subjects, but limited number of
from some of these deep architectures can be transferred to images per subject, and proposed a transfer learning approach
other tasks very well. In other words, one can get the features to perform identity recognition using a deep residual convo-
from a trained model for a specific task and use it for a different lutional network. We use a pre-trained ResNet50 [13] model
task (by training a classifier/predictor on top of it) [25]. trained on ImageNet dataset, and fine-tune it on our training
Inspired by [25], Minaee et al. [26] explored the application of images. ResNet is popular CNN architecture which was the
learned convolutional features for iris recognition and showed winner of ImageNet 2015 visual recognition competition. It
that features learned by training a ConvNet on a general image generates easier gradient flow for more efficient training. The
classification task, can be directly used for iris recognition, core idea of ResNet is introducing a so-called identity shortcut
beating all the previous approaches. connection that skips one or more layers, as shown in Figure
3. This would help the network to provide a direct path to the
For iris recognition task, there are several public datasets very early layers in the network, making the gradient updates
with a reasonable number of samples, but for most of them for those layers much easier.
the number of samples per class is limited, which makes it
difficult to train a convolutional neural network from scratch
on these datasets. In this work we propose a deep learning
framework for iris recognition for the case where only a few
samples are available for each class (few shots learning).
The structure of the rest of this paper is as follows. Section
II provides the description of the overall proposed framework.
Section III provides the experimental studies and comparison
with previous works. And finally the paper is concluded in
Section IV.
II. T HE P ROPOSED F RAMEWORK Fig. 2. The residual block used in ResNet Model
3x3x256 kernel, /2
3x3x512 kernel, /2
7x7x64 kernel, /2
3x3x128 kernel
3x3x128 kernel
3x3x128 kernel
3x3x128 kernel
3x3x128 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x256 kernel
3x3x512 kernel
3x3x512 kernel
3x3x512 kernel
3x3x512 kernel
3x3x512 kernel
3x3x128 kernel
3x3x128 kernel
3x3x256 kernel
3x3x64 kernel
3x3x64 kernel
3x3x64 kernel
3x3x64 kernel
3x3x64 kernel
3x3x64 kernel
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Convolution
Avg Pooling
Softmax
FC layer
Pooling
1x2048
Input Classes
Fig. 3. The architecture of ResNet50 neural network [13], and how it is transferred for iris recognition. The last layer is changed to match the number of
classes in our dataset.
In this section we provide the experimental results for the Method Accuracy Rate
proposed algorithm, and the comparison with the previous Multiscale Morphologic Features [9] 87.94%
The proposed algorithm 95.5%
works on this dataset.
Before presenting the result of the proposed model, let
us first talk about the hyper-parameters used in our training C. Important Regions Visualization
procedure. We train the proposed model for 100 epochs using
an Nvidia Tesla GPU. The batch size is set to 24, and Here we provide a simple approach to visualize the most
Adam optimizer is used to optimize the loss function, with important regions while performing iris recognition using
a learning rate of 0.0002. All images are down-sampled to convolutional network, inspired by the work in [31]. We start
224x224 before being fed to the neural network. All our from the top-left corner of an image, and each time zero out
implementations are done in PyTorch [28]. We present the a square region of size N xN inside the image, and make a
details of the datasets used for our work in the next section, prediction using the trained model on the occluded image.
followed by quantitative and visual experimental results. If occluding that region makes the model to mis-label that
iris image, that region would be considered as an important
region, while doing iris recognition. On the other hand, if
A. Dataset removing that region does not impact the model’s prediction,
we infer that region is not as important. Now if we repeat this
We have tested our algorithm on the IIT Delhi iris database,
procedure for different sliding windows of N xN , each time
which contains 2240 iris images captured from 224 different
shifting them with a stride of S, we can get a saliency map
people. The resolution of these images is 320x240 pixels [29].
for the most important regions in recognizing fingerprints. The
Six sample images from this dataset are shown in Fig 4. As we
saliency maps for four example iris images are shown in Figure
can see the iris images in this dataset have slightly different
5. As it can be seen, most regions inside the iris area seem to
color distribution, as well as different sizes.
be important while doing iris recognition.
IV. C ONCLUSION
In this work we propose a deep learning framework for
iris recognition, by fine-tuning a pre-trained convolutional
model on ImageNet. This framework is applicable for other
biometrics recognition problems, and is specially useful for
the cases where there are only a few labeled images available
for each class. We apply the proposed framework on a well-
known iris dataset, IIT-Delhi, and achieved promising results,
Fig. 4. Six sample iris images from IIT Delhi dataset [30]. which outperforms previous approaches on this datasets. We
train these models with very few original images per class. We
also present a visualization technique for detecting the most
For each person, 4 images are used as test samples ran- important regions while doing iris recognition.
domly, and the rest are using for training and validation.