[go: up one dir, main page]

CN112597466A - User authentication method and system - Google Patents

User authentication method and system Download PDF

Info

Publication number
CN112597466A
CN112597466A CN202011403435.2A CN202011403435A CN112597466A CN 112597466 A CN112597466 A CN 112597466A CN 202011403435 A CN202011403435 A CN 202011403435A CN 112597466 A CN112597466 A CN 112597466A
Authority
CN
China
Prior art keywords
user
extracted
feature
image
face
Prior art date
Legal status (The legal status is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the status listed.)
Pending
Application number
CN202011403435.2A
Other languages
Chinese (zh)
Inventor
徐炎
Current Assignee (The listed assignees may be inaccurate. Google has not performed a legal analysis and makes no representation or warranty as to the accuracy of the list.)
Ant Shield Co ltd
Original Assignee
Alipay Labs Singapore Pte Ltd
Priority date (The priority date is an assumption and is not a legal conclusion. Google has not performed a legal analysis and makes no representation as to the accuracy of the date listed.)
Filing date
Publication date
Application filed by Alipay Labs Singapore Pte Ltd filed Critical Alipay Labs Singapore Pte Ltd
Publication of CN112597466A publication Critical patent/CN112597466A/en
Pending legal-status Critical Current

Links

Images

Classifications

    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F21/00Security arrangements for protecting computers, components thereof, programs or data against unauthorised activity
    • G06F21/30Authentication, i.e. establishing the identity or authorisation of security principals
    • G06F21/31User authentication
    • G06F21/32User authentication using biometric data, e.g. fingerprints, iris scans or voiceprints
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/22Matching criteria, e.g. proximity measures
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06FELECTRIC DIGITAL DATA PROCESSING
    • G06F18/00Pattern recognition
    • G06F18/20Analysing
    • G06F18/24Classification techniques
    • G06F18/241Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches
    • G06F18/2415Classification techniques relating to the classification model, e.g. parametric or non-parametric approaches based on parametric or probabilistic models, e.g. based on likelihood ratio or false acceptance rate versus a false rejection rate
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/04Architecture, e.g. interconnection topology
    • G06N3/045Combinations of networks
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06NCOMPUTING ARRANGEMENTS BASED ON SPECIFIC COMPUTATIONAL MODELS
    • G06N3/00Computing arrangements based on biological models
    • G06N3/02Neural networks
    • G06N3/08Learning methods
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/161Detection; Localisation; Normalisation
    • G06V40/166Detection; Localisation; Normalisation using acquisition arrangements
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/168Feature extraction; Face representation
    • G06V40/171Local features and components; Facial parts ; Occluding parts, e.g. glasses; Geometrical relationships
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/16Human faces, e.g. facial parts, sketches or expressions
    • G06V40/172Classification, e.g. identification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/19Sensors therefor
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/193Preprocessing; Feature extraction
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/10Human or animal bodies, e.g. vehicle occupants or pedestrians; Body parts, e.g. hands
    • G06V40/18Eye characteristics, e.g. of the iris
    • G06V40/197Matching; Classification
    • GPHYSICS
    • G06COMPUTING; CALCULATING OR COUNTING
    • G06VIMAGE OR VIDEO RECOGNITION OR UNDERSTANDING
    • G06V40/00Recognition of biometric, human-related or animal-related patterns in image or video data
    • G06V40/40Spoof detection, e.g. liveness detection
    • G06V40/45Detection of the body part being alive

Landscapes

  • Engineering & Computer Science (AREA)
  • Theoretical Computer Science (AREA)
  • Physics & Mathematics (AREA)
  • General Physics & Mathematics (AREA)
  • Health & Medical Sciences (AREA)
  • General Health & Medical Sciences (AREA)
  • Human Computer Interaction (AREA)
  • Multimedia (AREA)
  • Data Mining & Analysis (AREA)
  • Oral & Maxillofacial Surgery (AREA)
  • General Engineering & Computer Science (AREA)
  • Evolutionary Computation (AREA)
  • Computer Vision & Pattern Recognition (AREA)
  • Artificial Intelligence (AREA)
  • Life Sciences & Earth Sciences (AREA)
  • Ophthalmology & Optometry (AREA)
  • Software Systems (AREA)
  • Evolutionary Biology (AREA)
  • Molecular Biology (AREA)
  • Computing Systems (AREA)
  • Mathematical Physics (AREA)
  • Computational Linguistics (AREA)
  • Biophysics (AREA)
  • Biomedical Technology (AREA)
  • Bioinformatics & Computational Biology (AREA)
  • Bioinformatics & Cheminformatics (AREA)
  • Computer Security & Cryptography (AREA)
  • Probability & Statistics with Applications (AREA)
  • Computer Hardware Design (AREA)
  • Collating Specific Patterns (AREA)

Abstract

提供了一种用户认证方法和系统。该方法包括:使用经训练的卷积神经网络提取与用户的第一眼瞳图像相关联的第一特征;使用所述经训练的卷积神经网络提取与所述用户的第二眼瞳图像相关联的第二特征,其中,所述用户的所述第二眼瞳图像是在白光投射到所述用户的人脸时拍摄的;基于所述提取的第一特征和所述提取的第二特征来生成相似度得分;以及在所述相似度得分指示所述提取的第一特征与所述提取的第二特征之间存在差异的情况下,所述用户通过认证。

Figure 202011403435

A user authentication method and system are provided. The method includes: extracting a first feature associated with a first pupil image of a user using a trained convolutional neural network; extracting a first feature associated with a second pupil image of the user using the trained convolutional neural network The second feature of the link, wherein the second pupil image of the user is captured when white light is projected on the face of the user; based on the extracted first feature and the extracted second feature to generate a similarity score; and if the similarity score indicates a difference between the extracted first feature and the extracted second feature, the user is authenticated.

Figure 202011403435

Description

User authentication method and system
Technical Field
This document relates generally, but not exclusively, to user authentication methods and user authentication systems.
Background
"electronically know your customer (eKYC)" is a digital due diligence process performed by business entities or service providers to verify the identity of their customers to prevent identity fraud.
In a typical eKYC process, a customer is asked to take a picture of his face (i.e., "selfie"). A face anti-spoofing method is implemented to prevent false face verification by an attacker using a photo, high resolution screenshot, 2D/3D mask or other substitute to replace a person's live face.
Current face anti-spoofing methods may be effective in detecting photographs, screenshots, or 2D masks spoofing a face. However, current face anti-spoofing methods may be ineffective in detecting spoofed faces of 3D masks.
Thus, there is a need for an improved way in which spoofed faces may be detected.
Disclosure of Invention
Embodiments seek to provide a user authentication method that involves a flash-based face anti-spoofing method to detect photos, screenshots, 2D and 3D masks spoofing faces. In the case of a live face, when the glints are projected onto the pupil area, white spots are expected to be seen in and/or around the pupil area. However, for a photo, screenshot, 2D or 3D mask spoof face, it is expected that white spots will not be seen in and/or around the pupil region.
According to one embodiment, there is provided a user authentication method including: extracting a first feature associated with a first pupil image of a user using a trained Convolutional Neural Network (CNN); extracting a second feature associated with a second eye pupil image of the user using the trained CNN, wherein the second eye pupil image of the user was taken while white light was projected onto a face of the user; generating a similarity score based on the extracted first feature and the extracted second feature; and the user is authenticated if the similarity score indicates a difference between the extracted first feature and the extracted second feature.
According to another embodiment, there is provided a user authentication system including: an extraction device configured to: extracting first features associated with a first eye pupil image of a user using a trained Convolutional Neural Network (CNN), and extracting second features associated with a second eye pupil image of the user using the trained CNN, wherein the second eye pupil image of the user is captured when white light is projected to a face of the user; a score generation device configured to generate a similarity score based on the extracted first feature and the extracted second feature; and an authentication device configured to authenticate the user if the similarity score indicates a difference between the extracted first feature and the extracted second feature.
Drawings
The embodiments are provided by way of example only and will be better understood and readily appreciated by those of ordinary skill in the art from the following written description, read in conjunction with the accompanying drawings, in which:
fig. 1 is a flowchart illustrating an example of a user authentication method according to an embodiment.
FIG. 2a shows an eye pupil area image with no flash projected onto the eye; fig. 2b, 2c and 2d show examples of eye pupil region images (of a live face) with glints projected onto the eyes.
Fig. 3 shows a schematic diagram of a computer system suitable for performing at least some of the steps of a user authentication method.
Fig. 4 is a schematic diagram showing an example of a user authentication system according to an embodiment.
Detailed Description
Embodiments will now be described, by way of example only, with reference to the accompanying drawings. Like reference numbers and characters in the drawings indicate like elements or equivalents.
Some portions of the following description are presented in terms of algorithms and functional or symbolic representations of operations on data within a computer memory. These algorithmic descriptions and functional or symbolic representations are the means used by those skilled in the data processing arts to most effectively convey the substance of their work to others skilled in the art. An algorithm is here, and generally, considered to be a self-consistent sequence of steps leading to a desired result. These steps are those requiring physical manipulations of physical quantities such as electrical, magnetic or optical signals capable of being stored, transferred, combined, compared, and otherwise manipulated.
Unless specifically stated otherwise, and as will be apparent from the following, it is appreciated that throughout the present document, discussions utilizing terms such as "receiving," "scanning," "computing," "determining," "replacing," "generating," "initializing," "outputting," or the like, refer to the action and processes of a computer system, or similar electronic device, that manipulates and transforms data represented as physical quantities within the computer system into other data similarly represented as physical quantities within the computer system or other information storage, transmission or display devices.
Also disclosed herein are apparatuses for performing the operations of the methods. Such apparatus may be specially constructed for the required purposes, or may comprise a computer or other device selectively activated or reconfigured by a computer program stored in the computer. The algorithms and displays presented herein are not inherently related to any particular computer or other apparatus. Various machines may be used with programs in accordance with the teachings herein. Alternatively, the construction of a more specialized apparatus for carrying out the required method steps may be appropriate. The structure of a computer adapted to perform the various methods/processes described herein will appear from the description below.
Further, a computer program is implicitly disclosed herein, since it is clear to a person skilled in the art that the individual steps of the methods described herein can be implemented by computer code. The computer program is not intended to be limited to any particular programming language and implementation thereof. It will be appreciated that a variety of programming languages and code therefor may be used to implement the teachings of the disclosure as contained herein. Moreover, the computer program is not intended to be limited to any particular control flow. There are many other variations of computer programs that may use different control flows without departing from the spirit or scope of the present invention.
Furthermore, one or more steps of a computer program may be executed in parallel rather than sequentially. Such a computer program may be stored on any computer readable medium. The computer-readable medium may include a storage device such as a magnetic or optical disk, a memory chip, or other storage device suitable for interfacing with a computer. The computer readable media may also include hard-wired media such as those exemplified in the internet systems, or wireless media such as those exemplified in the GSM mobile phone system, as well as other wireless systems such as bluetooth, ZigBee, Wi-Fi. When loaded and executed on such a computer, effectively creates means for implementing the steps of the preferred method.
"electronically know your customer (eKYC)" is a digital due diligence process performed by business entities or service providers to verify the identity of their customers to prevent identity fraud. Authentication may be considered a form of fraud detection in which the user's legitimacy is verified and a potential fraudster may be detected before fraud is carried out. Effective authentication can enhance the data security of the system, thereby protecting the digital data from unauthorized users.
The techniques described herein produce one or more technical effects. In particular, by calculating a similarity/difference between a first eye pupil image of a user and a second eye pupil image of the user (wherein the second eye pupil image of the user is photographed when white light is projected onto the face of the user), the user authentication method and system may reduce an attack success rate on the eKYC process, and may be particularly effective for recognizing an attack using a 3D mask. If the first eye pupil image of the user and the second eye pupil image of the user are determined to be similar or identical, an attack on the eKYC process may be recognized.
Further, the user authentication methods and systems may provide greater accuracy in detecting an attack by calculating a confidence score based on extracted supplemental features associated with a facial image of a user (where one or more facial images of the user were taken while projecting one or more non-white lights onto the user's face).
Embodiments seek to provide a user authentication method involving a flash-based face anti-spoofing method to detect photos, screenshots, 2D and 3D masks spoofing faces. In the case of a live face, when the glints are projected onto the pupil area, white spots are expected to be seen in and/or around the pupil area. However, in the case of a photograph, screenshot, 2D or 3D mask spoofing a human face, it is expected that white dots will not be seen in and/or around the pupil region.
An image capturing device (e.g., an RGB camera) is used to capture an image of a user's face without any flash being projected onto the user's face. The RGB camera is equipped with a CMOS sensor through which an image of the user's face (or any image presented by an attacker for authentication, such as a photograph, high resolution screen shot, 2D/3D print mask, etc.) can be acquired. An eye region of the photographed face image of the user is cut out so as to be focused on the eye pupil region. The cropped pupil area image (labeled "a") is collected. In image "a", no flash is projected on the user's face/eyes.
By first extracting face points including eyepoints, eye regions can be cut out. Wuyue (Yue Wu) et al published in 2018, 5 and 15 months "detection of human face characteristic points: literature Survey (Facial Landmark Detection: a Literrure Survey) "discloses a technique for extracting face points. Thereafter, an eye region/pupil region is cut out based on the eyepoint. For example, a rectangular area containing one eye may be cropped by obtaining the leftmost eyepoint, the rightmost eyepoint, the uppermost eyepoint, and the lowermost eyepoint. The eye pupil region may be cropped in a similar manner.
The image photographing apparatus is used to photograph a face image of a user while projecting a flash to the face of the user. The eye region of the photographed face image of the user is cut out so as to focus more on the eye pupil region. The corresponding eye pupil region image (labeled "B") is collected.
The white dots are expected to be clearly seen in the pupil region (with glints) of image B. On the other hand, it is expected that white spots will not be seen in the pupil region of image a (without using a flash).
For each of the pupil region images a and B, a convolutional neural (CNN) network (N1) is used to extract features. CNN network N1 may be trained using images of the large pupillary region, whether or not glints are projected onto the eyes. In one embodiment, the feature extractor is trained using resnet18 as the network structure.
The similarity score S1 is calculated based on the features extracted from the eye pupil region images a and B. In one embodiment, to calculate the similarity of two eigenvectors, a cosine similarity method is used. Cosine similarity measures the similarity between two vectors of an inner product space. It is measured by the cosine of the angle between the two vectors and determines whether the two vectors point in approximately the same direction. The formula that can be used to calculate the cosine similarity is as follows.
Figure BDA0002817764340000061
Wherein A isiAnd BiAre the components of vectors a and B, respectively.
If the similarity score S1 is greater than the predetermined threshold (T1), it means that the two images A and B are similar. In other words, the result indicates that image B is a spoofed image (e.g., image B is an image of a 3D mask or screenshot) because no white dots are observed in the eye pupil region. On the other hand, if the score S1 is smaller than the predetermined threshold (T1), it means that the two images a and B are different. In other words, a white spot is expected to be observed in the eye pupil region of the image B.
In an embodiment, to make fraud detection more robust and accurate, additional steps are performed that include projecting a series of colored lights to the user's face to obtain multiple frames with different colors (e.g., without limitation, red, blue, yellow, green). The colored light is projected onto the entire human face, and a plurality of frames having different colors represent the entire human face, not only the eye pupil regions (i.e., eye pupil region images a and B).
For multiframes with different colors, features are extracted using a CNN network (N2). CNN network N2 may be trained using large scale multiframes with different colors on live and spoofed faces. In one embodiment, resnet18 is employed as a network structure to train binary classifier N2.
The confidence score S2 is calculated based on features extracted from multiple frames having different colors.
The similarity score S1 and the confidence score S2 may be fused to obtain a final authentication result. The decision function may be defined as: if S1< T1 or S2> T2, the result is a spoofed face, otherwise the result is a live face. The predetermined thresholds T1 and T2 may be calculated based on the validation data set.
In summary, the authentication method can be divided into two phases. The first "flash" stage involves obtaining a similarity score S1 based on features extracted from the eye pupil region image a (no flash projected to the user 'S eye pupil region) and the eye pupil region image B (flash projected to the user' S eye pupil region). The second "color sequence" stage involves obtaining confidence scores based on features extracted from multiple frames having different colors projected onto the user' S face S2.
The color sequence phase can effectively detect most spoofing attacks, such as high resolution photos/screenshots and 2D paper masks. However, for a 3D mask, the color sequence phase may not work properly because the surface or material of the 3D mask may be very similar to a human face. During the "flash" phase when a flash (very bright white light) is projected onto the human eye, especially the pupil area, a significant difference can be observed from the two frames (i.e. with/without flash), but the difference may not be found if spoofing is done using a 3D mask. This is because the pupil area of the human eye and the material of the 3D mask are different. To make fraud detection more robust and cover more types of attacks, the similarity score S1 and the confidence score S2 are fused together to make the final decision.
Fig. 1 is a flowchart 100 illustrating an example of a user authentication method according to an embodiment. At step 102, a trained Convolutional Neural Network (CNN) extracts one or more first features associated with a first pupil image of a user. At step 104, the same trained CNN extracts one or more second features associated with a second eye pupil image of the user. The second eye pupil image of the user is taken while white light (flash light) is projected to the face of the user. On the other hand, the first pupil image of the user is photographed without white light (flash) projected to the face of the user.
The CNN may be trained using: a large data set comprising an eye pupil image of an eye pupil upon which white light is projected, and a large data set comprising an eye pupil image of an eye pupil upon which white light is not projected. CNN may employ a resnet18 network architecture.
The light source for the white flash may be a built-in flash of a camera-integrated smartphone. The camera of the smartphone may be used to capture a first eye pupil image of the user (i.e., when the built-in flash is not activated) and a second eye pupil image of the user (i.e., when the built-in flash is activated).
In particular, the method may comprise the step of capturing a first face image of the user comprising a first pupil image using an image sensor of the (camera of the smartphone). Then, the eye detection method is applied to the first face image of the user to generate a first eye detection frame. An area of the first face image of the user within the first eye detection frame corresponds to the first pupil image. Similarly, the method may include the step of capturing a second facial image of the user including a second eye pupil image using the image sensor. And then, applying the eye detection method to a second face image of the user to generate a second eye detection frame. The region of the second face image of the user within the second eye detection frame corresponds to the second eye pupil image.
At step 106, a similarity score is generated based on the extracted first feature and the extracted second feature.
In step 108, the user is authenticated in case the similarity score indicates a difference between the extracted first feature and the extracted second feature.
In the case of a photograph, screenshot, 2D or 3D mask spoofing a human face, when a flash of light is projected onto the eye pupil region, white dots are not visible in and/or around the eye pupil region.
On the other hand, in the case of a live face, when the flash light is projected onto the eye pupil region, white spots can be seen in and/or around the eye pupil region. Fig. 2a shows an example of an eye pupil area image 202 in the case where no flash is projected to the eye. For comparison, fig. 2b, 2c, and 2d show examples of eye pupil region images (of a live face) in the case where glints are projected onto the eyes. In fig. 2b, the white spot (dot/spot)204 can be seen in the pupil region. In fig. 2c, two white dots 206 can be seen in the eye pupil region. In fig. 2d, white dots 208 can be seen around the eye pupil area. Depending on the number of flashes and the angle of the flashes with respect to the eye pupil, white spots may appear at different locations in and/or around the eye pupil area.
The presence of white spots in and/or around the pupil area of a living face when glints are projected onto the eyes is due to the glints emanating/reflecting from the optic nerve. This occurs when the glints enter the eyes at an angle, resulting in a white eye effect. This is also referred to as "disc reflection" or "white reflection".
In addition to the white eye effect, red dots (dot (s)/spot (s)) may appear in and/or around the pupil area of a live face when glints are projected onto the eyes. This is known as the red-eye effect, where glints occur too quickly to close the pupil, so many glints enter the eye through the pupil, reflect off the back of the eye (where there is a large amount of blood) and reflect off through the pupil. The camera records the reflected light.
The white eye effect or the red eye effect causes a difference between the first eye pupil image of the user and the second eye pupil image of the user. Therefore, there is a difference between the extracted first feature and the extracted second feature. If the similarity score (determined at step 106) indicates a difference between the extracted first feature and the extracted second feature, the user may be authenticated.
In an embodiment, in order to make fraud detection more robust and accurate, the method may further comprise the following steps. First, a trained complementary Convolutional Neural Network (CNN) is used to extract one or more complementary features associated with a complementary facial image of a user. The supplemental facial image of the user is taken while non-white light is projected onto the user's face. A confidence score is then generated based on the extracted supplemental features.
In one embodiment, the confidence score is predicted using the softmax function of the trained CNN. The softmax function is used for the last layer of the neural network based classifier. The softmax function is used to map the non-normalized output of the network to a probability distribution over the class of predicted outputs. The supplemental CNN is trained on a large-scale sequence of facial images, both live and spoofed. During the authentication phase, a confidence score is generated from the trained CNN with the softmax function based on a face image taken when non-white light is projected onto the user's face.
To make fraud detection more robust and accurate, the user is authenticated in the event that (i) the similarity score indicates a discrepancy between the extracted first feature and the extracted second feature, or (ii) the confidence score is greater than a predetermined confidence threshold.
The supplemental CNN is different from the CNN described in step 102 above. The supplemental CNN may be trained using a large dataset of facial images, including live facial images and spoofed facial images captured by projecting non-white light. The supplementary CNN may employ a resnet18 network structure.
To make fraud detection more robust and accurate, multiple supplemental facial images of the user may be taken while multiple non-white lights are projected onto the user's face in sequence (e.g., blue, then red, then yellow, etc.). The light source that is not white light may be a display screen of a smartphone. The smartphone display may be configured to display a blue screen first, then a red screen, then a yellow screen, etc. An image sensor (of the camera of the smartphone) may be used to capture images of the user's face while each color is displayed. In other words, if three different non-white lights are displayed and thus projected onto the user's face, three different facial images of the user are captured.
The method also includes extracting a plurality of supplemental features associated with a plurality of supplemental facial images of the user using the trained supplemental CNN. Thereafter, a confidence score is generated based on the extracted plurality of supplemental features.
A similarity score greater than a predetermined similarity threshold indicates a difference between the extracted first feature and the extracted second feature. The similarity threshold and the confidence threshold may be determined separately based on the validation data set. The validation data set includes live face images and spoofed face images, and a Receiver Operating Characteristic (ROC) curve may be calculated from the validation data and the label of the prediction. In one embodiment, according to the ROC curve, a threshold value is set when FAR (false acceptance rate) is equal to 0.01 or 0.001.
Fig. 3 shows a schematic diagram of a computer system suitable for performing at least some of the steps of a user authentication method.
The following description of computing system/computing device 300 is provided by way of example only and is not intended to be limiting.
As shown in fig. 3, the exemplary computing device 300 includes a processor 304 for executing software routines. Although a single processor is shown for clarity, computing device 300 may also include a multi-processor system. The processor 304 is connected to a communication infrastructure 306 to communicate with other components of the computing device 300. The communication infrastructure 306 may include, for example, a communication bus, a crossbar, or a network.
Computing device 300 also includes a main memory 308, such as Random Access Memory (RAM), and a secondary memory 310. The secondary memory 310 may include, for example, a hard disk drive 312 and/or a removable storage drive 314, where the removable storage drive 314 may include a magnetic tape drive, an optical disk drive, etc. The removable storage drive 314 reads from and/or writes to a removable storage unit 318 in a well known manner. Removable storage unit 318 may comprise a magnetic tape, an optical disk, etc. which is read by and written to by removable storage drive 314. As will be appreciated by one skilled in the relevant art, removable storage unit 318 includes a computer-readable storage medium having stored therein computer-executable program code instructions and/or data.
In alternative embodiments, secondary memory 310 may additionally or alternatively include other similar means for allowing computer programs or other instructions to be loaded into computing device 300. Such devices may include, for example, a removable storage unit 322 and an interface 320. Examples of removable storage unit 322 and interface 320 include a removable storage chip (e.g., an EPROM, or PROM) and associated socket, and other removable storage units 322 and interfaces 320, 320 that allow software and data to be transferred from removable storage unit 322 to computer system 300.
Computing device 300 also includes at least one communication interface 324. Communications interface 324 allows software and data to be transferred between computing device 300 and external devices via communications path 326. In various embodiments, communication interface 324 allows data to be transferred between computing device 300 and a data communication network, such as a public or private data communication network. The communication interface 324 may be used to exchange data between different computing devices 300, which computing devices 300 form part of an interconnected computer network. Examples of communication interface 324 may include a modem, a network interface (such as an ethernet card), a communication port, an antenna with associated circuitry, and the like. The communication interface 324 may be wired or may be wireless. Software and data transferred via communications interface 324 are in the form of signals which may be electrical, electromagnetic, optical or other signals capable of being received by communications interface 324. These signals are provided to the communications interface via communications path 326.
Optionally, the computing device 300 further comprises: a display interface 302 that performs operations for presenting images to an associated display 330; and an audio interface 432 that performs operations for playing audio content via the associated speaker 334.
As used herein, the term "computer program product" may refer, in part, to removable storage unit 318, removable storage unit 322, a hard disk installed in hard disk drive 312, or a carrier wave carrying software to communication interface 324 over a communication path 326 (wireless link or cable). Computer-readable storage media refers to any non-transitory tangible storage medium that provides recorded instructions and/or data to computing device 300 for execution and/or processing. Examples of such storage media include floppy disks, magnetic tapes, CD-ROMs, DVDs, Blu-ray (Blu-ray)TM) Optical disks, hard drives, ROMs, or integrated circuits, USB memory, magneto-optical disks, or computer readable cards such as PCMCIA cards, whether internal to computing device 300 or otherwiseAnd (3) an external part. Examples of transitory or non-tangible computer-readable transmission media that may also participate in providing software, applications, instructions, and/or data to computing device 300 include a radio or infrared transmission channel and a network connection to another computer or networked device, as well as the internet or ethernet, etc., including information recorded on email transmissions, websites, and the like.
Computer programs (also called computer program code) are stored in main memory 308 and/or secondary memory 310. Computer programs may also be received via communications interface 324. Such computer programs, when executed, enable computing device 300 to perform one or more features of embodiments discussed herein. In various embodiments, the computer programs, when executed, enable the processor 304 to perform the features of the embodiments described above. Accordingly, such computer programs represent controllers of the computer system 300.
The software is stored in a computer program product and may be loaded into computing device 300 using removable storage drive 314, hard drive 312, or interface 320. Alternatively, the computer program product may be downloaded to computer system 300 over communications path 326. The software, when executed by the processor 304, causes the computing device 300 to perform the functions of the embodiments described herein.
It should be understood that the embodiment of fig. 3 is given by way of example only. Thus, in some embodiments, one or more features of computing device 300 may be omitted. Also, in some embodiments, one or more features of computing device 300 may be combined together. Additionally, in some embodiments, one or more features of computing device 300 may be separated into one or more components.
Fig. 4 is a schematic diagram illustrating an example of a user authentication system 400 according to an embodiment. The user authentication system includes an extraction device 402, a score generation device 404, and an authentication device 406. The extraction device 402 extracts one or more first features associated with a first pupil image of a user using a trained Convolutional Neural Network (CNN). The extraction device 402 also extracts one or more second features associated with a second eye pupil image of the user using the trained CNN. The second eye pupil image of the user is taken when the white light is projected onto the face of the user.
The score generation device 404 generates a similarity score based on the extracted first feature and the extracted second feature. The authentication device 406 authenticates the user if the similarity score indicates a difference between the extracted first feature and the extracted second feature.
The extraction device 402 may also extract one or more supplemental features associated with a supplemental facial image of the user using a trained supplemental Convolutional Neural Network (CNN). The supplemental facial image of the user is taken while projecting non-white light onto the user's face. The score generation device 404 may generate a confidence score based on the extracted supplemental features. The authentication device 406 authenticates the user if: (i) the similarity score indicates a difference between the extracted first feature and the extracted second feature, or (ii) the confidence score is greater than a predetermined confidence threshold.
The system 400 may also include an image sensor 408 to capture a first face image (containing a first pupil image) of the user and apply an eye detection method to the first face image of the user to generate a first eye detection box. An area of the first face image of the user within the first eye detection frame corresponds to the first pupil image. The image sensor 408 also captures a second face image (including a second eye pupil image) of the user, and applies the eye detection method to the second face image of the user to generate a second eye detection frame. The region of the second face image of the user within the second eye detection frame corresponds to the second eye pupil image.
The CNN may be trained using a large data set of eye pupil images that include an eye pupil upon which white light is projected. The supplemental CNN may be trained using a large data set comprising live face images captured by projecting non-white light and spoofed face images. Both CNN and supplementary CNN may employ a resnet18 network architecture.
Multiple supplemental facial images of the user may be taken while multiple non-white lights are sequentially projected onto the user's face. The extraction device 402 may use the trained supplemental CNN to extract a plurality of supplemental features associated with a plurality of supplemental facial images of the user, and the score generation device may generate a confidence score based on the plurality of extracted supplemental features.
A similarity score greater than a predetermined similarity threshold indicates a difference between the extracted first feature and the extracted second feature. The similarity threshold and the confidence threshold are determined separately based on the validation dataset.
The term "configured" is used herein in connection with systems, devices, and computer program components. For a system of one or more computers configured to perform a particular operation or action, it is meant that the system has installed thereon software, firmware, hardware, or a combination thereof that in operation causes the system to perform the operation or action. For one or more computer programs configured to perform specific operations or actions, it is meant that the one or more programs include instructions, which when executed by a data processing apparatus, cause the apparatus to perform the operations or actions. By dedicated logic circuitry configured to perform a particular operation or action is meant that the circuitry has electronic logic to perform the operation or action.
It will be appreciated by persons skilled in the art that numerous variations and/or modifications may be made to the invention as shown in the specific embodiments herein without departing from the spirit or scope of the invention as broadly described. The described embodiments are, therefore, to be considered in all respects as illustrative and not restrictive.

Claims (14)

1.一种用户认证方法,包括:1. A user authentication method, comprising: 使用经训练的卷积神经网络提取与用户的第一眼瞳图像相关联的第一特征;extracting a first feature associated with the user's first pupil image using the trained convolutional neural network; 使用所述经训练的卷积神经网络提取与所述用户的第二眼瞳图像相关联的第二特征,其中,所述用户的所述第二眼瞳图像是在白光投射到所述用户的人脸时拍摄的;extracting a second feature associated with a second pupil image of the user using the trained convolutional neural network, wherein the second pupil image of the user is projected to the user in white light photographed with human faces; 基于所述提取的第一特征和所述提取的第二特征来生成相似度得分;以及generating a similarity score based on the extracted first feature and the extracted second feature; and 在所述相似度得分指示所述提取的第一特征与所述提取的第二特征之间存在差异的情况下,所述用户通过认证。The user is authenticated if the similarity score indicates that there is a difference between the extracted first feature and the extracted second feature. 2.根据权利要求1所述的方法,还包括:2. The method of claim 1, further comprising: 使用经训练的补充卷积神经网络提取与所述用户的补充人脸图像相关联的补充特征,其中,所述用户的所述补充人脸图像是在将非白光投射到所述用户的人脸时拍摄的;extracting supplemental features associated with a supplemental face image of the user using a trained supplemental convolutional neural network, wherein the supplemental face image of the user is projecting non-white light onto the user's face taken when 基于所述提取的补充特征来生成置信度得分;以及generating a confidence score based on the extracted supplemental features; and 在以下情况下所述用户通过认证:(i)所述相似度得分指示所述提取的第一特征与所述提取的第二特征之间存在差异,或者(ii)所述置信度得分大于预定置信度阈值。The user is authenticated if (i) the similarity score indicates a difference between the extracted first feature and the extracted second feature, or (ii) the confidence score is greater than a predetermined Confidence threshold. 3.根据权利要求1或2所述的方法,还包括:3. The method of claim 1 or 2, further comprising: 使用图像传感器拍摄所述用户的包含所述第一眼瞳图像的第一人脸图像;using an image sensor to capture a first face image of the user including the first pupil image; 将眼部检测方法应用于所述用户的所述第一人脸图像以生成第一眼部检测框,其中,所述用户的所述第一人脸图像在所述第一眼部检测框内的区域对应于所述第一眼瞳图像;applying an eye detection method to the first face image of the user to generate a first eye detection frame, wherein the first face image of the user is within the first eye detection frame The region corresponds to the first pupil image; 使用所述图像传感器拍摄所述用户的包含所述第二眼瞳图像的第二人脸图像;以及capturing a second face image of the user including the second pupil image using the image sensor; and 将所述眼部检测方法应用于所述用户的所述第二人脸图像以生成第二眼部检测框,其中,所述用户的所述第二人脸图像在所述第二眼部检测框内的区域对应于所述第二眼瞳图像。applying the eye detection method to the second face image of the user to generate a second eye detection frame, wherein the second face image of the user is detected at the second eye The area within the box corresponds to the second pupil image. 4.根据前述权利要求中任一项所述的方法,其中,4. The method of any preceding claim, wherein, 所述卷积神经网络是使用眼瞳图像的大数据集来训练的,所述眼瞳图像包括投射有所述白光的眼瞳;并且the convolutional neural network is trained using a large dataset of pupil images including pupils on which the white light is projected; and 所述卷积神经网络采用resnet18网络结构。The convolutional neural network adopts the resnet18 network structure. 5.根据权利要求2所述的方法,其中,5. The method of claim 2, wherein, 所述补充卷积神经网络是使用人脸图像的大数据集来训练的,所述人脸图像包括通过投射所述非白光捕获的活脸图像和欺骗人脸图像,并且the complementary convolutional neural network is trained using a large dataset of face images including live face images and spoof face images captured by projecting the non-white light, and 所述补充卷积神经网络采用resnet18网络结构。The supplementary convolutional neural network adopts the resnet18 network structure. 6.根据权利要求2所述的方法,其中,在多个非白光顺序投射到所述用户的人脸上时拍摄所述用户的多个补充人脸图像,并且所述方法还包括:6. The method of claim 2, wherein a plurality of supplementary face images of the user are captured while a plurality of non-white lights are sequentially projected onto the user's face, and the method further comprises: 使用所述经训练的补充卷积神经网络来提取与所述用户的所述多个补充人脸图像相关联的多个补充特征;以及extracting a plurality of supplemental features associated with the plurality of supplemental face images of the user using the trained supplemental convolutional neural network; and 基于所述提取的多个补充特征生成所述置信度得分。The confidence score is generated based on the extracted plurality of complementary features. 7.根据前述权利要求中任一项所述的方法,其中,7. The method of any preceding claim, wherein, 所述相似度得分大于预定相似度阈值表示所述提取的第一特征与所述提取的第二特征之间存在差异,并且the similarity score being greater than a predetermined similarity threshold indicates that there is a difference between the extracted first feature and the extracted second feature, and 所述相似度阈值和所述置信度阈值是基于验证数据集分别确定的。The similarity threshold and the confidence threshold are respectively determined based on a validation data set. 8.一种用户认证系统,包括:8. A user authentication system, comprising: 提取设备,被配置为:The extraction device, configured as: 使用经训练的卷积神经网络提取与用户的第一眼瞳图像相关联的第一特征;extracting a first feature associated with the user's first pupil image using the trained convolutional neural network; 使用所述经训练的卷积神经网络提取与所述用户的第二眼瞳图像相关联的第二特征,其中,所述用户的所述第二眼瞳图像是在白光投射到所述用户的人脸时拍摄的;extracting a second feature associated with a second pupil image of the user using the trained convolutional neural network, wherein the second pupil image of the user is projected to the user in white light photographed with human faces; 得分生成设备,被配置为基于所述提取的第一特征和所述提取的第二特征来生成相似度得分;以及a score generating device configured to generate a similarity score based on the extracted first feature and the extracted second feature; and 认证设备,被配置为在所述相似度得分指示所述提取的第一特征与所述提取的第二特征之间存在差异的情况下,使所述用户通过认证。An authentication device configured to authenticate the user if the similarity score indicates a difference between the extracted first feature and the extracted second feature. 9.根据权利要求8所述的系统,其中,9. The system of claim 8, wherein, 所述提取设备进一步被配置为:The extraction device is further configured to: 使用经训练的补充卷积神经网络提取与所述用户的补充人脸图像相关联的补充特征,其中,所述用户的所述补充人脸图像是在将非白光投射到所述用户的人脸时拍摄的;extracting supplemental features associated with a supplemental face image of the user using a trained supplemental convolutional neural network, wherein the supplemental face image of the user is projecting non-white light onto the user's face taken when 基于所述提取的补充特征来生成置信度得分;以及generating a confidence score based on the extracted supplemental features; and 在以下情况下使所述用户通过认证:(i)所述相似度得分指示所述提取的第一特征与所述提取的第二特征之间存在差异,或者(ii)所述置信度得分大于预定置信度阈值。The user is authenticated if (i) the similarity score indicates a difference between the extracted first feature and the extracted second feature, or (ii) the confidence score is greater than Predetermined confidence threshold. 10.根据权利要求8或9所述的系统,还包括图像传感器,被配置为:10. The system of claim 8 or 9, further comprising an image sensor configured to: 拍摄所述用户的包含所述第一眼瞳图像的第一人脸图像;photographing a first face image of the user including the first pupil image; 将眼部检测方法应用于所述用户的所述第一人脸图像以生成第一眼部检测框,其中,所述用户的所述第一人脸图像在所述第一眼部检测框内的区域对应于所述第一眼瞳图像;applying an eye detection method to the first face image of the user to generate a first eye detection frame, wherein the first face image of the user is within the first eye detection frame The area of corresponds to the first pupil image; 使用所述图像传感器拍摄所述用户的包含所述第二眼瞳图像的第二人脸图像;以及capturing a second face image of the user including the second pupil image using the image sensor; and 将所述眼部检测方法应用于所述用户的所述第二人脸图像以生成第二眼部检测框,其中,所述用户的所述第二人脸图像在所述第二眼部检测框内的区域对应于所述第二眼瞳图像。applying the eye detection method to the second face image of the user to generate a second eye detection frame, wherein the second face image of the user is detected at the second eye The area within the box corresponds to the second pupil image. 11.根据权利要求8至10中任一项所述的系统,其中,所述卷积神经网络是使用眼瞳图像的大数据集来训练的,所述眼瞳图像包括投射有所述白光的眼瞳,并且所述卷积神经网络采用resnet18网络结构。11. The system of any one of claims 8 to 10, wherein the convolutional neural network is trained using a large dataset of eye pupil images comprising pupil, and the convolutional neural network adopts the resnet18 network structure. 12.根据权利要求9所述的系统,其中,所述补充卷积神经网络是使用人脸图像的大数据集来训练的,所述人脸图像包括通过投射所述非白光捕获的活脸图像和欺骗人脸图像,并且所述补充卷积神经网络采用resnet18网络结构。12. The system of claim 9, wherein the supplemental convolutional neural network is trained using a large dataset of face images including live face images captured by projecting the non-white light and spoofed face images, and the complementary convolutional neural network adopts the resnet18 network structure. 13.根据权利要求9所述的系统,其中,在多个非白光顺序投射到所述用户的人脸上时拍摄所述用户的多个补充人脸图像,并且13. The system of claim 9, wherein multiple complementary face images of the user are captured while multiple non-white lights are sequentially projected onto the user's face, and 所述提取设备进一步被配置为使用所述经训练的补充卷积神经网络来提取与所述用户的所述多个补充人脸图像相关联的多个补充特征;以及The extraction device is further configured to use the trained supplemental convolutional neural network to extract a plurality of supplemental features associated with the plurality of supplemental facial images of the user; and 所述得分生成设备进一步被配置为基于所述提取的多个补充特征生成所述置信度得分。The score generating device is further configured to generate the confidence score based on the extracted plurality of supplementary features. 14.根据权利要求8至13中任一项所述的系统,其中,所述相似度得分大于预定相似度阈值,表示所述提取的第一特征与所述提取的第二特征之间存在差异,并且其中,所述相似度阈值和所述置信所述相似度得分大于预定相似度阈值表示所述提取的第一特征与所述提取的第二特征之间存在差异,并且所述相似度阈值和所述置信度阈值是基于验证数据集分别确定的。14. The system according to any one of claims 8 to 13, wherein the similarity score is greater than a predetermined similarity threshold, indicating that there is a difference between the extracted first feature and the extracted second feature , and wherein, the similarity threshold and the confidence that the similarity score is greater than a predetermined similarity threshold indicates that there is a difference between the extracted first feature and the extracted second feature, and the similarity threshold and the confidence thresholds are determined separately based on the validation dataset.
CN202011403435.2A 2020-04-30 2020-12-04 User authentication method and system Pending CN112597466A (en)

Applications Claiming Priority (2)

Application Number Priority Date Filing Date Title
SG10202003994RA SG10202003994RA (en) 2020-04-30 2020-04-30 A User Authentication Method And System
SG10202003994R 2020-04-30

Publications (1)

Publication Number Publication Date
CN112597466A true CN112597466A (en) 2021-04-02

Family

ID=72643816

Family Applications (1)

Application Number Title Priority Date Filing Date
CN202011403435.2A Pending CN112597466A (en) 2020-04-30 2020-12-04 User authentication method and system

Country Status (2)

Country Link
CN (1) CN112597466A (en)
SG (1) SG10202003994RA (en)

Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007072861A (en) * 2005-09-08 2007-03-22 Omron Corp Impersonation detector and face authentication device
CN107292290A (en) * 2017-07-17 2017-10-24 广东欧珀移动通信有限公司 Face vivo identification method and Related product
CN108009531A (en) * 2017-12-28 2018-05-08 北京工业大学 A kind of face identification method of more tactful antifraud
CN108345818A (en) * 2017-01-23 2018-07-31 北京中科奥森数据科技有限公司 A kind of human face in-vivo detection method and device
CN110969077A (en) * 2019-09-16 2020-04-07 成都恒道智融信息技术有限公司 Living body detection method based on color change

Patent Citations (5)

* Cited by examiner, † Cited by third party
Publication number Priority date Publication date Assignee Title
JP2007072861A (en) * 2005-09-08 2007-03-22 Omron Corp Impersonation detector and face authentication device
CN108345818A (en) * 2017-01-23 2018-07-31 北京中科奥森数据科技有限公司 A kind of human face in-vivo detection method and device
CN107292290A (en) * 2017-07-17 2017-10-24 广东欧珀移动通信有限公司 Face vivo identification method and Related product
CN108009531A (en) * 2017-12-28 2018-05-08 北京工业大学 A kind of face identification method of more tactful antifraud
CN110969077A (en) * 2019-09-16 2020-04-07 成都恒道智融信息技术有限公司 Living body detection method based on color change

Non-Patent Citations (1)

* Cited by examiner, † Cited by third party
Title
李德毅 等: "人工智能导论", 31 August 2018, 中国科学技术出版社, pages: 163 - 165 *

Also Published As

Publication number Publication date
SG10202003994RA (en) 2020-09-29

Similar Documents

Publication Publication Date Title
JP7365445B2 (en) Computing apparatus and method
KR102299847B1 (en) Face verifying method and apparatus
US12014571B2 (en) Method and apparatus with liveness verification
KR102655949B1 (en) Face verifying method and apparatus based on 3d image
EP3623995A1 (en) Periocular face recognition switching
US8908977B2 (en) System and method for comparing images
US20180034852A1 (en) Anti-spoofing system and methods useful in conjunction therewith
CN110163899A (en) Image matching method and image matching apparatus
CN108280418A (en) The deception recognition methods of face image and device
KR101810190B1 (en) User authentication method and apparatus using face identification
CN111144277B (en) Face verification method and system with living body detection function
KR102079952B1 (en) Method of managing access using face recognition and apparatus using the same
KR101724971B1 (en) System for recognizing face using wide angle camera and method for recognizing face thereof
EP4343689A1 (en) Body part authentication system and authentication method
CN115082992B (en) Human face liveness detection method, device, electronic device and readable storage medium
CN111066023A (en) Detection system, detection device and method thereof
CN113642639B (en) Living body detection method, living body detection device, living body detection equipment and storage medium
JP7264308B2 (en) Systems and methods for adaptively constructing a three-dimensional face model based on two or more inputs of two-dimensional face images
WO2018179723A1 (en) Facial authentication processing apparatus, facial authentication processing method, and facial authentication processing system
KR102380426B1 (en) Method and apparatus for verifying face
CN113033243A (en) Face recognition method, device and equipment
Wang et al. Enhancing QR Code System Security by Verifying the Scanner's Gripping Hand Biometric
CN112597466A (en) User authentication method and system
WO2018133584A1 (en) Identity authentication method and device
CN112613345A (en) User authentication method and system

Legal Events

Date Code Title Description
PB01 Publication
PB01 Publication
SE01 Entry into force of request for substantive examination
SE01 Entry into force of request for substantive examination
TA01 Transfer of patent application right
TA01 Transfer of patent application right

Effective date of registration: 20240921

Address after: Guohao Times City # 20-01, 128 Meizhi Road, Singapore

Applicant after: Ant Shield Co.,Ltd.

Country or region after: Singapore

Address before: 45-01 Anson Building, 8 Shanton Avenue, Singapore

Applicant before: Alipay laboratories (Singapore) Ltd.

Country or region before: Singapore