HK40011433A

HK40011433A - Systems and methods for spoof detection and liveness analysis

Info

Publication number: HK40011433A
Application number: HK42020001080.9A
Authority: HK
Inventors: R‧R‧德拉赫沙尼; J‧特普利
Original assignee: 居米奥公司
Priority date: 2015-06-16
Filing date: 2018-08-24
Publication date: 2020-07-17

Description

System and method for counterfeit detection and liveness analysis

The application is a divisional application of an original Chinese invention patent application with the application number of CN201680041628.3, the name of the original application is 'system and method for counterfeit detection and liveness analysis', and the application date of the original application is 2016, 5 and 31.

CROSS-REFERENCE TO RELATED APPLICATIONS

The present application claims priority and benefit of U.S. provisional patent application No. 62/180,481 entitled "Liveness Analysis Using vitamins Detection Using vital organs", filed on 16/6/2015, which is incorporated herein by reference in its entirety.

Technical Field

The present disclosure relates generally to image, sonic signal, and vibration signal analysis, and in particular, to image and signal processing techniques for detecting whether a subject depicted in an image is active.

Background

It may be generally desirable to limit certain individual access attributes or resources. Biometric systems may be used to authenticate the identity of an individual to grant or deny access to a resource. For example, an iris scanner may be used by a biometric security system to identify an individual based on unique structures in the individual's iris. However, if an imposter presents a pre-recorded image or video of the face of an authorized individual for scanning, then such a system may erroneously authorize the imposter. This false image or video may be displayed on a monitor, such as a Cathode Ray Tube (CRT) or Liquid Crystal Display (LCD) screen, in the form of a plain photograph or the like held in front of the camera for scanning. Other counterfeiting techniques include the use of a photo-realistic three-dimensional mask of the face of a legitimate user.

One category of existing anti-counterfeiting measures focuses primarily on still imaging (e.g., photo-based) attacks. These measures assume that: static spoofing attacks cannot reproduce the naturally occurring and distinct movements of different parts of an image (mostly within the face). The measures also assume that: each of the above motions in the live scan occurs at a different scale in terms of the natural agility and frequency of the associated muscle group. However, these measures can only detect static (e.g., picture-based) spoofing attacks, and require some window of time to observe at a high frame rate sufficient to be able to resolve the motion vectors described above to their desired velocity and frequency distribution (if present). The measures can also falsely reject a living body that remains completely stationary during scanning or falsely accept a static copy with additional motion, for example, by bending and shaking a fake photograph in some way.

The second category of existing anti-counterfeiting measures assumes: the quality of the photo or video copy of the biometric sample is poor and the image texture analysis method can therefore identify counterfeiters. However, the assumption of discernable low-quality counterfeit copies is unreliable, especially when advanced high-quality and extremely common high-definition recording and display technologies are emerging that can be found even in modern smart phones and tablet computers. Surprisingly, by relying on specific and technology-dependent counterfeit replication artifacts, such technologies have been shown to be data set dependent and have shown substandard generalization capability. The same disadvantages exist for another category of anti-counterfeiting measures (which is based on reference or non-reference image quality metrics) related to the second category.

Disclosure of Invention

In various implementations described herein, detecting physical attributes indicative of the presence of a living person is used to distinguish a living person's real face from images/videos, duress verification and other disguise and fraud authentication methods, and/or to identify a counterfeiter, for example, by detecting the presence of a device for replaying a legitimate user's recorded images/videos/other physical reconstruction to spoof a biometric system. This is accomplished, in part, by (a) detecting the characteristics of counterfeiters and (b) using three-dimensional face detection and two-factor pulse recognition to verify the liveness and physical presence of an individual.

Accordingly, in one aspect, a computer-implemented method comprises the steps of: transmitting one or more audio signals using an audio output component of a user device; receiving, using an audio input component of the user device, one or more reflections of the audio signal from a target; determining whether the target includes at least one of facial structures and facial tissues based on the one or more reflections; and determining whether the target is a counterfeit based at least in part on the determination of whether the target includes at least one of facial structures and facial tissues. The user device may be, for example, a mobile device including a smartphone, tablet computer, or laptop computer. The one or more audio signals may include short coded pulse sources, short-term continuous-conversion signals, or CTFM sources. One or more characteristics of the one or more audio signals may be randomized.

In one embodiment, the method further comprises the steps of: training a classifier to identify physical features of the target; and providing information based on the one or more reflections of the audio signal from the target as an input to the classifier, wherein determining whether the target is an imposter is further based at least in part on an output of the classifier received in response to the provided input.

In another embodiment, the method further comprises the steps of: receiving a plurality of images of the target; and determining whether the object includes a three-dimensional face structure based on the detected light reflections in the image.

In another embodiment, the method further comprises the steps of: receiving a plurality of images of the target; and identifying whether the target has a first pulse based on the image, wherein determining whether the target is a counterfeiter is further based at least in part on the identification of whether the target has a pulse. The first pulse may be identified using remote photoplethysmography.

In yet another embodiment, a second pulse of the target is identified by physical contact with the target, wherein determining whether the target is a counterfeiter is further based at least in part on a measurement of the second pulse. A determination may be made as to whether the second pulse correlates with the first pulse, wherein determining whether the target is a spoofer is further based at least in part on the correlation. Measuring the second pulse may include: receiving information associated with the second pulse from the user device or another handheld or wearable device. The information associated with the second pulse may include a ballistocardiographic signal.

The details of one or more embodiments of the subject matter described in this specification are set forth in the accompanying drawings and the description below. Other features, aspects, and advantages of the subject matter will become apparent from the description, the drawings, and the claims.

Drawings

Fig. 1A to 1C depict various use cases for counterfeit prevention and liveness detection.

FIG. 2 depicts a method for counterfeit prevention and liveness detection according to an embodiment.

Fig. 3 depicts example direct and indirect acoustic paths of acoustic detection pulses between a telephone headset and a microphone.

Fig. 4 and 5 depict demodulated echoes of an example matched filter that exhibits reflections of a monitor screen and a real human face, respectively.

Fig. 6 and 7 depict reflections from different facets of a human face and a monitor screen, respectively.

Like reference symbols and designations in the various drawings indicate like elements.

Detailed Description

Described in various implementations herein are systems and accompanying methods for providing software-based multi-level anti-spoofing and "liveness" detection techniques that combine three-dimensional (3D) "face authenticity" sensing of sound waves using face modulation sound reflections with multi-source/multi-path life detection. As used herein, "liveness" refers to a characteristic that tends to indicate the presence of a living person (not a counterfeiter or imitator of a living person, such as an image or prerecorded video of an eye or face, a three-dimensional model head, etc.). Such characteristics may include, for example, identifiable physical attributes such as face, pulse, breathing pattern, and the like. "face authenticity" refers to characteristics that tend to indicate the presence of a real face, such as the presence of eyes, nose, mouth, chin and/or other facial features and tissues arranged in a recognizable pattern. This definition of face authenticity may be augmented by including passive or active acoustic, light sensitive and/or electromagnetic features of real faces (non-counterfeit faces).

The present invention provides a new physics-based solution that can be fully implemented in software and particularly detects counterfeit screen playback (regardless of its quality). It overcomes the shortcomings of existing anti-counterfeiting solutions based on vision by assessing the likelihood of presenting a real 3D face to a user device by examining the sonic (and/or photometric) characteristics of the real 3D face, all in a manner transparent to the user. Advantageously, this technique only uses typical mobile phone headsets/sound transducers and microphones in various everyday environments to detect counterfeiters for biometric authentication. The acoustic signature obtained using existing hardware of the mobile device is weak and faces the challenges of multiple confounding factors that the described method needs to overcome. The causes of such poor acoustic signal-to-noise ratios include unwanted echoes, as well as acoustic path nonlinearities and bandwidth limitations (which include acoustic path nonlinearities and bandwidth limitations of the transducer), microphone/earpiece directivity and sensitivity, and internal reverberation of the device. Furthermore, because of the longer wavelength of the audio frequency band utilized, spatial resolution is reduced over existing ultrasonic sonar systems and most of the target reflections are instead dissipated via scattering, providing indirect detection of embedded acoustic features, as detailed herein.

In one embodiment, the anti-counterfeiting and liveness detection technique comprises: verifying the existence of the three-dimensional face shape structure; and measuring the pulses of the target using the plurality of sources. Three-dimensional face sensing may be performed using face modulated sound reflections (e.g., reflections from coded high-pitch probe signals emitted by a telephone headset or other sound transducer (similar to sonar), and signals received from a telephone microphone or other audio input) and/or structured light stereo vision (e.g., fast patterned illumination from a telephone screen). Life detection, such as pulse detection by the user, can be measured from the cardiac pumping action that induces changes in complexion and hand/body vibration. Heart rate detection can be achieved through a number of approaches: the heartbeat of the human body induces mechanical vibrations (also known as a heart attack) and detects pulses from skin color changes recorded by a red-green-blue (RGB) camera (also known as a remote photoplethysmogram (remote PPG or rPPG) — pulses of the user can also be detected via other wearable/mobile devices with heart rate transducers.

1A-1C illustrate various uses of the anti-counterfeiting and liveness analysis techniques as described herein. For example, in fig. 1A, the target user 104 uses his mobile device 102 (e.g., smartphone, tablet computer, etc.) to self-authenticate using biometric readings (e.g., eye scans) captured by the mobile device camera. In addition to a camera, the mobile device 102 may also utilize other sensors such as accelerometers, gyroscopes, fingertip heartbeat sensors, vibration sensors, audio output components (such as speakers, headphones, or other sound transducers), audio input components (such as microphones), and the like to verify the physical presence of the user using the presently described techniques. In FIG. 1B, the mobile device 106 captures an image or video of a target on an LCD monitor 106 or other display screen. Software executing on the mobile device 102 may use current techniques, such as three-dimensional face detection, evaluation of reflected light and/or sound signals, and pulse detection, to determine that a target is physically absent. FIG. 1C depicts the second user 110 holding the mobile device 102 and facing the mobile device 102 to the target user 104. In this example, although the physical presence of the target user 104 will be determined (e.g., by three-dimensional facial structures and visual pulse recognition), the second pulse reading taken by the mobile device 102 via physical contact between the device 102 and the second user 110 will not correspond to the visual pulse identified for the target user 104, and thus, the verification of the user identity will fail.

Other techniques for anti-counterfeiting and liveness analysis may be used with the techniques described herein. These techniques include those described in U.S. patent application No. 14/480,802 entitled "system and method for Liveness Analysis (Systems and Methods for Liveness Analysis)" filed 2014 9 and U.S. patent application No. 14/672,629 entitled "User-authenticated biobelts (Bio left for User Authentication)" filed 2015 3 30, which are incorporated herein by reference in their entirety.

An embodiment of a method for spoofing and liveness detection is depicted in FIG. 2. Beginning at step 202, a user device, such as the mobile device 102 or other cellular telephone, smartphone, tablet computer, virtual reality device, or other device used in biometric augmented user interaction (e.g., logging into a banking application when using biometric eye authentication) detects whether a facetted three-dimensional (3D) object is positioned in front of the device, rather than a fake such as a prerecorded video on a flat display.

The 3D face detection in step 202 may be accomplished using various methods or combinations thereof and may be based on the availability of certain sensors and transmitters on the user device. In one implementation, acoustic waves (e.g., high frequency acoustic waves) are used to determine whether to present a three-dimensional face or an alternative ground plane display or a non-face shaped 3D object to a biometric sensor (which may include, for example, a mobile device camera for image-based biometric techniques using any of the faces or sub-regions thereof, including sub-regions of the eyes). An example of a sound wave (acoustic wave) based technique is Continuous Transmission Frequency Modulation (CTFM), where the distance to different facets/surfaces of a human face is measured based on the time-varying frequency detected by a measuring device, e.g. an audio output component of the device (headphones, speakers, sound transducers) in combination with an audio input component of the same or a different device (microphone). In the case of biometric authentication, acoustic distance measurements can also be used to verify: the measured interocular distance corresponds to the expected interocular distance determined at the time of target registration. The above is an example of a true dimensional measurement check, but it should be understood that other device-to-face distance measurements may also be used, such as measurements from the camera's focus mechanism. Techniques for 3D face detection will be described in further detail below.

In another implementation, the presence and extent of photometric stereo is analyzed for characteristics that tend to indicate the presence of three-dimensional face shaped objects. The efficacy of the photometric effect can also be combined with the sonic measurement distance mentioned earlier and optionally compared to photometric stereo data gathered during the biometric enrollment phase. If the device screen can be driven at a higher frame rate to make the screen-induced temporal changes in the photoreception detection more difficult for the user to perceive, then the photoreception measurement can be operated with a camera having a lower frame rate using aliasing. It should be noted that if the above three-dimensional characteristics are measured with a higher accuracy, the 3D contour of the user's face determined using sonic and/or photometric measurements when making a valid enrollment may become user-specific to some extent and may introduce more specificity (as a soft biometric technique) into the anti-counterfeiting measurements described herein.

If a face-shaped 3D structure is detected, the device may optionally further verify liveness by detecting whether the face-shaped structure has pulses present and within an expected range (using, for example, facial rPPG based on images captured by the device camera) (step 208). Otherwise, if no 3D facial structure is detected, liveness rejection is disabled and targets are rejected (step 230). If a valid pulse is detected, a 3D face object with apparent blood circulation is determined as the first stage of liveness detection and counterfeit prevention. This phase limits the spoofing attack to facial 3D structures with pulsating skin that can withstand the detour rPPG test, which is a high threshold.

In the second phase, the system may optionally attempt to correlate primary pulses detected from the face structure (e.g., the face rPPG after a sono-and/or photometric 3D face examination) with secondary pulse measurements obtained by different methods for stronger liveness detection/anti-counterfeiting (steps 212 and 216). Secondary pulse measurements may be accomplished, for example, by ballistocardiographic signals, which may be captured based on handheld device shaking induced by cardiac pumping and measured by a device motion transducer or pulse sensing wearable device (if available), or other suitable secondary approaches for examining heart rate or its harmonics. If the secondary pulse is not detected or is otherwise invalid (e.g., falls outside of an expected range), or if the correlation fails (e.g., the system detects that the pulse does not match the heart rate or other characteristic), the target is rejected (step 230). Conversely, if the foregoing steps verify liveness, then the target may be accepted as a live legitimate user (step 220). It should be appreciated that the verification stages described in this implementation need not be performed in the order described; rather, alternative step sequences are also contemplated. For example, one or more pulse measurements may be first taken, followed by using 3D face detection to reinforce the conclusion of liveness versus impersonation determined based on the pulse measurements. Furthermore, all steps need not be performed (e.g., whether a counterfeiter is present may be determined based solely on 3D face detection).

Acoustic 3D face authenticity measurement

This acoustic wave technique detects the presence of a human face (intended for legitimate eye-facing or face biometric scanning of a device) or other non-facial structure object (such as a flat screen or other fake situation) displayed to a biometric sensor (such as a front-facing camera of a mobile phone). The techniques are for image-based biometric techniques using faces or sub-regions thereof, including sub-regions of the eyes. Examples of sonic sources that can be used for 3D face authenticity measurements include, but are not limited to, short coded pulse sources, short term continuously variable signals, and CTFM.

Short coded pulse audio sources include audio sources in which a maximum correlation code (e.g., a Barker code (Barker) 2-13 pattern, whether in its original form or a binary phase shift keying code) and/or a short-term continuously variable frequency signal (e.g., a linear frequency sweep having an envelope such as a kaser window) are transmitted through an audio output component such as a telephone headset or other on-board sound transducer. If multiple audio output components are present, beamforming can be used to preferentially spatially concentrate the sound source. Matched filtering or autocorrelation decoding of the echoes from the pulse compression techniques described above allows the reconstruction of coarse 3D features of the target (which also reflects its texture and material structure due to the acoustic impedance of the impacted facets). This information is presented to the user device by the time of flight and morphology of the received echoes, similar to what is seen in sonar and radar systems. Matched filtering requires cross-correlating the received echoes with the original source signals. The autocorrelation of the echo with itself may alternatively be used, where the instant received replica of the forward signal effectively becomes the detection template. In any case, further post-processing, such as computing the amplitude of the analyzed version of the decoded signal, is performed prior to feature selection and classification.

For CTFM sound sources, the distance to different facets/surfaces of the target (here the user's face or a fake screen) is measured based on the time-varying frequency detected by the high-pitched sound waves emitted by the device (e.g. through the phone's earpiece).

In some implementations, acoustic distance measurements are also used to check the total face distance to ensure proper correspondence to the expected interocular distance measured via imaging at the time of biometric enrollment (true dimension measurement check). In some implementations, the low signal-to-noise ratio of the echoes can be further overcome by averaging multiple sources and/or multi-microphone beamforming and noise cancellation.

It should be noted that there are two aspects to this technique: (i) rejecting non-face objects (e.g., counterfeit screens); and (ii) accepting a 3D acoustic profile of a face, particularly a profile similar to that of an enrolled user (e.g., a user-specific acoustic face template established during enrollment), to improve accuracy of counterfeit prevention by exploring object specificity. The latter aspect utilizes learning facial features from acoustic reflections (appearance learning), which can be performed using well-known machine learning techniques such as classifier integration and deep learning. The accuracy of acoustic 3D face contour recognition can be further improved by including an auxiliary signal from the image sensor. For example, if a user wears glasses or hides a portion of their face with a scarf, the echo profile will change. Image analysis may reveal these variations and adjust the classification module accordingly, for example, by using templates and thresholds appropriate for these situations.

Photometric 3D face authenticity measurement

In some implementations, the 3D face authenticity measurement is further enhanced by checking the presence and extent of facial 3D structures from interrogating illumination changes, such as photometric stereo vision induced by high frequency patterns and colors of the mobile device screen (structured screen illumination) encoded using illumination intensity, phase and frequency, after (or before or simultaneously with) acoustic face structure detection. The photometric stereo effect generally depends on the light source distance and can therefore be combined with the sonar measurement distance mentioned earlier.

In further implementations, the verification photometric features can be compared to one or more photometric features taken during enrollment of the user to make these measurements specific to a more sensitive and specific subject. By combining improved acoustic and photometric 3D face profiles, the combination can not only detect counterfeiters with higher accuracy while continuing to avoid rejecting real users, but can detect user-specific acoustic photometric face features as soft biometric and thus further improve the performance of the primary biometric modality, which is an additional soft recognition modality.

The sensitization measurement may also utilize the imaging sensor aliasing for a better user experience if, for example, the device screen can be driven at a higher frame rate to make the screen-induced temporal variation of the sensitization detection more imperceptible. That is, if the camera is driven at a lower frame rate than the screen, the aliasing frequency components of the structured light can be used and proceed normally.

Heartbeat measurement

In some implementations, if face authenticity is confirmed acoustically and/or photometrically, the presence of facial pulses (and values in some examples) can be detected/measured from a front-facing camera of the mobile device within a shorter observation period than is required for full rPPG pulse rate calculations. This rapid inspection limits the spoofing attack to facial 3D structures with pulsating skin, which is a very high threshold. This pulse recognition step can serve as a supplemental measure of anti-counterfeiting protection after acoustic (and optionally photometric) face authenticity measurements.

In further embodiments, the proposed method measures and cross-validates the multipath cardiac activity of the user for even more rigorous liveness checks. The heartbeat signal can be determined based on, for example, the 3D confirmed face rPPG mentioned earlier. The additional heartbeat signal (or its major harmonics) can be recovered from ballistocardiographic signals (e.g., handheld device vibrations and their harmonics, as induced by the mechanical pumping action of the heart and as measured by the device motion transducer, and optionally as corroborated by the associated small vibrations fed into the detection from the device camera after rigorous signal processing and motion amplification). These additional heartbeat signals may be acquired by other heart rate sensors, if available, such as a health monitoring wearable device or other heart rate sensors embedded in the user's mobile device. In some embodiments, the motion sensor signal is pre-processed by filtering bandpass in the target heart rate frequency range and its harmonics. In other embodiments, heart rate harmonics are used as the primary ballistocardiographic signal. In further embodiments, the ballistocardiogram is augmented by a heart-induced magnification-related motion, as seen by, for example, a camera of the mobile device.

Once important real-time correlations between pulses and multiple heartbeat signals (e.g., rPPG and ballistocardiogram measurements) are detected, a greater likelihood of liveness can be guaranteed. This cardiac cycle activity score may be, for example, the real-time correlation/similarity strength between two heartbeat signals (ballistocardiogram and rPPG). This additional anti-counterfeiting measure uses the heartbeat of a user seeking biometric verification to cause the cardiac activity verification cycle to form a dead cycle (from handshaking (mechanical path) to perceptually confirming human faces/eyes (optical and acoustic paths)).

The presently described techniques may incorporate various heart rate detection techniques known in the art and described, for example, in: united states patent No. 8,700,137 entitled "Cardiac Performance Monitoring System for Use with Mobile Communications Devices," published 4/14 2014, "bio-phone: in accordance with physiological Monitoring of Peripheral smart phone Motion (Biophone from Peripheral smart phone movements), "hel nan de (Hernandez), McDuff (McDuff) and Picard (Picard), medical and biological Society Engineering (Engineering in Medicine and biological Society)," the 37 th annual international conference of the Institute of Electrical and Electronics Engineers (IEEE), 2015), and "implementation of Robust Motion ppg using Spatial Redundancy of Image sensors" (2015), Wang (Wang), saury (Stuijk) and german (de Haan), the Institute of Electrical and Electronics Engineers (IEEE) biological Engineering (IEEE Transactions biological), volume 62, 2, incorporated herein by reference in its entirety.

Additional embodiments

Referring now to fig. 3, during a biometric scan of a user's face and/or eye region using a front-facing camera, a telephone headset 302 (and/or other sound transducer including a plurality of speakers in a beamforming arrangement focused on a target face region) emits a series of signals to audibly interrogate the perceived authenticity of the interacting user's face, in accordance with the techniques described herein. If the authentication is a living body, the telephone microphone 304 collects mainly the reflection of the signal from the human face. However, during a spoofing attack, other copies of the screen or face may alternatively be presented. In some implementations, when the bottom microphone 304 of the device is used, the starting point of the sounding signal transmission is detected by the timestamp of the first transmission heard by the microphone 304 as the first and loudest received replica of the sound wave sounding (route 0), in view of the sound velocity and sound wave impedance of the sound source as it travels through/across the phone body. The raw signal is used for matched filtering along with its echo received by the phone's microphone 304 (which may include signals/echoes received via external route 1 (where the signal propagates from the headset 302 to the microphone through the air) and external route 2 (where the signal is reflected by the target and received by the microphone 304)). Examples of acoustic sources include pulse compression and/or maximum correlation sequences, such as short continuous-converted signals or barker/M-sequence codes.

In some implementations, a front-facing microphone (if available) is used to improve directionality, background noise suppression, and probe signal incipient detection. The directional polar pattern of the device microphone (e.g., cardiac line) may be selected for preferred directional reception. Multiple microphones on the device (if available) may be used for beamforming to improve directionality and thus improve reception of the face truth echo.

In some embodiments, the autocorrelation of the reflected sound is used to decode the face/false echo component of the reflected sound. This approach can yield better demodulation because the matched filter kernel is essentially the actual transmitted version of the probe waveform herein. In a further embodiment, the probe signals are of the CTFM type, and thus heterodyning is used to resolve the spatial profile and range of the target structure. Finally, the classifier can judge the perceived authenticity of the face based on features extracted from the demodulated echoes from any number of the above methods.

Based on the characteristics of the echoes recorded by the device's microphone, there are different ways to determine whether the sound is reflected by the user's face rather than a fake screen or other fake object, noting the shape of a particular multifaceted 3D shape of a human face and its absorptive/reflective properties versus, for example, the shape of a two-dimensional fake (e.g., an LCD copy of the face or eye region of interest) and its absorptive/reflective properties.

Fig. 4 and 5 depict example matched-filter demodulated echoes within the first 20cm of the acoustic path flight using barker 2 code sequences, with various acoustic reflections through routes 0,1, and 2 clearly observed (see fig. 3). More particularly, FIG. 4 depicts reflections caused by a monitor screen about 10cm to about 12cm from the phone that transmits the pulses, while FIG. 5 depicts different echo characteristics caused by a real human face about 10cm to about 14cm in front of the phone.

In some embodiments, the acoustic sounding signal is a maximum correlation signal, such as a barker 2 to 13 code (in its original form or using Binary Phase Shift Keying (BPSK) modulation, where the carrier frequency is shifted in phase by 180 degrees for each bit level change) or a pseudorandom M-sequence. In some implementations, the acoustic detection signal is composed of a short continuous variable frequency signal (which has various frequency ranges and sweep and amplitude envelopes). The probing signal may be, for example, a CTFM signal. These short high frequency signals are transmitted from an audio output component (e.g. a headset) which, in the case of e.g. a smartphone or tablet computer, naturally faces the target when captured using a front-facing camera. However, in some implementations, other or multiple device sound transducers are used for beamforming to focus acoustic detection preferentially on the biometric target.

In implementations of the disclosed technology, the acoustic detection signal may take various forms. For example, in one embodiment, the acoustic detection signal is a CTFM signal having a Hanning window linearly continuously variable frequency signal sweeping 16kHz to 20 kHz. In another embodiment, the sounding signal is a maximum correlation sequence, such as a barker 2 sequence with 180 degree shifted sinusoids BPSK at an 11.25kHz carrier frequency sampled at 44100 Hz. In another embodiment, the detection signal is a windowed, continuously variable frequency signal. The continuously variable frequency signal may be, for example, a cosine signal with a start-up frequency of 11.25kHz, which is linearly swept to 22.5kHz in 10ms and sampled at 44100 Hz. The windowing function may be a Keys window of length 440 samples (10ms at 44.1kHz sampling rate) with a beta value of 6. The foregoing values represent detected signal parameters that provide reasonably accurate results. However, it should be appreciated that the probe signal parameters that provide accurate results may vary based on device and audio input/output component characteristics. Accordingly, other value ranges are contemplated for use with the techniques of the present invention.

In some implementations, the initial phase, frequency, and exact playback start point of the transmitted probe signal, or even the encoding type itself, may be randomized by the mobile biometric module (e.g., for PSK encoding barker probe bursts). This randomization can thwart hypothetical (but extensive and elegant) attacks, where the false echoes are reproduced back on the phone to defeat the proposed sonic face plausibility checker. Since the attacker does not know the real-time randomized phase/type/starting point/frequency of PSK modulation of the encoded sonic sequence or other dynamic properties of the outgoing probe, it is assumed that the injected echo will not be demodulated by the matched filter and will not follow the exact expected pattern.

During the basic barker code/continuous variable signal/CTFM procedure, reflections of the probe signal delayed based on its round trip distance (and hence the frequency delay of the CTFM) are recorded by the device's microphone or other audio input component. The original continuous frequency signal or otherwise coded sound wave sounding may be detected by matched filtering or autocorrelation (for barker codes and short continuous frequency signals) or demodulated to baseband by multiplying the echo with the original frequency slope and getting a lower frequency by-product (heterodyne). Each impacted facet of the target reflects the probe pulse in a manner related to its texture and structural properties, such as the difference in acoustic wave impedance between air and the impacted surface and its size and shape, and to the distance of the acoustic wave source (acoustic round trip delay). Thus, in brief (assuming no noise and no useful background echo), a human face will have multiple reflections of lower magnitude (as reflected by its multiple major facets at the air-skin and soft tissue-bone interfaces), while a fake monitor screen, for example, will have a single stronger reflection (compare fig. 4 and 5).

In view of the round trip delay of each reflection, the distance of each reflecting target facet can be translated into a time delay of the matched filter/autocorrelation response or a dispersion of frequencies of the power spectral density or PSD (see fig. 6 and 7, which will be described further below), providing a target specific echo morphology. Different methods can be used to calculate the PSD signature from the CTFM signal. In some implementations, multiple clipping is applied to the demodulated echoes spanning 0Hz to 200Hz, and the trellis output is used as an input to a classifier (which may be, for example, a linear or gaussian kernel support vector machine or the like).

More specifically, in various implementations, one or more of the following steps are taken for the continuous variable frequency signal/coded pulse demodulation and target classification. In one example, sonic detection avoids loud ambient noise by frequently (e.g., every 100ms, every 500ms, every 1s, etc.) checking microphone readings, thereby listening for potentially interfering noise. This check may include: calculating a correlation (convolution based on the inversion time) continuous frequency converted signal/coded pulse detection signal; and setting the trigger threshold to the threshold obtained in a substantially quiet environment. In some implementations, an additional similarity check is made in real time after the sound detection signal is played to determine if the interfering noise occurs just after the sound source. If the interfering noise occurs just after the acoustic source, the conversation may be discarded. Multiple continuously variable signals may also be averaged (or median processed) at the signal or decision score level to improve the results.

In one embodiment, the pre-processing involves high-pass filtering of the received signal to allow only frequencies associated with transmitting the continuously variable signal/encoded signal. Such a high pass filter may be, for example, an iso-ripple finite impulse response filter with a stop band frequency of 9300Hz, a pass band frequency of 11750Hz, a stop band attenuation of 0.015848931925, a pass band ripple of 0.037399555859, and a density factor of 20.

In some implementations, demodulation includes normalized cross-correlation (equivalent to normalized convolution based on a time-reversed version of the sonic detection) of the received high-pass echo with the original sonic chirped signal/encoded signal. The maximum response is considered as the starting point/origin of the decoded signal. The demodulation may comprise, for example, an autocorrelation of a portion of the signal 0.227ms before the above-mentioned starting point to 2.27ms + the time length of the continuous frequency converted signal/encoded signal after the starting point marker. Post-processing the demodulated signal may include: the magnitude of its analysis signal (the complex spiral sequence consisting of the actual signal + its imaginary number (its 90 degree phase shifted version)) is calculated to further elucidate the envelope of the interceding varying echo. In one implementation, assuming a 44100Hz sampling rate, the first 100 samples of the above magnitude of the analysis signal are further multiplied by a piecewise linear weighting factor (which is 1 for the first 20 samples and increases linearly to 5 for samples 21-100) to compensate for the sound attenuation due to travel distance. Other weighting factors may be used, such as 1 after the second order interval.

Fig. 6 depicts multiple reflections from different facets of a human face (three samples are shown using CTFM techniques). These echoes reveal the specific spatial face structure (rather than the spoof features). This is due to the different delays (and magnitudes) of the different acoustic paths detected by the demodulated acoustic probe echoes. In contrast, the counterfeit display shown in fig. 7 mainly causes a single large peak during demodulation. The challenge may arise from the low spatial resolution and high scatter of a typical human face due to the upper frequency limit of 20KHz imposed by the audio circuitry of some phones. Other challenges include variations caused by user behavior and background noise and motion/reflection artifacts induced by uncontrolled environments and overall lower SNR due to device audio circuit limitations, all of which can be addressed by the techniques described herein.

In some embodiments, the feature set of the above-mentioned classifier is a set of subsets selected for optimal classification performance using a random subset integration classification technique. The set of random subspace classifiers may be, for example, a summation-rule fused set of k-nearest neighbor classifiers or a summation-rule fused set of support vector machines operating based on decoding a set of feature vectors of the analysis signal. Appendices a and B provide classifiers and input spaces experimentally obtained using a random subspace set building method. Appendix a lists an example set of 80 feature vectors selected using a large training/testing dataset consisting of over 18,000 echoes (from real users and various fake screen recordings using random subspace sampling in conjunction with kNN integrated classifiers). The subspace is obtained based on the average cross-validation performance (measured via ROC curve analysis) of the different subspace configurations (i.e., input sample positions and sizes and number of participating classifiers). The column position of each digit shows the number of digital signal samples from the decoding start point of the continuous variable frequency signal/encoded signal transmission using a sampling rate of 44100 Hz. In another implementation, the subspace set is a set of support vector machine classifiers (where the gaussian kernel receives the set of 40 feature vectors of the decoded analytic signal listed in appendix B) and is selected as a subset of the 80 features in appendix a based on its Fisher (Fisher) discriminant ratio (from Fisher discriminant linear classification using a larger dataset). Furthermore, the column position of each digit shows the number of digital signal samples from the decoding start point of the continuous variable frequency signal/encoded signal transmission using a sampling rate of 44100 Hz.

In some implementations, the sonar classifier is trained to be subject-specific (and may be device-specific, as the following methods are suitable for combined user-device features) to accurately identify a representation of a particular face in echo space (rather than just a generic face versus a counterfeiter). A classifier can be trained to distinguish user acoustic signatures obtained during biometric enrollment from acoustic signatures of a representative population of imposters (rather than just subjects-specific counterfeiters). Another advantage of this method is: the features collected during enrollment also reflect the characteristics of the device used for enrollment, and thus, the classifier is tailored to the acoustic characteristics of the particular device. The resulting user (and device) specific acoustic pattern detector can be used as part of a more accurate user (and device) tuned counterfeit-prevention classifier, where this subject-specific classification is combined with the counterfeit detection classifier mentioned earlier. In some implementations, the user-specific acoustic profile detector itself can be used as soft biometric.

The above-described acoustic wave interrogation of eye/facial biometric targets may be enhanced by a facially photometric response to structured light imparted by a scene interrogation mobile device to improve counterfeit resistance. In some implementations, the structured light is in the form of coded intensity, coded color variation, coded spatial distribution, and/or coded phase variation of the light imparted by the device, e.g., via an embedded LCD or LED light source. The above-described encoding may be defined in terms of frequency patterns and particular maximum correlation sequences (e.g., barker or M-sequences). In other embodiments, the photometric profile of the user's face is pre-computed based on the user's general population profile versus the impersonator (user-agnostic photometric face authenticity).

Further, in some embodiments, the classifier learns, for user specificity, the 3D contour of the user's face detected by the user's photometric reflections when confirming the registration. These user-specific selections can be used together with or alone as soft biometric techniques that also introduce more subject specificity and therefore better accuracy into these anti-counterfeiting measures.

The systems and techniques described here can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a client computer having a graphical user interface or a web browser through which a user can interact with an implementation of the systems and techniques described here), or any combination of such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include a local area network ("LAN"), a wide area network ("WAN"), and the internet.

The computing system may include a client and a server. The client and server are generally remote from each other and may interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. Several embodiments have been described herein. Nevertheless, it will be understood that various modifications may be made without departing from the spirit and scope of the invention.

Embodiments of the subject matter and the operations described in this specification can be implemented in each of the following: a digital electronic circuit; or computer software, firmware, or hardware, including the structures disclosed in this specification and their structural equivalents; or a combination of one or more of the foregoing. Embodiments of the subject matter described in this specification can be implemented as one or more computer programs, i.e., one or more modules of computer program instructions, encoded on computer storage media for execution by, or to control the operation of, data processing apparatus. Alternatively or in addition, the program instructions may be encoded in an artificially generated propagated signal, such as a machine-generated electrical, optical, or electromagnetic signal, that is generated to encode information for transmission to suitable receiver apparatus for execution by a data processing apparatus. The computer storage medium may be or be included in: a computer-readable storage device, a computer-readable storage substrate, a random or serial access memory array or device, or a combination of one or more thereof. Moreover, although a computer storage medium is not a propagated signal, a computer storage medium can be a source or destination of computer program instructions encoded in an artificially generated propagated signal. The computer storage medium may also be or be included in one or more separate physical components or media, such as multiple CDs, disks, or other storage devices.

The operations described in this specification may be implemented as operations performed by data processing apparatus on data stored on one or more computer-readable storage devices or received from other sources.

The term "data processing apparatus" encompasses all kinds of apparatus, devices, and machines for processing data, including for example programmable processors, computers, systems on a chip, or multiple ones or combinations of the foregoing. The apparatus can comprise special purpose logic circuitry, e.g., an FPGA (field programmable gate array) or an ASIC (application-specific integrated circuit). The apparatus can include, in addition to hardware, code that produces an execution environment for the computer program in question, e.g., code that constitutes processor firmware, a protocol stack, a database management system, an operating system, a cross-platform runtime environment, a virtual machine, or a combination of one or more of them. The apparatus and execution environment may implement a variety of different computing model infrastructures, such as web services, distributed computing, and grid computing infrastructures.

A computer program (also known as a program, software application, script, or program code) can be written in any form of programming language, including compiled or interpreted languages, declarations, or programming languages, and it can be deployed in any form, including as a stand-alone program or as a module, component, subroutine, object, or other unit suitable for use in a computing environment. A computer program may, but need not, correspond to a file in a file system. A program can be stored in a portion of a file that holds other programs or data (e.g., one or more scripts stored in a markup language resource), in a single file dedicated to the program in question, or in multiple coordinated files (e.g., files that store one or more modules, sub-programs, or portions of program code). A computer program can be deployed to be executed on one computer or on multiple computers that are located at one site or distributed across multiple sites and interconnected by a communication network.

Embodiments of the subject matter described in this specification can be implemented in a computing system that includes a back-end component (e.g., as a data server), or that includes a middleware component (e.g., an application server), or that includes a front-end component (e.g., a client computer having a graphical user interface or a Web browser through which a user can interact with an implementation of the subject matter described in this specification), or any combination of one or more such back-end, middleware, or front-end components. The components of the system can be interconnected by any form or medium of digital data communication, e.g., a communication network. Examples of communication networks include local area networks ("LANs") and wide area networks ("WANs"), the internet (e.g., the internet), and peer-to-peer networks (e.g., private peer-to-peer networks).

The computing system may include a client and a server. The client and server are generally remote from each other and may interact through a communication network. The relationship of client and server arises by virtue of computer programs running on the respective computers and having a client-server relationship to each other. In some embodiments, the server transmits data (e.g., HTML pages) to the client device (e.g., for displaying data to a user interacting with the client device and receiving user input from the user). Data generated at the client device (e.g., a result of the user interaction) may be received at the server from the client device.

A system of one or more computers may be configured to perform particular operations or actions by having software, firmware, hardware, or a combination thereof installed on the system to cause the system to perform the actions in operation. One or more computer programs may be configured to perform particular operations or actions by including instructions that, when executed by data processing apparatus, cause the apparatus to perform the actions.

Appendix A

Feature vector set 1:

the classifier 1: 7,9,14,15,18,20,24,27,35,37,40,45,55,58,60,64,65,70,80,81,98,100

A classifier 2: 6,12,13,23,26,36,44,47,50,52,58,59,63,64,67,76,77,85,86,87,89,92

A classifier 3: 10,21,22,25,31,32,34,37,38,46,49,62,72,73,80,82,83,84,86,90,93,95

The classifier 4: 1,2,5,8,15,17,20,22,23,28,29,30,41,42,51,56,61,78,83,94,96,99

The classifier 5: 3,4,12,16,28,30,32,37,39,43,45,54,57,60,63,66,76,78,84,87,88,97

The classifier 6: 4,11,13,19,27,31,39,44,47,48,49,53,58,69,71,74,75,91,93,94,99,100

The classifier 7: 1,2,4,6,8,9,11,13,26,33,36,41,50,51,54,67,68,69,73,79,85,90 classifier 8: 10,14,17,18,19,24,33,34,36,38,41,43,52,55,59,60,68,92,93,96,98,100

The classifier 9: 8,17,22,23,24,25,27,30,35,40,46,56,57,62,63,70,71,72,79,88,89,99

The classifier 10: 3,5,9,11,29,42,58,61,62,63,66,71,75,77,80,81,82,90,94,95,96,97

The classifier 11: 1,3,6,14,16,21,25,32,34,35,38,39,48,49,53,55,66,70,75,78,80,97

The classifier 12: 7,10,15,20,24,31,33,36,40,43,44,50,52,65,67,74,76,85,91,96,98,99

The classifier 13: 9,16,19,20,26,41,46,47,48,49,51,68,69,73,77,82,83,84,87,89,91,95

The classifier 14: 2,6,8,11,18,23,26,28,29,35,38,42,45,57,61,62,64,72,88,93,96,100

The classifier 15: 6,12,19,20,21,37,42,43,53,54,58,59,61,70,73,74,77,78,79,83,86,93

The classifier 16: 3,5,6,7,18,28,30,35,39,47,51,54,55,56,65,72,82,85,86,89,90,92

The classifier 17: 1,2,7,31,33,34,36,39,46,56,59,64,65,66,67,69,75,79,81,86,87,92

The classifier 18: 9,12,13,14,15,16,17,21,27,41,44,45,49,52,57,74,76,77,81,88,91,95

The classifier 19: 5,17,26,29,30,45,46,48,63,65,67,68,71,72,74,75,76,88,92,96,97,98

The classifier 20: 1,9,13,19,21,22,25,27,37,47,50,51,53,60,61,66,70,78,79,84,95,98

The classifier 21: 1,2,11,12,16,18,29,32,40,42,48,50,57,62,71,73,83,84,87,90,94,100

The classifier 22: 3,4,7,10,15,23,25,26,31,32,33,41,43,52,56,58,76,82,88,91,92,99

The classifier 23: 3,4,5,7,8,12,13,22,23,33,34,38,40,44,54,60,62,63,64,89,94,97

The classifier 24: 10,14,15,16,20,21,27,30,42,45,47,53,68,69,72,74,79,80,81,84,89,97

The classifier 25: 10,11,24,28,29,32,43,44,52,64,65,66,70,71,75,77,85,87,90,94,95,100

The classifier 26: 5,8,16,29,33,36,37,40,52,53,54,55,56,57,59,60,69,73,82,86,91,97

The classifier 27: 2,5,6,12,17,22,25,34,35,39,46,48,55,59,61,64,73,75,78,79,90,99

The classifier 28: 2,4,9,18,24,27,31,34,36,37,42,43,44,66,78,80,81,83,85,93,96,98

The classifier 29: 4,5,8,13,14,17,18,19,22,26,28,38,45,46,49,51,58,60,61,72,89,93

The classifier 30: 20,21,27,29,31,38,40,41,50,54,58,64,65,67,68,69,81,82,92,94,98,100

The classifier 31: 3,4,7,9,11,19,25,26,28,30,33,53,54,55,57,65,67,71,76,80,83,86

The classifier 32: 2,8,10,12,14,21,23,32,35,36,47,49,56,62,69,70,77,82,84,91,95,99

The classifier 33: 1,14,17,18,24,28,34,39,48,51,53,59,63,67,74,85,87,88,89,95,97,100

The classifier 34: 3,10,11,13,15,23,28,31,35,43,46,50,51,55,60,63,68,71,77,85,88,98

The classifier 35: 1,6,19,38,41,42,44,45,46,47,56,57,58,61,70,73,79,81,84,90,92,100

The classifier 36: 16,24,25,30,32,35,37,40,48,50,52,56,64,65,66,68,72,75,76,80,87,94

The classifier 37: 6,7,8,39,48,54,55,57,59,63,67,74,78,79,82,86,87,89,91,93,96,99

The classifier 38: 4,13,15,20,23,29,31,39,40,41,42,43,47,49,50,53,59,72,73,75,82,84

The classifier 39: 7,15,16,17,20,22,25,27,49,51,60,62,65,76,77,80,86,91,92,93,95,97

The classifier 40: 1,11,14,22,24,26,28,30,35,36,38,41,49,52,56,61,78,83,90,92,96,99

The classifier 41: 2,9,12,18,21,30,33,34,44,47,49,61,69,71,74,76,77,81,84,85,93,94

The classifier 42: 3,8,12,19,22,26,31,32,42,48,50,51,64,66,67,70,79,83,87,91,98,100

The classifier 43: 4,6,10,21,23,34,37,44,45,46,52,55,57,58,59,60,63,68,75,78,79,94

The classifier 44: 2,5,7,11,13,23,24,39,41,43,57,62,70,72,74,77,80,84,88,94,97,100

The classifier 45: 3,5,10,14,16,21,32,33,34,39,45,64,70,73,74,83,87,88,89,90,96,99

The classifier 46: 10,15,18,19,20,25,26,29,40,52,55,58,62,68,78,81,85,86,89,93,96,98

The classifier 47: 1,8,10,15,27,30,32,33,36,38,48,53,54,66,67,69,70,71,85,95,97,98

The classifier 48: 2,3,5,7,9,14,22,28,43,47,50,51,53,54,65,71,73,76,81,82,83,92

The classifier 49: 4,6,16,17,25,31,35,41,42,45,50,51,55,62,68,77,79,80,83,86,87,95

The classifier 50: 1,5,9,12,13,17,18,21,24,28,37,38,39,40,61,63,69,70,73,75,82,91

The classifier 51: 2,3,11,15,19,26,27,29,32,34,36,37,44,48,56,59,62,66,69,71,90,93

The classifier 52: 8,12,14,20,22,35,47,52,54,57,60,63,64,65,69,72,78,81,84,88,91,96

The classifier 53: 4,8,17,29,31,42,43,46,48,53,56,58,60,61,62,65,66,68,75,76,86,94

The classifier 54: 7,13,15,16,19,20,21,24,25,33,36,49,70,80,86,89,90,94,95,98,99,100

The classifier 55: 2,6,7,10,13,18,19,22,23,29,30,40,57,58,65,66,67,72,73,88,92,99

The classifier 56: 1,6,9,11,18,20,27,30,38,44,59,74,75,78,82,84,85,86,89,91,92,97

The classifier 57: 5,12,26,33,37,38,39,42,45,46,49,52,54,56,60,66,71,73,77,90,91,94

The classifier 58: 6,8,16,26,28,34,35,41,44,45,46,49,50,63,68,72,79,83,87,96,97,99

The classifier 59: 1,4,17,23,27,29,30,31,40,43,50,51,61,64,67,68,74,76,81,93,95,100

The classifier 60: 2,3,11,13,23,24,25,35,47,49,52,56,57,59,71,74,75,79,81,88,96,98

The classifier 61: 1,7,9,12,16,17,22,32,34,36,37,46,53,72,76,77,82,85,87,88,92,95

The classifier 62: 3,4,11,14,17,18,22,24,25,31,50,51,54,55,57,63,78,80,87,89,92,97

The classifier 63: 5,6,20,21,24,32,33,36,37,38,39,43,44,46,47,60,64,66,67,69,83,90

The classifier 64: 7,10,14,15,19,27,28,35,40,45,48,53,54,59,61,78,82,84,85,96,98,100

The classifier 65: 1,8,12,15,27,29,34,40,41,44,47,52,53,55,58,59,66,70,80,89,93,97

The classifier 66: 2,5,6,9,10,14,26,28,31,42,43,56,60,62,63,74,80,81,90,95,98,99

The classifier 67: 11,13,18,20,21,27,37,38,41,42,45,51,61,62,70,76,77,82,83,88,91,93

The classifier 68: 2,3,9,11,12,15,19,25,27,32,36,40,49,68,69,71,72,75,85,90,98,99

The classifier 69: 13,16,17,18,26,29,30,32,36,39,41,46,48,55,58,61,64,65,67,79,86,100

The classifier 70: 1,4,23,25,30,33,34,44,45,54,60,73,77,79,84,86,89,93,94,96,98,100

The classifier 71: 2,4,10,13,20,22,28,34,37,38,44,45,50,58,67,69,73,81,87,91,92,94

The classifier 72: 8,9,11,18,19,31,47,48,54,56,57,58,62,64,68,72,74,75,84,88,97,99

The classifier 73: 3,4,5,21,24,33,35,40,42,43,53,55,59,63,64,65,78,83,84,85,95,97

The classifier 74: 7,9,16,17,20,29,32,36,39,47,51,52,53,58,59,70,71,76,80,89,93,94

The classifier 75: 5,10,12,14,19,23,26,33,41,44,56,57,59,60,62,69,72,75,91,92,95,99

The classifier 76: 22,25,31,35,38,42,43,46,50,65,66,67,78,81,83,85,86,87,89,90,97,99

The classifier 77: 1,2,3,8,10,11,37,49,54,61,63,66,68,69,71,75,76,77,78,79,83,100

The classifier 78: 1,5,8,14,20,23,24,26,28,32,35,39,46,48,52,53,55,73,80,84,88,93

The classifier 79: 3,6,7,14,16,21,29,30,37,47,52,55,60,61,62,70,74,79,81,82,92,100

The classifier 80: 7,15,22,25,31,34,35,36,41,44,45,48,49,51,53,56,72,73,77,80,81,82

Appendix B

Feature vector set 2:

A classifier 2: 1,2,5,8,15,17,20,22,23,28,29,30,41,42,51,56,61,78,83,94,96,99

A classifier 3: 3,4,12,16,28,30,32,37,39,43,45,54,57,60,63,66,76,78,84,87,88,97

The classifier 4: 4,11,13,19,27,31,39,44,47,48,49,53,58,69,71,74,75,91,93,94,99,100

The classifier 5: 1,2,4,6,8,9,11,13,26,33,36,41,50,51,54,67,68,69,73,79,85,90 classifier 6: 3,5,9,11,29,42,58,61,62,63,66,71,75,77,80,81,82,90,94,95,96,97

The classifier 7: 7,10,15,20,24,31,33,36,40,43,44,50,52,65,67,74,76,85,91,96,98,99

The classifier 8: 2,6,8,11,18,23,26,28,29,35,38,42,45,57,61,62,64,72,88,93,96,100

The classifier 9: 3,5,6,7,18,28,30,35,39,47,51,54,55,56,65,72,82,85,86,89,90,92

The classifier 10: 5,17,26,29,30,45,46,48,63,65,67,68,71,72,74,75,76,88,92,96,97,98

The classifier 11: 3,4,7,10,15,23,25,26,31,32,33,41,43,52,56,58,76,82,88,91,92,99

The classifier 12: 3,4,5,7,8,12,13,22,23,33,34,38,40,44,54,60,62,63,64,89,94,97

The classifier 13: 5,8,16,29,33,36,37,40,52,53,54,55,56,57,59,60,69,73,82,86,91,97

The classifier 14: 2,5,6,12,17,22,25,34,35,39,46,48,55,59,61,64,73,75,78,79,90,99

The classifier 15: 2,4,9,18,24,27,31,34,36,37,42,43,44,66,78,80,81,83,85,93,96,98

The classifier 16: 4,5,8,13,14,17,18,19,22,26,28,38,45,46,49,51,58,60,61,72,89,93

The classifier 17: 3,4,7,9,11,19,25,26,28,30,33,53,54,55,57,65,67,71,76,80,83,86

The classifier 18: 4,13,15,20,23,29,31,39,40,41,42,43,47,49,50,53,59,72,73,75,82,84

The classifier 19: 4,6,10,21,23,34,37,44,45,46,52,55,57,58,59,60,63,68,75,78,79,94

The classifier 20: 2,5,7,11,13,23,24,39,41,43,57,62,70,72,74,77,80,84,88,94,97,100

The classifier 21: 3,5,10,14,16,21,32,33,34,39,45,64,70,73,74,83,87,88,89,90,96,99

The classifier 22: 2,3,5,7,9,14,22,28,43,47,50,51,53,54,65,71,73,76,81,82,83,92

The classifier 23: 4,6,16,17,25,31,35,41,42,45,50,51,55,62,68,77,79,80,83,86,87,95

The classifier 24: 1,5,9,12,13,17,18,21,24,28,37,38,39,40,61,63,69,70,73,75,82,91

The classifier 25: 4,8,17,29,31,42,43,46,48,53,56,58,60,61,62,65,66,68,75,76,86,94

The classifier 26: 2,6,7,10,13,18,19,22,23,29,30,40,57,58,65,66,67,72,73,88,92,99

The classifier 27: 5,12,26,33,37,38,39,42,45,46,49,52,54,56,60,66,71,73,77,90,91,94

The classifier 28: 1,4,17,23,27,29,30,31,40,43,50,51,61,64,67,68,74,76,81,93,95,100

The classifier 29: 1,7,9,12,16,17,22,32,34,36,37,46,53,72,76,77,82,85,87,88,92,95

The classifier 30: 3,4,11,14,17,18,22,24,25,31,50,51,54,55,57,63,78,80,87,89,92,97

The classifier 31: 5,6,20,21,24,32,33,36,37,38,39,43,44,46,47,60,64,66,67,69,83,90

The classifier 32: 7,10,14,15,19,27,28,35,40,45,48,53,54,59,61,78,82,84,85,96,98,100

The classifier 33: 2,5,6,9,10,14,26,28,31,42,43,56,60,62,63,74,80,81,90,95,98,99

The classifier 34: 2,3,9,11,12,15,19,25,27,32,36,40,49,68,69,71,72,75,85,90,98,99

The classifier 35: 1,4,23,25,30,33,34,44,45,54,60,73,77,79,84,86,89,93,94,96,98,100

The classifier 36: 2,4,10,13,20,22,28,34,37,38,44,45,50,58,67,69,73,81,87,91,92,94

The classifier 37: 3,4,5,21,24,33,35,40,42,43,53,55,59,63,64,65,78,83,84,85,95,97

The classifier 38: 7,9,16,17,20,29,32,36,39,47,51,52,53,58,59,70,71,76,80,89,93,94

The classifier 39: 5,10,12,14,19,23,26,33,41,44,56,57,59,60,62,69,72,75,91,92,95,99

The classifier 40: 1,5,8,14,20,23,24,26,28,32,35,39,46,48,52,53,55,73,80,84,88,93

While this specification contains many specific implementation details, these should not be construed as limitations on the scope of any inventions or of what may be claimed, but rather as descriptions of features specific to particular embodiments of particular inventions. Certain features that are described in the context of separate embodiments of the specification can also be implemented in combination in a single embodiment. Conversely, various features that are described in the context of a single embodiment can also be implemented in multiple embodiments separately or in any suitable subcombination. Moreover, although features may be described above as acting in certain combinations and even initially claimed as such, one or more features from a claimed combination can in some cases be excised from the combination, and the claimed combination may be directed to a subcombination or variation of a subcombination.

Similarly, although operations are depicted in the drawings in a particular order, this should not be understood as requiring: such operations are performed in the particular order shown or in sequential order; or perform all illustrated operations to achieve a desired result. In some cases, multitasking parallel processing may be advantageous. Moreover, the separation of various system components in the embodiments described above should not be understood as requiring such separation in all embodiments, and it should be understood that the described program components and systems can generally be integrated together in a single software product or packaged into multiple software products.

Thus, particular embodiments of the subject matter have been described herein. Other embodiments are within the scope of the following claims. In some cases, the actions recited in the claims can be performed in a different order and still achieve desirable results. In addition, the processes depicted in the accompanying figures do not necessarily require the particular order shown, or sequential order, to achieve desirable results. In certain implementations, multitasking parallel processing may be advantageous.

Claims

1. A method of counterfeit detection, the method comprising:

capturing, by a user device having an image sensor, a plurality of images of a target;

determining whether the target includes at least one of facial structures and facial tissues based on at least one of: (i) a reflection of an audio signal emitted by the user device from the target and (ii) a photometric stereo effect across the plurality of images, wherein the photometric stereo effect is induced by a high frequency pattern and color of a screen of the user device encoded using illumination intensity, phase, and frequency;

identifying whether the target has a first pulse based on the image;

measuring a second pulse of the target by physical contact with the target;

determining whether the target is a counterfeiter based at least in part on (i) determining whether the target includes at least one of facial structures and facial tissue, (ii) identifying whether the target has a pulse, and (iii) measuring the second pulse; and

determining whether the second pulse correlates to the first pulse, wherein determining whether the target is a spoofer is further based at least in part on the correlation of the second pulse to the first pulse.

2. The method of claim 1, wherein the first pulse is identified using remote photoplethysmography.

3. The method of claim 1, wherein measuring the second pulse comprises: receiving information associated with the second pulse through physical contact with at least one of the user device, a different handheld device, and a wearable device.

4. The method of claim 3, wherein the information associated with the second pulse comprises a ballistocardiographic signal.

5. The method of claim 1, further comprising: determining whether the target includes a three-dimensional face structure based on the detected light reflections in the image.

6. The method of claim 1, further comprising:

transmitting the audio signal using an audio output component of the user device; and

receiving, using an audio input component of the user device, the reflection of the audio signal from the target.

7. The method of claim 6, wherein the audio signal comprises a short coded pulse tone source, a short term continuous frequency converted signal, or a CTFM tone source.

8. The method of claim 6, further comprising:

training a classifier to identify physical features of the target; and

providing information based on the reflection of the audio signal from the target as input to the classifier,

wherein the determining whether the target is a spoofer is further based at least in part on an output of the classifier received in response to the provided input.

9. The method of claim 6, further comprising: randomizing one or more characteristics of the audio signal.

10. The method of claim 1, wherein the user device is a mobile device comprising a smartphone, a tablet computer, or a laptop computer.

11. A system for counterfeit detection, the system comprising:

at least one memory for storing computer-executable instructions; and

at least one processor for executing the instructions stored on the at least one memory, wherein execution of the instructions programs the at least one processor to perform operations comprising:

identifying whether the target has a first pulse based on the image;

measuring a second pulse of the target by physical contact with the target;

12. The system of claim 11, wherein the first pulse is identified using remote photoplethysmography.

13. The system of claim 11, wherein measuring the second pulse comprises: receiving information associated with the second pulse through physical contact with at least one of the user device, a different handheld device, and a wearable device.

14. The system of claim 13, wherein the information associated with the second pulse comprises a ballistocardiographic signal.

15. The system of claim 11, wherein the operations further comprise: determining whether the target includes a three-dimensional face structure based on the detected light reflections in the image.

16. The system of claim 11, wherein the operations further comprise:

17. The system of claim 16, wherein the audio signal comprises a short-coded pulse source, a short-term continuous-conversion signal, or a CTFM source.

18. The system of claim 16, wherein the operations further comprise:

training a classifier to identify physical features of the target; and

19. The system of claim 16, wherein the operations further comprise: randomizing one or more characteristics of the audio signal.

20. The system of claim 11, wherein the user device is a mobile device comprising a smartphone, a tablet computer, or a laptop computer.

21. A method of counterfeit detection, the method comprising:

transmitting an audio signal using an audio output component of the user device, wherein one or more characteristics of the audio signal are randomized;

receiving, using an audio input component of the user device, a reflection of the audio signal from the target;

determining whether the target includes at least one of facial structures and facial tissues based on at least one of: (i) a reflection of an audio signal emitted by the user device from the target and (ii) a photometric stereo effect across the plurality of images;

identifying whether the target has a first pulse based on the image;

measuring a second pulse of the target by physical contact with the target;

22. A system for counterfeit detection, the system comprising:

at least one memory for storing computer-executable instructions; and

identifying whether the target has a first pulse based on the image;

measuring a second pulse of the target by physical contact with the target;