Vulnerability of deep learning-based gait biometric recognition to adversarial perturbations

PDF of full paper: Vulnerability of deep learning-based gait biometric recognition to adversarial perturbations
Full-size poster image: Vulnerability of deep learning-based gait biometric recognition to adversarial perturbations

[This paper was presented on July 21, 2017 at The First International Workshop on The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security (CV-COPS 2017), in conjunction with the 2017 IEEE Conference on Computer Vision and Pattern Recognition.]

Vinay Uday Prabhu and John Whaley, UnifyID, San Francisco, CA 94107

Abstract

In this paper, we would like to draw attention towards the vulnerability of the motion sensor-based gait biometric in deep learning-based implicit authentication solutions, when attacked with adversarial perturbations, obtained via the simple fast-gradient sign method. We also showcase the improvement expected by incorporating these synthetically-generated adversarial samples into the training data.

Introduction

In recent times, password entry-based user-authentication methods have increasingly drawn the ire of the security community [1], especially when it comes to its prevalence in the world of mobile telephony. Researchers [1] recently showcased that creating passwords on mobile devices not only takes significantly more time, but it is also more error prone, frustrating, and, worst of all, the created passwords were inherently weaker. One of the promising solutions that has emerged entails implicit authentication [2] of users based on behavioral patterns that are sensed without the active participation of the user. In this domain of implicit authentication, measurement of gait-cycle [3] signatures, mined using the on-phone Inertial Measurement Unit – MicroElectroMechanical Systems (IMU-MEMS) sensors, such as accelerometers and gyroscopes, has emerged as an extremely promising passive biometric [4, 5, 6]. As stated in [7, 5], gait patterns can not only be collected passively, at a distance, and unobtrusively (unlike iris, face, fingerprint, or palm veins), they are also extremely difficult to replicate due to their dynamic nature.

Inspired by the immense success that Deep Learning (DL) has enjoyed in recent times across disparate domains, such as speech recognition, visual object recognition, and object detection [8], researchers in the field of gait-based implicit authentication are increasingly embracing DL-based machine-learning solutions [4, 5, 6, 9], thus replacing the more traditional hand-crafted-feature- engineering-driven shallow machine-learning approaches [10]. Besides circumventing the oft-contentious process of hand-engineering the features, these DL-based approaches are also more robust to noise [8], which bodes well for the implicit-authentication solutions that will be deployed on mainstream commercial hardware. As evinced in [4, 5], these classifiers have already attained extremely high accuracy (∼96%), when trained under the k-class supervised classification framework (where k pertains to the number of individuals). While these impressive numbers give the impression that gait-based deep implicit authentication is ripe for immediate commercial implementation, we would like to draw the attention of the community towards a crucial shortcoming. In 2014, Szegedy et al. [11] discovered that, quite like shallow machine-learning models, the state-of- the-art deep neural networks were vulnerable to adversarial examples that can be synthetically generated by strategically introducing small perturbations that make the resultant adversarial input example only slightly different from correctly classified examples drawn from the data distribution, but at the same time resulting in a potentially controlled misclassification. To make things worse, a large plethora of models with disparate architectures, trained on different subsets of the training data, have been found to misclassify the same adversarial example, uncovering the presence of fundamental blind spots in our DL frameworks. After this discovery, several works have emerged ([12, 13]), addressing both means of defence against adversarial examples, as well as novel attacks. Recently, the cleverhans software library [13] was released. It provides standardized reference implementations of adversarial example-construction techniques and adversarial training, thereby facilitating rapid development of machine-learning models, robust to adversarial attacks, as well as providing standardized benchmarks of model performance in the adversarial setting explained above. In this paper, we focus on harnessing the simplest of all adversarial attack methods, i.e. the fast gradient sign method (FGSM) to attack the IDNet deep convolutional neural network (DCNN)-based gait classifier introduced in [4]. Our main contributions are as follows: 1: This is, to the best of our knowledge, the first paper that introduces deep adversarial attacks into this non-computer vision setting, specifically, the gait-driven implicit-authentication domain. In doing so, we hope to draw the attention of the community towards this crucial issue in the hope that further publications will incorporate adversarial training as a default part of their training pipelines. 2: One of the enduring images that is widely circulated in adversarial training literature is that of the panda+nematode = gibbon adversarial-attack example on GoogleNet in [14], which was instrumental in vividly showcasing the potency of the blind spot. In this paper, we do the same with accelerometric data to illustrate how a small and seemingly imperceptible perturbation to the original signal can cause the DCNN to make a completely wrong inference with high probability. 3: We empirically characterize the degradation of classification accuracy, when subjected to an FGSM attack, and also highlight the improvement in the same, upon introducing adversarial training. 4: Lastly, we have open-sourced the code here.

Figure 1. Variation in the probability of correct classification (37 classes) with and without adversarial training for varying ε.
Figure 2. The true accelerometer amplitude signal and its adversarial counterpart for ε = 0.4.

2. Methodology and Results

In this paper, we focus on the DCNN-based IDNet [4] framework, which entails harnessing low-pass-filtered tri-axial accelerometer and gyroscope readings (plus the sensor-specific magnitude signals), to, firstly, extract the gait template, of dimension 8 × 200, which is then used to train a DCNN in a supervised-classification setting. In the original paper, the model identified users in real time by using the DCNN as a deep-feature extractor and further training an outlier detector (one-class support vector machine-SVM), whose individual gait-wise outputs were finally combined into a Wald’s probability-ratio-test-based framework. Here, we focus on the trained IDNet-DCNN and characterize its performance in the adversarial-training regime. To this end, we harness the FGSM introduced in [14], where the adversarial example, x ̃, for a given input sample, x, is generated by: x ̃ = x + ε sign (∇xJ (θ, x)), where θ represents the parameter vector of the DCNN, J (θ, x) is the cost function used to train the DCNN, and ∇x () is the gradient function.

As seen, this method is parametrized by ε, which controls the magnitude of the inflicted perturbations. Fig. 2 showcases the true and adversarial gait-cycle signals for the accelerometer magnitude signal (given by amag(t) = √(a2x (t) + a2y (t) + a2z (t))) for ε = 0.4. Fig. 1 captures the drop in the probability of correct classification (37 classes) with increasing ε. First, we see that in the absence of any adversarial example, we were able to get about 96% ac- curacy on a 37 class classification problem, which is very close to what is claimed in [4]. However, with even mild perturbations (ε = 0.4), we see a sharp decrease of nearly 40% in accuracy. Fig. 1 also captures the effect of including the synthetically generated adversarial examples in this scenario. We see that, for ε = 0.4, we manage to achieve about 82% accuracy, which is a vast improvement of ∼ 25%.

3. Future Work

This brief paper is part of an ongoing research endeavor. We are currently currently extending this work to other adversarial-attack approaches, such as Jacobian-based Saliency-Map Approach (JSMA) and Black-Box-Attack (BBA) approach [15]. We are also investigating the effect of these attacks within the deep-feature-extraction+SVM approach of [4], and we are comparing other architectures, such as [6] and [5].

References
[1]  W.Melicher, D.Kurilova, S.M.Segreti, P.Kalvani, R.Shay, B. Ur, L. Bauer, N. Christin, L. F. Cranor, and M. L. Mazurek, “Usability and security of text passwords on mobile devices,” in Proceedings of the 2016 CHI Conference on Human Factors in Computing Systems, pp. 527–539, ACM, 2016. 1
[2]  E. Shi, Y. Niu, M. Jakobsson, and R. Chow, “Implicit authentication through learning user behavior,” in International Conference on Information Security, pp. 99–113, Springer, 2010. 1
[3]  J. Perry, J. R. Davids, et al., “Gait analysis: normal and pathological function.,” Journal of Pediatric Orthopaedics, vol. 12, no. 6, p. 815, 1992. 1
[4]  M. Gadaleta and M. Rossi, “Idnet: Smartphone-based gait recognition with convolutional neural networks,” arXiv preprint arXiv:1606.03238, 2016. 1, 2
[5]  Y. Zhao and S. Zhou, “Wearable device-based gait recognition using angle embedded gait dynamic images and a convolutional neural network,” Sensors, vol. 17, no. 3, p. 478, 2017. 1, 2
[6]  S. Yao, S. Hu, Y. Zhao, A. Zhang, and T. Abdelza- her, “Deepsense: A unified deep learning framework for time-series mobile sensing data processing,” arXiv preprint arXiv:1611.01942, 2016. 1, 2
[7]  S. Wang and J. Liu, Biometrics on mobile phone. INTECH Open Access Publisher, 2011. 1
[8]  Y. LeCun, Y. Bengio, and G. Hinton, “Deep learning,” Nature, vol. 521, no. 7553, pp. 436–444, 2015. 1
[9]  N. Neverova, C. Wolf, G. Lacey, L. Fridman, D. Chandra, B. Barbello, and G. Taylor, “Learning human identity from motion patterns,” IEEE Access, vol. 4, pp. 1810–1820, 2016. 1
[10]  C. Nickel, C. Busch, S. Rangarajan, and M. Mo ̈bius, “Using hidden markov models for accelerometer-based biometric gait recognition,” in Signal Processing and its Applications (CSPA), 2011 IEEE 7th International Colloquium on, pp. 58–63, IEEE, 2011. 1
[11]  C. Szegedy, W. Zaremba, I. Sutskever, J. Bruna, D. Erhan, I. Goodfellow, and R. Fergus, “Intriguing properties of neural networks,” arXiv preprint arXiv:1312.6199, 2013. 1
[12]  C. Szegedy, W. Liu, Y. Jia, P. Sermanet, S. Reed, D. Anguelov, D. Erhan, V. Vanhoucke, and A. Rabinovich, “Going deeper with convolutions,” in Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 1–9, 2015. 1
[13]  N. Papernot, I. Goodfellow, R. Sheatsley, R. Feinman, and P. McDaniel, “cleverhans v1.0.0: an adversarial machine learning library,” arXiv preprint arXiv:1610.00768, 2016. 1
[14]  I. J. Goodfellow, J. Shlens, and C. Szegedy, “Explain- ing and harnessing adversarial examples,” arXiv preprint arXiv:1412.6572, 2014. 2
[15] N. Papernot, P. McDaniel, I. Goodfellow, S. Jha, Z. B. Celik, and A. Swami, “Practical black-box attacks against deep learning systems using adversarial examples,” arXiv preprint arXiv:1602.02697, 2016.

Smile in the face of adversity much? A print based spoofing attack

PDF of full paper: Smile in the face of adversity much? A print based spoofing attack
Full-size poster image: Smile in the face of adversity much? A print based spoofing attack

[This paper was presented on July 21, 2017 at The First International Workshop on The Bright and Dark Sides of Computer Vision: Challenges and Opportunities for Privacy and Security (CV-COPS 2017), in conjunction with the 2017 IEEE Conference on Computer Vision and Pattern Recognition.]

Vinay Uday Prabhu and John Whaley, UnifyID, San Francisco, CA 94107

Abstract

In this paper, we demonstrate a simple face spoof attack targeting the face recognition system of a widely available commercial smart-phone. The goal of this paper is not proclaim a new spoof attack but to rather draw the attention of the anti-spoofing researchers towards a very specific shortcoming shared by one-shot face recognition systems that involves enhanced vulnerability when a smiling reference image is used.

Introduction

One-shot face recognition (OSFR) or single sample per person (SSPP) face recognition is a well-studied research topic in computer vision (CV) [8]. Solutions such as Local Binary Pattern (LBP) based detectors [1], Deep Lambertian Networks (DLN) [9] and Deep Supervised Autoencoders (DSA) [4] have been proposed in recent times to make the OSFR system more robust to changes in illumination, pose, facial expression and occlusion that they encounter when deployed in the wild. One very interesting application of face recognition that has gathered traction lately is for mobile device unlocking [6]. One of the highlights of Android 4.0 (Ice Cream Sandwich) was the Face Unlock screen-lock option that allowed users to unlock their devices with their faces. It is rather imperative that we mention here that this option is always presented to the user with a cautioning clause that typically reads like *Face recognition is less secure than pattern, PIN, or password.

The reasoning behind this is that there exists a plethora of face spoof attacks such as print attacks, malicious identical twin attack, sleeping user attack, replay attacks and 3D mask attacks. These attacks are all fairly successful against most of the commercial off-the-shelf face recognizers [7]. This ease of spoof attacks has also attracted attention of the CV researchers that has led to a lot of efforts in developing liveness detection anti-spoofing frameworks such as Secure-face [6]. (See [3] for a survey.)

Recently, a large scale smart-phone manufacturer introduced a face recognition based phone unlocking feature. This announcement was promptly followed by media reports about users demonstrating several types of spoof attacks.

In this paper, we would like to explore a simple print attack on this smart-phone. The goal of this paper is not proclaim a new spoof attack but to rather draw the attention of the anti-spoofing community towards a very specific shortcoming shared by face recognition systems that we uncovered in this investigation.

2. Methodology and Results

Figure 1. Example of two neutral expression faces that failed to spoof the smart-phone’s face recognition system.
Figure 2. Example of 2 smiling registering faces that successfully spoofed the smart-phone’s face recognition system.
The methodology we used entailed taking a low quality printout of the target user’s face on a plain white US letter paper size (of dimension 8.5 by 11.0 inches) and then unlocking the device by simply exposing this printed paper in front of the camera. Given the poor quality of the printed images, we observed that this simple print attack was duly repulsed by the detector system as long as the attacker sported neutral facial expressions during the registration phase. However, when we repeated the attack in such a way that the attacker had an overtly smiling face when (s)he registered, we were able to break in successfully with high regularity.

In Figure 1, we see two examples of neutral expression faces that failed to spoof the smart-phone’s face recognition system when the registering image had a neutral facial expression. A video containing the failed spoofing attempt with a neutral facial expression can be viewed here.

In Figure 2, we see the same two subjects’ images that successfully spoofed the phone’s face recognition system when the registering (enrollment) image was overtly smiling. The face training demo videos are available here. The video of the successful spoof can be viewed here.

2.1. Motivation for the attack and discussion

It has been well known for a long time in the computer vision community that faces displaying expressions, especially smiles, resulted in stronger recall and discrimination power [10]. In fact, the authors in [2] termed this the happy-face advantage, and showcased the variation in detection performance for varying facial expressions. Through experimentation, we wanted to investigate the specific onshot classification scenario when the registering enrollment face had a strong smile that resulted in the discovery of this attack. As for defense from this attack, there are two straightforward recommendations. The first recommendation would be to simply display a message goading the user to maintain a passport-type neutral facial expression. The second would entail having a smile detector such as [5] as a pre-filter that would only allow smile-free images as a reference image.

References
[1] T. Ahonen, A. Hadid, and M. Pietikainen. Face description with local binary patterns: Application to face recognition. IEEE transactions on pattern analysis and machine intelligence, 28(12):2037–2041, 2006. 1
[2]  W. Chen, K. Lander, and C. H. Liu. Matching faces with emotional expressions. Frontiers in psychology, 2:206, 2011. 2
[3]  J. Galbally, S. Marcel, and J. Fierrez. Biometric antispoofing methods: A survey in face recognition. IEEE Access, 2:1530–1552, 2014. 1
[4]  S. Gao, Y. Zhang, K. Jia, J. Lu, and Y. Zhang. Single sample face recognition via learning deep supervised autoencoders. IEEE Transactions on Information Forensics and Security, 10(10):2108–2118, 2015. 1
[5]  P. O. Glauner. Deep convolutional neural networks for smile recognition. arXiv preprint arXiv:1508.06535, 2015. 2
[6]  K. Patel, H. Han, and A. K. Jain. Secure face unlock: Spoof detection on smartphones. IEEE Transactions on Information Forensics and Security, 11(10):2268–2283, 2016. 1
[7]  D. F. Smith, A. Wiliem, and B. C. Lovell. Face recognition on consumer devices: Reflections on replay attacks. IEEE Transactions on Information Forensics and Security,10(4):736–745, 2015. 1
[8]  X.Tan,S.Chen,Z.-H.Zhou, and F.Zhang. Face recognition from a single image per person: A survey. Pattern recognition, 39(9):1725–1745, 2006.
[9]  Y. Tang, R. Salakhutdinov, and G. Hinton. Deep lambertian networks. arXiv preprint arXiv:1206.6445, 2012. 1
[10]  Y. Yacoob and L. Davis. Smiling faces are better for face recognition. In Automatic Face and Gesture Recognition, 2002. Proceedings. Fifth IEEE International Conference on, pages 59–64. IEEE, 2002. 2

Credential Stuffing; How PRC almost hacked my Steam

Recently we’ve witnessed some pretty big password leaks. First 6.4m unsalted passwords leaked from LinkedIn, then 500m passwords leaked from Yahoo, which today turned to 1 billion accounts. This is truly scary even if you haven’t been using your Yahoo account. To see why let us go back a couple of months when I almost fell victim to a credential stuffing attack from China.

First of all, “Credential stuffing” is a fancy name for password reuse. All it takes is somebody with very intermediate computer security knowledge, looking up the password dumps from Yahoo or LinkedIn (widely available), then trying the same exact credentials on as many different sites as possible, until there is a match. In my case, I logged into my Steam account and saw something like this:

screen-shot-2016-09-29-at-1-48-50-pm
Steam, how it looks like when you have been hacked

Unfortunately, Steam does not specify if this is a credentials stuffing attempt, but it was only a week after the big LinkedIn leak. I may also have been reusing the same password for my LinkedIn and Steam, so all the pieces fit. Steam was very helpful in telling me the following:

  1. Somebody had tried to access my account from PRC.
  2. He had both my username and password.
  3. His attempt was blocked since I’ve never accessed Steam from PRC.
  4. I needed to change my password to regain access to my account.

At that time I deeply appreciated all those otherwise annoying security features. Facebook asking me to identify my friends, Google sending me text messages and now Steam using geolocation to see where my impersonator lives. I quickly updated my password on Steam and 5-6 other websites.

My new password was the same as the old one, with the last letter changed from a ‘d’ to an ‘e’, meaning this was the 5th time I updated my Steam password for one or another reason. The rest of the password was pretty good in terms of entropy. Caps, lower cases, numbers, and symbols, randomly generated as well as pronounceable, using pwgen, a great CLI tool for generating strong, memorable passwords.

screen-shot-2016-09-29-at-3-30-06-pm
pwgen producing secure, memorable passwords

But this is not great overall. It’s only one step in the right direction for attackers to realize how hard it is to remember a password, which is why users opt to postfix their existing ones with predictable components, such as an increasing identifier. I’ve read posts about people using the same password everywhere and instead prefixing it with the site name. So if your main password is “d34db33f” then for Amazon it will be “amazon_d3adb33f”, for Chase “chase_d3adb33f” or something along those lines.

I believe I have a good understanding of the security concepts behind passwords and I think I’m doing better in terms of passwords that 99% of the people out there since my password is not “password” or “123456” (proof). On the other hand, here I found myself coming up with predictable password patterns. Then it came to me, the bigger issue exposed by credential stuffing attacks and password reuse:

Either we all do passwords right, or nobody does.

Either nobody gets hacked, or we may as well all be, as long as users can’t help but use the same passwords and predictable patterns over and over again.

So what does it mean for everyone to do passwords right? If you want to be really safe, you’ve got to be a bit paranoid and lean completely on the side of security versus convenience.

  1. A password should be completely unpredictable (should not include pet names, date of birth, middle names, children names, childhood heroes, favorite books, in fact, no English words at all).
  2. A password should have capital letters, lowercase letters, numbers, symbols and be at least 16 characters long (for 128-bit keys).
  3. A different such password for each website, changed every 3 months, with no logical correlations between them.

It is indeed impossible to be truly secure using passwords. How about password managers then? Letting them handle the complexity of passwords. Not a bad idea on first thought. Just tie all passwords to the user’s machine. But then you get this:

4e5cfd3a8e031
The problem with password managers, you’re not your laptop

Password managers basically escalate the problem of cyber security to a problem of physical security of your devices. If I can get my hands on an open laptop, I can access pretty much any website, as long cookies are enabled or a password manager has been used. And that’s pretty terrible.

In the end, there is no solution that takes care of every aspect of identity security today. It’s either what you know (password), what you have (device) and now we’re finally moving into the age of what you are.

Photo
TechCrunch Disrupt 2016, we won runner-up in Disrupt Battlefield.

At UnifyID, we think of the human as the central point of identity management. Think about every bit of information that makes you, You. How large is your stride, do you walk fast or slow, how long are your arms, which floor is your house at, how fast do you drive to work? This is all information that we feed into our machine learning system as input. The output is binary. Either it is you, or it isn’t. Since we only require 1-bit of information at the time of authentication, we can log you in with one click.

screenshot_2016-12-14_15-09-09
One-click secure login with UnifyID

Our system works with existing password infrastructures. We generate a large, random password for every website you are logged in, and secure it with You as the key. In fact, we don’t even need to know that password. Part of it stays with you and part of it lives in our servers. This way, even if your devices get stolen, even if we get hacked, you’re safe. There’s no single point of failure in the UnifyID system.

In addition, UnifyID works across devices. Your computer knows about your phone, and they share the same credentials. Remember that time when you left your laptop unattended for 5′ and your facebook wall got full of questionable posts? Not anymore. We can detect when you stand up and walk away. In fact, we can do that for every website, banks, e-shops, federal websites. Take your identity with you when you leave the room.

Here at UnifyID we take your security seriously. Passwords are an inconvenience and they will soon go the way of the floppy drive. Machine learning and implicit authentication can help you, and we know exactly how. Sign up for our private beta!