A Machine Learning Emotion Detection Platform To Support Affective Well Being
A Machine Learning Emotion Detection Platform To Support Affective Well Being
A Machine Learning Emotion Detection Platform To Support Affective Well Being
Michael Healy, Department Ryan Donovan, Department Paul Walsh, Department of Huiru Zheng, Department of
of Computer Science, Cork of Computer Science, Cork Computer Science, Cork Computer Science, Ulster
Insititute of Technology, Insititute of Technology, Insititute of Technology, University, Antrim,
Cork City, Ireland, Cork City, Ireland, Cork City, Ireland, Northern Ireland,
Michael.healy2@mycit.ie Brendan.donovan@mycit.ie paul.walsh@cit.ie h.zheng@ulster.ac.uk
Abstract- This paper describes a new emotional detection system To measure a phenomenon we first need to describe it [8].
based on a video feed in real-time. It demonstrates how a bespoke Yet despite the fact that scholars from multiple perspectives as
machine learning support vector machine (SVM) can be utilized far back as Plato have sought to explain emotions, nobody as yet
to provide quick and reliable classification. Features used in the has provided an agreed-upon definition. Even folk conceptions
study are 68-point facial landmarks. In a lab setting, the cannot be relied on, as people differentiate emotions based on
application has been trained to detect six different emotions by their raw conscious experience of those emotions [9]. This
monitoring changes in facial expressions. Its utility as a basis for method of subjective introspection is unsuited for objective
evaluating the emotional condition of people in situations using scientific categorization. As one prominent affective
video and machine learning is discussed.
neuroscientist wrote: “Unfortunately, one of the most significant
Keywords: Affective Computing; Machine Learning; Emotion
things ever said about emotions may be that everyone knows
Detection; what it is until they are asked to define it” [10]. Hence, the
essence of emotions still remains unclear.
I. INTRODUCTION This inability to define emotions has encouraged more
Emotions are an integral part of experiencing the world. systematic research. The component viewpoint, for example,
Functioning emotions help us to perceive, think, and act aims to identify the physical patterns that co-aside or underline
correctly. The crucial role of emotions in general well-being the experience of an emotion and what causes such responses
becomes self-evident when they become dysfunctional. [11]. These patterns can range from neuronal activity in the
Consider the fact that one of the main aims of psychotherapy is central-nervous-system, facial expressions that co-aside with
to help people deal with difficult emotions [1]; that the emotions, to general changes in behavior (e.g., fist clenching and
likelihood of experiencing psychopathology has been linked to tone of voice increasing while angry). Under this viewpoint,
the tendency to experience extreme levels of emotions [2]; that emotions are a set of sub-cortical goal-monitoring systems.
our ability to make seemingly innocuous and everyday choices,
such as what clothes to wear, becomes impaired if the areas This fits neatly with the basic emotion approach, which
related to emotions in the brain are damaged [3]. This latter separates emotions based on their ability to produce consistent
example of dysfunction is of particular concern as people yet particular forms of appraisals, action preparations,
increasingly live longer and as a result become more susceptible physiological patterns, and subjective experiences [12]. These
to neurodegenerative diseases, such as dementia. emotions are considered basic in the sense they are deeply rooted
adaptations that helped our ancestors navigate their social
The world’s aging population is increasing. In 2017, 13% of environments. Currently, there are six proposed universal basic
the general population were aged 60 or over. Some estimates emotions: Joy, Fear, Disgust, Anger, Surprise, Sadness. There
expect this percentage to double by 2050 [4]. As people get older also exists a ‘higher’ level of emotions that are more mediated
their likelihood of developing dementia sharply increases [5]. As by socio-cognitive factors (e.g. shame). One of the main
dementia becomes more prevalent, the necessity to deal with its characteristics that distinguish basic emotions from these latter
negative consequences becomes more pertinent. People with forms of emotions is the presence of universal signals, such as
dementia (PwD) tend to suffer from a variety of affective facial expressions. Based on this view, facial expressions offer
problems, which can damage their cognition, relationships, and researchers a way to measure at least a subset of key emotions.
general well-being [6]. Examples of these affective problems
are: difficulty in managing emotions, difficulty in articulating For this view to be correct, at least three strands of evidence
and expressing emotions, and increased levels of agitation and are required. First, facial expressions signaling emotions are
frustration. Furthermore, PwD are also at an increased risk to universal. Second, facial expressions are a valid or an ‘honest’
suffer from debilitating affective conditions such as depression, signal to underlying emotions. Third, we can reliably decipher
which could further damage their quality of life. In order to emotional expressions. On the first point, although there is still
understand how to effectively manage these problems, some debate on this issue, there is independent research
researchers need to be able to accurately measure emotions [7]]. indicating that facial emotional expressions are consistent cross-
Much of the existing literature focuses on different feature To assist with the complex mathematics of creating a
extractions and methods of machine learning to achieve a high classification model we used a Support Vector Machine (SVM).
accuracy. There has been little or no development of software The SVM has been used in supervised learning to assist with the
tools that utilise this research to assist people in the areas of generation of a model by using built-in algorithms to find the
optimal hyperplane. The hyperplane is the largest separation of
health, well-being, and emotion functioning. Our aim is to
the classes from the training examples. New unseen examples
develop a system that is capable of analysing emotional data in are added to the same space and their class is predicted based on
real-time from either a live stream from a camera or a pre- which side of the gap they fill. The SVM used is called LIBSVM
recorded source such as Youtube. Given the evolving and [23]. LIBSVM is an integrated software library for support
dynamic nature of machine learning, the system shall be vector classification, regression, and distribution estimation. It
designed in such a way that the models used could easily be also has support for multi-class classification which enables the
replaced with new or updated models, without needing a code algorithm to compare the given data to multiple classes which
change to the system. Finally, we aimed to overcome the would be useful when attempting to classify multiple emotional
2695
states. LIBSVM was originally written in C but now has support
for a wide range of programming languages such as Java and
ܦΌ ൌ ඥሺ࢞ െ ࢞ ሻଶ ሺ࢟ െ ࢟ ሻଶ
Python. Details of parameters used can be found in this paper ሺͳሻ
[24].
Where (x1, y1) is the coordinates of the first landmark and (x2,
2. Datasets Used: In order to create machine learning y2) is the coordinates of the second landmarkand D12 is this
models for the Emotion Viewer application, training data was straight line distance between them.
taken from the Cohn-Kanade database (CK+) [25] and
Multimedia Understanding Group (MUG) [26] database. Both
databases contain images of people in lab environments
displaying Ekman’s six basic emotions [12].
Table 1 Datasets Under Study
2696
89.84%
89.54%
89.34%
88.03%
84.41%
78.47%
73.14%
64.08%
TESTING ACCURACY
2697
“Track Face” option must be enabled before this feature can be testing on Youtube videos, it allowed us to test the speed at
used. which the system can classify frames from the video.
The next item on the GUI is the “Voting Count”. This was The first examples in Figure 8a and 8b are taken from a
implementended as a way to control the transitions between speech by Donald Trump during a peroid when certain news
emotions. For example, while a person is talking the emotion outlets had made accusations of wrong doing during his
detected will change repeatedly as the expressions of the face campagin. Throughout the speech Trump is visually distressed
tends to change. To overcome this, a particular emotion must which the Emotion Viewer detects. During the clip the detector
reach the user defined consecutive votes before the analysis text outputs a anger / disgust classification. This aligned closely
changes to that emotion. with the overall narrative of the speech. The full analysis is
available on Youtube with the following link.
2. Run-Time Operations: For reference purposes, the https://www.youtube.com/watch?v=NaCe8bchs9I&index=2&l
operating system that was used is Windows 10 x64 bit and the ist=PLwagddoyFHYZOCeOVoTnM2UFYKhyMuwEJ&t=0s
hardware used is as follows:
• Intel Core i7-8550U (Laptop)
• 8GB DDR4 RAM
• Nvidia GeForce MX150 (Mobile)
A Javascript file is executed which makes a web socket
connection to the SEV server once the webpage is loaded.
Immediately after the connection is established, the video
begins streaming the image data to the server. The rate at which
the images are streamed can be adjusted.
2698
The next analysis was one that was in a sombre setting. The The last analysis featured a montage of people smiling and
video used was taken from former US president Barrack Obama laughing in a variety of different settings. Although most seem
discussing issues around gun control and referencing to school exaggerated, there is not much of a visual difference between a
shootings which he felt passionately about. The SEV detected true expression of happiness and an exaggerated expression of
two dominant emotions during this video, i.e. sadness and happiness. A screenshot can be seen in Figure 10. The full
anger. This is consistent with Obama visually shedding tears analysis video is also available on Youtube using the follow
and stating in the speech that “every time I think of those kids link:
it gets me mad” [0:54-0:58]. Some extracts from this analysis https://www.youtube.com/watch?v=pvjK5LVvz2A&index=4
can be seen in Figures 9a and 9b. The full analysis video is &list=PLwagddoyFHYZOCeOVoTnM2UFYKhyMuwEJ
available on Youtube
https://www.youtube.com/watch?v=q1VyU02wgzs&list=PLw
agddoyFHYZOCeOVoTnM2UFYKhyMuwEJ&index=2
V. CONCLUSION
Facial expressions are a gateway to detecting emotions. The
ability to accurately make face-to-state classifications opens the
potential for researchers to investigate emotions in new settings.
In particular, this paper discussed the SEV platform, which uses
machine learning support vectors in the analysis of emotions on
real-time video. The results suggest that the prototype has
external validity, as the emotions detected were consistent with
the emotions presented by the speakers. Using the laptop which
was described in section 2 of results & experimentation, the
application could classify frames at a speed of 8 frames per
second. This could be improved by deploying the application to
a more powerful hardware and we hope to achieve classification
on a 30fps video in the future through the use of mobile edge
computing (MEC). The next step of this project will be to test
and evaluate the system in real time applications in a mobile
ambulance use case in the SliceNet project. However, given the
accuracy found in the results, the initial signs suggest that
affective computing research is close to providing a powerful
Figure 9b Screenshot from Obama analysis new tool to quickly and objectively determine fundamental
aspects of human well-being.
2699
VI. ACKNOWLEDGEMENT [16] A. K. a. P. W. M. Healy, “Prototype proof of concept for a mobile
agitation tracking system for use in elderly and dementia care use
The authors MH and PW are supported by the SliceNet cases,” in CERC, Cork, 2016.
project (Grant Number: 761913), JZ and RD are supported by [17] R. R. Bond, H. Zheng, H. Wang, M. D. Mulvenna, P. McAllister, K.
the SenseCare project (Grant Number: 690862) funded by Delaney , P. Walsh, A. Keary, R. Riestra and S. Guaylupo, “SenseCare:
European Commission Horizon 2020 Programme. using affective computing to manage and care for the emotional
wellbeing of older people,” in eHealth 360°, vol. 181, K. Giokas, B.
VII. REFERENCES Laszlo and F. Hopfgartner, Eds., Springer, 2017, pp. 352-356.
[18] P. W. Michael Healy, “Detecting Demeanor for Healthcare with
Machine,” in 2017 IEEE International Conference on Bioinformatics
[1] J. C. N. M. J. V. a. N. J. K. L. F. Campbell, “Recognition of and Biomedicine (BIBM), Missouri, 2017.
psychotherapy effectiveness: The APA resolution,” Psychotherapy, vol. [19] A. K. U. K. C. a. A. C. A. Chakraborty, “Emotion Recognition From
50, no. 1, p. 98, 2013. Facial Expressions and Its Control Using Fuzzy Logic,” IEEE
[2] K. L. D. a. J. Panksepp, The Emotional Foundations of Personality: A Transactions on Systems, Man, and Cybernetics - Part A: Systems and
Neurobiological and Evolutionary Approach, WW Norton & Company, Humans, vol. 39, no. 4, pp. 726-743, 2009.
2018. [20] P. a. R. E. K. Michel, “Real time facial expression recognition in video
[3] Y. L. P. V. a. K. S. K. J. S. Lerner, “Emotion and decision making,” using support vector machines.,” in Proceedings of the 5th
Annual review of psychology, vol. 66, 2015. international conference on Multimodal interfaces, 2003.
[4] United Nations, “Ageing,” 2017. [Online]. Available: [21] R. N. a. H. D. Bashir Mohammed Ghandi, “Real-Time System for
http://www.un.org/en/sections/issues-depth/ageing/. [Accessed 29 Facial Emotion Detection,” in 2010 IEEE Symposium on Industrial
August 2018]. Electronics and Applications, Penang, 2010.
[5] A. F. Jorm and D. Jolley, “The incidence of dementia: A meta- [22] T. M. H. Mitchell, Machine Learning, 1997.
analysis,” Neurology, vol. 51, no. 1, pp. 728-733, 1998. [23] C.-C. C. a. C.-J. Lin, “LIBSVM -- A Library for Support Vector
[6] A. Burns and S. Iliffe , “Dementia,” BMJ (Clinical Research), vol. 338, Machines,” 23 July 2018. [Online]. Available:
p. B75, 2009. https://www.csie.ntu.edu.tw/~cjlin/libsvm/.
[7] M. Mulvenna, H. Zheng, R. Bond, P. McAlliser, H. Wang and R. [24] A. K. P. W. Michael Healy, “Preforming real-time emotion
Riestra, “Participatory design-based requirements elicitation involving classification using an Intel RealSense camera, multiple facial
people living with dementia towards a home-based platform to monitor expression databases and a Support Vector Machine,” in CERC,
emotional wellbeing,” in 2017 IEEE International Conference on Karlsruhe, 2017.
Bioinformatics and Biomedicine (BIBM), Kansas City, 2017. [25] J. F. C. T. K. J. S. Z. A. a. I. M. P. Lucey, “The Extended Cohn-Kanade
[8] R. B. Cattell, “The description of personality: Principles and findings in Dataset (CK+): A complete dataset for action unit and emotion-
a factor analysis,” The American Journal of Psychology, vol. 58, no. 1, specified expression,” in 2010 IEEE Computer Society Conference on
p. 69–90, 1945. Computer Vision and Pattern Recognition - Workshops, San Francisco,
[9] J. LeDoux, “Rethinking the emotional brain,” Neuron, vol. 73, no. 4, p. CA, 2010.
653–676, 2012. [26] C. P. a. A. D. N. Aifanti, “The MUG Facial Expression Database,” in
[10] J. LeDoux, The emotional brain: The mysterious underpinnings of Proc. 11th Int. Workshop on Image Analysis for Multimedia Interactive
emotional life, Simon and Schuster, 1998. Services (WIAMIS), Desenzano, 2010.
[11] K. R. Scherer, “What are emotions? And how can they be measured?,” [27] R. M. I. C. J. K. T. a. B. S. Gross, “Multi-Pie,” Image and Vision
Social science information, vol. 44, no. 4, p. 695–729, 2005. Computing, vol. 5, no. 28, pp. 807-813, 2010.
[12] P. Ekman, “Basic emotions,” Handbook of cognition and emotion, p. [28] Y.-W. Chen and C.-J. Lin, “Combining SVMs with Various Feature
45–60, 1999. Selection Strategies,” in Feature Extraction. Studies in Fuzziness and
Soft Computing, Heidelberg, Springer, 2006, pp. 315-324.
[13] M. G. F. a. H. S. H. D. Matsumoto, Nonverbal communication: Science
and applications, Sage, 2013. [29] D. E. King, “Dlib-ml: A Machine Learning Toolkit,” Journal of
Machine Learning Research, vol. 10, pp. 1755-1758, 2009.
[14] R. W. Picard, Affective computing, 1995.
[15] R. W. Picard, “Affective Computing for HCI,” presented at the HCI
(1), p. 829–833, 1999.
2700