Welcome to the IKCEST
Journal
IEEE Transactions on Affective Computing

IEEE Transactions on Affective Computing

Archives Papers: 329
IEEE Xplore
Please choose volume & issue:
Identifying Emotions from Non-Contact Gaits Information Based on Microsoft Kinects
Baobin LiChangye ZhuShun LiTingshao Zhu
Keywords:Feature extractionLegged locomotionEmotion recognitionTime-frequency analysisDiscrete Fourier transformsSupport vector machinesThree-dimensional displaysEmotiongaitjointsMicrosoft Kinecttime-frequency analysisdiscrete fourier transform
Abstracts:Automatic emotion recognition from gaits information is discussed in this paper, which has been investigated widely in the fields of human-machine interaction, psychology, psychiatry, behavioral science, etc. The gaits information is non-contact, collected from Microsoft kinects, and contains 3-dimensional coordinates of 25 joints per person. These joints coordinates vary with the time. So, by the discrete Fourier transform and statistic methods, some time-frequency features related to neutral, happy and angry emotion are extracted and used to establish the classification model to identify these three emotions. Experimental results show this model works very well, and time-frequency features are effective in characterizing and recognizing emotions for this non-contact gait data. In particular, by the optimization algorithm, the recognition accuracy can be further averagely improved by about 13.7 percent.
Automated Depression Diagnosis Based on Deep Networks to Encode Facial Appearance and Dynamics
Yu ZhuYuanyuan ShangZhuhong ShaoGuodong Guo
Keywords:Face recognitionFeature extractionHistogramsOptical imagingDatabasesOptical computingAutomated depression diagnosisnonverbal behaviordeep convolutional neural networksflow dynamics
Abstracts:As a severe psychiatric disorder disease, depression is a state of low mood and aversion to activity, which prevents a person from functioning normally in both work and daily lives. The study on automated mental health assessment has been given increasing attentions in recent years. In this paper, we study the problem of automatic diagnosis of depression. A new approach to predict the Beck Depression Inventory II (BDI-II) values from video data is proposed based on the deep networks. The proposed framework is designed in a two stream manner, aiming at capturing both the facial appearance and dynamics. Further, we employ joint tuning layers that can implicitly integrate the appearance and dynamic information. Experiments are conducted on two depression databases, AVEC2013 and AVEC2014. The experimental results show that our proposed approach significantly improve the depression prediction performance, compared to other visual-based approaches.
Towards Reading Hidden Emotions: A Comparative Study of Spontaneous Micro-Expression Spotting and Recognition Methods
Xiaobai LiXiaopeng HongAntti MoilanenXiaohua HuangTomas PfisterGuoying ZhaoMatti Pietikäinen
Keywords:VideosCamerasTrainingFace recognitionEmotion recognitionMicro-expressionfacial expression recognitionaffective computingLBPHOG
Abstracts:Micro-expressions (MEs) are rapid, involuntary facial expressions which reveal emotions that people do not intend to show. Studying MEs is valuable as recognizing them has many important applications, particularly in forensic science and psychotherapy. However, analyzing spontaneous MEs is very challenging due to their short duration and low intensity. Automatic ME analysis includes two tasks: ME spotting and ME recognition. For ME spotting, previous studies have focused on posed rather than spontaneous videos. For ME recognition, the performance of previous studies is low. To address these challenges, we make the following contributions: (i) We propose the first method for spotting spontaneous MEs in long videos (by exploiting feature difference contrast). This method is training free and works on arbitrary unseen videos. (ii) We present an advanced ME recognition framework, which outperforms previous work by a large margin on two challenging spontaneous ME databases (SMIC and CASMEII). (iii) We propose the first automatic ME analysis system (MESR), which can spot and recognize MEs from spontaneous video data. Finally, we show our method outperforms humans in the ME recognition task by a large margin, and achieves comparable performance to humans at the very challenging task of spotting and then recognizing spontaneous MEs.
Real-Time Movie-Induced Discrete Emotion Recognition from EEG Signals
Yong-Jin LiuMinjing YuGuozhen ZhaoJinjing SongYan GeYuanchun Shi
Keywords:Emotion recognitionReal-time systemsElectroencephalographyBrain modelsFilmsSupport vector machinesAffective computingEEGemotion recognitionmovie
Abstracts:Recognition of a human's continuous emotional states in real time plays an important role in machine emotional intelligence and human-machine interaction. Existing real-time emotion recognition systems use stimuli with low ecological validity (e.g., picture, sound) to elicit emotions and to recognise only valence and arousal. To overcome these limitations, in this paper, we construct a standardised database of 16 emotional film clips that were selected from over one thousand film excerpts. Based on emotional categories that are induced by these film clips, we propose a real-time movie-induced emotion recognition system for identifying an individual's emotional states through the analysis of brain waves. Thirty participants took part in this study and watched 16 standardised film clips that characterise real-life emotional experiences and target seven discrete emotions and neutrality. Our system uses a 2-s window and a 50 percent overlap between two consecutive windows to segment the EEG signals. Emotional states, including not only the valence and arousal dimensions but also similar discrete emotions in the valence-arousal coordinate space, are predicted in each window. Our real-time system achieves an overall accuracy of 92.26 percent in recognising high-arousal and valenced emotions from neutrality and 86.63 percent in recognising positive from negative emotions. Moreover, our system classifies three positive emotions (joy, amusement, tenderness) with an average of 86.43 percent accuracy and four negative emotions (anger, disgust, fear, sadness) with an average of 65.09 percent accuracy. These results demonstrate the advantage over the existing state-of-the-art real-time emotion recognition systems from EEG signals in terms of classification accuracy and the ability to recognise similar discrete emotions that are close in the valence-arousal coordinate space.
Predicting the Probability Density Function of Music Emotion Using Emotion Space Mapping
Yu-Hao ChinJia-Ching WangJu-Chiang WangYi-Hsuan Yang
Keywords:Probability density functionFeature extractionTrainingComputational modelingPredictive modelsProbability distributionKernelEmotion in musicvalence-arousal spaceemotion recognition from audiopredictive model and algorithm
Abstracts:Computationally modeling the affective content of music has been intensively studied in recent years because of its wide applications in music retrieval and recommendation. Although significant progress has been made, this task remains challenging due to the difficulty in properly characterizing the emotion of a music piece. Music emotion perceived by people is subjective by nature and thus complicates the process of collecting the emotion annotations as well as developing the predictive model. Instead of assuming people can reach a consensus on the emotion of music, in this work we propose a novel machine learning approach that characterizes the music emotion as a probability distribution in the valence-arousal (VA) emotion space, not only tackling the subjectivity but also precisely describing the emotions of a music piece. Specifically, we represent the emotion of a music piece as a probability density function (PDF) in the VA space via kernel density estimation from human annotations. To associate emotion with the audio features extracted from music pieces, we learn the combination coefficients by optimizing some objective functions of audio features, and then predict the emotion of an unseen piece by linearly combining the PDFs of the training pieces with the coefficients. Several algorithms for learning the coefficients are studied. Evaluations on the NTUMIR and MediaEval 2013 datasets validate the effectiveness of the proposed methods in predicting the probability distributions of emotion from audio features. We also demonstrate how to use the proposed approach in emotion-based music retrieval.
Predicting Personalized Image Emotion Perceptions in Social Networks
Sicheng ZhaoHongxun YaoYue GaoGuiguang DingTat-Seng Chua
Keywords:Feature extractionSocial network servicesVisualizationBridgesContextMeteorologyCloudsPersonalized image emotionsocial contexttemporal evolutionlocation influenceheypergraph learning
Abstracts:Images can convey rich semantics and induce various emotions to viewers. Most existing works on affective image analysis focused on predicting the dominant emotions for the majority of viewers. However, such dominant emotion is often insufficient in real-world applications, as the emotions that are induced by an image are highly subjective and different with respect to different viewers. In this paper, we propose to predict the personalized emotion perceptions of images for each individual viewer. Different types of factors that may affect personalized image emotion perceptions, including visual content, social context, temporal evolution, and location influence, are jointly investigated. Rolling multi-task hypergraph learning (RMTHG) is presented to consistently combine these factors and a learning algorithm is designed for automatic optimization. For evaluation, we set up a large scale image emotion dataset from Flickr, named Image-Emotion-Social-Net, on both dimensional and categorical emotion representations with over 1 million images and about 8,000 users. Experiments conducted on this dataset demonstrate that the proposed method can achieve significant performance gains on personalized emotion classification, as compared to several state-of-the-art approaches.
On the Interrelation Between Listener Characteristics and the Perception of Emotions in Classical Orchestra Music
Markus SchedlEmilia GómezErika S. TrentMarko TkalčičHamid Eghbal-ZadehAgustín Martorell
Keywords:MusicFeature extractionMultiple signal classificationEmotion recognitionInstrumentsComplexity theoryMoodEmotion perception in musicclassical musicaudio analysispersonalityuser studyagreement and correlation in music perception
Abstracts:This study deals with the strong relationship between emotions and music, investigating three main research questions: (RQ1)&#x00A0;Are there differences in human music perception (e.g., emotions, tempo, instrumentation, and complexity), according to musical education, experience, demographics, and personality traits?; (RQ2)&#x00A0;Do certain perceived music characteristics correlate (e.g., tension and sadness), irrespective of a particular listener&#x0027;s background or personality?;&#x00A0;(RQ3) Does human perception of music characteristics, such as emotions and tempo, correlate with descriptors extracted from music audio signals? To investigate our research questions, we conducted two user studies focusing on different groups of subjects. We used web-based surveys to collect information about demographics, listening habits, musical education, personality, and perceptual ratings with respect to perceived emotions, tempo, complexity, and instrumentation for 15 segments of Beethoven&#x0027;s 3<sup>rd</sup> symphony, &#x201C;Eroica&#x201D;. Our experiments showed that all three research questions can be affirmed, at least partly. We found strong support for RQ2 and RQ3, while RQ1 could be confirmed only for some perceptual aspects and user groups.
Multimodal Stress Detection from Multiple Assessments
Jonathan AigrainMichel SpodenkiewiczSéverine DubuissMarcin DetynieckiDavid CohenMohamed Chetouani
Keywords:StressPhysiologyFeature extractionElectrocardiographyHeart rate variabilityObserversStressassessmentbehaviourphysiologymultimodalclassification
Abstracts:Stress is a complex phenomenon that impacts the body and the mind at several levels. It has been studied for more than a century from different perspectives, which result in different definitions and different ways to assess the presence of stress. This paper introduces a methodology for analyzing multimodal stress detection results by taking into account the variety of stress assessments. As a first step, we have collected video, depth and physiological data from 25 subjects in a stressful situation: a socially evaluated mental arithmetic test. As a second step, we have acquired three different assessments of stress: self-assessment, assessments from external observers and assessment from a physiology expert. Finally, we extract 101 behavioural and physiological features and evaluate their predictive power for the three collected assessments using a classification task. Using multimodal features, we obtain average F1 scores up to 0.85. By investigating the composition of the best selected feature subsets and the individual feature classification performances, we show that several features provide valuable information for the classification of the three assessments: features related to body movement, blood volume pulse and heart rate. From a methodological point of view, we argue that a multiple assessment approach provide more robust results.
Multimodal Depression Detection: Fusion Analysis of Paralinguistic, Head Pose and Eye Gaze Behaviors
Sharifa AlghowinemRoland GoeckeMichael WagnerJulien EppsMatthew HyettGordon ParkerMichael Breakspear
Keywords:Feature extractionSpeechAustraliaSensorsMagnetic headsMoodDepression detectionmultimodal fusionspeaking behavioureye activityhead pose
Abstracts:An estimated 350 million people worldwide are affected by depression. Using affective sensing technology, our <italic>long-term goal</italic> is to develop an objective multimodal system that augments clinical opinion during the diagnosis and monitoring of clinical depression. This paper steps towards developing a classification system-oriented approach, where feature selection, classification and fusion-based experiments are conducted to infer which types of behaviour (verbal and nonverbal) and behaviour combinations can best discriminate between depression and non-depression. Using statistical features extracted from speaking behaviour, eye activity, and head pose, we characterise the behaviour associated with major depression and examine the performance of the classification of individual modalities and when fused. Using a real-world, clinically validated dataset of 30 severely depressed patients and 30 healthy control subjects, a Support Vector Machine is used for classification with several feature selection techniques. Given the statistical nature of the extracted features, feature selection based on T-tests performed better than other methods. Individual modality classification results were considerably higher than chance level (83 percent for speech, 73 percent for eye, and 63 percent for head). Fusing all modalities shows a remarkable improvement compared to unimodal systems, which demonstrates the complementary nature of the modalities. Among the different fusion approaches used here, feature fusion performed best with up to 88 percent average accuracy. We believe that is due to the compatible nature of the extracted statistical features.
Leveraging the Bayesian Filtering Paradigm for Vision-Based Facial Affective State Estimation
Meshia Cédric OvenekeIsabel GonzalezValentin EnescuDongmei JiangHichem Sahli
Keywords:State estimationPsychologyBayes methodsAffective computingFacial musclesKalman filtersGaussian processesFacial expressionsaffective state estimationprobabilistic reasoningmultiple instance regressionHausdorff distancesparse Gaussian processesvariational inferenceregularized least-squaresKalman filtering
Abstracts:Estimating a person&#x0027;s affective state from facial information is an essential capability for social interaction. Automatizing such a capability has therefore increasingly driven multidisciplinary research for the past decades. At the heart of this issue are very challenging signal processing and artificial intelligence problems driven by the inherent complexity of human affect. We therefore propose a principled framework for designing automated systems capable of continuously estimating the human affective state from an incoming stream of images. First, we model human affect as a dynamical system and define the affective state in terms of valence, arousal and their higher-order derivatives. We then pose the affective state estimation problem as a Bayesian filtering problem and provide a solution based on Kalman filtering (KF) for probabilistic reasoning over time, combined with multiple instance sparse Gaussian processes (MI-SGP) for inferring affect-related measurements from image sequences. We quantitatively and qualitatively evaluate our proposed framework on the AVEC 2012 and AVEC 2014 benchmark datasets and obtain state-of-the-art results using the baseline features as input to our MI-SGP-KF model. We therefore believe that leveraging the Bayesian filtering paradigm can pave the way for further enhancing the design of automated systems for affective state estimation.
Hot Journals