Predicting latent narrative mood using audio and physiologic data

Tuka Al Hanai, Mohammad Mahdi Ghassemi

Research output: Contribution to conferencePaperpeer-review


Inferring the latent emotive content of a narrative requires consideration of para-linguistic cues (e.g. pitch), linguistic content (e.g. vocabulary) and the physiological state of the narrator (e.g. heart-rate). In this study we utilized a combination of auditory, text, and physiological signals to predict the mood (happy or sad) of 31 narrations from subjects engaged in personal story-telling. We extracted 386 audio and 222 physiological features (using the Samsung Simband) from the data. A subset of 4 audio, 1 text, and 5 physiologic features were identified using Sequential Forward Selection (SFS) for inclusion in a Neural Network (NN). These features included subject movement, cardiovascular activity, energy in speech, probability of voicing, and linguistic sentiment (i.e. negative or positive). We explored the effects of introducing our selected features at various layers of the NN and found that the location of these features in the network topology had a significant impact on model performance. To ensure the real-time utility of the model, classification was performed over 5 second intervals. We evaluated our model's performance using leave-one-subject-out cross-validation and compared the performance to 20 baseline models and a NN with all features included in the input layer.

Original languageEnglish (US)
Number of pages7
StatePublished - 2017
Event31st AAAI Conference on Artificial Intelligence, AAAI 2017 - San Francisco, United States
Duration: Feb 4 2017Feb 10 2017


Other31st AAAI Conference on Artificial Intelligence, AAAI 2017
Country/TerritoryUnited States
CitySan Francisco

ASJC Scopus subject areas

  • Artificial Intelligence


Dive into the research topics of 'Predicting latent narrative mood using audio and physiologic data'. Together they form a unique fingerprint.

Cite this