Percorrer por autor "Condesso, Sofia Fernandes"
A mostrar 1 - 1 de 1
Resultados por página
Opções de ordenação
- Emotion recognition in multimedia contentPublication . Condesso, Sofia Fernandes; Ferreira, Artur Jorge; Leite, Nuno Miguel da Costa de SousaAbstract Emotion Recognition (ER) has become crucial in Human-Computer Interaction (HCI), with applications ranging from mental health support to adaptive learning. While many existing approaches rely on controlled environments or hardware-based sensors, this thesis explores non-contact unimodal methods—speech, facial expressions, and textual data—for a more naturalistic and practical analysis of emotions. First, we conduct a systematic evaluation of unimodal ER, comparing classical Machine Learning (ML) and Deep Learning (DL) approaches across multiple unimodal and multimodal datasets. For speech modality (audio), we extract acoustic features using openSMILE (GeMAPS), and learn with models such as Support Vector Machines (SVM) and Random Forests. Results show that feature selection on acoustic features can improve Speech Emotion Recognition (SER). For Facial Emotion Recognition (FER), we experiment with DeepFace and a lightweight Convolutional Neural Networks (CNN). For textual emotion recognition, we employ Word2Vec and GloVe with ML and DL models, and also experiment zero-shot and few-shot learning with large language models. In multimodal experiments, fusion of text and audio modalities improved accuracy to 0.45, confirming the benefit of combining complementary emotional cues. However, adding the visual modality led to a slight degradation in performance, attributed to suboptimal frame sampling. Overall, results highlight the trade-offs between unimodal simplicity and multimodal robustness, demonstrating that lightweight, interpretable models can achieve practical performance for real-world emotion-aware applications.
