Synergistic Fusion of Clinical Interview EEG and Video for Depression Detection: A Cross-Modal Attention Approach

Authors

  • Janaswami Hymavathi Department of CSE, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India.
  • Chokka Anuradha Department of CSE, Koneru Lakshmaiah Education Foundation, Vaddeswaram, AP, India.

DOI:

https://doi.org/10.56979/1002/2026/1222

Keywords:

Depression Detection, MODMA Dataset, Graph Convolutional Networks (GCN), Affective Computing, Electroencephalography (EEG)

Abstract

Objective quantification of Major Depressive Disorder (MDD) remains a substantial clinical challenge due to the inherent subjectivity of traditional diagnostic interviews. This paper presents a novel multimodal deep learning framework that synergistically integrates neurophysiological signals and behavioural cues for automated depression detection. Utilizing the Multi-modal Open Dataset for Mental-disorder Analysis (MODMA), we analyze synchronized 128-channel EEG and video recordings obtained during professional clinical assessments. Our architecture employs a dual-stream approach: a Graph Convolutional Network (GCN) combined with a Long Short-Term Memory (LSTM) network to capture the spatiotemporal dynamics of brain activity, and a 3D Convolutional Neural Network (3D-CNN) with a temporal attention mechanism to extract behavioral markers from facial expressions. A sophisticated cross-modal attention module is implemented to fuse these modalities, allowing the model to learn the complex interdependencies between neural states and overt behavior. To ensure clinical generalizability and prevent data leakage, the framework was evaluated using a strict subject-independent 10-fold cross-validation scheme. Experimental results demonstrate latest performance, achieving an Accuracy of 92.1 % and an F1-Score of 92.5 %. These findings suggest that the proposed multimodal integration offers a powerful and objective tool for mental health screening, enhancing diagnostic precision through the fusion of brain and behavioral biomarkers.

Downloads

Published

2026-02-20

How to Cite

Janaswami Hymavathi, & Chokka Anuradha. (2026). Synergistic Fusion of Clinical Interview EEG and Video for Depression Detection: A Cross-Modal Attention Approach. Journal of Computing & Biomedical Informatics, 10(02). https://doi.org/10.56979/1002/2026/1222