|Title||Fusion of Valence and Arousal Annotations through Dynamic Subjective Ordinal Modelling|
|Publication Type||Conference Paper|
|Year of Publication||2017|
|Authors||Ruiz, A, Martinez, O, binefa, X|
|Conference Name||IEEE International Conference on Automatic Face and Gesture Recognition (FG)|
An essential issue when training and validating computer vision systems for affect analysis is how to obtain reliable ground-truth labels from a pool of subjective annotations. In this paper, we address this problem when labels are given in an ordinal scale and annotated items are structured as temporal sequences. This problem is of special importance in affective computing, where collected data is typically formed by videos of human interactions annotated according to the Valence and Arousal (V-A) dimensions. Moreover, recent works have shown that inter-observer agreement of V-A annotations can be considerably improved if these are given in a discrete ordinal scale. In this context, we propose a novel framework which explicitly introduces ordinal constraints to model the subjective perception of annotators. We also incorporate dynamic information to take into account temporal correlations between ground-truth labels. In our experiments over synthetic and real data with V-A annotations, we show that the proposed method outperforms alternative approaches which do not take into account either the ordinal structure of labels or their temporal correlation.