BAUM-2: A Multilingual Audio-Visual Affective Face Database

Loading...
Publication Logo

Date

2015

Journal Title

Journal ISSN

Volume Title

Publisher

Kluwer Academic Publishers barbara.b.bertram@gsk.com

Open Access Color

Green Open Access

Yes

OpenAIRE Downloads

OpenAIRE Views

Publicly Funded

No
Impulse
Average
Influence
Top 10%
Popularity
Top 10%

Research Projects

Journal Issue

Abstract

Access to audio-visual databases, which contain enough variety and are richly annotated is essential to assess the performance of algorithms in affective computing applications, which require emotion recognition from face and/or speech data. Most databases available today have been recorded under tightly controlled environments, are mostly acted and do not contain speech data. We first present a semi-automatic method that can extract audio-visual facial video clips from movies and TV programs in any language. The method is based on automatic detection and tracking of faces in a movie until the face is occluded or a scene cut occurs. We also created a video-based database, named as BAUM-2, which consists of annotated audio-visual facial clips in several languages. The collected clips simulate real-world conditions by containing various head poses, illumination conditions, accessories, temporary occlusions and subjects with a wide range of ages. The proposed semi-automatic affective clip extraction method can easily be used to extend the database to contain clips in other languages. We also created an image based facial expression database from the peak frames of the video clips, which is named as BAUM-2i. Baseline image and video-based facial expression recognition results using state-of-the art features and classifiers indicate that facial expression recognition under tough and close-to-natural conditions is quite challenging. © 2017 Elsevier B.V., All rights reserved.

Description

Keywords

Affective Database, Audio-Visual Affective Database, Facial Expression Recognition, Computational Linguistics, Database Systems, Human Computer Interaction, Speech Recognition, Video Cameras, Visual Languages, Audio-Visual, Audio-Visual Database, Controlled Environment, Emotion Recognition, Facial Expression Recognition, Illumination Conditions, Performance of Algorithm, Semiautomatic Methods, Face Recognition

Fields of Science

0202 electrical engineering, electronic engineering, information engineering, 02 engineering and technology

Citation

WoS Q

N/A

Scopus Q

Q1
OpenCitations Logo
OpenCitations Citation Count
37

Source

Multimedia Tools and Applications

Volume

74

Issue

18

Start Page

7429

End Page

7459
PlumX Metrics
Citations

CrossRef : 18

Scopus : 37

Captures

Mendeley Readers : 58

SCOPUS™ Citations

42

checked on Mar 04, 2026

Page Views

1

checked on Mar 04, 2026

Downloads

11

checked on Mar 04, 2026

Google Scholar Logo
Google Scholar™
OpenAlex Logo
OpenAlex FWCI
1.8666

Sustainable Development Goals

SDG data is not available