MULTIMODAL SIGNAL PROCESSING AND MULTIDIMENSIONAL INTEGRATION
This talk presents an overview of ideas, methods and recent research
results in multimodal signal processing with emphasis on
audio-visual fusion and multimodal attention-based event detection. We
shall begin with a brief synopsis of important findings from
audio-visual perception. Then we shall outline multimodal signal
front-ends and computational models for sensor fusion with two
application fields: i) audio-visual speech recognition and inversion
and ii) multimodal saliency-based video summarization. We envision the
multi-dimensionality of the underlying conceptual framework as a
space-time analogy where the horizontal plane is the space for
multi-sensory integration, along the vertical direction we may have a
multilevel integration between low-level cues and high-level semantics,
and along the time direction we can track the evolving dynamics.
Professor Petros Maragos
National Technical University of Athens
School of Electrical and Computer Engineering
Athens, Greece