Description

  • Presents state-of-art methods for multimodal signal processing, analysis, and modeling
  • Contains numerous examples of systems with different modalities combined
  • Describes advanced applications in multimodal Human-Computer Interaction (HCI) as well as in computer-based analysis and modelling of multimodal human-human communication scenes.

Multimodal signal processing is an important research and development field that processes signals and combines information from a variety of modalities – speech, vision, language, text – which significantly enhance the understanding, modelling, and performance of human-computer interaction devices or systems enhancing human-human communication. The overarching theme of this book is the application of signal processing and statistical machine learning techniques to problems arising in this multi-disciplinary field. It describes the capabilities and limitations of current technologies, and discusses the technical challenges that must be overcome to develop efficient and user-friendly multimodal interactive systems.

With contributions from the leading experts in the field, the present book should serve as a reference in multimodal signal processing for signal processing researchers, graduate students, R&D engineers, and computer engineers who are interested in this emerging field.

Key Features

  • Presents state-of-art methods for multimodal signal processing, analysis, and modeling
  • Contains numerous examples of systems with different modalities combined
  • Describes advanced applications in multimodal Human-Computer Interaction (HCI) as well as in computer-based analysis and modelling of multimodal human-human communication scenes.

Readership

Signal, acoustic, speech, image and video processing university (applied) researchers, R&D engineers, computer engineers

Table of Contents

1. Introduction
Jean-Philippe Thiran, Ferran Marqués, and Hervé Bourlard

Part I -- Signal Processing, Modelling and Related Mathematical Tools

2. Statistical Machine Learning for HCI
Samy Bengio

2.1. Introduction
2.2. Introduction to Statistical Learning
2.3. Support Vector Machines for Binary Classification
2.4. Hidden Markov Models for Speech Recognition
2.5. Conclusion

3. Speech Processing
Thierry Dutoit and Stéphane Dupont

3.1. Introduction
3.2. Speech Recognition
3.3. Speaker Recognition
3.4. Text-to-Speech Synthesis
3.5. Conclusions

4. Natural Language and Dialogue Processing
Olivier Pietquin

4.1. Introduction
4.2. Natural Language Understanding
4.3. Natural Language Generation
4.4. Dialogue Processing
4.5. Conclusion

5. Image and Video Processing Tools for HCI
Montse Pardàs, Verónica Vilaplana and Cristian Canton-Ferrer

5.1. Introduction
5.2. Face Analysis
5.3. Hand-Gesture Analysis
5.4. Head Orientation Analysis and FoA Estimation
5.5. Body Gesture Analysis
5.6. Conclusions

6. Processing of Handwriting and Sketching Dynamics
Claus Vielhauer

6.1. Introduction
6.2. History of Handwriting Modality and the Acquisition of Online Handwriting Signals
6.3. Basics in Acquisition, Examples for Sensors
6.4. Analysis of Online Handwriting and Sketching Signals
6.5. Overview of Recognition Goals in HCI
6.6. Sketch Recognition for User Interface Design
6.7. Similarity Search in Digital Ink
6.8. Summary and Perspectives for Handwriting and Sketching in HCI

Part II -- Multimodal Signal Processing and Modelling

7. Basic Concepts of Multimodal Analysis
Mihai Gurban and Jea

Details

No. of pages:
352
Language:
English
Copyright:
© 2010
Published:
Imprint:
Academic Press
Print ISBN:
9780123748256
Electronic ISBN:
9780080888699