Search

List of contributors
Edited by Steve Renals, University of Edinburgh, Hervé Bourlard, Jean Carletta, University of Edinburgh, Andrei Popescu-Belis
Book:

Multimodal Signal Processing

Published online:

05 July 2012

Print publication:

07 June 2012, pp xii-xii
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Contents
Edited by Steve Renals, University of Edinburgh, Hervé Bourlard, Jean Carletta, University of Edinburgh, Andrei Popescu-Belis
Book:

Multimodal Signal Processing

Published online:

05 July 2012

Print publication:

07 June 2012, pp v-xi
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Index
Edited by Steve Renals, University of Edinburgh, Hervé Bourlard, Jean Carletta, University of Edinburgh, Andrei Popescu-Belis
Book:

Multimodal Signal Processing

Published online:

05 July 2012

Print publication:

07 June 2012, pp 271-273
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Multimodal Signal Processing

Human Interactions in Meetings
Edited by Steve Renals, Hervé Bourlard, Jean Carletta, Andrei Popescu-Belis
Published online:

05 July 2012

Print publication:

07 June 2012
- Book
- - Get access
    
    Buy a print copy
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Bringing together experts in multimodal signal processing, this book provides a detailed introduction to the area, with a focus on the analysis, recognition and interpretation of human communication. The technology described has powerful applications. For instance, automatic analysis of the outputs of cameras and microphones in a meeting can make sense of what is happening – who spoke, what they said, whether there was an active discussion and who was dominant in it. These analyses are layered to move from basic interpretations of the signals to richer semantic information. The book covers the necessary analyses in a tutorial manner, going from basic ideas to recent research results. It includes chapters on advanced speech processing and computer vision technologies, language understanding, interaction modeling and abstraction, as well as meeting support technology. This guide connects fundamental research with a wide range of prototype applications to support and analyze group interactions in meetings.

1 - Multimodal signal processing for meetings: an introduction
- By Andrei Popescu-Belis, Idiap Research Institute, Martigny, Switzerland, Jean Carletta, University of Edinburgh, UK
Edited by Steve Renals, University of Edinburgh, Hervé Bourlard, Jean Carletta, University of Edinburgh, Andrei Popescu-Belis
Book:

Multimodal Signal Processing

Published online:

05 July 2012

Print publication:

07 June 2012, pp 1-10
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

This book is an introduction to multimodal signal processing. In it, we use the goal of building applications that can understand meetings as a way to focus and motivate the processing we describe. Multimodal signal processing takes the outputs of capture devices running at the same time – primarily cameras and microphones, but also electronic whiteboards and pens – and automatically analyzes them to make sense of what is happening in the space being recorded. For instance, these analyses might indicate who spoke, what was said, whether there was an active discussion, and who was dominant in it. These analyses require the capture of multimodal data using a range of signals, followed by a low-level automatic annotation of them, gradually layering up annotation until information that relates to user requirements is extracted.
Multimodal signal processing can be done in real time, that is, fast enough to build applications that influence the group while they are together, or offline – not always but often at higher quality – for later review of what went on. It can also be done for groups that are all together in one space, typically an instrumented meeting room, or for groups that are in different spaces but use technology such as videoconferencing to communicate. The book thus introduces automatic approaches to capturing, processing, and ultimately understanding human interaction in meetings, and describes the state of the art for all technologies involved.

2 - Data collection
- By Jean Carletta, University of Edinburgh, UK, Mike Lincoln, University of Edinburgh, UK
Edited by Steve Renals, University of Edinburgh, Hervé Bourlard, Jean Carletta, University of Edinburgh, Andrei Popescu-Belis
Book:

Multimodal Signal Processing

Published online:

05 July 2012

Print publication:

07 June 2012, pp 11-27
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation
Summary

One of the largest and most important parts of the original AMI project was the collection of a multimodal corpus that could be used to underpin the project research. The AMI Meeting Corpus contains 100 hours of synchronized recordings collected using special instrumented meeting rooms. As well as the base recordings, the corpus has been transcribed orthographically, and large portions of it have been annotated for everything from named entities, dialogue acts, and summaries to simple gaze and head movement behaviors. The AMIDA Corpus adds around 10 hours of recordings in which one person uses desktop videoconferencing to participate from a separate, “remote” location.
Many researchers think of these corpora simply as providing the training and test material for speech recognition or for one of the many language, video, or multimodal behaviors that they have been used to model. However, providing material for machine learning was only one of our concerns. In designing the corpus, we wished to ensure that the data was coherent, realistic, useful for some actual end applications of commercial importance, and equipped with high-quality annotations. That is, we set out to provide a data resource that might bias the research towards the basic technologies that would result in useful software components. In addition, we set out to create a resource that would be used not just by computationally oriented researchers, but by other disciplines as well. For instance, corpus linguists need naturalistic data for studying many different aspects of human communication.

Frontmatter
Edited by Steve Renals, University of Edinburgh, Hervé Bourlard, Jean Carletta, University of Edinburgh, Andrei Popescu-Belis
Book:

Multimodal Signal Processing

Published online:

05 July 2012

Print publication:

07 June 2012, pp i-iv
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

References
Edited by Steve Renals, University of Edinburgh, Hervé Bourlard, Jean Carletta, University of Edinburgh, Andrei Popescu-Belis
Book:

Multimodal Signal Processing

Published online:

05 July 2012

Print publication:

07 June 2012, pp 238-270
- Chapter
- - Get access
    
    Check if you have access via personal or institutional login
    
    Log in Register
- Export citation

Search Results

Refine search

Refine search

Actions for selected content:

8 results

List of contributors

Contents

Index

Multimodal Signal Processing

1 - Multimodal signal processing for meetings: an introduction

Summary

2 - Data collection

Summary

Frontmatter

References

Search Results

Refine search

Refine search

Actions for selected content:

Save Search

8 results

List of contributors

Contents

Index

Multimodal Signal Processing

1 - Multimodal signal processing for meetings: an introduction

Summary

2 - Data collection

Summary

Frontmatter

References