Speech Enhancement

1st Edition

A Signal Subspace Perspective

Authors: Jacob Benesty Jesper Jensen Mads Graesboll Christensen Jingdong Chen
Paperback ISBN: 9780128001394
eBook ISBN: 9780128002537
Imprint: Academic Press
Published Date: 10th January 2014
Page Count: 138
49.95 + applicable tax
39.99 + applicable tax
64.95 + applicable tax
Unavailable
Compatible Not compatible
VitalSource PC, Mac, iPhone & iPad Amazon Kindle eReader
ePub & PDF Apple & PC desktop. Mobile devices (Apple & Android) Amazon Kindle eReader
Mobi Amazon Kindle eReader Anything else

Institutional Access


Description

Speech enhancement is a classical problem in signal processing, yet still largely unsolved. Two of the conventional approaches for solving this problem are linear filtering, like the classical Wiener filter, and subspace methods. These approaches have traditionally been treated as different classes of methods and have been introduced in somewhat different contexts. Linear filtering methods originate in stochastic processes, while subspace methods have largely been based on developments in numerical linear algebra and matrix approximation theory.

This book bridges the gap between these two classes of methods by showing how the ideas behind subspace methods can be incorporated into traditional linear filtering. In the context of subspace methods, the enhancement problem can then be seen as a classical linear filter design problem. This means that various solutions can more easily be compared and their performance bounded and assessed in terms of noise reduction and speech distortion. The book shows how various filter designs can be obtained in this framework, including the maximum SNR, Wiener, LCMV, and MVDR filters, and how these can be applied in various contexts, like in single-channel and multichannel speech enhancement, and in both the time and frequency domains.

Key Features

  • First short book treating subspace approaches in a unified way for time and frequency domains, single-channel, multichannel, as well as binaural, speech enhancement
  • Bridges the gap between optimal filtering methods and subspace approaches
  • Includes original presentation of subspace methods from different perspectives

Readership

Signal Processing researchers and R&D engineers in industry

Table of Contents

Chapter 1. Introduction

Abstract

1.1 History and Applications of Subspace Methods

1.2 Speech Enhancement from a Signal Subspace Perspective

1.3 Scope and Organization of the Work

References

Chapter 2. General Concept with the Diagonalization of the Speech Correlation Matrix

Abstract

2.1 Signal Model and Problem Formulation

2.2 Linear Filtering with a Rectangular Matrix

2.3 Performance Measures

2.4 Optimal Rectangular Filtering Matrices

References

Chapter 3. General Concept with the Joint Diagonalization of the Speech and Noise Correlation Matrices

Abstract

3.1 Signal Model and Problem Formulation

3.2 Linear Filtering with a Rectangular Matrix

3.3 Performance Measures

3.4 Optimal Rectangular Filtering Matrices

3.5 Another Signal Model

References

Chapter 4. Single-Channel Speech Enhancement in the Time Domain

Abstract

4.1 Signal Model and Problem Formulation

4.2 Linear Filtering with a Rectangular Matrix

4.3 Performance Measures

4.4 Optimal Rectangular Filtering Matrices

4.5 Single-Channel Noise Reduction Revisited

References

Chapter 5. Multichannel Speech Enhancement in the Time Domain

Abstract

5.1 Signal Model and Problem Formulation

5.2 Linear Filtering with a Rectangular Matrix

5.3 Performance Measures

5.4 Optimal Rectangular Filtering Matrices

References

Chapter 6. Multichannel Speech Enhancement in the Frequency Domain

Abstract

6.1 Signal Model and Problem Formulation

6.2 Linear Array Model

6.3 Performance Measures

6.4 Optimal Filters

References

Chapter 7. A Bayesian Approach to the Speech Subspace Estimation

Abstract

7.1 Signal Model and Problem Formulation

7.2 Estimation Based on the Minimum Mean

Details

No. of pages:
138
Language:
English
Copyright:
© Academic Press 2014
Published:
Imprint:
Academic Press
eBook ISBN:
9780128002537
Paperback ISBN:
9780128001394

About the Author

Jacob Benesty

Jacob Benesty received a Master degree in microwaves from Pierre & Marie Curie University, France, in 1987, and a Ph.D. degree in control and signal processing from Orsay University, France, in April 1991. During his Ph.D. (from Nov. 1989 to Apr. 1991), he worked on adaptive filters and fast algorithms at the Centre National d’Etudes des Telecomunications (CNET), Paris, France. From January 1994 to July 1995, he worked at Telecom Paris University on multichannel adaptive filters and acoustic echo cancellation. From October 1995 to May 2003, he was first a Consultant and then a Member of the Technical Staff at Bell Laboratories, Murray Hill, NJ, USA. In May 2003, he joined the University of Quebec, INRS-EMT, in Montreal, Quebec, Canada, as a Professor. His research interests are in signal processing, acoustic signal processing, and multimedia communications. He is the inventor of many important technologies. In particular, he was the lead researcher at Bell Labs who conceived and designed the world-first real-time hands-free full-duplex stereophonic teleconferencing system. Also, he conceived and designed the world-first PC-based multi-party hands-free full-duplex stereo conferencing system over IP networks. He was the co-chair of the 1999 International Workshop on Acoustic Echo and Noise Control and the general co-chair of the 2009 IEEEWorkshop on Applications of Signal Processing to Audio and Acoustics. He is the recipient, with Morgan and Sondhi, of the IEEE Signal Processing Society 2001 Best Paper Award. He is the recipient, with Chen, Huang, and Doclo, of the IEEE Signal Processing Society 2008 Best Paper Award. He is also the co-author of a paper for which Huang received the IEEE Signal Processing Society 2002 Young Author Best Paper Award. In 2010, he received the “Gheorghe Cartianu Award” from the Romanian Academy. In 2011, he received the Best Paper Award from the IEEE WASPAA for a paper that he co-authored with Jingdong Chen.

Affiliations and Expertise

INRS-EMT, University of Quebec, Canada.

Jesper Jensen

Jesper R. Jensen received the M.Sc. degree cum laude for completing the elite candidate education in 2009 from Aalborg University in Denmark. In 2012, he received the Ph.D. degree from Aalborg University. Currently, he is a Postdoctoral Researcher at the Department of Architecture, Design & Media Technology at Aalborg University in Denmark, where he is also a member of the Audio Analysis Lab. He has been a Visiting Researcher at University of Quebec, INRS-EMT, in Montreal, Quebec, Canada. He has published several papers in peer-reviewed conference proceedings and journals. Among others, his research interests are digital signal processing and microphone array signal processing theory and methods with application to speech and audio signals. In particular, he is interested in parametric analysis, modeling and extraction of such signals.

Affiliations and Expertise

Aalborg University, Denmark.

Mads Graesboll Christensen

Mads G. Christensen received the M.Sc. and Ph.D. degrees in 2002 and 2005, respectively, from Aalborg University (AAU) in Denmark, where he is also currently employed at the Dept. of Architecture, Design & Media Technology as Professor in Audio Processing. At AAU, he is head of the Audio Analysis Lab which conducts research in audio signal processing. He was formerly with the Dept. of Electronic Systems, Aalborg University and has been a Visiting Researcher at Philips Research Labs, ENST, UCSB, and Columbia University. He has published more than 100 papers in peer-reviewed conference proceedings and journals as well as 1 research monograph. His research interests include digital signal processing theory and methods with application to speech and audio, in particular parametric analysis, modeling, enhancement, separation, and coding. Prof. Christensen has received several awards, including an ICASSP Student Paper Award, the Spar Nord Foundation’s Research Prize for his Ph.D. thesis, a Danish Independent Research Council Young Researcher’s Award, and the Statoil Prize, as well as prestigious grants from the Danish Independent Research Council and the Villum Foundation’s Young Investigator Programme. He is an Associate Editor for IEEE Transactions on audio, speech and language processing, and has previously served as an Associate Editor for IEEE Signal processing letters.

Affiliations and Expertise

Aalborg University, Denmark.

Jingdong Chen

Jingdong Chen received the Ph.D. degree in pattern recognition and intelligence control from the Chinese Academy of Sciences in 1998. From 1998 to 1999, he was with ATR Interpreting Telecommunications Research Laboratories, Kyoto, Japan, where he conducted research on speech synthesis, speech analysis, as well as objective measurements for evaluating speech synthesis. He then joined the Griffith University, Brisbane, Australia, where he engaged in research on robust speech recognition and signal processing. From 2000 to 2001, he worked at ATR Spoken Language Translation Research Laboratories on robust speech recognition and speech enhancement. From 2001 to 2009, he was a Member of Technical Staff at Bell Laboratories, Murray Hill, New Jersey, working on acoustic signal processing for telecommunications. He subsequently joined WeVoice Inc. in New Jersey, serving as the Chief Scientist. He is currently a professor at the Northwestern Polytechnical University in Xi’an, China. His research interests include acoustic signal processing, adaptive signal processing, speech enhancement, adaptive noise/echo control, microphone array signal processing, signal separation, and speech communication. Dr. Chen is currently an Associate Editor of the IEEE TRANSACTIONS ON AUDIO, SPEECH, AND LANGUAGE PROCESSING, a member of the IEEE Audio and Electroacoustics Technical Committee, and a member of the editorial advisory board of the Open Signal Processing Journal. He was the Technical Program Co-Chair of the 2009 IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) and the Technical Program Chair of IEEE TENCON 2013, and helped organize many other conferences. Dr. Chen received the 2008 Best Paper Award from the IEEE Signal Processing Society (with Benesty, Huang, and Doclo), the best paper award from the IEEE Workshop on Applications of Signal Processing to Audio and Acoustics (WASPAA) in 2011 (with Benesty), the Bell Labs Role Model Teamwork Award twice,

Affiliations and Expertise

Northwestern Polytechnical University, China.

Reviews

Speech enhancement is a classical problem in signal processing, yet still largely unsolved. Two of the conventional approaches for solving this problem are linear filtering, like the classical Wiener filter, and subspace methods. These approaches have traditionally been treated as different classes of methods and have been introduced in somewhat different contexts. Linear filtering methods originate in stochastic processes, while subspace methods have largely been based on developments in numerical linear algebra and matrix approximation theory.

This book bridges the gap between these two classes of methods by showing how the ideas behind subspace methods can be incorporated into traditional linear filtering. In the context of subspace methods, the enhancement problem can then be seen as a classical linear filter design problem. This means that various solutions can more easily be compared and their performance bounded and assessed in terms of noise reduction and speech distortion. The book shows how various filter designs can be obtained in this framework, including the maximum SNR, Wiener, LCMV, and MVDR filters, and how these can be applied in various contexts, like in single-channel and multichannel speech enhancement, and in both the time and frequency domains.