Recent Research Towards Advanced Man-Machine Interface Through Spoken LanguageEdited By
- H. Fujisaki, Science University of Tokyo, Department of Applied Electronics, Tokyo, Japan
The spoken language is the most important means of human information transmission. Thus, as we enter the age of the
Information Society, the use of the man-machine interface through the spoken language becomes increasingly important. Due to the extent of the problems involved, however, full realization of such an interface calls for coordination of research effortsbeyond the scope of a single group or institution.
Thus a nationwide research project was conceived and started in 1987 as one of the first Priority Research Areas supported by the Ministry of Education, Science and Culture of Japan. Theproject was carried out in collaboration with over 190 researchers in Japan.
The present volume begins with an overview of the project, followed by 41 papers presented at the symposia. This work is expected to serve as an important source of information on each of the nine topics adopted for intensive study under the project.This book will serve as a guideline for further work in the important scientific and technological field of spoken language processing.
Published: October 1996
- Overview. Overview of Japanese efforts towards anadvanced man-machine interface through spoken language (H. Fujisaki). Speech Analysis. Composite cosine wave analysis and its application to speech signal (S. Saito, K. Tamaribuchi). Smoothed group delay analysis and its applications to isolated word recognition (H. Singer et al.). A new method of speech analysis - PSE (T. Nakajima, T. Suzuki). Estimation of voice source andvocal tract parameters based on ARMA analysis and a model for the glottal source waveform (H. Fujisaki, M. Ljungqvist). Estimation of sound pressure distribution characteristics in the vocal tract (N. Miki, K. Motoki). Speech production model involving the subglottal structure and oral-nasal coupling due to wall vibration (H. Suzuki et al.). On the analysis of predictive data such as speech by a class of single layer connectionist models (F. Fallside). Feature Extraction. Phoneme recognition in continuous speech using feature selection based on mutual information (K. Shirai et al.). Dependency of vowel spectra on phoneme environment (T. Kobayashi). A preliminary study on anew acoustic feature model for speech recognition (M. Dantsuji, S. Kitazawa). A hybrid code for automatic speech recognition (R. DeMori). Complementary approaches to acoustic-phonetic decoding of continuous speech (J.-P. Haton). Is rule-based acoustic-phonetic speech recognition a dead end? (P.V.S. Rao). Speech Recognition. Speaker-independent phoneme recognition using network units based on the a posteriori probability (J. Miwa). Unsupervised speaker adaptation in speech recognition (H. Matsumoto, Y. Yamashita). A Japanese text dictation system based on phoneme recognition and a dependency grammar (S. Makino et al.). Word recognition using synthesized templates (M. Blomberg et al.). A cache-based natural language model for speech recognition (R. DeMori, R. Kuhn). On the design of a voice-activated typewriter in French (J.-J. Mariani). Speech recognition using hidden Markov models: a CMU perspective (K.-F. Lee et al.). Phonetic features and lexical access (K.N. Stevens). Speech Understanding. A large-vocabulary continuous speech recognition system with high prediction capability (M. Shigenaga, Y. Sekiguchi).Syntax/semantics-oriented spoken Japanese understanding system: SPOJUS-SYNO/SEMO (S.Nakagawa et al.). An application of discourse analysis to speech understanding (Y. Niimi, Y. Kobayashi). Speech Synthesis. Studies on glottal source and formant trajectory models for the synthesis of high quality speech (S. Imaizumi, S. Kiritani). A System for synthesis of high-quality speech from Japanese text (H. Fujisaki et al.). A text-to-speech system having several prosody options: GK-SS5 (R. Teranishi). A prolog-based automatic text-to-phoneme conversion system for British English (J. Laver et al.). Data-bank analysis of speech prosody (G. Fant et al.) Dialogue Systems. Parsing grammatically ill-formed utterances (K. Uehara, J. Toyoda). A dialogue analyzing method using a dialogue model (A. Takano et al.).Discourse management system for communication through spoken language (Y.Yamashita et al.). Towards habitable systems: use of world knowledge to dynamically constrain speech recognition (S.R. Young, W.H. Ward). Speech Enhancement. Noise elimination of speech by vector quantization and neural networks (K. Nakata, A. Sugiura). Speech/nonspeech discrimination under nonstationary noise environments (H. Kobatake, A. Ishida). Spatially selectivemulti-microphone system (H. Date, T. Watanabe). Evaluation. Classification of Japanese syllables including speech sounds found in loanwords (S. Hiki). A. study of the suitability of synthetic speech for proof-reading in relation to the voice quality (H. Kasuya). Improving synthetic speech quality by systematic evaluation (L.C.W. Pols). Speech Database. Considerations on a common speech database (S. Itahashi). Transcription and alignment of the TIMIT database (V.W. Zue, S. Seneff).