Itakura Fumitada
Professor, Emeritus of Nagoya University (Japan)
Dr. Fumitada Itakura started his pioneering research on efficient voice coding during his doctor's course at Nagoya University . Boosting the level of research at the Electrical Communication Laboratory of NTT NTT Nippon Telegraph and Telephone Corporation NTT New Technology Telescope NTT National Technology Transfer, Inc
NTT Public Corporation and AT&T Bell Laboratories, he developed a number of fundamental new methods one after another. These methods were based on frequency spectrum-parameter extraction from speech signals using statistical approaches. His approaches involved approximating vocal-tract characteristics using an all-pole digital-filter model, transmitting its filtering coefficients, and reproducing the original voice with a speech synthesizer. These methods enabled the speech signal data rate to be reduced to 1/10 to 1/20 of that for the pulse-code modulation (PCM) method, which transmits speech data using the direct digitization of speech waveforms.
Joe Olive
Professor, DARPA Information Processing Techniques Office (IPTO) (U.S.A)

Dr. Olive has had over thirty years of experience in research and development at Bell Laboratories and 19 years of experience in management. He has been the world leader in research of text-to-speech synthesis and has managed a world-class team in computer dialogue systems and human-computer communication. In his role as director of speech research and CTO of Lucent's Business Unit, Lucent Speech Solutions, he supervised the productization of Bell-Labs core speech technologies: Automatic Speech Recognition (ASR), Text-to-Speech Synthesis (TTS), and Speaker Verification (SV). He also led the dialogue research team in creating a "next-generation" dialogue system for e-mail reading and navigation.

Dr. Olive graduated from the University of Chicago with a Ph.D. in Physics. While he was a graduate student, his research consisted of computational atomic physics requiring intensive use of computers for the computation of electron distribution functions. He was also a member of the University of Chicago 's computer center. Dr. Olive also earned an M.A. in music composition, a degree that he used to pursue a side career in writing music for small chamber groups and orchestras, computer music, and an opera for a computer, soprano, and small ensemble. After leaving the University of Chicago , Dr. Olive combined his interest in computation and his interest in music and began research in acoustics and signal processing.

Dr. Olive was a recipient of the National Endowment for the Arts grant in 1974 to write a computer opera. He was also the recipient of the Bell-Labs' Distinguished Member of Technical Staff award in 1984.

Title: Machine Translation at DARPA (download)


This talk will discuss the Chinese and Arabic machine translation work being carried out under DARPA's Global Autonomous Language Exploitation Program. Topics will include preparation for the program, the evaluation paradigm, the current status, and potential future research directions.

Mari Ostendorf
Professor, Department of Electrical Engineering University of Washington (U.S.A)

Mari Ostendorf joined the Speech Signal Processing Group at BBN Laboratories in 1985, where she worked on low-rate coding and acoustic modeling for continuous speech recognition. Two years later, she moved to Boston University in the Department of Electrical and Computer Engineering, where she her research expanded to include language modeling, prosody modeling, and speech synthesis. She joined the University of Washington in 1999, where she is broadly interested in spoken language technology. She teaches courses in statistical language processing and undergraduate signal processing, and has recently introduced a class on the Digital World of Multimedia, introducing new undergraduates to signal processing and communications. Her current research efforts are centered on rich speech transcription, particularly for purposes of automatic language processing on speech, with more fundamental interests in learning methods for language technology. She has published over 150 papers on various problems in speech and language processing. Dr. Ostendorf has served on the Speech Processing and the DSP Education Committees of the IEEE Signal Processing Society and numerous workshop committees.

Title: Translatable Language Technology -- Beyond HMMs and N-grams (download)


This talk looks at the challenge of applying recognition modeling frameworks developed in one language to another, particularly considering English and Mandarin Chinese. As many studies have shown, porting modeling techniques to a new language via retraining can often lead to a good initial baseline, but they can be limited by language differences. However, many techniques can be "translated" or used in a somewhat modified form to get good results in the new language, and lessons learned can impact the models used in the original language as well. Using examples from broadcast news and talk shows, this presentation will look at a few examples of porting and translating models, considering recognition of high-level structure in speech for translation and information extraction, as well more low-level issues such as modeling intonation.

Tatsuya Kawahara
Professor, Department of Intelligence Science and Technology Graduate School of Informatics Kyoto University (Japan)
Tatsuya Kawahara received B.E. in 1987, M.E. in 1989, and Ph.D. in 1995, all in information science, from Kyoto University , Kyoto , Japan .
In 1990, he became a Research Associate in the Department of Information Science, Kyoto University . From 1995 to 1996, he was a Visiting Researcher at Bell Laboratories, Murray Hill , NJ , USA . Currently, he is a Professor in the Academic Center for Computing and Media Studies and an Affiliated Professor in the School of Informatics , Kyoto University . He has also been an Invited Researcher at ATR, currently National Institute of Information and Communications Technology. (NICT).
He has published more than 200 technical papers on speech recognition, spoken language processing, and spoken dialogue systems. He has been managing several speech-related projects in Japan including a free large vocabulary continuous speech recognition software project (
Dr. Kawahara received the 1997 Awaya Memorial Award from the Acoustical Society of Japan and the 2000 Sakai Memorial Award from the Information Processing Society of Japan. From 2003 to 2006, he was a member of the IEEE SPS Speech Technical Committee. He was a general chair of the IEEE Automatic Speech Recognition & Understanding workshop (ASRU-2007). He is a senior member .
Title: Automatic Transcription of Parliamentary Meetings and Classroom Lectures

-- A Sustainable Approach and Real System Evaluations -- (download)


Applications of automatic speech recognition (ASR) have been extended to a variety of tasks and domains, including spontaneous human-human speech. We have developed an ASR system for the Japanese Pariliament (Diet), which is deployed this year. By exploiting official records made by human stenographers, we have realized an efficient training scheme of acoustic and language models, which does not require faithful transcripts and thus is scalable to enormous data. Evaluation results and remaining issues are presented. We are also working on an ASR system for classroom lectures, which is intended for assisting hearing impaired students. As the classroom lectures in universities are very technical, a number of adaptation methods of acoustic and language models are investigated. A trial of real-time captioning for a hearing impaired student in our university is reported.

