Technical Program


Nov. 29
NCKU, Tainan

Nov. 30
NCKU, Tainan

Dec. 1
NCKU, Tainan

Dec. 2
Sun Moon Lake

Dec. 3
Sun Moon Lake

08:00-08:30  Registration Registration  Registration (Shuttle Buses leave to Sun Moon Lake at 8:30AM) Technical Tour




Tutorial 1 Plenary 1 Plenary 2 Plenary 3



Refreshment Refreshment Refreshment


Tutorial 2 L 1 L 2 L 4 L 5 Plenary 4



SIG-CSLP Assembly


Lunch Closing


Lunch Lunch


Tutorial 3



P 1 P 2 Leave Sun Moon Lake



City Tour
& Reception




L 3 SPE 1 L 6 SPE 2












Location: NCKU Kuang-Fu Campus, International Conference Hall, 1st Lecture Room
Tutorial 1 (Nov. 29: 09:00 - 10:00)
Masato Akagi
School of Information Science, Japan Advanced Institute of Science and Technology (Japan)
Tutorial 2 (Nov. 29: 10:30 - 12:00)
Chin-Hui Lee
School of Electrical and Computer Engineering, Georgia Institute of Technology (USA)
Tutorial 3 (Nov. 29: 13:00 - 15:00)
Junichi Yamagishi & Dr. Simon King
Centre for Speech Technology Research, University of Edinburgh (UK)
Plenary Talks:
Location: NCKU Kuang-Fu Campus, International Conference Hall, 1st Lecture Room
Plenary 1 (Nov. 30: 09:00 - 10:00)
Itakura Fumitada
Emeritus of Nagoya University (Japan)
Plenary 2 (Dec. 01: 09:00 - 10:00)
Joe Olive
DARPA Information Processing Techniques Office (IPTO) (U.S.A)
Plenary 3 (Dec. 03: 09:00 - 10:00)
Mari Ostendorf
Department of Electrical Engineering University of Washington (U.S.A)
Plenary 4 (Dec. 03: 10:30 - 11:30)
Tatsuya Kawahara
Department of Intelligence Science and Technology Graduate School of Informatics Kyoto University (Japan)
Lecture Sessions:

L1: Speech Enhancement and Robust Speech Recognition

Date: Tuesday, Nov 30, 2010

Time: 10:30-12:30

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 1st Lecture Room

Chair(s): Chia-Ping CHEN, National Sun Yat-sen University

Zhijian OU, Tsinghua University


L1.1 Intelligibility Investigation of Single-Channel Noise Reduction Algorithms for Chinese and Japanese

Junfeng LI, Chau Duc THANH, Masato AKAGI, Lin YANG, Jianping ZHANG and Yonghong YAN


L1.2 DCT-based Processing of Dynamic Features for Robust Speech Recognition

Wen-Chi LIN, Hao-Teng FAN and Jeih-Weih HUNG


L1.3 Speech Enhancement as a Functional Approximation and Generalization

Xugang LU, Masashi UNOKI, Ryosuke ISOTANI, Hisashi KAWAI and Satoshi NAKAMURA


L1.4 Spectral Trajectory Estimation Using Nonnegative Matrix Factorization for Model-Based Monaural Speech Separation

Chun-Man MAK, Tan LEE and S.W. LEE


L1.5 An Environment Structuring Framework to Facilitating Suitable Prior Density Estimation for MAPLR on Robust Speech Recognition

Yu TSAO, Ryosuke ISOTANI, Hisashi KAWAI and Satoshi NAKAMURA


L1.6 Dual-microphone Noise Reduction Based on Semi-Blind DUET

Zhong-hua FU, Lei XIE and Domg-mei JIANG


L2: Speech Production and Perception

Date: Tuesday, Nov 30, 2010

Time: 10:30-12:30

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 2nd Lecture Room

Chair(s): Jianwu DANG, Japan Advanced Institute of Science and Technology & Tianjin University

Aijun LI, Chinese Academy of Social Sciences


L2.1 Acoustic and Articulatory Analysis on Mandarin Chinese Vowels in Emotional Speech

Aijun LI, Qiang FANG, Fang HU, Lu ZHENG, Hong WANG and Jianwu DANG


L2.2 Effect of Speech Rate on Inter-segmental Coarticulation in Standard Chinese

Yinghao LI and Jiangping KONG


L2.3 Discrimination between Natural and Unnatural Articulations based on Articulatory Structure

Akikazu NISHIKIDO, Shin-ichi KAWAMOTO and Jianwu DANG


L2.4 An Initial Investigation of L1 and L2 Discourse Speech Planning in English

Chiu-yu TSENG, Zhao-yu SU, Chi-Feng HUANG and Tanya VISCEGLIA


L2.5 Effects of Syllable Positions on Taiwanese Mandarin Sibilant Perception

Chenhao CHIU and Molly BABEL


L2.6 Toward a Comprehensive Vowel Space for Whispered Speech



L3: Applications of Spoken Language Processing Technology

Date: Tuesday, Nov 30, 2010

Time: 16:00-18:00

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 1st Lecture Room

Chair(s): Ea-Ee JAN, IBM T.J Watson Research Center

Yu HU, iFLYTEK Research


L3.1 Detection of Intonation in L2 English Speech of Native Mandarin Learners

Kun LI, Shuang ZHANG, Mingxing LI, Wai-Kit LO and Helen MENG


L3.2 Improving the Informativeness of Verbose Queries Using Summarization Techniques for Spoken Document Retrieval

Shih-Hsiang LIN, Berlin CHEN and Ea-Ee JAN


L3.3 Forward Optimal Measures for Automatic Mispronunciation Detection

Changliang LIU, Fuping PAN, Fengpei GE, Bin DONG and Yonghong YAN


L3.4 Capturing L2 Mispronunciations with Joint-sequence Models in Computer-Aided Pronunciation Training (CAPT)

Xiaojun QIAN, Helen MENG and Frank SOONG


L 3.5 A Novel Approach for Proper Name Transliteration Verification

Ea-Ee JAN, Niyu GE, Shih-Hsiang LIN, Jeffrey SORENSON and Salim ROUKOS


L3.6 Aligning Singing Voice with MIDI Melody Using Synthesized Audio Signal

Minghui DONG, Paul CHAN, Ling CEN and Haizhou LI


L4: Automatic Speech Recognition

Date: Wednesday, Dec 1, 2010

Time: 10:30-12:30

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 1st Lecture Room

Chair(s): Berlin CHEN, National Taiwan Normal University

Yu TSAO, National Institute of Information and Communications Technology

L4.1 Minimum Generation Error Training for HMM-based Prediction of Articulatory Movements

Tian-Yi ZHAO, Zhen-Hua LING, Ming LEI, Li-Rong DAI and Qing-Feng LIU


L4.2 Mandarin-English Bilingual Phone Modeling and Combining MPE Based Discriminative Training for Cross-Language Speech Recognition

Yanmin QIAN and Jia LIU


L4.3 Subvector-quantized High-density Discrete Hidden Markov Model and its Re-estimation

Guoli YE and Brian MAK


L4.4 Problems of Modeling Phone Deletion in Conversational Speech for Speech Recognition

Brian MAK and Tom KO


L4.5 Speaker Adaptation of Stochastic Segment Models Using Maximum Likelihood Linear Regression

Hao CHAO and Wenju LIU


L 4.6 A Study of Large Vocabulary Speech Recognition Decoding Using Finite-State Graphs

Zhijian OU and Ji XIAO


L5: Speech Synthesis

Date: Wednesday, Dec 1, 2010

Time: 10:30-12:30

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 2nd Lecture Room

Chair(s): Chung-Hsien WU, National Cheng Kung University

Jianhua TAO, Chinese Academy of Sciences


L5.1 Rendering a Personalized Photo-Real Talking Head from Short Video Footage

Lijuan WANG, Frank SOONG, Wei HAN and Xiaojun QIAN


L5.2 Automatic Prosody Prediction and Detection with Conditional Random Field (CRF) Models

Yao QIAN and Frank SOONG


L5.3 Development of an Articulatory Visual-Speech Synthesizer to Support Language Learning

Ka-Ho WONG, Wai-Kim LEUNG, Wai-Kit LO and Helen MENG


L5.4 Statistical Modeling of Syllable-Level F0 Features for HMM-based Unit Selection Speech Synthesis

Zhen-Hua LING, Zhi-Guo WANG and Li-Rong DAI


L5.5 Modeling Prosody Patterns for Chinese Expressive Text-to-Speech Synthesis

Zhiyong WU, Lianhong CAI and Helen MENG


L 5.6 A Method for Modeling and Generating Mandarin Tone Contour with Phrase Intonation Based on the Generation Process Model

Miaomiao WANG, Miaomiao WEN, Keikichi HIROSE and Nobuaki MINEMATSU


L6: Speaker and Language Recognition

Date: Wednesday, Dec 1, 2010

Time: 16:00-18:00

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 1st Lecture Room

Chair(s): Wei-Ho TSAI, National Taipei University of Technology

Bin MA, Institute for Infocomm Research


L6.1 The Description of iFlyTek Speech Lab System for NIST2009 Language Recognition Evaluation

Ying XU, Yan SONG, Yan-Hua LONG, Hai-Bing ZHONG and Li-Rong DAI


L6.2 UBM Data Selection for Effective Speaker Modeling

Chien-Lin HUANG and Haizhou LI


L6.3 Factor Analysis Based Spatial Correlation Modeling for Speaker Verification

Eryu WANG, Kong Aik LEE, Bin MA, Haizhou LI, Wu GUO and Lirong DAI


L6.4 Dialect-Based Speaker Classification Using Speaker-Invariant Dialect Features

Xuebin MA, Ruiyuan XU, Nobuaki MINEMATSU, Yu QIAO, Keikichi HIROSE and Aijun LI


L6.5 Using Cepstral and Prosodic Features for Chinese Accent Identification

Jue HOU, Yi LIU, Thomas Fang ZHENG, Jesper OLSEN and Jilei TIAN


L6.6 Speaker Verification Using Support Vector Machine with LLR-based Sequence Kernels

Yi-Hsiang CHAO, Wei-Ho TSAI and Hsin-Min WANG

Poster Sessions:

P1: Speech, Speaker, and Language Recognition

Date: Tuesday, Nov 30, 2010

Time: 14:00-15:30

Venue: TBA

Chair(s): Jeih-Weih HUNG, National Chi Nan University

Jui-Feng YEH, National Chiayi University


P1.1 Phonetic Clustering Based Confidence Measure for Embedded Speech Recognition

Zhi-Guo WANG, Cong LIU, Hai-Kun WANG, Yu HU and Li-Rong DAI


P1.2 Audio Visual Speech Recognition Based on Multi-Stream DBN Models with Articulatory Features

Dongmei JIANG, Zhonghua FU, Lei XIE, Hichem SAHLI and Werner VERHELST


P 1.3 A Study on Functional Loads of Phonetic Contrasts under Context Based On Mutual Information of Chinese Text and Phonemes

Jinsong ZHANG, Wei LI, Yuxia HOU, Wen CAO and Ziyu XIONG


P 1.4 A Study on Hakka and Mixed Hakka-Mandarin Speech Recognition

Tsai-Lu TSAI, Chen-Yu CHIANG, Hsiu-Min YU, Lieh-Shih LO, Yih-Ru WANG and Sin-Horng CHEN


P1.5 Auditory Front-ends for Noise-Robust Automatic Speech Recognition

Ja-Zang YEH and Chia-Ping CHEN


P1.6 Robust Speaker Localization in a Disturbance Noise Environment Using a Distributed Microphone System

Kook CHO, Takanobu NISHIURA and Yoichi YAMASHITA


P1.7 An Integrated Framework for Transcribing Mandarin-English Code-mixed Lectures with Improved Acoustic and Language Modeling

Ching-Feng YEH, Chao-Yu HUANG, Liang-Che SUN and Lin-Shan LEE


P1.8 Large Vocabulary Uyghur Continuous Speech Recognition based on Stems and Suffixes

Xin LI, Shang CAI, Yafei YANG, Jielin PAN and Yonghong YAN


P1.9 Topic-weak-correlated Latent Dirichlet Allocation

Yimin TAN and Zhijian OU


P1.10 Building Topic Mixture Language Models using the Document Soft Classification Notion of Topic Models

Shuanhu BAI, Cheung-Chi LEUNG, Chien-Lin HUANG and Bin MA


P1.11 Data-Driven Lexicon Refinement using Local and Web Resources for Chinese Speech Recognition

Hua ZHANG, Xuan ZHU, Tengrong SU, Kiwan EOM and Jaewon LEE


P1.12 The Psychoacoustic Approach towards Enhancing Speech Intelligibility in Noise

Paul Yaozhu CHAN, Minghui DONG, Ling CEN and Haizhou LI


P1.13 Improving Mandarin Chinese STT System with Random Forests Language Models

Ilya OPARIN, Lori LAMEL and Jean-Luc GAUVAIN


P1.14 Semantics-Based Language Modeling for Cantonese-English Code-mixing Speech Recognition

Houwei CAO, P. C. CHING, Tan LEE and Yu Ting YEUNG


P1.15 Web-Based Keyword Adapted Language Modeling for Keyword Spotting

Wenzhu SHEN, Ji WU and Wei LI


P1.16 Spontaneous Mandarin Speech Understanding Using Utterance Classification: A Case Study

Yun-Cheng JU and Jasha DROPPO


P1.17 Adaptive Segment Model for Spoken Document Retrieval

Chuang-Hua CHUEH and Jen-Tzung CHIEN


P1.18 Sentence Decomplexification Using Holistic Aspect-Based Clause Detection for Long Sentence Understanding

Chao-Hong LIU and Chung-Hsien WU


P1.19 SURE-MSE speech enhancement for robust speech recognition

Nengheng ZHENG, Xia LI, Thierry BLU and Tan LEE


P 1.20 A Novel Subspace Speech Enhancement Approach based on Test of Hypothesis and Masking Properties

Wenju LIU, Ning CHENG and Chao LI


P 1.21 A Speedup Method for the Separation of Speech Signals in Frequency Domain

Shih-Hsun CHEN and Hsiao-Chuan WANG


P 1.22 A Novel Algorithm of Seeking FrFT Order for Speech Enhancement

Duojia MA, Xiang XIE and JingMing KUANG


P1.23 Non-Negative Matrix Factorization Based Discriminative Features for Speaker Verification

Yanhua LONG, Lirong DAI, Eryu WANG, Bin MA and Wu GUO


P1.24 Multidimensional Scaling for Fast Speaker Clustering

Chi-Chun HSIA, Yu-Hsien CHIU, Kuo-Yuan LEE and Chih-Chieh CHUANG


P1.25 An Enhanced Fishervoice Subspace Framework for Text Independent Speaker Verification

Weiwu JIANG, Helen MENG and Zhifeng LI


P1.26 Frame Selection of Interview Channel for NIST Speaker Recognition Evaluation

Hanwu SUN, Bin MA and Haizhou LI


P1.27 Speaker Verification against Synthetic Speech

Lian-Wu CHEN, Wu GUO and Li-Rong DAI


P1.28 Spectro-temporal Smoothed Auditory Spectra for Robust Speaker Identification

Ting-Han LIN, Chung-Chien HSU and Tai-Shih CHI


P1.29 Multi-Feature Combination for Speaker Recognition

Zhi Yi LI, Liang HE, Wei Qiang ZHANG and Jia LIU


P2: Speech Analysis, TTS, and Applications

Date: Wednesday, Dec 1, 2010

Time: 14:00-15:30

Venue: TBA

Chair(s): Yih-Ru Wang, National Chiao Tung University

Hung-Yan GU, National Taiwan University of Science and Technology


P2.1 Effects of F0 Dimensions in Perception of Mandarin Tones

Bin LI and Caicai ZHANG


P2.2 Investigation of the Relation between Acoustic Features and Articulation -- An Application to Emotional Speech Analysis

Yongxin WANG, Jianwu DANG and Lianhong CAI


P2.3 Investigation of Muscle Activation in Speech Production Based on an Articulatory Model

Xiyu WU, Qiang FANG and Jianwu DANG


P2.4 The Relation between Larynx Height and F0 during the Four Tones of Mandarin in X-ray Movie

Gaowu WANG and Jiangping KONG


P2.5 Does Semantic Stress Have Effect on Duration and Pitch Patterns of Prosodic Words in Presenters' Speech?

Yu ZOU, Wei HE , Min HOU and Yonglin TENG


P2.6 Downstep in High-Low Sequences in Chinese

Maolin WANG, Hua WU and Aijun LI


P2.7 Relation between Focus and Accent in Standard Chinese

Yuan JIA and Aijun LI


P2.8 Mandarin Prosodic Break Detection Based on Complementary Model

Chong-Jia NI, Wen-Ju LIU and Bo XU


P2.9 Acoustic Development of Vowels in Children's Speech

Wai-Sum LEE and Eric ZEE


P2.10 GMM-based Voice Conversion with Explicit Modelling on Feature Transform

Ling-Hui CHEN, Zhen-Hua LING, Wu GUO and Li-Rong DAI


P2.11 Study on Attenuated Tone for Mandarin Text-To-Speech

Xiaoyan LOU and Jian LI


P2.12 Automatic Phrase Boundary Labeling for Mandarin TTS Corpus Using Context-Dependent HMM

Chen-Yu YANG, Zhen-Hua LING, Heng LU, Wu GUO and Li-Rong DAI


P2.13 Hierarchical Pitch Target Model for Mandarin Speech

Zhiping ZHANG, Xinhao WANG, Yansuo YU and Xihong WU


P2.14 Generating Emotional Speech from Neutral Speech

Ling CEN, Paul CHAN, Minghui DONG and Haizhou LI


P2.15 Mandarin to Lanzhou Dialect Conversion based on Five Degree Tone Model

Hongwu YANG, Qingqing LIANG, Weitong GUO and Dong PEI


P2.16 Improving GMM-Based Spectral Conversion with Optimal Conversion Function Selection

Hsin-Te HWANG, Weng-Liang WU and Sin-Horng CHEN


P2.17 Prosody Phrase Boundary Prediction with Ensemble Learning

Lifu YI, Jian LI, Lei HE and Jie HAO


P2.18 Error Diagnosis Using Penalized Probabilistic FOIL for Chinese as a Second Language Learner

Ru-Yng CHANG, Chung-Hsien WU and Philips Kokoh PRASETYO


P2.19 Automatic Lexical Stress Detection for Chinese Learners' of English

Jinyu CHEN and Lan WANG


P2.20 Robust Pronunciation Evaluation in Adverse Environments

Si WEI, Qianyong GAO, Guoping HU and Yu HU


P 2.21 A Distinctive Feature Based Method for Evaluating The Phonetic Transcription Of A Non-Native Speech Database

Jinsong ZHANG, Dongning WANG, Wen CAO and Ziyu XIONG


P2.22 Multi-Modal Feature Integration for Story Boundary Detection in Broadcast News

Mimi LU, Lei XIE, Zhong-hua FU, Dong-mei JIANG and Yan-ning ZHANG


P2.23 Confidence Estimation for Spoken Language Translation based on Round Trip Translation

Dong YU, Wei WEI, Lei JIA and Bo XU


P2.24 High Performance Chinese Spoken Term Detection Based on Term Expansion

Wei LI, Ji WU and Ping LV
Special Sessions:

SPE1: Speech and Language Processing for Han Dialects

Date: Tuesday, Nov 30, 2010

Time: 16:00-18:00

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 2nd Lecture Room

Chair(s): Ming-Shing YU, National Chung-Hsing University


SPE1.1 Perception and Analysis of Linearly Approximated F0 Contours in Cantonese Speech

Yujia LI and Tan LEE


SPE1.2 The Duration Analysis of the Checked Tones in Cantonese Speech

Xiaoying XU, Jianhua TAO, Ling ZHANG and Yingchao LU


SPE1.3 Constructing Online Audio Dictionaries for Bilingual Mandarin-Taiwan Dialects Based on Web 2.0 Concept

Neng-Huang PAN, Feng-Long HUANG, Chun-Hsien HO, Xin-Wei LIN and Shu-Hau SHIU


SPE 1.4 Combining HMM Spectrum Models and ANN Prosody Models for Speech Synthesis of Syllable Prominent Languages

Hung-Yan GU, Ming-Yen LAI and Sung-Feng TSAI


SPE 1.5 A Combined Approach to the Polysemy Problems in a Chinese to Taiwanese TTS System

Yih-Jeng LIN, Ming-Shing YU and Chin-Yu LIN


SPE1.6 Language Identification In Code-Switching Speech Using Word-based Lexical Model

Dau-Cheng LYU, Ren-Yuan LYU, Cing-Lei ZHU and Ming-Tat KO


SPE2: New Paradigms in ASR

Date: Wednesday, Dec 1, 2010

Time: 16:00-18:00

Venue: NCKU Kuang-Fu Campus, International Conference Hall, 2nd Lecture Room

Chair(s): Hsiao-Chuan WANG, National Tsing Hua University

Yuan-Fu LIAO, National Taipei University of Technology


SPE 2.1 A Survey on Recent Progress in the ASAT/SIRKUS Paradigm

Sabato Marco SINISCALCHI, Torbjørn SVENDSEN and Chin-Hui LEE


SPE2.2 Automatic Voice Onset Time Estimation of Stops in Continuous Speech

Chi-Yueh LIN and Hsiao-Chuan WANG


SPE2.3 Human Speech Model Based on Information Separation and Its Application to Speech Processing

Nobuaki Minematsu


SPE2.4 Robust Speaker Verification Using Phase Information of Speech

Ning WANG, P. C. CHING and Tan LEE


SPE2.5 Phone Boundary Refinement Using Ranking Methods

Hung-Yi LO and Hsin-Min WANG

TEL:+886 6 2096455
FAX:+8866 2381422