Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement

XU Chundong; XIA Risheng; YING Dongwen; LI Junfeng; Fundamental Science on Multiple Information Systems Laboratory; Beijing Institute of Technology; Faculty of Information Engineering; Jiangxi University of Science and Technology; Key Laboratory of Speech Acoustics and Content Understanding; Institute of Acoustics; Chinese Academy of Sciences

Journal Articles

Laws/Policies/Regulations

Companies/Products

Title, abstract, keywords:

Combined Search Advanced Search

Pay per View through On Demand Search

Time-frequency speech presence probability estimation based on sequential hidden markov model for speech enhancement

Author(s): XU Chundong, XIA Risheng, YING Dongwen, LI Junfeng, Fundamental Science on Multiple Information Systems Laboratory, Beijing Institute of Technology, Faculty of Information Engineering, Jiangxi University of Science and Technology, Key Laboratory of Speech Acoustics and Content Understanding, Institute of Acoustics, Chinese Academy of Sciences
Pages: 647-654
Year: 2014 Issue: 5
Journal: Acta Acustica
Keyword: 语音增强; 概率值; 隐马尔可夫模型; 分段信噪比; 算法实时性; 模型参数; 极大似然; 功率谱; 后验概率; 时间序列;
Abstract: Speech presence probability(SPP) estimation is a challenging issue on speech enhancement.Traditional methods for SPP is heuristic somewhat.They are not unified into a theoretical framework,which can’t enable the optimal estimation.We present a sequential hidden Markov model(SHMM) to describe the log-power sequence as a dynamic process that transits between speech and noise states.The emission probability of each state is modeled by a Gaussian function.SPP is represented as the posterior probability of speech states given the observed log-power sequence.To meet the requirement of real-time capability,SHMM parameter estimation is simplified to a first-order recursive process,where the model parameter set is updated frame by frame on the basis of maximum likelihood.The comparison between several modeling methods showed the superiority of SHMM in modeling temporal correlation.The speech enhancement experiments confirm constrained SHMM outperforms conventional Minima Controlled Recursive Averaging(IMCRA) in terms of segmental SNR and log spectral distortion.

Related Articles