Abstract:To overcome the problem of mode mixing for Hilbert-Huang Transform (HHT) in speech processing, a new method of time-frequency analysis based on Wavelet Packet Decomposition (WPD) is proposed in this paper. Firstly, noise-corrupted speech is decomposed by using WPD, each component is carried out Empirical Mode Decomposition (EMD) separately, and the Intrinsic Mode Function (IMF) is selected by using correlation threshold criterion. Then, the Hilbert spectrum and instantaneous energy spectrum of speech signal are achieved. Finally, the method of instantaneous energy spectrum based on WPD is applied to noise-corrupted speech endpoint detection. Experimental results indicate that the proposed method is more accurate、robust and self-adaptive by comparison with the original generalized dimension(OGD) and the spectral entropy(SE) algorithms. The proposed method can effectively describe the time-frequency characteristics of the non-linear and non-stationary speech signal, and has provided a new idea for the research of speech signal.