Abstract:Throat microphones have been used in a variety of strong noise scenarios due to their excellent anti-noise characteristics. However, the generated speech has some shortcomings such as much higher energy in middle frequency and severe loss of high frequency, which have greatly affected the speech quality and intelligibility. In order to improve the speech quality, a blind speech enhancement algorithm based on long short memory recurrent neural networks (Long short term memory recurrent neural networks, LSTM-RNN) is proposed. In contrast to previous estimation algorithms based on low-dimensional spectral envelope features, this algorithm first models the relationship of the high-dimensional logarithmic amplitude spectrum between the throat and air-conducted microphone speech directly, and this kind of neural networks can impressively capture the context information to reconstruct the signal. Secondly, the estimated speech amplitude spectrum is processed by non-negative matrix factorization (Non-negative matrix factorization, NMF), which can effectively suppress the over-smoothing problem and further improve the speech quality. The simulation results of LLR, LSD, PESQ show that this algorithm can effectively improve the speech quality of throat microphones.