一种基于重构性深度网络的MELP语音编码改进算法
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Improved MELP Algorithm Based on Reconstructive Deep Neural Network
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    为了提高深度模型的编码重构性能,本文为传统对比散度(Contrastive divergence,CD)添加了基于交叉熵的重构误差约束。利用改进后的算法训练了重构性深度自编码机(Reconstructive deep auto encoder,RDAE),并用RDAE替换混合激励线性预测编码(Mixed excitation linear prediction, MELP)语音编码器中 LSF参数的矢量量化方法。测试结果表明,改进后的算法在损失一定模型似然度的条件下获得了重构性能的提升,当RDAE隐藏层结点设为19 bit时,本文方法所测得的加权LSF距离、重构语音质量、谱失真指标在训练集和测试集上均优于25 bit矢量量化方法,即利用本文方法改进的MELP编码器,在不降低语音质量的条件下,可将MELP编码速率从2.4 kb/s降低至2.1 kb/s,编码速率降低了12.5%。

    Abstract:

    In order to improve the reconstruction performance of deep models, reconstruction error constraint based on cross entropy is added to traditional contrastive divergence (CD) algorithm. The improved algorithm is used to train reconstructive deep auto encoder(RDAE), which is used to replace the vector quantization method for LSF in MELP speech coding algorithm. Experimental results show that the improved CD algorithm improves the deep model gain reconstruction performance while costing some likelihood of the model. When the node number of the hidden layer of RDAE is set to 19 bit, the indicators, which include the weighted LSF distance, the performance of reconstructed speech, and the spectrum distortion, perform better in both training set and testing set by the proposed method than by the vector quantization method at 25 bit. That is to say, the coding bitrate of the MELP coder is reduced from 2.5 kb/s to 2.1 kb/s. The reduction rate of the coding bitrate is up to 12.5%, while the speech quality remains.

    参考文献
    相似文献
    引证文献
引用本文

张雄伟 吴海佳 张梁梁 邹霞.一种基于重构性深度网络的MELP语音编码改进算法[J].数据采集与处理,2015,30(2):307-318

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2015-04-23