基于临界频带的交互性双支路单通道语音增强模型
作者:
作者单位:

1.中国科学技术大学电子工程与信息科学系,合肥 230022;2.语音及语言信息处理国家工程研究中心,合肥 230022

作者简介:

通讯作者:

基金项目:

国家自然科学基金 (61671418)。


Interactive Dual-Branch Monaural Speech Enhancement Model Based on Critical Frequency Band
Author:
Affiliation:

1.Department of Electronic Engineering and Information Science, University of Science and Technology of China, Hefei 230022, China;2.National Engineering Research Center of Speech and Language Information Processing, Hefei 230022, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    针对目前主流的双支路单通道语音增强方法只关注全频带信息而忽略子频带信息这一问题,设计了一种基于人耳临界频带的交互性双支路模型。主要做法为,在复数谱支路上实施模拟人耳临界频带的划分方法对信号进行分频带处理,提取子带信息;在幅度补偿支路上直接对信号的全频带进行处理,提取全频带信息。复数谱支路负责初步恢复干净语音的幅度和相位,同时,该支路上学到的子带中间特征会被特定的模块传递给幅度补偿支路进行补偿;幅度补偿支路上的输出会对复数谱支路上输出的幅度做进一步的补偿,达到恢复干净语音频谱的目的。实验结果表明,提出的模型在恢复语音质量和可懂度方面优于其他先进的单通道语音增强模型。

    Abstract:

    Aiming at the problem that the current mainstream dual-branch single-channel speech enhancement methods only pay attention to the full frequency band information while ignoring the subband information, an interactive dual-branch model based on the critical frequency band of the human ear is proposed. The main method is to implement the division method of simulating the critical frequency band of the human ear on the complex spectrum branch to process the signal in frequency division and extract sub-band information. The whole frequency band of the signal is directly processed on the amplitude compensation branch, and the information of the whole frequency band is extracted. The complex spectrum branch is responsible for initially recovering the amplitude and phase of the clean speech signal. At the same time, the subband intermediate features learned by the branch are transferred to the amplitude compensation branch by specific modules for compensation. The output on the amplitude compensation branch will further compensate the amplitude of the output on the complex spectrum branch to achieve the purpose of recovering the clean speech spectrum. Experimental results show that the proposed model is superior to other advanced models in restoring speech quality and intelligibility.

    参考文献
    相似文献
    引证文献
引用本文

叶中付,赵紫微,于润祥.基于临界频带的交互性双支路单通道语音增强模型[J].数据采集与处理,2023,38(2):262-273

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2022-10-24
  • 最后修改日期:2023-02-27
  • 录用日期:
  • 在线发布日期: 2023-03-25