基于联合图学习的多通道语音增强方法
作者:
作者单位:

1.南京邮电大学通信与信息工程学院,南京 210003;2.南京邮电大学通信与网络技术国家地方联合工程研究中心,南京 210003

作者简介:

通讯作者:

基金项目:

国家自然科学基金(62071242)。


Multi-channel Speech Enhancement Based on Joint Graph Learning
Author:
Affiliation:

1.College of Communication and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China;2.National Local Joint Engineering Research Center for Communications and Network Technology, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    考虑到通道间存在的空间关系影响着其降噪问题,图信号处理可以捕获该潜在关系,若直接采用其空间物理分布图,无法实时反映其时变特性,因此本文提出了一种基于联合图学习的多通道语音增强方法。首先,提出一种联合时间-空间图学习方法,以最小化多通道含噪语音信号在空间图上的平滑度、参考通道信号在语音帧内图上的平滑度、空间图的稀疏度和帧内图的稀疏度之和为目标,优化阵列空间图和语音帧内图。基于学习的空间图和帧内图,构建多通道语音信号的时间-空间联合图。在此基础上,将多通道语音图信号进行联合图傅里叶变换,进而采用固定波束形成(Fixed beam forming,FBF)方法进行增强。实验结果表明,与传统的FBF方法相比,所提出的基于联合图学习的FBF(Joint graph learning based FBF, JGL-FBF)方法显著提升了增强语音的信噪比(Signal-to-noise ratio, SNR)和主观语音质量评估(Perceptual evaluation of speech quality, PESQ)。另外,实验结果也表明,JGL-FBF方法的语音增强性能会受到时延补偿准确性的影响。

    Abstract:

    Considering that the spatial relationship between channels affects the noise reduction, graph signal processing can capture the potential relationship. If the spatial physical distribution map is directly used, its time-varying characteristics cannot be reflected in real time. Therefore, we propose a multi-channel speech enhancement method based on joint graph learning. Firstly, we propose a joint time-space graph learning method, which jointly optimizes the array space graph and the speech frame inner graph, for the sake of minimizing the sum of the smoothness of the multi-channel noisy speech signal on the spatial graph, the smoothness of the nosiy speech signal from the reference channel on the speech frame graph, the sparsity of the Laplace matrix and the sparsity of the adjacency matrix. Based on the learned space graph and frame inner graph, the time-space joint graph of multi-channel speech signal is constructed. On this basis, the multi-channel speech graph signal is enhanced by applying the joint graph transform and the fixed beam forming (FBF) method. Experimental results show that the proposed joint graph learning based FBF (JGL-FBF) method can significantly improve the signal-to-noise ratio (SNR) of enhanced speech and perceptual evaluation of speech quality (PESQ) compared with the traditional FBF method. In addition, the experimental results also show that the accuracy of delay compensation affects the speech enhancement performance of JGL-FBF.

    参考文献
    相似文献
    引证文献
引用本文

张鹏程,郭海燕,王婷婷,杨震.基于联合图学习的多通道语音增强方法[J].数据采集与处理,2023,38(2):283-292

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2022-07-18
  • 最后修改日期:2022-09-24
  • 录用日期:
  • 在线发布日期: 2023-03-25