基于个性化联邦学习和语义通信的语音传输系统
作者:
作者单位:

南京邮电大学通信与信息工程学院,南京 210003

作者简介:

通讯作者:

基金项目:

国家自然科学基金(62071242)。


Speech Transmission System Based on Personalized Federated Learning and Semantic Communication
Author:
Affiliation:

School of Communications and Information Engineering, Nanjing University of Posts and Telecommunications, Nanjing 210003, China

Fund Project:

National Natural Science Foundation of China (No.62071242)。

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    面向多用户语音传输场景,本文提出一种使用超网络个性化联邦学习的深度学习语义通信系统(Deep learning based semantic communication system using federated learning based on hypernetworks, DeepSC-FedHN)。边缘服务器采用超网络来衡量每个本地用户语义编码器中各模块的重要性,生成个性化聚合权重矩阵来更新相应模型参数。同时,采用联邦学习(Federated learning, FL)算法聚合模型的信道编解码器和语义解码器部分。实验结果表明,本文提出的DeepSC-FedHN方案总体优于本地训练方案、联邦平均(Federated averaging, FedAvg)方案、联邦近似(Federated proximal, FedProx)方案和采用分层个性化联邦学习的深度学习语义通信系统(Deep learning based semantic communication system using layer-wised personalized federated learning, DeepSC-pFedLA)。

    Abstract:

    In multi-user speech transmission scenarios, the statistical heterogeneity of data among different users results in the transmission performance degradation if a uniform semantic communication based speech transmission model is used by all users. To address this problem, this paper proposes a novel deep learning-based semantic communication system using federated learning based on hypernetworks (DeepSC-FedHN), enabling each user to obtain a personalized model adaptive to its own data characteristics without compromising data privacy. Specifically, considering that different modules of the semantic encoder play different roles in extracting semantic information, the edge server employs a per-user hypernetwork to generate a personalized aggregation weight matrix by dynamically evaluating the importance of each module in the semantic encoder. The generated aggregation weight matrix is then used to update the corresponding model parameters, effectively tailoring the global knowledge to different users’ needs. Concurrently, since the channel codec and semantic decoder are not involved in extracting the semantic features of each local users’ data, the standard federated averaging (FedAvg) algorithm is used to perform weighted aggregation and updates on the channel codecs and semantic decoders of all the users. Experimental results on TIMIT and Edinburgh DataShare datasets show that the proposed DeepSC-FedHN scheme leads to significant improvement of speech transmission performance. Specifically, it outperforms conventional local training, the standard FedAvg approach, the federated proximal (FedProx) method, and the layer-wise personalized FL scheme (DeepSC-pFedLA) in terms of perceptual evaluation of speech quality (PESQ), signal-to-distortion ratio (SDR) and short time objective intelligibility (STOI), particularly in non-independent and identically distributed (non-IID) data settings. Additionally, the proposed DeepSC FedHN model exhibits better generalization ability for unseen speakers’ data and also demonstrates significantly lower computational overhead for model aggregation compared to the DeepSC pFedLA. We conclude that the integration of a hypernetwork for generating personalized weights offers a highly effective mechanism for tackling data heterogeneity in federated semantic communication systems, leading to superior and more adaptable speech transmission performance while fully preserving user data privacy.

    参考文献
    相似文献
    引证文献
引用本文

刘月照,郭海燕,王添顺,陈飞飞.基于个性化联邦学习和语义通信的语音传输系统[J].数据采集与处理,2026,(1):117-131

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2024-09-29
  • 最后修改日期:2025-02-18
  • 录用日期:
  • 在线发布日期: 2026-02-13