基于内容的x-vector文本相关SV研究

doi:10.16337/j.1004-9037.2020.05.006

首页 > 按月查看>2020年第5月 >850-857. DOI:10.16337/j.1004-9037.2020.05.006

基于内容的x-vector文本相关SV研究
DOI:
                        10.16337/j.1004-9037.2020.05.006
                    
作者:
                        
                        
                    
作者单位:中国科学技术大学语音及语言信息处理国家工程实验室，合肥，230027
作者简介:
通讯作者:
基金项目:

Content-Dependent x-vector for Text-Dependent Speaker Verification

Author:

Affiliation:

University of Science and Technology of China, National Engineering Laboratory for Speech and Language Information Processing,Hefei, 230027, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

x-vector系统将一段不定长的语音通过神经网络映射成固定维的矢量来表征说话人信息，该系统在文本无关的说话人确认（Speaker verification， SV）任务中取得了优异的性能。本文将其应用到文本相关的SV任务中，在x-vector模型选择上，采用残差神经网络以获得更有区分性的x-vector；在包含多字符的语句中，对每个字训练一个残差神经网络；在提取过程中，每一字单独提取一个x-vector并单独进行说话人判决，最后将多个判决得分进行融合后给出最终的识别结果。实验是在数据库RSR2015 Part Ⅲ 上进行的，提出的方法在男性和女性测试集上等错误率分别有15.34%、19.7%的下降。

Abstract:

The x-vector system maps a variable-length speech to a fixed-dimensional speaker embeddings via neural networks， and performs well in text-independent speaker verification. Here， it is applied to the text-dependent speaker verification and different x-vectors are extracted according to different contents in one sentence. In model selection， deep residual network （DRN） is used to obtain more discriminative x-vector. For a sentence with multiple words， word-dependent DRNs are trained to extract word-dependent x-vectors， which are separately fed to different backend classifiers. Finally， multiple scores are fused to obtain the final verification results. Experiments on Part Ⅲ of the RSR2015 dataset show that the proposed method can achieve equal error rate （EER） reduction of 15.34% and 19.7% for male and female， respectively.

参考文献

相似文献

引证文献

引用本文

陈亚峰,郭武.基于内容的x-vector文本相关SV研究[J].数据采集与处理,2020,35(5):850-857

复制

文章指标

点击次数:
下载次数:

历史

收稿日期:2019-12-05
最后修改日期:2020-04-25
录用日期:
在线发布日期: 2020-10-22

引用本文

分享

文章指标

历史