电视剧语音识别中的半监督自动语音分割算法

doi:10.16337/j.1004-9037.2019.02.010

首页 > 按月查看>2019年第2月 >281-287. DOI:10.16337/j.1004-9037.2019.02.010

电视剧语音识别中的半监督自动语音分割算法
DOI:
                        10.16337/j.1004-9037.2019.02.010
                    
作者:
                        
                        
                    
作者单位:上海师范大学信息与机电工程学院， 上海， 200234
作者简介:
通讯作者:
基金项目:上海市青年科技英才扬帆计划 14YF1409300;国家自然科学基金 61701306上海市青年科技英才扬帆计划(14YF1409300)资助项目；国家自然科学基金(61701306)资助项目。

Semi-supervised Automatic Speech Segmentation for TV-drama Speech Recognition

Author:

Affiliation:

The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, 200234,China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

针对具有大段连续文本标注、但无时间标签的电视剧语音提出了一种半监督自动语音分割算法。首先采用原始的标注文本构建一个有偏的语言模型，然后将该语言模型以一种半监督的方式用于电视剧语音识别中，最后利用自动语音识别的解码结果对传统的基于距离度量、模型分类以及基于音素识别的语音分割算法进行改进。在英国科幻电视剧“神秘博士”数据集合上的实验结果表明，提出的半监督自动语音分割算法能够取得明显优于传统语音分割算法的性能，不仅有效解决了电视剧语音识别中大段连续音频的自动分割问题，还能对相应的大段连续文本标注进行分段，保证分割后各语音段时间标签及其对应文本的准确性。

Abstract:

To deal with the speech segmentation of TV-drama which has large coherent text transcriptions but no time-stamps, an automatic semi-supervised speech segmentation algorithm is proposed in the paper. Firstly, the original text transcriptions are used to build a biased language model, then the model is applied to the TV-drama speech recognition in a semi-supervised way, and finally, the resulting automatic speech decoding hypothesis are well combined with the traditional segmentation methods to improve the performances of speech segmentation. These traditional methods are usually based on the distance metric, model classification and the phone recognizers. Experimental results on the British TV-drama “Doctor Who” database demonstrate that, the proposed approach can achieve significant performance improvement over traditional baseline algorithms. Meanwhile, the proposed approach allows high quality segmentation and the associated transcription alignments for the large coherent TV-drama speech recordings.

参考文献

相似文献

引证文献

引用本文

龙艳花,茅红伟,叶宏.电视剧语音识别中的半监督自动语音分割算法[J].数据采集与处理,2019,34(2):281-287

复制

文章指标

点击次数:
下载次数:

历史

收稿日期:2017-06-25
最后修改日期:2017-12-29
录用日期:
在线发布日期: 2019-04-22

引用本文

分享

文章指标

历史