Semi-supervised Automatic Speech Segmentation for TV-drama Speech Recognition
CSTR:
Author:
Affiliation:

The College of Information, Mechanical and Electrical Engineering, Shanghai Normal University, Shanghai, 200234,China

Clc Number:

TP918

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To deal with the speech segmentation of TV-drama which has large coherent text transcriptions but no time-stamps, an automatic semi-supervised speech segmentation algorithm is proposed in the paper. Firstly, the original text transcriptions are used to build a biased language model, then the model is applied to the TV-drama speech recognition in a semi-supervised way, and finally, the resulting automatic speech decoding hypothesis are well combined with the traditional segmentation methods to improve the performances of speech segmentation. These traditional methods are usually based on the distance metric, model classification and the phone recognizers. Experimental results on the British TV-drama “Doctor Who” database demonstrate that, the proposed approach can achieve significant performance improvement over traditional baseline algorithms. Meanwhile, the proposed approach allows high quality segmentation and the associated transcription alignments for the large coherent TV-drama speech recordings.

    Reference
    Related
    Cited by
Get Citation

Long Yanhua, Mao Hongwei, Ye Hong. Semi-supervised Automatic Speech Segmentation for TV-drama Speech Recognition[J].,2019,34(2):281-287.

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:June 25,2017
  • Revised:December 29,2017
  • Adopted:
  • Online: April 22,2019
  • Published:
Article QR Code