Abstract:To automatically annotate a special kind of video, i.e., lecture videos, a method is proposed to extract caption information from video, Then subtitle information is utilized with latent Dirichlet allocation(LDA). The document distribution probability on the topics is obtained. The distance between these probability distributions is calculated. Finally the semantic shot segmentation is realized. A shot is set as a sample based on safe semi-supervised support vector machine(S4VM ) method. A small amount of labeled semantic shots are taken as samples. The unlabeled shots are automatically annotated. Experimental results show that the proposed method can not only effectively complete the shot semantic segmentation, but also annotate key words for the video.