基于拓扑结构的微博话题摘要生成算法
DOI:
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Microblog Topic Summarization Based on Topology Structures
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    话题摘要是自然语言处理中对文本进行内容归纳和概要生成的技术。传统的话题摘要研究主要针对新闻、Web网页和博客这样的长文本,本文研究微博短文本的话题摘要问题。本文以微博转发消息为对象,提出具有拓扑结构的微博话题摘要生成算法(Microblog topic summarization,MTS)。首先通过微博转发上下文确定代表性词项;然后识别微博转发中的话题区域,从广度和深度两个方向对话题进行归并操作;最后,基于转发关系生成具有拓扑结构的微博话题摘要。本文实验采用真实的微博事件数据集验证MTS算法的有效性和可行性,并采用可视化方式展现微博话题摘要的结果。

    Abstract:

    Topic summarization is a natural language processing for creating summaries of topic information. Previous work focused on summaries of news, web documents and blogs, while seldom on microblog topic summaries. A microblog topic summarization (MTS) method is proposedbased on topology structures for microblog retweets. First, representative terms are selected according to structural relationships between retweeting tweets. Second, topic areas are identified after topic nodes are merged by using depth-first and breath-frist methods. Third, topic-oriented summaries with topology structure are generated through measuring adjacent topic nodes on the retweeting graph. Finally, experiments on the real world event datasets show the effectiveness of the proposed methods. Visual topic summary trees are also produced for remarkably emphasizing the insight behind the evolving topics.

    参考文献
    相似文献
    引证文献
引用本文

赵斌,吉根林,徐伟,顾彦慧.基于拓扑结构的微博话题摘要生成算法[J].数据采集与处理,2014,29(5):720-729

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2014-10-20