基于主动学习的微博聚类分析
DOI:
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Clustering Analysis of Micro Blogs Based on Active Learning
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    K Means聚类算法由于无法准确确定初始化聚类中心,容易造成 聚类结果准确率低下。对微博数据聚类时,可能会导致无法正确反映兴趣热点。本文 设计了基于主动学习的聚类算法,在确定初始聚类中心过程中应用Min Max主动学习策略, 使 得算法每次在很小数量的查询后都会提供数据点供用户进行初始中心点确认,并在K Means算 法中重新计算聚类中心时设置其权重值,从而减少迭代的数量,提高聚类结果的准确 率,并将这一算法运用于微博聚类分析,得出微博热门话题。

    Abstract:

    The K Means clustering algorithm can not determine the initial cluster ing centers, which results in low accuracy and inability to reflect the interest ing hotspots. Here, algorithm based on clustering is proposed through applying Min M ax active learning strategy to ask the user for identifying the seed points. Several points are provided in small quantities of query for users to confirm the initial centers, and the weight is set in the recalculation of K Means centers, which reduces the number of iterations and improves the accu racy of clustering results. Moreover, the hot topics are obtained by applying th is algorithm to the micro blog clustering analysis.

    参考文献
    相似文献
    引证文献
引用本文

朱丽;陆建峰.基于主动学习的微博聚类分析[J].数据采集与处理,2016,31(3):599-605

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2016-06-24