基于消息传递的谱聚类算法
作者:
作者单位:

1.中国矿业大学计算机科学与技术学院,徐州,221116;2.徐州工业职业技术学院信息与电气工程学院,徐州,221400;3.江苏大学计算机与通信工程学院,镇江,212013

作者简介:

通讯作者:

基金项目:

国家自然科学基金 61676522,61379101;徐州市科技发展基金 KC17132国家自然科学基金(61676522,61379101)资助项目;徐州市科技发展基金(KC17132)资助项目。


Spectral Clustering Algorithm Based on Message Passing
Author:
Affiliation:

1.School of Computer Science and Technology, China University of Mining and Technology, Xuzhou, 221116, China;2.School of Information and Electrical Engineering, Xuzhou College of Industrial Technology, Xuzhou, 221400,China;3.School of Computer Science and Communication Engineering, Jiangsu University, Zhenjiang, 212013, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    谱聚类将数据聚类问题转化成图划分问题,通过寻找最优的子图,对数据点进行聚类。谱聚类的关键是构造合适的相似矩阵,将数据集的内在结构真实地描述出来。针对传统的谱聚类算法采用高斯核函数来构造相似矩阵时对尺度参数的选择很敏感,而且在聚类阶段需要随机确定初始的聚类中心,聚类性能也不稳定等问题,本文提出了基于消息传递的谱聚类算法。该算法采用密度自适应的相似性度量方法,可以更好地描述数据点之间的关系,然后利用近邻传播(Affinity propagation,AP)聚类中“消息传递”机制获得高质量的聚类中心,提高了谱聚类算法的性能。实验表明,新算法可以有效地处理多尺度数据集的聚类问题,其聚类性能非常稳定,聚类质量也优于传统的谱聚类算法和k-means算法。

    Abstract:

    Spectral clustering transforms data clustering problem into a graph partitioning problem and classifies data points by finding the optimal sub-graphs. The key to spectral clustering is constructing a suitable similarity matrix, which can truly describe the intrinsic structure of the dataset. However, traditional spectral clustering algorithms adopt Gaussian kernel function to construct the similarity matrix, which results in their sensitivity of selection for scale parameter. In addition, the initial cluster centers need randomly determing at the clustering stage and the clustering performance is not stable. The paper presents an algorithm based on message passing. The algorithm uses a density adaptive similarity measure, which can well describe the relations between data points, and it can obtain high-quality cluster centers through message passing mechanism in affinity propagation (AP) clustering. Moreover, the performance of clustering is optimized by the method. Experiments show that the proposed algorithm can effectively deal with the clustering problem of multi-scale datasets. Its clustering performance is very stable, and the clustering quality is better than traditional spectral clustering algorithm and k-means algorithm.

    参考文献
    相似文献
    引证文献
引用本文

王丽娟,丁世飞,贾洪杰.基于消息传递的谱聚类算法[J].数据采集与处理,2019,34(3):548-557

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2018-01-08
  • 最后修改日期:2019-04-09
  • 录用日期:
  • 在线发布日期: 2019-06-12