基于Tukey规则与初始中心点优化的K-means聚类改进算法
作者:
作者单位:

1.绍兴文理学院计算机科学与工程系, 绍兴 312000;2.北京建筑大学电子信息工程学院, 北京 100044

作者简介:

通讯作者:

基金项目:

国家自然科学基金(62002227);绍兴文理学院校级科研项目(2021LG004)。


Improved K-means Clustering Algorithm Based on Tukey Rule and Initial Center Point Optimization
Author:
Affiliation:

1.Department of Computer Science and Engineering, Shaoxing University, Shaoxing 312000, China;2.School of Electrical and Information Engineering, Beijing University of Civil Engineering and Architecture, Beijing 100044, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    针对K-means聚类算法存在的初始中心点选择及异常点、离群点极易影响聚类结果等待改进问题,提出了一个基于Tukey规则与优化初始中心点选择的K-means改进算法。该算法利用Tukey规则构造核心与非核心子集,将聚类过程划分成2个阶段。同时,在核心子集上执行中心点逐个递增优化选择策略,选出初始中心点。在来自UCI的20个数据集上聚类结果表明,本文提出的算法优于K-means++聚类算法,有效地提升了聚类性能。

    Abstract:

    Aiming at shortcomings of the K-means algorithm to be improved, such as selection of initial center points and the problems that abnormal points and outliers can easily affect the clustering results, this paper proposes an improved K-means algorithm based on Tukey rules and optimizing initial center points selection. The proposed algorithm uses Tukey rules to construct core and non-core subsets, and divides the clustering process into two stages. At the same time, the strategy of increasing the center points one by one is implemented on the core subset to optimize the initial center points. The clustering results on 20 real-world datasets from UCI show that the proposed algorithm is better than the most popular K-means++ clustering algorithm and effectively improves the clustering performance.

    参考文献
    相似文献
    引证文献
引用本文

柳菁,邱紫滢,郭茂祖,余冬华.基于Tukey规则与初始中心点优化的K-means聚类改进算法[J].数据采集与处理,2023,38(3):643-651

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2022-03-24
  • 最后修改日期:2022-06-23
  • 录用日期:
  • 在线发布日期: 2023-05-25