一种基于密度的快速聚类方法
DOI:
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Fast Density Based Clustering Approach
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    具有噪声的基于密度的聚类方法(Density based spatial clustering of applications with noise, DBSCAN)在数据规模上的扩展性较差。本文在其基础 上提出一种改进算法——具有噪声的基于密度的快速聚类方法(Fast density based spat ial clustering of applications with noise, F DBSCAN),对核心对象邻域中的对象只 作标记,不再进行扩展检查,通过判断核心对象邻域中是否存在已标记对象来实现簇合并,对 边界对象判断其邻域中是否存在核心对象来确认是否为噪声。此方法避免了原始算法中对重叠区域 的重复操作,在不需创建空间索引的前提下,其时间复杂度为O(nlogn)。通过实验数据集和 真实数据集,验证其聚类效果及算法效率。实验表明F DBSCAN算法不仅保证了有良好的聚 类效果及算法效率,并且在数据规模上具有良好的扩展性。

    Abstract:

    Density based spatial clustering of applications with noise (DBSCAN) has poor scalability on the data size, especially when the amount of data increases. Here an improved adaptive fast density based spatial clustering of applications with noise (F DBSCAN) algorithm is proposed, with no longer checks of the objects inside the neighborhood of core obj ects, but just the mark of them. Merging clusters is performed by determining whether th ere exist the marked objects in the neighborhood of core objects. Noisy objects are recognized by checking whether the neighborhood of border ones contains a core ones. The proposed algorithm can avoid the repeated checking of overlapping are a of the original DBSCAN without building the spatial index, thus improving its eff iciency substantially with time complexity approaching O(nlogn). The clustering quality of FDBSCAN is validated on both artificial and real datasets, and its efficiency is also validated on two real datasets from different industries. The empirical results suggest that FDBSCAN can achieve good clustering qualit y as well as better efficiency and scalability.

    参考文献
    相似文献
    引证文献
引用本文

张晓 张媛媛 高阳 周新民.一种基于密度的快速聚类方法[J].数据采集与处理,2015,30(4):888-895

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2015-10-12