数据流聚类算法研究
作者:
作者单位:

1.南京航空航天大学计算机科学与技术学院, 南京 211106;2.三江学院计算机科学与工程学院, 南京 210012

作者简介:

通讯作者:

基金项目:

国家自然科学基金重点项目(61732006)。


Research on Data Stream Clustering Algorithms
Author:
Affiliation:

1.College of Computer Science and Technology, Nanjing University of Aeronautics & Astronautics, Nanjing 211106, China;2.College of Computer Science and Engineering, Sanjiang University, Nanjing 210012, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    许多应用程序会产生大量的流数据,如网络流、web点击流、视频流、事件流和语义概念流。数据流挖掘已成为热点问题,其目标是从连续不断的流数据中提取隐藏的知识/模式。聚类作为数据流挖掘领域的一个重要问题,在近期被广泛研究。不同于传统的静态数据聚类问题,数据流聚类面临有限内存、一遍扫描、实时响应和概念漂移等许多约束。本文对数据流挖掘中的各种聚类算法进行了总结。首先介绍了数据流挖掘的约束;随后给出了数据流聚类的一般模型,并描述了其与传统数据聚类之间的关联;最后提出数据流聚类领域中进一步的研究热点和研究方向。

    Abstract:

    Nowadays, developments of technology have allowed the generation of huge amounts of streaming data, such as network traffic flows, web click stream, video stream, event stream and semantic concept stream. Therefore, data stream mining has become a hot research topic and its goal is to extract hidden knowledge/patterns from continuous stream data. Clustering, as one of the most important problems in stream mining, has been highly explored recently. However, data stream clustering algorithms differ from traditional static data clustering algorithms in many aspects, and have more constraints such as bounded memory, single-pass, real-time response and concept-drift detection. In this paper, we survey the state-of-the-art data stream clustering algorithms. Firstly, mining constraints are identified. Then a general model for stream clustering is given, and its association with traditional data clustering is described. Finally, some further research issues in this domain are put forward.

    参考文献
    相似文献
    引证文献
引用本文

朱颖雯,陈松灿.数据流聚类算法研究[J].数据采集与处理,2022,37(4):894-908

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2021-12-14
  • 最后修改日期:2022-02-09
  • 录用日期:
  • 在线发布日期: 2022-08-11