基于主要特征抽取的重现概念漂移处理算法
DOI:
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Algorithm of Recurring Concept Drift Based on Main Feature Extraction
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    针对重现概念漂移检测中的概念表征和分类器选择问题,提出了一种适用于含重现概念漂移的数据流分类的算法——基于主要特征抽取的概念聚类和预测算法(Conceptual clustering and prediction through main feature extraction, MFCCP)。MFCCP通过计算不同批次样本的主要特征及影响因子的差异度以识别重复出现的概念,为每个概念维持且及时更新一个分类器,并依据Hoeffding不等式选择最合适的分类器对当前样本集实施分类,以 提高对概念漂移的反应能力。在3个数据集上的实验表明:MFCCP在含重现概念漂移的数据集上的分类准确率,对概念漂移的反应能力及对概念漂移检测的准确率均明显优于其他4种 对比算法,且MFCCP也适用于对不含重现概念漂移的数据流进行分类。

    Abstract:

    Recurring concept drift is one of the sub-types of concept drift. In recurring concept drift detection, it is very important to represent concepts and select the most appropriate classifier to classify. We propose an algorithm, conceptual clustering and prediction through main feature extraction (MFCCP), for classifying data stream with recurring concept drifts. MFCCP can recognize recurring concepts by computing the differences of main features and impact factors of different batches of samples. It maintains a classifier for each concept and monitors the classification accuracy to select classifier according to hoeffding inequality in order to enhance the ability of adapting to concept drift. The experimental results over the three datasets illustrate that MFCCP achieves better classification accuracy, adapts faster to concept drift, and detects concept drift more accurately than the other four algorithms on the data streams with recurring concept drifts, and therefore, MFCCP is apt to classify data stream without recurring concept drift.

    参考文献
    相似文献
    引证文献
引用本文

冯超 文益民 汤凌冰.基于主要特征抽取的重现概念漂移处理算法[J].数据采集与处理,2016,31(2):315-324

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2018-04-09