一种改进的基于规则实例多覆盖分类算法
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Improved Rule Based Classification Algorithm with Multiple Covering Instances
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    基于规则分类算法提取的规则集通常存在3个问题:首先,提取的分类规则集中短规则过少,致使高质量的规则不多;其次,规则集中规则数量少,训练数据中几乎所有实例仅被规则覆盖一次;第三,虽然提取大量的规则,但是训练数据中存在一些小类样本的实例不能被任何一条规则覆盖。本文提出一种改进的基于规则的实例多覆盖分类算法(Rule-based classification with instances covered by multiple rules, RCIM),其特点是:(1)为了提高规则的质量,在选择生成规则的第1项时不仅考虑属性值的好坏,而且还考虑了属性值补的好坏;(2)一次产生尽量多,高质量的规则,而且当训练数据的实例至少被两条规则覆盖后才将其删除;(3)当遇上难以判断的测试数据时,对测试数据的各个属性值进行二次学习提取规则。算法RCIM不仅可以有效地提取大量的规则,而且较大程度地提高了规则的质量。通 过在大量数据上实验结果表明,RCIM比许多其他算法取得了更高的分类准确率。

    Abstract:

    There are three problems in rule set which is extracted based on classification algorithm.First, too few short rules in the extracted classification rule set decrease the number of high quality rules. Second, there are such few rules in rule set that almost all of the examples in the training data can be covered only once.Third, despite lots of extracted rules, some examples of small classes in the training data fail to be covered by any of these rules. Herein, a modified example multiple coverage classification algorithm RCIM, which is based on generated rules, is proposed. Here are the features: (1) for the purpose of improving the quality of rules, not only the quality of attribute value but also that of its complement can be taken into account when choosing the first item of a generated rule. (2) It can generate high quality rules at a time as many as possible. (3) It deletes the examples in the training data only if they are covered at least twice.What′s more, it can restudy each of the attribute value of the test data to extract rules when encountering the data difficult to judge.The algorithm RCIM not only can efficiently extract a large quantity of rules but also largely improve the quality of rules. Experimental results in many data show that RCIM has achieved higher classification accuracy than many other algorithms.

    参考文献
    相似文献
    引证文献
引用本文

周忠眉李莎莎.一种改进的基于规则实例多覆盖分类算法[J].数据采集与处理,2017,32(6):1232-1238

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2018-04-10