MSDL-IEW: Active Learning Algorithm for Text Classification Based on Density Perception
CSTR:
Author:
Affiliation:

1.School of Computer Science and Engineering, Nanjing University of Science and Technology, Nanjing 210094, China;2.CETC Big Data Research Institute Co Ltd, Guiyang 550022,China;3.Big Data Application on Improving Government Governance Capabilities National Engineering Laboratory, Guiyang 550022, China;4.Nanjing Power Supply Company, Nanjing 210000, China;5.China Electronics Technology Cyber Security Co Ltd, Chengdu 610041, China

Clc Number:

TP391

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To solve the problem that the unlabeled data in the text classification task cannot be immediately marked and the cost is too high, this paper proposes an active learning method for uncertainty based on text classification. The MSDL (Measure sample density by LDA) algorithm is proposed to calculate the unlabeled sample density, and the new metric sample aggregation situation is introduced. The initial training set sample is selected in the densely sampled region, thus making the initial The training set is more representative. The more uncertain samples from the unlabeled samples are added to the training set, the samples are weighted based on the information entropy, and the classifier model is iteratively updated until the expected termination condition is reached. Experimental results show that this method is better than other traditional active learning algorithms in text classification tasks.

    Reference
    Related
    Cited by
Get Citation

TRAN Baphan, MA Feifei, MING Jingjing, YU Qinyong, YANG Hui, LI Quanbing, WANG Yongli. MSDL-IEW: Active Learning Algorithm for Text Classification Based on Density Perception[J].,2021,36(2):240-247.

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:June 04,2020
  • Revised:November 29,2020
  • Adopted:
  • Online: March 25,2021
  • Published:
Article QR Code