Research and Application of Imbalanced Credit Data Classification Based on NaN-Bicluster SMOTE
CSTR:
Author:
Affiliation:

College of Economics and Management, Nanjing University of Aeronautics & Astronautics, Nanjing 211106, China

Clc Number:

TP181

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    To assess borrower’s credit risk using imbalanced data, we propose an improved SMOTE, called NaN-Bicluster SMOTE, which is based on synthetic minority oversampling technique (SMOTE), natural neighbor (NaN) and bicluster. Firstly, we use parameterless NaN to set logical rules for sampling sample selection, avoiding the instability caused by r nearest neighbor partitioning of samples. Secondly, based on the neighbor relationship of stable structure, we set logical rules that specify security range to avoid samples becoming noise samples. Then, we use bicluster to mine local rules, synthetic samples inherit local rules, and synthetic formula is improved. Finally, we apply several sampling methods and machine learning models, carry out various experiments of NaN-Bicluster SMOTE and comparative models on Prosper’s credit data, and further use statistical testing methods to verify the performance of NaN-Bicluster SMOTE.

    Reference
    Related
    Cited by
Get Citation

HE Liang, XU Haiyan, CHEN Lu. Research and Application of Imbalanced Credit Data Classification Based on NaN-Bicluster SMOTE[J].,2023,38(6):1482-1494.

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:July 13,2022
  • Revised:October 05,2022
  • Adopted:
  • Online: November 25,2023
  • Published:
Article QR Code