基于数字结构特征的发票号码识别算法
作者:
作者单位:

作者简介:

通讯作者:

基金项目:


Invoice Number Recognition Algorithm Based on Numerical Structure Characteristics
Author:
Affiliation:

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    由于印章覆盖、发票折痕等干扰因素的存在,一些发票号码区域会出现噪声粘连区域,这些区域会导致发票号码无法正常分割。 针对这一问题,提出了噪声粘连区域修复算法,有效地避免了该情况对数字分割的影响。针对普通发票号码的字体结构和特点,提出了基于数字结构特征的发票号码识别算法。首先定义数字结构特征,包括4种填充区域、2种字符穿越数和4种镂空区域,构成待识别数字的10维特征向量;进而与标准模板库中数字进行模板特征匹配,求得距离最小值所对应的数字作为识别结果。将所提出的方法和基于改进的左右轮廓特征的印刷体数字识别方法进行对比,实验结果表明,本文所提出的识别算法拥有更高的准确率和更快的识别速度,以及对噪声 有更强的鲁棒性。

    Abstract:

    Interference factors such as seal cover, invoice crease and so on, cause noise adhesion in number area of some invoice, which would seriously lead to the invoice number segmentation error. Aiming at this problem, a noise adhesion area repairing algorithm is proposed. At the same time, according to the font structure and characteristics of ordinary invoice number, invoice number recognition algorithm based on characteristics of digital structure is proposed. Firstly, define number structure features, including four kinds of fill area, two kinds of number of passing through the character, and four kinds of hollow area, which constitute a 10-dimensional feature vector of the number to be identified. Then, match the feature vector with the template features in the standard template library, by obtaining the Euclidean distance, and regard the corresponding number with the minimum Euclidean distances as the last recognition result. The proposed method and printed number recognition method based on the improved left and right contour features are compared. Experimental results indicate that the proposed identification algorithm has higher accuracy, faster recognition speed and stronger robustness to noise.

    参考文献
    相似文献
    引证文献
引用本文

崔文成 任磊 刘阳 邵虹.基于数字结构特征的发票号码识别算法[J].数据采集与处理,2017,32(1):119-125

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:
  • 最后修改日期:
  • 录用日期:
  • 在线发布日期: 2018-04-09