基于改进DAN的自然场景下越南文字的识别
作者:
作者单位:

1.广西图像图形与智能处理重点实验室(桂林电子科技大学), 桂林 541004;2.广西文化和旅游智慧技术重点实验室(桂林旅游学院), 桂林 541006

作者简介:

通讯作者:

基金项目:

广西重点研发计划项目(桂科AB21220023);国家自然科学基金(62366011);广西图像图形与智能处理重点实验室项目(GIIP2306)。


Recognition of Vietnamese Text in Natural Scene Based on Modified DAN
Author:
Affiliation:

1.Guangxi Key Laboratory of Image and Graphic Intelligent Processing(Guilin University of Electronic Technology), Guilin 541004, China;2.Guangxi Key Laboratory of Culture and Tourism Smart Technology(Guilin Tourism University), Guilin 541006 China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    越南语字符由拉丁字符结合变音符号组成,由于变音符号的存在易导致注意力漂移,并且越南语文字字符类别较多,字符间差异性较小,部分字符仅为变音符号的差异,使得越南文字的识别具有挑战性。本文在解耦注意力网络(Decoupled attention network, DAN)的基础上,设计了视觉特征与序列特征融合模块(Visual feature and sequence feature fusion module, VSFM),分别利用双向门控循环单元(Bidirectional gated recurrent unit, Bi-GRU)在水平方向和竖直方向进行序列建模,进一步缓解注意力漂移,增强变音符号与拉丁字符间的关联性。然后设计了增强型解耦文本解码器模块(Enhanced decoupled text decoder module, ETDM),在解码器中分类时结合了更多的特征信息,可以更加有效地识别相似字符。一系列的实验验证了本文提出方法的有效性。

    Abstract:

    Vietnamese characters which are composed of Latin characters and diacritic symbols make recognition more challenging. On the one hand, diacritic symbols are more likely to lead to attention drift. On the other hand, Vietnamese characters include many categories, and the differences between characters are small, for example some characters only differ from diacritical symbols, which further increases difficulty of recognition. Based on the decoupled attention network (DAN) algorithm, this paper designs a visual feature and sequence feature fusion module (VSFM), which utilizes bidirectional gated recurrent unit (Bi-GRU) to model sequences in the horizontal and vertical directions, further alleviating attention drift and enhancing correlation between diacritics and Latin characters. And an enhanced decoupled text decoder module (ETDM) is designed, which employs more feature information to identify similar characters more effectively. A series of experiments validate the effectiveness of the proposed method.

    参考文献
    相似文献
    引证文献
引用本文

王利兵,俸亚特,文益民.基于改进DAN的自然场景下越南文字的识别[J].数据采集与处理,2023,38(5):1058-1068

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2022-03-13
  • 最后修改日期:2023-04-19
  • 录用日期:
  • 在线发布日期: 2023-09-25