图文跨模态检索研究进展
作者:
作者单位:

1.天津理工大学计算机科学与工程学院, 天津 300384;2.中南大学交通运输工程学院, 长沙 410083

作者简介:

通讯作者:

基金项目:

国家重点研发计划(2018AAA0102200); 国家自然科学基金(62036012,62002355,62072455,62102415,62106262); 天津市自然科学基金(22JCYBJC00030)。


Recent Advances in Cross Modal Image Text Retrieval
Author:
Affiliation:

1.School of Computer Science and Engineering, Tianjin University of Technology, Tianjin 300384, China;2.School of Traffic & Transportation Engineering, Central South University, Changsha 410083, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    随着互联网技术的迅速发展,文本和图像等各种类型的数据在网络上呈现爆发式增长,如何从这些多源异构且语义关联的多模态数据中获取有价值的信息则尤为重要。跨模态检索能够突破模态的限制,跨越不同模态的数据进行信息检索,满足用户获取有关事件信息的需求。近年来,跨模态检索已经成为了学术界和工业界研究的热点问题。本文聚焦于图文跨模态检索任务,首先介绍图文跨模态检索的定义,并分析说明了当前该任务面临的挑战。其次,对现有的研究方法进行归纳总结,将其分为3大类:(1)传统方法;(2)基于深度学习的方法;(3)基于哈希表示的方法。然后,详细介绍了图文跨模态检索的常用数据集,并对常用数据集上已有算法进行详细分析与比较。最后,对图文跨模态检索任务的未来发展方向进行展望。

    Abstract:

    With the rapid development of Internet technology, the volume of different types of data has grown tremendously, such as texts and images. How to obtain valuable information from such heterogeneous but semantic related multimodal data is particularly important. Cross-modal retrieval is an essential way to meet users’ requirements for obtaining different information on the Internet, which can effectively deal with the multimodal data. In recent years, cross modal retrieval has become a hot issue in both academic and industrial area. In this paper, we make a comprehensive overview of the image-text cross modal retrieval task, including definitions, challenges, and detailed discussions about the existing methods. Specifically, we first divide the existing methods into three main categories: (1) traditional methods, (2) methods based on deep learning; and (3) Hash based representation method. Then, we introduce the commonly used cross-modal retrieval benchmarks and discuss the existing methods on these benchmarks in detail. Finally, the future development direction of image-text cross modal retrieval task is prospected.

    参考文献
    相似文献
    引证文献
引用本文

张飞飞,马泽伟,周玲,孟铃涛.图文跨模态检索研究进展[J].数据采集与处理,2023,38(3):479-505

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2022-12-15
  • 最后修改日期:2023-02-20
  • 录用日期:
  • 在线发布日期: 2023-05-25