Recent Advances in Cross Modal Image Text Retrieval
CSTR:
Author:
Affiliation:

1.School of Computer Science and Engineering, Tianjin University of Technology, Tianjin 300384, China;2.School of Traffic & Transportation Engineering, Central South University, Changsha 410083, China

Clc Number:

Fund Project:

  • Article
  • |
  • Figures
  • |
  • Metrics
  • |
  • Reference
  • |
  • Related
  • |
  • Cited by
  • |
  • Materials
  • |
  • Comments
    Abstract:

    With the rapid development of Internet technology, the volume of different types of data has grown tremendously, such as texts and images. How to obtain valuable information from such heterogeneous but semantic related multimodal data is particularly important. Cross-modal retrieval is an essential way to meet users’ requirements for obtaining different information on the Internet, which can effectively deal with the multimodal data. In recent years, cross modal retrieval has become a hot issue in both academic and industrial area. In this paper, we make a comprehensive overview of the image-text cross modal retrieval task, including definitions, challenges, and detailed discussions about the existing methods. Specifically, we first divide the existing methods into three main categories: (1) traditional methods, (2) methods based on deep learning; and (3) Hash based representation method. Then, we introduce the commonly used cross-modal retrieval benchmarks and discuss the existing methods on these benchmarks in detail. Finally, the future development direction of image-text cross modal retrieval task is prospected.

    Reference
    Related
    Cited by
Get Citation

Zhang Feifei, Ma Zewei, Zhou Ling, Meng Lingtao. Recent Advances in Cross Modal Image Text Retrieval[J].,2023,38(3):479-505.

Copy
Related Videos

Share
Article Metrics
  • Abstract:
  • PDF:
  • HTML:
  • Cited by:
History
  • Received:December 15,2022
  • Revised:February 20,2023
  • Adopted:
  • Online: May 25,2023
  • Published:
Article QR Code