Image Caption Generation Model Based on Graph Neural Network and Guidance Vector

doi:10.16337/j.1004-9037.2023.01.018

Home > Archive>Volume 38, Issue 1, 2023 >209-219. DOI:10.16337/j.1004-9037.2023.01.018

Image Caption Generation Model Based on Graph Neural Network and Guidance Vector
DOI:
                        10.16337/j.1004-9037.2023.01.018
                    
CSTR:
                        
Author:
                        
Affiliation:College of Optical-Electrical and Computer Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
Clc Number:TP3
Fund Project:

Article

Figures

Metrics

Reference

Cited by

Materials

Comments

Abstract:

In recent years， deep learning has shown its advantages in the research of image caption technology. In deep learning model， the relationship between objects in image plays an important role in image representation. In order to better detect the visual relationship in the image， an image caption generation model （YOLOv4-GCN-GRU， YGG） is constructed based on graph neural network and guidance vector. The model uses the spatial and semantic information of the detected objects in the image to build a graph， and uses graph convolutional network （GCN） as an encoder to represent each region of the graph. In the process of decoding， an additional guidance neural network is trained to generate guidance vector， so as to assist the decoder to automatically generate sentences. Comparative experiments based on MSCOCO image dataset show that YGG model has better performance， and the performance of CIDEr-D is improved from 138.9% to 142.1%.

Reference

Cited by

Get Citation

TONG Guoxiang, LI Yueyang. Image Caption Generation Model Based on Graph Neural Network and Guidance Vector[J].,2023,38(1):209-219.

Copy

Article Metrics

Abstract:
PDF:
HTML:
Cited by:

History

Received:January 03,2022
Revised:April 18,2022
Adopted:
Online: January 25,2023
Published:

For Authors

Get Citation

Related Videos

Share

Article Metrics

History

Article QR Code