基于改进型Transformer编码器和特征融合的行人重识别

doi:10.16337/j.1004-9037.2023.02.013

首页 > 按月查看>2023年第2月 >375-385. DOI:10.16337/j.1004-9037.2023.02.013

基于改进型Transformer编码器和特征融合的行人重识别
DOI:
                        10.16337/j.1004-9037.2023.02.013
                    
作者:
                        
                        
                    
作者单位:上海电力大学电子与信息工程学院，上海 201306
作者简介:
通讯作者:
基金项目:国家自然科学基金 (61802250)。

Person Re-identification Method Based on Improved Transformer Encoder and Feature Fusion

Author:

Affiliation:

School of Electronics and Information Engineering, Shanghai University of Electric Power College, Shanghai 201306, China

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

为了解决Transformer编码器在行人重识别中因图像块信息丢失以及行人局部特征表达不充分导致模型识别准确率低的问题，本文提出改进型Transformer编码器和特征融合的行人重识别算法。针对Transformer在注意力运算时会丢失行人图像块相对位置信息的问题，引入相对位置编码，促使网络关注行人图像块语义化的特征信息，以增强行人特征的提取能力。为了突出包含行人区域的显著特征，将局部patch注意力机制模块嵌入到Transformer网络中，对局部关键特征信息进行加权强化。最后，利用全局与局部信息特征融合实现特征间的优势互补，提高模型识别能力。训练阶段使用Softmax及三元组损失函数联合优化网络，本文算法在Market1501和DukeMTMC-reID两大主流数据集中评估测试，Rank-1指标分别达到97.5%和93.5%，平均精度均值（mean Average precision， mAP）分别达到92.3%和83.1%，实验结果表明改进型Transformer编码器和特征融合算法能够有效提高行人重识别的准确率。

Abstract:

In order to solve the problem of low accuracy of Transformer encoder caused by the loss of person image blocks information and insufficient expression of person local features in person re-identification， an improved Transformer encoder and feature fusion algorithm for person re-identification is proposed. This algorithm uses relative position encoding to solve the problem that Transformer will lose the relative position information of person image blocks during attention operation so that the network can focus on the semantic feature information of person image blocks， thus enhancing the ability to extract pedestrian features. Secondly， the local patch attention module is embedded into the Transformer network to weighted strengthen the local key feature information and highlight the significant features of the person area. Finally， the fusion of global and local information features is used to achieve complementary advantages between features and improve the recognition ability of the model. In the training stage， Softmax and triple loss functions are used to jointly optimize the network. The proposed algorithm is experimentally compared and analyzed on the mainstream datasets of Market1501 and DukeMTMC-reID. The Rank-1 accuracy reaches 97.5% and 93.5% respectively， and the mean average precision （mAP） reaches 92.3% and 83.1% respectively. The experimental results show that the improved Transformer encoder and feature fusion algorithm can effectively improve the accuracy of person re-identification.

参考文献

相似文献

引证文献

引用本文

赵倩,薛超晨,赵琰.基于改进型Transformer编码器和特征融合的行人重识别[J].数据采集与处理,2023,38(2):375-385

复制

文章指标

点击次数:
下载次数:

历史

收稿日期:2022-04-05
最后修改日期:2022-08-27
录用日期:
在线发布日期: 2023-03-25

引用本文

分享

文章指标

历史