基于可学习掩模和位置编码的遮挡行人重识别

基于可学习掩模和位置编码的遮挡行人重识别
DOI:
                        
作者:
                        
作者单位:南京邮电大学
作者简介:
通讯作者:
基金项目:国家自然科学基金(62071242, 62171232)；江苏省研究生科研与实践创新计划项目(KYCX22_0955, SJCX23_0251)； 江苏省研究生教育学改革项目(JGKT23_C019)；南京邮电大学科研项目(NY220207)； 南京邮电大学研究生教育学改革项目(JGKT23_XJ02)

Learnable Mask and Position Encoding for Occluded Person Re-identification

Author:

Affiliation:

Nanjing University of Posts and Telecommunications

Fund Project:

摘要

图/表

访问统计

参考文献

相似文献

引证文献

资源附件

摘要:

行人重识别虽已取得了显著进展，但在实际应用场景中，不同障碍物引起的遮挡问题仍然是一个亟待解决的挑战。为了从被遮挡行人中提取更有效的特征，提出了一种基于可学习掩模和位置编码(Learnable Mask And Position Encoding, LMPE)的遮挡行人重识别方法。首先，引入了一种可学习的双路注意力掩模生成器(Learnable Dual Attention Mask Generator, LDAMG)，生成的掩模能够适应不同遮挡模式，显著提升了对被遮挡行人的识别准确性。该模块可以使网络更灵活，能更好地适应多样性的遮挡情况，有效克服了遮挡带来的困扰。同时，该网络通过掩模学习上下文信息，进一步增强了对行人所处场景的理解力。此外，为了解决Transformer位置信息损耗问题，引入了遮挡感知位置编码融合 (Occlusion Aware Position Encoding Fusion, OAPEF)模块。该模块进行不同层次位置编码融合，使网络获得更强的表达能力。通过全方位整合图像位置编码，更准确地理解行人间的空间关系，提高模型对遮挡情况的适应能力。最后，进行了仿真实验，实验表明，提出的LMPE在Occluded-Duke和Occluded-ReID遮挡数据集以及Market-1501和DukeMTMC-ReID无遮挡数据集上都取得了出色的效果，验证了提出方法的有效性和优越性。

Abstract:

Although the person re-identification task has made significant progress, the occlusion problem is still a challenge in practical application scenes. In order to extract more effective features from occluded pedestrians, a learnable mask and position encoding (LMPE) method is proposed. Firstly, a learnable dual attention mask generator (LDAMG) is introduced to adapt to different occlusion patterns, and significantly improve the re-identification accuracy of occluded pedestrians. It makes the network more flexible and better adapts to diverse occlusion situations. At the same time, the network learns contextual information through the mask, which further improves the understanding of the scenes. In addition, we introduce the occlusion aware position encoding fusion (OAPEF) module to solve the problem of losing position information in Transformer. This method helps to perform the fusion of different regional position encoding and allows the network to gain stronger expressive ability. The integration of position encoding in all directions enables the network to understand the spatial correlation between pedestrians more accurately, and improves the ability to adapt to the occlusion situation. Finally, simulation experiments are conducted, and experiments demonstrate that LMPE performs well on Occluded-Duke and Occluded-ReID occluded datasets and Market-1501 and DukeMTMC-ReID unoccluded datasets, which confirms the effectiveness and superiority of our proposed method.

参考文献

相似文献

引证文献

引用本文

复制

文章指标

点击次数:
下载次数:

历史

收稿日期:2024-02-14
最后修改日期:2024-04-18
录用日期:2024-04-26
在线发布日期:

引用本文

分享

文章指标

历史