With the rise of deep learning, person re-identification has gradually become a hot topic in the computer vision field. It performs cross-camera retrieval through a given query image, and finds the images that match the query identity. However, due to the factors such as background and illumination under different cameras, there are a large number of hard negative samples in the collected pedestrian datasets, and the performance of the model trained using these samples is bad and lacks robustness. Therefore, in order to improve the ability of the model to discriminate such negative samples, a novel method of synthesizing images with hard negative samples information through confusion factors is designed. For each input batch images, the similarity measurement is used to find the hard negative sample corresponding to each image, the new generated images with the clues of negative samples are synthesized through the confusion factor, and the model is prompted to mine the negative samples information in a supervised manner thus improving the model robustness. A large number of comparative experiments show that the proposed method achieves high performance on the mainstream datasets. The ablation study proves the effectiveness of the proposed method.
表 3 DukeMTMC-ReID结果对比Table 3 Comparison results of DukeMTMC-ReID
表 2 Market-1501结果对比Table 2 Results comparison of Market-1501
表 4 PRID结果对比Table 4 Comparison results of PRID
图1 锚点图像集合及难负样本集合Fig.1 Positive sample sets and negative sample sets
图2 网络结构示意图Fig.2 Schematic diagram of network structure
图3 4个不同数据集下的行人样本Fig.3 Samples from four different datasets
图5 难样本混淆可视化Fig.5 Visualization of negative sample confusion
表 7 不同度量方式的结果Table 7 Results with different measures