多级注意力特征优化的道路场景实时语义分割
作者:
作者单位:

重庆理工大学电气与电子工程学院,重庆 400054

作者简介:

通讯作者:

基金项目:

国家自然科学基金(62371081);重庆市自然科学基金(cstc2021jcyj-msxmX0411,CSTB2022NSCQ-MSX0873)。


Real-Time Semantic Segmentation of Road Scene Based on Multi-level Attention Feature Optimization
Author:
Affiliation:

School of Electrical and Electronic Engineering, Chongqing University of Technology, Chongqing400054, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    针对复杂多变道路场景下目标重叠导致图像边缘难以分割、小目标特征提取困难等问题,提出一种多级注意力特征优化的道路场景实时语义分割方法。首先,设计深度残差注意力模块,考虑不同层级下特征权重的差异性,通过压缩注意力机制来优化图像局部特征,从而改善像素之间的边缘效应;然后,设计通道注意力和深度聚合金字塔池化模块进一步加强语义上下文信息的提取,小目标信息丢失问题得到了改善;最后,设计注意力融合模块自上而下地融合不同尺度下的特征信息,实现全局特征信息下的有效交互,增强网络对重要特征的表达。Cityscapes和CamVid道路场景数据集上进行的实验测试分别达到74.4%和67.7%的分割精度,138帧/s和148帧/s的推理速度。与近几年其他优秀方法相比,该方法改善了图像边缘信息丢失,优化了对图像中小目标的分割准确度。

    Abstract:

    Aiming at the problems of overlapping targets in complex and changeable road scenes, it is difficult to segment image edges and extract small target features. A multi-level attention feature optimization method for real-time semantic segmentation of road scenes is proposed. Firstly, a lightweight residual attention module is designed, taking into account the difference in feature weights at different levels, and optimizing local features of the image through a compressed attention mechanism, thereby improving the edge effect between pixels. Then, the channel attention and depth aggregation pyramid pooling module are designed to further strengthen the extraction of semantic context information, thereby solving the problem of small target information loss. Finally, the attention fusion module is designed to fuse feature information at different scales from top to bottom. It can achieve effective interaction of global feature information and enhance the network’s expression of important features. Experimental tests are carried out on the Cityscapes and CamVid road scene datasets, and the segmentation accuracy is 74.4% and 67.7%, respectively, and the inference speed are 138 frames/s and 148 frames/s. Compared with the excellent methods in recent years, this method improves the loss of image edge information and optimizes the segmentation accuracy of small objects in the image.

    参考文献
    相似文献
    引证文献
引用本文

张鹏,彭宗举,张文瑞,罗英国,韦玮,王培容.多级注意力特征优化的道路场景实时语义分割[J].数据采集与处理,2024,39(6):1505-1516

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2023-10-19
  • 最后修改日期:2024-03-28
  • 录用日期:
  • 在线发布日期: 2024-12-12