融合U-Net改进模型与超像素优化的语义分割方法
作者:
作者单位:

1.上海理工大学光电信息与计算机工程学院,上海200093;2.上海外高桥造船有限公司工艺研究所, 上海200120

作者简介:

通讯作者:

基金项目:

国家自然科学基金 (61703278)资助项目;上海市科学技术委员会科研计划 (19511105103) 资助项目。


Semantic Segmentation Method Integrating U-Net Improvement Model and Superpixel Optimization
Author:
Affiliation:

1.School of Optoelectronic Information and Computer Engineering, Shanghai University of Technology, Shanghai 200093, China;2.Institute of Technology, Shanghai Waigaoqiao Shipbuilding Co., Ltd., Shanghai 200120, China

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    基于现有的语义分割方法在面对不受限制的开放词汇量和多样多变的场景时表现出的分割不够精细、语义信息提取不充分和收敛时间长的问题,提出一种融合U-Net改进模型与超像素优化的语义分割方法。U-Net改进模型中结合空间金字塔模块( Atrous spatial Pyramid pooling, ASPP)和Xception结构,在ASPP模块的分支网络中加入扩张卷积(Dilated convolutions,DC)形成模块本身的串并联结构,以增强图像特征提取能力;在Xception模块中添加注意力通道以及使用大的卷积核重构Xception模块,以减少数据的参数量并提高收敛速率,在此改进基础上再对图像进行超像素分割处理。最后使用条件随机场对分割结果施加全局约束,进一步优化像素的语义信息。本文方法在PASCAL VOC 2012测试集上进行验证并与DeepLab V3等主流网络进行对比,结果表明本文方法准确率提高了2.4%,证明了该方法在适应多变场景和应对精细语义分割上的有效性。

    Abstract:

    Facing unrestricted open vocabularies and diverse scenes, present semantic segmentation methods have the problems of insufficient segmentation, insufficient semantic information extraction and long convergence time. Therefore, this paper proposes a semantic segmentation method that combines U-Net improvement model and superpixel optimization. The U-Net improvement model combines the atrous spatial pyramid pooling (ASPP) and the Xception structure. Firstly, the dilated convolutions (DC) is added to the branch network of the ASPP module to form the serial-parallel structure of the module itself, thus enhancing the image feature extraction capability. And the attention channels are added to the Xception module and a large convolution kernel is used to reconstruct the Xception module, thus reducing the amount of data parameters and increasing the convergence rate. On the basis of the above improvements, the image is then subjected to the super pixel segmentation processing. Finally, conditional random fields are used to impose global constraints on the segmentation results to further optimize the semantic information of pixels. The proposed method is verified on the PASCAL VOC 2012 test set and compared with mainstream networks such as DeepLab V3. Experimental results show that the performance accuracy of the proposed method is increased by 2.4%, which proves the effectiveness of the proposed method in adapting to diverse scenes and dealing with the fine semantic segmentation.

    表 3 不同模块的参数量与收敛速度对比Table 3 Comparison of parameters and convergence speed of different modules
    表 2 不同设置的分割模型对比Table 2 Comparison of segmentation models with different settings
    图1 ASPP稀疏特征提取结构图Fig.1 Structure graph of ASPP sparse feature extraction
    图2 融合U-Net模型和SLIC的语义分割架构Fig.2 Semantic segmentation architecture combining U-Net model and SLIC
    图3 基于改进后Xception结构的U-Net结构图Fig.3 Structure diagram of U-Net based on improved Xception structure
    图4 改进后的ASPP模块结构图Fig.4 Improved ASPP module structure diagram
    图5 改进后的Xception模块结构图Fig.5 Structure diagram of improved Xception module
    图6 超像素迭代分割结果Fig.6 Segmentation results of superpixel iteration
    图7 不同分割模型产生的语义分段效果Fig.7 Semantic segmentation effects produced by different segmentation models
    表 5 30 000迭代次数下不同分割模型对比Table 5 Comparison of different segmentation models under 30 000 iterations
    表 4 不同分割模型评估指标对比Table 4 Performances of different segmentation models
    表 1 上下文网络结构Table 1 Context network structure
    参考文献
    相似文献
    引证文献
引用本文

王振奇,邵清,张生,杨振,何国春.融合U-Net改进模型与超像素优化的语义分割方法[J].数据采集与处理,2021,36(6):1263-1275

复制
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2020-11-06
  • 最后修改日期:2021-01-12
  • 录用日期:
  • 在线发布日期: 2021-11-25