基于扩张注意力与深度最优校正的多视图三维重建网络
DOI:
作者:
作者单位:

1.南京信息工程大学电子与信息工程学院;2.杭州电子科技大学自动化学院

作者简介:

通讯作者:

基金项目:

国家自然科学基金面上资助项目


Multi-view 3D reconstruction network based on dilated attention and depth optimal correction
Author:
Affiliation:

1.School of Electronics and Information Engineering,Nanjing University of Information Science and Technology;2.School of Automation,Hangzhou Dianzi University

Fund Project:

  • 摘要
  • |
  • 图/表
  • |
  • 访问统计
  • |
  • 参考文献
  • |
  • 相似文献
  • |
  • 引证文献
  • |
  • 资源附件
    摘要:

    在深度学习三维重建领域中,CVP-MVSNet和CasMVSNet相比MVSNet存在的内存消耗量问题,显著降低了模型处理高分辨率图像时的内存消耗量以及重建点云的准确性误差,但两者点云的完整性误差却很大。针对上述问题,本章提出基于扩张注意力与深度最优校正的多视图三维重建网络,将其命名为DA-MVSNet。DA-MVSNet以CasMVSNet作为基准网络,额外引入一个融合了深度可分离卷积思想的并行空洞卷积与注意力模块构成的特征增强网络,旨在增强重建网络对输入视图的全局特征捕获能力,提升重建点云的完整度。同时,为防止特征增强网络提取过多的视图非相关背景信息导致重建点云准确度的下降,在网络的输出部分还引入了一个基于非线性最小二乘思想对深度图进行最优化校正,旨在提升输出深度图的精度。分析实验结果表明DA-MVSNet网络模型在室内场景数据集DTU上运行得到重建点云质量最佳并拥有较好的综合性能,同时在真实室外场景的大型数据集Tanks Temples上运行,展现出良好的泛化能力。

    Abstract:

    In the field of deep learning 3D reconstruction, CVP-MVSNet and CasMVSNet have the memory consumption problem compared with MVSNet, which significantly reduces the memory consumption of the model when processing high-resolution images and the accuracy error of reconstructed point clouds, but the integrity error of the point clouds is large. In order to solve the above problems, a multi-view 3D reconstruction network based on dilated attention and depth optimal correction is proposed in this chapter, which is named DA-MVSNet. DA-MVSNet takes CasMVSNet as the benchmark network, and introduces a feature enhancement network composed of parallel dilated convolution and attention module that integrates the idea of deep separable convolution, aiming to enhance the global feature capture ability of the reconstruction network to the input view and improve the integrity of the reconstructed point cloud. At the same time, in order to prevent the feature enhancement network from extracting too much view non-relevant background information and causing the degradation of the accuracy of the reconstructed point cloud, an optimization correction of the depth map based on the idea of nonlinear least squares is introduced in the output part of the network, aiming to improve the accuracy of the output depth map. The experimental results show that the DA-MVSNet network model has the best quality and good comprehensive performance when running on the indoor scene dataset DTU, and it also runs on the large dataset Tanks Temples in the real outdoor scene, showing good generalization ability.

    参考文献
    相似文献
    引证文献
引用本文
分享
文章指标
  • 点击次数:
  • 下载次数:
历史
  • 收稿日期:2024-12-16
  • 最后修改日期:2025-03-24
  • 录用日期:2025-04-30
  • 在线发布日期: