Multi-branch Collaborative Segmentation Model for Multi-modal Cardiac Imaging
Author:
Affiliation:
College of Artificial Intelligence, Nanjing University of Aeronautics & Astronautics, Nanjing 211106, China
Fund Project:
摘要
|
图/表
|
访问统计
|
参考文献
|
相似文献
|
引证文献
|
资源附件
摘要:
精确的心脏结构分割对于心脏血管疾病辅助诊断和术前的准确评估有着重要的意义。不同模态的影像之间在空间分布和语义表达上存在显著差异,但现有方法多采用单分支网络结构,难以充分融合多模态信息,在多模态任务上缺乏泛化能力。针对这一问题,提出一种融合状态空间模型Mamba与卷积模型的多分支协同分割网络MCNet(Multi-modal collaborative network)。该网络主要由3个模块构成:基于Mamba与卷积神经网络的双分支特征提取器、动态特征融合模块以及Mamba解码器。特征提取器的双分支分别侧重于提取全局语义与局部细节特征,动态特征融合模块根据图像动态调整多种融合路径的权重,从而实现不同分支的动态特征整合。本文提出的方法在心脏的MRI数据集ACDC与超声数据集CAMUS上进行了充分实验。实验结果表明,本文方法通过基于混合专家(Mixture of experts, MoE)机制的动态特征融合模块,动态调整Mamba全局特征和CNN局部特征的融合权重,在边界清晰的ACDC数据集中,平均Dice和交并比IoU分别达到0.845和0.779,在边界模糊的CAMUS数据集中的平均Dice和IoU分别达到0.883和0.796,均优于目前主流方法。同时,消融实验进一步验证了每个模块的有效性。MCNet通过MoE机制实时调整全局和局部特征的融合权重,在保证全局感知的同时提升了结构细节完整性,为多模态心脏影像分割提供了高效而鲁棒的解决方案。
Abstract:
Precise structural segmentation of the heart is important for the adjunctive diagnosis of cardiovascular disease and accurate preoperative evaluation. There are significant differences between images of different modalities in terms of spatial distribution and semantic expression, but existing methods mostly use single-branch network structures, which are unable to fully integrate multi-modal information and lack generalization capabilities in multi-modal tasks. To address this problem, this paper proposes a multi-branch collaborative segmentation network, i.e. multi-modal collaborative network (MCNet), which fuses the state space model Mamba with the convolutional model. The network is mainly composed of three modules: A dual-branch feature extractor based on Mamba and convolutional neural networks, a dynamic feature fusion module, and a Mamba decoder. The dual branches of the feature extractor focus on extracting global semantic and local detail features, respectively, and the dynamic feature fusion module dynamically adjusts the weights of multiple fusion paths according to the image, thus realizing dynamic feature integration in different branches. The proposed method is fully experimented on the MRI dataset ACDC of the heart and the ultrasound dataset CAMUS. Experimental results show that the proposed method, through a dynamic feature fusion module based on the mixture of experts (MoE) mechanism, dynamically adjusts the fusion weights of Mamba global features and CNN local features. In the ACDC dataset with clear boundaries, the average Dice and intersection over union (IoU) values reach 0.845 and 0.779, respectively. In the CAMUS dataset with blurred boundaries, the average Dice and IoU values reach 0.883 and 0.796, respectively, both of which outperform current mainstream methods. Additionally, ablation experiments further validate the effectiveness of each module. MCNet uses the MoE mechanism to dynamically adjust the fusion weights between global and local features in real time, enhancing structural detail integrity while maintaining global perception, thereby providing an efficient and robust solution for multi-modal cardiac image segmentation.