摘要
为了更加精确地分割出甲状腺结节,本文提出了一种改进的全卷积神经网络(Fully convolutional network,FCN) 分割模型。相较于FCN,本文方法加入了空洞空间卷积池化金字塔 (Atrous spatial pyramid pooling, ASPP) 模块与多层特征传递模块(Feature transfer, FT),并采用LinkNet模型中Decoder模块进行上采样,VGG16主干网络实现特征提取下采样。实验采用来自斯坦福AIMI(Artificial intelligence in medicine and imaging)共享数据集的17 413张超声甲状腺结节图像分别用于训练、验证和测试。实验结果表明,相比于其他多种分割模型,本文模型在平均交并比(mean Intersection over union,mIoU),Dice相似系数,F1分数3个分割指标上分别达到了79.7%,87.6%和98.42%,实现了更好的分割效果,有效地提升了甲状腺结节的分割精确度。
关键词
甲状腺是内分泌系统的腺体,甲状腺结节是在甲状腺内形成的异常肿块。研究表明,甲状腺结节患者中大约11%会患上甲状腺
传统的甲状腺结节分割方法大致分为基于轮廓分割和基于区域分割两种类型。基于轮廓的方法使用超声图像中的轮廓信息来分割甲状腺结
卷积神经网络在计算机图像识别领域取得了巨大成
本文受编码器‑解码器结构启发,针对甲状腺超声图像中甲状腺组织大小和形态多样性以及周边组织的复杂性问
改进的FCN由VGG16下采样模块、ASPP模块、多层特征传递模块和Decoder上采样模块4部分组成,模型结构如

图1 改进的FCN模型结构图
Fig.1 Structure diagram of improved FCN model
VGG16 | 多层特征传递模块 | Decoder | |||
---|---|---|---|---|---|
模块配置 | 输出分辨率/像素 | 模块配置 | 输出分辨率/像素 | 模块配置 | 输出分辨率/像素 |
Conv3,s1 | 224×224,64 | ||||
Conv3,s1 | ConvT3,s2 | 224×224,2 | |||
Maxpool,s2 | 112×112,64 | Conv3,s1 | 112×112,32 | ||
Conv3,s1 | 112×112,128 | ConvT3,s2 | 112×112,32 | ||
Conv3 s1 | Conv3,s2 | 56×56,96 | |||
Maxpool,s1 | 56×56,128 | Conv1,s1 | 56×56,96 | ||
Conv3,s1 | 56×56,256 | ConvT3,s2 | 56×56,192/4 | ||
Conv3,s1 | Conv1,s1 | 28×28,192/4 | |||
Conv3,s1 | Conv3,s2 | 28×28,192 | |||
Maxpool,s2 | 28×28,256 | Conv1,s1 | 28×28,192 | ||
Conv3,s1 | 28×28,512 | ConvT3,s2 | 28×28,384/4 | ||
Conv3,s1 | Conv1,s1 | 14×14,384/4 | |||
Conv3,s1 | Conv3,s2 | 14×14,384 | |||
Maxpool,s2 | 14×14,512 | Conv1,s1 | 14×14,384 | ||
Conv3,s1 | ConvT3,s2 | 14×14,768/4 | |||
Conv3,s1 | Conv1,s1 | 7×7, 768/4 | |||
Conv3,s1 | |||||
Maxpool,s2 | 7×7,512 | ||||
ASPP | 7×7,768 |
VGG16网络结构简洁,常用于主干网络提取图像特
多层特征传递包含3个特征FT,特征传递模块如

图2 特征传递模块
Fig.2 Feature transfer module
ASPP对输入特征图以不同膨胀率的空洞卷积进行并行采样,不同膨胀率的空洞卷积可以获取不同尺度的感受野,提取多尺度信

图3 ASPP结构图
Fig.3 Structure diagram of ASPP
本文采取LinkNet网络中Decoder模块实现特征图上采

图4 1×1卷积块
Fig.4 1×1 convolution block

图5 3×3转置卷积块
Fig.5 3×3 transposed convolution block
首先,特征图经过3个Decoder Blocks上采样后,长宽增加为原来的8倍,通道数减小为1/8,然后依次经过1个3×3转置卷积块,1个3×3卷积块和1个3×3转置卷积后,长宽增加为原来的32倍,通道数减小为2,实现甲状腺结节分割。一方面,Decoder模块完成了特征图上采样;另一方面,Decoder模块通过融合多层次特征信息,保留了更丰富的局部细节和抽象的语义信息。
本文使用的数据集是斯坦福AIMI共享数据集,包含斯坦福大学医学中心167名经活检证实的甲状腺结节患者的记录,年龄在19岁至84岁之间(平均年龄56岁)。该数据集由17 412张超声甲状腺结节图像和放射科医生注释的17 412张分割图像组成,训练、验证和测试图像分别占比60%、20%和20%。
本文中实验基于PyTorch1.9.0框架实现,所有的运算都在1块内存为11 GB的NVIDIA GeForce GTX 1080Ti的显卡上来加速网络训练。在训练期间,网络的最大学习率设置为0.000 1,采用学习率Warmup更新策
(4) |
式中:batch_size为一次训练所取的图像数量;epoch为训练迭代次数。

图6 学习率变化曲线
Fig.6 Learning Rate curve
本文使用mIoU、Dice相似系数、F1分数、FLOPs和Params作为超声甲状腺结节分割任务的评价指
(5) |
式中:i代表真实值;j代表预测值;表示将i预测为j的像素数量; k表示类别;k+1表示加上背景类别。
(6) |
(7) |
(8) |
(9) |
式中:TP(True positive)定义为正确分割为甲状腺结节的区域(真阳性);FP(False positive) 定义为将非结节区域错误分割为结节区域(假阳性);FN(False negative)定义为错误漏分割甲状腺结节区域(假阴性)。
为研究本文所提模型对超声甲状腺结节的分割性能,将本文模型与其他10种分割模型进行指标对比,
网络 | mIoU/% | Dice/% | F1/% | FLOPs/GMAC | Params/MB |
---|---|---|---|---|---|
FCN_VGG16_8s | 77.2 | 85.8 | 98.23 | 19.58 | 19.17 |
FCN_ResNet50_8s | 73.3 | 82.6 | 97.94 | 26.55 | 32.95 |
FCN_ResNet101_8s | 70.4 | 80.1 | 97.64 | 41.46 | 51.94 |
U‑Net | 75.3 | 84.3 | 98.01 | 50.1 | 34.51 |
Swin‑Unet | 70.8 | 80.7 | 97.07 | 8.65 | 41.38 |
LinkNet | 71.7 | 80.7 | 97.51 | 1.33 | 12.95 |
Segnet | 74.2 | 83.3 | 98.12 | 30.73 | 29.44 |
PSPNet+MobileNetV2 | 67.8 | 78.1 | 96.30 | 1.88 | 2.38 |
PSPNet+ResNet50 | 71.6 | 81.2 | 97.63 | 35.37 | 46.71 |
DeepLabv3+MobileNetV2 | 66.7 | 76.7 | 97.01 | 5.05 | 5.81 |
改进的FCN | 79.7 | 87.6 | 98.42 | 17.84 | 32.17 |
不同模型在甲状腺结节测试集图像上的分割结果如

图7 不同模型分割结果
Fig.7 Segmentation results of different models
为了更好地体现本文模型总体性能的稳定性,给出验证集图像在验证过程中的mIoU和Dice相似系数曲线,不同分割模型的mIoU曲线如

图8 不同分割模型在验证集图像上的mIoU曲线
Fig.8 mIoU curves of different segmentation models on validation set images

图9 不同分割模型在验证集图像上的Dice曲线
Fig.9 Dice curves of different segmentation models on validation set images
为进一步验证本文所提模型对超声甲状腺结节的分割性能,将本文模型与文献[
方法 | mIoU | Precision | Recall | F1 |
---|---|---|---|---|
文献[ | 65.8 | 97.78 | 94.63 | 96.18 |
文献[ | 77.9 | 98.47 | 98.15 | 98.30 |
文献[ | 73.7 | 96.97 | 98.18 | 97.57 |
文献[ | 75.3 | 99.02 | 97.02 | 98.00 |
文献[ | 73.8 | 98.25 | 97.05 | 97.64 |
改进的FCN | 79.7 | 99.82 | 97.06 | 98.42 |
为研究主干网络对分割网络性能的影响,比较使用多种主干特征提取网络进行实验对比,其余网络结构保持不变。
主干网络 | mIoU/% | Dice/% | F1/% | FLOPs/GMAC | Params/MB |
---|---|---|---|---|---|
GoogLeNet | 76.4 | 85.2 | 97.98 | 4.32 | 42.49 |
MobileNetV2 | 69.2 | 79.0 | 97.38 | 2.67 | 35.34 |
ResNet50 | 74.4 | 83.6 | 97.99 | 9.25 | 77.49 |
ResNet101 | 68.0 | 78.1 | 96.85 | 12.98 | 96.49 |
EfficientNetB0 | 69.2 | 78.9 | 97.51 | 2.72 | 36.98 |
EfficientNetB7 | 70.9 | 80.6 | 97.60 | 9.01 | 126.01 |
改进的FCN | 79.7 | 87.6 | 98.42 | 17.84 | 32.17 |
使用不同主干网络在甲状腺结节测试集图像上的分割结果如

图10 不同主干网络分割结果
Fig.10 Segmentation results of different backbone networks
为了更好地体现采用VGG16作为主干网络进分割的稳定性,给出甲状腺结节验证集图像在验证过程中的mIoU和Dice相似系数曲线,采用不同主干进行分割的平均交并比曲线如

图11 不同主干网络在验证集图像上的mIoU曲线
Fig.11 mIoU curves of different backbone on validation set images

图12 不同主干网络在验证集图像上的Dice曲线
Fig.12 Dice curves of different backbone on validation set images
为了进一步验证本文模型的性能,将超声甲状腺数据集分别应用于VGG+Decoder基础模型、VGG+Decoder基础模型+FT和VGG+Decoder基础模型+ASPP模块,基础模型同时加入特征传递模块,ASPP模块模型进行训练,并使用测试集进行测试。
分割模型 | mIoU/% | Dice/% | F1/% | FLOPs/GMAC | Params/MB |
---|---|---|---|---|---|
VGG+Decoder | 75.9 | 84.8 | 97.86 | 16.06 | 15.19 |
VGG+Decoder+FT | 77.22 | 85.9 | 98.02 | 17.11 | 17.52 |
VGG+Decoder+ASPP | 76.9 | 85.6 | 98.11 | 16.79 | 29.84 |
改进的FCN | 79.7 | 87.6 | 98.42 | 17.84 | 32.17 |
分辨率/(像素×像素) | mIoU/% | Dice/% | F1/% | FLOPs/GMAC | Params/MB |
---|---|---|---|---|---|
224×224 | 79.7 | 87.6 | 98.42 | 17.84 | 32.17 |
384×384 | 78.7 | 86.9 | 98.32 | 52.42 | 32.17 |
448×448 | 77.9 | 86.3 | 98.23 | 71.34 | 32.17 |
本文采用BCE Loss与Dice Loss的共同监督预测分割图像,如
α | β | mIoU/% | Dice/% | F1/% |
---|---|---|---|---|
1.0 | 0.0 | 77.4 | 86.0 | 98.07 |
0.8 | 0.2 | 79.4 | 87.4 | 98.40 |
0.7 | 0.3 | 78.7 | 88.40 | 98.28 |
0.6 | 0.4 | 79.1 | 87.2 | 98.36 |
0.5 | 0.5 | 77.8 | 87.20 | 98.22 |
0.4 | 0.6 | 78.5 | 86.8 | 98.29 |
0.3 | 0.7 | 77.5 | 86.1 | 98.02 |
0.2 | 0.8 | 79.7 | 87.6 | 98.42 |
0.0 | 1.0 | 78.7 | 87.0 | 98.21 |
本文提出了一种改进FCN的超声甲状腺结节分割方法,主要加入多层特征传递模块和空洞空间卷积池化金字塔模块。其中,多层特征传递模块对来自VGG16的特征图进一步下采样,并传递多层次特征图辅助特征融合,空洞空间卷积池化金字塔模块提取多尺度的感受野信息。另外,采用Decoder模块完成特征图上采样,融合多层次特征信息。本文实验是在斯坦福大学AIMI共享数据集上进行,利用mIoU、Dice相似系数、F1分数、FLOPs和Params五个指标进行分析。实验结果表明,相比于其他模型,本文模型在mIoU、Dice和F1三个指标上都取得了最优值。但是所提模型较复杂,训练参数稍多,因此,后续本文将重点构建一个可部署到医疗设备中的轻量级高精度的图像分割网络。
参考文献
Paluskievicz C M, Chang D R, Blackburn K W, et al. Low-risk papillary thyroid cancer: Treatment de-escalation and cost implications[J]. Journal of Surgical Research, 2022, 275: 273-280. [百度学术]
CHEN Bo, FENG Mei, YAO Zhongyang, et al. Hypoxia promotes thyroid cancer progression through HIF1α/FGF11 feedback loop[J]. Experimental Cell Research, 2022, 416(1): 113159. [百度学术]
董芬,张彪,单广良.中国甲状腺癌的流行现状和影响因素[J].中国癌症杂志,2016,26(1): 47-52. [百度学术]
DONG Fen, ZHANG Biao, SHAN Guangliang. Distribution and risk factors of thyroid cancer in China[J]. China Oncology, 2016, 26(1): 47-52. [百度学术]
van Velsen E F S, Leung A M, Korevaar T I M. Diagnostic and treatment considerations for thyroid cancer in women of reproductive age and the perinatal period[J]. Endocrinology and Metabolism Clinics, 2022, 51(2): 403-416. [百度学术]
王波,李梦翔,刘侠.基于改进U-Net网络的甲状腺结节超声图像分割方法[J].电子与信息学报,2022,44(2): 514-522. [百度学术]
WANG Bo, LI Mengxiang, LIU Xia. Ultrasound image segmentation method of thyroid nodules based on the improved U-Net network[J]. Journal of Electronics & Information Technology, 2022, 44(2): 514-522. [百度学术]
Phuttharak W, Boonrod A, Klungboonkrong V, et al. Interrater reliability of various thyroid imaging reporting and data system (TIRADS) classifications for differentiating benign from malignant thyroid nodules[J]. Asian Pacific Journal of Cancer Prevention: APJCP, 2019, 20(4): 1283. [百度学术]
胡屹杉,秦品乐,曾建潮,等.基于特征融合和动态多尺度空洞卷积的超声甲状腺分割网络[J].计算机应用,2021,41(3):891-897. [百度学术]
HU Yishan, QIN Pinle, ZENG Jianchao, et al. Ultrasound thyroid segmentation network based on feature fusion and dynamic multi-scale dilated convolution[J]. Journal of Computer Applications, 2021, 41(3): 891-897. [百度学术]
Maroulis D E, Savelonas M A, Iakovidis D K, et al. Variable background active contour model for computer-aided delineation of nodules in thyroid ultrasound images[J]. IEEE Transactions on Information Technology in Biomedicine, 2007, 11(5): 537-543. [百度学术]
Chan T F, Vese L A. Active contours without edges[J]. IEEE Transactions on Image Processing, 2001, 10(2): 266-277. [百度学术]
Savelonas M A, Iakovidis D K, Legakis I, et al. Active contours guided by echogenicity and texture for delineation of thyroid nodules in ultrasound images[J]. IEEE Transactions on Information Technology in Biomedicine, 2008, 13(4): 519-527. [百度学术]
ZHAO Jie, ZHENG Wei, ZHANG Li, et al. Segmentation of ultrasound images of thyroid nodule for assisting fine needle aspiration cytology[J]. Health Information Science and Systems, 2013, 1(1): 1-12. [百度学术]
Alrubaidi W M H, PENG Bo, YANG Yan, et al. An interactive segmentation algorithm for thyroid nodules in ultrasound images[C]//Proceedings of International Conference on Intelligent Computing. Cham: Springer, 2016: 107-115. [百度学术]
WANG Lei, YANG Shujian, YANG Shan, et al. Automatic thyroid nodule recognition and diagnosis in ultrasound imaging with the YOLOv2 neural network[J]. World Journal of Surgical Oncology, 2019, 17(1): 12. [百度学术]
Buda M, Wildman-Tobriner B, Castor K, et al. Deep learning-based segmentation of nodules in thyroid ultrasound: Improving performance by utilizing markers present in the images[J]. Ultrasound in Medicine & Biology, 2020, 46(2): 415-421. [百度学术]
卢宏涛,罗沐昆.基于深度学习的计算机视觉研究新进展[J].数据采集与处理,2022,37(2): 247-278. [百度学术]
LU Hongtao, LUO Mukun. Survey on new progresses of deep learning based computer vision[J]. Journal of Data Acquisition and Processing, 2022, 37(2): 247-278. [百度学术]
HU Jie, SHEN Li, SUN Gang. Squeeze-and-excitation networks[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. [S.l.]: IEEE, 2018: 7132-7141. [百度学术]
刘安安,李天宝,王晓雯,等.基于深度学习的三维模型检索算法综述[J].数据采集与处理,2021, 36(1): 1-21. [百度学术]
LIU Anan, LI Tianbao, WANG Xiaowen, et al. Review of 3D model retrieval algorithms based on deep learning[J]. Journal of Data Acquisition and Processing, 2021, 36(1): 1-21. [百度学术]
MA Jinlian, WU Fa, JIANG Tian’an, et al. Ultrasound image-based thyroid nodule automatic segmentation using convolutional neural networks[J]. International Journal of Computer Assisted Radiology and Surgery, 2017, 12(11): 1895-1910. [百度学术]
刘明坤,张俊华,李宗桂.改进Mask R-CNN的甲状腺结节超声图像分割方法[J].计算机工程与应用,2022,58(16): 219-225. [百度学术]
LIU Mingkun, ZHANG Junhua, LI Zonggui. Improved mask R-CNN method for thyroid nodules segmentation in ultrasound images[J]. Computer Engineering and Applications, 2022, 58(16): 219-225. [百度学术]
LI Xuewei, WANG Shuaijie, WEI Xi, et al. Fully convolutional networks for ultrasound image segmentation of thyroid nodules[C]//Proceedings of 2018 IEEE 20th International Conference on High Performance Computing and Communications. [S.l.]: IEEE, 2018: 886-890. [百度学术]
ZHOU Shujun, WU Hong, GONG Jie, et al. Mark-guided segmentation of ultrasonic thyroid nodules using deep learning[C]//Proceedings of the 2nd International Symposium on Image Computing and Digital Medicine. New York: ACM, 2018: 21-26. [百度学术]
YING Xiang, YU Zhihui, YU Ruiguo, et al. Thyroid nodule segmentation in ultrasound images based on cascaded convolutional neural network[C]//Proceedings of International Conference on Neural Information Processing. Cham: Springer, 2018: 373-384. [百度学术]
迟剑宁,于晓升,张艺菲.融合深度网络和浅层纹理特征的甲状腺结节癌变超声图像诊断[J].中国图象图形学报,2018,23(10): 1582-1593. [百度学术]
CHI Jianning, YU Xiaosheng, ZHANG Yifei. Thyroid nodule malignantrisk detection in ultrasound image by fusing deep and texture features[J]. Journal of Image and Graphics, 2018, 23(10): 1582-1593. [百度学术]
裴昀. 医学影像分析中的注意力机制研究[D].长春:吉林大学,2022. [百度学术]
PEI Yun. Research on attention mechanism in medical image analysis[D]. Changchun: Jilin University, 2022. [百度学术]
SIMONYAN K, ZISSERMAN A. Very deep convolutional networks for large-scale image recognition[EB/OL].(2041-09-10) [2022-06-13]. https://doi.org/10.48550/arXiv.1409.1556. [百度学术]
BA J L, KIROS J R, HINTON G E. Layer normalization[EB/OL]. (2016-07-21)[2022-06-13]. https://doi.org/10.48550/arXiv.1607.06450. [百度学术]
Maas A L, Hannun A Y, Ng A Y. Rectifier nonlinearities improve neural network acoustic models[J]. Journal of Machine Learning Research, 2013, 30(1): 3. [百度学术]
Chen L C, Papandreou G, Kokkinos I, et al. Deeplab: Semantic image segmentation with deep convolutional nets, atrous convolution, and fully connected CRFS[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 40(4): 834-848. [百度学术]
周苏,吴迪,金杰.基于卷积神经网络的车道线实例分割算法[J].激光与光电子学进展,2021,58(8): 381-388. [百度学术]
ZHOU Su, WU Di, JIN Jie. Lane instance segmentation algorithm based on convolutional neural network[J]. Laser & Optoelectronics Progress, 2021, 58(8): 381-388. [百度学术]
Chaurasia A, Culurciello E. Linknet: Exploiting encoder representations for efficient semantic segmentation[C]//Proceedings of 2017 IEEE Visual Communications and Image Processing (VCIP). [S.l]: IEEE, 2017: 1-4. [百度学术]
Jadon S. A survey of loss functions for semantic segmentation[C]//Proceedings of 2020 IEEE Conference on Computational Intelligence in Bioinformatics and Computational Biology. [S.l.]: IEEE,2020: 1-7. [百度学术]
LI Xiaoya, SUN Xiaofei, MENG Yuxian, et al. Dice loss for data-imbalanced NLP tasks[EB/OL]. (2019-09-07)[2022-06-14]. https://doi.org/10.48550/arXiv.1911.02855. [百度学术]
XIONG Ruibin, YANG Yunchang, HE Di, et al. On layer normalization in the transformer architecture[C]//Proceedings of International Conference on Machine Learning. [S.l.]: PMLR, 2020: 10524-10533. [百度学术]
LOSHCHILOV I, HUTTER F. Decoupled weight decay regularization[EB/OL]. (2017-09-14)[2022-06-13]. https://doi.org/10.48550/arXiv.1711.05101. [百度学术]
YANG Lei, GU Yuge, HUO Benyan, et al. A shape-guided deep residual network for automated CT lung segmentation[J]. Knowledge-Based Systems, 2022,250: 108981. [百度学术]
Perazzi F, Pont-Tuset J, McWilliams B, et al. A benchmark dataset and evaluation methodology for video object segmentation[C]//Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition. [S.l.]: IEEE, 2016: 724-732. [百度学术]
MOLCHANOV P, TYREE S, KARRAS T, et al. Pruning convolutional neural networks for resource efficient inference[EB/OL].(2016-09-19)[2022-06-13]. https://doi.org/10.48550/arXiv.1611.06440. [百度学术]