LAC-UNet: 基于胶囊表达局部-整体特征关系的语义分割模型

doi:10.6040/j.issn.1671-9352.0.2022.488

摘要/Abstract

摘要：

针对原生U-Net对空间结构特征表达能力不足的问题, 将胶囊结构引入到U-Net语义分割模型中, 提出了LAC-UNet模型, 即通过胶囊向量获得更加精细的空间结构。LAC-UNet模型将局部的像素级信息编码为胶囊, 并以胶囊作为U-Net的基本特征单元, 具有更精细的空间结构特征表达能力。首先使用卷积操作将局部的像素级信息输入初级胶囊; 其次, 使用局部动态路由算法将初级胶囊在数字胶囊层整合为高级胶囊, 其中, 局部路由算法引入了空间与通道权重, 使胶囊在编码和整合局部信息时, 具有更强的局部上下文线索捕捉能力; 最后, 使用不同的评价指标(精度、Dice等)进行模型性能评价。试验结果表明, LAC-UNet在DRIVE、CHASEDB1、CrackForest和MSRC 4种数据集中均达到最佳的分割效果。

关键词: 图像语义分割, 胶囊网络, U-Net, 路由算法, 深度学习

Abstract:

Aiming at the insufficient ability of native U-Net to express spatial structure features, capsule structure is introduced into the U-Net semantic segmentation model and the LAC-UNet model is proposed to obtain finer spatial structure by capsule vectors. LAC-UNet model encodes local pixel-level information into capsules and uses capsules as the basic feature unit of U-Net with finer spatial structure feature expression capability. First, the local pixel-level information is input into the primary capsule using the convolution operation, and then, the primary capsule is integrated into the advanced capsule at the digital capsule layer using the local dynamic routing algorithm. In which, the local dynamic routing algorithm introduces spatial and channel weights to give the capsule a stronger ability to capture local contextual cues when it is encoding and integrating local information. Finally, different evaluation metrics such as accuracy, Dice are used to evaluate the performance of the model. The experimental results show that LAC-UNet achieves the best effect in all four semantic segmentation datasets that are DRIVE, CHASEDB1, CrackForest, and MSRC.

Key words: image semantic segmentation, capsule network, U-Net, routing algorithm, deep learning

中图分类号:

TP391

仲诚诚,周恒,张梓童,张春雷. LAC-UNet: 基于胶囊表达局部-整体特征关系的语义分割模型[J]. 《山东大学学报(理学版)》, 2023, 58(11): 116-126.

Chengcheng ZHONG,Heng ZHOU,Zitong ZHANG,Chunlei ZHANG. LAC-UNet: semantic segmentation model based on capsules for representing part-whole hierarchical features[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(11): 116-126.

图/表 15

图1

图2

图3

图4

图5

图6

表1

表2

表3

表4

表5

图7

图8

图9

图10

参考文献 27

1	易三莉, 陈建亭, 贺建峰. ASR-UNet: 一种基于注意力机制改进的视网膜血管分割算法[J]. 山东大学学报(理学版), 2021, 56 (9): 13- 20.
	YI Sanli , CHEN Jianting , HE Jianfeng . ASR-UNet: an improved retinal vessels segmentation algorithm based on attention mechanism[J]. Journal of Shandong University(Natural Science), 2021, 56 (9): 13- 20.
2	郭文鹃, 杨公平, 董晋利. 指纹图像分割方法综述[J]. 山东大学学报(理学版), 2010, 45 (7): 94- 101.
	GUO Wenjuan , YANG Gongping , DONG Jinli . A review of fingerprint image segmentation methods[J]. Journal of Shandong University(Natural Science), 2010, 45 (7): 94- 101.
3	杨鹏, 蔡青青, 孙昊, 等. 基于卷积神经网络的室内场景识别[J]. 郑州大学学报(理学版), 2018, 50 (3): 73- 77.
	YANG Peng , CAI Qingqing , SUN Hao , et al. Indoor scene recognition based on convolutional neural network[J]. Journal of Zhengzhou University(Natural Science Edition), 2018, 50 (3): 73- 77.
4	杜中强, 唐林波, 韩煜祺. 面向嵌入式平台的车道线检测方法[J]. 红外与激光工程, 2022, 51 (7): 483- 490.
	DU Zhongqiang , TANG Linbo , HAN Yuqi . Lane line detection method for embedded platform[J]. Infrared and Laser Engineering, 2022, 51 (7): 483- 490.
5	MINAEE S , BOYKOV Y , PORIKLI F , et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44 (7): 3523- 3542.
6	高兴波, 史旭华, 葛群峰, 等. 面向动态物体场景的视觉SLAM综述[J]. 机器人, 2021, 43 (6): 733- 750.
	GAO Xingbo , SHI Xuhua , GE Qunfeng , et al. A survey of visual SLAM for scenes with dynamic objects[J]. Robot, 2021, 43 (6): 733- 750.
7	CHEN Chen , QIN Chen , QIU Huaqi , et al. Deep learning for cardiac image segmentation: a review[J]. Frontiers in Cardiovascular Medicine, 2020, 7, 25.
8	MAGADZA T , VIRIRI S . Deep learning for brain tumor segmentation: a survey of state-of-the-art[J]. Journal of Imaging, 2021, 7 (2): 19.
9	LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39 (4): 640- 651.
10	WANG C L, YANG B, LIAO Y W. Unsupervised image segmentation using convolutional autoencoder with total variation regularization as preprocessing[C]//2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New Orleans, LA, USA: IEEE, 2017: 1877-1881.
11	RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention — MICCAI 2015. Berlin: Springer, 2015: 234-241.
12	ZHOU Z W, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. UNet++: A nested U-net architecture for medical image segmentation[M]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Berlin: Springer, 2018: 3-11.
13	BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (12): 2481- 2495.
14	ZHAO Pengyu, ZHANG Yuanxing, BIAN Kaigui, et al. Laddernet: knowledge transfer based viewpoint prediction in 360° video[C]//Proc of IEEE ICASSP. Brighton, UK: IEEE, 2019: 1657-1661.
15	WU H K, ZHANG J G, HUANG K Q, et al. FastFCN: rethinking dilated convolution in the backbone for semantic segmentation[EB/OL]. (2019-03-28)[2023-10-13]. https://arxiv.org/abs/1903.11816
16	MANINIS K K , PONT-TUSET J , ARBELAEZ P , et al. Convolutional oriented boundaries: from image segmentation to high-level tasks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 819- 833.
17	XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[EB/OL]. (2021-10-28)[2023-10-13]. https://arxiv.org/abs/2105.15203
18	HINTON G E, KRIZHEVSKY A, WANG S D. Transforming auto-encoders[M]//Lecture Notes in Computer Science. Berlin. Heidelberg: Springer, 2011: 44-51.
19	SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 3859-3869.
20	HINTON G E, SABOUR S, FROSST N. Matrix capsules with EM routing[C]//Proc of the 6th International Conference on Learning Representations, [s. n. ]: Open Review, 2018.
21	HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway: IEEE, 2017: 2261-2269.
22	SAHU S K, KUMAR P, SINGH A P. Dynamic routing using inter capsule routing protocol between capsules[C]//2018 UKSim-AMSS 20th International Conference on Computer Modelling and Simulation (UKSim). Cambridge, UK: IEEE, 2018: 1-5.
23	LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: IEEE, 2022: 9992-10002.
24	XIAO T T, LIU Y C, ZHOU B L, et al. Unified perceptual parsing for scene understanding[M]//Computer Vision-ECCV 2018. Berlin: Springer International Publishing, 2018: 432-448.
25	ZHOU Y Q, YU H C, SHI H. Study group learning: improving retinal vessel segmentation trained with noisy labels[M]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2021. Cham: Springer, 2021: 57-67.
26	CHEN J N, LU Y Y, YU Q H, et al. TransUNet: transformers make strong encoders for medical image segmentation[EB/OL]. (2021-02-08)[2023-10-13]. https://arxiv.org/abs/2102.04306
27	CHEN Y L, LI X W, YAO H T, et al. Adherent nuclei edge detection based on caps-unet[C]//2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). Exeter, UK: IEEE, 2021: 889-894.

多维度评价

Viewed

Full text

Abstract

Cited

Shared

Discussed

编码层数	卷积层			初级胶囊		数字胶囊
编码层数	输入通道	输出通道	卷积核	输入通道	胶囊长度	输入通道	胶囊长度
1	3	3	3×3	4	4	2	8
2	32	32	3×3	8	4	4	8
3	64	64	3×3	16	4	8	8
4	128	128	3×3	32	4	16	8
5	256	256	3×3	64	4	32	8

模型	A	S	S′	D
U-Net	0.961 2	0.811 4	0.974 1	0.803 8
U-Net++	0.962 3	0.807 9	0.977 6	0.808 0
SegNet	0.961 6	0.812 8	0.976 4	0.805 8
UperNet	0.953 1	0.780 7	0.970 3	0.762 5
SGLNet	0.961 6	0.732 3	0.984 9	0.788 6
Caps-Unet	0.965 9	0.820 0	0.980 3	0.811 1
TransUnet	0.963 8	0.809 7	0.979 4	0.815 6
LAC-UNet	0.968 5	0.823 6	0.987 9	0.821 7

模型	A	S	S′	D
U-Net	0.968 9	0.806 1	0.981 6	0.808 1
U-Net++	0.969 3	0.796 6	0.982 8	0.808 8
SegNet	0.967 3	0.788 0	0.981 3	0.800 3
UperNet	0.966 3	0.813 2	0.978 3	0.794 6
SGLNet	0.953 6	0.673 5	0.975 2	0.691 2
Caps-Unet	0.964 6	0.812 9	0.983 8	0.815 6
TransUnet	0.965 3	0.796 5	0.949 2	0.768 6
LAC-UNet	0.972 0	0.820 8	0.987 6	0.820 1

模型	A	S	S′	D
U-Net	0.981 4	0.634 6	0.992 5	0.614 0
U-Net++	0.977 7	0.674 2	0.988 1	0.571 0
SegNet	0.979 5	0.641 4	0.990 5	0.585 7
UperNet	0.982 6	0.656 8	0.993 3	0.643 1
SGLNet	0.961 3	0.544 4	0.968 6	0.324 5
Caps-Unet	0.989 4	0.655 0	0.992 7	0.597 8
TransUnet	0.942 6	0.639 0	0.987 9	0.617 4
LAC-UNet	0.985 0	0.675 1	0.993 6	0.632 7

模型	A	S	S′	D
U-Net	0.980 2	0.903 6	0.992 0	0.903 6
U-Net++	0.978 3	0.891 4	0.990 9	0.891 4
SegNet	0.974 9	0.869 4	0.989 1	0.869 4
UperNet	0.977 3	0.884 7	0.990 4	0.884 7
SGLNet	0.938 0	0.597 2	0.966 4	0.597 2
Caps-Unet	0.980 3	0.824 9	0.989 7	0.867 3
TransUnet	0.979 2	0.823 8	0.985 6	0.665 5
LAC-UNet	0.982 6	0.924 3	0.992 7	0.954 3