您的位置:山东大学 -> 科技期刊社 -> 《山东大学学报(理学版)》

《山东大学学报(理学版)》 ›› 2023, Vol. 58 ›› Issue (11): 116-126.doi: 10.6040/j.issn.1671-9352.0.2022.488

•   • 上一篇    下一篇

LAC-UNet: 基于胶囊表达局部-整体特征关系的语义分割模型

仲诚诚1(),周恒2,*(),张梓童3,张春雷4   

  1. 1. 中国地质大学(北京)数理学院, 北京 100083
    2. 中国农业大学信息与电气工程学院, 北京 100083
    3. 中国地质大学(北京)地球科学与资源学院, 北京 100083
    4. 北京中地润德石油科技有限公司, 北京 100083
  • 收稿日期:2022-09-09 出版日期:2023-11-20 发布日期:2023-11-07
  • 通讯作者: 周恒 E-mail:1129109156@qq.com;xys_zh@163.com
  • 作者简介:仲诚诚(1995—),女,硕士研究生,研究方向为机器学习、深度学习等. E-mail: 1129109156@qq.com

LAC-UNet: semantic segmentation model based on capsules for representing part-whole hierarchical features

Chengcheng ZHONG1(),Heng ZHOU2,*(),Zitong ZHANG3,Chunlei ZHANG4   

  1. 1. School of Science, China University of Geosciences, Beijing 100083, China
    2. College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
    3. School of Earth Sciences and Resources, China University of Geosciences, Beijing 100083, China
    4. Beijing Zhongdi Runde Petroleum Technology Co., Ltd., Beijing 100083, China
  • Received:2022-09-09 Online:2023-11-20 Published:2023-11-07
  • Contact: Heng ZHOU E-mail:1129109156@qq.com;xys_zh@163.com

摘要:

针对原生U-Net对空间结构特征表达能力不足的问题, 将胶囊结构引入到U-Net语义分割模型中, 提出了LAC-UNet模型, 即通过胶囊向量获得更加精细的空间结构。LAC-UNet模型将局部的像素级信息编码为胶囊, 并以胶囊作为U-Net的基本特征单元, 具有更精细的空间结构特征表达能力。首先使用卷积操作将局部的像素级信息输入初级胶囊; 其次, 使用局部动态路由算法将初级胶囊在数字胶囊层整合为高级胶囊, 其中, 局部路由算法引入了空间与通道权重, 使胶囊在编码和整合局部信息时, 具有更强的局部上下文线索捕捉能力; 最后, 使用不同的评价指标(精度、Dice等)进行模型性能评价。试验结果表明, LAC-UNet在DRIVE、CHASEDB1、CrackForest和MSRC 4种数据集中均达到最佳的分割效果。

关键词: 图像语义分割, 胶囊网络, U-Net, 路由算法, 深度学习

Abstract:

Aiming at the insufficient ability of native U-Net to express spatial structure features, capsule structure is introduced into the U-Net semantic segmentation model and the LAC-UNet model is proposed to obtain finer spatial structure by capsule vectors. LAC-UNet model encodes local pixel-level information into capsules and uses capsules as the basic feature unit of U-Net with finer spatial structure feature expression capability. First, the local pixel-level information is input into the primary capsule using the convolution operation, and then, the primary capsule is integrated into the advanced capsule at the digital capsule layer using the local dynamic routing algorithm. In which, the local dynamic routing algorithm introduces spatial and channel weights to give the capsule a stronger ability to capture local contextual cues when it is encoding and integrating local information. Finally, different evaluation metrics such as accuracy, Dice are used to evaluate the performance of the model. The experimental results show that LAC-UNet achieves the best effect in all four semantic segmentation datasets that are DRIVE, CHASEDB1, CrackForest, and MSRC.

Key words: image semantic segmentation, capsule network, U-Net, routing algorithm, deep learning

中图分类号: 

  • TP391

图1

U-Net网络架构"

图2

胶囊网络结构图"

图3

初级胶囊层"

图4

初级胶囊聚合到数字胶囊"

图5

LAC-UNet模型结构"

图6

4种语义分割数据集"

表1

LAC-UNet模型超参数"

编码层数 卷积层 初级胶囊 数字胶囊
输入通道 输出通道 卷积核 输入通道 胶囊长度 输入通道 胶囊长度
1 3 3 3×3 4 4 2 8
2 32 32 3×3 8 4 4 8
3 64 64 3×3 16 4 8 8
4 128 128 3×3 32 4 16 8
5 256 256 3×3 64 4 32 8

表2

DRIVE数据集的分割结果"

模型 A S S D
U-Net 0.961 2 0.811 4 0.974 1 0.803 8
U-Net++ 0.962 3 0.807 9 0.977 6 0.808 0
SegNet 0.961 6 0.812 8 0.976 4 0.805 8
UperNet 0.953 1 0.780 7 0.970 3 0.762 5
SGLNet 0.961 6 0.732 3 0.984 9 0.788 6
Caps-Unet 0.965 9 0.820 0 0.980 3 0.811 1
TransUnet 0.963 8 0.809 7 0.979 4 0.815 6
LAC-UNet 0.968 5 0.823 6 0.987 9 0.821 7

表3

CHASEDB1数据集的分割结果"

模型 A S S D
U-Net 0.968 9 0.806 1 0.981 6 0.808 1
U-Net++ 0.969 3 0.796 6 0.982 8 0.808 8
SegNet 0.967 3 0.788 0 0.981 3 0.800 3
UperNet 0.966 3 0.813 2 0.978 3 0.794 6
SGLNet 0.953 6 0.673 5 0.975 2 0.691 2
Caps-Unet 0.964 6 0.812 9 0.983 8 0.815 6
TransUnet 0.965 3 0.796 5 0.949 2 0.768 6
LAC-UNet 0.972 0 0.820 8 0.987 6 0.820 1

表4

CrackForest数据集的分割结果"

模型 A S S D
U-Net 0.981 4 0.634 6 0.992 5 0.614 0
U-Net++ 0.977 7 0.674 2 0.988 1 0.571 0
SegNet 0.979 5 0.641 4 0.990 5 0.585 7
UperNet 0.982 6 0.656 8 0.993 3 0.643 1
SGLNet 0.961 3 0.544 4 0.968 6 0.324 5
Caps-Unet 0.989 4 0.655 0 0.992 7 0.597 8
TransUnet 0.942 6 0.639 0 0.987 9 0.617 4
LAC-UNet 0.985 0 0.675 1 0.993 6 0.632 7

表5

MSRC数据集的分割结果"

模型 A S S D
U-Net 0.980 2 0.903 6 0.992 0 0.903 6
U-Net++ 0.978 3 0.891 4 0.990 9 0.891 4
SegNet 0.974 9 0.869 4 0.989 1 0.869 4
UperNet 0.977 3 0.884 7 0.990 4 0.884 7
SGLNet 0.938 0 0.597 2 0.966 4 0.597 2
Caps-Unet 0.980 3 0.824 9 0.989 7 0.867 3
TransUnet 0.979 2 0.823 8 0.985 6 0.665 5
LAC-UNet 0.982 6 0.924 3 0.992 7 0.954 3

图7

DRIVE数据集的分割结果"

图8

CHASEDB1数据集的分割结果"

图9

CrackForest数据集的分割结果"

图10

MSRC数据集的分割结果"

1 易三莉, 陈建亭, 贺建峰. ASR-UNet: 一种基于注意力机制改进的视网膜血管分割算法[J]. 山东大学学报(理学版), 2021, 56 (9): 13- 20.
YI Sanli , CHEN Jianting , HE Jianfeng . ASR-UNet: an improved retinal vessels segmentation algorithm based on attention mechanism[J]. Journal of Shandong University(Natural Science), 2021, 56 (9): 13- 20.
2 郭文鹃, 杨公平, 董晋利. 指纹图像分割方法综述[J]. 山东大学学报(理学版), 2010, 45 (7): 94- 101.
GUO Wenjuan , YANG Gongping , DONG Jinli . A review of fingerprint image segmentation methods[J]. Journal of Shandong University(Natural Science), 2010, 45 (7): 94- 101.
3 杨鹏, 蔡青青, 孙昊, 等. 基于卷积神经网络的室内场景识别[J]. 郑州大学学报(理学版), 2018, 50 (3): 73- 77.
YANG Peng , CAI Qingqing , SUN Hao , et al. Indoor scene recognition based on convolutional neural network[J]. Journal of Zhengzhou University(Natural Science Edition), 2018, 50 (3): 73- 77.
4 杜中强, 唐林波, 韩煜祺. 面向嵌入式平台的车道线检测方法[J]. 红外与激光工程, 2022, 51 (7): 483- 490.
DU Zhongqiang , TANG Linbo , HAN Yuqi . Lane line detection method for embedded platform[J]. Infrared and Laser Engineering, 2022, 51 (7): 483- 490.
5 MINAEE S , BOYKOV Y , PORIKLI F , et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44 (7): 3523- 3542.
6 高兴波, 史旭华, 葛群峰, 等. 面向动态物体场景的视觉SLAM综述[J]. 机器人, 2021, 43 (6): 733- 750.
GAO Xingbo , SHI Xuhua , GE Qunfeng , et al. A survey of visual SLAM for scenes with dynamic objects[J]. Robot, 2021, 43 (6): 733- 750.
7 CHEN Chen , QIN Chen , QIU Huaqi , et al. Deep learning for cardiac image segmentation: a review[J]. Frontiers in Cardiovascular Medicine, 2020, 7, 25.
8 MAGADZA T , VIRIRI S . Deep learning for brain tumor segmentation: a survey of state-of-the-art[J]. Journal of Imaging, 2021, 7 (2): 19.
9 LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39 (4): 640- 651.
10 WANG C L, YANG B, LIAO Y W. Unsupervised image segmentation using convolutional autoencoder with total variation regularization as preprocessing[C]//2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New Orleans, LA, USA: IEEE, 2017: 1877-1881.
11 RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention — MICCAI 2015. Berlin: Springer, 2015: 234-241.
12 ZHOU Z W, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. UNet++: A nested U-net architecture for medical image segmentation[M]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Berlin: Springer, 2018: 3-11.
13 BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (12): 2481- 2495.
14 ZHAO Pengyu, ZHANG Yuanxing, BIAN Kaigui, et al. Laddernet: knowledge transfer based viewpoint prediction in 360° video[C]//Proc of IEEE ICASSP. Brighton, UK: IEEE, 2019: 1657-1661.
15 WU H K, ZHANG J G, HUANG K Q, et al. FastFCN: rethinking dilated convolution in the backbone for semantic segmentation[EB/OL]. (2019-03-28)[2023-10-13]. https://arxiv.org/abs/1903.11816
16 MANINIS K K , PONT-TUSET J , ARBELAEZ P , et al. Convolutional oriented boundaries: from image segmentation to high-level tasks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 819- 833.
17 XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[EB/OL]. (2021-10-28)[2023-10-13]. https://arxiv.org/abs/2105.15203
18 HINTON G E, KRIZHEVSKY A, WANG S D. Transforming auto-encoders[M]//Lecture Notes in Computer Science. Berlin. Heidelberg: Springer, 2011: 44-51.
19 SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 3859-3869.
20 HINTON G E, SABOUR S, FROSST N. Matrix capsules with EM routing[C]//Proc of the 6th International Conference on Learning Representations, [s. n. ]: Open Review, 2018.
21 HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway: IEEE, 2017: 2261-2269.
22 SAHU S K, KUMAR P, SINGH A P. Dynamic routing using inter capsule routing protocol between capsules[C]//2018 UKSim-AMSS 20th International Conference on Computer Modelling and Simulation (UKSim). Cambridge, UK: IEEE, 2018: 1-5.
23 LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: IEEE, 2022: 9992-10002.
24 XIAO T T, LIU Y C, ZHOU B L, et al. Unified perceptual parsing for scene understanding[M]//Computer Vision-ECCV 2018. Berlin: Springer International Publishing, 2018: 432-448.
25 ZHOU Y Q, YU H C, SHI H. Study group learning: improving retinal vessel segmentation trained with noisy labels[M]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2021. Cham: Springer, 2021: 57-67.
26 CHEN J N, LU Y Y, YU Q H, et al. TransUNet: transformers make strong encoders for medical image segmentation[EB/OL]. (2021-02-08)[2023-10-13]. https://arxiv.org/abs/2102.04306
27 CHEN Y L, LI X W, YAO H T, et al. Adherent nuclei edge detection based on caps-unet[C]//2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). Exeter, UK: IEEE, 2021: 889-894.
[1] 李程,车文刚,高盛祥. 一种用于航拍图像的目标检测算法[J]. 《山东大学学报(理学版)》, 2023, 58(9): 59-70.
[2] 易三莉,陈建亭,贺建峰. ASR-UNet: 一种基于注意力机制改进的视网膜血管[J]. 《山东大学学报(理学版)》, 2021, 56(9): 13-20.
[3] 徐菲菲,许赟杰. 基于Arc-LSTM的人职匹配研究[J]. 《山东大学学报(理学版)》, 2021, 56(1): 83-90.
[4] 郝长盈,兰艳艳,张海楠,郭嘉丰,徐君,庞亮,程学旗. 基于拓展关键词信息的对话生成模型[J]. 《山东大学学报(理学版)》, 2019, 54(7): 68-76.
[5] 刘飚,路哲,黄雨薇,焦萌,李泉其,薛瑞. 神经网络结构在功耗分析中的性能对比[J]. 《山东大学学报(理学版)》, 2019, 54(1): 60-66.
[6] 庞博,刘远超. 融合pointwise及深度学习方法的篇章排序[J]. 山东大学学报(理学版), 2018, 53(3): 30-35.
[7] 刘明明,张敏情,刘佳,高培贤. 一种基于浅层卷积神经网络的隐写分析方法[J]. 山东大学学报(理学版), 2018, 53(3): 63-70.
[8] 刘铭, 昝红英, 原慧斌. 基于SVM与RNN的文本情感关键句判定与抽取[J]. 山东大学学报(理学版), 2014, 49(11): 68-73.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] 刘倩辉1,裴海燕1,2,*,胡文容1,2,解军3. 南四湖浮游植物种群构成特征及季节变化[J]. J4, 2010, 45(5): 12 -18 .
[2] 王 涛 . Post-Gamma算子关于导数为局部有界函数的点态逼近估计[J]. J4, 2007, 42(4): 75 -78 .
[3] 董爱君 李国君 邹青松. 含相邻三角形的平面图的列表边和列表全染色[J]. J4, 2009, 44(10): 17 -20 .
[4] 刘天宝,李宝宗,彭艳芬 . 有机物对沙癙幼虫麻醉活性的构效关系研究[J]. J4, 2006, 41(6): 129 -131 .
[5] 杨世洲1,宋雪梅2. 相对于幺半群的拟-MCCOY环[J]. J4, 2010, 45(8): 47 -52 .
[6] 范小莉,刘伯燕,梁玉,刘建,房用,孟振农. 南四湖湿地植被构成及分布分析[J]. 山东大学学报(理学版), 2016, 51(7): 131 -136 .
[7] 李宇溪,王恺璇,林慕清,周福才. 基于匿名广播加密的P2P社交网络隐私保护系统[J]. 山东大学学报(理学版), 2016, 51(9): 84 -91 .
[8] 刘甲国 . 高阶的分数阶的粘弹性材料本构模型的复模量与复柔量[J]. J4, 2008, 43(4): 85 -88 .
[9] 雷雪萍. 几乎C-倾斜模[J]. J4, 2011, 46(2): 101 -104 .
[10] 许兰图 . 非线性双曲[J]. J4, 2006, 41(4): 20 -24 .