JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE) ›› 2023, Vol. 58 ›› Issue (11): 116-126.doi: 10.6040/j.issn.1671-9352.0.2022.488

•   • Previous Articles     Next Articles

LAC-UNet: semantic segmentation model based on capsules for representing part-whole hierarchical features

Chengcheng ZHONG1(),Heng ZHOU2,*(),Zitong ZHANG3,Chunlei ZHANG4   

  1. 1. School of Science, China University of Geosciences, Beijing 100083, China
    2. College of Information and Electrical Engineering, China Agricultural University, Beijing 100083, China
    3. School of Earth Sciences and Resources, China University of Geosciences, Beijing 100083, China
    4. Beijing Zhongdi Runde Petroleum Technology Co., Ltd., Beijing 100083, China
  • Received:2022-09-09 Online:2023-11-20 Published:2023-11-07
  • Contact: Heng ZHOU E-mail:1129109156@qq.com;xys_zh@163.com

Abstract:

Aiming at the insufficient ability of native U-Net to express spatial structure features, capsule structure is introduced into the U-Net semantic segmentation model and the LAC-UNet model is proposed to obtain finer spatial structure by capsule vectors. LAC-UNet model encodes local pixel-level information into capsules and uses capsules as the basic feature unit of U-Net with finer spatial structure feature expression capability. First, the local pixel-level information is input into the primary capsule using the convolution operation, and then, the primary capsule is integrated into the advanced capsule at the digital capsule layer using the local dynamic routing algorithm. In which, the local dynamic routing algorithm introduces spatial and channel weights to give the capsule a stronger ability to capture local contextual cues when it is encoding and integrating local information. Finally, different evaluation metrics such as accuracy, Dice are used to evaluate the performance of the model. The experimental results show that LAC-UNet achieves the best effect in all four semantic segmentation datasets that are DRIVE, CHASEDB1, CrackForest, and MSRC.

Key words: image semantic segmentation, capsule network, U-Net, routing algorithm, deep learning

CLC Number: 

  • TP391

Fig.1

Network framework of U-Net"

Fig.2

Structure of capsule network"

Fig.3

Primary capsule"

Fig.4

Aggregation process from primary capsules to digital capsules"

Fig.5

The structure of LAC-UNet"

Fig.6

Four semantically segmented datasets"

Table 1

LAC-UNet model hyperparameters"

编码层数 卷积层 初级胶囊 数字胶囊
输入通道 输出通道 卷积核 输入通道 胶囊长度 输入通道 胶囊长度
1 3 3 3×3 4 4 2 8
2 32 32 3×3 8 4 4 8
3 64 64 3×3 16 4 8 8
4 128 128 3×3 32 4 16 8
5 256 256 3×3 64 4 32 8

Table 2

Segmentation results of DRIVE dataset"

模型 A S S D
U-Net 0.961 2 0.811 4 0.974 1 0.803 8
U-Net++ 0.962 3 0.807 9 0.977 6 0.808 0
SegNet 0.961 6 0.812 8 0.976 4 0.805 8
UperNet 0.953 1 0.780 7 0.970 3 0.762 5
SGLNet 0.961 6 0.732 3 0.984 9 0.788 6
Caps-Unet 0.965 9 0.820 0 0.980 3 0.811 1
TransUnet 0.963 8 0.809 7 0.979 4 0.815 6
LAC-UNet 0.968 5 0.823 6 0.987 9 0.821 7

Table 3

Segmentation results of CHASEDB1 dataset"

模型 A S S D
U-Net 0.968 9 0.806 1 0.981 6 0.808 1
U-Net++ 0.969 3 0.796 6 0.982 8 0.808 8
SegNet 0.967 3 0.788 0 0.981 3 0.800 3
UperNet 0.966 3 0.813 2 0.978 3 0.794 6
SGLNet 0.953 6 0.673 5 0.975 2 0.691 2
Caps-Unet 0.964 6 0.812 9 0.983 8 0.815 6
TransUnet 0.965 3 0.796 5 0.949 2 0.768 6
LAC-UNet 0.972 0 0.820 8 0.987 6 0.820 1

Table 4

Segmentation results of CrackForest dataset"

模型 A S S D
U-Net 0.981 4 0.634 6 0.992 5 0.614 0
U-Net++ 0.977 7 0.674 2 0.988 1 0.571 0
SegNet 0.979 5 0.641 4 0.990 5 0.585 7
UperNet 0.982 6 0.656 8 0.993 3 0.643 1
SGLNet 0.961 3 0.544 4 0.968 6 0.324 5
Caps-Unet 0.989 4 0.655 0 0.992 7 0.597 8
TransUnet 0.942 6 0.639 0 0.987 9 0.617 4
LAC-UNet 0.985 0 0.675 1 0.993 6 0.632 7

Table 5

Segmentation results of MSRC dataset"

模型 A S S D
U-Net 0.980 2 0.903 6 0.992 0 0.903 6
U-Net++ 0.978 3 0.891 4 0.990 9 0.891 4
SegNet 0.974 9 0.869 4 0.989 1 0.869 4
UperNet 0.977 3 0.884 7 0.990 4 0.884 7
SGLNet 0.938 0 0.597 2 0.966 4 0.597 2
Caps-Unet 0.980 3 0.824 9 0.989 7 0.867 3
TransUnet 0.979 2 0.823 8 0.985 6 0.665 5
LAC-UNet 0.982 6 0.924 3 0.992 7 0.954 3

Fig.7

Segmentation results of DRIVE dataset"

Fig.8

Segmentation results of CHASEDB1 dataset"

Fig.9

Segmentation results of CrackForest dataset"

Fig.10

Segmentation results of MSRC dataset"

1 易三莉, 陈建亭, 贺建峰. ASR-UNet: 一种基于注意力机制改进的视网膜血管分割算法[J]. 山东大学学报(理学版), 2021, 56 (9): 13- 20.
YI Sanli , CHEN Jianting , HE Jianfeng . ASR-UNet: an improved retinal vessels segmentation algorithm based on attention mechanism[J]. Journal of Shandong University(Natural Science), 2021, 56 (9): 13- 20.
2 郭文鹃, 杨公平, 董晋利. 指纹图像分割方法综述[J]. 山东大学学报(理学版), 2010, 45 (7): 94- 101.
GUO Wenjuan , YANG Gongping , DONG Jinli . A review of fingerprint image segmentation methods[J]. Journal of Shandong University(Natural Science), 2010, 45 (7): 94- 101.
3 杨鹏, 蔡青青, 孙昊, 等. 基于卷积神经网络的室内场景识别[J]. 郑州大学学报(理学版), 2018, 50 (3): 73- 77.
YANG Peng , CAI Qingqing , SUN Hao , et al. Indoor scene recognition based on convolutional neural network[J]. Journal of Zhengzhou University(Natural Science Edition), 2018, 50 (3): 73- 77.
4 杜中强, 唐林波, 韩煜祺. 面向嵌入式平台的车道线检测方法[J]. 红外与激光工程, 2022, 51 (7): 483- 490.
DU Zhongqiang , TANG Linbo , HAN Yuqi . Lane line detection method for embedded platform[J]. Infrared and Laser Engineering, 2022, 51 (7): 483- 490.
5 MINAEE S , BOYKOV Y , PORIKLI F , et al. Image segmentation using deep learning: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2022, 44 (7): 3523- 3542.
6 高兴波, 史旭华, 葛群峰, 等. 面向动态物体场景的视觉SLAM综述[J]. 机器人, 2021, 43 (6): 733- 750.
GAO Xingbo , SHI Xuhua , GE Qunfeng , et al. A survey of visual SLAM for scenes with dynamic objects[J]. Robot, 2021, 43 (6): 733- 750.
7 CHEN Chen , QIN Chen , QIU Huaqi , et al. Deep learning for cardiac image segmentation: a review[J]. Frontiers in Cardiovascular Medicine, 2020, 7, 25.
8 MAGADZA T , VIRIRI S . Deep learning for brain tumor segmentation: a survey of state-of-the-art[J]. Journal of Imaging, 2021, 7 (2): 19.
9 LONG J , SHELHAMER E , DARRELL T . Fully convolutional networks for semantic segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2015, 39 (4): 640- 651.
10 WANG C L, YANG B, LIAO Y W. Unsupervised image segmentation using convolutional autoencoder with total variation regularization as preprocessing[C]//2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP). New Orleans, LA, USA: IEEE, 2017: 1877-1881.
11 RONNEBERGER O, FISCHER P, BROX T. U-net: convolutional networks for biomedical image segmentation[C]//Medical Image Computing and Computer-Assisted Intervention — MICCAI 2015. Berlin: Springer, 2015: 234-241.
12 ZHOU Z W, RAHMAN SIDDIQUEE M M, TAJBAKHSH N, et al. UNet++: A nested U-net architecture for medical image segmentation[M]//Deep Learning in Medical Image Analysis and Multimodal Learning for Clinical Decision Support. Berlin: Springer, 2018: 3-11.
13 BADRINARAYANAN V , KENDALL A , CIPOLLA R . SegNet: a deep convolutional encoder-decoder architecture for image segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39 (12): 2481- 2495.
14 ZHAO Pengyu, ZHANG Yuanxing, BIAN Kaigui, et al. Laddernet: knowledge transfer based viewpoint prediction in 360° video[C]//Proc of IEEE ICASSP. Brighton, UK: IEEE, 2019: 1657-1661.
15 WU H K, ZHANG J G, HUANG K Q, et al. FastFCN: rethinking dilated convolution in the backbone for semantic segmentation[EB/OL]. (2019-03-28)[2023-10-13]. https://arxiv.org/abs/1903.11816
16 MANINIS K K , PONT-TUSET J , ARBELAEZ P , et al. Convolutional oriented boundaries: from image segmentation to high-level tasks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2018, 40 (4): 819- 833.
17 XIE E Z, WANG W H, YU Z D, et al. SegFormer: simple and efficient design for semantic segmentation with transformers[EB/OL]. (2021-10-28)[2023-10-13]. https://arxiv.org/abs/2105.15203
18 HINTON G E, KRIZHEVSKY A, WANG S D. Transforming auto-encoders[M]//Lecture Notes in Computer Science. Berlin. Heidelberg: Springer, 2011: 44-51.
19 SABOUR S, FROSST N, HINTON G E. Dynamic routing between capsules[C]//Proceedings of the 31st International Conference on Neural Information Processing Systems. New York: ACM, 2017: 3859-3869.
20 HINTON G E, SABOUR S, FROSST N. Matrix capsules with EM routing[C]//Proc of the 6th International Conference on Learning Representations, [s. n. ]: Open Review, 2018.
21 HUANG G, LIU Z, VAN DER MAATEN L, et al. Densely connected convolutional networks[C]//2017 IEEE Conference on Computer Vision and Pattern Recognition(CVPR). Piscataway: IEEE, 2017: 2261-2269.
22 SAHU S K, KUMAR P, SINGH A P. Dynamic routing using inter capsule routing protocol between capsules[C]//2018 UKSim-AMSS 20th International Conference on Computer Modelling and Simulation (UKSim). Cambridge, UK: IEEE, 2018: 1-5.
23 LIU Z, LIN Y T, CAO Y, et al. Swin transformer: hierarchical vision transformer using shifted windows[C]//2021 IEEE/CVF International Conference on Computer Vision (ICCV). Montreal, QC, Canada: IEEE, 2022: 9992-10002.
24 XIAO T T, LIU Y C, ZHOU B L, et al. Unified perceptual parsing for scene understanding[M]//Computer Vision-ECCV 2018. Berlin: Springer International Publishing, 2018: 432-448.
25 ZHOU Y Q, YU H C, SHI H. Study group learning: improving retinal vessel segmentation trained with noisy labels[M]//Medical Image Computing and Computer Assisted Intervention-MICCAI 2021. Cham: Springer, 2021: 57-67.
26 CHEN J N, LU Y Y, YU Q H, et al. TransUNet: transformers make strong encoders for medical image segmentation[EB/OL]. (2021-02-08)[2023-10-13]. https://arxiv.org/abs/2102.04306
27 CHEN Y L, LI X W, YAO H T, et al. Adherent nuclei edge detection based on caps-unet[C]//2020 IEEE Intl Conf on Parallel & Distributed Processing with Applications, Big Data & Cloud Computing, Sustainable Computing & Communications, Social Computing & Networking (ISPA/BDCloud/SocialCom/SustainCom). Exeter, UK: IEEE, 2021: 889-894.
[1] Cheng LI,Wengang CHE,Shengxiang GAO. A object detection algorithm for aerial images [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2023, 58(9): 59-70.
[2] San-li YI,Jian-ting CHEN,Jian-feng HE. ASR-UNet: an improved retinal vessels segmentation algorithm based on attention mechanism [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(9): 13-20.
[3] Fei-fei XU,Yun-jie XU. Research on matching resumes and positions based on Arc-LSTM [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2021, 56(1): 83-90.
[4] Chang-ying HAO,Yan-yan LAN,Hai-nan ZHANG,Jia-feng GUO,Jun XU,Liang PANG,Xue-qi CHENG. Dialogue generation model based on extended keywords information [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(7): 68-76.
[5] LIU Biao, LU Zhe, HUANG Yu-wei, JIAO Meng, LI Quan-qi, XUE Rui. Comparative study on neural network structures in power analysis [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2019, 54(1): 60-66.
[6] PANG Bo, LIU Yuan-chao. Fusion of pointwise and deep learning methods for passage ranking [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 30-35.
[7] LIU Ming-ming, ZHANG Min-qing, LIU Jia, GAO Pei-xian. Steganalysis method based on shallow convolution neural network [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2018, 53(3): 63-70.
[8] LIU Ming, ZAN Hong-ying, YUAN Hui-bin. Key sentiment sentence prediction using SVM and RNN [J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2014, 49(11): 68-73.
Viewed
Full text


Abstract

Cited

  Shared   
  Discussed   
[1] LIU Qian-hui1, PEI Hai-yan1, 2,*, HU Wen-rong1,2, XIE Jun3. Population characteristics and seasonal variations of  phytoplankton in Nansi Lake[J]. J4, 2010, 45(5): 12 -18 .
[2] WANG Tao . Pointwise approximation of Post-Gamma operators for functions with locally bounded derivatives[J]. J4, 2007, 42(4): 75 -78 .
[3] DONG Ai-Jun, LI Guo-Jun, JU Jing-Song. List edge and list total colorings of planar graphs with adjacent triangles[J]. J4, 2009, 44(10): 17 -20 .
[4] LIU Tian-bao,LI Zong-bao,PENG yan-fen, . Relationships between the anesthetic activities for Arenicola larva and the structures of organic molecules[J]. J4, 2006, 41(6): 129 -131 .
[5] YANG Shi-zhou1, SONG Xue-mei2. Quasi-McCoy rings relative to a monoid[J]. J4, 2010, 45(8): 47 -52 .
[6] FAN Xiao-li, LIU Bo-yan, LIANG Yu, LIU Jian, FANG Yong, MENG Zhen-nong. Analysis and distribution of wetland vegetation of Nansi Lake, China[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(7): 131 -136 .
[7] LI Yu-xi, WANG Kai-xuan, LIN Mu-qing, ZHOU Fu-cai. A P2P network privacy protection system based on anonymous broadcast encryption scheme[J]. JOURNAL OF SHANDONG UNIVERSITY(NATURAL SCIENCE), 2016, 51(9): 84 -91 .
[8] LIU Jia-guo . The complex modulus and the complex compliance for higher-order fractional constitutive models of visco-elastic materials[J]. J4, 2008, 43(4): 85 -88 .
[9] LEI Xue-ping. Almost complete C-tilting modules[J]. J4, 2011, 46(2): 101 -104 .
[10] XU Tu-lan . A.D.I. Galerkin method for a nonlinear hyperbolic differential system[J]. J4, 2006, 41(4): 20 -24 .