LGNet: Local and global representation learning for fast biomedical image segmentation

Guoping Xu; Xuan Zhang; Wentao Liao; Shangbin Chen; Xinglong Wu

doi:10.1142/S1793545822430015

Abstract

Medical image segmentation plays a crucial role in clinical diagnosis and therapy systems, yet still faces many challenges. Building on convolutional neural networks (CNNs), medical image segmentation has achieved tremendous progress. However, owing to the locality of convolution operations, CNNs have the inherent limitation in learning global context. To address the limitation in building global context relationship from CNNs, we proposeLGNet, a semantic segmentation network aiming to learn local and global features for fast and accurate medical image segmentation in this paper. Specifically, we employ a two-branch architecture consisting of convolution layers in one branch to learn local features and transformer layers in the other branch to learn global features. LGNet has two key insights: (1) We bridge two-branch to learn local and global features in an interactive way; (2) we present a novel multi-feature fusion model (MSFFM) to leverage the global contexture information from transformer and the local representational features from convolutions. Our method achieves state-of-the-art trade-off in terms of accuracy and efficiency on several medical image segmentation benchmarks including Synapse, ACDC and MOST. Specifically, LGNet achieves the state-of-the-art performance with Dice’s indexes of 80.15% on Synapse, of 91.70% on ACDC, and of 95.56% on MOST. Meanwhile, the inference speed attains at 172 frames per second with

224 \times 224

input resolution. The extensive experiments demonstrate the effectiveness of the proposed LGNet for fast and accurate for medical image segmentation.