• Acta Photonica Sinica
  • Vol. 52, Issue 1, 0110001 (2023)
Qinglin AI1、*, Junrui ZHANG1, and Feiqing WU2
Author Affiliations
  • 1Key Laboratory of Special Purpose Equipment and Advanced Manufacturing Technology,Ministry of Education & Zhejiang Province,Zhejiang University of Technology,Hangzhou,310023,China
  • 2School of Information Science and Engineering,Ningbotech University,Ningbo 315100,China
  • show less
    DOI: 10.3788/gzxb20235201.0110001 Cite this Article
    Qinglin AI, Junrui ZHANG, Feiqing WU. AF-ICNet Semantic Segmentation Method for Unstructured Scenes Based on Small Target Category Attention Mechanism and Feature Fusion[J]. Acta Photonica Sinica, 2023, 52(1): 0110001 Copy Citation Text show less

    Abstract

    There are a lot of unstructured road scenes in the actual road driving, large-scale engineering work and field work of robots. Compared with the structured road, the unstructured road has the characteristics of small color difference between the road and its surroundings, large variety of road target species and complex information. Traditional methods for detecting unstructured road areas have problems such as low detection accuracy, poor real-time performance and poor detection effect for small targets. Small target obstacles and pedestrians will seriously interfere with the detection of viable road areas. Small target has low resolution and little information in the image, which leads to poor feature characterization capability. The dominance of large categories also easily leads to the neglect of small target categories. To solve the above problems, we construct a lightweight real-time semantics segmentation network of AF-ICNet based on small target category attention mechanism and feature fusion. Firstly, the pyramid pooling module structure in ICNet is replaced by atrous spatial pyramid pooling, which combines feature receptive fields of different scales to reduce the pooling effect, and finally enhances the network′s ability to perceive the global image. On this basis, we embedded coordinate attention mechanism in improved ICNet model. We establish channel information and spatial location information to enhance the network's ability to extract the small target category semantics features of unstructured roads. This method of fusing channels and spatial attention is different from both SE-Net and CBAM. Finally, in view of the imbalance of category distribution in unstructured road scenes, we design a Weighted Cross-Entropy loss function to improve the network's attention to small target categories. The weight of the branch can effectively improve the attention of the network to images of different resolutions. The weight of the category can effectively improve the network's attention to small target categories. In order to verify the validity of the super parameters, we carried out the parameter sensitivity analysis experiment, and the value range of the parameters is determined. Based on the above improvements, we designed AF-ICNet semantic segmentation network. In order to verify the improvement of the model, we use the AF-ICNet model to train Cityscapes and IDD datasets. After training, 19 categories are segmented in the Cityscapes dataset, the final MIoU of AF-ICNet reaches 71.5%, and the final PixAcc reaches 81.3%. 26 categories are segmented in the IDD dataset and the final MIoU of AF-ICNet reaches 62.5%, and the final PixAcc reaches 89.8%. To verify the effectiveness of each improvement point, we perform the ablation experiments. We divided into four groups for the experiment. The four groups of experiments are the original network, the network with Weighted Cross-Entropy, the network with ASPP and Weighted Cross-Entropy, and AF-ICNet with CA attention mechanism. The experimental results show that each improvement point of AF-ICNet can effectively improve the network segmentation accuracy on the basis of guaranteeing the real-time performance of the network. In order to further verify the effectiveness of the improved method in practical application, we establish an experimental testing system for a field test, and use training model of the IDD dataset to test. In the real scene test experiment, AF-ICNet effectively segmented the road area and the surrounding objects, and for the manually placed small target objects, the segmentation edge of AF-ICNet is more accurate. In terms of test speed, AF-ICNet achieves a segmentation speed of 17FPS on a 1 280×960 image, and a segmentation speed of 31FPS on a 1 280×720 image, which fully meets the real-time requirements of road segmentation. The test results show that AF-ICNet effectively improves the network segmentation effect. In the case of real-time performance, the segmentation accuracy of small target categories is improved.
    Qinglin AI, Junrui ZHANG, Feiqing WU. AF-ICNet Semantic Segmentation Method for Unstructured Scenes Based on Small Target Category Attention Mechanism and Feature Fusion[J]. Acta Photonica Sinica, 2023, 52(1): 0110001
    Download Citation