• Chinese Journal of Lasers
  • Vol. 48, Issue 11, 1110003 (2021)
Aili Wang1, Yuxiao Zhang1, Haibin Wu1、*, Kaiyuan Jiang1, and Yuji Iwahori2
Author Affiliations
  • 1Heilongjiang Province Key Laboratory of Laser Spectroscopy Technology and Application, Harbin University of Science and Technology, Harbin, Heilongjiang 150080, China
  • 2Department of Computer Science, Chubu University, Aichi 487- 8501, Japan
  • show less
    DOI: 10.3788/CJL202148.1110003 Cite this Article Set citation alerts
    Aili Wang, Yuxiao Zhang, Haibin Wu, Kaiyuan Jiang, Yuji Iwahori. LiDAR Data Classification Based on Dilated Convolution Capsule Network[J]. Chinese Journal of Lasers, 2021, 48(11): 1110003 Copy Citation Text show less

    Abstract

    Objective LiDAR, as an essential technical means of obtaining physical attributes of ground objects, is widely used in remote sensing image classification research. Recently, deep learning algorithms that use multilayer nonlinear transformation methods to automatically extract features have become mainstream in the field of image processing. Traditional convolutional neural networks (CNNs) use neurons as the unit for processing data. Each neuron can only recognize one pattern and is not sensitive to the direction and position information of objects in the image. To overcome this shortcoming of CNN, CapsNet is used in the field of image classification. However, the capsule network is a lightweight neural network, and its network structure indicates that the feature capture ability is still lacking compared with the deep network. Aiming at the above two problems, this study proposes a LiDAR data classification algorithm that combines capsule network and dilated convolution. The features extracted by the residual network combined with the dilated convolution are used as the input of the capsule network to enhance the feature capture capability of the lightweight capsule network. Dilated convolution can expand the receptive field of the convolutional layer without increasing network parameters. The capsule network uses capsules as a data processing unit in its internal structure and adopts a dynamic routing algorithm for data transfer between capsule layers. It has better feature expression capabilities than CNN. Thus, the proposed method can improve the classification performance of LiDAR data.

    Methods The LiDAR data classification algorithm that combines the capsule network and dilated convolution is proposed. First, input the LiDAR data into the dilated convolution capsule network (DCCN) model, which uses the first three layers of the residual network to perform rough feature extraction on the LiDAR data, and the latter two layers use dilated convolution instead of traditional convolution to obtain detailed features of LiDAR images of different scales. Among them, the design of the dilation rate adopts the odd-even mixed method to avoid the gridding effect. Since the capsule network converts each neuron of the traditional neural network from a scalar to a vector, the modeling of the internal structure of the image presents a certain semantic relationship. At the same time, the excellent characteristics of the spatial information of the image are retained. The feature extracted from the residual network combined with dilated convolution is used as the input of the capsule network, the convolutional layer of the capsule network is used to capture data features, and batch normalization is used in the convolutional layer to solve the problem of poor initialization and ensure gradient propagates to each layer. Then, output the feature map and reform it into several main capsules. The coupling coefficient is obtained through dynamic routing, and the conversion from the primary capsule layer to the digital capsule layer is completed. Finally, the classification results of various features are obtained according to the length of each vector in the digital capsule layer.

    Results and Discussions The datasets used in this article are the Bayview Park and Recology public datasets of the 2012 IEEE International Remote Sensing Image Fusion Competition, both of which were collected in San Francisco, USA. The evaluation index of the classification result of the experiment adopts the overall classification accuracy (OA), average classification accuracy (AA), and Kappa coefficient commonly used in remote sensing image classification. To verify the advantages of the capsule network in time, the training and testing time of the model are used as essential evaluation indicators of the model. First, the proposed DCCN model is compared with the other seven classification algorithms when the training samples are 400, 500, 600, and 700. The results showed that the evaluation indicators of the proposed algorithm are higher than that of the other classification algorithms (Table 2, Table 3). When the number of training samples is 700, the classification accuracy of each category of the two datasets is compared. By comparing the classification results of each category of Dilated-ResNet and DCCN, the classification accuracy of the ground object category with a lower height is significantly improved, which fully proves the sensitivity of the capsule network to spatial feature information (Table 4, Table 5). When the capsule network is used to classify LiDAR data alone, although the overall classification accuracy is low, the training time for the two datasets is only 99.58 and 94.27 s, which saves computational cost. The proposed model uses the residual network extracted features combined with dilated convolution as the input of the capsule network. Although the training time is increased, the slight disadvantage of time is acceptable because it can significantly improve the classification accuracy (Table 6). Finally, the results of all classification algorithms are shown in pseudo-color maps. It can be seen that the DCCN classification boundary is smoother, the marked and background pixels are less misclassified, and they are closer to the real surface object distribution (Fig.10, Fig.11).

    Conclusions This study proposes the DCCN model used for the ground object classification of LiDAR data, which introduces dilated convolution into ResNet. The dilation rate distribution of dilated convolution is designed using the odd-even mixed method to suppress the gridding effect. The capsule network is used to extract more detailed spatial feature information. This study conducts experiments on two representative LiDAR datasets and compares them with seven typical classification algorithms. The results showed that the designed network classification accuracy performs better. When the number of training samples is 700, the OA on the Bayview Park and Recology datasets reach 97.07% and 96.98%, respectively.

    Aili Wang, Yuxiao Zhang, Haibin Wu, Kaiyuan Jiang, Yuji Iwahori. LiDAR Data Classification Based on Dilated Convolution Capsule Network[J]. Chinese Journal of Lasers, 2021, 48(11): 1110003
    Download Citation