• Acta Photonica Sinica
  • Vol. 52, Issue 12, 1210003 (2023)
Mengyu SUN1, Peng WANG2, Junqi XU1、*, Xiaoyan LI2, Hui GAO2, and Ruohai DI2
Author Affiliations
  • 1School of Opto-electronical Engineering,Xi'an Technological University,Xi'an 710021,China
  • 2Electronic Information Engineering,Xi'an Technological University,Xi'an 710021,China
  • show less
    DOI: 10.3788/gzxb20235212.1210003 Cite this Article
    Mengyu SUN, Peng WANG, Junqi XU, Xiaoyan LI, Hui GAO, Ruohai DI. Adaptive Information Selection for Infrared Object Tracking with Variable Scale Correlation Filter[J]. Acta Photonica Sinica, 2023, 52(12): 1210003 Copy Citation Text show less

    Abstract

    While there is less information contained in the infrared image, it still has problems with image blur and noise. It caused many difficulties in infrared visual target tracking. The discriminative correlation filter is a reliable method for tracking objects that can be trained online on an embedded hardware platform. But the variable scale of the object cannot be efficiently solved. We proposed an adaptive information selection for infrared object tracking with a variable scale correlation filter to solve the problem mentioned earlier. The proposed algorithm is divided into three parts, one for feature extractor, one for location filter, and one for scale filter. Three features are extracted by the feature extractor to represent the object. The insensitive feature and the histogram of gradient are chosen as the based features. The insensitive feature represents the intensity of the object at its current position in the infrared image. The information extracted from the gradient histogram, which contains more information about the shape of the target, represents the regions with sudden changes in the infrared image such as the edge and corner of the target. To enhance the object's representation, we generate a new histogram of gradients based on the insensitive feature, which has various representations of the object. After that, the position filter will receive the sample frame feature information. We train three position filters depend on three types of feature, respectively. In the training phase of the position filter, temporal regularization, spatial regularization and spatial information selection are added during optimization. The temporal regularization is used to enforce correlation filter more similarity to the last coefficients, which makes correlation filter more fit on the variation of the object. The spatial regularization is limited the correlation filter concentrate on the region of the object when training. For reducing redundancy of the object's information and speeding up the training process, we set different thresholds for coefficients. When the coefficients are lower than the threshold, we set it to zero to reduce the number of coefficients. The spatial coefficient of each channel is treated individually. Then, the position filter will be convolved with the feature of the current frame. We compute three response maps according to the different features. The response map corresponding to various features will be weighted and sum together to find the max value position which is the estimated object position. The weights of the various features are set to the hyparameter. The weight of the new feature is set smaller than the other base features, because of the larger receptive field. For better representation of the object's scale change, we extract the variable scale sample from the estimated object position. The ratio of the height and width of the variable scale sample is different. We flatten the features of all variable scale samples and concatenate them together along with the spatial dimension as the scale training features. The corresponding scale factor is obtained by convolution according to the obtained scale filter coefficient, so as to determine the boundary ratio and obtain the boundary of the object. We choose LSOTB-TIR dataset and PTB-TIR dataset as the test dataset. The proposed algorithm can reach 34.85 frames per seconds on the central processing unit platform. On the LSOTB-TIR data set, the accuracy and success rate of the proposed algorithm reached 71.3% and 59.4% on the LSOTB-TIR dataset, and 80.4% and 61.1% on the PTB-TIR dataset, respectively. We analysis each component in our algorithm and find the spatial selection contributes more for precision. We also visualize the result of our method, which demonstrates our method can track the object consistently and can adapt the change of boundary ratio.
    Mengyu SUN, Peng WANG, Junqi XU, Xiaoyan LI, Hui GAO, Ruohai DI. Adaptive Information Selection for Infrared Object Tracking with Variable Scale Correlation Filter[J]. Acta Photonica Sinica, 2023, 52(12): 1210003
    Download Citation