• Opto-Electronic Engineering
  • Vol. 50, Issue 6, 230009 (2023)
Wen Cheng1、2、3, Zhongbi Chen2、*, Qingqing Li2, Meihui Li2, Jianlin Zhang2, and Yuxing Wei2
Author Affiliations
  • 1National Key Laboratory of Optical Field Manipulation Science and Technology, Chinese Academy of Sciences, Chengdu, Sichuan 610209 China
  • 2Institute of Optics and Electronics, Chinese Academy of Science, Chengdu, Sichuan 610209 China
  • 3University of Chinese Academy of Science School of Electronic, Electrical, Communication Engineering, Beijing 100049 China
  • show less
    DOI: 10.12086/oee.2023.230009 Cite this Article
    Wen Cheng, Zhongbi Chen, Qingqing Li, Meihui Li, Jianlin Zhang, Yuxing Wei. Multiple object tracking with aligned spatial-temporal feature[J]. Opto-Electronic Engineering, 2023, 50(6): 230009 Copy Citation Text show less

    Abstract

    Multiple object tracking (MOT) is an important task in computer vision. Most of the MOT methods improve object detection and data association, usually ignoring the correlation between different frames. They don’t make good use of the temporal information in the video, which makes the tracking performance significantly degraded in motion blur, occlusion, and small target scenes. In order to solve these problems, this paper proposes a multiple object tracking method with the aligned spatial-temporal feature. First, the convolutional gated recurrent unit (ConvGRU) is introduced to encode the spatial-temporal information of the object in the video; By considering the whole history frame sequence, this structure effectively extracts the spatial-temporal information to enhance the feature representation. Then, the feature alignment module is designed to ensure the time consistency between the historical frame information and the current frame information to reduce the false detection rate. Finally, this paper tests on MOT17 and MOT20 datasets, and multiple object tracking accuracy (MOTA) values are 74.2 and 67.4, respectively, which is increased by 0.5 and 5.6 compared with the baseline FairMOT method. Our identification F1 score (IDF1) values are 73.9 and 70.6, respectively, which are increased by 1.6 and 3.3 compared with the baseline FairMOT method. In addition, the qualitative and quantitative experimental results show that the overall tracking performance of this method is better than that of most of the current advanced methods.
    Wen Cheng, Zhongbi Chen, Qingqing Li, Meihui Li, Jianlin Zhang, Yuxing Wei. Multiple object tracking with aligned spatial-temporal feature[J]. Opto-Electronic Engineering, 2023, 50(6): 230009
    Download Citation