• Acta Optica Sinica
  • Vol. 42, Issue 15, 1515001 (2022)
Zishuo Zhang1、2, Yong Song1、2、*, Xin Yang1、2, Yufei Zhao1、2, and Ya Zhou1、2
Author Affiliations
  • 1School of Optics and Photonics, Beijing Institute of Technology, Beijing 100081, China
  • 2Beijing Key Laboratory for Precision Optoelectronic Measurement Instrument and Technology, Beijing 100081, China
  • show less
    DOI: 10.3788/AOS202242.1515001 Cite this Article Set citation alerts
    Zishuo Zhang, Yong Song, Xin Yang, Yufei Zhao, Ya Zhou. Triplet Network Based on Dynamic Feature Attention for Object Tracking[J]. Acta Optica Sinica, 2022, 42(15): 1515001 Copy Citation Text show less
    References

    [1] Qiu Z L, Zha Y F, Zhu P et al. Visual tracking algorithm based on online feature discrimination with Siamese network[J]. Acta Optica Sinica, 39, 0915003(2019).

    [2] Li Y, Yang D D, Han Y J et al. Siamese neural network object tracking with distractor-aware model[J]. Acta Optica Sinica, 40, 0415002(2020).

    [3] Bolme D S, Beveridge J R, Draper B A et al. Visual object tracking using adaptive correlation filters[C], 2544-2550(2010).

    [4] Henriques J F, Caseiro R, Martins P et al. High-speed tracking with kernelized correlation filters[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 583-596(2015).

    [5] Meng L, Li C X. Brief review of object tracking algorithms in recent years: correlated filtering and deep learning[J]. Journal of Image and Graphics, 24, 1011-1016(2019).

    [6] Bertinetto L, Valmadre J, Henriques J F et al. Fully-convolutional Siamese networks for object tracking[M]. Hua G, Jégou H. Computer vision-ECCV 2016 workshops. Lecture notes in computer science, 9914, 850-865(2016).

    [7] Li B, Yan J J, Wu W et al. High performance visual tracking with Siamese region proposal network[C], 8971-8980(2018).

    [8] Ren S Q, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).

    [9] Zhu Z, Wang Q, Li B et al. Distractor-aware Siamese networks for visual object tracking[M]. Ferrari V, Hebert M, Sminchisescu C, et al. Computer Vision-ECCV 2018. Lecture notes in computer science, 11213, 103-119(2018).

    [10] Bhat G, Johnander J, Danelljan M et al. Unveiling the power of deep tracking[M]. Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science, 11206, 493-509(2018).

    [11] Zhang Z P, Peng H W. Deeper and wider Siamese networks for real-time visual tracking[C], 4586-4595(2019).

    [12] Li B, Wu W, Wang Q et al. SiamRPN++: evolution of Siamese visual tracking with very deep networks[C], 4277-4286(2019).

    [13] Zhang L C, Gonzalez-Garcia A, de Weijer J V et al. Learning the model update for Siamese trackers[C], 4009-4018(2019).

    [14] Yu Y C, Xiong Y L, Huang W L et al. Deformable Siamese attention networks for visual object tracking[C], 6727-6736(2020).

    [15] Chen Z D, Zhong B N, Li G R et al. Siamese box adaptive network for visual tracking[C], 6667-6676(2020).

    [16] Chen X L, He K M. Exploring simple Siamese representation learning[C], 15745-15753(2021).

    [17] Dong J F, Liu C, Cao F W et al. Online adaptive Siamese network tracking algorithm based on attention mechanism[J]. Laser & Optoelectronics Progress, 57, 021510(2020).

    [18] Tolstikhin I, Houlsby N, Kolesnikov A et al. MLP-mixer: an all-MLP architecture for vision[EB/OL]. https://arxiv.org/abs/2105.01601

    [19] Dosovitskiy A, Beyer L, Kolesnikov A et al. An image is worth 16×16 words: transformers for image recognition at scale[C](2021).

    [20] Danelljan M, Bhat G, Khan F S et al. ATOM: accurate tracking by overlap maximization[C], 4655-4664(2019).

    [21] Li C, Yang D D, Song P et al. Global-aware Siamese network for thermal infrared object tracking[J]. Acta Optica Sinica, 41, 0615002(2021).

    [22] Hu J, Shen L, Albanie S et al. Squeeze-and-excitation networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 42, 2011-2023(2020).

    [23] Zheng Z H, Wang P, Liu W et al. Distance-IoU loss: faster and better learning for bounding box regression[J]. Proceedings of the AAAI Conference on Artificial Intelligence, 34, 12993-13000(2020).

    [24] Yu J H, Jiang Y N, Wang Z Y et al. UnitBox: an advanced object detection network[C], 516-520(2016).

    [25] Russakovsky O, Deng J, Su H et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 115, 211-252(2015).

    [26] Fan H, Lin L T, Yang F et al. LaSOT: a high-quality benchmark for large-scale single object tracking[C], 5369-5378(2019).

    [27] Real E, Shlens J, Mazzocchi S et al. YouTube-BoundingBoxes: a large high-precision human-annotated data set for object detection in video[C], 7464-7473(2017).

    [28] Lin T Y, Maire M, Belongie S J et al. Microsoft COCO: common objects in context[M]. Fleet D, Pajdla T, Schiele B, et al. Computer vision-ECCV 2014. Lecture notes in computer science, 8693, 740-755(2014).

    [29] Zhao F, Zhang T, Song Y B et al. Siamese regression tracking with reinforced template updating[J]. IEEE Transactions on Image Processing Society, 30, 628-640(2021).

    [30] Valmadre J, Bertinetto L, Henriques J et al. End-to-end representation learning for correlation filter based tracking[C], 5000-5008(2017).

    [31] Danelljan M, Häger G, Khan F S et al. Convolutional features for correlation filter based visual tracking[C], 621-629(2015).

    Zishuo Zhang, Yong Song, Xin Yang, Yufei Zhao, Ya Zhou. Triplet Network Based on Dynamic Feature Attention for Object Tracking[J]. Acta Optica Sinica, 2022, 42(15): 1515001
    Download Citation