• Acta Optica Sinica
  • Vol. 40, Issue 4, 0415002 (2020)
Yong Li, Dedong Yang*, Yajun Han, and Peng Song
Author Affiliations
  • School of Artificial Intelligence, Hebei University of Technology, Tianjin 300130, China
  • show less
    DOI: 10.3788/AOS202040.0415002 Cite this Article Set citation alerts
    Yong Li, Dedong Yang, Yajun Han, Peng Song. Siamese Neural Network Object Tracking with Distractor-Aware Model[J]. Acta Optica Sinica, 2020, 40(4): 0415002 Copy Citation Text show less
    Schematic diagram of siamese neural network
    Fig. 1. Schematic diagram of siamese neural network
    Algorithmic framework diagram in this paper
    Fig. 2. Algorithmic framework diagram in this paper
    Visualization of each layer's convolutional feature map
    Fig. 3. Visualization of each layer's convolutional feature map
    Precision plots and success plots of the eight trackers. (a) Success rate; (b) accuracy
    Fig. 4. Precision plots and success plots of the eight trackers. (a) Success rate; (b) accuracy
    Actual results of eight algorithms. (a) Faceocc1; (b) subway; (c) football; (d) freeman1; (e) dog1;(f) carScale; (g) mountainBike; (h) david2; (i) faceocc2; (j) basketball
    Fig. 5. Actual results of eight algorithms. (a) Faceocc1; (b) subway; (c) football; (d) freeman1; (e) dog1;(f) carScale; (g) mountainBike; (h) david2; (i) faceocc2; (j) basketball
    Success rate and accuracy of various tracking algorithms in aerial video sequences. (a) Accuracy; (b) success rate
    Fig. 6. Success rate and accuracy of various tracking algorithms in aerial video sequences. (a) Accuracy; (b) success rate
    Actual effect of algorithms in aerial video sequence
    Fig. 7. Actual effect of algorithms in aerial video sequence
    Video sequenceLength /frameResolution ratio /(pixel×pixel)Characteristic
    David2537320×240In-plane rotation, out-of-plane rotation
    Faceocc1892352×288Occlusion
    Faceocc2812320×240Illumination variation, occlusion, in-plane rotation, out-of-plane rotation
    Subway175352×288Occlusion, deformation, background clutter
    Freeman1326360×240Scale variation, in-plane rotation, out-of-plane rotation
    MountainBike228640×360Out-of-plane rotation, in-plane rotation, background clutter
    Dog11350320×240Scale variation, in-plane rotation, out-of-plane rotation
    CarScale252640×272Scale variation, occlusion, fast motion, in-plane rotation, out-of-plane rotation
    Football362624×352Occlusion, in-plane rotation, out-of-plane rotation, background clutter
    Basketball725576×432Illumination variation, out-of-plane rotation, occlusion, deformation, background clutter
    Table 1. Ten sets of video attributes
    SequenceOurSiamfcDSiamMASLATLDMEEMMUSTERIVT
    MountainBike5.61996.14065.79158.9727213.327813.00378.127.416
    Faceocc110.193111.965611.483177.810827.367816.990414.293217.8346
    Freeman15.94356.60786.039104.877439.698811.30298.636111.7283
    Subway2.49553.2542.9104137.6901159.01144.11692.2211130.2318
    Football5.29676.73925.069815.372414.25875.142314.778914.8367
    CarScale15.749815.31818.433424.900250.349567.299318.675811.7225
    Basketball10.424122.717410.65882.6266268.75694.21044.8487106.9015
    Faceocc210.134610.705210.149319.505912.277910.58725.88957.1397
    Dog15.00883.0043.50865.80684.19036.10534.06964.0764
    David23.77162.80613.0071.58744.97881.8631.98491.6066
    Table 2. Tracking errors of tracking algorithms in ten video sequences
    Video sequenceLength /frameResolution ratio /(pixel×pixel)Characteristic
    Wakeboard42331280×720Scale variation, aspect ratio change, viewpoint change
    Wakeboard101571280×720Scale variation, low resolution
    Boat13011280×720Scale variation
    Boat22671280×720Scale variation
    Boat62691280×720Scale variation
    Boat94671280×720Scale variation, aspect ratio change, low resolution, partial occlusion, viewpoint change
    Building11571280×720
    Truck31791280×720Low resolution, partial occlusion, background clutter
    Car44491280×720Occlusion, aspect ratio change, low resolution, partial occlusion, camera motion, similar object
    Car52491280×720Scale variation
    Table 3. Ten sets of aerial video sequence attributes
    SequenceOurMUSTERDSiamMASLASiamfcTLDMEEMIVT
    Wakeboard40.7510.5490.5880.0040.3050.0770.5970.004
    Wakeboard101.0001.0001.0000.9171.0001.0001.0000.248
    Boat11.0000.8411.0000.9901.0000.4980.6580.957
    Boat21.0001.0001.0001.0001.0000.3971.0001.000
    Boat60.9330.8920.9140.9550.9220.8850.8180.981
    Boat90.9530.9140.5220.8290.9720.4730.4690.203
    Building11.0001.0001.0001.0001.0001.0001.0001.000
    Truck31.0001.0001.0001.0000.2071.0001.0001.000
    Car40.9890.9980.9980.4570.2960.9980.2960.450
    Car51.0000.9081.0001.0001.0000.9960.2650.743
    Table 4. Accuracy of tracking algorithms in ten video sequences
    SequenceOurMUSTERDSiamMASLASiamfcTLDMEEMIVT
    Wakeboard40.4340.3120.3630.0090.1850.0290.3480.010
    Wakeboard100.5670.3960.6310.3650.5520.4370.3330.146
    Boat10.7300.7310.7220.5290.7400.5950.3760.612
    Boat20.7480.7450.7540.7730.7450.6240.6180.817
    Boat60.7860.3390.7740.6020.7630.3460.3290.618
    Boat90.5260.3320.3250.2730.5340.2010.0710.116
    Building10.8160.8030.8300.7640.7420.7370.7810.793
    Truck30.6130.7940.7020.8320.1320.7530.6940.787
    Car40.7560.6350.7060.3620.2270.7850.2440.369
    Car50.7570.7210.7660.5210.7220.7290.4120.526
    Table 5. Success rate of tracking algorithms in ten video sequences
    AlgorithmOurSiamfcMUSTERDSiamMASLATLDMEEMIVT
    Speed /(frame·s-1)37.256.13.723.98.127.89.939.9
    Table 6. Average speed comparison of the algorithms
    Yong Li, Dedong Yang, Yajun Han, Peng Song. Siamese Neural Network Object Tracking with Distractor-Aware Model[J]. Acta Optica Sinica, 2020, 40(4): 0415002
    Download Citation