Author Affiliations
1School of Mechanical and Automotive Engineering, Shanghai University of Engineering Science, Shanghai 201620, China2Shanxi New Energy Technology Co., Ltd., Taiyuan 030024, Shanxi, China3Shanghai Compass Satellite Navigation Technology Co., Ltd., Shanghai 201801, Chinashow less
Fig. 1. Overall framework of proposed algorithm
Fig. 2. Multitarget tracking algorithm flow
Fig. 3. Loss and AP curves
Fig. 4. Passenger detection results
Fig. 5. Performance indicator ranking on VOT2016. (a) Accuracy-robustness; (b) average overlap expectation
Fig. 6. Performance indicator ranking on VOT2018. (a) Accuracy-robustness; (b) average overlap expectation
Fig. 7. Comparison of tracking results in different scenarios. (a) Out-of-view; (b) background clutters; (c) partial occlusion; (d) scale variation; (e) deformation
Convolution layer | Convolution kernel | Output and input channel | Stride | Template image | Search image | Channel |
---|
| | | | 127×127 | 255×255 | 3 | Conv1 | 3×3 | 64×3 | 1 | 125×125 | 253×253 | 64 | Res1 | 1×13×3 | 64×64 | 1 | 123×123 | 251×251 | 32 64 | Maxpool1 | 2×2 | | 2 | 61×61 | 125×125 | 64 | Res2 | 1×13×3 | 128×64 | 1 | 59×59 | 123×123 | 64128 | Res3 | 1×13×3 | 128×128 | 1 | 57×57 | 121×121 | 64128 | Maxpool2 | 2×2 | | 2 | 28×28 | 60×60 | 128 | Res4 | 1×13×3 | 256×128 | 1 | 26×26 | 58×58 | 128256 | Res5 | 1×13×3 | 256×256 | 1 | 24×24 | 56×56 | 128256 | Maxpool3 | 2×2 | | 2 | 12×12 | 28×28 | 256 | Res6 | 1×13×3 | 256×256 | 1 | 10×10 | 26×26 | 128256 | Res7 | 1×13×3 | 512×256 | 1 | 8×8 | 24×24 | 256512 | Conv2 | 1×1 | 256×512 | 1 | 8×8 | 24×24 | 256 | Conv3 | 3×3 | 256×256 | 1 | 6×6 | 22×22 | 256 |
|
Table 1. Siamese network structure incorporating residual connections
Algorithm | Acc | Rob | EAO |
---|
SiameseFC | 0.5342 | 0.4613 | 0.2356 | Staple | 0.5425 | 0.3784 | 0.2946 | SRDCF | 0.5356 | 0.4193 | 0.2459 | DeepSRDCF | 0.5271 | 0.3264 | 0.2758 | TADT | 0.5539 | 0.3327 | 0.2991 | SiameseRPN | 0.5617 | 0.2621 | 0.3442 | Proposed algorithm | 0.5808 | 0.1561 | 0.4095 |
|
Table 2. Comparison results of different algorithms on VOT2016 dataset
Algorithm | Acc | Rob | EAO |
---|
SiameseFC | 0.5108 | 0.4836 | 0.2343 | Staple | 0.5246 | 0.6887 | 0.1694 | DSiamese | 0.5117 | 0.6458 | 0.1966 | SiameseDW | 0.5411 | 0.4032 | 0.2704 | CFNet | 0.5028 | 0.5853 | 0.1882 | SiameseRPN | 0.4945 | 0.4627 | 0.2441 | Proposed algorithm | 0.5347 | 0.3417 | 0.3091 |
|
Table 3. Comparison results of different algorithms on VOT2018 dataset
Algorithm | AUC | Prec |
---|
SiameseRPN | 0.596 | 0.785 | SiameseRPN+RC | 0.638 | 0.822 | SiameseRPN+AB | 0.609 | 0.804 | SiameseRPN+RC+AB | 0.654 | 0.846 |
|
Table 4. Experimental results of ablation on OTB100 dataset
Attribute | Number of videos | SiameseFC | Staple | SRDCF | ACFN | SiameseRPN | Proposed algorithm |
---|
SV | 63 | 0.765 | 0.721 | 0.739 | 0.758 | 0.806 | 0.841 | OCC | 48 | 0.738 | 0.729 | 0.730 | 0.752 | 0.781 | 0.837 | DEF | 43 | 0.779 | 0.742 | 0.725 | 0.764 | 0.793 | 0.843 | OV | 14 | 0.695 | 0.670 | 0.593 | 0.687 | 0.736 | 0.780 | BC | 31 | 0.703 | 0.747 | 0.767 | 0.766 | 0.803 | 0.824 |
|
Table 5. Comparison of Prec of different video attributes on OTB100 for each algorithm
Algorithm | MOTP | MOTA | FPS |
---|
SORT | 0.814 | 0.815 | 112 | DeepSort | 0.817 | 0.869 | 48 | MHT | 0.822 | 0.871 | 1.5 | POI | 0.835 | 0.883 | 15 | SiameseCNN | 0.764 | 0.865 | 107 | Proposed algorithm | 0.891 | 0.937 | 39 |
|
Table 6. Performance comparison of each algorithm