Fig. 1. Schematic diagram of siamese neural network
Fig. 2. Algorithmic framework diagram in this paper
Fig. 3. Visualization of each layer's convolutional feature map
Fig. 4. Precision plots and success plots of the eight trackers. (a) Success rate; (b) accuracy
Fig. 5. Actual results of eight algorithms. (a) Faceocc1; (b) subway; (c) football; (d) freeman1; (e) dog1;(f) carScale; (g) mountainBike; (h) david2; (i) faceocc2; (j) basketball
Fig. 6. Success rate and accuracy of various tracking algorithms in aerial video sequences. (a) Accuracy; (b) success rate
Fig. 7. Actual effect of algorithms in aerial video sequence
Video sequence | Length /frame | Resolution ratio /(pixel×pixel) | Characteristic |
---|
David2 | 537 | 320×240 | In-plane rotation, out-of-plane rotation | Faceocc1 | 892 | 352×288 | Occlusion | Faceocc2 | 812 | 320×240 | Illumination variation, occlusion, in-plane rotation, out-of-plane rotation | Subway | 175 | 352×288 | Occlusion, deformation, background clutter | Freeman1 | 326 | 360×240 | Scale variation, in-plane rotation, out-of-plane rotation | MountainBike | 228 | 640×360 | Out-of-plane rotation, in-plane rotation, background clutter | Dog1 | 1350 | 320×240 | Scale variation, in-plane rotation, out-of-plane rotation | CarScale | 252 | 640×272 | Scale variation, occlusion, fast motion, in-plane rotation, out-of-plane rotation | Football | 362 | 624×352 | Occlusion, in-plane rotation, out-of-plane rotation, background clutter | Basketball | 725 | 576×432 | Illumination variation, out-of-plane rotation, occlusion, deformation, background clutter |
|
Table 1. Ten sets of video attributes
Sequence | Our | Siamfc | DSiamM | ASLA | TLD | MEEM | MUSTER | IVT |
---|
MountainBike | 5.6199 | 6.1406 | 5.7915 | 8.9727 | 213.3278 | 13.0037 | 8.12 | 7.416 | Faceocc1 | 10.1931 | 11.9656 | 11.4831 | 77.8108 | 27.3678 | 16.9904 | 14.2932 | 17.8346 | Freeman1 | 5.9435 | 6.6078 | 6.039 | 104.8774 | 39.6988 | 11.3029 | 8.6361 | 11.7283 | Subway | 2.4955 | 3.254 | 2.9104 | 137.6901 | 159.0114 | 4.1169 | 2.2211 | 130.2318 | Football | 5.2967 | 6.7392 | 5.0698 | 15.3724 | 14.2587 | 5.1423 | 14.7789 | 14.8367 | CarScale | 15.7498 | 15.318 | 18.4334 | 24.9002 | 50.3495 | 67.2993 | 18.6758 | 11.7225 | Basketball | 10.4241 | 22.7174 | 10.658 | 82.6266 | 268.7569 | 4.2104 | 4.8487 | 106.9015 | Faceocc2 | 10.1346 | 10.7052 | 10.1493 | 19.5059 | 12.2779 | 10.5872 | 5.8895 | 7.1397 | Dog1 | 5.0088 | 3.004 | 3.5086 | 5.8068 | 4.1903 | 6.1053 | 4.0696 | 4.0764 | David2 | 3.7716 | 2.8061 | 3.007 | 1.5874 | 4.9788 | 1.863 | 1.9849 | 1.6066 |
|
Table 2. Tracking errors of tracking algorithms in ten video sequences
Video sequence | Length /frame | Resolution ratio /(pixel×pixel) | Characteristic |
---|
Wakeboard4 | 233 | 1280×720 | Scale variation, aspect ratio change, viewpoint change | Wakeboard10 | 157 | 1280×720 | Scale variation, low resolution | Boat1 | 301 | 1280×720 | Scale variation | Boat2 | 267 | 1280×720 | Scale variation | Boat6 | 269 | 1280×720 | Scale variation | Boat9 | 467 | 1280×720 | Scale variation, aspect ratio change, low resolution, partial occlusion, viewpoint change | Building1 | 157 | 1280×720 | | Truck3 | 179 | 1280×720 | Low resolution, partial occlusion, background clutter | Car4 | 449 | 1280×720 | Occlusion, aspect ratio change, low resolution, partial occlusion, camera motion, similar object | Car5 | 249 | 1280×720 | Scale variation |
|
Table 3. Ten sets of aerial video sequence attributes
Sequence | Our | MUSTER | DSiamM | ASLA | Siamfc | TLD | MEEM | IVT |
---|
Wakeboard4 | 0.751 | 0.549 | 0.588 | 0.004 | 0.305 | 0.077 | 0.597 | 0.004 | Wakeboard10 | 1.000 | 1.000 | 1.000 | 0.917 | 1.000 | 1.000 | 1.000 | 0.248 | Boat1 | 1.000 | 0.841 | 1.000 | 0.990 | 1.000 | 0.498 | 0.658 | 0.957 | Boat2 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 0.397 | 1.000 | 1.000 | Boat6 | 0.933 | 0.892 | 0.914 | 0.955 | 0.922 | 0.885 | 0.818 | 0.981 | Boat9 | 0.953 | 0.914 | 0.522 | 0.829 | 0.972 | 0.473 | 0.469 | 0.203 | Building1 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | 1.000 | Truck3 | 1.000 | 1.000 | 1.000 | 1.000 | 0.207 | 1.000 | 1.000 | 1.000 | Car4 | 0.989 | 0.998 | 0.998 | 0.457 | 0.296 | 0.998 | 0.296 | 0.450 | Car5 | 1.000 | 0.908 | 1.000 | 1.000 | 1.000 | 0.996 | 0.265 | 0.743 |
|
Table 4. Accuracy of tracking algorithms in ten video sequences
Sequence | Our | MUSTER | DSiamM | ASLA | Siamfc | TLD | MEEM | IVT |
---|
Wakeboard4 | 0.434 | 0.312 | 0.363 | 0.009 | 0.185 | 0.029 | 0.348 | 0.010 | Wakeboard10 | 0.567 | 0.396 | 0.631 | 0.365 | 0.552 | 0.437 | 0.333 | 0.146 | Boat1 | 0.730 | 0.731 | 0.722 | 0.529 | 0.740 | 0.595 | 0.376 | 0.612 | Boat2 | 0.748 | 0.745 | 0.754 | 0.773 | 0.745 | 0.624 | 0.618 | 0.817 | Boat6 | 0.786 | 0.339 | 0.774 | 0.602 | 0.763 | 0.346 | 0.329 | 0.618 | Boat9 | 0.526 | 0.332 | 0.325 | 0.273 | 0.534 | 0.201 | 0.071 | 0.116 | Building1 | 0.816 | 0.803 | 0.830 | 0.764 | 0.742 | 0.737 | 0.781 | 0.793 | Truck3 | 0.613 | 0.794 | 0.702 | 0.832 | 0.132 | 0.753 | 0.694 | 0.787 | Car4 | 0.756 | 0.635 | 0.706 | 0.362 | 0.227 | 0.785 | 0.244 | 0.369 | Car5 | 0.757 | 0.721 | 0.766 | 0.521 | 0.722 | 0.729 | 0.412 | 0.526 |
|
Table 5. Success rate of tracking algorithms in ten video sequences
Algorithm | Our | Siamfc | MUSTER | DSiamM | ASLA | TLD | MEEM | IVT |
---|
Speed /(frame·s-1) | 37.2 | 56.1 | 3.7 | 23.9 | 8.1 | 27.8 | 9.9 | 39.9 |
|
Table 6. Average speed comparison of the algorithms