Siamese Neural Network Object Tracking with Distractor-Aware Model

Yong Li; Dedong Yang; Yajun Han; Peng Song

doi:10.3788/AOS202040.0415002

Journals >Acta Optica Sinica >Volume 40 >Issue 4 >Page 0415002 > Article

Acta Optica Sinica
Vol. 40, Issue 4, 0415002 (2020)

Siamese Neural Network Object Tracking with Distractor-Aware Model

Yong Li, Dedong Yang^*, Yajun Han, and Peng Song

Author Affiliations

School of Artificial Intelligence, Hebei University of Technology, Tianjin 300130, China

show less

DOI: 10.3788/AOS202040.0415002 Cite this Article Set citation alerts

Yong Li, Dedong Yang, Yajun Han, Peng Song. Siamese Neural Network Object Tracking with Distractor-Aware Model[J]. Acta Optica Sinica, 2020, 40(4): 0415002 Copy Citation Text

show less

Fig. 1. Schematic diagram of siamese neural network

Download full size

Fig. 2. Algorithmic framework diagram in this paper

Download full size

Fig. 3. Visualization of each layer's convolutional feature map

Download full size

Fig. 4. Precision plots and success plots of the eight trackers. (a) Success rate; (b) accuracy

Download full size

Fig. 5. Actual results of eight algorithms. (a) Faceocc1; (b) subway; (c) football; (d) freeman1; (e) dog1;(f) carScale; (g) mountainBike; (h) david2; (i) faceocc2; (j) basketball

Download full size

Fig. 6. Success rate and accuracy of various tracking algorithms in aerial video sequences. (a) Accuracy; (b) success rate

Download full size

Fig. 7. Actual effect of algorithms in aerial video sequence

Download full size

Video sequence	Length /frame	Resolution ratio /(pixel×pixel)	Characteristic
David2	537	320×240	In-plane rotation, out-of-plane rotation
Faceocc1	892	352×288	Occlusion
Faceocc2	812	320×240	Illumination variation, occlusion, in-plane rotation, out-of-plane rotation
Subway	175	352×288	Occlusion, deformation, background clutter
Freeman1	326	360×240	Scale variation, in-plane rotation, out-of-plane rotation
MountainBike	228	640×360	Out-of-plane rotation, in-plane rotation, background clutter
Dog1	1350	320×240	Scale variation, in-plane rotation, out-of-plane rotation
CarScale	252	640×272	Scale variation, occlusion, fast motion, in-plane rotation, out-of-plane rotation
Football	362	624×352	Occlusion, in-plane rotation, out-of-plane rotation, background clutter
Basketball	725	576×432	Illumination variation, out-of-plane rotation, occlusion, deformation, background clutter

Table 1. Ten sets of video attributes

Sequence	Our	Siamfc	DSiamM	ASLA	TLD	MEEM	MUSTER	IVT
MountainBike	5.6199	6.1406	5.7915	8.9727	213.3278	13.0037	8.12	7.416
Faceocc1	10.1931	11.9656	11.4831	77.8108	27.3678	16.9904	14.2932	17.8346
Freeman1	5.9435	6.6078	6.039	104.8774	39.6988	11.3029	8.6361	11.7283
Subway	2.4955	3.254	2.9104	137.6901	159.0114	4.1169	2.2211	130.2318
Football	5.2967	6.7392	5.0698	15.3724	14.2587	5.1423	14.7789	14.8367
CarScale	15.7498	15.318	18.4334	24.9002	50.3495	67.2993	18.6758	11.7225
Basketball	10.4241	22.7174	10.658	82.6266	268.7569	4.2104	4.8487	106.9015
Faceocc2	10.1346	10.7052	10.1493	19.5059	12.2779	10.5872	5.8895	7.1397
Dog1	5.0088	3.004	3.5086	5.8068	4.1903	6.1053	4.0696	4.0764
David2	3.7716	2.8061	3.007	1.5874	4.9788	1.863	1.9849	1.6066

Table 2. Tracking errors of tracking algorithms in ten video sequences

Video sequence	Length /frame	Resolution ratio /(pixel×pixel)	Characteristic
Wakeboard4	233	1280×720	Scale variation, aspect ratio change, viewpoint change
Wakeboard10	157	1280×720	Scale variation, low resolution
Boat1	301	1280×720	Scale variation
Boat2	267	1280×720	Scale variation
Boat6	269	1280×720	Scale variation
Boat9	467	1280×720	Scale variation, aspect ratio change, low resolution, partial occlusion, viewpoint change
Building1	157	1280×720
Truck3	179	1280×720	Low resolution, partial occlusion, background clutter
Car4	449	1280×720	Occlusion, aspect ratio change, low resolution, partial occlusion, camera motion, similar object
Car5	249	1280×720	Scale variation

Table 3. Ten sets of aerial video sequence attributes

Sequence	Our	MUSTER	DSiamM	ASLA	Siamfc	TLD	MEEM	IVT
Wakeboard4	0.751	0.549	0.588	0.004	0.305	0.077	0.597	0.004
Wakeboard10	1.000	1.000	1.000	0.917	1.000	1.000	1.000	0.248
Boat1	1.000	0.841	1.000	0.990	1.000	0.498	0.658	0.957
Boat2	1.000	1.000	1.000	1.000	1.000	0.397	1.000	1.000
Boat6	0.933	0.892	0.914	0.955	0.922	0.885	0.818	0.981
Boat9	0.953	0.914	0.522	0.829	0.972	0.473	0.469	0.203
Building1	1.000	1.000	1.000	1.000	1.000	1.000	1.000	1.000
Truck3	1.000	1.000	1.000	1.000	0.207	1.000	1.000	1.000
Car4	0.989	0.998	0.998	0.457	0.296	0.998	0.296	0.450
Car5	1.000	0.908	1.000	1.000	1.000	0.996	0.265	0.743

Table 4. Accuracy of tracking algorithms in ten video sequences

Sequence	Our	MUSTER	DSiamM	ASLA	Siamfc	TLD	MEEM	IVT
Wakeboard4	0.434	0.312	0.363	0.009	0.185	0.029	0.348	0.010
Wakeboard10	0.567	0.396	0.631	0.365	0.552	0.437	0.333	0.146
Boat1	0.730	0.731	0.722	0.529	0.740	0.595	0.376	0.612
Boat2	0.748	0.745	0.754	0.773	0.745	0.624	0.618	0.817
Boat6	0.786	0.339	0.774	0.602	0.763	0.346	0.329	0.618
Boat9	0.526	0.332	0.325	0.273	0.534	0.201	0.071	0.116
Building1	0.816	0.803	0.830	0.764	0.742	0.737	0.781	0.793
Truck3	0.613	0.794	0.702	0.832	0.132	0.753	0.694	0.787
Car4	0.756	0.635	0.706	0.362	0.227	0.785	0.244	0.369
Car5	0.757	0.721	0.766	0.521	0.722	0.729	0.412	0.526

Table 5. Success rate of tracking algorithms in ten video sequences

Algorithm	Our	Siamfc	MUSTER	DSiamM	ASLA	TLD	MEEM	IVT
Speed /(frame·s^-1)	37.2	56.1	3.7	23.9	8.1	27.8	9.9	39.9

Table 6. Average speed comparison of the algorithms

Yong Li, Dedong Yang, Yajun Han, Peng Song. Siamese Neural Network Object Tracking with Distractor-Aware Model[J]. Acta Optica Sinica, 2020, 40(4): 0415002

Download Citation

Set citation alerts for the article

Tools

Set citation alerts for the article

Save the article for my favorites

Paper Information