Visual Simultaneous Localization and Mapping Algorithm Combining Mixed Attention Instance Segmentation

Haowei Jiang; Mengyuan Chen; Xuechao Yuan

doi:10.3788/LOP213265

Journals >Laser & Optoelectronics Progress >Volume 60 >Issue 10 >Page 1028008 > Article

Laser & Optoelectronics Progress
Vol. 60, Issue 10, 1028008 (2023)

Visual Simultaneous Localization and Mapping Algorithm Combining Mixed Attention Instance Segmentation

Haowei Jiang¹, Mengyuan Chen^1、2、*, and Xuechao Yuan³

Author Affiliations

¹College of Electrical Engineering, Anhui Polytechnic University, Wuhu 241000, Anhui, China

²Key Laboratory of Advanced Perception and Intelligent Control of High-End Equipment, Wuhu 241000, Anhui, China

³Wuhu Googol Automation Technology Co., Ltd., Wuhu 241000, Anhui, China

show less

DOI: 10.3788/LOP213265 Cite this Article Set citation alerts

Haowei Jiang, Mengyuan Chen, Xuechao Yuan. Visual Simultaneous Localization and Mapping Algorithm Combining Mixed Attention Instance Segmentation[J]. Laser & Optoelectronics Progress, 2023, 60(10): 1028008 Copy Citation Text

show less

Fig. 1. System framework diagram

Download full size

Fig. 2. Framework diagram of mixed attention Mask-RCNN algorithm

Download full size

Fig. 3. Proposed backbone network structure

Download full size

Fig. 4. Spatial attention structure

Download full size

Fig. 5. Channel attention structure

Download full size

Fig. 6. Flow chart of mismatching remove

Download full size

Fig. 7. Instance segmentation results in 02 and 07 sequences. (a)(c) Pre-improved algorithm; (b) (d) proposed algorithm

Download full size

Fig. 8. Matching results in 00 sequence. (a) SURF feature matching results; (b) ORB feature matching results; (c) proposed algorithm feature matching results

Download full size

Fig. 9. Operating trajectories in different sequences on KITTI. (a) 10 sequence; (b) 01 sequence; (c) 06 sequence; (d) 07 sequence; (e) 09 sequence; (f) 00 sequence

Download full size

Fig. 10. Processing time per frame on three algorithms. (a) ORB-SLAM2; (b) DS-SLAM; (c) proposed algorithm

Download full size

Fig. 11. TurtleBot3 Burger

Download full size

Fig. 12. Real experimental environment scene. (a) Real scene; (b) layout plan

Download full size

Fig. 13. Image of pentacle position for the first time. (a) Instance segmentation result of pre-improved algorithm; (b) instance segmentation result of proposed algorithm

Download full size

Fig. 14. Image of pentacle position for the second time. (a) Instance segmentation result of pre-improved algorithm; (b) instance segmentation result of proposed algorithm

Download full size

Fig. 15. Operating trajectory in real scene

Download full size

Number	Type	Area	Value
0	Conv1_x-Output	ResNet-50，Conv1_x	（64，H/4，W/4）
1	Conv2_x-Output	ResNet-50，Conv2_x	（256，H/4，W/4）
2	Conv3_x-Output	ResNet-50，Conv3_x	（512，H/8，W/8）
3	Conv4_x-Output	ResNet-50，Conv4_x	（1024，H/16，W/16）
4	Conv5_x-Output	ResNet-50，Conv5_x	（2048，H/32，W/32）
5	Upsample stride	FPN	2
6	Convolution kernel size	Spatial attention	7×7
7	Activation function	Spatial attention	Sigmoid
8	Activation function	Channel attetion-MLP	ReLU
9	Activation function	Channel attention	Sigmoid

Table 1. Main parameter of mixed attention backbone network

Algorithm	Backbone	AP /%
Algorithm	Backbone	AP	AP₅₀	AP₇₅	AP_S	AP_M	AP_L
Mask-RCNN	ResNet-50-FPN	33.4	54.9	35.3	14.7	35.2	50.1
Proposed algorithm	ResNet-50-MAM-FPN	34.9	57.5	36.9	15.3	36.9	52.5

Table 2. Comparison of algorithm test results in AP

Sequence	SURF				ORB				Proposed algorithm
Sequence	Matching pairs	Effective matching pairs	Effective matching rate /%	Matching time /s	Matching pairs	Effective matching pairs	Effective matching rate /%	Matching time /s	Matching pairs	Effective matching pairs	Effective matching rate /%	Matching time /s
00	1235	1002	81.1	0.1156	512	395	77.1	0.0089	494	392	79.4	0.0115
01	1254	1030	82.1	0.1172	490	381	77.8	0.0084	489	396	81.0	0.0097
06	1560	1264	81.0	0.1405	607	457	75.3	0.0094	524	424	80.9	0.0122
07	1438	1196	83.2	0.1281	530	405	76.4	0.0090	507	426	84.0	0.0119
09	1320	1088	82.4	0.1261	507	390	76.9	0.0086	501	412	82.2	0.0102
10	1480	1210	81.8	0.1364	552	424	76.8	0.0092	514	424	82.5	0.0121
Average	1381	1132	81.9	0.1273	533	409	76.7	0.0089	505	412	81.7	0.0113
Variance	14376	9476	0.57	0.083×10^-3	4568	648	0.58	0.115×10^-6	140	190	2.1	0.937×10^-6

Table 3. Comparison of effective matching rate and matching time on KITTI

Sequence	ORB-SLAM2			DS-SLAM			Proposed algorithm
Sequence	Average distance Error /m	Average angle Error /m	Precision rate of loop detection /%	Average distance Error /m	Average angle Error /m	Precision rate of loop detection /%	Average distance Error /m	Average angle Error /m	Precision rate of loop detection /%
10	3.15	1.55		2.62	0.94		2.01	0.82
01	3.26	1.39		3.01	0.88		2.32	0.79
06	2.99	1.57	77.9	2.51	0.79	82.3	2.38	0.73	86.4
07	3.05	1.30		2.72	0.61		2.53	0.50
09	3.11	1.43		2.87	0.85		2.14	0.72
00	3.64	1.24	76.6	2.94	0.97	80.4	2.54	0.87	84.7