Review of Deep Learning-Based Semantic Segmentation

Xiangfu Zhang; Jian Liu; Zhangsong Shi; Zhonghong Wu; Zhi Wang

doi:10.3788/LOP56.150003

Journals >Laser & Optoelectronics Progress >Volume 56 >Issue 15 >Page 150003 > Article

Laser & Optoelectronics Progress
Vol. 56, Issue 15, 150003 (2019)

Review of Deep Learning-Based Semantic Segmentation

Xiangfu Zhang, Jian Liu^*, Zhangsong Shi, Zhonghong Wu, and Zhi Wang

Author Affiliations

College of Weapons Engineering, Naval University of Engineering, Wuhan, Hubei 430032, China

show less

DOI: 10.3788/LOP56.150003 Cite this Article Set citation alerts

Xiangfu Zhang, Jian Liu, Zhangsong Shi, Zhonghong Wu, Zhi Wang. Review of Deep Learning-Based Semantic Segmentation[J]. Laser & Optoelectronics Progress, 2019, 56(15): 150003 Copy Citation Text

show less

Fig. 1. Diagram of basic composition of standard convolutional neural network

Download full size

Fig. 2. Diagram of max pooling process

Download full size

Fig. 3. Structural diagram of VGGNet model

Download full size

Fig. 4. Inception module in GoogLeNet network

Download full size

Fig. 5. Residual module in ResNet network

Download full size

Fig. 6. Partial dataset images and corresponding semantic segmentation effect diagrams. (a) PASCAL VOC 2012; (b) PASCAL-CONTEXT; (c) MICROSOFT COCO; (d) CITYSCAPES

Download full size

Fig. 7. Classification of common deep learning semantic segmentation methods

Download full size

Fig. 8. FCN network processing diagram

Download full size

Fig. 9. Structural diagram of SegNet model

Download full size

Fig. 10. Effects of using CRF tuning iterations in DeepLab. (a) GT; (b) CNNout; (c) CRFit1; (d) CRFit2; (e) CRFit10

Download full size

Fig. 11. Comparison of three models of CRFasRNN, FCN-8s, and DeepLab

Download full size

Fig. 12. Diagram of pyramid pooling module in PSPNet

Download full size

Fig. 13. Diagram of multi-scale CNN network architecture proposed by Roy

Download full size

Fig. 14. Structural diagram of ReSeg model

Download full size

Fig. 15. Diagram of GRU calculation process

Download full size

Item	LeNet5	AlexNet	VGGNet	GoogLeNet	ResNet
Year	1994	2012	2014	2014	2015
Layer	7	8	19	22	152
Conv	2	5	16	21	151
Kernel size	5	11,5,3	3	7,1,3,5	7,1,3,5
Linear	3	3	3	1	1
Linear size	120,84,10	4096,4096,1000	4096,4096,1000	1000	1000
Activation function	Sigmoid	ReLU	ReLU	ReLU	ReLU
Classifier	Multi-layerperception	Softmax	Softmax	Softmax	Softmax
Data augment	×	√	√	√	√
Bath normalization	×	×	×	×	√
Local responsenormalization	×	√	×	√	×
Graphicsprocessing unit	×	√	√	√	√
Inception	×	×	×	√	×
Dropout	×	√	√	√	√
TOP-5(error)	N/A	16.4%	7.32%	6.67%	3.57%

Table 1. Information summary of common image classification networks

Dataset	Classes	Sample(training)	Sample(validation)	Sample(test)	Purpose	Year
PASCAL VOC 2012^[18]	21	1464	1449	1452	Generic	2012
PASCAL VOC 2012+^[19]	21	10582	1449	1452	Generic	2014
PASCAL-CONTEXT^[20]	540	4998	5105	-	Generic	2014
PASCAL-PERSON-PART^[20]	6	1716	-	1817	Person	2014
PASCAL-COW-PART^[21]	4	294	-	227	Cow	2015
SBD^[22]	21	8498	2857	-	Generic	2011
MICROSOFT COCO^[23]	80+	82783	40504	81434	Generic	2014
CITYSCAPES(fine)^[24]	19	2975	500	1525	Urban	2015
CITYSCAPES(coarse)^[24]	19	22973	500	-	Urban	2015
CAMVID^[25-26]	32	361	100	233	Driving	2009
KITTI-Ros^[27]	11	170	-	46	Driving	2015
KITTI-Zhang^[28]	10	140	-	112	Driving	2015

Table 2. Information summary of common semantic segmentation datasets

Model name	Year	Architecture	Accuracy	Efficiency	Training	Contribution
FCN^[32]	2015	VGG-16(FCN)	C	C	C	Forerunner
SegNet^[33]	2017	VGG-16 + Decoder	A	B	C	Encoder-decoder
DeepLab^[34-37]	2017	VGG-16 + ResNet-101	A	C	C	Standalone CRF,Atrous convolutions
CRFasRNN^[38]	2015	FCN-8s	C	B	A	CRF reformulated as RNN
ParseNet^[39]	2015	VGG-16	A	C	C	Global context feature fusion
SharpMask ^[40]	2016	DeepMask	A	C	C	Top-down refinement module
PSPNet^[41]	2016	ResNet-101	A	B	C	Pyramid pooling module
Multi-scale-CNN-Raj^[42]	2015	VGG-16(FCN)	A	C	C	Multi-scale architecture
Multi-scale-CNN-Eigen^[43]	2015	Custom	A	C	C	Multi-scalesequential refinement
Multi-scale-CNN-Roy^[44]	2016	Multi-scale-CNN-Eigen	A	C	C	Multi-scale coarse-to-fine refinement
Multi-scale-CNN-Bian^[45]	2016	FCN	B	C	B	Independently trainedMulti-scale FCNs
ReSeg^[46]	2016	VGG-16 + ReNet	B	C	C	Extension of ReNet tosemantic segmentation
LSTM-CF^[47]	2016	Fast R-CNN +DeepMask	A	C	C	Fusion of contextualinformationfrom multiple sources
RCNN^[48]	2014	MDRNN	A	B	C	Different input sizes,image context
2D-LSTM^[49]	2015	MDRNN	B	B	C	Image context modelling
DAG-RNN ^[50]	2015	Elman network	A	C	C	Graph image structurefor context modelling
MINC-CNN^[51]	2015	GoogLeNet(FCN)	C	C	C	Patchwise CNN,Standalone CRF
DeepMask^[52]	2015	VGG-A	A	C	C	Proposals generationfor segmentation

Table 3. Information summary of common deep learning semantic segmentation methods

Xiangfu Zhang, Jian Liu, Zhangsong Shi, Zhonghong Wu, Zhi Wang. Review of Deep Learning-Based Semantic Segmentation[J]. Laser & Optoelectronics Progress, 2019, 56(15): 150003

Download Citation

Set citation alerts for the article

Tools

Set citation alerts for the article

Save the article for my favorites

Paper Information