Method based on enlarging receptive field | Method based on dilated convolution | DeepLab v1 | 2014 | Upsampling and structure prediction | CRF | PASCAL VOC 2012, Cityscapes | 71.6, 63.1 |
ENet | 2016 | Decomposition filter and dilated convolution | | Cityscapes, CamVid | 58.3, 51.3 |
DRN | 2017 | Dilated convolution | | | |
Method based on optimizing convolution structure | Deformable | 2017 | Deformable convolution | | PASCAL VOC 2012 | 75.3 |
MobileNet V1 | 2017 | Depth separable convolution | | COCO | 70.6 |
MobileNet V2 | 2018 | Improved depth separable convolution | | COCO | 71.7 |
TuSimple | 2018 | Upsampling convolution and mixed dilated convolution | | PASCAL VOC 2012 | 83.1 |
Method based on probability graphical model | DSM | 2016 | Modeling CRF through CNN | CRF | PASCAL VOC 2012 | 78.0 |
C&G | 2016 | Embedding CRF into CNN | CRF | PASCAL VOC 2012 | 78.1 |
DPN | 2015 | Integrating CNN with MRF | MRF | PASCAL VOC 2012 | 77.5 |
QO | 2016 | Quadratic optimization | G-CRF | PASCAL VOC 2012 | 80.2 |
HOCRF+ | 2016 | Embedding CRF into CNN | HOCRF | PASCAL VOC 2012 | 77.9 |
Method based on feature fusion | Method based on ASPP | DeepLab v3 | 2017 | Improved dilated convolutionand improved ASPP | CRF | PASCAL VOC 2012 | 86.9 |
DeepLab v3+ | 2018 | ASPP module with separable convolution and skip join fusion of different level features | | PASCAL VOC 2012, Cityscapes | 89.0, 82.1 |
ICNet | 2017 | Cascaded model and feature fusion | | Cityscapes, CamVid | 70.6, 67.1 |
DenseASPP | 2018 | ASPP and densely connected networks to improve receptive field | | Cityscapes | 80.6 |
DMNet | 2019 | Dynamic convolution module and context-aware correlation filter | | PASCAL VOC 2012 | 84.4 |
APCNet | 2019 | GLA and ACM | | PASCAL VOC 2012 | 84.2 |
Method based on attention mechanism | PSANet | 2018 | Attention mechanism | | PASCAL VOC 2012, Cityscapes | 85.7, 80.1 |
CCNet | 2018 | Dilated convolution and feature weighted fusion | | Cityscapes | 81.4 |
BiseNet | 2018 | Spatial path and context path | | Cityscapes, CamVid | 78.9, 68.7 |
ACNet | 2019 | Three parallel branch architecture and attention assistant module integrating attention mechanism | | NYUDv2 | 48.3 |
DANet | 2019 | Dilated convolution,deconvolution and feature weighted fusion | | PASCAL VOC 2012, Cityscapes | 82.6, 81.5 |
Method | Model | Year | Key technology | PGM | Dataset | mloU /% |
Method based on encoding and decoding | SegNet | 2015 | Deconvolution, upsampling and dropout layer | | CamVid | 55.6 |
DeconvNet | 2015 | Deconvolution and unpooling | | PASCAL VOC 2012 | 69.6 |
RefineNet | 2017 | Bilinear interpolation skip join and residual join | | Cityscapes | 73.6 |
GCN+ | 2017 | Large kernel convolution and global convolution network | | PASCAL VOC 2012, Cityscapes | 82.2, 76.9 |
DFANet | 2019 | Deep feature polymerization network | | Cityscapes, CamVid | 70.3, 64.7 |
DUpsampling | 2019 | Fusion of different resolution features | | PASCAL VOC 2012 | 88.1 |
SDN | 2019 | Capturing multi-scale context information to ensure fine recovery of target location information | | PASCAL VOC 2012, CamVid | 86.6, 71.8 |
Method based on RNN | rCNN | 2014 | Multi size input window | | SIFT Flow | |
2D-LSTM | 2015 | Four different directions of RNN | | SIFT Flow | |
ReSeg | 2016 | Extending of ReNet function | | CamVid | |
Method based on GAN | | 2016 | GAN adversarial training | | PASCAL VOC 2012 | 54.3 |
| 2016 | GAN domain adaptation | | Cityscapes | 67.8 |