• Chinese Journal of Lasers
  • Vol. 49, Issue 20, 2007205 (2022)
Yuan Yuan, Minghui Chen*, Shuting Ke, Teng Wang, Longxi He, Linjie Lü, Hao Sun, and Jiannan Liu
Author Affiliations
  • Shanghai Engineering Research Center of Interventional Medical, Ministry of Education of Medical Optical Engineering Center, School of Health Sciences and Engineering, University of Shanghai for Science and Technology, Shanghai 200093, China
  • show less
    DOI: 10.3788/CJL202249.2007205 Cite this Article Set citation alerts
    Yuan Yuan, Minghui Chen, Shuting Ke, Teng Wang, Longxi He, Linjie Lü, Hao Sun, Jiannan Liu. Fundus Image Classification Research Based on Ensemble Convolutional Neural Network and Vision Transformer[J]. Chinese Journal of Lasers, 2022, 49(20): 2007205 Copy Citation Text show less
    Overview of Vit model
    Fig. 1. Overview of Vit model
    Structure of MBConv and Fused-MBConv. (a) MBConv; (b) Fused-MBConv
    Fig. 2. Structure of MBConv and Fused-MBConv. (a) MBConv; (b) Fused-MBConv
    SimAM 3D attention weights
    Fig. 3. SimAM 3D attention weights
    Confusion matrix of Vit and EfficientNetV2-S models. (a) Vit; (b) EfficientNetV2-S
    Fig. 4. Confusion matrix of Vit and EfficientNetV2-S models. (a) Vit; (b) EfficientNetV2-S
    Heatmap of abnormal fundus images in dataset. (a) Abnormalities in optic disc; (b) abnormalities in macular
    Fig. 5. Heatmap of abnormal fundus images in dataset. (a) Abnormalities in optic disc; (b) abnormalities in macular
    StageOperatorStrideNumber of channelsNumber of layers
    0Conv 3×32241
    1Fused-MBConv1,k 3×31242
    2Fused-MBConv4,k 3×32484
    3Fused-MBConv4,k 3×32644
    4MBConv4, k 3×3, SimAM21286
    5MBConv6, k 3×3, SimAM11609
    6MBConv6, k 3×3, SimAM227215
    7Conv 1×1 & Pooling & FC17921
    Table 1. EfficientNetV2-S architecture
    Degree of illnessNumber of training imagesNumber of testing imagesTotal number of images
    Normal28184093227
    DR13892031592
    ARMD14749196
    Myopia23435269
    Cataract26243305
    Table 2. Fundus dataset
    ModelAccuracy /%Precision /%Specificity /%Training time /h
    Vit91.186.497.211.0
    EfficientNetV2-S92.287.697.59.2
    EfficientNet-Vit92.788.398.1
    Table 3. Accuracy, precision, and specificity of Vit, EfficientNetV2-S, and EfficientNet-Vit models
    ModelAccuracy /%
    Resnet5087.3
    Densenet12189.5
    ResNeSt-10190.7
    EfficientNet-B091.3
    TNT-B91.1
    EfficientNet-Vit92.7
    Table 4. Comparison of accuracy indexes of different models
    Weighted factorAccuracy /%
    0.3, 0.792.0
    0.4, 0.692.7
    0.5, 0.591.6
    Table 5. Comparison of accuracy indexes of models with different weighted factors
    Yuan Yuan, Minghui Chen, Shuting Ke, Teng Wang, Longxi He, Linjie Lü, Hao Sun, Jiannan Liu. Fundus Image Classification Research Based on Ensemble Convolutional Neural Network and Vision Transformer[J]. Chinese Journal of Lasers, 2022, 49(20): 2007205
    Download Citation