• Optics and Precision Engineering
  • Vol. 32, Issue 16, 2564 (2024)
Zhenping XIA1,3,*, Hao CHEN1, Yuning ZHANG2,4, Cheng CHENG1,3, and Fuyuan HU1,3
Author Affiliations
  • 1School of Electronic & Information Engineering, Suzhou University of Science and Technology, Suzhou25009, China
  • 2Display R&D Centre, School of Electronic Science & Engineering, Southeast University, Nanjing10096, China
  • 3Jiangsu Industrial Intelligent and Low-carbon Technology Engineering Center, Suzhou215009, China
  • 4Shi-Cheng Laboratory for Information Display and Visualization, Nanjing210013, China
  • show less
    DOI: 10.37188/OPE.20243216.2564 Cite this Article
    Zhenping XIA, Hao CHEN, Yuning ZHANG, Cheng CHENG, Fuyuan HU. Lightweight video super-resolution based on hybrid spatio-temporal convolution[J]. Optics and Precision Engineering, 2024, 32(16): 2564 Copy Citation Text show less
    Overall network structure
    Fig. 1. Overall network structure
    Motion compensation structure
    Fig. 2. Motion compensation structure
    Hybrid Spatial-Temporal Convolution
    Fig. 3. Hybrid Spatial-Temporal Convolution
    2D Spatial Convolution
    Fig. 4. 2D Spatial Convolution
    Similarity-based feature selection
    Fig. 5. Similarity-based feature selection
    Visual results of our network and its variants
    Fig. 6. Visual results of our network and its variants
    Reconstruction visual comparisons of the state-of-the-art algorithms and proposed network on three datasets for ×4 SR
    Fig. 7. Reconstruction visual comparisons of the state-of-the-art algorithms and proposed network on three datasets for ×4 SR
    [in Chinese]
    Fig. 8. [in Chinese]
    模块函数名卷积核大小
    运动补偿Cf (·)3×3×128
    Cg (·)3×3×128
    DConv3×3×128
    时空特征提取Ca (·)3×3×128
    CSC (·)3×3×128
    CTC (·)3×3×3×128
    Cfuse (·)3×3×128
    选择性特征融合θ(·)3×3×128
    ϕ(·)1×1×128
    Ce(·)1×1×128
    Up sampling3×3×48
    Table 1. Architecture of network
    模型三维卷积二维空间卷积特征选择模块PSNRSSIM
    TC-VSR29.470.869 9
    Deep-TC-VSR29.790.874 8
    S-TC-VSR29.720.873 4
    S-SC-VSR29.610.871 3
    HTSC-VSR29.590.873 5
    S-HTSC-VSR(Ours)30.510.880 9
    Table 2. Quantitative comparison of different activation functions on the SPMCS-11 dataset
    深度宽度参数量/MPSNR/dBSSIM
    8643.930.210.870 1
    81285.230.270.874 4
    10646.830.430.875 2
    101289.730.510.880 9
    Table 3. Network performance of different widths and depths
    MSE lossL1 lossCharbonnier loss
    PSNR27.2827.3627.43
    Table 4. Average value of all video frames of different Loss Functions on the Vid4 dataset
    片段名BicubicRCAN25DUF14TDAN21VSR-Transformer26BasicVSR++27Ours
    Calendar20.39/0.572 022.31/0.724 824.04/0.811 023.20/0.768 924.14/0.815 724.23/0.820 924.20/0.821 2
    City25.16/0.602 826.07/0.693 828.27/0.831 327.18/0.771 627.87/0.811 428.01/0.813 728.03/0.814 1
    Foliage23.47/0.566 624.69/0.662 826.41/0.770 925.64/0.728 426.29/0.761 326.34/0.765 426.39/0.766 5
    Walk26.10/0.797 428.64/0.871 830.30/0.914 129.80/0.894 030.91/0.910 931.11/0.915 431.09/0.915 7
    Average23.78/0.634 725.43/0.738 327.26/0.831 826.46/0.790 727.30/0.824 827.42/0.828 927.43/0.829 4
    Table 5. Quantitative comparisons of different algorithms for scale factor ×4 on Vid4 dataset(PSNR(dB)/SSIM)
    片段名BicubicRCAN25DUF14TDAN21VSR-Transformer26BasicVSR++27Ours
    Car_0527.75/0.782 529.84/0.848 330.77/0.870 530.59/0.865 332.13/0.903 232.31/0.905 432.42/0.906 3
    hdclub_00319.42/0.486 320.39/0.610 022.06/0.742 921.34/0.687 922.11/0.738 722.19/0.744 322.17/0.741 9
    hitachi_isee519.61/0.593 823.58/0.837 125.75/0.892 724.59/0.856 726.50/0.906 926.73/0.909 726.74/0.912 3
    hk004_00128.54/0.800 331.72/0.862 832.96/0.898 432.27/0.882 533.48/0.904 633.59/0.905 133.66/0.904 5
    HKVTG_00427.46/0.683 128.77/0.765 029.15/0.785 529.11/0.778 829.57/0.798 329.60/0.798 729.55/0.801 3
    jvc_00925.40/0.755 828.29/0.872 229.17/0.895 928.90/0.883 230.46/0.919 530.74/0.921 130.91/0.921 6
    NYVTG_00628.45/0.801 430.99/0.886 032.32/0.905 831.90/0.899 633.32/0.925 133.56/0.926 934.11/0.927 4
    PRVTG_01225.63/0.713 626.63/0.781 127.35/0.816 427.16/0.805 627.67/0.825 327.79/0.828 127.84/0.827 4
    RMVTG_01123.96/0.657 326.05/0.757 427.53/0.811 526.95/0.792 427.71/0.819 727.81/0.823 427.94/0.825 2
    veni3_01129.47/0.897 934.54/0.962 534.64/0.967 634.68/0.964 536.53/0.974 536.57/0.974 837.16/0.975 2
    veni5_01527.41/0.848 331.01/0.926 231.89/0.936 731.30/0.927 532.77/0.944 933.17/0.947 333.12/0.946 6
    Average25.73/0.739 128.35/0.828 129.42/0.865 928.98/0.849 530.20/0.878 230.37/0.880 430.51/0.880 9
    Table 6. Quantitative comparisons of different algorithms for scale factor ×4 on SPMCS-11 dataset(PSNR(dB)/SSIM)
    算法慢速运动中速运动快速运动Average
    Bicubic29.34/0.833 031.29/0.870 834.07/0.905 031.32/0.868 4
    RCAN2532.92/0.902 835.33/0.926 538.45/0.945 335.32/0.924 5
    DUF1433.38/0.910 736.69/0.944 238.86/0.950 836.35/0.938 3
    TDAN2133.17/0.906 536.05/0.936 938.70/0.949 135.87/0.932 5
    VSR-Transformer2634.43/0.923 237.69/0.951 740.26/0.961 337.42/0.947 3
    BasicVSR++2734.58/0.925 637.75/0.952 740.49/0.962 437.52/0.948 6
    Ours34.53/0.924 637.81/0.953 540.56/0.963 337.56/0.949 0
    片段数量1 6164 9831 2257 824
    平均流大小0.62.58.33.0
    Table 7. Quantitative comparisons of different algorithms for scale factor ×4 on Vimeo-90K-T dataset(PSNR(dB)/SSIM)
    评估指标BicubicRCAN25TDAN21BasicVSR++27Ours
    NIQE↓7.586.296.566.116.05
    SSEQ↓54.4046.3244.2641.1740.59
    Table 8. Quantitative comparisons on the real-world dataset
    算法PSNR/dBSSIM参数量/MFLOPs/109平均运行时间/s
    RCAN2528.350.828 115.6261.461.586
    DUF1429.420.865 95.892.970.573
    3DSRNet1328.980.849 515.9127.490.778
    VSR-Transformer2630.200.878 243.8834.011.153
    BasicVSR++2730.370.880 46.411.070.067
    Ours30.510.880 99.719.040.115
    Table 9. Average running time on SPMCS-11 dataset for ×4 SR
    Zhenping XIA, Hao CHEN, Yuning ZHANG, Cheng CHENG, Fuyuan HU. Lightweight video super-resolution based on hybrid spatio-temporal convolution[J]. Optics and Precision Engineering, 2024, 32(16): 2564
    Download Citation