• Laser & Optoelectronics Progress
  • Vol. 57, Issue 18, 181509 (2020)
Ze Zhu1, Qingbing Sang1、2、*, and Hao Zhang1
Author Affiliations
  • 1School of Internet of Things Engineering, Jiangnan University, Wuxi, Jiangsu 214122, China
  • 2Jiangsu Provincial Engineering Laboratory of Pattern Recognition and Computational Intelligence, Wuxi, Jiangsu 214122, China
  • show less
    DOI: 10.3788/LOP57.181509 Cite this Article Set citation alerts
    Ze Zhu, Qingbing Sang, Hao Zhang. No Reference Video Quality Assessment Based on Spatio-Temporal Features and Attention Mechanism[J]. Laser & Optoelectronics Progress, 2020, 57(18): 181509 Copy Citation Text show less
    Network structure
    Fig. 1. Network structure
    Schematic of GRU network structure
    Fig. 2. Schematic of GRU network structure
    Attention model
    Fig. 3. Attention model
    1st frame of different distorted videos. (a) Riverbed; (b) sunflower; (c) station; (d) tractor
    Fig. 4. 1st frame of different distorted videos. (a) Riverbed; (b) sunflower; (c) station; (d) tractor
    Flow chart of video data processing
    Fig. 5. Flow chart of video data processing
    Scatter plot of prediction results on LIVE video library
    Fig. 6. Scatter plot of prediction results on LIVE video library
    Relationship curves between number of training sets of different proportions and evaluation results
    Fig. 7. Relationship curves between number of training sets of different proportions and evaluation results
    Scatter plot of prediction results on CSIQ video library
    Fig. 8. Scatter plot of prediction results on CSIQ video library
    Scatter plot of prediction results on IVP video library
    Fig. 9. Scatter plot of prediction results on IVP video library
    Layer nameOutput sizeParameter
    Conv1,Conv224000×48×64Size: 3×3; filters: 64
    Max pooling112000×24×64Size: 2×2; stride: 2×2
    Conv3,Conv412000×24×128Size: 3×3; filters: 128
    Max pooling26000×12×128Size: 2×2; stride: 2×2
    Conv5,Conv6,Conv76000×12×256Size: 3×3; filters: 256
    Max pooling33000×6×256Size: 2×2; stride: 2×2
    Conv8,Conv9,Conv103000×6×512Size: 3×3; filters: 512
    Max pooling41500×3×512Size: 2×2; stride: 2×2
    Conv11,Conv12,Conv131500×3×512Size: 3×3; filters: 512
    Max pooling5749×1×512Size: 2×2; stride: 3×3
    GRU1×1×512512
    Attention1×512/
    FC1×11
    Table 1. Network parameter setting
    AlgorithmSROCCPLCC
    PSNR[23]0.53980.5645
    SSIM[24]0.73640.7470
    ST-MAD[6]0.82510.8332
    STRRED[25]0.80070.8119
    FS-MOVIE[7]0.84820.8636
    V-BLIINDS[4]0.83770.8471
    Ours without attention0.85570.8633
    Ours with attention0.87980.8910
    Table 2. Performance comparison of different algorithms on LIVE video library
    AlgorithmWirelessIPH.264MPEG-2
    PSNR[23]0.65740.41670.45850.3862
    SSIM[24]0.72890.65340.73130.6684
    ST-MAD[6]0.80990.77580.90210.8461
    STRRED[25]0.78570.77220.81930.7193
    FS-MOVIE[7]0.81390.77220.84900.8609
    V-BLIINDS[4]0.84550.78980.85870.8377
    Ours withoutattention0.84870.83160.84680.8331
    Ours withattention0.86170.84580.85850.8547
    Table 3. Comparison of SROCC values of different algorithms for single distortion type
    AlgorithmWirelessIPH.264MPEG-2
    PSNR[23]0.70580.47670.57460.3986
    SSIM[24]0.71840.77640.74200.6222
    ST-MAD[6]0.85910.80650.87960.8560
    STRRED[25]0.80530.85270.81410.7570
    FS-MOVIE[7]0.85990.80090.87650.8721
    V-BLIINDS[4]0.91340.90200.90380.8699
    Ours withoutattention0.90690.90990.87660.8745
    Ours withattention0.92030.91770.89620.8858
    Table 4. Comparison of PLCC values of different algorithms for single distortion type
    AlgorithmLiveData1LiveData2LiveData3LiveData4Average
    SROCCPLCCSROCCPLCCSROCCPLCCSROCCPLCCSROCCPLCC
    Ours withoutattention0.84780.87520.84820.86720.85440.84610.87240.86480.85570.8633
    Ours with attention0.86930.89080.89100.90040.88520.87580.87350.89690.87980.8910
    Table 5. Comparison of final evaluation results on LIVE video library
    AlgorithmTime /s
    PSNR[23]3.09
    SSIM[24]11.34
    ST-MAD[6]335.90
    STRRED[25]54.94
    FS-MOVIE[7]4444.20
    Ours with attention1291.20
    Table 6. Comparison of running time of different methods on “Tractor” video
    AlgorithmSROCCPLCC
    PSNR[23]0.72530.7932
    SSIM[24]0.86610.8517
    ST-MAD[6]0.81740.8266
    STRRED[25]0.88220.8734
    FS-MOVIE[7]0.80670.8053
    V-BLIINDS[4]0.83510.8449
    Ours with attention0.89090.8991
    Table 7. Performance comparison of different algorithms on CSIQ video library
    AlgorithmSROCCPLCC
    PSNR[23]0.70640.7299
    SSIM[24]0.76940.7667
    ST-MAD[6]0.82350.8284
    STRRED[25]0.87610.8853
    FS-MOVIE[7]0.81770.8359
    V-BLIINDS[4]0.85520.8441
    Ours with attention0.90640.9135
    Table 8. Performance comparison of different algorithms on IVP video library
    Ze Zhu, Qingbing Sang, Hao Zhang. No Reference Video Quality Assessment Based on Spatio-Temporal Features and Attention Mechanism[J]. Laser & Optoelectronics Progress, 2020, 57(18): 181509
    Download Citation