• Laser & Optoelectronics Progress
  • Vol. 58, Issue 24, 2415002 (2021)
Bowen Zheng, Huawei Xia*, Ruidong Chen**, and Qiankun Han***
Author Affiliations
  • School of Electrical and Information Engineering, Tianjin University, Tianjin 300072, China
  • show less
    DOI: 10.3788/LOP202158.2415002 Cite this Article Set citation alerts
    Bowen Zheng, Huawei Xia, Ruidong Chen, Qiankun Han. Exposing DeepFake Video Detection Based on Convolutional Long Short-Term Memory Network[J]. Laser & Optoelectronics Progress, 2021, 58(24): 2415002 Copy Citation Text show less
    Framework of convolutional LSTM network
    Fig. 1. Framework of convolutional LSTM network
    Influence of number of frames on the classification accuracy on the FaceForensics++ dataset. (a) Gaussian blur frame number; (b) face data removal frame number; (c) total frame number
    Fig. 2. Influence of number of frames on the classification accuracy on the FaceForensics++ dataset. (a) Gaussian blur frame number; (b) face data removal frame number; (c) total frame number
    Backbone networkClassification accuracy /%
    LQHQRAW
    AlexNet[22]96.8988.6491.97
    VGG16[11]90.8592.3594.67
    ResNet[10]93.4295.6896.43
    EfficientNet(Ours)96.5197.8999.57
    Table 1. Comparison of different backbone networks on the FaceForensics++ dataset
    AlgorithmClassification accuracy /%
    LQHQRAW
    CNN[21]90.0091.4593.40
    SVM[25]70.1073.6475.43
    RNN[6]93.4695.0495.98
    GRU[24]94.4896.1897.54
    LSTM[14]94.2996.2496.79
    ConvLSTM[19]95.1896.7998.80
    ConvLSTM(with attention)96.5197.8999.57
    Table 2. Comparison of different algorithms on the FaceForensics++ dataset
    ModelClassification accuracy /%
    LQHQRAW
    ConvLSTM95.1896.7998.80
    ConvLSTM+hard-attention95.2296.9199.24
    Self-attention95.9697.8998.91
    ConvLSTM+soft-attention96.5197.3499.57
    Table 3. Experimental results of different attention models on the FaceForensics++ dataset
    Number of manipulatedframesClassification accuracy /%
    LQHQRAW
    271.3278.4182.17
    385.6687.6389.68
    488.9790.0592.53
    592.6594.3297.43
    675.7479.5782.64
    Table 4. Classification accuracy of the variation of Gaussian blur frame numbers on the FaceForensics++ dataset
    Number of manipulatedframesClassification accuracy /%
    LQHQRAW
    395.2296.2797.48
    496.0797.0898.32
    596.5097.5599.15
    696.7497.9199.57
    796.1296.9498.51
    894.2695.4697.20
    992.3793.8494.78
    Table 5. Experimental results of the facial data removal on the FaceForensics++ dataset
    Total framenumberClassification accuracy /%
    LQHQRAW
    593.4395.0695.95
    1096.5197.3298.22
    1596.7098.4999.91
    2096.7498.0899.05
    Table 6. Classification accuracy of the variation of the total frame numbers on the FaceForensics++ dataset
    ReferenceMethodClassifierClassification accuracy /%Dataset
    Ref.[1]Mesoscopic featuresCNN83.2F2F
    Ref.[21]Steganalysis featuresCNN91.0F2F
    94.0DF
    93.0FS
    81.0NT
    Ref.[6]Temporal featuresRNN94.3F2F
    Ref.[27]Temporal featuresOptical Flow81.6F2F
    Ref.[28]Deep learning features3DCNN95.1DF
    92.3FS
    This workInterframe featuresConvLSTM96.5F2F
    96.7DF
    94.9FS
    92.7NT
    Table 7. Comparison of classification experimental results of different algorithms on the FaceForensics++ dataset
    Bowen Zheng, Huawei Xia, Ruidong Chen, Qiankun Han. Exposing DeepFake Video Detection Based on Convolutional Long Short-Term Memory Network[J]. Laser & Optoelectronics Progress, 2021, 58(24): 2415002
    Download Citation