• Laser & Optoelectronics Progress
  • Vol. 56, Issue 13, 131003 (2019)
Lingyu Liang1、2、3、**, Tiantian Zhang1、3, and Wei He1、*
Author Affiliations
  • 1 Key Laboratory of Wireless Sensor Network and Communication, Shanghai Institute of Microsystem and Information Technology, Chinese Academy of Sciences, Shanghai 201800, China
  • 2 School of Information Science and Technology, ShanghaiTech University, Shanghai 200120, China
  • 3 University of Chinese Academy of Sciences, Beijing 100049, China
  • show less
    DOI: 10.3788/LOP56.131003 Cite this Article Set citation alerts
    Lingyu Liang, Tiantian Zhang, Wei He. Head Pose Estimation Based on Multi-Scale Convolutional Neural Network[J]. Laser & Optoelectronics Progress, 2019, 56(13): 131003 Copy Citation Text show less
    Head pose estimation flow chart
    Fig. 1. Head pose estimation flow chart
    Deep neural network structure for head pose estimation. (a) Multi-scale convolution structure; (b) process structure after feature combination
    Fig. 2. Deep neural network structure for head pose estimation. (a) Multi-scale convolution structure; (b) process structure after feature combination
    Part of the experimental head posture library picture. (a) CAS-PEALR1; (b) Pointing'04
    Fig. 3. Part of the experimental head posture library picture. (a) CAS-PEALR1; (b) Pointing'04
    Face images after cropped
    Fig. 4. Face images after cropped
    Partial head posture pictures under different interference factors. (a) Standard; (b) with mask; (c) with glasses; (d) expression; (e) weak illumination; (f) strong illumination; (g) background
    Fig. 5. Partial head posture pictures under different interference factors. (a) Standard; (b) with mask; (c) with glasses; (d) expression; (e) weak illumination; (f) strong illumination; (g) background
    ConvolutionRecognition accuracyon test set /%
    Multi-scale convolution98.9
    3×3 single-scale convolution94.3
    5×5 single-scale convolution93.1
    7×7 single-scale convolution93.2
    Table 1. Multi-scale convolution vs. single-scale convolution
    AlgorithmPointing'04accuracy /%CAS-PEAL-R1 accuracy /%Numberofgestures
    Algorithm ofthis paper96.598.921
    Cluster-glassificationBayesian network[23]94.896.212
    Based onfacialfeature points[24]92.793.56
    Table 2. Accuracy of different algorithms on Pointing'04 and CAS-PEAL-R1
    AftergesturesconversionPitch attitudebefore gesturesconversionYaw attitudebefore gesturesconversion /(°)
    LevelPM0,-15,+15
    Level leftPM+30,+45
    Level rightPM-30,-45
    Pitch downPD0,-15,+15
    Pitch upPU0,-15,+15
    Left upPU+30,+45
    Right upPU-30,-45
    Left downPD+30,+45
    Right downPD-30,-45
    Table 3. Gestures Conversion relationship table
    InterferencefactorAccuracy ofthis paper /%Accuracy ofmethod in Ref.[23] /%
    Standard98.596.3
    With mask92.986.4
    With glasses96.389.7
    Expression98.194.9
    Weak illumination95.591.8
    Strong illumination96.192.2
    Background97.795.1
    Table 4. Different interference factors effect on recognition rate
    Resolution /(pixel×pixel)Timeof thispaper /msTimeof method inRef. [23] /msTimeof method inRef. [24] /ms
    1920×108034.351.3562.1
    1360× 76032.748.5451.7
    800×60030.845.2349.8
    Table 5. Recognition time at different resolution
    Lingyu Liang, Tiantian Zhang, Wei He. Head Pose Estimation Based on Multi-Scale Convolutional Neural Network[J]. Laser & Optoelectronics Progress, 2019, 56(13): 131003
    Download Citation