• Laser & Optoelectronics Progress
  • Vol. 57, Issue 2, 21006 (2020)
Yan Fenting, Wang Peng*, Lü Zhigang, Ding Zhe, and Qiao Mengyu
Author Affiliations
  • School of Electronics and Information Engineering, Xi''an Technological University, Xi''an, Shaanxi 710021, China
  • show less
    DOI: 10.3788/LOP57.021006 Cite this Article Set citation alerts
    Yan Fenting, Wang Peng, Lü Zhigang, Ding Zhe, Qiao Mengyu. Real-Time Multi-Person Video-Based Pose Estimation[J]. Laser & Optoelectronics Progress, 2020, 57(2): 21006 Copy Citation Text show less

    Abstract

    For multi-person pose estimation in images and videos, it is necessary to address the inaccurate positioning of the human-bounding box and improve the detection accuracy of hard keypoints. This paper designs a real-time multi-person pose-estimation model based on a top-down framework. First, depth-separable convolution is added to the target-detection algorithm to improve the running speed of the human detector; then, by combining the feature pyramid network with context-semantic information, the online hard-example mining algorithm is used to solve the problem of low detection accuracy at hard keypoints. Finally, combining the spatial-transformation network and pose-similarity calculation, the redundant pose is eliminated and the accuracy of the bounding-box positioning is improved. In this paper, the average detection precision of the proposed model on the 2017MS COCO Test-dev dataset is 14.84% higher than that of the Mask R-CNN model, and 2.43% higher than that of the RMPE model. The frame frequency is 22 frame·s -1.
    Yan Fenting, Wang Peng, Lü Zhigang, Ding Zhe, Qiao Mengyu. Real-Time Multi-Person Video-Based Pose Estimation[J]. Laser & Optoelectronics Progress, 2020, 57(2): 21006
    Download Citation