[1] Yan S, Xiong Y, Lin D. Spatial temporal graph convolutional networks for skeleton-based action recognition. [C]∥Thirty-Second AAAI Conference on Artificial Intelligence, February 2-7, 2018, Hilton New Orleans Riverside, New Orleans, Louisiana, USA. USA: AAAI, 7444-7452(2018).
[3] Toshev A, Szegedy C. DeepPose: human pose estimation via deep neural networks. [C]∥2014 IEEE Conference on Computer Vision and Pattern Recognition, June 23-28, 2014, Columbus, OH, USA. New York: IEEE, 1653-1660(2014).
[4] Fan X C, Zheng K, Lin Y W et al. Combining local appearance and holistic view: dual-source deep neural networks for human pose estimation. [C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA. New York: IEEE, 1347-1355(2015).
[5] Carreira J, Agrawal P, Fragkiadaki K et al. Human pose estimation with iterative error feedback. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 4733-4742(2016).
[6] Yang W, Li S, Ouyang W L et al. Learning feature pyramids for human pose estimation. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE, 1290-1299(2017).
[7] Newell A, Yang K Y, Deng J. Stacked hourglass networks for human pose estimation[M]. ∥Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science. Cham: Springer, 9912, 483-499(2016).
[8] Tompson J J, Jain A. LeCun Y, et al. Joint training of a convolutional network and a graphical model for human pose estimation. [C]∥Advances in Neural Information Processing Systems 27 (NIPS 2014), December 8-13, 2014, Montreal, Quebec, Canada. Canada: NIPS(2014).
[9] Yang W, Ouyang W L, Li H S et al. End-to-end learning of deformable mixture of parts and deep convolutional neural networks for human pose estimation. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 3073-3082(2016).
[10] Cao Z, Simon T, Wei S E et al. Realtime multi-person 2D pose estimation using part affinity fields. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 1302-1310(2017).
[11] He K M, Gkioxari G, Dollar P et al. Mask R-CNN. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE, 2980-2988(2017).
[12] Ren S Q, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 1137-1149(2017).
[14] Fang H S, Xie S Q, Tai Y W et al. RMPE: regional multi-person pose estimation. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE, 2353-2362(2017).
[15] Redmon J. -04-08)[2019-05-16]. https:∥arxiv., org/abs/1804, 02767(2018).
[17] Chen L C, Zhu Y K, Papandreou G et al. Encoder-decoder with atrous separable convolution for semantic image segmentation[M]. ∥Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science. Cham: Springer, 11211, 833-851(2018).
[18] Shrivastava A, Gupta A, Girshick R. Training region-based object detectors with online hard example mining. [C]∥2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA. New York: IEEE, 761-769(2016).