[2] Peng X J, Wang L M, Wang X X et al. Bag of visual words and fusion methods for action recognition: comprehensive study and good practice[J]. Computer Vision and Image Understanding, 150, 109-125(2016).
[3] Hinton G E. Reducing the dimensionality of data with neural networks[J]. Science, 313, 504-507(2006).
[4] Russakovsky O, Deng J, Su H et al. ImageNet large scale visual recognition challenge[J]. International Journal of Computer Vision, 115, 211-252(2015).
[8] Ng J Y H, Hausknecht M, Vijayanarasimhan S et al. . Beyond short snippets: deep networks for video classification. [C]∥2015 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 7-12, 2015, Boston, MA, USA. New York: IEEE, 4694-4702(2015).
[9] Gu D F. 3D densely connected convolutional network for the recognition of human shopping actions Ottawa,[D]. Ontario: University of Ottawa(2017).
[10] Tran D, Bourdev L, Fergus R et al. Learning spatiotemporal features with 3D convolutional networks. [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, Santiago, Chile. New York: IEEE, 4489-4497(2016).
[11] Hara K, Kataoka H, Satoh Y. Learning spatio-temporal features with 3D residual networks for action recognition. [C]∥2017 IEEE International Conference on Computer Vision Workshops (ICCVW), October 22-29, 2017, Venice, Italy. New York: IEEE, 3154-3160(2018).
[12] Wang X L, Girshick R, Gupta A et al. Non-local neural networks. [C]∥2018 IEEE/CVF Conference on Computer Vision and Pattern Recognition, June 18-23, 2018, Salt Lake City, UT, USA. New York: IEEE, 7794-7803(2018).
[14] Shen Z Q, Liu Z, Li J G et al. DSOD: learning deeply supervised object detectors from scratch. [C]∥2017 IEEE International Conference on Computer Vision (ICCV), October 22-29, 2017, Venice, Italy. New York: IEEE, 1937-1945(2017).
[15] He K M, Zhang X Y, Ren S Q et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification. [C]∥2015 IEEE International Conference on Computer Vision (ICCV), December 7-13, 2015, Santiago, Chile. New York: IEEE, 1026-1034(2015).
[16] Glorot X, Bengio Y. Understanding the difficulty of training deep feedforward neural networks. [C]∥Proceedings of the 13th International Conference on Artificial Intelligence and Statistics, May 13-15, 2010, Chia Laguna Resort, Sardinia, Italy. Cambridge: PMLR, 9, 249-256(2010).