Improved Encoder-Decoder Temporal Action Detection Algorithm

Yue Wang; Hansong Su; Gaohua Liu

doi:10.3788/LOP202158.2020001

[1] Wu Y C, Yin J Q, Wang L et al. Temporal action detection based on action temporal semantic continuity[J]. IEEE Access, 6, 31677-31684(2018).

[2] Gaidon A, Harchaoui Z, Schmid C. Actom sequence models for efficient action detection[C]. //2011 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), June 20-25, 2011, Colorado Springs, CO, USA., 3201-3208(2011).

[3] Singh B, Marks T K, Jones M et al. A multi-stream bi-directional recurrent neural network for fine-grained action detection[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 1961-1970(2016).

[4] Xiong Y J, Zhao Y, Wang L M et al. A pursuit of temporal accuracy in general activity detection[EB/OL]. (2017-03-08)[2020-09-23]. https://arxiv.org/abs/1703.02716

[5] Gao J Y, Yang Z H, Sun C et al. TURN TAP: temporal unit regression network for temporal action proposals[C]. //2017 IEEE International Conference on Computer Vision (ICCV), October22-29, 2017, Venice, Italy, 3648-3656(2017).

[6] Xu H J, Das A, Saenko K. R-C3D: region convolutional 3D network for temporal activity detection[C]. //2017 IEEE International Conference on Computer Vision(ICCV), October 22-29, 2017, Venice, Italy, 5794-5803(2017).

[7] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 770-778(2016).

[8] Lea C, Flynn M D, Vidal R et al. Temporal convolutional networks for action segmentation and detection[C]. //2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA., 1003-1012(2017).

[9] Ronneberger O, Fischer P, Brox T. U-net: convolutional networks for biomedical image segmentation[M]. //Navab N, Hornegger J, Wells W M, et al. Medical image computing and computer-assisted intervention-MICCAI 2015. Lecture notes in computer science, 9351, 234-241(2015).

[10] Xu B, Wang N Y, Chen T Q et al. Empirical evaluation of rectified activations in convolutional network[EB/OL]. (2015-05-05)[2020-09-23]. https://arxiv.org/abs/1505.00853

[11] Wang J J, Jian M W, Liu X Y et al. Video saliency detection based on 3D full ConvLSTM neural network[J]. Computer Science, 47, 195-201(2020).

[12] LeCun Y, Bottou L, Bengio Y et al. Gradient-based learning applied to document recognition[J]. Proceedings of the IEEE, 86, 2278-2324(1998).

[13] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 60, 84-90(2017).

[14] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition[EB/OL]. (2014-09-04)[2020-09-23]. https://arxiv.org/abs/1409.1556

[15] Wang M, Su H S, Liu G H et al. Classroom face detection algorithm based on convolutional neural network[J]. Laser & Optoelectronics Progress, 56, 211501(2019).

[16] Singh B, Marks T K, Jones M et al. A multi-stream bi-directional recurrent neural network for fine-grained action detection[C]. //2016 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), June 27-30, 2016, Las Vegas, NV, USA., 1961-1970(2016).

[17] Fathi A, Ren X F, Rehg J M. Learning to recognize objects in egocentric activities[C]. //2011 IEEE Conference on Computer Vision and Pattern Recognition(CVPR), June 20-25, 2011, Colorado Springs, CO, USA., 3281-3288(2011).

[18] Liu F, Liu P Y, Li B et al. Deep learning model design of video target tracking based on TensorFlow platform[J]. Laser & Optoelectronics Progress, 54, 091501(2017).