• Laser & Optoelectronics Progress
  • Vol. 59, Issue 16, 1610004 (2022)
Zitong Ma and Guodong Wang*
Author Affiliations
  • College of Computer Science & Technology, Qingdao University, Qingdao 266071, Shandong , China
  • show less
    DOI: 10.3788/LOP202259.1610004 Cite this Article Set citation alerts
    Zitong Ma, Guodong Wang. Human Instance Segmentation Based on Two-Stream Convolutional Neural Network[J]. Laser & Optoelectronics Progress, 2022, 59(16): 1610004 Copy Citation Text show less
    References

    [1] He K M, Gkioxari G, Dollár P et al. Mask R-CNN[C], 2980-2988(2017).

    [2] Li Q Q, Hua X H, Zhao B F et al. New method for plane segmentation of indoor scene point cloud[J]. Chinese Journal of Lasers, 48, 1604002(2021).

    [3] Chen H, Sun K Y, Tian Z et al. BlendMask: top-down meets bottom-up for instance segmentation[C], 8570-8578(2020).

    [4] Zhang X Y, Cao J L. Contour-point refined mask prediction for single-stage instance segmentation[J]. Acta Optica Sinica, 40, 2115001(2020).

    [5] Girshick R. Fast R-CNN[C], 1440-1448(2015).

    [6] Cai Z W, Vasconcelos N. Cascade R-CNN: high quality object detection and instance segmentation[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 43, 1483-1498(2021).

    [7] Zhang Z Y, Fidler S, Urtasun R. Instance-level segmentation for autonomous driving with deep densely connected MRFs[C], 669-677(2016).

    [8] George A, Marcel S. Cross modal focal loss for RGBD face anti-spoofing[C], 7878-7887(2021).

    [9] Shi Y C, Yu X, Sohn K et al. Towards universal representation learning for deep face recognition[C], 6816-6825(2020).

    [10] Ma X, Zhang F D, Li Y L et al. Robust sparse representation based face recognition in an adaptive weighted spatial pyramid structure[J]. Science China Information Sciences, 61, 1-13(2017).

    [11] Cao Z, Simon T, Wei S H et al. Realtime multi-person 2D pose estimation using part affinity fields[C], 1302-1310(2017).

    [12] Jain A, Tompson J, Andriluka M et al. Learning human pose estimation features with convolutional networks[EB/OL]. https://arxiv.org/abs/1312.7302

    [13] Artacho B, Savakis A. UniPose: unified human pose estimation in single images and videos[C], 7033-7042(2020).

    [14] Lifkooee M Z, Liu C L, Liang Y Q et al. Real-time avatar pose transfer and motion generation using locally encoded Laplacian offsets[J]. Journal of Computer Science and Technology, 34, 256-271(2019).

    [15] Newell A, Huang Z A, Deng J. Associative embedding: end-to-end learning for joint detection and grouping[EB/OL]. https://arxiv.org/abs/1611.05424

    [16] Zhang S H, Li R L, Dong X et al. Pose2Seg: detection free human instance segmentation[C], 889-898(2019).

    [17] Tripathi S, Collins M, Brown M et al. Pose2Instance: harnessing keypoints for person instance segmentation[EB/OL]. https://arxiv.org/abs/1704.01152

    [18] Papandreou G, Zhu T, Chen L C et al. PersonLab: person pose estimation and instance segmentation with a bottom-up, part-based, geometric embedding model[M]. Ferrari V, Hebert M, Sminchisescu C, et al. Computer vision-ECCV 2018. Lecture notes in computer science, 11218, 282-299(2018).

    [19] Dalal N, Triggs B. Histograms of oriented gradients for human detection[C], 886-893(2005).

    [20] Belongie S, Malik J, Puzicha J. Shape matching and object recognition using shape contexts[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 24, 509-522(2002).

    [21] Lowe D G. Distinctive image features from scale-invariant keypoints[J]. International Journal of Computer Vision, 60, 91-110(2004).

    [22] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[J]. Communications of the ACM, 60, 84-90(2017).

    [23] Hu J, Shen L, Sun G. Squeeze-and-excitation networks[C], 7132-7141(2018).

    [24] Jimmy B, Ryan K, Kyunghyun C et al. Tell: neural image caption generation with visual attention[EB/OL]. https://arxiv.org/abs/1502.03044

    [25] Cui H H, Lou H C, Tian W et al. High-precision visual positioning of hole-making datum for orbital crawling robot[J]. Acta Optica Sinica, 41, 0915002(2021).

    [26] He K M, Zhang X Y, Ren S Q et al. Deep residual learning for image recognition[C], 770-778(2016).

    [27] Lin T Y, Dollár P, Girshick R et al. Feature pyramid networks for object detection[C], 2117-2125(2017).

    [28] He K M, Zhang X Y, Ren S Q et al. Delving deep into rectifiers: surpassing human-level performance on ImageNet classification[C], 1026-1034(2015).

    [29] Hou Q B, Zhou D Q, Feng J S. Coordinate attention for efficient mobile network design[C], 13708-13717(2021).

    Zitong Ma, Guodong Wang. Human Instance Segmentation Based on Two-Stream Convolutional Neural Network[J]. Laser & Optoelectronics Progress, 2022, 59(16): 1610004
    Download Citation