Deep transfer learning for fine-grained categorization on micro datasets

Wang Ronggui; Yao Xuchen; Yang Juan; Xue Lixia

doi:10.12086/oee.2019.180416

[1] Krizhevsky A, Sutskever I, Hinton G E. ImageNet classification with deep convolutional neural networks[C]//Proceedings of 2012 International Conference on Neural Information Processing Systems, Lake Tahoe, USA, 2012: 1097–1105.

[2] Zhang N, Donahue J, Girshick R, et al. Part-based R-CNNs for fine-grained category detection[C]//Proceedings of the 13th European Conference on Computer Vision, Zurich, Switzerland,2014: 834–849.

[3] Branson S, Van Horn G, Belongie S, et al. Bird species categorization using pose normalized deep convolutional nets[OL].arXiv preprint arXiv:1406.2952[cs.CV].

[4] Simon M, Rodner E. Neural activation constellations: unsupervised part model discovery with convolutional networks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile, 2015: 1143–1151.

[5] Tan B, Song Y Q, Zhong E H, et al. Transitive transfer learning[C]//Proceedings of the 21th ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, Sydney,NSW, Australia, 2015: 1155–1164.

[6] Tzeng E, Hoffman J, Darrell T, et al. Simultaneous deep transfer across domains and tasks[C]//Proceedings of 2015 IEEE International Conference on Computer Vision, Santiago, Chile,2015: 4068–4076.

[7] Ge W F, Yu Y Z. Borrowing treasures from the wealthy: deep transfer learning through selective joint fine-tuning[C]//Proceedings of 2017 IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, USA,2017: 10–19.

[8] Yosinski J, Clune J, Bengio Y, et al. How transferable are features in deep neural networks [C]//Proceedings of 2014 International Conference on Neural Information Processing Systems,Montreal, Canada, 2014: 3320–3328.

[9] Chopra S, Hadsell R, LeCun Y. Learning a similarity metric discriminatively, with application to face verification[C]//Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Diego, CA, USA, 2005, 1: 539–546.

[10] Iandola F N, Han S, Moskewicz M W, et al. SqueezeNet:AlexNet-level accuracy with 50x fewer parameters and <0.5MB model size[OL]. arXiv preprint arXiv:1602.07360 [cs.CV].

[11] Jia Y Q, Shelhamer E, Donahue J, et al. Caffe: convolutional architecture for fast feature embedding[C]//Proceedings of the 22nd ACM International Conference on Multimedia, Orlando,USA, 2014: 675–678.

[12] Xie S N, Yang T B, Wang X Y, et al. Hyper-class augmented and regularized deep learning for fine-grained image classification[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition, Boston, USA, 2015:2645–2654.

[13] Deng J, Dong W, Socher R, et al. ImageNet: A large-scale hierarchical image database[C]//Proceedings of 2009 IEEE Conference on Computer Vision and Pattern Recognition,Miami, FL, USA, 2009: 248–255.

[14] Wah C, Branson S, Welinder P, et al. The Caltech-UCSD birds-200–2011 dataset[R]. California: California Institute of Technology, 2011.

[15] Stark M, Krause J, Pepik B, et al. Fine-grained categorization for 3D scene understanding[J]. International Journal of Robotics Research, 2011, 30(13): 1543–1552.

[16] Krause J, Stark M, Deng J, et al. 3D object representations for fine-grained categorization[C]//Proceedings of 2013 IEEE International Conference on Computer Vision Workshops, Sydney,Australia, 2013: 554–561.

[17] Parkhi O M, Vedaldi A, Zisserman A, et al. Cats anddogs[C]//Proceedings of 2012 IEEE Conference on Computer Vision and Pattern Recognition, Providence, USA, 2012:3498–3505.

[18] Lazebnik S, Schmid C, Ponce J. A maximum entropy framework for part-based texture and object recognition[C]//Proceedings of the 10th IEEE International Conference on Computer Vision, Beijing, China, 2005, 1: 832–838.

[19] Bo L F, Ren X F, Fox D. Kernel descriptors for visual recognition[C]//Proceedings of the 23rd International Conference on Neural Information Processing Systems, Vancouver, Canada,2010: 244–252.

[20] Murray N, Perronnin F. Generalized max pooling[C]//Proceedings of 2014 IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA, 2014:2473–2480.

[21] Khamis S, Lampert C H. CoConut: co-classification with output space regularization[C]//Proceedings of 2014 British Machine Vision Conference, Nottingham, UK, 2014.

[22] Wang Y M, Choi J, Morariu V I, et al. Mining discriminative triplets of patches for fine-grained classification[C]//Proceedings of 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, USA, 2016: 1163–1172.

[23] Deng J, Krause J, Li F F. Fine-grained crowdsourcing for finegrained recognition[C]//Proceedings of 2013 IEEE Conference on Computer Vision and Pattern Recognition, Portland, USA,2013: 580–587.

[24] Wu X M, Mori M, Kashino K. Data-driven taxonomy forest for fine-grained image categorization[C]//Proceedings of 2015 IEEE International Conference on Multimedia and Expo, Turin,Italy, 2015: 1–6.

[25] Escalante H J, Ponce-López V, Escalera S, et al. Evolving weighting schemes for the bag of visual words[J]. Neural Computing and Applications, 2017, 28(5): 925–939.

[26] Iscen A, Tolias G, Gosselin P H, et al. A comparison of dense region detectors for image search and fine-grained classification[J]. IEEE Transactions on Image Processing, 2015, 24(8):2369–2381.

[27] Kobayashi T. Low-rank bilinear classification: efficient convex optimization and extensions[J]. International Journal of Computer Vision, 2014, 110(3): 308–327.

[28] Hang S T, Aono M. Bi-linearly weighted fractional max pooling.An extension to conventional max pooling for deep convolutional neural network[J]. Multimedia Tools and Applications,2017, 76(21): 22095–22117.

[29] Ionescu R T, Popescu M. Have a SNAK. Encoding spatial information with the spatial non-alignment kernel[C]//Proceedings of 18th International Conference on Image Analysis and Processing, Genoa, Italy, 2015: 97–108.

[30] Qian Q, Jin R, Zhu S H, et al. Fine-grained visual categorization via multi-stage metric learning[C]//Proceedings of 2015 IEEE Conference on Computer Vision and Pattern Recognition,Boston, USA, 2015: 3716–3724.

[31] Wang J J, Yang J C, Yu K, et al. Locality-constrained linear coding for image classification[C]//Proceedings of 2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, San Francisco, UAS, 2010: 3360–3367.