Extraction Method of Interest Text in Image Based on Recurrent Neural Network

Hengjie Yang; Zheng Yan; Zongling Wu; Dingbang Fang; Fang Duan

doi:10.3788/LOP56.241501

[1] Oliveira D A B, Viana M P. Fast CNN-based document layout analysis. [C]∥2017 IEEE International Conference on Computer Vision Workshops (ICCVW), October 22-29, 2017, Venice, Italy. New York: IEEE, 1173-1180(2017).

[2] Le V P, Nayef N, Visani M et al. Text and non-text segmentation based on connected component features. [C]∥2015 13th International Conference on Document Analysis and Recognition (ICDAR), August 23-26, 2015, Tunis, Tunisia. New York: IEEE, 1096-1100(2015).

[3] Okun O, Doermann D, Pietikainen M. Page segmentation and zone classification: the state of the art[R]. Fort Belvoir: Defense Technical Information Center(1999).

[4] Moll M A, Baird H S. Segmentation-based retrieval of document images from diverse collections[J]. Proceedings of SPIE, 6815, 68150L(2008). http://spie.org/Publications/Proceedings/Paper/10.1117/12.767295

[5] Bukhari S S. Al Azawi M I A, Shafait F, et al. Document image segmentation using discriminative learning over connected components. [C]∥Proceedings of the 8th IAPR International Workshop on Document Analysis Systems-DAS '10, June 9-11, 2010, Boston, Massachusetts, USA. New York: ACM, 183-190(2010).

[6] Ye Q X, Doermann D. Text detection and recognition in imagery: a survey[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 37, 1480-1500(2015).

[7] Liu X Y, Meng G F, Pan C H. Scene text detection and recognition with advances in deep learning: a survey[J]. International Journal on Document Analysis and Recognition (IJDAR), 22, 143-162(2019).

[8] Epshtein B, Ofek E, Wexler Y. Detecting text in natural scenes with stroke width transform. [C]∥2010 IEEE Computer Society Conference on Computer Vision and Pattern Recognition, June 13-18, 2010, San Francisco, CA, USA. New York: IEEE, 2963-2970(2010).

[9] Neumann L, Matas J. A method for text localization and recognition in real-world images[M]. ∥Kimmel R, Klette R, Sugimoto A. Computer vision-ACCV 2010. Lecture notes in computer science. Berlin, Heidelberg: Springer, 6494, 770-783(2011).

[10] Wang K, Babenko B, Belongie S. End-to-end scene text recognition. [C]∥2011 International Conference on Computer Vision, November 6-13, 2011, Barcelona, Spain. New York: IEEE, 1457-1464(2011).

[11] Huang W L, Qiao Y, Tang X O. Robust scene text detection with convolution neural network induced MSER trees[M]. ∥Fleet D, Pajdla T, Schiele B, et al. Computer vision-ECCV 2014. Lecture notes in computer science. Cham: Springer, 8692, 497-511(2014).

[12] Fang D B, Feng G, Cao H Y et al. Handwritten formula symbol recognition based on multi-feature convolutional neural network[J]. Laser & Optoelectronics Progress, 56, 072001(2019).

[13] Wang X, Liu Y, Li G Y. Moving object detection algorithm based on improved visual background extractor algorithm[J]. Laser & Optoelectronics Progress, 56, 011007(2019).

[14] Zhao H, An W S. Image salient object detection combined with deep learning[J]. Laser & Optoelectronics Progress, 55, 121003(2018).

[15] Jaderberg M, Vedaldi A, Zisserman A. Deep features for text spotting[M]. ∥Fleet D, Pajdla T, Schiele B, et al. Computer vision-ECCV 2014. Lecture notes in computer science. Cham: Springer, 8692, 512-528(2014).

[16] LiaoM, ShiB, BaiX, et al.Textboxes: a fast text detector with a single deep neural network[C]∥Thirty-First AAAI Conference on Artificial Intelligence (AAAI-17), February 4-10, 2017, San Francisco, California, USA. USA: AAAI Press, 2017: 4161- 4167.

[17] Zhou X Y, Yao C, Wen H et al. EAST: an efficient and accurate scene text detector. [C]∥2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), July 21-26, 2017, Honolulu, HI, USA. New York: IEEE, 2642-2651(2017).

[18] Tian Z, Huang W L, He T et al. Detecting text in natural image with connectionist text proposal network[M]. ∥Leibe B, Matas J, Sebe N, et al. Computer vision-ECCV 2016. Lecture notes in computer science. Cham: Springer, 9912, 56-72(2016).

[19] Jaderberg M, Simonyan K, Vedaldi A et al. -12-09)[2019-05-04]. https:∥arxiv., org/abs/1406, 2227(2014).

[20] Liao M, Zhang J, Wan Z et al. Scene text recognition from two-dimensional perspective[C]∥Proceedings of the AAAI Conference on Artificial Intelligence, January 27-February 1, 2019, Hilton Hawaiian Village, Honolulu, Hawaii, USA., 30, 8714-8721(2019).

[21] Shi B G, Bai X, Yao C. An end-to-end trainable neural network for image-based sequence recognition and its application to scene text recognition[J]. IEEE Transactions on Pattern Analysis and Machine Intelligence, 39, 2298-2304(2017).

[22] Huang Z H, Xu W. -08-09) [2019-05-04]. https:∥arxiv., org/abs/1508, 01991(2015).

[23] Lafferty J D. McCallum A, Pereira F C N. Conditional random fields: probabilistic models for segmenting and labeling sequence data. [C]∥Proceedings of the Eighteenth International Conference on Machine Learning, June 28-July 1, 2001, Williams College, Williamstown, MA, USA. USA: ACM, 282-289(2001).

[24] Chiu J P C, Nichols E. Named entity recognition with bidirectional LSTM-CNNs[J]. Transactions of the Association for Computational Linguistics, 4, 357-370(2016).

[25] Hochreiter S, Schmidhuber J. Long short-term memory[J]. Neural Computation, 9, 1735-1780(1997).

[26] Graves A, Mohamed A R, Hinton G. Speech recognition with deep recurrent neural networks. [C]∥2013 IEEE International Conference on Acoustics, Speech and Signal Processing, May 26-31, 2013, Vancouver, BC, Canada. New York: IEEE, 6645-6649(2013).

[27] Ratnaparkhi A. A maximum entropy model for part-of-speech tagging. [C]∥Conference on Empirical Methods in Neural Language Processing, May 17-18, 1996, Philadelphia, PA, USA. [S.l.: s.n.](1996).

[28] McCallum A, Freitag D, Pereira F C N. Maximum entropy Markov models for information extraction and segmentation. [C]∥Proceedings of the Seventeenth International Conference on Machine Learning, June 29-July 2, 2000, Stanford, CA, USA. USA: ACM, 591-598(2000).

[29] Ren S, He K M, Girshick R et al. Faster R-CNN: towards real-time object detection with region proposal networks. [C]∥Neural Information Processing Systems (NIPS), December 7-12, 2015, Palais des Congrès de Montréal, Montréal Canada. Canada: NIPS, 91-99(2015).

[30] Shi X F. CHINESE-OCR[EB/OL]. -04-14) https:∥github.com/xiaofengShi/[2019-05-04]. CHINESE-OCR.(2018).