• Journal of Semiconductors
  • Vol. 41, Issue 2, 021403 (2020)
Jin Song1、2、3, Xuemeng Wang3、4, Zhipeng Zhao3、4, Wei Li1, and Tian Zhi1
Author Affiliations
  • 1SKL of Computer Architecture, Institute of Computing Technology, Chinese Academy of Sciences, Beijing 100190, China
  • 2University of Chinese Academy of Sciences, Beijing 100049, China
  • 3Cambricon Tech. Ltd, Beijing 100191, China
  • 4University of Science and Technology of China, Hefei 230026, China
  • show less
    DOI: 10.1088/1674-4926/41/2/021403 Cite this Article
    Jin Song, Xuemeng Wang, Zhipeng Zhao, Wei Li, Tian Zhi. A survey of neural network accelerator with software development environments[J]. Journal of Semiconductors, 2020, 41(2): 021403 Copy Citation Text show less
    References

    [1] W Huang, Z Jing. Multi-focus image fusion using pulse coupled neural network. Pattern Recogn Lett, 28, 1123(2007).

    [2] J K Paik, A K Katsaggelos. Image restoration using a modified hopfield network. IEEE Trans Image Process, 1, 49(1992).

    [3] X Li, L Zhao, L Wei et al. DeepSaliency: multi-task deep neural network model for salient object detection. IEEE Trans Image Process, 25, 3919(2016).

    [4] Y Zhu, R Urtasun, R Salakhutdinov et al. segDeepM: exploiting segmentation and context in deep neural networks for object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 4703(2015).

    [5] A Graves, A R Mohamed, G Hinton. Speech recognition with deep recurrent neural networks. 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, 6645(2013).

    [6] O Abdelhamid, A Mohamed, H Jiang et al. Convolutional neural networks for speech recognition. IEEE/ACM Trans Audio Speech Language Process, 22, 1533(2014).

    [7] R Collobert, J Weston. A unified architecture for natural language processing. International Conference on Machine Learning(2008).

    [8] R Sarikaya, G E Hinton, A Deoras. Application of deep belief networks for natural language understanding. IEEE/ACM Trans Audio, Speech, Language Process, 22, 778(2014).

    [9]

    [10] W S McCulloch, W Pitts. A logical calculus of ideas immanent in nervous activity. Bull Math Biophys, 5, 115(1943).

    [11] F Rosenblatt. The perceptron: a probabilistic model for information storage and organization in the brain. Psycholog Rev, 65, 386(1958).

    [12]

    [13] G E Hinton, S Osindero, Y Teh. A fast learning algorithm for deep belief nets. Neur Comput, 18, 1527(2006).

    [14]

    [15] K He, X Zhang, S Ren et al. Deep residual learning for image recognition. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 770(2016).

    [16] C Szegedy, W Liu, Y Jia et al. Going deeper with convolutions. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, 1(2015).

    [17] C Szegedy, S Ioffe, V Vanhoucke et al. Inception-v4, Inception-ResNet and the impact of residual connections on learning. National Conference on Artificial Intelligence, 4278(2016).

    [18]

    [19] F Mamalet, C Garcia. Simplifying convnets for fast learning. international conference on artificial neural networks. International Conference on Artificial Neural Networks, 58(2012).

    [20]

    [21]

    [22] S Hochreiter, J Schmidhuber. Long short-term memory. Neur Comput, 9, 1735(1997).

    [23] A Vaswani, N Shazeer, N Parmar et al. Attention is all you need. Advances in Neural Information Processing Systems, 5998(2017).

    [24]

    [25]

    [26]

    [27] D D Lin, S S Talathi, V S Annapureddy. Fixed point quantization of deep convolutional networks. International Conference on Machine Learning, 2849(2016).

    [28] J Xue, J Li, D Yu et al. Singular value decomposition based low-footprint speaker adaptation and personalization for deep neural network. IEEE International Conference on Acoustics(2014).

    [29] E Park, J Ahn, S Yoo. Weighted-entropy-based quantization for deep neural networks. IEEE Conference on Computer Vision & Pattern Recognition(2017).

    [30] L Song, Y Wang, Y Han et al. C-Brain: A deep learning accelerator that tames the diversity of CNNs through adaptive data-level parallelization. Design Automation Conference(2016).

    [31] R J Kuo, Y L An, H S Wang et al. Integration of self-organizing feature maps neural network and genetic K-means algorithm for market segmentation. Expert Syst Appl, 30, 313(2006).

    [32] T Roska, G Bártfai, P Szolgay et al. A digital multiprocessor hardware accelerator board for cellular neural networks: CNN-HAC. Int J Circuit Theory Appl, 20, 589(1992).

    [33]

    [34] A Page, A Jafari, C Shea et al. SPARCNet: a hardware accelerator for efficient deployment of sparse convolutional networks. ACM J Emerg Technolog Comput Syst, 13, 1(2017).

    [35] T Chen, Y Chen, M Duranton et al. BenchNN: On the broad potential application scope of hardware neural network accelerators. 2012 IEEE International Symposium on Workload Characterization (IISWC), 36(2012).

    [36] C Farabet, C Poulet, J Y Han et al. CNP: An FPGA-based processor for convolutional networks. International Conference on Field Programmable Logic and Applications(2009).

    [37] S Zhang, Z Du, L Zhang et al. Cambricon-X: An accelerator for sparse neural networks. The 49th Annual IEEE/ACM International Symposium on Microarchitecture, 20(2016).

    [38] Y Yu, T Zhi, X Zhou et al. BSHIFT: a low cost deep neural networks accelerator. Int J Paral Program, 47, 360(2019).

    [39] A Shafiee, A Nag, N Muralimanohar et al. ISAAC: a convolutional neural network accelerator with in-situ analog arithmetic in crossbars. 2016 ACM/IEEE 43rd Annual International Symposium on Computer Architecture (ISCA)(2016).

    [40] Y H Chen, T Krishna, J S Emer et al. Eyeriss: an energy-efficient reconfigurable accelerator for deep convolutional neural networks. IEEE J Solid-State Circuits, 52, 127(2017).

    [41] N P Jouppi, C Young, N Patil et al. In-datacenter performance analysis of a tensor processing unit. 2017 ACM/IEEE 44th Annual International Symposium on Computer Architecture (ISCA), 1(2017).

    [42] Y Chen, T Chen, Z Xu et al. DianNao family: energy-efficient hardware accelerators for machine learning. Commun ACM, 59, 105(2016).

    [43] T Chen, Z Du, N Sun et al. DianNao: a small-footprint high-throughput accelerator for ubiquitous machine-learning. Proceedings of the 19th International Conference on Architectural Support for Programming Languages and Operating Systems(2014).

    [44] Y Chen, T Luo, S Liu et al. Dadiannao: A machine-learning supercomputer. Proceedings of the 47th Annual IEEE/ACM International Symposium on Microarchitecture, 609(2014).

    [45] Z Du, R Fasthuber, T Chen et al. ShiDianNao:shifting vision processing closer to the sensor. ACM/IEEE International Symposium on Computer Architecture(2015).

    [46] D Liu, T Chen, S Liu et al. Pudiannao: A polyvalent machine learning accelerator. ACM SIGARCH Comput Architect News, 43, 369(2015).

    [47] Z Du, K Palem, A Lingamneni et al. Leveraging the error resilience of machine-learning applications for designing highly energy efficient accelerators. 2014 19th Asia and South Pacific Design Automation Conference (ASP-DAC), 201(2014).

    [48] G Estrin. Organization of computer systems: the fixed plus variable structure computer. Western Joint IRE-AIEE-ACM Computer Conference, 33(1960).

    [49]

    [50] A Majumdar, S Cadambi, M Becchi et al. A massively parallel, energy efficient programmable accelerator for learning and classification. ACM Trans Architect Code Optim, 9, 1(2012).

    [51]

    [52] K Ando, K Ueyoshi, K Orimo et al. BRein memory: a single-chip binary/ternary reconfigurable in-memory deep neural network accelerator achieving 1.4 TOPS at 0.6 W. IEEE J Solid-State Circuits, 53, 983(2017).

    [53] J Lee, C Kim, S H Kang et al. UNPU: A 50.6TOPS/W unified deep neural network accelerator with 1b-to-16b fully-variable weight bit-precision. International Solid-State Circuits Conference, 218(2018).

    [54] W You, C Wu. A reconfigurable accelerator for sparse convolutional neural networks. Proceedings of the 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 119(2019).

    [55] S Liu, Z Du, J Tao et al. Cambricon: An instruction set architecture for neural networks. ACM SIGARCH Comput Architect News, 44, 393(2016).

    [56] Y Zhao, Z Du, Q Guo et al. Cambricon-F: machine learning computers with fractal von neumann architecture. International Symposium on Computer Architecture, 788(2019).

    [57] M Abadi, P Barham, J Chen et al. Tensorflow: A system for large-scale machine learning. 12th USENIX Symposium on Operating Systems Design and Implementation (OSDI 16), 265(2016).

    [58] Y Jia, E Shelhamer, J Donahue et al. Caffe: Convolutional architecture for fast feature embedding. Proceedings of the 22nd ACM International Conference on Multimedia, 675(2014).

    [59]

    [60]

    [61] H Lan, Z Du. DLIR: an intermediate representation for deep learning processors. IFIP International Conference on Network and Parallel Computing, 169(2018).

    [62] W Du, L Wu, X Chen et al. ZhuQue: a neural network programming model based on labeled data layout. International Symposium on Advanced Parallel Processing Technologies, 27(2019).

    [63]

    [64]

    [65] C Mendis, J Bosboom, K Wu et al. Helium: lifting high-performance stencil kernels from stripped ×86 binaries to halide DSL code. Program Language Des Implem, 50, 391(2015).

    [66] J Song, Y Zhuang, X Chen et al. Compiling optimization for neural network accelerators. International Symposium on Advanced Parallel Processing Technologies, 15(2019).

    Jin Song, Xuemeng Wang, Zhipeng Zhao, Wei Li, Tian Zhi. A survey of neural network accelerator with software development environments[J]. Journal of Semiconductors, 2020, 41(2): 021403
    Download Citation