[1] A Krizhevsky, I Sutskever, G E Hinton. ImageNet classification with deep convolutional neural networks. Neural Information Processing Systems (NIPS), 1097(2012).
[2] S Liang, S Yin, L Liu et al. FP-BNN: Binarized neural network on FPGA. Neurocomputing, 275, 1072(2017).
[3]
[4] C Zhang, P Li, G Sun et al. Optimizing FPGA-based accelerator design for deep convolutional neural networks. The 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 161(2015).
[5] J Qiu, J Wang, S Yao et al. Going deeper with embedded FPGA platform for convolutional neural network. The 2016 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 26(2016).
[6] S Yin, P Ouyang, S Tang et al. A high energy efficient reconfigurable hybrid neural network processor for deep learning applications. IEEE J Solid-State Circuits, 53, 968(2018).
[7] S Han, H Mao, W J Dally. Deep compression: compressing deep neural networks with pruning, trained quantization and huffman coding. ICLR(2016).
[8] P Gysel, M Motamedi, S Ghiasi. Hardware-oriented approximation of convolutional neural networks. ICLR(2016).
[9] S Han, X Liu, H Mao et al. EIE: efficient inference engine on compressed deep neural network. International Symposium on Computer Architecture (ISCA), 243(2016).
[10] A Zhou, A Yao, Y Guo et al. Incremental network quantization: towards lossless CNNs with low-precision weights. ICLR(2017).
[11]
[12] I Hubara, M Courbariaux, D Soudry. Binarized neural networks. Neural Information Processing Systems (NIPS), 1(2016).
[13] Y Umuroglu, N J Fraser, G Gambardella et al. FINN: A framework for fast, scalable binarized neural network inference. International Symposium on Field-Programmable Gate Arrays, 65(2017).
[14] A Boutros, S Yazdanshenas, V Betz. Embracing diversity: Enhanced DSP blocks for low precision deep learning on FPGAs. 28th International Conference on Field-Programmable Logic and Applications, 35(2018).
[15]
[16]
[17] A Boutros, M Eldafrawy, S Yazdanshenas et al. Math doesn’t have to be hard: logic block architectures to enhance low-precision multiply-accumulate on FPGAs. The 2019 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 94(2019).
[18] J H Kim, J Lee, J H Anderson. FPGA architecture enhancements for efficient BNN implementation. International Conference on Field-Programmable Technology (ICFPT), 217(2018).
[19]
[20]
[21]
[22]
[23] S Yazdanshenas, V Betz. Automatic circuit design and modelling for heterogeneous FPGAs. International Conference on Field-Programmable Technology (ICFPT), 9(2017).
[24] S Yazdanshenas, V Betz. COFFE 2: Automatic modelling and optimization of complex and heterogeneous FPGA architectures. ACM Trans Reconfig Technol Syst, 12, 3(2018).
[25]
[26]
[27] M Langhammer, B Pasca. Floating-point DSP block architecture for FPGAs. The 2015 ACM/SIGDA International Symposium on Field-Programmable Gate Arrays, 117(2015).
[28]