[1] Z Yun, L Jiang, S Wang et al. Design of reconfigurable array processor for multimedia application. Multimed Tools Appl, 77, 3639(2018).
[2] X Shi, X Luo, J Liang et al. Frog: Asynchronous graph processing on GPU with hybrid coloring model. IEEE Trans Knowl Data Eng, 30, 29(2017).
[3] Y Wang, A Davidson, Y Pan et al. Gunrock: A high-performance graph processing library on the GPU. ACM SIGPLAN Notices, 51, 11(2016).
[4]
[5]
[6] R J Tian, L Jiang, J Y Deng et al. Design and implementation of reconfigurable viewport transformation unit in embedded GPU. Mini-Micro Syst, 39, 1074(2018).
[7]
[8]
[9] C Yang, Y Wang, X Wang et al. WRA: A 2.2-to-6.3 TOPS highly unified dynamically reconfigurable accelerator using a novel Winograd decomposition algorithm for convolutional neural networks. IEEE Trans Circuits Syst I, 66, 3480(2019).
[10] L Liu, Z Li, C Yang et al. Hrea: An energy-efficient embedded dynamically reconfigurable fabric for 13-dwarfs processing. IEEE Trans Circuits Syst II, 65, 381(2017).
[11] S M A H Jafri, M Daneshtalab, N Abbas et al. Transmap: Transformation based remapping and parallelism for high utilization and energy efficiency in CGRAs. IEEE Trans Comput, 65, 3456(2016).
[12]
[13] Y Kim, H Joo, S Yoon. Inter-coarse-grained reconfigurable architecture reconfiguration technique for efficient pipelining of kernel-stream on coarse-grained reconfigurable architecture-based multi-core architecture. IET Circuits, Devices Syst, 10, 251(2016).
[14]
[15]
[16]
[17]
[18]
[19]
[20] Y S Wang, L B Liu, S Y Yin et al. Hierarchical representation of on-chip context to reduce reconfiguration time and implementation area for coarse-grained reconfigurable architecture. Sci Chin Inform Sci, 56, 1(2013).
[21] Y Kim, R N Mahapatra. Dynamic context compression for low-power coarse-grained reconfigurable architecture. IEEE Trans Very Large Scale Integr Syst, 18, 15(2009).
[22] A Venkat, D M Tullsen. Harnessing ISA diversity: Design of a heterogeneous-ISA chip multiprocessor. ACM SIGARCH Comput Architect News, 42, 121(2014).
[23]
[24] J Y Deng, T Li, L Jiang et al. Design and optimization for multiprocessor interactive GPU. The Journal of China Universities of Posts and Telecommunications, 21, 85(2014).
[25] J Y Deng, T Li, L Jiang et al. Design and implementation of the graphics accelerator oriented to OpenGL. Journal of Xidian University, 42, 124(2015).
[26] J Y Deng, T Li, L Jiang et al. The design of multiprocessor interactive GPU MIGPU-9. J Comput Aid Des Comput Graph, 26, 1468(2014).
[27] X B Shen, Z X Liu, R Wang et al. The unified model of computer architectures. Chin J Computs, 30, 729(2007).
[28]
[29]
[30] X T Zhang, L Jiang, J Y Deng et al. Design and Implementation of global controller in reconfigurable video array processor. Microelectron Comput, 34, 75(2017).