Silicon photonics convolution accelerator based on coherent chips with sub-1 pJ/MAC power consumption

Ying Zhu; Lu Xu; Xin Hua; Kailai Liu; Yifan Liu; Ming Luo; Jia Liu; Ziyue Dang; Ye Liu; Min Liu; Hongguang Zhang; Daigao Chen; Lei Wang; Xi Xiao; Shaohua Yu

doi:10.1364/PRJ.536939

[1] A. Krizhevsky, I. Sutskever, G. E. Hinton. ImageNet classification with deep convolutional neural networks. Commun. ACM, 60, 84-90(2017).

[2] M. Tan, Q. Le. EfficientNet: rethinking model scaling for convolutional neural networks. International Conference on Machine Learning, 6105-6114(2019).

[3] A. Vaswani, N. Shazeer, N. Parmar. Attention is all you need. Proceedings of the 31st International Conference on Neural Information Processing Systems, 6000-6010(2017).

[4] T. Brown, B. Mann, N. Ryder. Language models are few-shot learners. Proceedings of the 34th International Conference on Neural Information Processing Systems, 33, 1877-1901(2020).

[5] A. Baevski, W.-N. Hsu, A. Conneau. Unsupervised speech recognition. Proceedings of the 35th International Conference on Neural Information Processing Systems, 34, 27826-27839(2021).

[6] S. Secinaro, D. Calandra, A. Secinaro. The role of artificial intelligence in healthcare: a structured literature review. BMC Med. Inform. Decis. Mak., 21, 125(2021).

[7] C.-J. Wu, R. Raghavendra, U. Gupta. Sustainable AI: environmental implications, challenges and opportunities. Proc. Mach. Learn. Syst., 4, 795-813(2022).

[8] M. Le Gallo, R. Khaddam-Aljameh, M. Stanisavljevic. A 64-core mixed-signal in-memory compute chip based on phase-change memory for deep neural network inference. Nat. Electron., 6, 680-693(2023).

[9] D. B. Strukov, G. S. Snider, D. R. Stewart. The missing memristor found. Nature, 453, 80-83(2008).

[10] M. Jerry, P.-Y. Chen, J. Zhang. Ferroelectric FET analog synapse for acceleration of deep neural network training. IEEE International Electron Devices Meeting (IEDM), 6.2.1-6.2.4(2017).

[11] K. Ni, X. Yin, A. F. Laguna. Ferroelectric ternary content-addressable memory for one-shot learning. Nat. Electron., 2, 521-529(2019).

[12] G. Singh, L. Chelini, S. Corda. Near-memory computing: past, present, and future. Microprocess. Microsyst., 71, 102868(2019).

[13] K. Roy, I. Chakraborty, M. Ali. In-memory computing in emerging memory technologies for machine learning: an overview. 57th ACM/IEEE Design Automation Conference (DAC), 1-6(2020).

[14] A. Sebastian, M. Le Gallo, R. Khaddam-Aljameh. Memory devices and applications for in-memory computing. Nat. Nanotechnol., 15, 529-544(2020).

[15] M. V. DeBole, B. Taba, A. Amir. TrueNorth: accelerating from zero to 64 million neurons in 10 years. Computer, 52, 20-29(2019).

[16] S. Schmitt, J. Klähn, G. Bellec. Neuromorphic hardware in the loop: training a deep spiking network on the brainscales wafer-scale system. International Joint Conference on Neural Networks (IJCNN), 2227-2234(2017).

[17] S. Han, H. Mao, W. J. Dally. Deep compression: compressing deep neural networks with pruning, trained quantization and Huffman coding. arXiv(2015).

[18] S. Han. Efficient methods and hardware for deep learning(2017).

[19] Y. Zhu, G. L. Zhang, T. Wang. Statistical training for neuromorphic computing using memristor-based crossbars considering process variations and noise. Design, Automation & Test in Europe Conference & Exhibition (DATE), 1590-1593(2020).

[20] P. Spilger, E. Müller, A. Emmel. hxtorch: PyTorch for brainscales-2: perceptrons on analog neuromorphic hardware. IoT Streams for Data-Driven Predictive Maintenance and IoT, Edge, and Mobile for Embedded Machine Learning, 189-200(2020).

[21] G. Wetzstein, A. Ozcan, S. Gigan. Inference in artificial intelligence with deep optics and photonics. Nature, 588, 39-47(2020).

[22] H. Zhou, J. Dong, J. Cheng. Photonic matrix multiplication lights up photonic accelerator and beyond. Light Sci. Appl., 11(2022).

[23] J. Cardenas, C. B. Poitras, J. T. Robinson. Low loss etchless silicon photonic waveguides. Opt. Express, 17, 4752-4757(2009).

[24] L. Vivien, A. Polzer, D. Marris-Morini. Zero-bias 40 Gbit/s germanium waveguide photodetector on silicon. Opt. Express, 20, 1096-1101(2012).

[25] L. Yang, L. Zhang, R. Ji. On-chip optical matrix-vector multiplier. Proc. SPIE, 8855(2013).

[26] C. Huang, S. Fujisawa, T. F. de Lima. A silicon photonic–electronic neural network for fibre nonlinearity compensation. Nat. Electron., 4, 837-844(2021).

[27] S. Ambrogio, P. Narayanan, H. Tsai. Equivalent-accuracy accelerated neural-network training using analogue memory. Nature, 558, 60-67(2018).

[28] J. Gu, C. Feng, Z. Zhao. Efficient on-chip learning for optical neural networks through power-aware sparse zeroth-order optimization. Proceedings of the AAAI Conference on Artificial Intelligence, 35, 7583-7591(2021).

[29] Y. Zhu, M. Liu, L. Xu. Multi-wavelength parallel training and quantization-aware tuning for wdm-based optical convolutional neural networks considering wavelength-relative deviations. Proceedings of the 28th Asia and South Pacific Design Automation Conference, 384-389(2023).

[30] P. Dong, Y.-K. Chen, G.-H. Duan. Silicon photonic devices and integrated circuits. Nanophotonics, 3, 215-228(2014).

[31] A. N. Tait, T. F. De Lima, E. Zhou. Neuromorphic photonic networks using silicon photonic weight banks. Sci. Rep., 7(2017).

[32] G. Tanaka, T. Yamane, J. B. Héroux. Recent advances in physical reservoir computing: a review. Neural Netw., 115, 100-123(2019).

[33] D. Brunner, M. C. Soriano, C. R. Mirasso. Parallel photonic information processing at gigabyte per second data rates using transient states. Nat. Commun., 4, 1364(2013).

[34] K. Vandoorne, P. Mechet, T. Van Vaerenbergh. Experimental demonstration of reservoir computing on a silicon photonics chip. Nat. Commun., 5, 3541(2014).

[35] K. Liu, T. Zhang, B. Dang. An optoelectronic synapse based on α-In₂Se₃ with controllable temporal dynamics for multimode and multiscale reservoir computing. Nat. Electron., 5, 761-773(2022).

[36] Y.-W. Shen, R.-Q. Li, G.-T. Liu. Deep photonic reservoir computing recurrent network. Optica, 10, 1745-1751(2023).

[37] Y. Shen, N. C. Harris, S. Skirlo. Deep learning with coherent nanophotonic circuits. Nat. Photonics, 11, 441-446(2017).

[38] X. Xu, M. Tan, B. Corcoran. 11 TOPS photonic convolutional accelerator for optical neural networks. Nature, 589, 44-51(2021).

[39] J. Feldmann, N. Youngblood, M. Karpov. Parallel convolutional processing using an integrated photonic tensor core. Nature, 589, 52-58(2021).

[40] Y. Zhu, X. Zhang, X. Hua. Optoelectronic neuromorphic accelerator at 523.27 GOPS based on coherent optical devices. Optical Fiber Communication Conference, M2J-4(2023).

[41] Y. Zhu, M. Luo, X. Hua. Silicon photonic neuromorphic accelerator using integrated coherent transmit-receive optical sub-assemblies. Optica, 11, 583-594(2024).

[42] X. Meng, G. Zhang, N. Shi. Compact optical convolution processing unit based on multimode interference. Nat. Commun., 14, 3000(2023).

[43] Y. Chen, M. Nazhamaiti, H. Xu. All-analog photoelectronic chip for high-speed vision tasks. Nature, 623, 48-57(2023).

[44] J. Cheng, C. Huang, J. Zhang. Multimodal deep learning using on-chip diffractive optics with in situ training capability. Nat. Commun., 15, 6189(2024).

[45] Z. Xu, T. Zhou, M. Ma. Large-scale photonic chiplet Taichi empowers 160-TOPS/W artificial general intelligence. Science, 384, 202-209(2024).

[46] C. D. McGillem, G. R. Cooper. Continuous and Discrete Signal and System Analysis(1991).

[47] X. Xiao, L. Wang, M. Luo. High baudrate silicon photonics for the next-generation optical communications. European Conference on Optical Communication (ECOC), 1-4(2022).

[48] X. Hu, D. Wu, H. Zhang. Ultrahigh-speed silicon-based modulators/photodetectors for optical interconnects. Optical Fiber Communications Conference and Exhibition (OFC), 1-3(2023).

[49] M. Xu, M. He, H. Zhang. High-performance coherent optical modulators based on thin-film lithium niobate platform. Nat. Commun., 11, 3911(2020).

[50] X. Xie, Y. Dai, K. Xu. Broadband photonic RF channelization based on coherent optical frequency combs and I/Q demodulators. IEEE Photonics J., 4, 1196-1202(2012).

[51] N. Picqué, T. W. Hänsch. Frequency comb spectroscopy. Nat. Photonics, 13, 146-157(2019).

[52] Z. Tang, D. Zhu, S. Pan. Coherent optical RF channelizer with large instantaneous bandwidth and large in-band interference suppression. J. Lightwave Technol., 36, 4219-4226(2018).

[53] M. J. Filipovich, Z. Guo, M. Al-Qadasi. Silicon photonic architecture for training deep neural networks with direct feedback alignment. Optica, 9, 1323-1332(2022).

[54] F. Pedregosa, G. Varoquaux, A. Gramfort. Scikit-learn: machine learning in Python. J. Mach. Learn. Res., 12, 2825-2830(2011).

[55] A. Paszke, S. Gross, F. Massa. PyTorch: an imperative style, high-performance deep learning library. Proceedings of the 33rd International Conference on Neural Information Processing Systems, 32, 8026-8037(2019).

[56] https://www.nvidia.com/en-us/data-center/a100/. https://www.nvidia.com/en-us/data-center/a100/

[57] C. Yang, R. Hu, M. Luo. IM/DD-based 112-Gb/s/lambda PAM-4 transmission using 18-Gbps DML. IEEE Photonics J., 8, 7903907(2016).

[58] P. Caragiulo, O. E. Mattia, A. Arbabian. A 2× time-interleaved 28-GS/s 8-bit 0.03-mm² switched-capacitor DAC in 16-nm FINFET CMOS. IEEE J. Solid-State Circuits, 56, 2335-2346(2021).

[59] S. Nakano, M. Nagatani, K. Tanaka. A 180-mW linear MZM driver in CMOS for single-carrier 400-Gb/s coherent optical transmitter. European Conference on Optical Communication (ECOC), 1-3(2017).

[60] S. Daneshgar, H. Li, T. Kim. A 128 Gb/s, 11.2 mW single-ended pam4 linear TIA with 2.7 μA_rms input noise in 22 nm FINFET CMOS. IEEE J. Solid-State Circuits, 57, 1397-1408(2022).

[61] A. D. Güngördü, G. Dündar, M. B. Yelten. A high performance TIA design in 40 nm CMOS. IEEE International Symposium on Circuits and Systems (ISCAS), 1-5(2020).

[62] L. Chang, S. Liu, J. E. Bowers. Integrated optical frequency comb technologies. Nat. Photonics, 16, 95-108(2022).

微信扫一扫：分享

微信扫一扫：分享