• Advanced Photonics
  • Vol. 4, Issue 4, 044001 (2022)
Pengfei Xu1 and Zhiping Zhou1、2、*
Author Affiliations
  • 1Peking University, State Key Laboratory of Advanced Optical Communication Systems and Networks, School of Electronics, Beijing, China
  • 2Chinese Academy of Sciences, Shanghai Institute of Optics and Fine Mechanics, Shanghai, China
  • show less
    DOI: 10.1117/1.AP.4.4.044001 Cite this Article Set citation alerts
    Pengfei Xu, Zhiping Zhou. Silicon-based optoelectronics for general-purpose matrix computation: a review[J]. Advanced Photonics, 2022, 4(4): 044001 Copy Citation Text show less
    Development of processors for matrix computation. (a) Moore’s law no longer seems applicable.16" target="_self" style="display: inline;">16 (b) Exponential growth of energy consumption for more accurate ANN models.17" target="_self" style="display: inline;">17 (c) Development trends of processors.18" target="_self" style="display: inline;">18 (d) Temporal and spatial architectures for multicore parallelization.17" target="_self" style="display: inline;">17 (e) Memristor crossbar arrays in post-Moore’s law era.19" target="_self" style="display: inline;">19 (f) Integrated waveguide meshes for general-purpose matrix computation.20" target="_self" style="display: inline;">20
    Fig. 1. Development of processors for matrix computation. (a) Moore’s law no longer seems applicable.16 (b) Exponential growth of energy consumption for more accurate ANN models.17 (c) Development trends of processors.18 (d) Temporal and spatial architectures for multicore parallelization.17 (e) Memristor crossbar arrays in post-Moore’s law era.19 (f) Integrated waveguide meshes for general-purpose matrix computation.20
    Intuitive visualization showing energy-efficient models processing in the convolutional neural network. The CONV between the filters and kernel can be deployed into the MVM to improve efficiency.
    Fig. 2. Intuitive visualization showing energy-efficient models processing in the convolutional neural network. The CONV between the filters and kernel can be deployed into the MVM to improve efficiency.
    Integrated waveguide meshes: from QIP to MVM. (a) Bulk-optical CNOT gate in 2004.29" target="_self" style="display: inline;">29 (b) On-chip photonic CNOT gates in 2007.30" target="_self" style="display: inline;">30 (c) Programmable quantum processor in 2016.31" target="_self" style="display: inline;">31 (d) Large-scale photonic processor for arbitrary two-qubit operations.32" target="_self" style="display: inline;">32 (e) Large-scale photonic processor for multidimensional quantum entanglement.33" target="_self" style="display: inline;">33 (f) Schematic of optical switch topologies in the data center.34" target="_self" style="display: inline;">34 (g) Reconfigurable hexagonal mesh for programmable signal processing.35" target="_self" style="display: inline;">35 (h) Photonic “FPGA” for programmable radiofrequency signal processing.36" target="_self" style="display: inline;">36 (i) Self-configuring 4×4-port linear processor. (j) First optical computing processor for vowel recognition.20" target="_self" style="display: inline;">20 (k) Large scale 64×64 MVM processor.37" target="_self" style="display: inline;">37,38" target="_self" style="display: inline;">38 (l) FFTNet architecture for better fault tolerance against imprecise components.39" target="_self" style="display: inline;">39 (m) Redundant architecture to overcome fabrication errors.40" target="_self" style="display: inline;">40
    Fig. 3. Integrated waveguide meshes: from QIP to MVM. (a) Bulk-optical CNOT gate in 2004.29 (b) On-chip photonic CNOT gates in 2007.30 (c) Programmable quantum processor in 2016.31 (d) Large-scale photonic processor for arbitrary two-qubit operations.32 (e) Large-scale photonic processor for multidimensional quantum entanglement.33 (f) Schematic of optical switch topologies in the data center.34 (g) Reconfigurable hexagonal mesh for programmable signal processing.35 (h) Photonic “FPGA” for programmable radiofrequency signal processing.36 (i) Self-configuring 4×4-port linear processor. (j) First optical computing processor for vowel recognition.20 (k) Large scale 64×64 MVM processor.37,38 (l) FFTNet architecture for better fault tolerance against imprecise components.39 (m) Redundant architecture to overcome fabrication errors.40
    Multiple laser source MVM. (a) MVM based on microring modulators.54" target="_self" style="display: inline;">54 (b) MVM based on on-chip photorefractive interaction.55" target="_self" style="display: inline;">55,56" target="_self" style="display: inline;">56 (c) Photonic tensor core constituted by dot-product engines.57" target="_self" style="display: inline;">57 (d) Photonic crossbar arrays with phase-change material.58" target="_self" style="display: inline;">58
    Fig. 4. Multiple laser source MVM. (a) MVM based on microring modulators.54 (b) MVM based on on-chip photorefractive interaction.55,56 (c) Photonic tensor core constituted by dot-product engines.57 (d) Photonic crossbar arrays with phase-change material.58
    FT-based CONV. (a) Flowchart of CONV using FT. (b) FT based on MMI coupler and compensating phase shifter arrays.59" target="_self" style="display: inline;">59 (c) FT operation with 21×21-star coupler.60" target="_self" style="display: inline;">60 (d) Compact surface plasmon polaritons device for FT.61" target="_self" style="display: inline;">61 (e) CONV based on Cooley–Tukey method FT.62" target="_self" style="display: inline;">62
    Fig. 5. FT-based CONV. (a) Flowchart of CONV using FT. (b) FT based on MMI coupler and compensating phase shifter arrays.59 (c) FT operation with 21×21-star coupler.60 (d) Compact surface plasmon polaritons device for FT.61 (e) CONV based on Cooley–Tukey method FT.62
    Element-wise MAC operations. (a) Basic 3×3 CONV consisting of nine MAC operations. (b) MAC based on balanced homodyne detection.63" target="_self" style="display: inline;">63 (c) MAC based on cascaded acousto-optical modulators.64" target="_self" style="display: inline;">64 (d) MAC based on microring modulators.65" target="_self" style="display: inline;">65
    Fig. 6. Element-wise MAC operations. (a) Basic 3×3 CONV consisting of nine MAC operations. (b) MAC based on balanced homodyne detection.63 (c) MAC based on cascaded acousto-optical modulators.64 (d) MAC based on microring modulators.65
    MAC based on dispersion. (a) The 118 GigaMAC/s matrix operation is realized by 1.1-km long linear dispersion fiber.66" target="_self" style="display: inline;">66 (b) 11.9 GigaFLOPs/s MAC conducted with 13-km spool of standard single-mode fiber.67" target="_self" style="display: inline;">67 (c) Time-stretch method for MAC operations.68" target="_self" style="display: inline;">68 (d) Temporal CONV (a series of MAC) by spiral waveguide with linear group dispersion.69" target="_self" style="display: inline;">69
    Fig. 7. MAC based on dispersion. (a) The 118 GigaMAC/s matrix operation is realized by 1.1-km long linear dispersion fiber.66 (b) 11.9 GigaFLOPs/s MAC conducted with 13-km spool of standard single-mode fiber.67 (c) Time-stretch method for MAC operations.68 (d) Temporal CONV (a series of MAC) by spiral waveguide with linear group dispersion.69
    Interconnections in processors, memory, and peripheral hardware. (a) The memory-processor interconnections are one of the major factors influencing the overall performance and the memory wall problem that has hindered high-performance computing.70" target="_self" style="display: inline;">70 (b) Large-scale matrix multiplication is decomposed into small-scale matrix multiplications while processing.71" target="_self" style="display: inline;">71 (c) On-chip optical transceivers are good alternatives for low-energy-budget interconnections and boosting the data movement among the computation hardware.72" target="_self" style="display: inline;">72
    Fig. 8. Interconnections in processors, memory, and peripheral hardware. (a) The memory-processor interconnections are one of the major factors influencing the overall performance and the memory wall problem that has hindered high-performance computing.70 (b) Large-scale matrix multiplication is decomposed into small-scale matrix multiplications while processing.71 (c) On-chip optical transceivers are good alternatives for low-energy-budget interconnections and boosting the data movement among the computation hardware.72
    Optical computing from bulk-optics to photonic-electronic integration. (a) SLM-based MVM processor released by Enlight in 2003.82" target="_self" style="display: inline;">82 (b) Bulk-optical 4f-system for convolutional neural network.83" target="_self" style="display: inline;">83 (c) Diffractive deep neural network by 3D-printed multi-layer phase mask.84" target="_self" style="display: inline;">84 (d) 3D copackaged module for enhancing the interaction between the photonic core and electronic ASIC.34" target="_self" style="display: inline;">34
    Fig. 9. Optical computing from bulk-optics to photonic-electronic integration. (a) SLM-based MVM processor released by Enlight in 2003.82 (b) Bulk-optical 4f-system for convolutional neural network.83 (c) Diffractive deep neural network by 3D-printed multi-layer phase mask.84 (d) 3D copackaged module for enhancing the interaction between the photonic core and electronic ASIC.34
    Energy efficiency of silicon-based optoelectronic matrix computation processor (consider all the photonic, optoelectronic, and electronic devices and circuits). (a) The equivalent energy efficiency (energy consumption per MAC operation) linearly decreases as the side-length of the matrix increases.63" target="_self" style="display: inline;">63 (b) Expectations of future compute density and energy efficiency in silicon-based optoelectronic matrix computation (the energy efficiency depends on the matrix configuration).
    Fig. 10. Energy efficiency of silicon-based optoelectronic matrix computation processor (consider all the photonic, optoelectronic, and electronic devices and circuits). (a) The equivalent energy efficiency (energy consumption per MAC operation) linearly decreases as the side-length of the matrix increases.63 (b) Expectations of future compute density and energy efficiency in silicon-based optoelectronic matrix computation (the energy efficiency depends on the matrix configuration).
    Photonic matrix computation can be used for solving some difficult problems and reducing their time complexity. (a) Heuristic recurrent algorithm for the annealing of Ising models. (b) Reconstruction of K-sparse signals in compressed sensing. (c) Very large-scale discrete Fourier transform.
    Fig. 11. Photonic matrix computation can be used for solving some difficult problems and reducing their time complexity. (a) Heuristic recurrent algorithm for the annealing of Ising models. (b) Reconstruction of K-sparse signals in compressed sensing. (c) Very large-scale discrete Fourier transform.
    Pengfei Xu, Zhiping Zhou. Silicon-based optoelectronics for general-purpose matrix computation: a review[J]. Advanced Photonics, 2022, 4(4): 044001
    Download Citation