Efficient stochastic parallel gradient descent training for on-chip optical processor

Yuanjian Wan; Xudong Liu; Guangze Wu; Min Yang; Guofeng Yan; Yu Zhang; Jian Wang

doi:10.29026/oea.2024.230182

Journals >Opto-Electronic Advances >Volume 7 >Issue 4 >Page 230182 > Article

Opto-Electronic Advances
Vol. 7, Issue 4, 230182 (2024)

Efficient stochastic parallel gradient descent training for on-chip optical processor

Yuanjian Wan^1,2,†, Xudong Liu^1,2,†, Guangze Wu^1,2, Min Yang^1,2..., Guofeng Yan^1,2, Yu Zhang^1,2 and Jian Wang^1,2,*|Show fewer author(s)

Author Affiliations

¹Wuhan National Laboratory for Optoelectronics and School of Optical and Electronic Information, Huazhong University of Science and Technology, Wuhan 430074, China

²Optics Valley Laboratory, Wuhan 430074, China

show less

DOI: 10.29026/oea.2024.230182 Cite this Article

Yuanjian Wan, Xudong Liu, Guangze Wu, Min Yang, Guofeng Yan, Yu Zhang, Jian Wang. Efficient stochastic parallel gradient descent training for on-chip optical processor[J]. Opto-Electronic Advances, 2024, 7(4): 230182 Copy Citation Text

EndNote(RIS)

BibTex

Plain Text

show less

(a) Conceptual diagram of the on-chip optical processor for optical switching and channel descrambling in MDM communication systems. (b) Schematic configuration of the integrated reconfigurable optical processor. θ and ϕ mean the phase shift of the phase shifters. MDM: mode-division multiplexing; MUX: multiplexer; DEMUX: demultiplexer.

Fig. 1. (a) Conceptual diagram of the on-chip optical processor for optical switching and channel descrambling in MDM communication systems. (b) Schematic configuration of the integrated reconfigurable optical processor. θ and ϕ mean the phase shift of the phase shifters. MDM: mode-division multiplexing; MUX: multiplexer; DEMUX: demultiplexer.

Download full size | View in the Article

Fig. 2. Flow chart of Stochastic Parallel Gradient Descent (SPGD) algorithm.

Download full size | View in the Article

Fig. 3. Training results in electronic computer for optical switching, optical channel descrambling, and optical channel descrambling and switching. (a) Emulated light power distributions and (b) normalized light intensity distributions after training when the switching state is I₁−O₂, I₂−O₁, I₃−O₅, I₄−O₆, I₅−O₃, I₆−O₄. (d, e) Normalized light intensity distributions (d) before and (e) after training when randomly generating a set of phases in the part (1) of our chip to emulate crosstalk. (g, h) Normalized light intensity distributions (g) before and (h) after training with crosstalk when the switching state is: I₁−O₅, I₂−O₃, I₃−O₂, I₄−O₄, I₅−O₁, I₆−O₆. (c, f, i) The evaluation function changing with iteration rounds.

Download full size | View in the Article

Fig. 4. (a) Schematic of experimental configuration. (b) Microscopy image of optical processor. VSA: voltage source array; PD: photodetector array.

Download full size | View in the Article

Fig. 5. Online training results for optical switching at a wavelength of 1550 nm. (a) The evaluation function changing with iteration rounds when the switching state is I₁−O₃, I₂−O₁, I₃−O₄, I₄−O₆, I₅−O₂, I₆−O₅. The insets figures show the light power distributions when the round of iteration equals 50, 300, and 600, respectively. (b) The measured light power distributions after training. (c) The normalized light intensity distributions of measured results. (d, e) The measured light power distributions and normalized light intensity distributions when the switching state is I₁−O₃, I₂−O₆, I₃−O₄, I₄−O₂, I₅−O₁, I₆−O₅.

Download full size | View in the Article

Fig. 6. Online training results for optical channel descrambling at a wavelength of 1550 nm. (a) The evaluation function changing with iteration rounds. The insets show the light power distributions when the round of iteration equals 1, 300, and 600, respectively. (b) The light power distributions before training. (c) The light power distributions after training. (d, e) The results of training when generating another matrix

\tilde{U}

Download full size | View in the Article

Fig. 7. Online training results for optical channel descrambling and switching at a wavelength of 1550 nm. (a) The evaluation function changing with iteration rounds when the switching state is I₁−O₄, I₂−O₁, I₃−O₅, I₄−O₆, I₅−O₃, I₆−O₂. The insets show the light power distributions when the round of iteration equals 1, 100, and 400, respectively. (b) The light power distributions before training. (c) The light power distributions after training. (d, e) The results of training when generating another matrix

\tilde{U}

and the switching state is I₁−O₅, I₂−O₃, I₃−O₁, I₄−O₆, I₅−O₂, I₆−O₄.

Download full size | View in the Article

Fig. 8. Experimental setup and measured results for optical channel descrambling. (a) Experimental setup for the 6×6 optical descrambling systems. (b) The measured BER performance for back-to-back, optimization without crosstalk, before optimization with crosstalk, and after optimization with crosstalk systems. (c) The measured constellation chart at the back-to-back. (d) The measured constellation chart without crosstalk. (e) The measured constellation chart before optimization with crosstalk. (f) The measured constellation chart after optimization with crosstalk. PC: polarization controller; AWG: arbitrary waveform generator; EDFA: erbium-doped fiber amplifier; VOA: variable optical attenuator; OSC: oscilloscope; DSP: digital signal processing.

Download full size | View in the Article

Algorithm	Numbers of update	Matrix sizes
Algorithm	Numbers of update	N=6	N=10	N=16	N=32
GD	N(N−1)×T	690	3870	13200	93248
GA	M×T	1048	9046.67	39732	171200
PSO	M×T	1024	5912	31056	116145
SPGD	3×T	297.9	1092.6	4752.6	18053.1