James Spall, Xianxin Guo, Alexander I. Lvovsky, "Training neural networks with end-to-end optical backpropagation," Adv. Photon. 7, 016004 (2025)

Search by keywords or author
- Advanced Photonics
- Vol. 7, Issue 1, 016004 (2025)

Fig. 1. Illustration of optical training. (a) Network architecture of the ONN used in this work, which consists of two fully connected linear layers and a hidden layer. (b) Simplified experimental schematic of the ONN. Each linear layer performs optical MVM with a cylindrical lens and an SLM that encodes the weight matrix. Hidden layer activations are computed using SA in an atomic vapor cell. Light propagates in both directions during optical training. (c) Working principle of SA activation. The forward beam (pump) is shown by solid red arrows and the backward (probe) by purple wavy arrows. The probe transmission depends on the strength of the pump and approximates the gradient of the SA function. For high forward intensity (top panel), a large portion of the atoms are excited to the upper level. Stimulated emission produced by these atoms largely compensates for the absorption due to the atoms at the ground level. For the weak pump (bottom panel), the excited level population is small, and the absorption is significant. (d) NN training procedure. (e) Optical training procedure. Both signal and error propagations in the two directions are fully implemented optically. Loss function calculation and parameter update are left for electronics without interrupting the optical information flow.

Fig. 2. Multilayer ONN characterization. (a) Scatterplots of measured-against-theory results for MVM-1 (first layer forward), MVM-2a (second layer forward), and MVM-2b (second layer backward). All three MVM results are taken simultaneously. Histograms of the signal and noise error for each MVM are displayed underneath. (b) First layer activations measured after the vapor cell, plotted against the theoretically expected linear MVM-1 output before the cell. The green line is a best-fit curve of the theoretical SA nonlinear function. (c) Amplitude of a weak constant probe passed backward through the vapor cell as a function of the pump , with a constant input probe. Measurements for both forward and backward beams are taken simultaneously.

Fig. 3. Optical training performance. (a) Decision boundary charts of the ONN inference output for three different classification tasks, after the ONN has been trained optically (top) or in silico (bottom). (b) Learning curves of the ONN for classification of the “Rings” dataset, showing the mean and standard deviation of the validation loss and accuracy averaged over five repeated training runs. Shown above are decision boundary charts of the ONN output for the test set, after different epochs. (c) Evolution of output neuron values and output errors, for the training set inputs of the two classes. (d) Comparison between optically measured and digitally calculated gradients. Each panel shows gradients for each of the 10 weight matrix elements.
|
Table 1. Summary of network architecture and hyperparameters used in both optical and digital training.
|
Table 2. Generalization of the optical training scheme.

Set citation alerts for the article
Please enter your email address