• Photonics Research
  • Vol. 9, Issue 3, 03000B71 (2021)
Xianxin Guo1、2、3、5、†、*, Thomas D. Barrett2、6、†、*, Zhiming M. Wang1、7、*, and A. I. Lvovsky2、4、8、*
Author Affiliations
  • 1Institute of Fundamental and Frontier Sciences, University of Electronic Science and Technology of China, Chengdu 610054, China
  • 2Clarendon Laboratory, University of Oxford, Oxford OX1 3PU, UK
  • 3Institute for Quantum Science and Technology, University of Calgary, Calgary, Alberta T2N 1N4, Canada
  • 4Russian Quantum Center, Skolkovo 143025, Moscow, Russia
  • 5e-mail: xianxin.guo@physics.ox.ac.uk
  • 6e-mail: thomas.barrett@physics.ox.ac.uk
  • 7e-mail: zhmwang@uestc.edu.cn
  • 8e-mail: alex.lvovsky@physics.ox.ac.uk
  • show less

    Abstract

    We propose a practical scheme for end-to-end optical backpropagation in neural networks. Using saturable absorption for the nonlinear units, we find that the backward-propagating gradients required to train the network can be approximated in a surprisingly simple pump-probe scheme that requires only simple passive optical elements. Simulations show that, with readily obtainable optical depths, our approach can achieve equivalent performance to state-of-the-art computational networks on image classification benchmarks, even in deep networks with multiple sequential gradient approximation. With backpropagation through nonlinear units being an outstanding challenge to the field, this work provides a feasible path toward truly all-optical neural networks.

    1. INTRODUCTION

    Machine learning (ML) is changing the way in which we approach complex tasks, with applications ranging from natural language processing [1] and image recognition [2] to artificial intelligence [3] and fundamental science [4,5]. At the heart (or “brain”) of this revolution are artificial neural networks (ANNs), which are universal function approximators [6,7] capable, in principle, of representing an arbitrary mapping of inputs to outputs. Remarkably, their function only requires two basic operations: matrix multiplication to communicate information between layers, and some nonlinear transformation of individual neuron states (activation function). The former accounts for most of the computational cost associated with ML. This operation can, however, be readily implemented by leveraging the coherence and superposition properties of linear optics [8]. Optics is therefore an attractive platform for realizing the next generation of neural networks, promising faster computation with low power consumption [913].