Interfacing photonics with artificial intelligence: an innovative design strategy for photonic structures and devices based on artificial neural networks

Yihao Xu; Xianzhe Zhang; Yun Fu; Yongmin Liu

doi:10.1364/PRJ.417693

Abstract

Over the past decades, photonics has transformed many areas in both fundamental research and practical applications. In particular, we can manipulate light in a desired and prescribed manner by rationally designed subwavelength structures. However, constructing complex photonic structures and devices is still a time-consuming process, even for experienced researchers. As a subset of artificial intelligence, artificial neural networks serve as one potential solution to bypass the complicated design process, enabling us to directly predict the optical responses of photonic structures or perform the inverse design with high efficiency and accuracy. In this review, we will introduce several commonly used neural networks and highlight their applications in the design process of various optical structures and devices, particularly those in recent experimental works. We will also comment on the future directions to inspire researchers from different disciplines to collectively advance this emerging research field.

1. INTRODUCTION

Novel optical devices consisting of elaborately designed structures have become an extremely dynamic and fruitful research area because of their capability of manipulating light flow down to the nanoscale. Thanks to the advanced numerical simulation, fabrication, and characterization techniques, people are able to design, fabricate, and demonstrate dielectric and metallic micro- and nano-structures with sophisticated geometries and arrangements. For instance, metamaterials and metasurface comprising subwavelength structures, called meta-atoms, can show extraordinary properties beyond those of natural materials [1]. Many metadevices have been reported that offer enormous opportunities for technology breakthroughs in a wide range of applications including light steering [2 - 5], holography [6 - 9], imaging [10 - 14], sensing [15 - 17], and polarization control [18 - 21].

At present, we can handle most of the photonic design problems by accurately solving Maxwell’s equations using numerical algorithms such as the finite element method (FEM) and finite-difference time-domain (FDTD) method. However, those methods often require plenty of time and computational resources, especially when it comes to the inverse design problem aiming to retrieve the optimal structure from target optical responses and functionalities. In the conventional procedure, we normally start with full-wave simulations of an initial design based on the empirical knowledge and then adjust the geometric/material parameters iteratively to approach the customer-specific requirements. Such a trial-and-error process is time consuming, even for most experienced researchers. The initial design strongly relies on our experience and cognition, and usually some basic structures are chosen, including split-ring resonators [22, 23], helix [24], cross [25], bowtie [26], L-shape [2], and H-shape [27, 28] structures. Although it is known that a specific type of structures can produce a certain optical response (e.g., strong magnetic resonance from split-ring resonators and chiroptical response from helical structures), sometimes the well-established knowledge may limit our aspiration to seek an entirely new design that is suitable for the same applications or even more complicated ones when the traditional approach is not applicable.

10^{- 3}

Sign up for Photonics Research TOC. Get the latest issue of Photonics Research delivered right to you！Sign up now

10^{4}

This review is devoted to the topic of designing photonic structures and devices with ANNs. We will focus on very recent works on this topic, especially the experimental demonstrations, after introducing the widely used ANNs. The remaining part of the review is organized as follows. In Section 2, we will discuss the basic FCLs and their application in the prediction of design parameters. Then, in Section 3, we will focus on the CNNs that are used in the retrieval of much more complicated structures described by pixelated images. In Section 4, other useful and efficient hybrid algorithms by combining deep learning and conventional optimization methods for photonic design will be discussed. In the last section, we will conclude the review by discussing the achievements, current challenges, and outlooks in the future.

2. PHOTONIC DESIGN BY FULLY CONNECTED NEURAL NETWORK

A. Introduction of FCLs

(a) Illustration of a biological neuron. (b) FCLs-based neural network, in which all neurons in adjacent layers are connected. (c) Three widely used activation functions: Sigmoid, tanh, and ReLU.

Figure 1.(a) Illustration of a biological neuron. (b) FCLs-based neural network, in which all neurons in adjacent layers are connected. (c) Three widely used activation functions: Sigmoid, tanh, and ReLU.

X

B. Design Parameterized Structure by FCLs-Based ANNs

Figure 2.(a) Top: Schematic of the tandem neural network and ${SiO}_{2}$ and ${Si}_{3} O_{4}$ multilayers. Bottom: Two examples of target spectra (blue solid lines) and simulated spectra of retrieved structures (green dashed lines). The target spectra are in a Gaussian shape. (b) Left: Predicted (open circles) extinction cross section of the electric dipole (red) and magnetic dipole (black) of core-shell nanoparticles. The solid lines are target responses. Right: Simulated extinction spectra and the corresponding electric field distribution of core-shell nanoparticles. (c) Top: Simulation result and inverse design prediction of the scattering cross section of core-shell nanoparticles. Bottom: Runtime comparison between the conventional method and neural network. (d) Top: A multilayer structure composed of ${Si}_{3} N_{4}$ and graphene. Bottom: Optical response of the designed nanostructures (with either low/near-unity absorbance in graphene) under the excitation of s-polarized light. (a) is reproduced from Ref. [46] with permission; (b) is reproduced from Ref. [47] with permission; (c) is reproduced from Ref. [38] with permission; (d) is reproduced from Ref. [48] with permission.

Subsequent works have further confirmed the good performance of the tandem network architecture. For instance, S. So et al. used a similar ANN structure to design core-shell structures (with three layers) that support strong electric and magnetic dipole resonances [47]. The ANN was built to learn the correlation between the extinction spectra and core-shell nanoparticle designs, including the material information and shell thicknesses. In Fig. 2 (b), the predicted (open circles) extinction cross sections of the electric dipole (red) and magnetic dipole (black) of core-shell nanoparticles are compared with the target responses (solid lines). It is clear that both the electric dipole and magnetic dipole spectra of the designed core-shell nanoparticles fit well with the expectations. J. Peurifoy et al. also studied the inverse design with ANNs for multilayered particles (up to eight layers), with a focus on the scattering spectra [38]. The FCLs were used in both forward prediction of scattering cross-section spectra and the inverse design from the spectra. Using a model trained with 50,000 training data, they can achieve a mean relative error of around 1%. One example is shown in the top panel of Fig. 2 (c), in which the result from the neural network is compared with numerical nonlinear optimization as well as the desired spectra. The comparison demonstrates that the neural network model performs better in this design problem. Moreover, the running time of the ANNs-aided inverse design is shortened by more than 100 times in comparison with full-wave simulation as demonstrated in the bottom panel of Fig. 2 (c). This result clearly shows the advantage of ANNs in terms of efficiency.

h_{i}

Figure 3.(a) Left: Schematic illustration of the metasurface, the unit cell, and matrix encoding method. Right: Predicted S-parameter and absorptivity with the REACTIVE method. (b) Illustration of the neural network architecture consisting of BaseNet and TransferNet. (c) The trend of spectrum error when $n$ layers are transferred to the TransferNet and the predicted transmission spectra for two examples. (a) is reproduced from Ref. [39] with permission; (b) and (c) are reproduced from Ref. [49] with permission.

n

The FCLs have also been utilized in reinforcement learning [50 - 53], which is another hot area of machine learning, for the inverse design problem. Reinforcement learning has already achieved great performance in robotics, system control, and game-playing (AlphaGo). Instead of predicting the optimized geometry, the ANNs in reinforcement learning behave as an iterative optimization method. In each step, an action to optimize the geometry parameters is predicted. For instance, the action can be increasing or decreasing several parameters by a certain value. The advantage of this approach is that it can be adaptive to specific problems, and it can provide guidance for conventional trial-and-error optimization methods.

Figure 4.(a) Left: Architecture of the proposed neural network for nonlinear layers. Right: Predicted, simulated, and measured transmission spectra of two gold nanostructures under different polarization conditions. (b) Left: Illustrations of MANN used for reconstruction of 3D vectorial field. Right: Experimental approach and characterizations of 3D vectorial holography based on a vectorial hologram. (c) Left: Schematic of a deep-learning-enabled self-adaptive metasurface cloak. Right: Demonstration of the self-adaptive cloak response subject to random backgrounds and incidence with varied angles and frequencies. (a) is reproduced from Ref. [54] with permission; (b) is reproduced from Ref. [63] with permission; (c) is reproduced from Ref. [64] with permission.

In addition to spectrum prediction [55,56], the FCLs-based ANNs have also been used in the inverse design to realize other functionalities and benefit real-world applications [57–62]. Holographic images, for example, can be optimized by ANNs to achieve a wide viewing angle and three-dimensional vectorial field as recently demonstrated by H. Ren et al. [63]. They used a network named multilayer perceptron ANN (MANN), which was composed of an input layer fed with an arbitrary three-dimensional (3D) vectorial field, four hidden layers, and an output layer for the synthesis of a two-dimensional (2D) vector field. There are 1000 neurons within each hidden layer. The scheme of this ANN is shown in the top left panel of Fig. 4(b). The authors showed that an arbitrary 3D vectorial field can be achieved with a 2D vector field predicted by the well-trained model. A 2D Dirac comb function was then applied to sample the desired image. Subsequently, digital holography, calculated from the desired image, was combined with the 2D vector field. This process can be visualized in the right panel of Fig. 4(b). With a split-screen spatial light modulator that independently controls the amplitude and phase orthogonal circularly polarized light, any desired 2D vector beam can be generated. As a result, the experimentally measured image from the hologram can show four different 3D vectorial fields in different regions as presented in the bottom left panel of Fig. 4(b). The authors experimentally realized an ultrawide viewing angle of 94° and high diffraction efficiency of 78%. The demonstrated 3D vectorial holography opens avenues to widespread applications such as holographic display as well as multidimensional data storage, machine learning microscopy, and imaging systems.

θ

3. RETRIEVE COMPLEX STRUCTURES BY CONVOLUTIONAL NEURAL NETWORKS

A. Introduction of CNNs

64 \times 64

Figure 5.(a) Schematic of the convolution operation, in which the filters map the subarea in the input image to a single value in the output image. (b) Schematic of the pooling operation, in which the subarea in the input image is pooled into a single value in the output according to the maximum or mean value. (c) The workflow of a conventional CNN. The input images pass through several CNNs, and then the extracted features are passed into the FCLs to predict the response (e.g., transmission, reflection, and absorption spectra).

B. Design Complex Photonic Structures by CNNs

Figure 6.(a) Top: Examples of cDCGAN-suggested images and the simulation results. Bottom: Entirely new structures suggested by the cDCGAN for desired spectra. (b) Top: The proposed deep generative model for metamaterial design, which consists of the prediction, recognition, and generation models. Bottom: Evaluation of the proposed model. The desired spectra either generated with user-defined function or simulated from an existing structure are plotted in the first column. The reconstructed structures with the simulated spectra are plotted in the second and third columns. (c) Left: Flowchart of the VAE-ES framework. Right: Test results of designed photonic structures from the proposed model and the simulated spectra. (a) is reproduced from Ref. [69] with permission; (b) is reproduced from Ref. [41] with permission; (c) is reproduced from Ref. [79] with permission.

64 \times 64

v

Figure 7.(a) Left: One example of 1-bit coding elements with regular phase differences. Right: Comparison of the simulated and measured results of the dual- and triple-beam coding metasurfaces. (b) Schematic of the proposed 3D CNN model to characterize the near-field and far-field properties of arbitrary dielectric and plasmonic nanostructures. (c) Left: Sketch of the nanostructure geometry and the 1D CNN-based ANNs. Right: Training convergence and readout accuracy of the ANNs. (d) Left: The workflow of designing the DMD pattern for light control through scattering media with ANNs. Right: The structures of the FCLs-based single-layer neural network and the CNNs, together with the simulated and measured results for the focusing effect. (a) is reproduced from Ref. [80] with permission; (b) is reproduced from Ref. [81] with permission; (c) is reproduced from Ref. [86] with permission; (d) is reproduced from Ref. [87] with permission.

CNNs are widely applied in 2D image processing. The significance of CNNs is attributed to their ability to keep the local segment of the input as a whole, which can theoretically work in an arbitrary dimension. Taking advantage of this property, P. R. Wiecha and O. L. Muskens built a model with 3D CNNs to predict the near-field and far-field electric/magnetic response of arbitrary nanostructures [81]. They pixelated the dielectric or plasmonic nanostructure of interest into a 3D image and fed the image into several layers of 3D CNNs. Then an output 3D image with the same size as the input was predicted, representing the electric field under a fixed wavelength and polarization in the same coordination system as shown in Fig. 7 (b). The residual connections and shortcut connections in the network are known as the residual learning [82] and U-Net [83] blocks, which can help to stabilize the gradient of the networks and make the network deeper without compromising its performance [84, 85]. From the predicted near-field response, other physical quantities, such as far-field scattering patterns, energy flux, and electromagnetic chirality, can then be deduced. The authors studied two cases: 2D gold nanostructures with random polygonal shapes and 3D silicon structures consisting of several pillars. Each scheme was trained by simulation data of 30,000 distinct geometries. With the well-trained model, the authors reproduced several nano-optical effects from the near-field prediction from the 3D CNNs, like antenna behavior of gold nanorods and Kerker-type scattering of Si nanoblocks. The model can potentially serve as an extremely fast tool to replace the current full-wave simulation methods, with the trade-off of slightly decreased accuracy.

N

η = 32

4. OTHER INTELLIGENT ALGORITHMS FOR PHOTONIC DESIGNS

Figure 8.(a) Left: Illustration of meta-molecules. Right: Fabricated samples and the measured and simulated results of polarization conversion. (b) Top: Schematic of a silicon metagrating that deflects light to a certain angle. Bottom: The proposed conditional GLOnet for metagrating optimization. (c) Top: Schematic of structure refinement and filtering for the high-efficiency thermal emitter. Bottom: The efficiency, emissivity, and normalized emission of the well-optimized thermal emitter. (d) Top: Illustration of the unit cell consisting of three metallic patches connected via PIN diodes and a photograph of the fabricated metasurface. Bottom: Experimental results for reconstructing human body imaging. (a) is reproduced from Ref. [95] with permission; (b) is reproduced from Ref. [100] with permission; (c) is reproduced from Ref. [42] with permission; (d) is reproduced from Ref. [104] with permission.

p

λ = 0.5 - 1.7 μm

Conventional machine learning methods, such as Bayesian learning [106], clustering [107], and manifold learning [104], are also very helpful in solving photonic design problems. In 2019, L. Li et al. showcased a machine-learning-based imager that can efficiently record the microwave image of a moving object by a reprogrammable metasurface [104]. This work may pave the way for intelligent surveillance with both fast response time and high accuracy. The meta-atom has three metallic patches connected via PIN diodes to encode 2-bit information as schematically shown in the top panel of Fig. 8 (d). The digital phase step is around 90° between adjacent states, and the state can be tuned by applying an external bias voltage. The authors recorded a moving person for less than 20 min to generate the training data for the model. With principal component analysis (or random projection), the main modes with significant contributions were calculated. Then all meta-atoms were tuned by a bias voltage to match the principal component analysis modes for each measurement. In this way, the measurement became more efficient because it always captured the information with a high contribution to reconstructing the microwave image. To test the well-trained model, another person was moving in front of the metasurface, and images of the movements were reconstructed as shown at the bottom of Fig. 8 (d). With only 400 measurements, which were far fewer than the number of pixels, high-quality images could be produced even when the person was blocked by a 3-cm-thick paper wall. This method was further extended to the classification problem, in which the authors defined three different movements (i.e., standing, bending, and raising arms). With a simple nearest-neighbor algorithm, only 25 measurements led to good recognition of the movements.

5. CONCLUSION AND OUTLOOK

In this review, we have introduced the basic idea of applying ANNs and other advanced algorithms to accelerate and optimize photonic designs, including plasmonic nanostructures and metamaterials. We have highlighted some representative works in this field and discussed the performance and applications of the proposed models. In the inverse design problem, the neural network is usually built upon FCLs and CNNs, integrated with other neural network units like ResNets and RNNs. It is beneficial to incorporate ANNs with conventional optimization methods such as genetic algorithm and topology optimization because the conventional optimization methods can help to perform global optimization and provide feedback to further improve the ANNs. The emergence of all the methods offers a great opportunity to increase the structural complexity in the devices, which can realize much more complex and novel functionalities.

$(a) Top: Comparison between the all-optical D2NN and a conventional ANN. Bottom: Measured performance of the classifier for handwritten digits and fashion products. (b) Top: Sketch of the optical logic operations by a diffractive neural network. Bottom: Experiment setup and measured results of three basic logic operations on the fabricated metasurface. (a) is reproduced from Ref. [118] with permission; (b) is reproduced from Ref. [119] with permission.$

Figure 9.(a) Top: Comparison between the all-optical $D^{2} NN$ and a conventional ANN. Bottom: Measured performance of the classifier for handwritten digits and fashion products. (b) Top: Sketch of the optical logic operations by a diffractive neural network. Bottom: Experiment setup and measured results of three basic logic operations on the fabricated metasurface. (a) is reproduced from Ref. [118] with permission; (b) is reproduced from Ref. [119] with permission.

Over the past decades, photonics and artificial intelligence have been evolving largely as two separate research disciplines. The intersection and combination of these two topics in recent years have brought exciting achievements. On one hand, the innovative ANN models provide a powerful tool to accelerate the optical design and implementation process. Some nonintuitive structures and phenomena have been discovered by this new strategy. On the other hand, the developed optical designs are expected to produce a variety of real-world applications, such as optical imaging, holography, communications, and information encryption, with high efficiency, fidelity, and robustness. Toward this goal, we need to include the practical fabrication constraints and underlying material properties into the design space in order to globally optimize the devices and systems. We believe that the field of interfacing photonics and artificial intelligence will significantly move forward as more researchers from different backgrounds join this effort.

References

微信扫一扫：分享

微信扫一扫：分享