• Opto-Electronic Advances
  • Vol. 5, Issue 3, 210147 (2022)
Sergey Krasikov1、2, Aaron Tranter3, Andrey Bogdanov2, and Yuri Kivshar1、*
Author Affiliations
  • 1Nonlinear Physics Center, Research School of Physics, The Australian National University, Canberra ACT 2601, Australia
  • 2School of Physics and Engineering, ITMO University, St. Petersburg 197101, Russia
  • 3Centre for Quantum Computation and Communication Technology, Department of Quantum Science, Research School of Physics, The Australian National University, Canberra, ACT 2601, Australia
  • show less
    DOI: 10.29026/oea.2022.210147 Cite this Article
    Sergey Krasikov, Aaron Tranter, Andrey Bogdanov, Yuri Kivshar. Intelligent metaphotonics empowered by machine learning[J]. Opto-Electronic Advances, 2022, 5(3): 210147 Copy Citation Text show less

    Abstract

    In the recent years, a dramatic boost of the research is observed at the junction of photonics, machine learning and artificial intelligence. A new methodology can be applied to the description of a variety of photonic systems including optical waveguides, nanoantennas, and metasurfaces. These novel approaches underpin the fundamental principles of light-matter interaction developed for a smart design of intelligent photonic devices. Artificial intelligence and machine learning penetrate rapidly into the fundamental physics of light, and they provide effective tools for the study of the field of metaphotonics driven by optically induced electric and magnetic resonances. Here we overview the evaluation of metaphotonics induced by artificial intelligence and present a summary of the concepts of machine learning with some specific examples developed and demonstrated for metasystems and metasurfaces.

    Introduction

    As demonstrated in the recent years, artificial intelligence (AI) is quickly becoming a ubiquitous concept in various physical sciences1-6. Specifically, there is a variety of challenging tasks in optics and photonics that can be efficiently analysed and solved without directly solving of Maxwell’s equations and instead leveraging these novel approaches. Merging AI with photonics is particularly advantageous as Maxwell’s equations fundamentally describe very precisely a wide range of experimentally observed phenomena. At the same time, the direct solution of Maxwell's equations can be used to generate vast amounts of training data required for implementing various AI algorithms. Also, by employing AI in optics and photonics, we may expect the creation of efficient photonic quantum machines that eventually may merge with much broader biological intelligence. Thus, a remarkably powerful AI methodology that complements many conventional analytical and numeric techniques find many important applications in photonics. In particular, such approaches are used for inverse design, optimization, big-data processing, underpinning the rapid development of precise photonic technologies.

    Metaphotonics is a new and rapidly developing direction in subwavelength photonics7. It is inspired by the physics of metamaterials where the electromagnetic response is associated with the magnetic dipole resonances and optical magnetism originating from the resonant dielectric nanostructures with high refractive index. The concept of all-dielectric resonant nanophotonics is driven by the idea to employ subwavelength dielectric Mie-resonant nanoparticles as “meta-atoms” for creating highly efficient optical metasurfaces and metadevices. These are defined as devices having unique functionalities appeared as a result of a smart structuring the meta-atoms at the subwavelength scale combined with the use of functional and high-refractive-index materials8. In contrast to the classical optics, where the electromagnetic response is completely defined by electric polarization, the metaphotonics is often termed “meta-optics” emphasizing the importance of optically induced magnetic response of the artificial subwavelength-patterned structures. The high-refractive-index materials provide excellent confinement of electromagnetic fields making even subwavelength particles resonant. The interference between these resonances results in a variety of scattering effects not existing in classical optics. Dielectric nanoscale structures are expected to complement or even replace different plasmonic components in a range of potential applications. Moreover, many concepts which had been developed for plasmonic structures, but fell short of their potential due to strong losses of metals at optical frequencies, can now be realized based on low-loss dielectric structures.

    High-index dielectric resonators can support both electric and magnetic Mie-type resonances which can be tailored by the nanoparticle geometry. Mie-resonant silicon nanoparticles have recently received considerable attention for applications in nanophotonics and metamaterials9 including optical nanoantennas, wavefront-shaping metasurfaces, and nonlinear frequency generation. Importantly, the simultaneous excitation of strong electric and magnetic Mie-type dipole and multipole resonances can result in constructive or destructive interferences with unusual spatial scattering characteristics, and it may also lead to the resonant enhancement of magnetic fields in dielectric structures that could bring many novel effects in both linear and nonlinear regimes.

    Here, we discuss the current progress in the recently emerged fields of machine learning and artificial intelligence focusing on their applications to metaphotonics driven by optically induced electric and magnetic resonances. More specifically, we describe how these novel ideas can be applied to study optical nanoantennas and metasurfaces and improve their performance and functionalities. We also present a summary of the basic concepts of machine learning with some specific examples developed and demonstrated for a variety of metasystems and metasurfaces.

    This Perspective is organized as follows. In Sec. Basic concepts of machine learning, we broadly discuss the basic concepts of machine learning irrespective of specific applications. We introduce the concepts of deep learning and machine learning and show how they are linked to artificial neural networks and artificial intelligence concepts. Next, in Sec. Metasystems and metasurfaces we briefly summarize the main ideas of metaphotonics and describe the main parameters and properties which are required usually for the optimisation. In Sec. Advanced nanoantennas we consider several examples of advanced nanoantennas and their optimisation with the help of machine learning. All Secs. Transformative metasurfaces, Chemical and biological sensing, and Self-adapting metasystems discuss various aspects of functional and self-adaptive metasurfaces and their applications to sensing. Finally, Sec. Perspective and outlook conclude the paper with perspectives and outlook.

    Basic concepts of machine learning

    In this perspective article, we first provide a brief description of machine learning (ML) algorithms in order to outline the general ideas of those methods suitable for applications. We then shift the focus to the application of these powerful tools in metaphotonics without a thorough description of the technical details of each specific example.

    While AI is a broad field closely connected with the development of systems mimicking intelligence, ML provides a set of data-driven algorithms for learning ability of AI10. The goal of ML algorithms is to find a sequence of programmatic transformations connecting the input and output data. This could constitute a mathematical model that connects physical parameters to an observed phenomena (collected data)11-13. In another case, it could be the use of a heuristic to classify labelled data, for example, clustering images using k-means. Considering an example from physics, this might be transformations which have to be applied to geometric parameters of a unit cell of a metagrating to obtain a corresponding reflection spectrum. Generally, provided a good source of data, the more input data the algorithm receives, the more accurate a result will be. It should be highlighted that ML algorithms can work with data itself and find correlations within it without any understanding of physical, mathematical, or other meaning. However, in some cases, it is possible to provide physical intuition or to constrain models to obey physical models such as, for example, the Navier-Stokes equations14, 15 or Maxwell's equations16. Existing ML methods span a range of tasks, including but not limited to, classification, regression, clustering, anomaly detection and structured prediction. While ML has demonstrable efficacy for several tasks there remains significant overhead involved in the application of such techniques. This usually involves a few tasks such as tuning and design of the models in question, with the largest task generally being the feature extraction process which often requires domain specific expertise.

    Deep learning (DL) is a subclass of ML (see Fig. 1) solely based on layered structures referred to as artificial neural networks (ANNs)17, 18. ANNs derive their name from the neural structures found in biological entities. This structure is emulated mathematically as a node, referred to as a neuron, which may contain many input and output connections with associated weightings. These neurons have a non-linear activation function which is a function that serves to map the inputs to an output and provides the switching behaviour seen in biological neurons. These neurons are stacked into layers which are then connected to subsequent layers. The utility of this is to construct a sufficiently deep neural network such that any arbitrary function may be approximated19. At the same time, increase of the ANN complexity requires larger datasets for good prediction accuracy20. ANNs may be considered as a mapping from some input space to some output space which can be arbitrarily defined. ANNs can also have complicated structures for different tasks. The inclusion of convolutional layers has demonstrated great success in image processing21 and auto-regressive models similarly with natural language processing22 and sequentially organised data23. A key feature of these models and DL in general is that the design process does not require the same domain specific knowledge as other classical ML methods. Instead, the features of a data set are learnt automatically to best facilitate the desired mapping from input to output. To perform a given task DL models must undergo a process known as training. Training a model requires the introduction of the so-called loss function, which provides feedback on the difference between the real output (or "ground truth") and the output predicted by the network for the same given input data. Training the ANN aims to minimize the loss function by adjustment of the weight values at each layer. These tasks are generally achieved via some form of stochastic gradient descent which may be done efficiently over the network with backward propagation24. The training is termed complete when the model can predict the output with some desired quality metric.

    Links of different ML-based concepts. Artificial intelligence (AI) is a part of computer science dedicated to development of ways to mimic general intelligence. Machine learning (ML) is a subset of AI, these are data-driven algorithms which learn from experience and have the capacity to improve their performance over time and adapt to new data. ML algorithms are varied in their approach, however deep learning (DL) is a subset of ML solely based on layered structures referred to as artificial neural networks. The main feature of DL is the capability to efficiently process raw unstructured data and automatically determine its features while classical ML involves the processing of data in a manner predefined by a human operator.

    Figure 1.Links of different ML-based concepts. Artificial intelligence (AI) is a part of computer science dedicated to development of ways to mimic general intelligence. Machine learning (ML) is a subset of AI, these are data-driven algorithms which learn from experience and have the capacity to improve their performance over time and adapt to new data. ML algorithms are varied in their approach, however deep learning (DL) is a subset of ML solely based on layered structures referred to as artificial neural networks. The main feature of DL is the capability to efficiently process raw unstructured data and automatically determine its features while classical ML involves the processing of data in a manner predefined by a human operator.

    After the training, the ANN can accurately map the input to the desired output, implying that it has "learned" the necessary mapping from the given data. It should be noted that this mapping is not necessarily unique and differences in training procedure can cause a failure to converge. It is generally considered good practice to divide a dataset into training and validation sets. Due to the expressivity of high dimensional networks, it is possible for over-fitting to occur which must be mitigated. Over-fitting can be monitored via the loss function applied to the validation set. Once the model is trained and over-fitting is mitigated, the model can be said to be capable of generalising for an output of sample input. This of course assumes that the training data provided is representative of the problem that one is trying to learn. It is important to note that in general ML models are interpolative by construction, thus extrapolative intuition (not to be confused with generalisation) is still a difficult problem that is indeed difficult for human operators as well.

    One of the major drawbacks of ML is the amount of data needed for the training procedure, which is typically dictated by the complexity of the problem. For DL increased training data also tends to mitigate over-fitting to a certain extent and thus can be even more important. The collection of the data as well as training may take an enormous time. Hence, the use of DL is usually justified when the same task has to be solved many times for slightly different set of parameters or the task of building a feature extractor is intractable.

    Concerning photonics, ML is incorporated mainly as a tool for forward and inverse design procedures (see Fig. 2). Forward design is associated with the prediction of a physical response (scattering spectra, polarization, etc.) for a given structure. This task can be solved via variety of different tools, such as T-matrix calculations, full-wave simulations, and others, which are basically aimed to solve Maxwell’s equations. Though ML may become an alternative method for the same purpose, possibly its greatest benefit is associated with inverse design problems – determination of the structure parameters necessary to provide a given response. In this case, DL can be used as a basis for the so-called surrogate modelling providing not a simulation but rather its data-driven approximation.

    Inverse and forward designs based on DL techniques. An artificial neural network is fed by parameters of a structure and the corresponding physical response is obtained with conventional numerical techniques or from an experiment. Structure parameters may come in a form of numerical arrays containing geometric or material parameters, or in a form of images or sets of pixels. During the training procedure the neural network determines the appropriate mapping to account for the relation between the parameters and the responses. Then, the network can predict a structure response which was not present in the learning set of data – this is a forward design, which is a prediction of a physical response for a given structure. Swapping the input and output data and applying the similar training procedure, the network may provide a reverse function, predicting parameters of a structure which allows achieving a given physical response, called an inverse design.

    Figure 2.Inverse and forward designs based on DL techniques. An artificial neural network is fed by parameters of a structure and the corresponding physical response is obtained with conventional numerical techniques or from an experiment. Structure parameters may come in a form of numerical arrays containing geometric or material parameters, or in a form of images or sets of pixels. During the training procedure the neural network determines the appropriate mapping to account for the relation between the parameters and the responses. Then, the network can predict a structure response which was not present in the learning set of data – this is a forward design, which is a prediction of a physical response for a given structure. Swapping the input and output data and applying the similar training procedure, the network may provide a reverse function, predicting parameters of a structure which allows achieving a given physical response, called an inverse design.

    To illustrate the above ideas, we consider an example of a metaphotonics system which can be designed with the help of an ANN. Let the aim be to predict a scattering spectrum of a uniform spherical particle with a given radius. In this case, the input is the radius, and the output is the value of a scattering cross-section (SCS) at some frequency. The training set consists of pairs (radius, SCS) and the network finds the mapping between these two values, meaning that it finds a sequence of mathematical transformations which converts a given radius to a known SCS value. In this case, the transformations are not constrained to have anything in common with any physical equation, theory, or model.

    After the model has been trained, the network can predict the unknown value of SCS for a radius which was not given during the training. Importantly, the reverse process is possible, and inputs can be swapped with outputs so the network will estimate parameters required for a given SCS value. For a simple problem, the network architecture can simply be inverted and trained in the same way. Therefore, the network can be trained with the data acquired by conventional solvers and can solve both the forward and inverse tasks. If one does not wish to retrain the model, then the network can instead be used as a surrogate model. In this case the inputs required to produce a desired output are identified via a search algorithm (such as gradient descent or random forest). Many real-world tasks require specialised structures (such as convolutional or recurrent layers) to process the features in a data set efficiently, with model design often becoming an iterative process. Nevertheless, accuracy of the predictions may approach unity when enough data are provided. Moreover, the speed of estimation can be suitable to many real-time tasks making DL a promising tool for design of photonic devices.

    The same logic can be applied for other classes of problems, not necessarily related to scattering. However, in this Perspective, we describe not the algorithms themselves but some results which can be achieved with their help. Thus, below we refer to DL algorithms as black boxes with unspecified structures and functional mappings between the physical parameters.

    Metasystems and metasurfaces

    Electromagnetic metamaterials were suggested as artificial structures composed of subwavelength elements, and initially they were driven by curiosity such as negative refraction and super-lens. Some years later, metamaterials created a paradigm for engineering electromagnetic properties with the help of transformation optics25. More recently, research on metamaterials evolved into the study of metadevices8 and it is expected that future photonic technologies will involve a high level of integration achieved by embedding the data-processing and waveguiding functionalities at the material’s level26, 27. Thus, an important task is to study and optimize materials building blocks such as nanoantennas and metasurfaces.

    Plasmonic and dielectric nanoantennas supporting resonances represent a novel type of building blocks of metamaterials for generating, manipulating, and modulating light7, 28. By combining both electric and magnetic modes, one can not only modify far-field radiation patterns but also localize the electromagnetic energy in open resonators by employing the physics of bound states in the continuum to achieve destructive interference of leaky modes29.

    Metasurfaces being planar structures created by arrays of artificial subwavelength-scale building elements provide novel capability to manipulate electromagnetic waves30. Well before the exploration of metasurfaces, tailoring the light scattering with planar optical structures has been majorly pursued with diffractive optical elements31. However, the concept of metasurfaces provides much broader and deeper insights and useful tools for complete control of light. Metasurfaces are characterized by reduced dimensionality, and usually they consist of arrays of optical resonators with spatially varying geometric parameters and subwavelength separation. In contrast to conventional optical components that achieve wavefront engineering by phase accumulation through light propagation in a medium, metasurface provides new degrees of freedom to control the phase, amplitude, and polarization of light waves with subwavelength resolution, as well as to accomplish wavefront shaping within a distance much less than the wavelength of light. The outstanding optical properties of dielectric metasurfaces drive the development of ultra-thin optical elements and devices, whether showing novel optical phenomena or new functionalities outperforming their traditional bulky counterparts9, 32, 33. The optical response (phase, amplitude, and polarisation) of the meta-atom changes with its parameters (height, width, material, etc). Arrangements of meta-atoms provide specific variations of parameters, depending on required functionalities. Meta-atoms can operate as subwavelength resonators supporting multipolar Mie resonances9, or they can contribute to averaged parameters like metamaterials32.

    The concept of optical metasurfaces has been applied to demonstrate many exotic optical phenomena and various useful planar optical devices. Many of these metasurface-based applications are potentially very promising alternatives to replace conventional optical elements and devices, as they largely benefit from ultra-thin, lightweight, and ultracompact properties, provide the possibility of overcoming several limitations suffered by their traditional counterparts, and can demonstrate versatile novel functionalities. Metasurfaces were suggested for an efficient control of light-matter interaction with subwavelength resonant structures, and they have been explored widely in the recent years for creating transformational flat-optics devices25.

    Plasmonic metastructures suffer from significant losses and show low efficiency, but all-dielectric structures can readily combine electric and magnetic Mie resonances and control efficiently optical properties such as amplitude, phase, polarization, chirality, and anisotropy7. The control of all such parameters requires a careful optimisation depending on the problems where the metasurfaces are used. Many such properties are driven by local electromagnetic resonances such as Mie-type scattering, bound states in the continuum, Fano resonances, and anapole resonances. The recent research frontiers in dielectric metasurfaces include wavefront-shaping, metalenses, multifunctional and computational approaches, with the main strategies to realize the dynamic tuning of dielectric metasurfaces.

    Importantly, recent advances in nanofabrication technologies bring low-cost, large-area and mass productive approaches and capabilities for the development of various types of metasystems and metasurfaces, and the methods are gradually becoming mature. It is expected that flat-optics components based on dielectric metasurfaces will appear in our daily life very soon bringing complexity of optical components and novel functionalities34.

    Advanced nanoantennas

    One of the first illustrative examples for application of the ML techniques to metaphotonics is a design of nanoantennas as elementary units of metasystems (meta-atoms). Engineering of a scattering response is often realized with core-shell structures via tuning of thickness and dielectric permittivity of the layers. In this case, a design of nanoantennas may be time consuming due to the overwhelming number of possible combinations of the nanoantenna parameters. Even accounting for manufacturing limitations, the number of possible materials and layer thicknesses can go far beyond thousands of values. Without advanced optimization techniques they should be iterated to fit a desired scattering spectrum, which implies many calculations. In this case, the application of DL techniques may become very useful and productive.

    The general idea of a design procedure based on the DL approach is shown schematically in Fig. 3(a). One of the first results in this field was presented in ref.35 where the network was trained to determine a scattering spectrum of a layered spherical nanoparticle with fixed material parameters. The inputs of the network are the thicknesses of the layers (30−70 nm) and the outputs are the scattering spectrum samples in the range of wavelengths between 400 nm and 800 nm. After the training, the network is able to generate a scattering spectrum for a given set of parameters within a fraction of a second and with a high precision (mean relative error is below 1.5%). Importantly, the re-trained network is used to provide an inverse design such that for a given spectra it provides a suitable set of thicknesses. Moreover, the network is used as a tool for optimizing the spectral features in narrow and broadband regions of the spectra. Similar tasks have also been considered in refs.36, 37, where more advanced DL techniques have been applied.

    ML-empowered designs of multilayer nanoantennas. (a) Schematic of the design procedure based on the DL approaches. A DL algorithm (a black box) connects physical response with parameters of a structure. For example, hand-drawn scattering spectra can be processed with the black-box in order to define materials and thicknesses of the layers needed to achieve target spectra (Based on the results from ref.38). (b) Demonstration of invisibility-to-superscattering transition of a multilayer sphere made of phase-change materials. Materials and thicknesses for this case are found similar to the scheme in (a)48. (c) Transfer learning process. Layers of one trained ANN can be merged with another ANN to provide a training procedure for another type of the structure. This might be used for a design of multilayer films or multilayer sphere with using the ANN approach (Concept originates from the results of refs.52, 53). Figure reproduced with permission from: (b) ref.48, Optica Publishing Group.

    Figure 3.ML-empowered designs of multilayer nanoantennas. (a) Schematic of the design procedure based on the DL approaches. A DL algorithm (a black box) connects physical response with parameters of a structure. For example, hand-drawn scattering spectra can be processed with the black-box in order to define materials and thicknesses of the layers needed to achieve target spectra (Based on the results from ref.38). (b) Demonstration of invisibility-to-superscattering transition of a multilayer sphere made of phase-change materials. Materials and thicknesses for this case are found similar to the scheme in (a)48. (c) Transfer learning process. Layers of one trained ANN can be merged with another ANN to provide a training procedure for another type of the structure. This might be used for a design of multilayer films or multilayer sphere with using the ANN approach (Concept originates from the results of refs.52, 53). Figure reproduced with permission from: (b) ref.48, Optica Publishing Group.

    The DL procedure was further generalized to adjust not only thicknesses but also materials of a three-layered sphere via the application of neural network38. In this case, the inputs are electric and magnetic dipole responses while the output is the set of materials and thicknesses. To avoid arbitrary refractive indices which cannot be achieved in a real situation, the set of possible refractive indices is limited to 7 values corresponding to typical materials used for nanofabrication (such as Si, SiO2, Ag, etc.). On the one hand, such a strategy significantly limits the possible output parameters, but the computational problem becomes more complicated. In this case, two tasks must be solved simultaneously: a regression problem, to estimate geometric parameters taking continuous values, and classification problem, to determine material properties. Additionally, the network must also reconstruct the spectrum from the design parameters (i.e., material and thickness). The proposed solution is based on a purpose-built hybrid architecture. Two networks are used here: one, to connect optical properties with parameters of the structure, and the other one, to map design parameters to optical responses. Importantly the training of both networks happens simultaneously with a combined loss function that imposes the networks to learn forward and inverse design. The proposed DL architecture allows finding and tuning electric and magnetic dipole resonances. For example, the algorithm is used to achieve spectral co-location of the resonances. Simultaneous tuning of electric and magnetic dipole resonances also allows implementing the first Kerker condition and design even more complex media with negative index of refraction.

    Implementation of ML-based numerical design techniques is not limited to spherical geometries and neural networks. Classical ML algorithm (the so-called Bayesian optimization) was applied for forward design of cylindrical metal-dielectric nanoantenna in order to achieve unidirectional scattering satisfying Kerker or anti-Kerker conditions39. Multipole engineering via DL techniques was also demonstrated in ref.40, where an ANN was used to determine the far- and near-field responses of arbitrary plasmonic and dielectric structures and the internal field distributions. DL algorithms can be employed to study a variety of effects in metaphotonics including electric and magnetic dipole resonances, Kerker scattering, and destructive interferences leading to the anapole states.

    Advantages of DL-based designs over FDTD simulations of core-shell nanoantennas have been discussed in ref.41. In general, the speed of DL-based forward designs is 100–1000 times faster in comparison with FDTD simulations. At the same time, accuracy of prediction reaches the values about 95%. Together with developing ML-supplemented methods for solving both forward and inverse scattering problems42-45, such studies suggest that AI technologies may constitute conventional numerical methods resulting in the development of novel computational tools. For instance, the recent results46 suggests real-time web-based tools for designing far-fields of arbitrary-shaped structures.

    Other examples of the scattering problems that can be tackled with DL algorithms include a design of invisible objects. In particular, 5-layer particle (Ag and SiO2 layers) was inversely designed to achieve extremely low scattering efficiency within an optical frequency spectrum47. Namely, the scattering efficiency was below 10−2 in the range from 400 nm to 700 nm, dropping below 10−4 between 510 nm and 550 nm. Another demonstration utilizes phase-change materials for realizing invisibility-to-superscattering switching48. Developed DL approach allowed predicting required materials and structural parameters to realize simultaneously satisfied conditions for super- and near-zero scattering for two phase states, see Fig. 3(b). The shape of a particle itself may also be a target of DL-based design procedure as for example in ref.49 where an all-dielectric shell was designed for optical cloaking.

    Design of nanoantennas can also be used as a basis for solving other problems. As was mentioned above, to achieve reasonable accuracy, the application of DL methods requires a large amount of data. At the same time, ANNs are usually implemented for some specific systems, and a change of the system requires initiation of a new training process even if the general task remains the same. For example, a design of a scattering response of nanoparticles is completely different from the same procedure for metasurfaces though the outputs are quite similar (such as scattering spectra). This is reasonable, since these two systems are characterized by different sets of parameters, and hence there are different mappings between the input and output data. Nevertheless, it is possible to transfer knowledge gained by an ANN between completely different scenarios, referred to as transfer learning50, 51. This was demonstrated for the transmission of 8-layer films utilizing the results obtained for scattering from 8-layer nanoparticles52. The general idea is presented in Fig. 3(c), and the procedure requires merging two ANNs meaning that layers of one network (for the scattering problem) can be inserted into the other one (for the transmission problem). As a result, the learning error of the network for the multilayer films is reduced by almost 20% (reducing to 5.7%). The proposed technique may be useful for insufficient data allowing to use ANNs already trained for other tasks. Similar idea was developed53 to provide inverse design with finding simultaneous materials and continuous geometric parameters of core-shell particles and multilayer films.

    Both shape and size of single-material nanoparticles may also be a target for the inverse design procedure. In particular, DL techniques were implemented to achieve a desired spectral emission54 as well as far- and near-field properties55, 56 of plasmonic nanoantennas. At the same time, a design of nanoparticles is required in many other fields apart from photonics, such as chemistry or biology. For instance, the DL-based inverse design can be implemented to realise specific interactions57. However, these topics are beyond the scope of the current paper.

    Moving from isolated nanoantennas to their arrays, we wish to mention the results of the DL-based optimization of halide-perovskite thin solar cells combined with a layer of core-shell metallic nanoparticles58. This study employs the fact that core-shell nanoparticles may have two plasmon resonances located in different parts of spectra, and the resonances can be tuned by adjusting structural parameters. As a result, the efficiency of such solar cells can be improved by applying the ANN approach, e.g., by studying configurations of core-shell nanoparticles to maximize optical absorption. In particular, the improvement in ref.58 reached about 30%. Similar ideas are applied to the design of metasurfaces with the optimization of structural parameters of single meta-atoms, as it will be discussed in the next section.

    Transformative metasurfaces

    The fundamental concepts and underlying optical physics of metasurfaces have been extensively explored with the field itself now exhibiting a comprehensive framework. Currently, more advanced engineering with optimization tools are required for moving this field to specific practical applications including bending of light, metalenses, and metaholograms. A rich variety of ML methods and DL approaches has already been explored for metasurfaces aiming to optimize their required functionalities. Below, we present only a few recent examples of such efforts concentrating on applications and specific optical devices and functionalities.

    Inverse design of metasurfaces results in selecting shapes and materials of an isolated meta-atoms as specific elements of metasurface supercell. In this case, representation of parameters plays an important role, often significantly affecting the result of the subsequent training procedure. Like the case of nanoantennas, the design procedure here may result in a set of geometric parameters of meta-atoms with fixed shapes. For example, such an approach was used for a metasurface with super-cells consisting of gold cross-shaped resonators (up to 25 resonators per supercell)59. Inputs to the DL algorithm were the corresponding sets of lengths, and the outputs were the absorption spectra. As a result, the algorithm allowed the design of narrow-band, broadband, and multiresonant absorbers [see Fig. 4(a)]. However, even with adjustment these constrained parameters may fail to provide a desired spectral response. As an alternative, a free-form design approach can be used. In this case, the DL algorithm process is based not on sets of parameters, but rather employs pixelated images of the unit cells, as for instance in ref.60 [see example in Fig. 4(a)]. Such techniques extend the expressivity of the design approach, significantly extending the range of possible cell geometries. Conversely, the DL approach can also be used to restore a unit cell geometry from a given spectra61. The design procedure may also incorporate some transfer learning between meta-atoms of various shapes62.

    ML-empowered designs of transformative metasurfaces. (a) General scheme of the inverse design implemented for metasurfaces. Representation of metasurface elements can be done in several ways. For example, DL may process parameters of unit cells with a fixed geometry. At the same time, free-form design can be implemented, such as unit cells are represented as sets of pixelated images. DL-supplemented design of metasurfaces enable achieving of a great variety of devices, such as multi-resonant and broadband absorbers, metalenses with dual independent focal points, switchable reflectors, and structures with desired circular dichroism. (b) Example of the LIDAR device consisting of two ITO electrodes containing the metastructures filled with the liquid crystal. Voltages applied to electrodes are controlled via FPGA processor. Metasurfaces deflects the transmitted beam at the desired angle via tuning of the refractive index. (c) Configuration of binocular near-eye display utilizing beam splitters as in-coupling gratings. The optimised grating design is shown in the inset as well as RGB model of the beam-splitter and electric intensity above the grating. Figure reproduced with permission from: (a) refs.59, 60, under the Creative Commons Attribution 4.0 International License, refs.84, 108, Optica Publishing Group, ref.95, IEEE; (b) ref.109, under a Creative Commons Attribution 4.0 International license; (c) ref.110, Optica Publishing Group.

    Figure 4.ML-empowered designs of transformative metasurfaces. (a) General scheme of the inverse design implemented for metasurfaces. Representation of metasurface elements can be done in several ways. For example, DL may process parameters of unit cells with a fixed geometry. At the same time, free-form design can be implemented, such as unit cells are represented as sets of pixelated images. DL-supplemented design of metasurfaces enable achieving of a great variety of devices, such as multi-resonant and broadband absorbers, metalenses with dual independent focal points, switchable reflectors, and structures with desired circular dichroism. (b) Example of the LIDAR device consisting of two ITO electrodes containing the metastructures filled with the liquid crystal. Voltages applied to electrodes are controlled via FPGA processor. Metasurfaces deflects the transmitted beam at the desired angle via tuning of the refractive index. (c) Configuration of binocular near-eye display utilizing beam splitters as in-coupling gratings. The optimised grating design is shown in the inset as well as RGB model of the beam-splitter and electric intensity above the grating. Figure reproduced with permission from: (a) refs.59, 60, under the Creative Commons Attribution 4.0 International License, refs.84, 108, Optica Publishing Group, ref.95, IEEE; (b) ref.109, under a Creative Commons Attribution 4.0 International license; (c) ref.110, Optica Publishing Group.

    One of the critical tasks in the development of metadevices is to engineer specific properties of the light-matter interaction. As a consequence, many studies are devoted to the design and optimization of specific absorptive, scattering, and diffracting properties63-66. Contrary to the works focusing more on design methods rather than specific applications, we would like to highlight several examples of devices benefiting from the AI approach.

    Starting from absorbers, there are a variety of realised design procedures resulting in the development of perfect67, multi-band68, and ultrathin69 absorbers. Such studies cover both plasmonic70, 71 and dielectric72 metastructures. DL also opens the route for the design of biology-inspired devices such as moth-eye structures73 with the designed average absorption reaching 90% in the range from 400 nm to 1600 nm. Scattering properties are a subject of many design procedures61, 74-78, including those devoted to the development of anisotropic79 and bianisotropic80 metasurfaces. DL was also exploited for achieving electromagnetically-induced transparency81-83. Chiral metasurfaces are also among the typical applications of the DL design procedures84-89 [see Fig. 4(a)]. Phase-amplitude engineering supported by DL algorithms helps to overcome chromatism90, achieve tunable beam-steering91, and multiplex an aperture92 of metasurfaces. AI-assisted design enables development and improvement of broadband achromatic93, 94, bifocal95 [see Fig. 4(a)], thermally tunable96, and RGB (red, green, blue)97 metalenses.

    Light-matter interaction plays an important role in our daily life providing such information about an object as its colour. In general, coloration of objects may be achieved by using special pigments characterized by different absorption properties. Another approach is to employ structural colours generated via engineering of diffraction and reflection properties of an object98. Apart from surface decoration, structural colours may be used for digital displays, sensing, storage of information, and many other applications. Metasurfaces are able to absorb or scatter light selectively with particular wavelengths making them promising candidates for structural colour engineering99. Methods to design metasurfaces for this specific purpose also include the application of DL algorithms to plasmonic100, 101 and, more extensively, for dielectric metastructures102-107.

    Metasurfaces may also enhance the performance of solar cells. To further improve capabilities of metasystems, a novel design of silicon solar cell with an active nano-pixel metasurface has been suggested111. The shape was optimized by using DL to solve the forward design problem, over which an optimisation search is performed using a genetic algorithm to find the maximum short circuit current. As a result, the optimized solar cell exhibited a 2.5 times larger short circuit current than in any solar cell without metasurfaces, but with the same amount of crystalline silicon. It was demonstrated that the short circuit current was above 12 mA/cm2 for all other polarizations and angles of incidence. Additionally, if the absorptive metasurfaces may be made of various materials, the design procedure can exploit transfer learning methods in order to account for the effect of different materials on the spectra112.

    One more application is laser imaging detection and ranging (LIDAR) systems extensively used in many areas, such as biology, geology, atmosphere studies, astronomy, smartphones, augmented reality and many other fields113. With the rise of augmented reality technologies and autonomous systems, such as self-driving vehicles, LIDAR systems entered a new stage of development as a tool for object recognition and 3D modeling of an environment. In ref.109, the authors suggested a device consisting of a metasurface covered by a layer of liquid crystals with high birefringence. The structure is sandwiched between two coverslips with a transparent, patterned and conductive layer (ITO), serving as electrodes and connected to a field-programmable gate array (FPGA) control processor. The deflection is performed by the metasurface as well as a change of the surrounding media. Applied external voltage changes the orientation of liquid-crystal molecules resulting in a change of the refractive index enabling beam steering in the desired angle range. The metastructures consist of a patterned MoS2 layer with thickness 30 nm. The ANN approach is employed to generate ensembles of topology-optimized metasurfaces for achieving the highest deflection efficiency for a desired deflection at the telecommunication wavelengths. The demonstrated deflection angles were 45°, 55°, and 65° with an absolute efficiency of 0.7 in all cases [see Fig. 4(b)].

    AI methods can also empower technologies such as near-eye displays, required for virtual and augmented reality systems as well as for vision correction114, 115. One of the promising platforms for realizing such displays is based on the application of metasurfaces116-120 employing their ability to control wavefront, phase, amplitude, and polarization of light. Design of metasurfaces for such applications can be realized efficiently with the DL methods. In ref.110 example of the metasurface acting as a beam deflector is presented. It consists of a stack of L-layer gratings made of TiO2 with glass nanoridges where each grating layer is 300 nm thick. The display system transfers a designated scene toward the human eye via grating-covered waveguide. The system consists of image projectors, in-coupling and out-coupling gratings, and a waveguide. Light propagates along the waveguide towards an eye, and the in-coupling grating is illuminated by single or multiple projectors [see Fig. 4(c)]. The aim of the DL-based metasurface design is to maximize the deflection efficiency for given operating wavelengths, here chosen to be 720 nm (red), 540 nm (green), and 432 nm (blue). For the cases of integrated and separated RGB projectors, the achieved deflection efficiency was about 91% and 95% respectively, and the overall efficiency is about 80%. Also, the authors developed pupil expanded out-coupling grating consisting of three sections. For all of three designated diffraction efficiencies (33%, 50%, and 100%), the design procedure led to the deflection efficiency above 85%.

    Other applications of metasurfaces include the development of the so-called light sails - structures for a spacecraft propulsion system relied on a force exerted due to the radiation pressure coming from the Sun or a strong laser beam121. Recent studies demonstrated that all-dielectric metasurfaces can maximize the acceleration combined with thermal management requirements and passive stabilization mechanisms122-124, however, such designs require precise engineering of the structures. Along with some conventional techniques125, 126, the DL algorithms were considered as candidates to solve this task127. Aiming to develop a meta-sail for ultralight spacecraft that can reach Proxima Centauri b in approximately 20 years, the authors of ref.127 designed free-form silicon metastructures to satisfy optical and weight constraints. They demonstrated that optimization algorithms converged to one-dimensional gratings which allow the acceleration distance D = 1.9×109 m (distance required to reach target velocity 0.2 of light speed) and mean reflectivity around 0.81.

    Some other practically important applications include thermal emitters128, optical memory elements129, smart windows130, programmable switches131, switchable reflectors108 and transmission cloaking devices132. Developing advanced DL-based design approaches also pave way to tackle fabrication robustness of metasurfaces as it was demonstrated in ref.133 where designs of broadband flexible light polarizers were experimentally validated. Metasurfaces also find applications in biosensing which also may benefit from AI methods, as we discuss below.

    Chemical and biological sensing

    Metasurfaces can be employed as a reliable and robust platform for various chemical and biological sensors134-136. Being one of the most promising and rapidly developing fields of modern photonics, biosensing has also utilised AI aided techniques137. In this case, ML techniques may play two major roles. First, metasurfaces can be employed to optimize sensors138-141. On the other hand, metasurfaces empowered by ML techniques provide a powerful tool for classification tasks, which can be exploited to analyse output data of a sensor142-145. These concepts are schematically shown in Fig. 5(a).

    DL-empowered metasensors. (a) Schematic of sensing applications of metasurfaces empowered by the of DL methods. DL serves as a tool for designing metasurfaces used as sensing platforms or for analysing measured response spectra and its classification for the presence of specific molecules, etc. (b) Example of a colorimetric sensor inversely designed with the DL approach. Design procedure implies achieving of a target dual-resonant spectra for a double-bar unit cell. As a result, the sensor is capable to distinguish refractive indexes differed in values by less than 0.01. (c) DL-supported classification of SARS-CoV-2. Here, unit cell and Raman shift spectra are shown, where P.A. stands for the primary aptamer and S.A. is for the secondary aptamer. Confusion matrix is composed for DL-based classification of clinical samples. (d) Example of a plasmonic sensor for monitoring biomolecule dynamics. Real-time analysis of absorbance spectra via ML methods allows to distinguish between dynamically changing biological samples. In particular, regression signal allows tracking dynamics of liposome nanoparticles loaded with sucrose and nucleotides. Introduction of melittin results in perforation of lipid membranes and cargo release followed by formation of lipid bilayer. This process can be tracked with the help of ML. Figure reproduced with permission from: (b) ref.138, under a Creative Commons Attribution 4.0 InternationalLicense; (c) ref.142, the authors; (d) ref.143, under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

    Figure 5.DL-empowered metasensors. (a) Schematic of sensing applications of metasurfaces empowered by the of DL methods. DL serves as a tool for designing metasurfaces used as sensing platforms or for analysing measured response spectra and its classification for the presence of specific molecules, etc. (b) Example of a colorimetric sensor inversely designed with the DL approach. Design procedure implies achieving of a target dual-resonant spectra for a double-bar unit cell. As a result, the sensor is capable to distinguish refractive indexes differed in values by less than 0.01. (c) DL-supported classification of SARS-CoV-2. Here, unit cell and Raman shift spectra are shown, where P.A. stands for the primary aptamer and S.A. is for the secondary aptamer. Confusion matrix is composed for DL-based classification of clinical samples. (d) Example of a plasmonic sensor for monitoring biomolecule dynamics. Real-time analysis of absorbance spectra via ML methods allows to distinguish between dynamically changing biological samples. In particular, regression signal allows tracking dynamics of liposome nanoparticles loaded with sucrose and nucleotides. Introduction of melittin results in perforation of lipid membranes and cargo release followed by formation of lipid bilayer. This process can be tracked with the help of ML. Figure reproduced with permission from: (b) ref.138, under a Creative Commons Attribution 4.0 InternationalLicense; (c) ref.142, the authors; (d) ref.143, under a Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 International License.

    To demonstrate benefits of the ML-assisted metasensor design, we consider the example of a colorimetric sensor based on all-dielectric metasurfaces. Colorimetry techniques rely on detection of colour variations associated with environmental change which leads to a change of refractive index. Hence, optimization of colorimetric sensors implies maximization of colour difference. In ref.138, the authors propose a metasurface consisting of double-bar elements. They demonstrated that a dual-resonant type of spectrum can achieve better colour difference than a single resonance. Therefore, the design process aims to find target dual-resonance spectra with the highest sensitivity of colour change with respect to minute spectral shifts. These spectra were used as inputs to a DL algorithm to find geometric parameters of the unit cell. The optimized structure was used to sense the concentration of a glucose solution. It was demonstrated that the sensing resolution was able to distinguish a change in refractive index of 0.009 with a mean square error within 0.005 [see Fig. 5(b)].

    Another demonstration of meta-sensor design includes the optimization of double-negative plasmonic metasurfaces for enhanced detection of DNA oligomers141. The overall device consisted of a layered structure comprised of a glass substrate, gold layer and negative-index effective medium. Immobilized and hybridized DNA oligonucleotides were placed on top of the metasurface and covered by a buffer ambience layer (distilled water). A standard multilayer perception was used to predict the angular reflectance for a given thickness of the gold layer, effective permittivity and permeability. The obtained spectral characteristics were then clustered to ensure that resonance characteristics are appropriate for an efficient metasensor. The consequent application of DL and classical ML algorithms resulted in a sensitivity improvement of up to 13 times, in comparison with conventional plasmonic sensors.

    To illustrate how ML can be employed for classification of biomolecules, we mention the recently suggested platform for surface-enhanced infrared absorption spectroscopy empowered by DL algorithms143. They use nanoplasmonic metasurface supporting three resonances in a broad mid-IR spectrum (1000–3000 cm−1) covering absorption bands of biomolecules. The sample consists of liposome nanoparticles loaded with sucrose and nucleotides. In addition, melittin is introduced to perforate the lipid membranes of the liposomes leading to dual cargo release and formation of supported lipid bilayer on the surface of the sensor. Real-time measurements of reflectance spectra are used to calculate absorbance which is analysed via a DL technique. As a result, the algorithm is able to distinguish between each analyte at each point of time [see Fig. 5(c)]. Therefore, the developed sensing platform enables simultaneous label-free monitoring of major biomolecule classes in water opening the route to study real-time biomolecular interactions, such as vesicle capture, perforation with dual cargo release, and partial transition to planar lipid bilayers.

    Advances in metasensors have also been utilised to address global challenges such as COVID-19 pandemic. AI technologies were already exploited for diagnostics of coronavirus146, 147 and recently they were proposed to be combined with meta-sensors. The SARS-CoV-2 saliva sensor [see Fig. 5(d)] proposed in ref.142 demonstrated sensitivity and specificity both reaching 95.2% during clinical trials. The sensing platform is comprised of a plasmonic metasurface functionalized with thiol-modified primary DNA aptamers. The meta-atoms were optimized for maximization of the Raman cross-section using a genetic algorithm. Testing samples consisted of unprocessed saliva mixed with Cy5.5-modified fluorescent secondary DNA aptamers. After an allowed 15 minutes for binding, the solution was placed on the sensor to drain. The surface was then rinsed with double-distilled water and phosphate buffer saline. Identification of the virus was performed using an ML algorithm to process Raman shift spectra. Out of 69 clinical samples only 1 false positive and 1 false negative result were obtained indicating a high potential worth further development. Moreover, the sensor was capable of variant detection among inactivated Alpha (B.1.1.7), Beta (B.1.351), and wild-type variants, with cumulative variance 99.7%.

    We notice that classification of a metasensor output is not necessarily done via DL algorithms. Classical ML methods, such as k nearest neighbours or principal component analysis with latent Dirichlet allocation, may provide no fewer good results. For instance, gaseous and liquid chemicals may be recognised using plasmonic mid-infrared spectral filters with a photodetector array145 or plasmonic metasurfaces integrated with microfluidic channel144.

    We notice also that ML may be used not only to design and supplement metasensors, but also to sense the presence of metamaterials per se. As an example, a DL algorithm could process electromagnetic response signals obtained with THz-band time-domain spectroscopy in order to identify metamaterial in lactose mixtures148. The accuracy reached an astonishing value of 100%, while classical ML algorithm (support vector machine) allowed only 87.9% and human’s ability to recognize the spectrum of a metamaterial was below 57%.

    The examples described above, together with other achievements of AI-supplemented sensing, suggest the beginning of a revolutionary era of intelligent biosensors, extensively discussed in literature (see refs.137, 149-152 for instance). Seemingly, metastructures and metasurfaces will play an important role in the development of those ideas. Apart from photonics, other sensing technologies have also experienced a rapid onset of AI resulting in the development of next-level intelligent systems based on principles of self-adjustment, which we discuss in the next section.

    Self-adapting metasystems

    Progress in computer science enabled the development of such intriguing concepts as coding and digital (or programmable) metasurfaces153-160 that fill a gap between IT and electrodynamics. Though the ML concepts can also be used just for a straightforward design of such structures161-167, they expose a range of additional possibilities. AI technologies which can adjust the response of a system to specific inputs are extremely useful for the development of self-adapting and re-programmable systems. In this case, ML can be used as a feedback mechanism, redefining programming sequences of a digital metasurface to adjust to environmental changes. Schematic illustration of this concept is shown in Fig. 6(a). In recent years, this concept has developed into a new branch of intelligent metasurfaces, which rapidly gained a lot of attention for wireless communications at radio frequencies, especially in 6G and internet-of-things technologies168-175. Below we present only a few examples for illustrating those ideas.

    Self-adapting metasurfaces. (a) General schematic of DL-assisted self-adaptive metadevices. Programmable metasurfaces are controlled via a DL algorithm processing incoming signal. A change of environment leads to a change of coding sequences required to adjust the response of a metasurface. (b) Example of a self-adapting cloaking device. An object is covered by a metacloak made of varactor diodes. Data from sensors perceiving changes in background is processed by a DL algorithm determining voltages required to apply to diodes to adjust the cloak. (c) DL-assisted microwave imager based on a metasurface. Microwave data coming to a metasurface is processed via DL algorithm to reconstruct image of a human. Another DL algorithm can be used for recognition of specific regions within the reconstructed image, such as hands (gestures). Additionally, the collected data may be used for identification of human breath. (d) Metasurface-based optical ANN with re-programmable functions. Digital metasurfaces are used as a physical layer of the network with programming sequences controlled by FPGA processor. The network can dynamically change its functions via re-training procedure. Figure reproduced with permission from: (b) ref.177, Springer Nature; (c) ref.180, (d) ref.182, under a Creative Commons Attribution 4.0 International License.

    Figure 6.Self-adapting metasurfaces. (a) General schematic of DL-assisted self-adaptive metadevices. Programmable metasurfaces are controlled via a DL algorithm processing incoming signal. A change of environment leads to a change of coding sequences required to adjust the response of a metasurface. (b) Example of a self-adapting cloaking device. An object is covered by a metacloak made of varactor diodes. Data from sensors perceiving changes in background is processed by a DL algorithm determining voltages required to apply to diodes to adjust the cloak. (c) DL-assisted microwave imager based on a metasurface. Microwave data coming to a metasurface is processed via DL algorithm to reconstruct image of a human. Another DL algorithm can be used for recognition of specific regions within the reconstructed image, such as hands (gestures). Additionally, the collected data may be used for identification of human breath. (d) Metasurface-based optical ANN with re-programmable functions. Digital metasurfaces are used as a physical layer of the network with programming sequences controlled by FPGA processor. The network can dynamically change its functions via re-training procedure. Figure reproduced with permission from: (b) ref.177, Springer Nature; (c) ref.180, (d) ref.182, under a Creative Commons Attribution 4.0 International License.

    Metamaterial cloaks, or meta-cloaks176, previously mentioned in the framework of inverse design, may be supplemented by ML algorithms enabling self-adapting functions. The cloak presented in ref.177 is made of reconfigurable metasurface consisting of varactor diodes. Incoming waves sensed by detectors in real time are processed by a DL algorithm calculating voltages, which must be applied across the diodes to adjust the scattering spectrum. As a result, the engineered scattered field is like that in the case of bare surrounding without an intruding object [see Fig. 6(b)]. The use of DL allows fast determination of the necessary parameters of the metasurface to provide cloaking for a wide range of changing parameters, such as the angle and frequency of the incident wave and background variations. As discussed in the literature178, AI technologies may become a key factor in the future development of meta-cloaks and even more fascinating devices.

    A self-adapting approach was implemented to realize a real-time metasurface imager179. The proposed solution is based on 2-bit coding metasurface - a structure made of independently controlled meta-atoms supporting four different digital responses, 00, 01, 10 and 11, which correspond to the physical phases 0, π/2, π and 3π/2, respectively. Thus, an incident field with different coding sequences will produce different scattering patterns. The principle behind this imaging technique is based on the recognition of an object from the measured scattered fields. In the proposed design, the coding metasurface is controlled via a field programmable gate array. First, the ML algorithms are trained for the desired radiation patterns, and then coding patterns of the metasurface are determined from the obtained radiation patterns.

    Another imager was developed in ref.180. In this case, microwave data collected by a metasurface are used as an input for an ANN reconstructing the image of a human body. A second ANN is used to recognize a particular region within the recorded image, such as a hand or a head. Coding sequences needed for control of programmable metasurfaces are defined via the Gerchberg-Saxton algorithm. This sequence is implemented to focus radiation waves onto the desired regions of a human body to read the necessary data from the reflected echoes and was demonstrated experimentally [see Fig. 6(c)]. Similar ideas were recently implemented for development of the nano-optics metadevice performing high-quality optical imaging181. In this case the utilized differentiable model of image formation allows joint optimization of all parameters of the imaging pipeline. Future development of similar systems may lead to next-level human-device interfaces used for smart environments, health monitoring, sign, and speech recognition. Such systems can be developed further by adding a feedback mechanism with another ANN.

    Further development of intelligent imagers and sensors led to the integration of programmable meta-atoms as trainable physical weights into a learned integrated sensing pipeline183, 184. This concept implies joint learning of measurement strategies with a matching data processing algorithm, which results in accuracy improvements for object recognition tasks. Perspective on development of intelligent meta-imagers can be found in ref.185, for instance.

    Here, we also mention that a relation between AI and photonics is not one-sided. Photonic devices find broad applications for realizing optical ANNs186, 187 due to their ability to provide parallel computing with the speed of light. Metasurfaces are among a variety of building blocks of such networks188, 189. Indeed, individual meta-atoms of digital metasurfaces may represent single artificial neurons, controlled via some predefined algorithms. Functionality of optical ANNs may be extended with the use of feedback mechanisms providing re-training of the meta-ANN. For instance, ref.182 demonstrates that coding sequences can be redefined with real-time field-programmable gate arrays. Feedback signals allowed to realize self-learning functions using data from the interaction with environment without prior knowledge [see Fig. 6(d)]. As a result, the authors developed wave-based intelligence machine able to perform such DL tasks as image recognition and feature detection as well as multi-channel coding and decoding tasks and dynamic multi-beam focusing. More discussion on application of metasurfaces for optical neural networks is provided in Sec. Perspective and outlook.

    Moving away from metadevices, it should be mentioned that the AI tools become a part of establishing control systems for lasers190. In this case, ML algorithms are integrated into feedback mechanisms automatically adjusting position and orientation of laser elements, such as waveplates and polarizers. Emergence of similar intelligent systems may change the way how experimental characterization of metasystems and their manufacturing are performed. Below, we discuss briefly some other applications of the AI technologies to other related fields and also provide some perspective on how AI technologies may reshape the development and applications of metadevices.

    Perspective and outlook

    Recent advancements in ML and AI methods are expected to reshape some major areas of metaphotonics where metastructures and metasurfaces play important roles. As we mentioned above, to develop metasurfaces with specific properties and functionalities, novel design strategies and approaches in advanced computational techniques are required. It is expected that ML and DL will be useful for developing sophisticated smartphones, robotic systems, and self-driving cars employing the concepts of flat optics. Importantly, ML can help to discover unconventional optical designs thus advancing imaging, sensing, and other functionalities of metaphotonics devices. Many recent studies are devoted to photonic design approaches and emerging material platforms showcasing ML-assisted optimization for intelligent metasurface designs. Among emerging and future developments, we wish to mention a few examples of AI-supplemented systems, beyond the major scope of this paper but still closely related to metaphotonics.

    First, it is worth mentioning topological photonics being one of the most actively developing branches of photonics and physics in general. Realization of topological phases in physical systems opens a road towards novel robust structures protected from scattering losses and structural disorder191, 192. The current trends of AI-supplemented studies do not leave aside this important area of research. The first demonstrations of inverse design of photonic topological insulators include the DL-based estimation of geometric parameters of 1D photonic crystal to obtain protected edge states at target frequencies193-195, see Fig. 7(a). Later, this approach was also extended to 1D PT-symmetric chains and cylindrical photonic crystal fibers196. In addition, ANN methods were used to predict topological transitions in photonic crystals197. Recent works successfully utilizing non-DL methods for inverse design of 2D topological insulators and waveguides198, 199 indicates that ANNs soon may be applied to design topological structures beyond 1D geometry. At the same time, direction of modern research is turned towards reduction of size and realization of topological metadevices200 meaning that AI technologies can find their applications in metaphotonics.

    Examples of DL-empowered metasystems. (a) Forward and inverse design procedures for topological properties of 1D photonic crystal. Labels 0 and 1 indicate geometric Zak phase of bands (which is either 0 or π, correspondingly). (Adapted from refs.193, 197). (b) Bound states in the continuum designed via DL algorithm, which allowed to predict reflection spectra with automatically labelled modes and find suitable geometric parameters of a unit cell. The results of the design procedure were confirmed experimentally by angle-resolved measurements. (Adapted from ref.202). (c) Example of biology-inspired system incorporating DL algorithms. Here, spider-eye-like system is presented, where antenna is used to perceive incoming waves and process them to retrieve information about the environment. ANN in this case explicitly imitates the work of a biological neural network, realizing a system similar to a vision system of a spider. (Adapted from ref.206) Figure reproduced with permission from: (a) ref.193, Optica Publishing Group, ref.197, American Physical Society; (b) ref.202, the authors; (c) ref.206, under a Creative Commons Attribution 4.0 International License.

    Figure 7.Examples of DL-empowered metasystems. (a) Forward and inverse design procedures for topological properties of 1D photonic crystal. Labels 0 and 1 indicate geometric Zak phase of bands (which is either 0 or π, correspondingly). (Adapted from refs.193, 197). (b) Bound states in the continuum designed via DL algorithm, which allowed to predict reflection spectra with automatically labelled modes and find suitable geometric parameters of a unit cell. The results of the design procedure were confirmed experimentally by angle-resolved measurements. (Adapted from ref.202). (c) Example of biology-inspired system incorporating DL algorithms. Here, spider-eye-like system is presented, where antenna is used to perceive incoming waves and process them to retrieve information about the environment. ANN in this case explicitly imitates the work of a biological neural network, realizing a system similar to a vision system of a spider. (Adapted from ref.206) Figure reproduced with permission from: (a) ref.193, Optica Publishing Group, ref.197, American Physical Society; (b) ref.202, the authors; (c) ref.206, under a Creative Commons Attribution 4.0 International License.

    Metasurfaces supporting high-Q optical resonances are especially promising for applications in nanophotonics, including enhancement of light-matter interaction, high-harmonic generation, biosensing and nonlinear effects201. Realization of such opening opportunities requires achievement of desired resonance properties. DL-based inverse design of photonic structures supporting BIC was presented in refs.202, 203. In particular, in ref.202 the authors consider suspended photonic crystal slab made of Si3N4 [see Fig. 7(b)]. Using DL, they find geometric parameters of the structure (radius and height of circular holes) for which symmetry-protected BIC may occur. Importantly, they support the results with experimental measurements demonstrating the presence of two BICs at Γ-point at wavelengths 700 and 750 nm. Widespread attention which BICs receive in the area of photonics204, 205 suggests that more studies supported by AI-algorithms will appear in recent future.

    While the previous sections of this review have discussed the application of DL techniques to photonics, here we briefly discuss an emerging field, namely that of photonic devices as platforms for physical neural networks. Recently there has been increasing interest in approaches termed neuromorphic computing with photonics providing a potentially viable platform186, 207. Neuromorphic computing aims to take an alternative approach to information processing by creating hardware structures that mimic that of nature’s analogue computing i.e., neural structures. This approach is a departure from standard computing models which rely on a centralized processing architecture, instead opting for a distributed model. This approach has several key advantages: first this model maps well to information tasks that are distributive, such as neural network models. This match between the hardware and algorithm allows an energy efficient approach to information processing. Secondly, for photonics-based systems the computation is inherently parallel due to the nature of the platform. Using photonics one can implement different types of neuron schemes with physical implementations having being demonstrated using Mach-Zehnder interferometers208, 209, phase change materials210, diffractive elements211 and optical nanoresonators212. Recently the scope of such applications was increased by proposing a physics aware training scheme that allows a wider class of systems to be trained for DL applications213. As metaphotonics develops, it is likely that many of the mechanics and non-linearities possible with metaphotonics will be identified as potential candidates for DL platforms. A metaphotonics hybrid approach to computational sensors has already been demonstrated that operates similar to that of conventional convolutional neural networks214.

    Similar to neuromorphic structures there has been the development of biology-inspired systems complemented with AI technologies. As an example, the spider-eye-like antenna demonstrated in ref.206 is supplemented by a generalised regression neural network215 to analyse the incoming waves [see Fig. 7(c)]. This algorithm predicts the direction of arrival and the polarization state, simulating a work of the spider’s neural system. Similar ideas were implemented for development of AI photonic synapse216. The structure is based on a field-effect transistor with a floating gate of self-assembled perovskite nanocones embedded into a self-assembled block co-polymer. Arrays of such synapses imitate the human retina with position-dependent photosynaptic performance defined by the spatial distribution of the nanocones. Such retina represents a single-layer neural network capable of image recognition with up to 90% accuracy. We believe that intelligent metasurfaces may be integrated in such kinds of systems to improve their performance and extend functionality, providing additional re-programmable functions or data processing with optical ANNs.

    The use of DL methods is not limited to a design of nanoantennas and their optical response. It may become an extremely useful assisting tool for experimental measurements. The demonstrations in this field include characterization of orientation217 or size218 of metallic nanoparticles using measured spectral data. ML finds its applications in a variety of microscopy and imaging techniques219, 220 as well as for tracking221, localization222 and analysis of single molecules223. The proposed algorithms may become a basis for a fast automated retrieval of parameters of nanoantenna samples.

    An additional application for DL methods lies within the realm of optimization, specifically that of high dimensional systems. Global optimization is a generally challenging endeavour with many problems becoming intractable with a growing number of parameters in the absence of prior knowledge of the parameter landscape. One approach that leverages the power of DL to generalize high-dimensional systems is to perform an online optimisation. In this case, an ML agent is put in a type of feedback loop with the agent providing predictions as to the best parameters, whilst updating its belief based on feedback from the system. This approach has been used to optimize the laser cooling sequences of cold atom traps with up to 63 individual controllable parameters224, 225. The use of DL allows one to consider optimization problems with much higher dimensions compared to conventional approaches such as the Gaussian process which are generally limited to smaller parameter spaces226, 227. Metaphotonics and metasurfaces provide a range of different parameters and exhibit a rich range of phenomena owing to their subwavelength structuring. This provides an opportunity for direct optimisation both in a design setting and real-time capacity, such as that of programmable metasurfaces. Such approaches are indeed flexible as the objective function may be arbitrarily defined to suit any given purpose.

    While metaphotonics structures have been intensively studied in the past decade, many challenges still exist for practical and end-specific utilization. Fabrication becomes a tricky issue for many accuracy-required structures, and it also becomes harder to modulate metasurfaces. With emerging materials, new physics, and advanced nanofabrication techniques, those challenges may be bypassed. We anticipate new insights being delivered by merging the concepts from several fields exploring interdisciplinary concepts of metaphotonics and creating new types of hybrid systems governed by the concepts of flat optics and metamaterial-inspired subwavelength engineering. We expect that future research in intelligent metaphotonics will significantly broaden the horizons of photonics and offer new perspectives for novel applications beyond our current imagination.

    References

    [10] Mohri M, Rostamizadeh A, Talwalkar A. FoundationsofMachineLearning (MIT Press, Cambridge, 2018).

    [13] Tanaka A, Tomiya A, Hashimoto K. DeepLearningandPhysics (Springer, Singapore, 2021);https://doi.org/10.1007/978-981-33-6108-9.

    [14] Miyanawala TP, Jaiman RK. An efficient deep learning technique for the navier-stokes equations: application to unsteady wake flow dynamics. ArXiv: 1710.09099 (2018).

    [15] Lucor D, Agrawal A, Sergent A. Physics-aware deep neural networks for surrogate modeling of turbulent natural convection. ArXiv: 2103.03565 (2021).

    [16] Lim J, Psaltis D. MaxwellNet: physics-driven deep neural network training based on Maxwell’s equations. ArXiv: 2107.06164 (2021).

    [18] Goodfellow I, Bengio Y, Courville A. DeepLearning (MIT Press, Cambridge, 2016).

    [21] Simonyan K, Zisserman A. Very deep convolutional networks for large-scale image recognition. In Proceedings of the 3rd International Conference on Learning Representations (ICLR, 2015).

    [22] Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J et al. Language models are few-shot learners. In Proceedings of the 34th Conference on Neural Information Processing Systems (NeurIPS, 2020).

    [23] van Oord A, Kalchbrenner N, Kavukcuoglu K. Pixel recurrent neural networks. In Proceedingsofthe33rdInternationalConferenceonMachineLearning 1747–1756 (PMLR, 2016).

    [37] Qiu CK, Luo Z, Wu X, Yang HD, Huang B. Inverse design of multilayer nanoparticles using artificial neural networks and genetic algorithm. ArXiv: 2003.08356 (2020).

    [44] Guo R, Lin ZC, Shan T, Song XQ, Li MK et al. Physics embedded deep neural network for solving full-wave inverse scattering problems. IEEETransAntennasPropag (2021); https://doi.org/10.1109/TAP.2021.3102135.

    [54] Elzouka M, Yang C, Albert A, Lubner S, Prasher RS. Interpretable inverse design of particle spectral emissivity using machine learning. ArXiv: 2002.04223 (2020).

    [75] Ghorbani F, Shabanpour J, Beyraghi S, Soleimani H, Oraizi H et al. A deep learning approach for inverse design of the metasurface for dual-polarized waves. ArXiv: 2105.08508 (2021).

    [78] Zandehshahvar M, Kiarashi Y, Zhu ML, Maleki H, Brown T et al. Manifold learning for knowledge discovery and intelligent inverse design of photonic nanostructures: breaking the geometric complexity. ArXiv: 2102.04454 (2021).

    [80] Naseri P, Pearson S, Wang ZZ, Hum SV. A combined machine-learning/optimization-based approach for inverse design of nonuniform bianisotropic metasurfaces. ArXiv: 2105.14133 (2021).

    [113] McManamon PF. LiDARTechnologiesandSystems (SPIE Press, Bellingham, 2019).

    [121] Vulpetti G, Johnson L, Matloff GL. SolarSails: ANovelApproachtoInterplanetaryTravel (Springer, New York, 2015);https://doi.org/10.1007/978-1-4939-0941-4.

    [127] Kudyshev ZA, Kildishev AV, Shalaev VM, Boltasseva A. Optimizing Startshot lightsail design: a generative network-based approach. ArXiv: 2108.12999 (2021).

    [131] Abdollahramezani S, Hemmatyar O, Taghinejad M, Taghinejad H, Krasnok A et al. Electrically driven programmable phase-change meta-switch reaching 80% efficiency. ArXiv: 2104.10381 (2021).

    [138] Son H, Kim SJ, Hong J, Sung J, Lee B. Design of highly perceptible dual-resonance all-dielectric metasurface colorimetric sensor via deep neural networks. (2021);https://doi.org/10.21203/rs.3.rs-801301/v1.

    [142] Torun H, Bilgin B, Ilgu M, Yanik C, Batur N et al. Machine learning detects SARS-CoV-2 and variants rapidly on DNA aptamer metasurfaces. medRxiv (2021);https://doi.org/10.1101/2021.08.07.21261749.

    [144] Ren ZH, Zhang ZX, Wei JX, Dong BW, Lee C. Mid-infrared nanoantennas as ultrasensitive vibrational probes assisted by machine learning and hyperspectral imaging. (2021);https://doi.org/10.21203/rs.3.rs-209363/v1.

    [147] Kwekha-Rashid AS, Abduljabbar HN, Alhayani B. Coronavirus disease (COVID-19) cases analysis using machine-learning applications. ApplNanosci (2021);https://doi.org/10.1007/s13204-021-01868-7.

    [162] Banerji S, Majumder A, Hamrick A, Menon R, Sensale-Rodriguez B. Machine learning enables ultra-compact integrated photonics through silicon-nanopattern digital metamaterials. ArXiv: 2011.11754 (2020).

    [164] Abdullah M, Koziel S. Supervised-learning-based development of multibit RCS-reduced coding metasurfaces. IEEETransMicrowTheoryTech (2021);https://doi.org/10.1109/TMTT.2021.3105677.

    [171] Alexandropoulos GC, Shlezinger N, Alamzadeh I, Imani MF, Zhang HY et al. Hybrid reconfigurable intelligent metasurfaces: enabling simultaneous tunable reflections and sensing for 6G wireless communications. ArXiv: 2104.04690 (2021).

    [182] Cui TJ, Liu C, Ma Q, Luo ZJ, Hong QR et al. Programmable artificial intelligence machine for wave sensing and communications. (2020); https://doi.org/10.21203/rs.3.rs-90701/v1.

    [185] Saigre-Tardif C, Faqiri R, Zhao HT, Li LL, del Hougne P. Intelligent meta-imagers: from compressed to learned sensing. ArXiv: 2110.14022 (2021).

    [189] Luo XH, Hu YQ, Li X, Ou XN, Lai JJ et al. Metasurface-enabled on-chip multiplexed diffractive neural networks in the visible. ArXiv: 2107.07873 (2021).

    [198] Chen M, Zandehshahvar M, Kiarashinejad Y, Hemmatyar O, Umapathy D et al. Inverse design of nanophotonic structures using a hybrid dimensionality reduction technique. In Proceedings of FrontiersinOptics/LaserScience FM2A. 1 (Optical Society of America, 2020);https://doi.org/10.1364/FIO.2020.FM2A.1.

    [202] Ma XZ, Ma Y, Cunha P, Liu QS, Kudtarkar K et al. A universal deep learning strategy for designing high-quality-factor photonic resonances. ArXiv: 2105.03001 (2021).

    [207] Schuman CD, Potok TE, Patton RM, Birdwell JD, Dean ME et al. A survey of neuromorphic computing and neural networks in hardware. ArXiv: 1705.06963 (2017).

    [213] Wright LG, Onodera T, Stein MM, Wang TY, Schachter DT et al. Deep nonlinear optical neural networks using physics-aware training. In Proceedings of ConferenceonLasersandElectro-Optics FF1A. 4 (Optical Society of America, 2021);https://doi.org/10.1364/CLEO_QELS.2021.FF1A.4.

    [214] Majumdar A. Metaphotonic computational image sensors. In Proceedings of ImagingandAppliedOpticsCongress IW1D. 3 (Optical Society of America, 2020);https://doi.org/10.1364/ISA.2020.IW1D.3.

    [225] Gupta RK, Everett JL, Tranter AD, Henke R, Gokhroo V et al. Machine learner optimization of optical nanofiber-based dipole traps for cold 87Rb atoms. ArXiv: 2110.03931 (2021).

    Sergey Krasikov, Aaron Tranter, Andrey Bogdanov, Yuri Kivshar. Intelligent metaphotonics empowered by machine learning[J]. Opto-Electronic Advances, 2022, 5(3): 210147
    Download Citation