Metamaterials and metasurfaces have inspired worldwide interest in the recent two decades due to their extraordinary performance in controlling material parameters and electromagnetic properties. However, most studies on metamaterials and metasurfaces are focused on manipulations of electromagnetic fields and waves, because of their analog natures. The concepts of digital coding and programmable metasurfaces proposed in 2014 have opened a new perspective to characterize and design metasurfaces in a digital way, and made it possible to control electromagnetic fields/waves and process digital information simultaneously, yielding the birth of a new direction of information metasurfaces. On the other hand, artificial intelligence (AI) has become more important in automatic designs of metasurfaces. In this review paper, we first show the intrinsic natures and advantages of information metasurfaces, including information operations, programmable and real-time control capabilities, and space–time-coding strategies. Then we introduce the recent advances in designing metasurfaces using AI technologies, and particularly discuss the close combinations of information metasurfaces and AI to generate intelligent metasurfaces. We present self-adaptively smart metasurfaces, AI-based intelligent imagers, microwave cameras, and programmable AI machines based on optical neural networks. Finally, we indicate the challenges, applications, and future directions of information and intelligent metasurfaces.
In the past two decades, metamaterials have attracted widespread attention all over the world and have been fully studied in various fields due to their unparalleled capabilities in manipulating material parameters[1–3]. In the early stage, metamaterial research was focused mainly on the control of effective medium parameters based on bulk structures in three-dimensional (3D) versions, driven by the enthusiasm on negative refractions[4–7], invisibility cloaks[8–15], and perfect/super lenses[16–20]. However, 3D metamaterials usually have high loss and fabrication complexity, which apparently limit their further developments and applications. Therefore, the idea of a planarized metamaterial design was proposed and gradually formed the concept of metasurfaces[21–24], which can be regarded as two-dimensional (2D) versions of metamaterials. Metasurfaces not only possess the powerful control abilities of 3D metamaterials but also have the advantages of an ultra-low profile, easy fabrication, and low loss. In 2011 and 2012, the generalized Snell’s law was proposed based on a metasurface[22,25], showing a new method to delicately tailor the reflection and transmission of electromagnetic (EM) waves. This work successfully inspired researchers to design metasurfaces using their phase and amplitude distributions and also stimulated abundant applications such as ultrathin cloaks[22,24,26],holograms[27,28], planar optical lenses, polarization converters[22,30,31], absorbers[32–34], and vortex-beam generators[27,35]. Because of the above-mentioned advantages, metasurfaces significantly expand their application ranges, including wireless communications[36–40], EM imaging[41–43], satellite antennas[44–46], cloaking[26,47–49], and so on. Based on passive metasurfaces, tunable[50–55] and reconfigurable[56–64] metasurfaces dynamically promote the aforementioned application scenarios.
Traditional metasurface studies are based on continuous scales to design their EM properties, which can be attributed to analog metasurfaces. With the establishment and wide application of the Von Neumann computer system, the representation of modern information is inseparable from digital binary coding. To explore the possible connection between metasurfaces and digital information, the concept of digital metasurfaces was proposed in 2014[65,66] by two groups independently. Giovampaola and Engheta presented a discrete structural design method for the digital design of metasurfaces, but this concept is still limited to the digital digitization of the equivalent medium parameters and is hard to connect with the coding streams of digital information. On the contrary, Cui et al. proposed to characterize metasurfaces using the digital codes “0” and “1” (with opposite phase responses) instead of medium parameters, and to control the EM fields and waves using different coding sequences, producing digital coding metasurfaces. The digital coding sequences are exactly connected with the coding streams in the digital information. More importantly, an active meta-atom was designed to switch digital states “0” and “1” in real time. After all possible coding sequences and their EM functions are pre-computed and stored in a field programmable gate array (FPGA), the digital coding metasurface has become a programmable metasurface, in which many different EM functions can be performed on the same platform and switched in real time through FPGA. The coding sequence, on one hand, is the controlling command to perform the specific EM function, and on the other hand, it is a digital stream, which is modulated on the EM function. Hence the programmable metasurface can control EM fields and waves in real time and modulate digital information simultaneously. This unique feature has directly evolved into a new branch of metasurfaces—information metasurfaces, which was first proposed in 2017 and developed in 2021.
Digital coding, programmable, and information metasurfaces have successfully bridged the EM physical world and the digital information world[69,70]. Based on their unique features, various functions, devices, and systems of information metasurfaces have been achieved[71–75], such as orbital angular momentum (OAM) generators[67,76,77], spatial modulators[33,78–82], nonreciprocal devices[83,84], smart and self-adaptive beam scanners[85,86], intelligent imagers[87–90], and microwave cameras[87,88]. In the early stage, the form of coding was limited to encoding of the reflection phase, but it was rapidly extended to amplitude coding[33,78,91], polarization coding[67,79,80], OAM coding, and frequency coding. The working frequency of the digital coding metasurface has also been increased from the microwave band to the terahertz frequency[93–95]. In the meantime, a wealth of theories[96–98] and applications[99–101] have emerged. One important direction is the combination of information metasurfaces with traditional information theory. The convolution theorem of the digital coding metasurface was proposed in 2016, and implements a fast design and calculation method for arbitrary spatial beams and provides a reference for information computing. In the same year, information entropy theory was presented for digital coding metasurfaces, which provides an effective analysis method for information entropy calculation for both digital coding patterns and scattering patterns of EM waves. As an extension, a general EM information theory was developed[102,103], giving the constraint of the digital information and EM information, as well as information capacities. Another important dimension of information metasurfaces is temporal control, from which time-domain digital coding metasurfaces[82,104,105] have emerged. Time-domain coding makes it possible to freely control the frequency spectra of scattering waves in a programmable way[82,100,104–109]. Combining space-domain and time-domain coding together, space–time-coding digital metasurfaces have been presented[82,100,104–109], which can manipulate both spatial beams and frequency spectra simultaneously in real time. One important application of information metasurfaces is to build new architectures for wireless communication systems. In 2018, Cui, Liu, and Bai proposed a direct information transmission system based on a programmable metasurface, which can realize real-time image transmission based on 1-bit programmable units. Since then, many kinds of researches have been conducted for new-architecture wireless communications based on time-coding and space–time-coding digital metasurfaces[39,40,81,84,108,109], opening a direction for developing new wireless communication systems. On the other hand, the real-time reprogrammable feature of information metasurfaces can be used to control and tailor the wireless channel and EM environment. Such metasurfaces are also named as reconfigurable intelligent surfaces[38,110–115] in the wireless communication community, and have emerged as a promising path to optimize spatial energy efficiency in a disorganized EM environment.
Sign up for Photonics Insights TOC. Get the latest issue of Photonics Insights delivered right to you！Sign up now
Parallel to the developments of metasurfaces, as the ultimate direction of information and digitization, artificial intelligence (AI) has also received extensive attention in recent years. Since the AI board-game-Go program (AlphaGo) developed by Google DeepMind beat Lee Sedol, who is one of the best players in the world, in 2016, AI has gained exponential growth of attention and has been applied to ever-increasing varieties of fields. AI technology aims to study the way of making machines imitate the action and decision-making process of human beings to solve intelligent problems. The basic problem of AI is how to let a machine learn the experience from collected data or interact with the environment, and therefore, a variety of machine-learning and deep-learning algorithms have been developed. Artificial neural networks (ANNs) have proved to be able to handle various intelligent tasks, such as speech recognition[118–120], image recognition[121–123], automatic translation[124–126], image editing[127–130], and robot control[131–133]. Due to its unparalleled specialty, AI has been integrated into metasurface structure and function designs. In 2017, Zhang et al. exhibited a method to design a metasurface unit using the machine-learning algortithm, in which the pixeled metallic structure of the metasurface element can be automatically designed for arbitrary phase responses. The idea was promoted by Qiu et al. in 2019, in which an efficient method based on deep learning was reported. In addition to these studies on building unit structures with pixel blocks, Ghorbani et al. proposed to construct unit structures based on eight basic patterns, in which the deep-learning method was used to rapidly design the EM wave regulations in the case of dual polarizations. In 2018, Inampudi and Mosallaei developed a meta-element design method using the neural network for surface metal structures based on polygonal patterns. The structural intelligent design method was also extended to acoustic metamaterials, where the cylindrical structure is divided into five layers with different radii to obtain desired transmission coefficients, which are analyzed by a probability-density-based neural network. In addition to being used for the automatic design of metasurface elements, the machine-learning algorithm was also integrated into information metasurfaces to perform more intelligent tasks. In 2019, Li et al. presented a reprogrammable metasurface imager using principal component analysis (PCA). High-accuracy EM imaging was demonstrated, including reconstructions of handwritten digits and through-wall body gestures. Based on a programmable metasurface and PCA algorithm, the imaging system was apparently simplified with low cost[89,90]. Along this line of research, an intelligent imager and recognizer, also called a microwave camera, was further developed to perform more precise and customized imaging. By applying a series of CNN algorithms, the system is able to recognize the hand signs and vital signs of multiple people in experiments with good performance.
In the AI community, besides machine-learning and deep-learning algorithms, physics-informed neural networks driven by partial differential equations have been rapidly developed, showing great potential in solving classical problems such as fluid mechanics and quantum mechanics. Also, the graph neural network, benefiting from its highly extensible connecting structure, has become a superexcellent framework to absorb physical mechanisms and yield state-of-the-art performance in particle-based simulations. With the advancements of AI, the scale of artificial neuron networks has a trend of becoming more and more enormous, requiring higher demands for computing power and promoting the development of computing hardware. Nowadays, the speed of executing AI frameworks has been an important performance index for graphics processing units (GPUs). Although GPUs are very suitable for general AI calculations, they are expensive, bulky, and power consuming for edge deployment. Since 2015, different specific AI chips with low power consumption and high performance have been developed[142,143] for executing AI computing workloads with energy-efficient approachs, but they have fixed functionalities. Besides electronic-based AI chips, neuromorphic computing based on nanophotonics has gained more and more attention inspired by its natural characteristics of parallel and light-speed computing, and consistent efforts have been made to bring neuromorphic photonics towards the realization of fully functional neuromorphic networks. All-optical diffractive deep neural networks () as well as related theoretical methods have been developed for higher parallelism and lower energy consumption[147–153]. Following this path, some programmable methods to establish a more general computing machine have been investigated[154–156].
To better show the mutual developments of metamaterials and AI and their integration, we summarize a development timeline for the above two areas, as shown in Fig. 1. In the early days, artificial materials and AI were two independent research directions. Although the concept of metamaterials was proposed as early as in 1967, its extensive study started in 1996. The research of machine learning emerged around 1980, and it received worldwide attention after the deep-learning algorithm was proposed around 2010. Metamaterial and AI technologies such as machine learning have been continuously integrated during this period and formed some new sub-directions. With the development and fusion of information metasurfaces in recent years, intelligent metasurfaces have emerged, which include the integration of machine-learning algorithms into information metasurfaces and using multi-layer information metasurfaces to build the hardware of neural networks, resulting in all-optical and programmable , as illustrated in Fig. 1. To clearly indicate the timeline of the typical works in information, we add a timeline with the detailed references and pictures. Please note that the timeline is not strictly linear on the time scale.
Figure 1.Development of metamaterials, artificial intelligence, and their integration to result in intelligent metamaterials.
In this paper, different from previous review papers on metasurfaces[157,158], we focus on the recent advances in information metasurfaces and their integration with AI, the intelligent metasurfaces. We first introduce the developments of digital coding, programmable, and information metasurfaces, including their concepts, information operation theories, real-time and reprogrammable controls of EM fields and waves, space–time-coding modulations, and applications in wireless communications. Then we discuss the relationship between metasurfaces and AI technologies, which helps automatic designs of metasurface elements and metasurface patterns. Especially, we investigate intelligent metasurfaces to achieve close combinations of information metasurfaces and AI technologies, including self-adaptively smart metasurfaces with self-decision ability, AI-based intelligent imagers and microwave cameras, and programmable AI machines based on wave diffraction for neuromorphic computing with highly parallel features and efficiency. Finally, we give the challenges, potential applications, and future directions of information metasurfaces and intelligent metasurfaces.
2 Information Metasurfaces
2.1 Concept and Theories of Information Metasurfaces
The concept of information metasurfaces originated from digital coding and programmable metasurfaces, and combines digital information and physical meta-structures. Most previous research on metamaterials and metasurfaces has focused on the material characteristics and function realization, but is rarely discussed from the perspective of information science. The digital coding metasurface offers us a new angle to characterize the metasurface in a digital way and control the EM function by a spatial coding sequence[66,70,97,159,160]. Therefore, abundant coding theories, digital meta-atom designs, and programmable EM manipulations have been proposed, which gradually construct the academic and application systems of information metasurfaces. Among these theories and applications, the scattering field expression of the digital coding pattern, convolution theorem, and information entropy of information metasurfaces established the core methods for other expansive applications, which will be discussed in detail in this subsection. For more information, please refer to two in-depth review papers on information metasurfaces[68,70].
For the phase coding metasurface with certain digital bit states, the phase distribution on the metasurface can be expressed as a digital matrix, as shown in Fig. 2. When the metasurface is illuminated by a plane wave, the scattered field can be expressed as where and are element dimensions along and axes, and and are element numbers along and axes, respectively; is the reflective phase response of the element in the specific location .
Figure 2.Theories of information metasurfaces. (a), (b) Characterization of metasurface by digital coding and its scattering features. (c)–(e) Convolution operation of the digital coding metasurface, from the coding-pattern domain to the scattering-pattern domain. (f), (g) Information entropy of the digital coding metasurface, offering the information measurement from the coding pattern to the scattering pattern. (a), (b) Adapted from Ref. , Copyright 2014, with permission from Springer Nature, licensed under CC-BY-NC-SA 3.0. (c)–(e) Adapted from Ref. , Copyright 2016, with permission from Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. (f), (g) Adapted from Ref. , Copyright 2016, with permission from Springer Nature, licensed under CC-BY-NC-ND 4.0.
The convolution theorem of the information metasurface, inspired by signal processing theory, presents flexible scattering-field transformations and manipulations using the coding pattern superposition. As shown in Fig. 2, the scattered fields [Fig. 2(d)] of the cross-coding pattern and gradient-phase coding pattern [Fig. 2(c)] are distinct, which, respectively, are five beams in the upward direction and a deflected single beam. The superposed coding pattern of the cross and gradient phases is generated in Fig. 2(c), whose scattered field then combines the characteristics of both patterns, transforming the five beams in the deflected direction of the gradient phase pattern. This is similar to the convolution theorem for two signals, operating a frequency shift in the frequency domain. More interestingly, based on the above method, a more flexible coding pattern calculation is derived according to the equation where and are synthetic scattering angles, and and are scattering angles of two original coding patterns. These equations mean that arbitrarily deflecting angles can be synthetized by using the convolution theorem of digital coding metasurfaces. This method enriches the coding pattern operations and establishes the connection between the phase pattern on the metasurface and far-field scattered fields, promoting more research to focus on the digital operating manners in EM fields. Also, the convolution operation facilitates on-site solutions for beam-scanning and multi-beam manipulations.
To theoretically define the information property of information metasurfaces, the concept of information entropy of the digital coding metasurface was reported. The digital coding pattern is first analyzed from the perspective of geometrical information entropy (for a digital stream), as shown in Fig. 2(f), since the coding pattern can be regarded as a pixelated image and corresponds to the digital stream. After a fast Fourier transform (FFT), the digital coding pattern is transformed into the related scattering pattern, which is related to the physical entropy of the scattering field, as depicted in Fig. 2(g). The relationship between the geometrical information entropy and physical information entropy is investigated. For example, a periodic coding pattern such as “01010101…” reflects mainly the scattering energy in two symmetrical directions, whose physical entropy is relatively low. When the coding pattern gradually becomes randomly distributed with high geometrical entropy, the physical entropy of the scattering pattern is correspondingly increased. This information definition for a digital coding metasurface paves an important road for investigating the information capacity and information bounds of information metasurfaces[102,103].
2.2 Programmable Metasurfaces
The first benefit of the digital coding representation to a metasurface is making the EM controls be in real-time and the EM functions reprogrammable, because EM fields and waves are manipulated by the coding sequences or coding patterns on the metasurface. Most previous works of coding metasurfaces are based on passive structures, whose functions are fixed after fabrication. In contrast, programmable metasurfaces can reconfigure the EM structure by integrating active devices such as diodes and varactors, thereby achieving various functions. Other tunable materials such as liquid crystal and graphene, or other mechanical methods can also achieve programmable metasurfaces. In fact, various programmable metasurfaces have been presented in the past few years[68,70], and have realized reprogrammable holograms, scattering control, time-domain modulation, and so on. Recently, with the emergence of novel physical effects in topology research, programmable topology devices have attracted much attention. Here, we detail a recent work: a reprogrammable plasmonic topological insulator.
At present, most photonic topological insulators can achieve only specific static topological EM functions, and their reconfigurability is limited[161–163]. In addition, the existing static photonic topological crystals can form waveguide paths only on the topological boundary surface[164,165], which wastes the huge internal space of photonic topological crystals, limiting the compactness and miniaturization of topological optoelectronic devices. In future practical applications, to improve the integration and reduce design and manufacturing costs, topological optoelectronic devices will inevitably be developed in the direction of multi-functional monolithic integration. For this reason, researchers continue to explore reconfigurable photonic topological insulators. Recently, many reconfigurable photonic topological insulators have been realized by mechanically or thermally changing their geometrical or material parameters[166–171]. However, in practical engineering applications, once most of the photonic topological crystals are processed, their geometric and material parameters are not easy to change. Recently, You et al. reported a field-programmable topological EM metasurface based on surface plasmons, as shown in Fig. 3(a). Based on the programmable topology platform, dynamically regulated topological protection waveguide paths are realized, and the switching time for different topological waveguide paths can reach 10 ns. Compared with the previous mechanical control method, the switching speed increases by times.
Figure 3.Principle of the reprogrammable plasmonic topological insulator and the experimental demonstration. (a) Schematic of the reprogrammable topological insulator, where each unit can be programed by FPGA to establish distinct topological routes. (b) Detailed structure of a 2-bit unit cell, in which six PIN diodes are integrated on the six branches. (c) Four typical states when different on–off states are applied, encoded as units 0, 1, 2, and 3. (d) Band diagrams of a crystal with the designed 2-bit unit cell. (e) First Brillouin zones of units 1 and 2. (f) Topological phase transition and valley–chirality properties of units 1 and 2. (g)–(i) Measured near-field distributions of three typical topological routes. Adapted from Ref. , Copyright 2021, under a Creative Commons Attribution 4.0 International License.
To realize electrical programmability, the honeycomb-arranged unit structure contains six symmetrically distributed positive–intrinsic–negative (PIN) diodes, as depicted in Fig. 3(b), and each unit has four coding states (0, 1, 2, 3), as presented in Figs. 3(c)–3(f). By controlling the switching state of the diodes using FPGA, the spatial symmetry of the cell structure can be regulated. At the same time, by configuring the coding regions with different shapes, various types of topological region boundary lines are constructed. In addition, each unit of the reprogrammable EM metasurface has a dynamic encoding function; thus, it can dynamically switch the topological waveguide paths with different shapes at high speeds, so as to realize arbitrary customization of the EM topological paths and high-speed control functions. Compared with the existing reconfigurable photonic topological insulators, each unit of the reprogrammable topology metasurface has an independent electronic-control coding function, and hence the control accuracy and speed are beyond the reach of traditional reconfigurable photonic topological insulators.
To verify the high-speed dynamic control characteristics of the programmable topology waveguide, a multi-channel digital-to-analog converter is designed on the field-programmable topology metasurface. As shown in Figs. 3(g)–3(i), the multi-channel digital-to-analog converter has four waveguide ports, and port 1 is excited by a sine-signal wave with a frequency of 7.2 GHz. By dynamically encoding different topological waveguide paths, high-speed switching of output ports 2, 3, and 4 can be realized, thereby directly discretizing the input analog–signal wave into different digital signals at the output ports. Experimental results show that the switching time of the programmable topology waveguide can reach 10 ns. Compared with the existing mechanical regulation method, the regulation speed is increased by times. In addition, since only one signal channel is opened in each time period, the interference between different channels is negligible. This low cross talk feature plays a vital role in actual high-fidelity digital communications.
2.3 Space–Time-Coding Digital Metasurfaces
Earlier digital coding and programmable metasurfaces concentrated mainly on spatial coding. Recently, time-domain coding and space–time-coding digital metasurfaces have been presented to explore new degrees for controlling the frequency spectra and increasing information capacity. EM wavefronts can be tailored more flexibly from both space and frequency dimensions by using space–time-coding metasurfaces. A conceptual diagram of the space–time coding metasurface with programmable coding elements integrated with PIN diodes is demonstrated in Fig. 4(a). The EM responses of the programmable coding elements are tailored via FPGA according to an optimized 3D space–time-coding matrix, in which each element is not only space modulated but also time modulated. Hence, the harmonic distributions and propagation directions of EM waves can be simultaneously manipulated, and correspond to spectral and spatial properties, respectively. As shown in Fig. 4(a), a monochromatic beam of frequency can be reflected into different beams at different harmonic frequencies and specific propagation directions, which are realized by the designed space–time-coding metasurface with temporally modulated frequency . The far-field scattering patterns at the generic th harmonic frequency can be expressed as where and are the element periods along and directions, respectively; is the central operation wavelength; is the scattering pattern in the generic ()th element; and are elevation and azimuth angles, respectively; and is equivalent complex amplitudes. Here, is space–time distributions of the local reflection coefficients with period . The space–time coding matrix [Figs. 4(b) and 4(c)] can be defined and optimized so as to deflect each harmonic frequency toward a predesigned direction by using the above theory. Figures 4(d)–4(i) illustrate the distributions of the equivalent complex amplitudes (magnitudes and phases) and the corresponding harmonic scattering patterns, in which both amplitude-modulation (AM) and phase-modulation (PM) schemes are adopted. We observe that the AM scheme provides more uniform magnitude distribution but lower efficiency, while the PM scheme based on space–time gradients provides higher efficiency but rather unbalanced magnitude distribution among the harmonics. It is worth pointing out that, working with the 1-bit coding scheme, beam steering is possible only at harmonic frequencies, but not the central one. This restriction can be removed by using higher-bit coding schemes, such as 2-bit coding.
Figure 4.Space–time-coding digital metasurface. (a) Conceptual illustration. (b), (c) Examples of space–time coding matrices for harmonic beam steering. (d)–(g) Distributions of the equivalent magnitudes and phases based on amplitude and phase modulations. (h), (i) Corresponding far-field scattering patterns at harmonic frequencies for AM and PM. Adapted from Ref. , Copyright 2018, under a Creative Commons Attribution 4.0 International License.
Based on the fact that digital programmable metasurfaces can transmit digital messages without using complicated radio-frequency (RF) chains (e.g. mixers, antennas, and filters) in wireless communications, a new wireless communication scheme of space- and frequency-division multiplexing was further presented by using space–time-coding metasurfaces, as shown in Fig. 5(a). Different digital data streams are simultaneously transmitted to multiple designated users at different locations using different harmonic frequencies produced by the space–time-coding digital metasurfaces, hence implementing space- and frequency-division multiplexing wireless communications. More importantly, it is difficult to recover the correct information for unspecified users even if sufficiently sensitive receivers are used to receive all transmitted information. This feature ensures the security of wireless communications using the simple platform. A dual-channel wireless communication system based on the space–time-coding metasurface is fabricated to verify the feasibility of the space- and frequency-division multiplexing scheme, as illustrated in Fig. 5(b). The transmitter of the wireless communication system is composed of a control platform, a microwave signal generator, and a space–time-coding metasurface fed by a horn antenna; the receiver is composed of two horn antennas at and (serving as two users), downconverters, a software-defined radio (SDR) receiver, and a post-processing computer. Two transmitted color pictures are first translated into two different bitstreams by an on–off keying (OOK) modulation scheme and then mapped to the corresponding space–time-coding matrices. The transmitted data are reflected towards two directions ( and ) via different harmonic frequencies () simultaneously when the transmitted data are modulated by the space–time-coding metasurface with the corresponding space–time-coding matrices. The modulated waves can be received independently by the two horn antennas, and the two transmitted pictures can be accurately recovered via the SDR receiver. However, the transmitted pictures cannot be recovered even if the two horn antennas (user 1 and user 2) can receive EM waves with high-transmission powers if they are located at undesired positions. The proposed space- and frequency-division multiplexing wireless communication system based on the space–time-coding metasurface has great potential in future 6G applications.
Figure 5.Space- and frequency-multiplexing wireless communication system based on the space–time-coding metasurface. (a) Conceptual illustration. (b) Experimental scenario. Adapted from Ref. , Copyright 2021, under a Creative Commons Attribution 4.0 International License.
How to design a metasurface is one of the most important topics in this community, which includes designing meta-atoms and the whole metasurface. An unabridged metasurface consists of many subwavelength meta-atoms to exhibit various novel physical phenomena[172–174]. The basic problem for metasurface design is to optimize meta-atom structure parameters to achieve the desired reflection and/or transmission properties. However, it is difficult to directly analyze the EM responses of meta-atoms, especially digital meta-atoms, because of the EM coupling effect and active devices inside the meta-atoms[175–177]. The canonical design process of meta-atoms depends on full-wave simulation software using numerical algorithms to derive their EM characteristics and the intuition of experts to adjust the structure parameters, which would consume a large amount of time and experience. Moreover, when a similar meta-atom with different EM responses is re-designed, the above optimization process has to be repeated from scratch, inspiring people to reflect on whether there exists an auto-learning technology that could extract past experiences to accelerate the design procedure. A rudimentary approach involves using computer algorithms such as a genetic algorithm (GA) together with simulation software to automatically design the structure of meta-atoms[134,178,179]. This is basically a trial-and-error procedure depending on a mass of simulations, but the fully manageable execution by a computer will save a lot of time for researchers. The main defect of this process is that the preceding simulation samples have not been fully used, and a brand-new procedure needs to be called if the optimization target is changed. From here, it would be very natural to think about exploiting the powerful learning capacity of deep-learning methods to make full use of the existing data to accelerate the whole design process[180–185]. Existing data could be gathered from past simulation work, and most researchers would choose to specifically make a training dataset by simulations and/or experiments for the learning process. The preparation for training data does consume a lot of computing resources, but it is once and for all if the deep-learning networks are well trained. After that, the design process for meta-atoms with similar types could be extremely accelerated. The simulation process to gain training data is usually a fully manageable procedure executed by computers, and could be manyfold speeded up by multi-threads or distributed computing.
In this section, we discuss the intelligent designs of metasurfaces using machine-learning technologies. Before the metasurface is designed to fulfill a specific function, its meta-atom should first be designed to satisfy the required EM responses. Since the scale of the problem for the overall design of a metasurface is usually much bigger than that for a meta-atom, the intelligent designs of meta-atoms have gained more attention, and have been developed faster than the intelligent overall design of metasurfaces. Even so, recently there have been various works of intelligent designs for metasurface arrays, which present higher accuracy and efficiency than canonical methods[48,88]. For this consideration, both intelligent designs of meta-atoms and metasurfaces will be demonstrated.
3.2 Meta-Atom Design Using Artificial Intelligence
One of the most significant matters for machine learning or deep learning is to tell the computer the form of the problem, in other words, to make the computer comprehend the problem. For the intelligent design of meta-atoms, the first step in the optimization process is inputting the geometry of the meta-atom into a computer. For meta-atom design, it is highly discouraged to directly input the CAD model of meta-atoms because the CAD model is not intuitive to deal with and has a great deal of redundant structure information. The recommended way is to parameterize the structure of the meta-atom and enable the structural parameters as optimization variables. Similarly, it is equally important to define the target of the optimization process in the form of a target function or error function. The process of predicting the value of the target function when giving the structure of meta-atoms is called the forward process. Accordingly, the process of generating the structure of meta-atoms automatically when giving the design objective is called the inverse process.
The most common optimization targets for meta-atom design are the desired S-parameters or complex reflection and transmission coefficients in specific frequency ranges. The optimization targets could be easily represented by discrete sampling points organized in vectors, which are different in sampling amounts and corresponding spectra according to practical use. In the conventional design process, the wideband S-parameters or complex reflection/transmission coefficients of a meta-atom are obtained by running full-wave simulations on commercial simulation software such as CST Microwave Studio, and the simulation process takes the vast majority of time in an optimization procedure. For this reason, it is a worthwhile attempt to accelerate the simulation process through deep-learning methods. These deep-learning methods use various ANNs to learn the relationship between structural parameters of a meta-atom and its discrete EM responses. These ANNs are called forwarding neural networks or prediction neural networks (PNNs). The learning process of PNN is a black-box fitting process; hence, the network structure of PNN has a high degree of freedom with diverse convolutional neural networks (CNNs) or recurrent neural networks (RNNs). Empirically speaking, the quality and quantity of the training dataset determine the upper limit of the prediction accuracy, and the network structure of PNN determines to what extent the upper limit can reach. Therefore, it is encouraged to try diverse kinds of PNN structures with different numbers of layers/nodes, activation functions, and connection types among layers, until the desired prediction accuracy is obtained. Since the appropriate structure of PNN relates to specific problems and training datasets, our discussion of PNN will not pay much attention to the network-structure design but will concentrate on the parametrization methods of the meta-atom’s geometry and its function combined with the inverse design methods in the whole intelligent design process.
We start from the intelligent designs of digital coding metasurfaces by means of classical machine-learning methods. Zhang et al. combined binary particle swarm optimization (BPSO) with commercial EM software to automatically find the paired or tetrad meta-atoms with constant phase differences for reflected waves, as shown in Fig. 6(a). The main issue of this fully manageable method is the way to represent the meta-atom geometry. Inspired by the idea of digital coding, the meta-atom made of four-fold symmetric square sub-blocks is represented by a binary matrix, where 1 or 0 indicates the corresponding sub-block covered with or without the metal, respectively. This coding representation makes the usage of BPSO possible. Depending on the application programming interface (API) of the commercial software, CST Microwave Studio, the BPSO method executed in MATLAB could obtain broadband reflection phases of the current meta-atoms. Then the fitness value and speed of particles could be calculated to indicate the update of meta-atoms. As a result, a pair of meta-atoms sharing the reflection phase difference between 170° and 190° in the frequency band from 9.5 GHz to 10.3 GHz is finally obtained, as illustrated in Fig. 6(b). The reliability of this optimization result was experimentally verified by using the digital coding metasurface composed of these paired meta-atoms in realizing beamforming applications. Zhang et al. also showed that this method could be extended for automatic design of 16 ganged meta-atoms with 22.5° phase difference to achieve circularly or elliptically shaped radiation beams. In another work on machine-learning optimization, Samad et al. used the adaptive GA (AGA) to automatically design meta-atoms consisting of binary gold patterns with four-fold symmetry in generating specific reflection coefficients. It is also worth noting that research on the multiple mode coupling effect between adjacent meta-atoms designed by AI is quite important to the accurate manipulation of EM response. The efficiency of metasurfaces may be further improved when meta-atoms are automatically designed and arranged by AI in considering the coupling effect[190,191]. For example, different Bloch waves and propagating waves are carefully manipulated by means of the coupling effect between meta-atoms, hence realizing the design of perfect anomalous reflectors. The related research of the coupling effect between adjacent meta-atoms may further help the AI designs of meta-atoms and high-efficiency metasurfaces.
Figure 6.Intelligent designs of meta-atoms. (a) Flowchart of the BPSO algorithm together with the CST Microwave Studio. The BPSO algorithm controls the update of meta-atoms, and CST provides the reflection phases of the current meta-atoms. (b) Models and reflection phases and amplitudes of the paired meta-atoms with 90° phase difference. (c) Flowchart of DDQN method used to optimize the meta-atoms. The DDQN model predicted the optimal current update actions of meta-atom structure and material parameters by learning from the interactions with numerical simulations. (a), (b) Adapted from Ref. , Copyright 2017, with permission from Springer Nature, under a Creative Commons Attribution 4.0 International License. (c) Adapted from Ref. , Copyright 2019, with permission from Springer Nature, under a Creative Commons Attribution 4.0 International License.
As a classical iterative machine-learning method, reinforcement learning shows its powerful utility by learning the current update direction from interaction with the environment. With the development of deep learning, a novel reinforcement-learning method called a double deep Q-learning network (DDQN) was developed to accelerate the learning process, and is related to how to efficiently explore the given optimization space to reach the design target in the least time. Recently, Sajedian et al. proposed to use DDQN in meta-atom design, aiming to find the unit structures covering all transmission phases simultaneously with the highest efficiency, as shown in Fig. 6(c). They designed 16 different update actions including the change of materials and adjustment of unit size. By interaction with numerical simulations, this DDQN can learn an optimal update strategy and find the optimal results in only 2169 update steps among nearly 5.7 billion possible candidates, which significantly reduces simulation times compared with the BPSO or AGA methods.
These machine-learning methods (BPSO, AGA, and DDQN) can be classified as heuristic algorithms, which share the advantages of simplicity and effectiveness. Their drawback is obvious, due to the large expenditure of time and computational resources. Even though, it is still worth trying to apply them in metasurface optimization because they have a low barrier to be used, and their automatic running procedure does not need supervision from a human being, which means that researchers could do something else in this process and thus save much time and energy.
The efficiency of the above-mentioned machine-learning methods is encumbered by the time-consuming simulation process executed in commercial EM software, especially when the scale of the problem is large. Luckily, as the by-products of these methods, large datasets are generated and can be used as ingredients for deep-learning methods to train ANNs. Further work conducted by Zhang et al. trained a CNN to predict the reflection phases of two-fold anisotropic meta-atoms at specific frequencies when the coding metasurface is radiated by transverse-electric (TE) and transverse-magnetic (TM) polarized waves. The zero-one representation of digital meta-atoms provides a natural parameterized method for meta-structures and could be directly used as inputs of CNNs. After the training process, CNN could predict the reflection phases in several milliseconds with high accuracy and replace the function of the commercial software in the optimization process, as shown in Fig. 7(a), which accelerates the whole BPSO procedure by least three orders. In fact, the prediction process of CNN is so fast that the global random search methods could be used to search the whole parameter space to find solutions. As an example, Christian et al. trained a PNN to predict the S-parameters of an all-dielectric metasurface consisting of a square array of cylindrical resonators, in which each unit-cell cylinder is parameterized by its radius and height. Then a fast forward dictionary search (FFDS) method was developed to find the appropriate unit-cell structure corresponding to the desired S-parameters in several hours with the aid of PNN.
Figure 7.Illustration of design flowcharts for CNN. (a) Flowchart of the BPSO algorithm together with the prediction CNN to design the anisotropic coding meta-atom. The nearly real-time reflection-phase prediction of CNN accelerates immensely the whole procedure and makes it possible for simultaneous optimizations of TE and TM responses. (b) Contrastive flowchart of the design process of the REACTIVE method and the conventional metasurface design method. As a non-iterative method, REACTIVE could generate the probable digital meta-atom structures in seconds when given the design target of the reflection coefficients. (a) Adapted from Ref. , Copyright 2018, with permission from Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim. (b) Reprinted from Ref. , Copyright 2019, with permission from Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
The intelligent design processes described above are all based on iterative methods, which have reduced the optimization time from days to hours or minutes compared with canonical approaches. Taking full advantage of the ANNs’ data fitting ability, multiple works have been presented to construct direct access from the optimization target to the meta-atom structural parameters using so-called inverse ANNs. As non-iterative methods, well-trained inverse ANNs can directly output the corresponding optimization results in seconds or even in real time when given self-defined optimization targets. As a typical case of an inverse ANN, Qiu et al. designed an ANN structure named REACTIVE to match the relationship from the reflection coefficients to the corresponding four-fold coding meta-atoms represented by a binary matrix. In other words, the meta-atom structure can be efficiently predicted, as shown in Fig. 7(b). One of the difficulties in the coding meta-atom prediction is that ANNs cannot directly output binary data. As an alternative choice, Qiu et al. let REACTIVE output the vectors with element values between zero and one by adding a sigmoid activation function at the end of neural networks. Then these vectors could be discretized into one-hot versions and reshaped into matrices with congruent shapes of . Although the REACTIVE method has shown 90% prediction accuracy for binary elements in the top 30% of testing samples, it was not steady enough because, for one S-parameter, it could generate only one prediction structure of the corresponding meta-atom but had no remedial measure if this meta-atom did not satisfy the design target after verification by simulations or experiments. Also, two similar S-parameter curves in the training dataset may correspond to distinct meta-atom structures, which would cause one-to-many problems and worsen the performance of the training process.
To solve the steady and one-to-many problems of inverse ANNs, Luo et al. developed a special inverse ANN structure called a probability-density-based network (PDN). Instead of directly outputting meta-structure parameters, PDN generates a mixture of Gaussian distributions represented by mixing the coefficient, mean, and standard deviation of the output Gaussian, which indicates the likelihood of each structure parameter. By means of extracting the local maxima in the mixture of Gaussian, several candidate structure-parameter samples could be obtained and then verified by simulations or experiments to pick out the optimal one. The multiple alternatives of PDN output results can reduce the risk of invalidation. Another approach to solving the one-to-many problem is retrieving the structure parameters from the well-trained PNN by concatenating an inverse ANN at the end of PNN and constructing a variational auto-encoder (VAE) structure[184,185,195], as illustrated in Fig. 8(a). The network parameters of PNN will be fixed in the VAE training process, but the parameters of the inverse ANN will be updated. After proper training, the inverse ANN can independently retrieve the relationship from the design target to meta-atom structure. However, the above-mentioned VAE structure cannot generate multiple candidates for one design target, which lowers the success rate.
Figure 8.Schematic diagrams of inverse ANNs. (a) Schematic diagram of an inverse ANN retrieving the relationship between spectrum response and meta-atom structure. After being well trained, the inverse ANN could directly output meta-atom structures with corresponding input spectrum responses. (b) Flowchart of GAN for inverse design of 2D meta-atoms with arbitrary patterns. The pre-trained PNN acted as a simulator that could form a VAE when concatenated to the generator. (c) Schematic diagram of an inverse-design GAN with latent space. A well-designed sampling strategy was designed to sample data from this latent space as parts of the inverse generator’s input to guarantee the diversity of design results. (a) Adapted from Ref. , Copyright 2019, with permission from American Chemical Society. (b) Reprinted from Ref. , Copyright 2018, with permission from American Chemical Society. (c) Reprinted from Ref. , Copyright 2019, with permission from Wiley-VCH Verlag GmbH & Co. KGaA, Weinheim.
In the above discussions, meta-atoms or metasurfaces are represented by structure parameters, which restricts the freedom of design. Sajedian et al. found that using the 2D image of the meta-atom structure as the input of PNN could successfully predict the corresponding optical properties, which means that the meta-atom could be represented by pixelated patterns. Inspired by this trick, generative adversarial networks (GANs), used widely in image generation, were introduced into the inverse design of meta-atoms with arbitrary 2D structures. As an intuitionistic case, Jiang et al. trained a generator that accepts the quantified design targets and normally distributed random numbers to output the desired meta-grating patterns. The discriminator is trained to distinguish the actual patterns sampled in the training set from those fake patterns generated by the generator, so as to indicate the generator’s learning process. After being well trained, the generator could output multiple patterns corresponding to one design target by inputting different random numbers. These multiple output patterns could then be selected by full-wave simulations to pick out the best result. Liu et al. and An et al. developed another kind of generator training strategy by introducing a pre-trained PNN to be joined to the generator, forming a VAE structure, as shown in Fig. 8(b). As a result, the PNN made the spectrum of the generator’s output patterns conform to the design target, and the discriminator made these patterns fit the topological features. The generators in the above-mentioned works had added random noises to the input to increase the diversity of the output patterns. Further, in the training process of GAN, Ma et al. encoded meta-atom patterns together with their corresponding optical responses (design targets) into a latent space, as illustrated in Fig. 8(c). Then a well-designed sampling strategy was designed to sample the data from this latent space as part of the inverse generator’s input to guarantee the diversity of the design results. An interesting phenomenon was observed in that the extraction features from meta-atoms with similar geometrical characteristics assembled together in latent space, which is analogous to the clustering of word vectors in natural language processing. This phenomenon provides some interpretability to the feature-extraction ethology of prediction and inverse ANNs.
To summarize, great progress has been made in meta-atom designs using AI, where customized machine-learning algorithms are adopted to realize both forward design and inverse design, and high consistency is witnessed between prediction and simulation results. The algorithms show intrinsic superiority over manual tuning, especially in multi-dimensional space optimization and multi-sample output circumstances. However, there are still some limitations that impede wide promotion. So far, the proposed algorithms, though representing summits in each field, can tackle only the forward or inverse design problem for one specific structure with limited design freedom. Though methods including binary pixel patterns[196–198] alleviate the scarcity of freedom to some extent, difficulty is still encountered when new dimensions are explored in the design, which requires the network to be trained again with extra simulation time. Constraints on parameter selection and scope of automatic design are other defects. Discrete (selectable material dielectric, pixel binarization) and continuous (structural parameter) values sometimes need to be simultaneously fed into or extracted from the networks. In the inverse design, the predicted structure may be physically unrealizable due to out-off-scope structural parameters. Such circumstances will compromise network performance and lower the matching success rate. In addition, iterative algorithms could potentially converge into a local minimum, which may attribute to an insufficient dataset, or strong local resonant points may be missed during the optimization process.
We envision that the tendency for automatic meta-atom design lies in data sharing, task complication, and migration learning. First, a large-scale dataset should be built and shared, covering the versatile meta-atom structures ranging from microwave frequency to optical. The collective dataset not only speeds up algorithm development, but also facilitates the test benchmark for future proposed algorithms. Second, a more complicated and customized optimization target for meta-atom responses could be defined. So far, the automatic designs for meta-atoms focus mostly on phase/amplitude manipulations at a single frequency. A new meta-atom structure with extreme performance or innovative features could be explored using AI algorithms. For instance, multi-bit meta-atoms exhibiting uniform performance in ultra-wideband or multi-frequency points could benefit from fast AI algorithms. Active and non-linear performance could also be taken into account during the training process. Furthermore, new methods should be proposed to integrate human expertise into neural networks or other algorithms in automatic designs. Currently, human guidance is limited to key parameter selection and parameter bound setting. It means that once the parameter specification is given, the user will have to choose a meta-atom structure prototype, determine the key parameters together with a permissible range in automatic designs, and then apply the aforementioned versatile AI algorithms. In the future, the algorithm could replace humans to perform such selection tasks. Like the discriminator in the GAN network, human expertise could intervene in the design process directly through a loss function or indirectly to prevent the network from converging into local minima or skipping strong resonant points.
3.3 Metasurface Pattern Design Using Artificial Intelligence
As discussed in the introduction section, the design of a whole metasurface is more difficult than that of a meta-atom by using machine-learning algorithms, and very limited research has been conducted on this topic. Here, we present some examples to design digital coding metasurfaces using AI technology. The coding metasurfaces consist of digital meta-atom arrays that exhibit various unique macroscopic phenomena by manipulating the EM responses of meta-atoms. Taking the 1-bit coding metasurface as an example, the states of meta-atoms are usually designed by iterative optimization algorithms, such as GA, particle swarm optimization, and the Gerchberg–Saxton (GS) algorithm. These iterative optimization algorithms can reach the design targets but are not efficient enough to meet real-time requirements in some specific applications such as real-time holography. Inspired by the real-time inverse design ability of the deep-learning method, inverse ANNs have gained more and more attention for designs of coding metasurfaces. Shan et al. trained a supervised CNN by learning the relationship between radiation pattern properties and coding states of a 1-bit coding metasurface. The training dataset was made of coupled samples consisting of parameters of single and dual beam patterns and their corresponding coding sequences or patterns obtained by executing backpropagation or GA. As a result, the supervised CNN can provide coding sequences or patterns that generate the desired beam patterns at the speed of milliseconds. Qian et al. recently showed a novel self-adaptive microwave cloak realized by a coding metasurface controlled by a pre-trained ANN, as shown in Fig. 9(a). This coding metasurface should have a rapid response to the incident wave and generate well-designed reflection EM responses to conceal the inner object, which demands a real-time inverse design ability from the desired EM responses to the bias voltages of the metasurface’s units. The inverse design method needs to be executed in miniaturized edge devices. The above two restrictions make the iterative design method a nearly impossible choice. To solve the problem, Qian et al. trained a small fully connected ANN to learn the mapping from the needed reflection spectra together with the features of incident waves to the corresponding bias voltages of meta-atoms. Experimental results showed that the metasurface cloak controlled by the simple ANN was effective and could react to the ever-changing incident waves in a millisecond.
Figure 9.Intelligent design of the metasurface pattern. (a) Experimental setup of metasurface cloak controlled in real time by a pretrained ANN, which can learn the mapping from the needed reflection spectra together with the features of the incident wave to the corresponding bias voltages of the meta-atoms. (b) Schematic diagram of the physics-assisted unsupervised GAN for real-time holography. The generator together with the EM propagation process makes up the VAE structure. A discriminator was used to improve the imaging quality of the generator. (a) Adapted from Ref. , Copyright 2020, with permission from Springer Nature. (b) Reprinted from Ref. , Copyright 2021, with permission from Optica Publishing Group, under a Creative Commons Attribution 4.0 International License.
From the above discussions, we note that the deep-learning methods for designing digital coding metasurfaces are all supervised methods, which means that the training of ANNs depends on a large number of coupled samples. Such training samples are obtained from either full-wave simulations or experimental measurements, which requires much time and effort, posing a high obstacle to researchers. To solve the problem, Liu et al. recently developed a physics-assisted unsupervised GAN method to design the corresponding hologram of a 1-bit coding metasurface in real time when the target holographic image was given, as illustrated in Fig. 9(b). By using EM propagation formulas to construct forward mapping from the metasurface hologram to the corresponding holographic image, the generator responsible for the inverse design could be trained in an unsupervised VAE structure. The training target of this VAE was chosen to make the output image as similar as possible to the input one, so that arbitrary images could be used to train the generator without preparing their corresponding holograms. A discriminator was added in the training process to make the output images of VAE more topologically similar to the input. Experimental results show that the metasurface holograms designed by the trained generator have better imaging quality than those from the conventional GS algorithm. It is foreseeable that more physics-assisted deep-learning methods will be developed for intelligent designs of the whole metasurface, enhancing the interpretability and generalization ability of PNN and inverse ANNs.
4 Intelligent Metasurfaces
4.1 Information Metasurface Integrated with Machine-Learning Algorithm
In the above section, we discussed the intelligent designs of metasurfaces by using machine-learning and deep-learning algorithms, including designs for both meta-atoms and meta-atom arrays. However, AI may have a closer connection to information metasurfaces to make them smarter, which yields intelligent metasurfaces. Here, we first investigate information metasurfaces integrated with a machine-learning algorithm and their application in microwave imaging. The microwave images have been widely deployed in various scenarios, while the limited imaging rate, complicated reconstruction algorithm, and high-cost hardware remain bottlenecks for performance improvement and commercial usage. Also, the huge data-stream makes microwave imagers ineffective for complicated in situ sensing and monitoring. To address these problems, Li et al. purposed a real-time digital-metasurface imager that utilizes PCA to guide the optimized measurement modes.
As the recent advanced research in machine learning has demonstrated, image data can be highly compressed and reconstructed virtually and flawlessly under the strategy of feature extraction. It is therefore inferred that, instead of acquiring each pixel in the image collection round, a limited number of feature patterns will be sufficient in the whole image reconstruction and analysis, which reduces the burden for both measurements and data transmissions. As shown in Fig. 10(a), the original raw images (vector ) are linearly transferred (compressed) into lower dimension data . Here, the transformation matrix is generated through the PCA algorithm, which guarantees minimum information loss between raw data through transformation, and hence a solid reconstruction performance was reached from the collected data.
Figure 10.Reprogrammable metasurface imager integrated with machine-learning algorithm. (a) Schematic of the machine-learning algorithm. (b) Meta-atom structure and the metasurface. (c) Real-time imaging through a wall. (d), (e) Experimental measurements of different body gestures and the related imaging results. (f) Classification rate of different algorithms. Adapted from Ref. , Copyright 2019, under a Creative Commons Attribution 4.0 International License.
To realize machine-learning-based measurement modes in the EM domain, a 2-bit digital coding metasurface is designed and fabricated to dynamically manipulate incident waves, as illustrated in Fig. 10(b). In each meta-atom, four PIN diodes are individually controlled, so that the meta-atom exhibits four digital states relating to its phase responses for the incident wave at the centering frequency, i.e., state 0 (0°), state 1 (90°), state 2 (180°), and state 3 (270°). Through proper mappings between the source field and near field, the coding patterns are determined for each measurement mode and loaded to the metasurface periodically at a maximum clock rate of 64 Hz, hence building up the real-time metasurface imager, as demonstrated in Fig. 9(c).
Figures 10(d) and 10(e) demonstrate the imaging capabilities of the proposed imager. From the reconstructed results shown in Fig. 10(e), the subject’s body gestures can be clearly discerned, even if the subject is blocked by a paper wall. A pair of red plastic scissors, simulating a dangerous target, is tied to the subject, which is successfully detected through the imager. Note that each image is collected under 400 measurement modes, which is much smaller than the pixel number of 8000, equivalent to a compression rate of 95%. Experiments are also conducted to verify the recognition ability of the imager. Three categories of actions were chosen for classification, i.e., standing, bending, and raising arms. In experiments, 60 measurement modes were designed based on the specific image sets, and Fig. 10(f) depicts the relationship between the accuracy rate and measurement numbers. Based on the curves, the classification performance of PCA is much better than that of random projection, which quickly approaches the ideal result without using more measurements. It is worth pointing out that the compressed data are used as raw data for the classification task, and the classification accuracy has reached its upper limit when the first 25 main components are collected. With the help of the CNN algorithm, the electronically controlled metasurface imagers would hopefully extend the venues for fast data acquisition and processing to reach intelligent surveillance.
4.2 Information Metasurface Integrated with Multiple Convolutional Neural Networks
Remarkable progress has been made in recent years using CNNs to achieve state-of-the-art results on computer vision tasks, such as object detection and image processing. The great success of CNN relies on its ability to explore the spatial relationship using so-called kernels, which will extract the full image to generate the feature information. It gives CNN the capacity to develop an internal representation of a high-dimensional image with parameter sharing. Therefore, the CNN architecture significantly reduces the number of parameters to be trained in comparison with a fully connected neural network, contributing to faster convergence speed and a more compact model.
Figure 10 shows that the information metasurface integrated with a CNN algorithm has successfully realized an intelligent microwave imager. If the information metasurface involved a series of CNN algorithms, more intelligent devices would be produced. In fact, intelligent sensing has proved to be of great benefit to human beings without intruding into people’s normal lives and privacy, or burdening people with any active devices or identification tags, as seen in Fig. 11(a). However, the conventional smart sensing devices designed for the Internet of Things (IoT) and cyber physical systems (CPSs) are either function-designed without adaption ability or too complicated and expensive with excessive peripherals. Recently, Li et al. presented a smart microwave metasurface imager and recognizer, called a microwave camera, which is empowered by a series of ANNs for adaptively controlling data flow and automatically identifying targets. Experiments have demonstrated the metasurface’s multi-functionalities of imaging, gesture recognition, and respiration monitoring.
Figure 11.Intelligent microwave imager and recognizer. (a) Application scenario of the intelligent metasurface. (b) System architecture of the intelligent metasurface. (c) Meta-atom responses at different digital states. (d) Programmable manipulations of EM focusing for different functions including hand gestures and vital inspection. Adapted from Ref. , Copyright 2019, with permission from Springer Nature, under a Creative Commons Attribution 4.0 International License.
Figure 11(b) illustrates the overall structure of the intelligent metasurface integrated with deep-learning techniques. The reflective digital metasurface, whose dynamic frequency response is shown in Fig. 11(c), manipulates the source signal and functions as an intelligent sensing probe. The intelligent metasurface has two operational modes: active and passive. Compared with the traditional methods of modeling and analyzing the characteristics of EM environments, the ANN-based method is more efficient in terms of computational cost, insensitive to background or surroundings, and trainable for various scenes, and hence is consequently more easily deployed.
Based on the intelligent metasurface, two specific tasks are performed, i.e., gesture recognition and respiration monitoring with different CNN modules. The collected microwave data first go through the IM-CNN-1 imaging network, which transfers the raw data into the real-time body image, as shown in the right two insets in Fig. 11(d). Then, the faster R-CNN is performed to find the region of interest (ROI) from the whole image, as marked by the red rectangle in the two insets. Then, an adapted GS algorithm is performed to calculate the optimized digital coding patterns of the metasurface to realize beam focusing on the target area, for instance, the hand for sign-language recognition or the chest for respiration monitoring. Finally, the collected microwave data are transmitted to a specific network for analyses in each function mode. To be specific, human breathing is identified by a time–frequency domain analyzer, and another CNN network, IM-CNN-2, processes the data to recognize the hand design.
From the plotted results, the subject’s respiration rate is around 0.28 Hz. The deviant state is also clearly distinguished when the subject is asked to hold his breath. The accuracy rate of hand gesture recognition reaches 96% under the 10 categories, six of which are illustrated in the insets. The sign-language rate of the human hand and respiration rate are of the order of 10–30 bit/s, which is significantly slower than the switching speed of the digital coding patterns by a factor of . Therefore, the intelligent metasurface has the potential to fulfill more sophisticated tasks in the future, including rip reading and human-mood recognition.
4.3 Information Metasurface Integrated with Sensor and Onsite Algorithm
As we discussed above, most programmable metasurfaces focus mainly on the manipulation of EM waves, in which human beings must be involved to make instructions. To further attain smarter functionalities of metasurfaces to make decisions by themselves, the metasurface should have the sensing ability to collect essential information and make a decision. Here, we review two smart metasurfaces with intelligent sensing functions. Figures 12(a) and 12(b) schematically demonstrate the metasurfaces with self-adaptively reprogrammable functionalities proposed by Ma et al., in which multiple sensors, a microcontroller unit (MCU), and FPGA are integrated to construct a closed-loop sensing feedback system for automatic decision making. In this case, the authors assumed that the position of the satellite was relatively fixed since the distance between them was very far. When the spatial posture of the aircraft changes, the gyroscope sensor will send posture data to MCU, which drives the metasurface to steer the EM beam direction to the satellite. When other specific features and the environments around the metasurface are changed, such as light intensity, humidity, and temperature, the integrated sensors can promptly detect the variations and send the corresponding information to MCU. Subsequently, MCU with a feedback algorithm automatically processes all sensing information and output instructions to drive FPGA to adaptively update the digital coding sequences or patterns in real time, so that the metasurface can perform different functions without human instructions.
Figure 12.Smart metasurfaces with self-adaptive capabilities. (a), (b) Conceptual illustration of self-adaptively smart metasurfaces. (c) Smart sensing metasurface. (a), (b) Adapted from Ref. , Copyright 2019, with permission from Springer Nature, under a Creative Commons Attribution 4.0 International License. (c) Reprinted from Ref. , Copyright 2020, under a Creative Commons Attribution 4.0 International License.
Figures 12(a) and 12(b) conceptually demonstrate an example of a satellite-communication scenario with an airplane equipped with a smart metasurface, in which a gyroscope sensor is integrated. The embedded gyroscopic sensors can instantaneously acquire the varied spatial orientations of the metasurface and send the corresponding information when the flying airplane changes its spatial positions in the sky. The original beam deflection direction becomes the changed direction . Then the solution of the digital coding pattern with beam deflection is changed to the digital coding pattern with beam deflection , ensuring that the radiation beam is always pointed to the satellite. The fast inverse design algorithm loaded in MCU can quickly calculate the desired coding pattern. To obtain higher beam-scanning accuracy, the coding sequence of the beam deflection angle is further decomposed into two coding subsequences with different beam deflection angles in the proposed algorithm by the following formula:
In addition, the error analysis functions are subsequently designed to determine the best coding length and to acquire the final coding patterns. Due to the high-accuracy and high-speed algorithm in MCU, the metasurface can precisely radiate the EM beam in the direction of the satellite to construct good communications between the airplane and the satellite in real time. In addition to the satellite communication scenario, the application scenarios can be greatly broadened by integrating heat sensors, humidity sensors, light sensors, height sensors, and other sensors into the smart metasurface. For instance, a light sensor can obtain the intensity percentage by detecting the visible light, and the smart metasurface will react to different light intensities, so that the metasurface can generate a dual-beam scattering pattern for the bright condition and reduction of the radar cross section (RCS) for the dark condition. The smart metasurface with self-adaptively reprogrammable functions will further promote intelligent devices and systems.
The above-mentioned smart metasurfaces with multiple sensors can sense and manipulate only different targets. Further, a smart sensing metasurface with self-defined functions was presented that can simultaneously achieve sensing and manipulation of the same object. As schematized in Fig. 12(c), the smart sensing metasurface is integrated with sensing units and executing units, in which the sensing units can detect the power levels of incident waves and the executing units will perform the functions of wave manipulations under different polarizations. The incident-wave energies of the sensing units can penetrate from the top metal patch to the detecting circuit on the bottom by a via-hole along or axis, and then are transformed to different DC voltages (0.15–0.3 V) by an RF power detector (LTC5530). The corresponding DC voltages can be perceived by MCU, and the data are simultaneously transferred to FPGA. Various digital coding patterns for incident waves with different polarizations can be implemented by controlling the executing units after the sensing data are processed by FPGA with a pre-loaded algorithm. Thus, a smart sensing metasurface with self-defined functions in dual-polarization modes can sense and manipulate scattering fields simultaneously.
5 Programmable Artificial Intelligence Machine
5.1 Neural Network Hardware by 3D-Printed Metamaterials
In the above section, AI collaborates with information metasurfaces as software. Here, we introduce hardware collaborations. Recently, to explore the new architecture of the computing hardware of AI, various optical neural network hardwares have been reported, yielding faster computing speeds and lower energy consumption. In 2018, Lin et al. proposed an all-optical for machine learning, as depicted in Figs. 13(a) and 13(b). The multi-layer neural network is established based on five layers of optical metamaterials, fabricated using 3D printing technology. When coherent light passes through the input layer and irradiates the learning layers, diffraction behaviors of the pixel blocks between the layers, which follow the Huygens–Fresnel principle, have network models similar to the fully connected neural network, as shown in Fig. 13(a). The transmission coefficient of each pixel can be designed, which is considered as a multiplicative bias term in an equivalent neural network. Therefore, the forward propagation of light in the presented network is intrinsically similar to the forward-propagation computing of an all-connected neural network. More importantly, forward propagation at the speed of light enables ultra-fast network computing capability and ultra-low power loss due to its passive structure. Compared with traditional electronic chips, its computing power consumption and energy efficiency have significant advantages. Such a learnable network is trained theoretically in computers by adjusting the transmission coefficients of pixels, using the error backpropagation method. Input and output optical fields are demonstrated in Figs. 13(c) and 13(d), where the detector regions are marked with red dashed lines. To prove the performance of , the authors trained the device for image recognition based on the MNIST dataset, in which 55,000 images are used for training and 10,000 for testing. Classification accuracy of 91.75% has been achieved, as shown in Figs. 13(e) and 13(f).
Figure 13.Diffractive deep neural networks. (a), (b) All-optical diffractive deep neural networks (D2NNs) based on 3D-printed materials. (c), (d) Input and output optical fields, where the detector regions are marked with red dashed lines. (e), (f) Confusion matrix and energy distribution percentage for 10,000 different handwritten digits. (g) Nanophotonic media for artificial neural inference. (h) Schematic of DPU. (i) Experiment of DPU. (a)–(f) Adapted from Ref. , Copyright 2018, under a Creative Commons Attribution 4.0 International License. (g) Reprinted from Ref. , Copyright 2021, with permission from Optica Publishing Group, under a Creative Commons Attribution 4.0 International License. (h), (i) Adapted from Ref. , Copyright 2020, with permission from Springer Nature.
In addition to the optical neural networks that propagate in 3D form, diffractive neural networks in planar forms have also been proposed. Based on some control structures for 2D light fields[206–210], such optical neural networks can achieve good performance. In 2019, Xu et al. proposed a planar-form diffractive neural network, as illustrated in Fig. 13(g), in which light waves are diffracted in a layer of a micro-nano-fabricated silica photonic medium. The wavefront propagation behaviors of the light waves are similar to the signal transmissions in neural networks. The researchers conducted training and validation using the handwritten image dataset and achieved a prediction accuracy of over 79%.
5.2 Programmable Artificial Intelligence Machine
Although all-optical diffractive neural networks have been demonstrated using 3D and planar structures with excellent performance, they have fixed functions once they are fabricated, in which the training process is still performed on conventional computers. That is to say, these diffractive neural networks are unprogrammable. In 2021, Zhou et al. further presented a reconfigurable optical diffraction processing unit (DPU). They employed a series of optical devices such as digital micromirror devices (DMDs), spatial light modulators (SLMs), and complementary metal–oxide–semiconductor (CMOS) sensors to form a one-layer programmable optical network, as shown in Figs. 13(h) and 13(i). Using the programmability of the large-scale light reflection switch array of digital optical micromirrors, the light source can be precisely controlled. A multi-layer optical neural network was then simulated by feeding the output of this one-layer programmable optical network into its input port with the aid of an electronic circuit, which simulates an RNN structure. The multi-layer optical neural network was trained and tested using the MNIST dataset and achieved a blind-testing accuracy of 97.6%. This scheme opens a new path to realize the programmability of large-scale diffractive neural networks.
However, the above work was based on a one-layer programmable optical network, and the multi-layer optical neural network had to be performed with the help of electronic chips and the training process was conducted in a computer, which weakens the advantages of light-speed calculations in . To solve the problem completely, Liu et al. proposed a fully programmable AI machine (PAIM) using multiple layers of information metasurfaces, which can directly receive EM waves in free space, and achieve direct calculations in wave space by adjusting the transmission gains of meta-atoms with a wide dynamic control range. Moreover, the characteristics of light-speed calculation are maintained in the fully programmable diffractive architecture, which significantly expands its application potential. For experimental demonstrations, a prototype with five programmable layers is designed and fabricated with various functions including programmable image recognition, automatic beam focusing, and wireless communications, as illustrated in Fig. 14.
Figure 14.Programmable artificial neural network and its image recognition. (a) PAIM structure composed of multi-layer information metasurfaces. (b) Diffractive illustration of PAIM. (c), (d) Image recognition of oil paintings for landscape and portraiture. Adapted from Ref. , Copyright 2022, with permission from Springer Nature.
PAIM imitates the inhibition and amplification of neurotransmitters by human brain neurons through meta-atoms, and makes use of multi-layer cascaded information metasurfaces to simulate the neural network, in which the propagation of EM waves in space is used to simulate the connection between neurons, as illustrated in Figs. 14(a) and 14(b). The fully connected characteristic of the network depends on the distribution of the energy emitted by the meta-atom of the upper layer to illuminate the surface of the network of the next layer. This distribution is the result of meta-atom energy transmission in free space, including the path loss and energy differences in different radiation directions. Each node (i.e., meta-atom) shown in Fig. 14(a), integrated with amplifiers, can be programmed by controlling the bias voltage. Therefore, the transmitted EM waves of each meta-atom can be controlled independently to achieve the particular weight distribution. The following equation demonstrates the numerical relationship between electric fields at adjacent layers: in which and are the complex fields on the th and planes in the form of vectors, and is total layer number; is the fixed propagation matrix between nodes at the th and planes that can be derived from the Huygens–Fresnel principle; is the Hadamard product; and vector represents the complex transmission coefficients of the th layer, which can be tuned through the external bias voltage. Equation (7) shares a great similarity to the forward propagation in a fully connected network, which is depicted as
In addition, the multi-layer abstract processing mechanism of information makes the PAIM not only a neural network simulator, but also a direct processing device for the microwave signal. Owing to its fast on-site programmability, the PAIM demonstrates a number of functions, including image recognition, beam focusing based on reinforcement learning, and wave-space communication codec with a denoising function. The examples shown in Figs. 14(c) and 14(d) present the image recognition between landscape and portraiture. The input image is first pixelated and gray-scaled to form a mask. Then the input field is formed by the radiation of incident EM waves, and specific energy distribution is constructed during EM propagation inside PAIM. Finally, it is determined whether the image is a landscape or a portrait through the energy distribution of the output layer.
In addition to intelligent image recognition, the proposed PAIM can also realize information transmissions based on code-division multiple access (CDMA) schemes, as depicted in Fig. 15(a). In this scheme, four user codes are designed and input by the first programmable layer, which is regarded as an encoder. The other four layers, as decoders, will guide the energy of the user codes into corresponding directions, which are sensed by the four horn antennas on the receiving plane. The weight distribution of the decoding network is trained to recognize whether each user code is sent or not. It should be noted that the decoding target is to distinguish four close energy inputs, which may cause severe inter-symbol interference. Therefore, based on this decoder, multiple user codes are transmitted with low interference in a very small space. Based on the CDMA scheme, a wireless communication prototype is further established, as shown in Fig. 15(b). A receiving antenna array, integrated with analog-to-digital conversion (ADC) and FPGA, is placed on the receiving plane to detect the electric field energy value at the receiving antenna position corresponding to each user code. AM is used for communication signal modulation. Specifically, in a certain clock interval, when a high level is detected at the receiving antenna position corresponding to a certain user code, the binary information transmitted by this user in the current clock interval is “1,” and otherwise “0.” Since the four user codes can be transmitted independently in PAIM, we can transmit four signals simultaneously in the same channel. If different user codes transmit different parts of the same picture, then the transmission rate will be increased by four times. A badge image of the State Key Laboratory of Millimeter Waves at Southeast University with binary pixels was transmitted and tested, achieving a transmission error rate of 0.52%. As a comparison, the image transmission experiment was also conducted by removing the decoding part of PAIM, in which 49.02% of the pixels were not successfully received, indicating that the user inter-symbol interference becomes very significant after removing the PAIM.
Figure 15.Encoder and decoder in the CDMA scheme and its communication experiment using PAIM. (a) Schematic of CDMA scheme using PAIM, where the first layer and the last four layers are assigned as encoder and decoders, respectively. (b) Experiment of image transmission using the presented CDMA scheme and PAIM. Adapted from Ref. , Copyright 2022, with permission from Springer Nature.
We remark that the proposed PAIM has realized the first system for programmable processing, light-speed calculation of AI, wireless sensing, and communications in microwave space, and can find wide applications in intelligent radar and new-generation communication systems after further miniaturization and intensification. For instance, the deployed PAIM in radar systems could perform the subject recognition task directly in the microwave domain and substitute conventional Tx/Rx modules together with the digital processing unit. Owing to the instantaneous speed of EM wave propagation, high-dimension matrix multiplication is performed at the speed of light, building up a real-time processing system with negligible delay. Based on the proposed programmable platform, we also envisage that the training process of the neural network could be implemented directly on our PAIM architecture. The conventional backpropagation algorithm, which optimizes the trainable parameters inside the neural network iteratively, usually costs huge calculation resources and time in the design process. We hope that the algorithm could be migrated directly into our platform since the proposed PAIM has weight-programmable and sample injection capacity. However, further research is needed to explore a feasible method to extract electrical fields in the intermediate layers of PAIM architecture, which are crucial data in backpropagation algorithms.
The digital coding and programmable metasurfaces proposed in 2014 not only combine EM physics with digital information, but also incorporate the idea of encoding into the design of EM functions and the representation of information, thus forming a new direction of information metasurfaces. During this period, the wide application of AI has given birth to related research on the intelligent design of metasurfaces, which is also closely related to the future development of information metasurfaces. In this paper, we first reviewed the recent advances in information metasurfaces from information theories and operations to programmable designs and space–time-coding strategies. We also exhibited the intelligent designs of metasurfaces using machine-learning algorithms. Then we introduced the research on intelligent metasurfaces based on a combination of information metamaterials and AI. Finally, we presented recent developments of all-optical in both static and reprogrammable ways.
We envision that future research on information metasurfaces will be focused on the reconstruction of traditional Shannon information theories by deeply combining digital information with EM fields. In particular, exploring new modulation forms that can combine multiple modulations for next-generation communication systems in the time domain, frequency domain, spatial domain, and polarization domain may be an important direction. Since most current work on information metasurfaces is in the microwave band, the higher working frequency band (even in the optical frequency band) will be one of the important directions for future research. Low-frequency passive structures can be used for reference in high-frequency designs, but the tunable devices widely used in the microwave band are difficult to implement in the terahertz and optical frequency bands. Therefore, to achieve a programmable design in the optical frequency band, some new optical devices are required. For the intelligent design of meta-atoms, in addition to the current mainstream pixelation methods, more general and accurate intelligent design methods are urgently needed. We believe that in-depth analyses and deconstructions of the EM properties of specific structures may be the key to improve the performance of future algorithms, and may also help reduce the database dependence of the algorithms. Moreover, more precise and intelligent design methods for metasurface patterns are also worth looking forward to. For new types of artificial intelligent machines, the current programmable solutions are relatively preliminary, and more powerful nonlinear programmable forms will inject new vitality into the research on this hardware of deep neural networks. In addition, intelligent metasurfaces as neural networks still demand programmable capacity with a larger computing scale.
 M. Z. Chen et al. Accurate and broadband manipulations of harmonic amplitudes and phases to reach 256QAM millimeter-wave wireless communications by time-domain digital coding metasurface. Natl. Sci. Rev., 9, nwab134(2021).
 A. Graves, A. R. Mohamed, G. Hinton. International Conference on Acoustics Speech and Signal Processing ICASSP. IEEE International Conference on Acoustics, Speech and Signal Processing, 6645(2013).
 K. Lu. Intelligent recognition system for high precision image significant features in large data background. Advances in Intelligent Systems and Computing, 1056(2020).
 T. Tong et al. MBVCNN: joint convolutional neural networks method for image recognition. AIP Conf. Proc., 1839, 020091(2017).
 H. Sun et al. MBVCNN: joint convolutional neural networks method for image recognition. 7th International Conference on System of Systems Engineering, 24(2012).
 K. Cho et al. Learning phrase representations using RNN encoder-decoder for statistical machine translation(2014).
 C. Escolano, M. R. Costa-Jussa, J. A. R. Fonollosa. From bilingual to multilingual neural-based machine translation by incremental training. 57th Annual Meeting of the Association for Computational Linguistics: Student Research Workshop, 236(2019).