
- Chinese Optics Letters
- Vol. 22, Issue 12, 120002 (2024)
Abstract
1. Introduction
Technological advances have enabled humans to explore polar regions, space, and deep oceans[1-3]. Among them, the ocean covers 71% of the Earth’s surface and is a strategically important space for global ecological, resource, economic, and security development. Optical means play an important role in deep ocean detection and research[4,5]. There are many extreme environments in the ocean, including high pressure and seawater corrosion, which place high requirements on the robustness, speed, and energy consumption of the elements or chips used. Optical chips, as an emerging technology, have received wide attention because of their light-speed-computing and ultra-low-energy-consumption features[6,7]. Compared with electronic chips, optical chips have higher interference resistance. This is because the optical properties (like refractive index[8]) of chip materials are generally less sensitive than their electrical properties (like carrier mobility[9]).
In recent years, diffractive neural networks (DNNs), as a three-dimensional (3D) optical network, have been widely investigated[10,11]. Applications, including image recognition[12-14], optical computing[15], phase imaging[16,17], and scattered image reconstruction[18], have been demonstrated based on the DNN framework, showing its potential to be used in various fields. In DNNs, neural connections are constructed based on light propagation and diffraction in free space, eliminating the need for optical waveguides, thus making the chip structure simple, which helps improve the robustness. For deep-ocean research, DNNs enable direct optical image processing at the time of reception. This feature can increase the speed of image information acquisition. Furthermore, DNNs can deal with invisible light information such as phase[19] and polarization[20], which may provide more comprehensive details of the detection target.
Unfortunately, the materials, architectures, and designs of existing DNNs do not yet support their use in the ocean or water environment. Currently, organic 3D printing is the main fabrication method of DNNs, which is of low mechanical strength and robustness[21,22]. Besides, the diffractive layers of DNNs are usually spatially separated[23], resulting in insufficient compactness, which may cause functional failures due to structural changes during long-term operation. In addition, reported DNNs are only designed to work in air and have not been optimized for use in water[24-28].
Sign up for Chinese Optics Letters TOC. Get the latest issue of Chinese Optics Letters delivered right to you!Sign up now
Here, we experimentally demonstrate a compact DNN chip for water-immersed optical inference. The DNN consists of two cascaded diffractive layers, which are integrated on the two surfaces of a quartz plate, respectively. We used the double-side photolithography followed by dry etching to fabricate the chip. This integration allows the spacing and relative positions of the diffractive layers to remain stable, thus enhancing the robustness. When optics work underwater, there may be unforeseen circumstances such as water ingress from leaking system seals, which can render the device inoperable. To address this problem, we designed a DNN chip to work directly in both water and air. Through initial training value optimization and multi-objective training, the DNN chip can realize high-accuracy inference in the two media. Handwritten digit recognition and fashion product recognition were performed based on our chip. The chip shows a good performance. Recognition accuracies of four-type handwritten digits (0–3) in air and water are 91.5% and 91.4%, respectively, while the accuracies of four-type fashion products (T-shirts, trousers, bags, and shoes) in air and water are 94.6% and 92.6%, respectively. Our strategy provides a route to underwater applications of smart photonic devices, which can also be used for applications in other extreme environments.
2. Experiments and Methods
2.1. Principle
Figure 1(a) is the schematic diagram of optical inference with our bilayer DNN chip, showing that it can operate in both water and air. As depicted in Fig. 1(b), optical images generated by coherent light propagate from the input layer through diffraction in the medium to the DNN chip, undergo optical processing by the two layers, and finally propagate to the output layer. Therefore, a feedforward optical neural network can be constructed. The inference results of the chip are displayed on the output layer. For each task, there are 4 circled regions on the output layer corresponding to the four different types of inputs [Fig. 1(a)]. The light intensity distributions represent the recognition results. The weight information carried by neurons is encoded in the pixels with different phase modulations by controlling the pixel depth. Thus, through the diffraction and interference of incident light, the matrix multiplication of inputs and weights can be realized [Fig. 1(c)]. Figure 1(d) shows the digital image of our chip. The substrate we used is a fused quartz plate. The high stability of quartz ensures that the chip can cope with corrosion in various underwater environments. On the chip, we fabricated four DNNs with 2 different functions, including handwritten digit recognition and fashion product recognition. Each diffractive layer has
Figure 1.Bilayer DNN chip integrated on a quartz substrate. (a) A schematic diagram of the chip capable of operating in both air and water. Optical images enter the DNN, and the final recognition results are reflected on the output layer through the distribution of light intensity. (b) The schematic diagram illustrating the propagation of light in the chip. (c) A network description of the physical computation process of the DNN chip. Dataset images are generated at the input layer and then propagate through two diffractive layers with optical operation based on coherent superposition. In our experiment, the diffractive layer is designed with binary phase modulation. (d) The digital image of the DNN chip. Scale bar, 5 mm.
2.2. Tensorflow-Based DNN training
Artificial neural networks achieve the mimicry of the synaptic transmission of signals through the multiplication of weight matrix and inputs[29]. In DNNs, matrix multiplication is realized through the transmission and coherent superposition of incident coherent waves. Therefore, when designing DNNs, it is necessary to construct a light propagation model between diffractive layers.
Here, we used angular spectrum diffraction to simulate the propagation of incident light[30],
Figure 2(a) illustrates the forward propagation model and error backpropagation model used in training. The training process is implemented using the TensorFlow 2.0 framework (Google Inc.) with a learning rate of 0.03 and an epoch of 400. We trained the DNNs using modified versions of the Modified National Institute of Standards and Technology (MNIST)[31] and Fashion-Modified National Institute of Standards and Technology (Fashion-MNIST) databases[32]. Each training set has 20,000 images with 5000 images for each type. The distances from the input layer to the DNN chip and from the DNN chip to the output layer are 10 and 25 cm, respectively.
Figure 2.The simulation results of the DNN chip. (a) The training process diagram of the DNN chip. (b)–(e) Simulated outputs of the DNNs and corresponding normalized light intensities in the 4 circled regions. (b) Task: handwritten digital recognition. Medium: air. (c) Task: fashion product recognition. Medium: air. (d) Task: handwritten digital recognition. Medium: water. (e) Task: fashion product recognition. Medium: water.
To ensure accurate recognition in both air and water, we performed multi-objective training, which enables simultaneous optimization of multiple losses[33]. The process involves inputting the images in datasets into the DNN network and simulating their propagation in water and air, respectively. Then, by comparing the outputs obtained in water and air (
To simulate the light propagation in water and air, it is necessary to first determine the phase modulation difference in the two environments. We achieve transmission-type phase modulation by fabricating pixels with different depths. The phase modulation difference
In order to facilitate chip fabrication, we perform a binarization process on the phase values, discretizing the range of 0–6π into 0 and
Additionally, the amplitude fields of the four regions on the output layer representing different digitals or fashion products were optimized to follow a Gaussian distribution. Compared to a typical uniform distribution, regions with Gaussian distribution tend to exhibit more concentrated intensity. In this way, the maximum light intensity density in these regions can be increased. Therefore, the camera can more easily capture effective signals, enabling it to operate with shorter exposure time and/or lower laser power configurations to reduce the noise and save energy.
2.3. Fabrication of the DNN Chip
Optical elements based on quartz materials generally have extremely long lifetime because of the high chemical stability and mechanical strength of silica[34]. Therefore, our chips can theoretically work with high stability in various environments for a long time. Figure S1 (Supporting Information) shows the fabrication process. We used a SUSS MA6 UV photolithography machine, which can achieve pattern alignment on both sides of the substrate. The plasma dry etching is carried out with a SENTECH inductively coupled plasma etching system using
3. Results
3.1. Analysis of Training Results
Figures 2(b)–2(e) display the simulation results for the two datasets (MNIST and Fashion-MNIST) in air and water, respectively. The target regions corresponding to the input image exhibit the highest light intensity, indicating the successful recognition of the input image by the DNN. Due to the binary phase distribution, the incident light cannot be fully modulated, resulting in a decrease in diffraction efficiency. Therefore, it can be observed that there is still a certain proportion of input images present in the output images.
We analyze the effect on the DNN performance with different phase discretization levels (Table 1). It can be seen that the DNN can be trained to realize high accuracies (
Phase discretization | Accuracy (test set) | |
---|---|---|
Air | Water | |
256-level | 98.7% | 98.5% |
8-level | 98.1% | 98.1% |
4-level | 97.4% | 97.5% |
2-level | 96.3% | 96.8% |
Table 1. The Simulated Accuracy of Fashion Product Recognition with Different Phase Discretization Levels
The distance to recognize a target is an important metric; therefore, we analyzed the changes in the accuracy of DNNs under different recognition distances. The distance from the input layer to the DNN chip can be regarded as the recognition distance. In the experiment, the distance we designed was 10 cm. Therefore, from the results (Fig. S2, Supporting Information), we can see that the DNN has the highest accuracy at 10 cm. As the value shifts, the accuracy gradually decreases. When the recognition distance is larger than 500 mm, the accuracy in water is less than 90%. Therefore, if we define effective recognition as an accuracy greater than 90%, the detection distance range of our chip is approximately 20–500 mm.
3.2. Robustness Analysis
Figure 3(a) shows the obtained binary phase distributions of the DNN for fashion product recognition. The binary phase distributions of the DNN for handwritten digital recognition are shown in Fig. S3 (Supporting Information). We utilized double-sided photolithography followed by plasma dry etching to engrave the DNN on the two surfaces of a quartz plate. This process is compatible with the current complementary metal–oxide semiconductor (CMOS) manufacturing processes, which means that large batches of DNN chip fabrication can be achieved.
Figure 3.Characterization and fabrication error analysis of the chip for fashion product recognition. (a) The obtained phase map of the DNN after training. (b) The optical images of the two surfaces of the DNN chip captured by a 4f optical system. Scale bars, 1 mm. (c) The simulated impact of phase modulation errors caused by etching on the accuracy of the DNN. (d) and (e) The simulated impact of alignment errors caused by double-sided photolithography on the accuracy of the DNN working in (d) air and (e) water, respectively.
The optical images shown in Fig. 3(b) indicate that the morphology of the sample matches well with the idea phase map. Besides, the scanning electron microscope (SEM) images (Fig. S4, Supporting Information) of the sample demonstrate that the fabrication process has a high patterning accuracy. To achieve a
3.3. Performance of the DNN Chip
Figure 4(a) depicts the experimental optical setup. A laser emits light at 532 nm, and its power is adjusted by a half-wave plate (HWP) and a polarizing beam splitter (PBS). We used lenses L1 and L2 to form a 4f system for beam expansion. A pinhole is placed at the focal plane of L1 to filter the beam, allowing only the Gaussian beam to pass through. The beam then enters the digital micromirror device (DMD) to generate the input optical images at a distance of 10 cm in front of the DNN chip. Finally, the output results are displayed in a plane 25 cm behind the chip and captured by a charge-coupled device (CCD) camera. For testing in water, the DNN chip was held in place by a clamp and immersed in a quartz container filled with water [Fig. 4(b)].
Figure 4.Experimental results of the DNN chip. (a) Experimental optical setup. HWP: half-wave plate, PBS: polarizing beam splitter, BS: beam splitter, CCD: charge-coupled device, DMD: digital micromirror device. (b) The digital image of the DNN chip working in water. (c) Confusion matrices for handwritten digital recognition in air and water. (d) Confusion matrices for fashion product recognition in air and water. (e)–(h) Recorded outputs of the DNNs and corresponding normalized light intensities in the 4 circled regions. Scale bars, 2 mm. (e) Task: handwritten digital recognition. Medium: air. (f) Task: fashion product recognition. Medium: air. (g) Task: handwritten digital recognition. Medium: water. (h) Task: fashion product recognition. Medium: water.
We tested the DNN chip using 1000 images (250 images for each type) in the test sets. The results in Fig. 4(c) show that the DNN can reach accuracies of 91.5% in air and 91.4% in water for handwritten digital recognition, respectively, while the accuracies of the DNN for fashion product recognition are 94.6% in air and 92.6% in water [Fig. 4(d)], respectively. These values are slightly lower than that of the simulation in Table 1, which may be due to fabrication and measurement errors, as well as the different numbers of test images. These experimental results proved that our DNN chip can work in both water and air with high accuracies (
4. Discussion
We have realized an integrated DNN chip with its two diffractive layers fabricated on the two surfaces of a quartz plate, respectively. Based on CMOS-compatible double-side photolithography, this approach is possible for large-scale fabrication of DNN chips with various functions. To cope with the unexpected situation of working underwater, we train the DNNs that can work in both water and air through multi-objective optimization training. The integrated chip architecture reduces the training constraints, allowing the chip to maintain high accuracy while operating in the two media. It can be integrated with the camera, thereby enabling the direct light-speed analog processing of optical images. The ability to work directly in water may enable it for direct recognition or to extract information of the target objective at a close range underwater. Besides, it may be used for underwater scattering imaging. For instance, in some turbid waters, it can be used to improve the quality of the captured images. Finally, the DNN can process invisible optical information such as phase[22] and polarization[20], so it can be used to study invisible targets underwater or in air, like turbulence. The high chemical stability of quartz ensures that the chip can handle different extreme environments. We note that existing DNNs are mainly designed for working in air. Therefore, our design strategy shown in this work may promote the direct application of DNNs in other media, especially in some extreme environments. In the future, not limited to recognition tasks, the DNN chip can also be designed to implement other tasks, for instance, image feature extraction and underwater beam shaping, which will further expand its applications.
References
[4] F. Hanson, S. Radic. High bandwidth underwater optical communication. Appl. Opt., 47, 277(2008).
[8] B. J. Frey, D. B. Leviton, T. J. Madison. Temperature-dependent refractive index of silicon and germanium. Optomechanical Technologies for Astronomy, 790(2006).
[9] D. A. Neamen. Semiconductor Physics and Devices: Basic Principles(2003).
[17] A. Ozcan, Y. Rivenson, Y. Wu et al. Method and system for phase recovery and holographic image reconstruction using a neural network(2022).
[21] E. Goi, M. Gu. Perspective on photonic neuromorphic computing. Neuromorphic Photonic Devices and Applications, 353(2024).
[29] X. Sui, Q. Wu, J. Liu et al. A review of optical neural networks. IEEE Access, 8, 70773(2020).
[32] H. Xiao, K. Rasul, R. Vollgraf. Fashion-MNIST: a novel image dataset for benchmarking machine learning algorithms(2017).
[35] K. Iizuka, K. Iizuka. Engineering Optics, 35(2008).

Set citation alerts for the article
Please enter your email address