
- Advanced Photonics
- Vol. 5, Issue 6, 066003 (2023)
Abstract
1 Introduction
Holographic imaging is an interdisciplinary field that combines optics, computer science, and applied mathematics to generate holographic images using numerical algorithms. Although the concept of using computers to generate holograms can be traced back to the 1960s, it was not until the emergence of digital imaging and processing techniques in the 1990s that computational holography began to develop into a viable technology.1,2 In the 1990s, digital holography started to gain more attention due to advancements in computer technology and digital image processing.3 In recent years, holographic imaging has continued to advance, with new research and technology enabling even more sophisticated holographic imaging capabilities. Researchers have developed increasingly sophisticated numerical algorithms for holographic imaging, such as compressive sensing, sparse coding, and deep-learning techniques.4
Spatial coherence (SC) is a critical factor that determines the quantity and quality of high-frequency information carried by the light beam in holographic imaging. High-frequency information is crucial for achieving high resolution and capturing fine details in an image. When the SC of the light source is low, the phase relationship of the beam becomes chaotic, causing the interference pattern to be washed out and resulting in insufficient transmission of high-frequency information. As a result, the reconstructed image has a lower resolution and less fine-detail information, as the high-frequency information needed to capture these details has been lost. Therefore, high SC light is preferred for holographic imaging to ensure that sufficient high-frequency information is present in the interference pattern and the hologram, resulting in high-resolution and detailed reconstructed images. However, the SC of light sources is often very low in complex scenes, which leads to image degradation and loss of details. Therefore, how to restore images under low-SC light sources is a challenging issue.11
Oceanic and atmospheric turbulence may profoundly influence optical imaging, engendering distortions and deterioration in photographs acquired through cameras and alternative optical detection devices. The distortion and degradation of images caused by oceanic turbulence occur because the turbulent motions in the water column cause variations in the refractive index of the water, which in turn leads to variations in the path of light as it travels through the water. Atmospheric turbulence occurs because the Earth’s atmosphere is not uniform and contains regions of varying temperature and density, which can cause variations in the refractive index of the air. Whether it is oceanic turbulence or atmospheric turbulence, as the beam passes through these regions of varying refractive index, phase correlation changes, and the SC is distorted, causing the image to become blurred and distorted, or even completely lost. Massive efforts were devoted to finding a solution for imaging in various turbulences.16
Sign up for Advanced Photonics TOC. Get the latest issue of Advanced Photonics delivered right to you!Sign up now
Artificial intelligence for optics has unparalleled advantages, especially in the field of holography. For example, deep learning can address challenging inverse problems in holographic imaging, where the objective is to recover the original scene or object properties from observed images or measurements and enhance the resolution of optical imaging systems beyond their traditional diffraction limit,24
We summarize the innovations of this paper as follows.
Figure 1.Principle and performance of TWC-Swin method. (a) LPR. SC modulation can adjust the SC by changing the distance
2 Materials and Methods
2.1 Scheme of the LPR
Figure 1(a) shows the LPR. The high-coherence light source generated by the solid-state laser (CNI, MLL-FN, 532 nm) is polarized horizontally after passing through a half-wave plate and a polarization beam splitter, allowing it not only to match the modulation mode of the SLM but also to adjust the beam intensity. The RD (DHC, GCL-201) is used to reduce the SC of the light source, with the degree of reduction depending on the radius of the incident beam on the RD—the larger the radius is, the lower the SC of the output light source is (see Note 2 in the Supplementary Material). In the experiment, we control the incident beam radius by adjusting the distance between lens 1 (L1, 100 mm) and the RD. After being collimated by lens 2 (L2, 100 mm), the beam is incident on the SLM1 (HDSLM80R) loaded with turbulent phase, which is continuously refreshed at a rate of 20 Hz. After passing through the turbulence, the beam is split into two parts by a beam splitter. The first part employs Michelson interference to capture interference fringes and measure the SC of the light. The second part is used for holographic imaging, with the phase hologram of the image loaded onto the SLM2 (PLUTO). The high-pass filter is employed to filter out the unmodulated zero-order diffraction pattern, and the final imaging result is captured by the complementary metal–oxide-semiconductor (CMOS, Sony, E3ISPM). In summary, we control the SC of the light source by adjusting the distance between lens L1 and the RD. We simulate a turbulent environment using the SLM1, with the intensity of the turbulence depending on the loaded turbulent phase. If turbulence is not required, the SLM1 can be turned off, and it functions as a mirror equivalent.
2.2 Oceanic Turbulence and Atmospheric Turbulence
The turbulence intensity in the experiment is determined by the spatial power spectrum of the turbulence. The function of the spatial power spectrum of the turbulent refractive-index fluctuations used in this paper is based on the assumption that turbulence is homogeneous and isotropic. We use the Nikishov power spectrum to describe oceanic turbulence:33
For atmospheric turbulence, we use the non-Kolmogorov power spectrum,34
2.3 Data Acquisition
Low SC and turbulence are different physical scenarios, but the influence of these scenarios on holographic imaging can be described through SC. Based on the above method, we only use the data obtained under different SCs for model training, and any other data are used for testing [Fig. 1(g)]. The process of data acquisition is as follows.
Our original images consist of public data sets, such as the Berkeley segmentation data set (BSD),36 Celebfaces attributes high-quality data set (CelebA),37 Flickr data set (Flickr),38 Webvision data set (WED),39 and DIV2k data set (DIV).40 The training set is only composed of images captured by CMOS1 in steps 2 and 3.
In the training phase, we divide the training data into 11 groups based on SC and send them to the network for training in turn. Therefore we can obtain a model space containing swin models with different weights. In the testing phase, the swin adapter is a program that needs to receive the SC information of the light source and selects the optimal model in model space to achieve the image restoration task. Here we set to distance priority mode, and the swin adapter will select the weight parameter closest to the measured SC. The test set comes from the images generated in steps 4 and 5. Note that none of the test sets have been trained; they are blinded to the network. Our model was implemented using PyTorch; the detailed architecture can be found in Note 1 in the Supplementary Material. We use adaptive moment estimation with weight decay (AdamW) as optimizer,41 which is utilized to update the weights with initial learning rates of 0.0005 with a 50% drop every 10 epochs. The total epoch is 100. Mean-squared error (MSE) is the loss function of the network. All training and testing stages are placed on the NVIDIA GTX3080Ti graphics card, and a full training period takes about 12 h. To effectively verify the performance of our method, a series of credible image quality assessment measures were applied. The full-reference measures include peak signal-to-noise ratio (PSNR), structural similarity index (SSIM), and Pearson correlation coefficient (PCC), which are used to provide an assessment of a single image in relation to perceived visual quality. See Note 4 in the Supplementary Material for descriptions of evaluation indices.
3 Results and Discussion
This section primarily showcases the performance of our method under various SCs and turbulent scenes. We simulated different strengths of oceanic and atmospheric turbulence, enhancing the diversity of turbulence intensities and types. Additionally, we conducted comparative analyses with traditional convolutional-residual networks and performed ablation studies to reinforce the validity and efficiency of our proposed method. It is important to emphasize that our training data exclusively consisted of holographic imaging results obtained under different SC conditions, with none of the test data used during the training phase.
3.1 Performance on Low SC
Figures 2 and 1(e) show the original images captured by CMOS1 and restored images processed by the TWC-Swin method under different SCs. We present 11 groups of test results, each representing a different SC level and containing samples from five distinct data sets. As described in Sec. 2, the SC of the light source can be altered by adjusting the distance between RD and L1. It is evident that as the SC decreases, the quality of holographic imaging deteriorates significantly, exhibiting high levels of noise and blurriness. Simultaneously, the decrease in SC corresponds to a reduction in light efficiency, resulting in darker images that ultimately become indiscernible. After processing through the trained network, these degraded images become smoother, with improved sharpness, enhanced details, and reduced noise. Remarkably, even in low SC conditions where the original images captured by the CMOS1 sensor lack any discernible details, our network successfully reconstructs a significant portion of the elements. To accurately evaluate the effectiveness of image restoration, we present the evaluation indices (SSIM and PCC), comparing the original and reconstructed images with respect to the ground truth for different SCs [Fig. 1(f) and Table 1]. Other indices are provided in Table S3 in the Supplementary Material. The quantitative results further validate the significant improvement achieved in various indicators of the reconstructed images compared to the original ones, approaching the ground truth. Figure 3 illustrates the average evaluation indices for each test set. Here only partial results are shown; more detailed results are included in Fig. S2 in the Supplementary Material. It can be seen that each evaluation index of images has risen significantly compared to the original images after being processed by the TWC-Swin method, indicating a substantial improvement in the image quality. Moreover, the network demonstrates its robust generalization capability by performing image restoration on multiple test sets, which are beyond the scope of the training set. This implies that our method has effectively learned the underlying patterns in the data during training and can apply these patterns to unseen data, resulting in successful image restoration.
Figure 2.Qualitative analysis of our method’s performance at the different SCs. Input, raw image captured by CMOS1. Output, image processed by the network. (a)–(k) Different SCs: (a)
SC | SSIM | PCC | ||||||||
BSD | CelebA | Flickr | WED | DIV | BSD | CelebA | Flickr | WED | DIV | |
Input_ | 0.5893 | 0.5943 | 0.4296 | 0.6155 | 0.4625 | 0.9368 | 0.9575 | 0.9210 | 0.9146 | 0.8753 |
Output_ | ||||||||||
Input_ | 0.5775 | 0.5415 | 0.3917 | 0.6245 | 0.4184 | 0.8953 | 0.9303 | 0.8588 | 0.9149 | 0.8043 |
Output_ | ||||||||||
Input_ | 0.6178 | 0.5394 | 0.2777 | 0.5677 | 0.3892 | 0.8957 | 0.9211 | 0.8396 | 0.8961 | 0.8144 |
Output_ | ||||||||||
Input_ | 0.6040 | 0.5017 | 0.3183 | 0.5510 | 0.4136 | 0.8303 | 0.9035 | 0.8511 | 0.8568 | 0.7979 |
Output_ | ||||||||||
Input_ | 0.4881 | 0.4469 | 0.3073 | 0.5271 | 0.3643 | 0.8072 | 0.8817 | 0.7557 | 0.8326 | 0.7196 |
Output_ | ||||||||||
Ground truth | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Table 1. Quantitative analysis of evaluation indices (SSIM and PCC) at different SCs and test samples
Figure 3.Average results of the evaluation indices for each test data set. The coherence is 0.368. Results of other coherences are provided in Fig. S2 in the Supplementary Material. All evaluation indices demonstrate that our method possesses strong image restoration ability under low SC.
3.2 Performance on Oceanic Turbulence and Atmospheric Turbulence
Owing to the stochastic variations of the refractive index within oceanic and atmospheric turbulence, the phase information of light beams becomes distorted, thereby reducing SC and degrading the quality of computational holography images. This issue can be effectively addressed using the TWC-Swin method. It should be mentioned that all images captured under turbulent scenes were never trained by the network. Figure 4 demonstrates the remarkable image restoration capability of TWC-Swin method under varying intensities of oceanic and atmospheric turbulence. As discussed in Sec. 2, the turbulence intensity depends on certain variates of the power spectrum function, where stronger turbulence presents more complex simulated turbulence phases, as shown in Figs. 4(A5) and 4(O5). We carried out experiments under five distinct intensities of both oceanic and atmospheric turbulence, and simultaneously measured the SC of the light source for selecting the optimal model. It should be noted that the turbulence phase loaded on the SLM is continuously refreshed (20 Hz). To provide stronger evidence, we present the evaluation indices (SSIM and PCC) for oceanic and atmospheric turbulence in Tables 2 and 3 and Fig. 1(h), whereas additional indices (MSE and PSNR) can be found in Tables S4 and S5 in the Supplementary Material. Our analysis concluded that as the turbulence intensity increases, the SC experiences a downturn, which subsequently degrades image quality. Nevertheless, our proposed method is capable of overcoming these adverse effects and effectively improving the image quality regardless of the turbulence intensity. Our model learns the universal features of image degradation and restoration that depend on SC. This further demonstrates the strong generalization capability of the network trained with SC as physical prior information and the ability to apply learned knowledge from the training set to new, unseen scenes. This versatility is a desirable trait in a neural network, as it suggests the method’s potential for broad application.
Figure 4.Qualitative analysis of our method’s performance across varying intensities of (a) oceanic and (b) atmospheric turbulence. The network trained with coherence as physical prior information can effectively overcome the impact of turbulence on imaging and improve image quality. (O1)–(O5) mean oceanic turbulence phase and (A1)–(A5) mean atmospheric turbulence phase. (O1)
Oceanic turbulence | SSIM | PCC | ||||||||
BSD | CelebA | Flickr | WED | DIV | BSD | CelebA | Flickr | WED | DIV | |
Input (O1) | 0.5331 | 0.6773 | 0.6810 | 0.6016 | 0.7018 | 0.8978 | 0.9404 | 0.8876 | 0.9096 | 0.8718 |
Output (O1) | ||||||||||
Input (O2) | 0.5098 | 0.6566 | 0.6690 | 0.5716 | 0.5371 | 0.8855 | 0.9329 | 0.8786 | 0.8970 | 0.8494 |
Output (O2) | ||||||||||
Input (O3) | 0.4950 | 0.6538 | 0.6575 | 0.5455 | 0.5281 | 0.8764 | 0.9313 | 0.8585 | 0.8916 | 0.8371 |
Output (O3) | ||||||||||
Input (O4) | 0.4796 | 0.6408 | 0.6474 | 0.5034 | 0.5074 | 0.8774 | 0.9245 | 0.8576 | 0.8664 | 0.8130 |
Output (O4) | ||||||||||
Input (O5) | 0.4519 | 0.6041 | 0.6202 | 0.4446 | 0.4945 | 0.8456 | 0.9075 | 0.8287 | 0.8281 | 0.7631 |
Output (O5) | ||||||||||
Ground truth | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Table 2. Quantitative analysis of evaluation indices (SSIM and PCC) at different oceanic turbulence intensities
Atmospheric turbulence | SSIM | PCC | ||||||||
BSD | CelebA | Flickr | WED | DIV | BSD | CelebA | Flickr | WED | DIV | |
Input (A1) | 0.5738 | 0.6821 | 0.6988 | 0.6495 | 0.6338 | 0.9014 | 0.9404 | 0.8929 | 0.9160 | 0.9766 |
Output (A1) | ||||||||||
Input (A2) | 0.5311 | 0.6513 | 0.6727 | 0.5743 | 0.5701 | 0.8797 | 0.9264 | 0.8676 | 0.8896 | 0.8279 |
Output (A2) | ||||||||||
Input (A3) | 0.5083 | 0.6383 | 0.6785 | 0.5348 | 0.5720 | 0.8688 | 0.9202 | 0.8493 | 0.8747 | 0.8081 |
Output (A3) | ||||||||||
Input (A4) | 0.4965 | 0.6264 | 0.6635 | 0.5202 | 0.5575 | 0.8590 | 0.9161 | 0.8364 | 0.8673 | 0.8040 |
Output (A4) | ||||||||||
Input (A5) | 0.4959 | 0.6153 | 0.6595 | 0.4840 | 0.5407 | 0.8524 | 0.9080 | 0.8263 | 0.8493 | 0.7862 |
Output (A5) | ||||||||||
Ground truth | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 | 1 |
Table 3. Quantitative analysis of evaluation indices (SSIM and PCC) at different atmospheric turbulence intensities
3.3 Comparison between Different Methods and Ablation Study
In this section, we conduct a comprehensive comparative study of different methodologies, assessing their performance and efficacy in restoring images under challenging conditions of low SC and turbulent scenes. Traditional convolution-fusion framework methods, U-net,42 and U-RDN13 were compared to demonstrate the power of the proposed swin model.
In our network architecture, the swin transformer serves as a robust backbone module, responsible for extracting high-level features from input. The special working mechanism gives it powerful hierarchical representation and global perception capabilities. However, direct output from the swin transformer often exhibits artifacts and high-noise levels in image restoration tasks. Therefore, it is necessary to add lightweight convolutional layers as postprocessing blocks. Convolution layers capture local features of the image through local receptive fields, aiding in a better understanding of image details and textures while facilitating mapping from high-dimensional to low-dimensional spaces, resulting in high-quality output. To validate the effectiveness of the postprocessing block in the swin model, we conduct an ablation study. In the ablation study, we created a control group named pure swin, which was obtained by removing the postprocessing block from the swin model. The training processes and data sets of all methods are consistent. Figure 5 shows detailed comparisons of images processed by various methods. Figure 6 illustrates the quantitative results between different methods on various data sets. More qualitative results are provided in Figs. S3 and S4 in the Supplementary Material. Comparing the visual output results of pure swin and the swin model, we found that the output results of the pure swin framework will produce black spots, resulting in blurred perception; the SSIM is 0.8396, a 7% reduction. This is because the swin transformer lacks the ability to sense local features and dimensional mapping. Convolutional layers can fill this gap by offering a mechanism to refine and enhance local features past the swin transformer blocks. The ablation study (compared with pure swin) validates that the postprocessing module is indispensable for the swin model.
Figure 5.Visualization of performance of different methods. The SSIM is shown in the bottom left corner. Our method presents the best performance, which is shown by smoother images with lower noise. (a) Sample selected with the WED data set and magnified insets of the red bounding region. (b) Sample selected with Flickr data set and magnified insets of the red bounding region. The pure swin model can be obtained by removing the postprocessing block of the swin model (
Figure 6.Performance between different methods on various data sets with SC being 0.494. Our model outperforms other methods across various data sets and indices.
We tested the performance of other networks under the same conditions. Our proposed network outperforms other methods by presenting the lowest noise and best evaluation index. Tables S6 and S7 in the Supplementary Material provide a detailed quantitative comparison of the performance across different models and different SCs. In the task of image restoration under low SC, our proposed methodology exhibits superior performance across all evaluative indices when juxtaposed with alternative approaches. Figure 7 shows the comparative performance of various methods when faced with image degradation due to various turbulence types and intensities. We observed that all networks trained with SC exhibit the ability to significantly improve the image quality under turbulent scenes and not just the swin model. This is an exciting result, as it signifies the successful integration of physical prior information into network training, enabling the networks to be applied to multiple tasks and scenarios.
Figure 7.(a), (b) Performance comparison between different methods at various turbulent scenes. (A1)
4 Conclusions
By leveraging the SC as physical prior information and harnessing advanced deep-learning algorithms, we proposed a methodology, TWC-Swin, which demonstrates exceptional capabilities in simultaneously restoring images in low SC and random turbulent scenes. Our multicoherence and multiturbulence holographic imaging data sets, consisting of natural images, are created by the LPR, which can simulate different SCs and turbulence scenes (see Sec. 2). Though the swin model used in the tests was trained solely on the multicoherence data set, it can achieve promising results on both low SC, oceanic turbulence and atmospheric turbulence scenes. The key is that we capture the common physical property in these scenes, SC, and use it as physical prior information to generate a training set, so that the TWC-Swin method exhibits remarkable generalization capabilities, effectively restoring images from unseen scenes beyond the training set. Furthermore, through a series of rigorous experiments and comparisons, we have established the superiority of the swin model over traditional convolutional frameworks and alternative methods in terms of image restoration from qualitative and quantitative analysis (see Sec. 3). The integration of SC as a fundamental guiding principle in network training has proven to be a powerful strategy in aligning downstream tasks with pretrained models.
Our research findings offer guidance not only for the domain of optical imaging but also for the integration with the segment anything model (SAM),43 extending its applicability to multiphysics scenarios. For instance, in turbulent scenes, our methodology can be implemented for preliminary image processing, enabling the utilization of unresolved images for precise image recognition and segmentation tasks facilitated by SAM. Moreover, our experimental scheme also provides a simple idea for turbulence detection. Our research contributes valuable insights into the use of deep-learning algorithms for addressing image degradation problems in multiple scenes and highlights the importance of incorporating physical principles into network training. It is foreseeable that our research can serve as a successful case for the combination of deep learning and holographic imaging in the future, which facilitates the synergistic advancement of the fields of optics and computer science.
Xin Tong is a PhD student at the School of Physics, Zhejiang University, Hangzhou, China. He received his BS degree in physics from Zhejiang University of Science and Technology, Hangzhou, China. His current research interests include holographic imaging, deep learning, computational imaging, and partial coherence theory.
Renjun Xu received his PhD from the University of California, Davis, California, United States. He is a ZJU100 Young Professor and a PhD supervisor at the Center for Data Science, Zhejiang University, Hangzhou, China. He was the senior director of data and artificial intelligence at VISA Inc. His research interests include machine learning, alignment techniques for large-scale pretrained models, transfer learning, space editing, transformation, generation, and the interdisciplinarity of physics and mathematics.
Pengfei Xu is a PhD student at the School of Physics, Zhejiang University, Hangzhou, China. He received his BS degree in physics from Zhejiang University, Hangzhou, China, in 2017. His current research interests include computational holographic imaging, partially coherent structured light field, and vortex beam manipulation techniques.
Zishuai Zeng is a PhD student at the School of Physics, Zhejiang University, Hangzhou, China. He received his BS degree in 2019 from the School of Information Optoelectronic Science and Engineering at South China Normal University. His current research interests include computer-generated holography, as well as beam propagation transformation and computational imaging.
Shuxi Liu is a PhD student at the School of Physics, Zhejiang University, China. He received his BS degree in physics from Zhejiang University in 2022. His current research interests include catastrophe optics, optical vortex, and computational imaging.
Daomu Zhao received his PhD from Zhejiang University, Hangzhou, China. Since 2003, he has been as a professor of the School of Physics at Zhejiang University. Now, he is the director of the Institute of Optoelectronic Physics at Zhejiang University. He has broad research interests in beam transmission, coherence and polarization theory, diffraction optics, holographic imaging, and deep learning.
References
[4] R. Horisaki et al. Compressive propagation with coherence. Opt. Lett., 47, 613-616(2022).
[9] X. Guo et al. Stokes meta-hologram toward optical cryptography. Nat. Commun., 13, 6687(2022).
[23] J. Bertolotti, O. Katz. Imaging in complex media. Nat. Phys., 18, 1008-1017(2022).
[31] K. He et al. Deep residual learning for image recognition, 770-778(2016).
[35] R. W. Gerchberg. A practical algorithm for the determination of phase from image and diffraction plane pictures. Optik, 35, 237-246(1972).
[37] Z. Liu et al. Deep learning face attributes in the wild, 3730-3738(2015).
[39] W. Li et al. WebVision database: visual learning and understanding from web data(2017).
[41] I. Loshchilov, F. Hutter. Decoupled weight decay regularization(2017).
[43] A. Kirillov et al. Segment anything(2023).

Set citation alerts for the article
Please enter your email address