• Photonics Insights
  • Vol. 4, Issue 2, R03 (2025)
Xiyuan Luo1,†, Sen Wang1, Jinpeng Liu1,2, Xue Dong1..., Piao He1, Qingyu Yang1, Xi Chen1, Feiyan Zhou1, Tong Zhang1, Shijie Feng2, Pingli Han1,3, Zhiming Zhou1, Meng Xiang1,3, Jiaming Qian2, Haigang Ma2, Shun Zhou2, Linpeng Lu2, Chao Zuo2,*, Zihan Geng4,*, Yi Wei5,* and Fei Liu1,3,*|Show fewer author(s)
Author Affiliations
  • 1School of Optoelectronic Engineering, Xidian University, Xi’an, China
  • 2School of Electronic and Optical Engineering, Nanjing University of Science and Technology, Nanjing, China
  • 3Xi’an Key Laboratory of Computational Imaging, Xi’an, China
  • 4Institute of Data and Information, Tsinghua Shenzhen International Graduate School, Tsinghua University, Shenzhen, China
  • 5Department of Mechanical Engineering, Massachusetts Institute of Technology, Cambridge, USA
  • show less
    DOI: 10.3788/PI.2025.R03 Cite this Article Set citation alerts
    Xiyuan Luo, Sen Wang, Jinpeng Liu, Xue Dong, Piao He, Qingyu Yang, Xi Chen, Feiyan Zhou, Tong Zhang, Shijie Feng, Pingli Han, Zhiming Zhou, Meng Xiang, Jiaming Qian, Haigang Ma, Shun Zhou, Linpeng Lu, Chao Zuo, Zihan Geng, Yi Wei, Fei Liu, "Revolutionizing optical imaging: computational imaging via deep learning," Photon. Insights 4, R03 (2025) Copy Citation Text show less
    Data-driven computational imaging link.
    Fig. 1. Data-driven computational imaging link.
    Applications of computational optical system design.
    Fig. 2. Applications of computational optical system design.
    Deep-learning-based wavefront reconstruction method for adaptive systems. (a) A simplified network diagram for CARMEN[36]. (b) System controller design based on BP artificial neural network[37]. (c) An architecture with 900 hidden layer neurons. (d) The original image and network output and the spot center algorithm were found[38].
    Fig. 3. Deep-learning-based wavefront reconstruction method for adaptive systems. (a) A simplified network diagram for CARMEN[36]. (b) System controller design based on BP artificial neural network[37]. (c) An architecture with 900 hidden layer neurons. (d) The original image and network output and the spot center algorithm were found[38].
    Deep-learning-based wavefront phase recovery method. (a) ISNet architecture[40]. (b) Adapted Inception v3 architecture[41]. (c) Application procedure of the feature-based phase retrieval wavefront sensing approach using machine learning[42].
    Fig. 4. Deep-learning-based wavefront phase recovery method. (a) ISNet architecture[40]. (b) Adapted Inception v3 architecture[41]. (c) Application procedure of the feature-based phase retrieval wavefront sensing approach using machine learning[42].
    Image-based wavefront aberration estimation method. (a) Experimental optical system setup for learning based Shack–Hartmann wavefront sensor[44]. (b) Sketch map of the object-independent wavefront sensing approach using deep LSTM networks[45]. (c) Schematic and experimental diagrams of the deep learning wavefront sensor[46].
    Fig. 5. Image-based wavefront aberration estimation method. (a) Experimental optical system setup for learning based Shack–Hartmann wavefront sensor[44]. (b) Sketch map of the object-independent wavefront sensing approach using deep LSTM networks[45]. (c) Schematic and experimental diagrams of the deep learning wavefront sensor[46].
    Wavefront aberration estimation method based on phase and intensity information. (a) Working procedure of the De-VGG wavefront sensing approach[47]. (b) The architecture of the phase diversity-convolutional neural network (PD-CNN)[48].
    Fig. 6. Wavefront aberration estimation method based on phase and intensity information. (a) Working procedure of the De-VGG wavefront sensing approach[47]. (b) The architecture of the phase diversity-convolutional neural network (PD-CNN)[48].
    Design framework for automatic generation of free-form initial structures[51].
    Fig. 7. Design framework for automatic generation of free-form initial structures[51].
    Deep-learning-based method for automatic initial structure generation. (a) Overview of the deep learning framework used by Geoffroi et al.[53]. (b) Upon receiving a given set of specifications and lens sequence, the model outputs K = 8 different lenses that share the same sequence but differ in structures[55].
    Fig. 8. Deep-learning-based method for automatic initial structure generation. (a) Overview of the deep learning framework used by Geoffroi et al.[53]. (b) Upon receiving a given set of specifications and lens sequence, the model outputs K = 8 different lenses that share the same sequence but differ in structures[55].
    Deep-learning-based optical design platform. (a) Illustration of the generalized design framework[58]. (b) Flowchart of the proposed deep learning optical design (DLOD) framework. (c) Eyepiece lens before and after training[59].
    Fig. 9. Deep-learning-based optical design platform. (a) Illustration of the generalized design framework[58]. (b) Flowchart of the proposed deep learning optical design (DLOD) framework. (c) Eyepiece lens before and after training[59].
    Large field-of-view computational thin plate lens imaging method. (a) The RRG-GAN network architecture. (b) RRG-GAN restoring results based on manual dataset[61].
    Fig. 10. Large field-of-view computational thin plate lens imaging method. (a) The RRG-GAN network architecture. (b) RRG-GAN restoring results based on manual dataset[61].
    U-Net (Res-U-net)++ network structure and imaging effects at different defocus distances. (a) Schematic of the network structure. The raw image is used as the input for the model, and the reconstructed image is the output of the model. (b) Image quality of the Cooke triplet lens and the doublet lens systems with the defocus amount varying within the range of −0.2 to 0.2 mm[62].
    Fig. 11. U-Net (Res-U-net)++ network structure and imaging effects at different defocus distances. (a) Schematic of the network structure. The raw image is used as the input for the model, and the reconstructed image is the output of the model. (b) Image quality of the Cooke triplet lens and the doublet lens systems with the defocus amount varying within the range of 0.2 to 0.2 mm[62].
    Deep-learning-based aberration compensation method. (a) End-to-end deep learning-based image reconstruction pipeline for single and dual MDLs. (b) Examples of images with reconstruction artifacts[64].
    Fig. 12. Deep-learning-based aberration compensation method. (a) End-to-end deep learning-based image reconstruction pipeline for single and dual MDLs. (b) Examples of images with reconstruction artifacts[64].
    Architecture of deep-learning-based super-resolution method for shadow imaging. (a) Overall architecture of CNNSR processing[65]. (b) CSRnet structure model[66]. (c) Schematic of SRGAN-enabled contact microscopy[68].
    Fig. 13. Architecture of deep-learning-based super-resolution method for shadow imaging. (a) Overall architecture of CNNSR processing[65]. (b) CSRnet structure model[66]. (c) Schematic of SRGAN-enabled contact microscopy[68].
    Architecture of the customized AlexNet deep learning model. (a) Input cell images of size 30 pixel × 30 pixel were resized to 50 pixel × 50 pixel. (b) Went through eight convolutional layers[71].
    Fig. 14. Architecture of the customized AlexNet deep learning model. (a) Input cell images of size 30 pixel × 30 pixel were resized to 50 pixel × 50 pixel. (b) Went through eight convolutional layers[71].
    Schematic of the signal enhancement network and enhancement effect used by Rajkumar et al. (a) Schematic of the proposed denoising and classification architecture. (b) Reconstructed results from the optimized CNN[72].
    Fig. 15. Schematic of the signal enhancement network and enhancement effect used by Rajkumar et al. (a) Schematic of the proposed denoising and classification architecture. (b) Reconstructed results from the optimized CNN[72].
    Framework and results of the hologram information reconstruction network. (a) Network flowchart by Yair et al. (b) Red blood cell volume estimation using deep-neural-network-based phase retrieval[25]. (c) Architecture of Y-Net[75]. (d) Process of dataset generation (orange), network training (gray), and network testing (blue). (e) Reconstruction results of the example pit and pollen for single and synthetic wavelengths[76].
    Fig. 16. Framework and results of the hologram information reconstruction network. (a) Network flowchart by Yair et al. (b) Red blood cell volume estimation using deep-neural-network-based phase retrieval[25]. (c) Architecture of Y-Net[75]. (d) Process of dataset generation (orange), network training (gray), and network testing (blue). (e) Reconstruction results of the example pit and pollen for single and synthetic wavelengths[76].
    Architecture of the RedCap model and results. (a) Architecture of proposed RedCap model for holographic reconstruction. (b) Reconstructed image comparison[78].
    Fig. 17. Architecture of the RedCap model and results. (a) Architecture of proposed RedCap model for holographic reconstruction. (b) Reconstructed image comparison[78].
    Deep-learning-based super-resolution holographic imaging. (a) Overview of the deep-learning-based pixel super-resolution approach[79]. (b) Schematic diagram of the CNN architecture. (c) The main components of GAN[80].
    Fig. 18. Deep-learning-based super-resolution holographic imaging. (a) Overview of the deep-learning-based pixel super-resolution approach[79]. (b) Schematic diagram of the CNN architecture. (c) The main components of GAN[80].
    WavResNet network flow[81].
    Fig. 19. WavResNet network flow[81].
    PhaseStain workflow and virtual staining results. (a) PhaseStain workflow. (b) Virtual H&E staining of label-free skin tissue using the PhaseStain framework[83].
    Fig. 20. PhaseStain workflow and virtual staining results. (a) PhaseStain workflow. (b) Virtual H&E staining of label-free skin tissue using the PhaseStain framework[83].
    Deep-learning-based PSF estimation method[84].
    Fig. 21. Deep-learning-based PSF estimation method[84].
    Workflow diagram of deep learning methods for computational acceleration. (a) Overview of the imaging pipeline by Kristina et al.[85]. (b) The overall architecture of MultiFlatNet. (c) Reconstruction results of real captures using various approaches[86].
    Fig. 22. Workflow diagram of deep learning methods for computational acceleration. (a) Overview of the imaging pipeline by Kristina et al.[85]. (b) The overall architecture of MultiFlatNet. (c) Reconstruction results of real captures using various approaches[86].
    Deep neural network based on mask properties. (a) Overview of imaging pipeline by Zhou et al.[89]. (b) Diagram of end-to-end DPDAN.
    Fig. 23. Deep neural network based on mask properties. (a) Overview of imaging pipeline by Zhou et al.[89]. (b) Diagram of end-to-end DPDAN.
    Schematic of the network framework of DNN-FZA. (a) Image acquisition pipeline and reconstruction for the DNN-FZA camera. (b) Architecture of the U-Net. (c) Up- and down-projection units in DBPN[92].
    Fig. 24. Schematic of the network framework of DNN-FZA. (a) Image acquisition pipeline and reconstruction for the DNN-FZA camera. (b) Architecture of the U-Net. (c) Up- and down-projection units in DBPN[92].
    MMCN network configuration framework[93].
    Fig. 25. MMCN network configuration framework[93].
    Schematic of the network of the method proposed by Pan et al. (a) Lens-free imaging transformer proposed by Pan et al. (b) Results of image-on-screen experiment and object-in-wild experiment[94].
    Fig. 26. Schematic of the network of the method proposed by Pan et al. (a) Lens-free imaging transformer proposed by Pan et al. (b) Results of image-on-screen experiment and object-in-wild experiment[94].
    GI incorporating CNN. (a) Experimental setup of GI for detecting scattered light. (b) Diagram of CNN architecture for GI. (c) Experimental results of GI image without CNN. (d) Experimental results of GI image reconstructed using CNN[102].
    Fig. 27. GI incorporating CNN. (a) Experimental setup of GI for detecting scattered light. (b) Diagram of CNN architecture for GI. (c) Experimental results of GI image without CNN. (d) Experimental results of GI image reconstructed using CNN[102].
    Network architecture of the PGI system and result diagrams[102]. (a) CNN architecture in the PGI system. (b) Comparison of the DL reconstruction and CS reconstruction results for different sampling rates and turbidity scenarios.
    Fig. 28. Network architecture of the PGI system and result diagrams[102]. (a) CNN architecture in the PGI system. (b) Comparison of the DL reconstruction and CS reconstruction results for different sampling rates and turbidity scenarios.
    (a) Diagram of DNN network architecture. (b) Training process of CGIDL. (c) Plot of experimental results of different methods[106].
    Fig. 29. (a) Diagram of DNN network architecture. (b) Training process of CGIDL. (c) Plot of experimental results of different methods[106].
    (a) Structure of recurrent neural network combining convolutional layers. (b) Comparison of results[110].
    Fig. 30. (a) Structure of recurrent neural network combining convolutional layers. (b) Comparison of results[110].
    (a) Structure of DNN network. (b) Comparison of the original image and the effects of different methods[112].
    Fig. 31. (a) Structure of DNN network. (b) Comparison of the original image and the effects of different methods[112].
    Network architecture of the proposed DL-FSPI system and experimental results[113]. (a) DCAN network architecture used. (b) Experimental results of the DL-FSPI system.
    Fig. 32. Network architecture of the proposed DL-FSPI system and experimental results[113]. (a) DCAN network architecture used. (b) Experimental results of the DL-FSPI system.
    (a) Architecture of SPCI-Net layer[115]. (b) Diagram of the experimental setup. (c) Visualization of different methods under high-SNR conditions. (d) Visualization of different methods under low-SNR conditions.
    Fig. 33. (a) Architecture of SPCI-Net layer[115]. (b) Diagram of the experimental setup. (c) Visualization of different methods under high-SNR conditions. (d) Visualization of different methods under low-SNR conditions.
    Comparison of OGTM network architecture and experimental results[118]. (a) OGTM network architecture and migration learning network. (b) Comparison of OGTM experimental results.
    Fig. 34. Comparison of OGTM network architecture and experimental results[118]. (a) OGTM network architecture and migration learning network. (b) Comparison of OGTM experimental results.
    Applications of light-field high-dimensional information acquisition.
    Fig. 35. Applications of light-field high-dimensional information acquisition.
    (a) Experimental setup. (b) CNN architecture. (c) Testing results of “seen objects through unseen diffusers[20].
    Fig. 36. (a) Experimental setup. (b) CNN architecture. (c) Testing results of “seen objects through unseen diffusers[20].
    (a) Experimental setup uses a DMD as the object. (b) Test results of the complex object dataset with two characters, three characters, and four characters[140].
    Fig. 37. (a) Experimental setup uses a DMD as the object. (b) Test results of the complex object dataset with two characters, three characters, and four characters[140].
    Facial speckle image reconstruction by SpT UNet network[137]. (a) SpT UNet network architecture and (b) reconstructed image results.
    Fig. 38. Facial speckle image reconstruction by SpT UNet network[137]. (a) SpT UNet network architecture and (b) reconstructed image results.
    The architecture of the proposed denoiser network. (a)–(e) Single image super-resolution performance comparison for Butterfly image[144].
    Fig. 39. The architecture of the proposed denoiser network. (a)–(e) Single image super-resolution performance comparison for Butterfly image[144].
    (a) CNN’s architecture. (b) Visual comparison of deblurring results on images “Boat” and “Couple” in the presence of AWGN with unknown strength[143].
    Fig. 40. (a) CNN’s architecture. (b) Visual comparison of deblurring results on images “Boat” and “Couple” in the presence of AWGN with unknown strength[143].
    (a), (b) Schematic illustration for imaging through a scattering medium. (c), (d) Schematic of the PSE-deep method. (e) The comparison of the reconstruction results from different methods[150].
    Fig. 41. (a), (b) Schematic illustration for imaging through a scattering medium. (c), (d) Schematic of the PSE-deep method. (e) The comparison of the reconstruction results from different methods[150].
    (a) Experimental setup. (b) The structure of GAN, (b1) the structure of the generative network, and (b2) the structure of the discriminative network. (c) The reconstruction results of multiple continuous shots captured with the same object under the same dynamic scattering media[154].
    Fig. 42. (a) Experimental setup. (b) The structure of GAN, (b1) the structure of the generative network, and (b2) the structure of the discriminative network. (c) The reconstruction results of multiple continuous shots captured with the same object under the same dynamic scattering media[154].
    Active and passive NLOSs[161–167" target="_self" style="display: inline;">–167].
    (a) Non-line-of-sight (NLOS) physics-based 3D human pose estimation. (b) Isogawa M DeepRL-based photons to 3D human pose estimation framework under the laws of physics[203].
    Fig. 44. (a) Non-line-of-sight (NLOS) physics-based 3D human pose estimation. (b) Isogawa M DeepRL-based photons to 3D human pose estimation framework under the laws of physics[203].
    Overview of the proposed dynamic-excitation-based steady-state NLOS imaging framework[207].
    Fig. 45. Overview of the proposed dynamic-excitation-based steady-state NLOS imaging framework[207].
    (a) The structure of the two-step DNN strategy. (b) The corresponding cropped speckle patterns, their autocorrelation, and the reconstructed images with the proposed two-step method[214].
    Fig. 46. (a) The structure of the two-step DNN strategy. (b) The corresponding cropped speckle patterns, their autocorrelation, and the reconstructed images with the proposed two-step method[214].
    Flowchart of PCIN algorithm for NLOS imaging reconstruction[219]. The speckle image captured by the camera is put into CNN, and PCIN iteratively updates the parameters in CNN using the loss function constructed by the speckle image and forward physical model. The optimized parameters are utilized to obtain a high-quality reconstructed image.
    Fig. 47. Flowchart of PCIN algorithm for NLOS imaging reconstruction[219]. The speckle image captured by the camera is put into CNN, and PCIN iteratively updates the parameters in CNN using the loss function constructed by the speckle image and forward physical model. The optimized parameters are utilized to obtain a high-quality reconstructed image.
    (a) Network architecture and result of Ref. [235]. (b) Network architecture and result of Ref. [237].
    Fig. 48. (a) Network architecture and result of Ref. [235]. (b) Network architecture and result of Ref. [237].
    Network architecture and result of PDN[247].
    Fig. 49. Network architecture and result of PDN[247].
    (a) Network architecture and result of PRDN[245]. (b) Network architecture and result of MTM-Net[252]. (c) Network architecture and result of underwater descattering network[5].
    Fig. 50. (a) Network architecture and result of PRDN[245]. (b) Network architecture and result of MTM-Net[252]. (c) Network architecture and result of underwater descattering network[5].
    Network architecture and experimental results of Lin et al.[253] (a) Proposed network architecture. (b) Results of image deblurring experiments.
    Fig. 51. Network architecture and experimental results of Lin et al.[253] (a) Proposed network architecture. (b) Results of image deblurring experiments.
    Comparison of Gated2Depth network architecture and experimental results[255–257" target="_self" style="display: inline;">–257]. (a) Gated2Depth network architecture. (b) Comparison of experimental results of different methods.
    Fig. 52. Comparison of Gated2Depth network architecture and experimental results[255257" target="_self" style="display: inline;">257]. (a) Gated2Depth network architecture. (b) Comparison of experimental results of different methods.
    Underlying principle of TR-PAM and theoretical prediction of PA response[269]. (a) Laser-induced thermoelastic displacement (u) and subsequent PA response based on the optical-absorption-induced thermoelastic expansion for vascular tissues. (b) The principle of TR-PAM: elasticity and calcification estimations from vascular PA time response characteristics. (c) Experimental setup of the TR-PAM.
    Fig. 53. Underlying principle of TR-PAM and theoretical prediction of PA response[269]. (a) Laser-induced thermoelastic displacement (u) and subsequent PA response based on the optical-absorption-induced thermoelastic expansion for vascular tissues. (b) The principle of TR-PAM: elasticity and calcification estimations from vascular PA time response characteristics. (c) Experimental setup of the TR-PAM.
    The process of using the 4D spectral–spatial computational PAD combined with experiments for dataset acquisition and system optimization for deep learning[271]. (a) Relevant parameters can be set before data acquisition, and the distribution of the model optical field and detector acoustic field under a collimated Gaussian beam in the model is shown. (b) Feedback on relevant performance optimization parameters is provided to the experimental system after simulating calculations. (c) Experimental system. (d) The dataset is used for training the spread-spectrum network model. (e) The dataset is used for training the depth-enhanced network model. (f) The low-center-frequency detector skin imaging results obtained in the experiment are input into the trained spread-spectrum model to obtain the output image. (g) The skin imaging results under conventional scattering obtained in the experiment are input into the trained depth-enhanced model to obtain the output image.
    Fig. 54. The process of using the 4D spectral–spatial computational PAD combined with experiments for dataset acquisition and system optimization for deep learning[271]. (a) Relevant parameters can be set before data acquisition, and the distribution of the model optical field and detector acoustic field under a collimated Gaussian beam in the model is shown. (b) Feedback on relevant performance optimization parameters is provided to the experimental system after simulating calculations. (c) Experimental system. (d) The dataset is used for training the spread-spectrum network model. (e) The dataset is used for training the depth-enhanced network model. (f) The low-center-frequency detector skin imaging results obtained in the experiment are input into the trained spread-spectrum model to obtain the output image. (g) The skin imaging results under conventional scattering obtained in the experiment are input into the trained depth-enhanced model to obtain the output image.
    PtyNet-S network architecture and experimental prediction results[273]. (a) Structure of PtyNet-S. (b) Phase reconstruction results of the PtyNet-S in tungsten test pattern.
    Fig. 55. PtyNet-S network architecture and experimental prediction results[273]. (a) Structure of PtyNet-S. (b) Phase reconstruction results of the PtyNet-S in tungsten test pattern.
    Using networks to reduce color artifacts and improve image quality for different slices[282]. (a1), (a2) FPM raw data. (b1), (b2) Network input. (c1), (c2) Network output. (d1), (d2) FPM color image. (e1), (e2) True value. (f) Train two generator-discriminator pairs using two mismatched image sets. (g) Generator A2B accepts FPM input and outputs almost stained images.
    Fig. 56. Using networks to reduce color artifacts and improve image quality for different slices[282]. (a1), (a2) FPM raw data. (b1), (b2) Network input. (c1), (c2) Network output. (d1), (d2) FPM color image. (e1), (e2) True value. (f) Train two generator-discriminator pairs using two mismatched image sets. (g) Generator A2B accepts FPM input and outputs almost stained images.
    Network architecture and results comparison[285]. (a) Network architecture combined with two models. (b) Amplitude contrast. From left to right, the original image is restored using GS algorithm, MFP algorithm, and the algorithm proposed in this review. (c) Phase contrast. From left to right, the original image is restored using GS algorithm, MFP algorithm, and the algorithm proposed in this review.
    Fig. 57. Network architecture and results comparison[285]. (a) Network architecture combined with two models. (b) Amplitude contrast. From left to right, the original image is restored using GS algorithm, MFP algorithm, and the algorithm proposed in this review. (c) Phase contrast. From left to right, the original image is restored using GS algorithm, MFP algorithm, and the algorithm proposed in this review.
    Reconstruction methods and results under different overlapping rates[287]. (a) Image reconstruction process with low overlap rate. (b) Image reconstruction process with high overlap rate. (c) Phase recovery results of different methods with low overlap rate; from top to bottom are alternate projection (AP) phase recovery algorithm, PtychNet, cGAN-FP, and truth value. (d) Phase recovery results of different methods with high overlap rate; from top to bottom are AP phase recovery algorithm, PtychNet, cGAN-FP, and truth value.
    Fig. 58. Reconstruction methods and results under different overlapping rates[287]. (a) Image reconstruction process with low overlap rate. (b) Image reconstruction process with high overlap rate. (c) Phase recovery results of different methods with low overlap rate; from top to bottom are alternate projection (AP) phase recovery algorithm, PtychNet, cGAN-FP, and truth value. (d) Phase recovery results of different methods with high overlap rate; from top to bottom are AP phase recovery algorithm, PtychNet, cGAN-FP, and truth value.
    Network architecture and result analysis[31]. (a) Work flow of Fourier lamination dynamic imaging reconstruction based on deep learning. (b) Use the temporal dynamic information reconstructed by the proposed CNN and compare it with the true value. (c) Network architecture of generative adversarial network (cGAN) for FPM dynamic image reconstruction.
    Fig. 59. Network architecture and result analysis[31]. (a) Work flow of Fourier lamination dynamic imaging reconstruction based on deep learning. (b) Use the temporal dynamic information reconstructed by the proposed CNN and compare it with the true value. (c) Network architecture of generative adversarial network (cGAN) for FPM dynamic image reconstruction.
    The network architecture and simulation results[292]. (a) The network architecture. (b1), (b2) High-resolution amplitude and phase images for simulation. (c1)–(c3) The output of the CNN based on (a) and different wave vectors.
    Fig. 60. The network architecture and simulation results[292]. (a) The network architecture. (b1), (b2) High-resolution amplitude and phase images for simulation. (c1)–(c3) The output of the CNN based on (a) and different wave vectors.
    Network architecture and result analysis. (a) Deep-SLAM procedure[298]. (b)–(d) Wide-FOV and isotropic Deep-SLAM imaging. x-z maximum intensity projections (MIPs) of sub-diffraction fluorescence beads imaged by thick SLAM mode (b, 0.02 illumination NA), thin SLAM mode (c, 0.06 illumination NA), and Deep-SLAM mode (d, 0.02 illumination NA + iso-CARE). Scale bars: 50 µm. The magnified views of the PSFs indicate the resolving power by each mode. Scale bars: 20 µm for insets.
    Fig. 61. Network architecture and result analysis. (a) Deep-SLAM procedure[298]. (b)–(d) Wide-FOV and isotropic Deep-SLAM imaging. x-z maximum intensity projections (MIPs) of sub-diffraction fluorescence beads imaged by thick SLAM mode (b, 0.02 illumination NA), thin SLAM mode (c, 0.06 illumination NA), and Deep-SLAM mode (d, 0.02 illumination NA + iso-CARE). Scale bars: 50 µm. The magnified views of the PSFs indicate the resolving power by each mode. Scale bars: 20 µm for insets.
    (a) DR-Storm network architecture. (b) Comparison of experimental STORM data of Deep-STORM and DRL-STORM[300]. (b1) The sum of 500 frames of original images. (b2) Intensity distribution along the dotted white line. (b3), (b4) Images reconstructed using Deep-STORM and DR-Storm, respectively.
    Fig. 62. (a) DR-Storm network architecture. (b) Comparison of experimental STORM data of Deep-STORM and DRL-STORM[300]. (b1) The sum of 500 frames of original images. (b2) Intensity distribution along the dotted white line. (b3), (b4) Images reconstructed using Deep-STORM and DR-Storm, respectively.
    UNet-RCAN architecture and result analysis[304]. (a) UNet-RCAN network. (b) Restoration results of UNet-RCAN, 2D-RCAN, CARE, pix2pix, and deconvolution on noisy 2D-STED images for β-tubulin in U2OS cells in comparison to the ground-truth STED data.
    Fig. 63. UNet-RCAN architecture and result analysis[304]. (a) UNet-RCAN network. (b) Restoration results of UNet-RCAN, 2D-RCAN, CARE, pix2pix, and deconvolution on noisy 2D-STED images for β-tubulin in U2OS cells in comparison to the ground-truth STED data.
    Network training process and phase unwrapping results[311]. (a) Training and testing of the network. (b) CNN results for samples obtained from the test image set. Wrapped, real, and unwrapped phase images of the CNN output. (c) Compare the phase heights at both ends of the centerline represented by the real and unwound phases of the CNN output.
    Fig. 64. Network training process and phase unwrapping results[311]. (a) Training and testing of the network. (b) CNN results for samples obtained from the test image set. Wrapped, real, and unwrapped phase images of the CNN output. (c) Compare the phase heights at both ends of the centerline represented by the real and unwound phases of the CNN output.
    Experimental setup, PhaseNet architecture, and results[314]. (a) The true values. (b) Reconstruction results of PhaseNet. (c) Three-dimensional images of the reconstructed results. (d) Error mapping between the true value and PhaseNet reconstruction results. (e) Deep-learning-based holographic microscope. (f) Detailed schematic of the PhaseNet architecture.
    Fig. 65. Experimental setup, PhaseNet architecture, and results[314]. (a) The true values. (b) Reconstruction results of PhaseNet. (c) Three-dimensional images of the reconstructed results. (d) Error mapping between the true value and PhaseNet reconstruction results. (e) Deep-learning-based holographic microscope. (f) Detailed schematic of the PhaseNet architecture.
    (a) Schematic diagram of NFTPM[318]. (b) The physics prior (forward image formation model) of NFTPM. (c1), (c2) Phases of HeLa cells retrieved by AI-TIE. (d1), (d2) Phases of HeLa cells retrieved by NFTPM.
    Fig. 66. (a) Schematic diagram of NFTPM[318]. (b) The physics prior (forward image formation model) of NFTPM. (c1), (c2) Phases of HeLa cells retrieved by AI-TIE. (d1), (d2) Phases of HeLa cells retrieved by NFTPM.
    Flowchart of deep-learning-based phase retrieval method and the 3D reconstruction results of different approaches[331]. (a) The principle of deep-learning-based phase retrieval method: first, the background map A is predicted from the single-frame stripe image I by CNN1; then, the mapping of the stripe pattern I and the predicted background map A to the numerator term M and denominator term D of the inverse tangent function is realized by CNN2; finally, a high-precision wrapped phase map can be obtained by the arctangent function. (b) Comparison of the 3D reconstructions of different fringe analysis approaches (FT, WFT, the deep-learning-based method, and 12-step phase-shifting profilometry).
    Fig. 67. Flowchart of deep-learning-based phase retrieval method and the 3D reconstruction results of different approaches[331]. (a) The principle of deep-learning-based phase retrieval method: first, the background map A is predicted from the single-frame stripe image I by CNN1; then, the mapping of the stripe pattern I and the predicted background map A to the numerator term M and denominator term D of the inverse tangent function is realized by CNN2; finally, a high-precision wrapped phase map can be obtained by the arctangent function. (b) Comparison of the 3D reconstructions of different fringe analysis approaches (FT, WFT, the deep-learning-based method, and 12-step phase-shifting profilometry).
    (a) Network architecture and result of global guided network path with multiscale feature fusion[338]. (b) Network architecture and result of depth measurement based on convolutional neural networks[342].
    Fig. 68. (a) Network architecture and result of global guided network path with multiscale feature fusion[338]. (b) Network architecture and result of depth measurement based on convolutional neural networks[342].
    Flowchart of DLMFPP: The projector sequentially projects the stripe pattern onto the dynamic scene so that the corresponding modulated stripe images encode the scene at different time[345]. The camera then captures the multiplexed images with longer exposure time and obtains the spatial spectrum by Fourier transform. A synthesized scene consisting of the letters “MULTIPLEX” is used to illustrate the principle.
    Fig. 69. Flowchart of DLMFPP: The projector sequentially projects the stripe pattern onto the dynamic scene so that the corresponding modulated stripe images encode the scene at different time[345]. The camera then captures the multiplexed images with longer exposure time and obtains the spatial spectrum by Fourier transform. A synthesized scene consisting of the letters “MULTIPLEX” is used to illustrate the principle.
    (a) Results obtained using different lidar devices in 360°[346]. (b) Previous framework versus proposed approach for 3D detection[347]. (c) Result of BirdNet+[348].
    Fig. 70. (a) Results obtained using different lidar devices in 360°[346]. (b) Previous framework versus proposed approach for 3D detection[347]. (c) Result of BirdNet+[348].
    (a) The network design of MVSNet[351]. (b) The pipeline of reconstructing a 3D model using sparse RGB-D images.
    Fig. 71. (a) The network design of MVSNet[351]. (b) The pipeline of reconstructing a 3D model using sparse RGB-D images.
    (a) The structure of ACV-Net[353]. (b) The structure of proposed Sparse PatchMatch[354].
    Fig. 72. (a) The structure of ACV-Net[353]. (b) The structure of proposed Sparse PatchMatch[354].
    Applications of data-driven polarimetric imaging.
    Fig. 73. Applications of data-driven polarimetric imaging.
    (a) Passive 3D polarization face reconstruction method[373]. (b) Overview of Huang’s approach[376].
    Fig. 74. (a) Passive 3D polarization face reconstruction method[373]. (b) Overview of Huang’s approach[376].
    Network model of cotton foreign fiber detection[394].
    Fig. 75. Network model of cotton foreign fiber detection[394].
    Flowchart of the algorithm NSAE-WFCM[399]. NSAE represents the pixel neighborhood-based SAE with several hidden layers, and WFCM denotes the feature-WFCM under various land cover types.
    Fig. 76. Flowchart of the algorithm NSAE-WFCM[399]. NSAE represents the pixel neighborhood-based SAE with several hidden layers, and WFCM denotes the feature-WFCM under various land cover types.
    Pang’s network architecture and result[389].
    Fig. 77. Pang’s network architecture and result[389].
    (a) Network architecture and result of PFNet. (b) Network architecture and result of Sun’s network[410].
    Fig. 78. (a) Network architecture and result of PFNet. (b) Network architecture and result of Sun’s network[410].
    (a) Mueller matrix microscope and the schematic of Li’s system[434,435]. (b) Results of classification experiments on algal samples of Ref. [434]. (c) Results of classification experiments on algal samples of Ref. [401]. (d) Network architecture and results of classification experiments on algal samples of Ref. [436].
    Fig. 79. (a) Mueller matrix microscope and the schematic of Li’s system[434,435]. (b) Results of classification experiments on algal samples of Ref. [434]. (c) Results of classification experiments on algal samples of Ref. [401]. (d) Network architecture and results of classification experiments on algal samples of Ref. [436].
    Polarization-imaging-based dual-mode machine learning framework for quantitative diagnosis of cervical precancerous lesions[443].
    Fig. 80. Polarization-imaging-based dual-mode machine learning framework for quantitative diagnosis of cervical precancerous lesions[443].
    (a) CNN training network and the maximum testing precision of each sample under four different algorithms[447]. (b) Classification results of the training process (blue arrows) and testing process (red arrows) of 3D-CCNN for spectral data from the University of Pavia[449].
    Fig. 81. (a) CNN training network and the maximum testing precision of each sample under four different algorithms[447]. (b) Classification results of the training process (blue arrows) and testing process (red arrows) of 3D-CCNN for spectral data from the University of Pavia[449].
    Overall structure of the graph-attention convolution into auto-coding and estimated abundance maps for the Samson dataset[464].
    Fig. 82. Overall structure of the graph-attention convolution into auto-coding and estimated abundance maps for the Samson dataset[464].
    (a) λ-net network structure and bird reconstruction data[469]. (b) Actual data for constructing a generic framework and wheel by connecting each stage sequentially[471].
    Fig. 83. (a) λ-net network structure and bird reconstruction data[469]. (b) Actual data for constructing a generic framework and wheel by connecting each stage sequentially[471].
    (a) Joint optimization framework for coded aperture and HSI reconstruction[472]. (b) The reconstruction results contain PSNR and spectral angular map SAM[473].
    Fig. 84. (a) Joint optimization framework for coded aperture and HSI reconstruction[472]. (b) The reconstruction results contain PSNR and spectral angular map SAM[473].
    (a) Schematic diagram of the optical system[478]. (b) Simplified block diagram of the experiment. (c) Reconstructed images obtained by DD1-Net and DD2-Net.
    Fig. 85. (a) Schematic diagram of the optical system[478]. (b) Simplified block diagram of the experiment. (c) Reconstructed images obtained by DD1-Net and DD2-Net.
    (a) Example results for end-to-end network architecture and CAVE database[482]. (b) Schematic of the PCSED framework and example results from the CAVE database[483].
    Fig. 86. (a) Example results for end-to-end network architecture and CAVE database[482]. (b) Schematic of the PCSED framework and example results from the CAVE database[483].
    Main application areas of computational processing.
    Fig. 87. Main application areas of computational processing.
    Schematic diagram of image fusion scene[488].
    Fig. 88. Schematic diagram of image fusion scene[488].
    MMF-Net[495]. (a) Model framework. (b) Source image A. (c) Source image B. (d) Fused image.
    Fig. 89. MMF-Net[495]. (a) Model framework. (b) Source image A. (c) Source image B. (d) Fused image.
    Multi-focus image fusion network[498]. (a) Model framework. (b) The “Model Girl” source image pair and their fused images obtained with different fusion methods.
    Fig. 90. Multi-focus image fusion network[498]. (a) Model framework. (b) The “Model Girl” source image pair and their fused images obtained with different fusion methods.
    Effects of different multi-focus image fusion networks[500–503" target="_self" style="display: inline;">–503].
    Fig. 91. Effects of different multi-focus image fusion networks[500503" target="_self" style="display: inline;">503].
    EFCNN[509]. (a) Network structure. (b) Source images. (c) Fusion image.
    Fig. 92. EFCNN[509]. (a) Network structure. (b) Source images. (c) Fusion image.
    Unsupervised multi-exposure image fusion network DeepFuse[512]. (a) Network architecture. (b) Underexposed image. (c) Overexposed image. (d) Fusion result.
    Fig. 93. Unsupervised multi-exposure image fusion network DeepFuse[512]. (a) Network architecture. (b) Underexposed image. (c) Overexposed image. (d) Fusion result.
    GANFuse image fusion network[515]. (a) Network architecture. (b) Overexposure image. (c) Underexposer image. (d) Fusion result.
    Fig. 94. GANFuse image fusion network[515]. (a) Network architecture. (b) Overexposure image. (c) Underexposer image. (d) Fusion result.
    Image fusion network based on multi-scale feature cascades and non-local attention[519]. (a) Network architecture. (b) Infrared image. (c) Visible image. (d) Fusion result.
    Fig. 95. Image fusion network based on multi-scale feature cascades and non-local attention[519]. (a) Network architecture. (b) Infrared image. (c) Visible image. (d) Fusion result.
    Unsupervised infrared and visible image fusion network based on DenseNet[521]. (a) Network structure. (b) Infrared image. (c) Visible image. (d) Fusion image.
    Fig. 96. Unsupervised infrared and visible image fusion network based on DenseNet[521]. (a) Network structure. (b) Infrared image. (c) Visible image. (d) Fusion image.
    Medical images of different modes[526].
    Fig. 97. Medical images of different modes[526].
    Medical fusion method[531]. (a) Network architecture. (b) Fusion results for CT and MRI images. (c) Fusion results for MRI and PET images. (d) Fusion results for MRI and SPET images.
    Fig. 98. Medical fusion method[531]. (a) Network architecture. (b) Fusion results for CT and MRI images. (c) Fusion results for MRI and PET images. (d) Fusion results for MRI and SPET images.
    TIEF[532]. (a) Network architecture. (b) Fusion image.
    Fig. 99. TIEF[532]. (a) Network architecture. (b) Fusion image.
    Basic CNN architecture for PNN sharpening[535].
    Fig. 100. Basic CNN architecture for PNN sharpening[535].
    RMFF-UPGAN[540]. (a) Network architecture. (b) EXP. (c) D_P. (d) Fusion result. (e) Ground truth.
    Fig. 101. RMFF-UPGAN[540]. (a) Network architecture. (b) EXP. (c) D_P. (d) Fusion result. (e) Ground truth.
    FFDNet network[547]. (a) Network structure. (b) Noisy image. (c) Denoised image.
    Fig. 102. FFDNet network[547]. (a) Network structure. (b) Noisy image. (c) Denoised image.
    ERDF network[549]. (a) Architecture of the proposed lightweight zero-shot network. (b) Qualitative comparison of denoising for different methods along with the corresponding PSNR.
    Fig. 103. ERDF network[549]. (a) Architecture of the proposed lightweight zero-shot network. (b) Qualitative comparison of denoising for different methods along with the corresponding PSNR.
    DRANet network[550]. (a) Network structure. (b) Ground truth A. (c) Noisy image A. (d) Denoised image A. (e) Ground truth B. (f) Noisy image B. (g) Denoised image B.
    Fig. 104. DRANet network[550]. (a) Network structure. (b) Ground truth A. (c) Noisy image A. (d) Denoised image A. (e) Ground truth B. (f) Noisy image B. (g) Denoised image B.
    RCA-GAN[552]. (a) Network structure. (b) The ground truth. (c) Noisy images. (d) Denoised images.
    Fig. 105. RCA-GAN[552]. (a) Network structure. (b) The ground truth. (c) Noisy images. (d) Denoised images.
    ALDIP-SSTV network[555]. (a) Network structure. (b) Noisy image A. (c) Denoised image A. (d) Noisy image B. (e) Denoised image B.
    Fig. 106. ALDIP-SSTV network[555]. (a) Network structure. (b) Noisy image A. (c) Denoised image A. (d) Noisy image B. (e) Denoised image B.
    Four-branch image noise reduction network[557]. (a) Network structure. (b) Noisy image A. (c) Denoised image A. (d) Noisy image B. (e) Denoised image B.
    Fig. 107. Four-branch image noise reduction network[557]. (a) Network structure. (b) Noisy image A. (c) Denoised image A. (d) Noisy image B. (e) Denoised image B.
    Low-light image enhancement network[561]. (a) The network structure. (b) Input image and enhancement results.
    Fig. 108. Low-light image enhancement network[561]. (a) The network structure. (b) Input image and enhancement results.
    Improved UM-GAN network[564]. (a) Network structure. (b) Low-light inputs and enhancement images.
    Fig. 109. Improved UM-GAN network[564]. (a) Network structure. (b) Low-light inputs and enhancement images.
    MBLLEN network[566]. (a) Network structure. (b) Input image. (c) Enhancement result.
    Fig. 110. MBLLEN network[566]. (a) Network structure. (b) Input image. (c) Enhancement result.
    UWGAN[568]. (a) Network structure. (b) Input image A. (c) Enhancement image A. (d) Input image B. (e) Enhancement image B.
    Fig. 111. UWGAN[568]. (a) Network structure. (b) Input image A. (c) Enhancement image A. (d) Input image B. (e) Enhancement image B.
    UWGAN. (a) Network structure[570]. (b) Input image A. (c) Enhancement image A. (d) Input image B. (e) Enhancement image B.
    Fig. 112. UWGAN. (a) Network structure[570]. (b) Input image A. (c) Enhancement image A. (d) Input image B. (e) Enhancement image B.
    Semi-supervised image de-fogging network[574]. (a) Network architecture. (b) Image containing fog. (c) Fog removal image.
    Fig. 113. Semi-supervised image de-fogging network[574]. (a) Network architecture. (b) Image containing fog. (c) Fog removal image.
    SWCGAN. (a) Network structure[579]. (b) Low-resolution image. (c) Super-resolution recovered image.
    Fig. 114. SWCGAN. (a) Network structure[579]. (b) Low-resolution image. (c) Super-resolution recovered image.
    Image compression network[584]. (a) Network framework. (b) Original image 1. (c) Compressed image 1. (d) Original image 2. (e) Compressed image 2.
    Fig. 115. Image compression network[584]. (a) Network framework. (b) Original image 1. (c) Compressed image 1. (d) Original image 2. (e) Compressed image 2.
    Xiyuan Luo, Sen Wang, Jinpeng Liu, Xue Dong, Piao He, Qingyu Yang, Xi Chen, Feiyan Zhou, Tong Zhang, Shijie Feng, Pingli Han, Zhiming Zhou, Meng Xiang, Jiaming Qian, Haigang Ma, Shun Zhou, Linpeng Lu, Chao Zuo, Zihan Geng, Yi Wei, Fei Liu, "Revolutionizing optical imaging: computational imaging via deep learning," Photon. Insights 4, R03 (2025)
    Download Citation