• Photonics Research
  • Vol. 9, Issue 12, 2464 (2021)
Xianglei Liu1、†, João Monteiro1、†, Isabela Albuquerque1, Yingming Lai1, Cheng Jiang1, Shian Zhang2, Tiago H. Falk1、3、*, and Jinyang Liang1、4、*
Author Affiliations
  • 1Centre Énergie Matériaux Télécommunications, Institut National de la Recherche Scientifique, Varennes, Québec J3X1S2, Canada
  • 2State Key Laboratory of Precision Spectroscopy, East China Normal University, Shanghai 200062, China
  • 3e-mail: falk@emt.inrs.ca
  • 4e-mail: jinyang.liang@emt.inrs.ca
  • show less
    DOI: 10.1364/PRJ.422179 Cite this Article Set citation alerts
    Xianglei Liu, João Monteiro, Isabela Albuquerque, Yingming Lai, Cheng Jiang, Shian Zhang, Tiago H. Falk, Jinyang Liang. Single-shot real-time compressed ultrahigh-speed imaging enabled by a snapshot-to-video autoencoder[J]. Photonics Research, 2021, 9(12): 2464 Copy Citation Text show less
    Single-shot machine-learning assisted real-time (SMART) compressed optical-streaking ultrahigh-speed photography (COSUP). (a) System schematic. (b) Operating principle. S2V-AE, snapshot-to-video autoencoder.
    Fig. 1. Single-shot machine-learning assisted real-time (SMART) compressed optical-streaking ultrahigh-speed photography (COSUP). (a) System schematic. (b) Operating principle. S2V-AE, snapshot-to-video autoencoder.
    Snapshot-to-video autoencoder (S2V-AE). (a) General architecture. FI, frame index. (b) Architecture of encoder showing the generation of latent vectors from a compressively recorded snapshot. Bi-LSTM, bidirectional long short-term memory; BN, batch normalization; ReLU, rectified linear unit; W, H, and N, output dimensions; Win, Hin, and Nin, input dimensions. (c) Architecture of the generator showing the reconstruction of a single frame from one latent vector. (d) Generative adversarial networks (GANs) with multiple discriminators {Dk}. LDk, the loss function of each discriminator; LG, the loss function of the generator; and {pk}, random projection with a kernel size of [8,8] and a stride of [2,2]. (e) Architecture of each discriminator.
    Fig. 2. Snapshot-to-video autoencoder (S2V-AE). (a) General architecture. FI, frame index. (b) Architecture of encoder showing the generation of latent vectors from a compressively recorded snapshot. Bi-LSTM, bidirectional long short-term memory; BN, batch normalization; ReLU, rectified linear unit; W, H, and N, output dimensions; Win, Hin, and Nin, input dimensions. (c) Architecture of the generator showing the reconstruction of a single frame from one latent vector. (d) Generative adversarial networks (GANs) with multiple discriminators {Dk}. LDk, the loss function of each discriminator; LG, the loss function of the generator; and {pk}, random projection with a kernel size of [8,8] and a stride of [2,2]. (e) Architecture of each discriminator.
    Simulation of video reconstruction using the S2V-AE. (a) Six representative frames of the ground truth (GT, top row) and the reconstructed result (bottom row) of the handwritten digit “3.” The snapshot is shown in the far right column. (b), (c) As (a), but showing handwritten digits 5 and 7. (d), (e) Peak SNR and the structural similarity index measure (SSIM) of each reconstructed frame for the three handwritten digits.
    Fig. 3. Simulation of video reconstruction using the S2V-AE. (a) Six representative frames of the ground truth (GT, top row) and the reconstructed result (bottom row) of the handwritten digit “3.” The snapshot is shown in the far right column. (b), (c) As (a), but showing handwritten digits 5 and 7. (d), (e) Peak SNR and the structural similarity index measure (SSIM) of each reconstructed frame for the three handwritten digits.
    SMART-COSUP of animation of bouncing balls at 5 kfps. (a) Experimental setup. DMD, digital micromirror device. Inset: an experimentally acquired snapshot. (b) Five representative frames with 4 ms intervals in the ground truth (GT) and the videos reconstructed by TwIST, PnP-ADMM, and S2V-AE, respectively. Centroids of the three balls are used as vertices to build a triangle (delineated by cyan dashed lines), whose geometric center is marked with a green asterisk. (c), (d) PSNR and SSIM at each reconstructed frame. (e) Comparison of the positions of the geometric center between the GT and the reconstructed results in the x direction. (f) As (e), but showing the results in the y direction.
    Fig. 4. SMART-COSUP of animation of bouncing balls at 5 kfps. (a) Experimental setup. DMD, digital micromirror device. Inset: an experimentally acquired snapshot. (b) Five representative frames with 4 ms intervals in the ground truth (GT) and the videos reconstructed by TwIST, PnP-ADMM, and S2V-AE, respectively. Centroids of the three balls are used as vertices to build a triangle (delineated by cyan dashed lines), whose geometric center is marked with a green asterisk. (c), (d) PSNR and SSIM at each reconstructed frame. (e) Comparison of the positions of the geometric center between the GT and the reconstructed results in the x direction. (f) As (e), but showing the results in the y direction.
    SMART-COSUP of multiple-particle tracking at 20 kfps. (a) Experimental setup. (b) Static image of three microspheres (labeled as M1−M3) and the radii (labeled as rM1 and rM3). (c) Time-integrated image of the rotating microspheres imaged at the intrinsic frame rate of the CMOS camera (20 fps). (d) Color-coded overlay (top image) of five reconstructed frames (bottom row) with a 1 ms interval. (e) Time histories of the microspheres’ centroids. (f) Measured velocities of microspheres with fitting.
    Fig. 5. SMART-COSUP of multiple-particle tracking at 20 kfps. (a) Experimental setup. (b) Static image of three microspheres (labeled as M1M3) and the radii (labeled as rM1 and rM3). (c) Time-integrated image of the rotating microspheres imaged at the intrinsic frame rate of the CMOS camera (20 fps). (d) Color-coded overlay (top image) of five reconstructed frames (bottom row) with a 1 ms interval. (e) Time histories of the microspheres’ centroids. (f) Measured velocities of microspheres with fitting.
    Algorithm123 
    xyxyxyMean
    TwIST37.536.339.435.743.234.937.8
    PnP-ADMM27.626.225.625.328.630.527.3
    S2V-AE15.012.311.012.615.316.013.7
    Table 1. Standard Deviations of Reconstructed Centroids of Each Ball Averaged over Time (Unit: μm)
    Xianglei Liu, João Monteiro, Isabela Albuquerque, Yingming Lai, Cheng Jiang, Shian Zhang, Tiago H. Falk, Jinyang Liang. Single-shot real-time compressed ultrahigh-speed imaging enabled by a snapshot-to-video autoencoder[J]. Photonics Research, 2021, 9(12): 2464
    Download Citation