• Advanced Imaging
  • Vol. 2, Issue 1, 011001 (2025)
Shukai Wu1,2, Zheng Huang1,2, Caihua Zhang1,2, Conghe Wang1,2, and Hongwei Chen1,2,*
Author Affiliations
  • 1Department of Electronic Engineering, Tsinghua University, Beijing, China
  • 2Beijing National Research Center for Information Science and Technology, Beijing, China
  • show less
    DOI: 10.3788/AI.2025.10018 Cite this Article Set citation alerts
    Shukai Wu, Zheng Huang, Caihua Zhang, Conghe Wang, Hongwei Chen, "Privacy-preserving face recognition with a mask-encoded microlens array," Adv. Imaging 2, 011001 (2025) Copy Citation Text show less
    Schematic diagram of the experimental setup. (a) Our MEM-FR prototype for face recognition. (b) Schematic diagram of the convolution calculation correction in the experiment. (c) Concept of the point spread function (PSF) formation. (d) Optical path comparison between our MEM-FR system (right) and LOEN system (left), where the MEM-FR system increases the spatial resolution from Δ to Δ/14.2 and enlarges the aperture size of the mask from 30 to 300 µm, enhancing light throughput.
    Fig. 1. Schematic diagram of the experimental setup. (a) Our MEM-FR prototype for face recognition. (b) Schematic diagram of the convolution calculation correction in the experiment. (c) Concept of the point spread function (PSF) formation. (d) Optical path comparison between our MEM-FR system (right) and LOEN system (left), where the MEM-FR system increases the spatial resolution from Δ to Δ/14.2 and enlarges the aperture size of the mask from 30 to 300 µm, enhancing light throughput.
    Impact of the optical convolution kernel size. (a) Recognition accuracy of the MEM-FR system alongside peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) values calculated between the encrypted images and original images, reflecting the privacy protection effectiveness of the system. (b) Principle of the optical standard convolution (top) and dilated convolution (bottom). (c) Privacy protection effectiveness of dilated convolution under different dilation rates.
    Fig. 2. Impact of the optical convolution kernel size. (a) Recognition accuracy of the MEM-FR system alongside peak signal-to-noise ratio (PSNR) and structural similarity index measure (SSIM) values calculated between the encrypted images and original images, reflecting the privacy protection effectiveness of the system. (b) Principle of the optical standard convolution (top) and dilated convolution (bottom). (c) Privacy protection effectiveness of dilated convolution under different dilation rates.
    Face recognition framework of the MEM-FR system. The network consists of an optical dilated convolution layer and an electronic neural network, with optimized parameters deployed to the physical component for fine-tuning and inference after training.
    Fig. 3. Face recognition framework of the MEM-FR system. The network consists of an optical dilated convolution layer and an electronic neural network, with optimized parameters deployed to the physical component for fine-tuning and inference after training.
    Visualization of captured images in the natural scene and experimental setup. Each row represents different poses (including center, left, right, up, and down) from one identity, and each column represents the same pose from different identities.
    Fig. 4. Visualization of captured images in the natural scene and experimental setup. Each row represents different poses (including center, left, right, up, and down) from one identity, and each column represents the same pose from different identities.
    Visualization of privacy protection effectiveness and quantitative analysis of image similarity. (a) Original, simulated encrypted, and captured images from the MEM-FR system. (b) Comparative image metrics (PSNR, SSIM, and LPIPS) for captured images between the same and different identities.
    Fig. 5. Visualization of privacy protection effectiveness and quantitative analysis of image similarity. (a) Original, simulated encrypted, and captured images from the MEM-FR system. (b) Comparative image metrics (PSNR, SSIM, and LPIPS) for captured images between the same and different identities.
    Reconstructed images under a blind deconvolution attack with U-Net. Reconstructed images (training with 200 pairs) have 19.26 dB PSNR and 0.55 SSIM; reconstructed images (training with 600 pairs) have 21.84 dB PSNR and 0.65 SSIM; some key facial features from reconstructed and original images are compared.
    Fig. 6. Reconstructed images under a blind deconvolution attack with U-Net. Reconstructed images (training with 200 pairs) have 19.26 dB PSNR and 0.55 SSIM; reconstructed images (training with 600 pairs) have 21.84 dB PSNR and 0.65 SSIM; some key facial features from reconstructed and original images are compared.
    MethodSimulation AccuracyPhysical AccuracyFOV (°)Light Throughout (relative)Spatial Resolution (mm)
    LOEN0.90920.870728.021.0010.6
    MEM-FR0.94970.923334.514.370.9
    Table 1. Experimental Results and System Performance.
    Cosine SimilarityID1_1 versus ID2_1ID1_2 versus ID2_2ID1_1 versus ID1_2ID2_1 versus ID2_2
    MEM-FR system0.130.040.520.47
    Standard system0.880.930.870.87
    Table 2. Cosine Similarity of Different and Same Identities Using the MEM-FR System and Standard System.
    Shukai Wu, Zheng Huang, Caihua Zhang, Conghe Wang, Hongwei Chen, "Privacy-preserving face recognition with a mask-encoded microlens array," Adv. Imaging 2, 011001 (2025)
    Download Citation