Ten-mega-pixel snapshot compressive imaging with a hybrid coded aperture

Zhihong Zhang; Chao Deng; Yang Liu; Xin Yuan; Jinli Suo; Qionghai Dai

doi:10.1364/PRJ.435256

Journals >Photonics Research >Volume 9 >Issue 11 >Page 2277 > Article

Photonics Research
Vol. 9, Issue 11, 2277 (2021)

Ten-mega-pixel snapshot compressive imaging with a hybrid coded aperture

Zhihong Zhang^1、2、†, Chao Deng^1、2、†, Yang Liu³, Xin Yuan^4、6, Jinli Suo^1、2、*, and Qionghai Dai^1、2、5

Author Affiliations

¹Department of Automation, Tsinghua University, Beijing 100084, China

²Institute for Brain and Cognitive Science, Tsinghua University, Beijing 100084, China

³Computer Science and Artificial Intelligence Laboratory, Massachusetts Institute of Technology, Cambridge, Massachusetts 02139, USA

⁴Westlake University, Hangzhou 310024, China

⁵Beijing National Research Center for Information Science and Technology, Tsinghua University, Beijing 100084, China

⁶e-mail: xyuan@westlake.edu.cn

show less

DOI: 10.1364/PRJ.435256 Cite this Article Set citation alerts

Zhihong Zhang, Chao Deng, Yang Liu, Xin Yuan, Jinli Suo, Qionghai Dai. Ten-mega-pixel snapshot compressive imaging with a hybrid coded aperture[J]. Photonics Research, 2021, 9(11): 2277 Copy Citation Text

EndNote(RIS)

BibTex

Plain Text

show less

Our 10-mega-pixel video SCI system (a) and the schematic (b). Ten high-speed (200 fps) high-resolution (3200×3200 pixels) video frames (c) reconstructed from a snapshot measurement (d), with motion detail in (e) for the small region in the blue box of (d). Different from existing solutions that only use an LCoS or a mask (thus with limited spatial resolution), our 10-mega-pixel spatio-temporal coding is generated jointly by an LCoS at the aperture plane and a static mask close to the image plane.

Fig. 1. Our 10-mega-pixel video SCI system (a) and the schematic (b). Ten high-speed (200 fps) high-resolution (3200×3200 pixels) video frames (c) reconstructed from a snapshot measurement (d), with motion detail in (e) for the small region in the blue box of (d). Different from existing solutions that only use an LCoS or a mask (thus with limited spatial resolution), our 10-mega-pixel spatio-temporal coding is generated jointly by an LCoS at the aperture plane and a static mask close to the image plane.

Download full size | View in the Article

Pipeline of the proposed large-scale HCA-SCI system (left) and the PnP reconstruction algorithms (right). Left: During the encoded photography stage, a dynamic low-resolution mask at the aperture plane and a static high-resolution mask close to the sensor plane work together to generate a sequence of high-resolution codes to encode the large-scale video into a snapshot. Right: In the decoding, the video is reconstructed under a PnP framework incorporating deep denoising prior and TV prior into a convex optimization (GAP), which leverages the good convergence of GAP and the high efficiency of the deep network.

Fig. 2. Pipeline of the proposed large-scale HCA-SCI system (left) and the PnP reconstruction algorithms (right). Left: During the encoded photography stage, a dynamic low-resolution mask at the aperture plane and a static high-resolution mask close to the sensor plane work together to generate a sequence of high-resolution codes to encode the large-scale video into a snapshot. Right: In the decoding, the video is reconstructed under a PnP framework incorporating deep denoising prior and TV prior into a convex optimization (GAP), which leverages the good convergence of GAP and the high efficiency of the deep network.

Download full size | View in the Article

Fig. 3. Illustration of the multiplexed mask generation. For the same scene point, its images generated by different sub-apertures (marked as blue, yellow, and red, respectively) intersect the mask plane with different regions and are thus encoded with corresponding (shifted) random masks before summation at the sensor. The multiplexing would raise the light flux for high SNR recording, while doing so only with slight performance degeneration.

Download full size | View in the Article

Fig. 4. Multiplexing pattern schemes used in our experiments (taking Cr=6 for an example). Top row: multiplexing patterns for simulation experiments. Each pattern contains 50% open sub-apertures, and each sub-aperture is a 512×512 binning macro pixel on the LCoS. Bottom row: multiplexing patterns for real experiments. Each pattern contains an open circle with a radius of about 400 pixels, and the circles in adjacent patterns have a rotation of 360/Cr degrees.

Download full size | View in the Article

Fig. 5. Reconstruction results and comparison with state-of-the-art algorithms on simulated data at different resolutions (left: 256×256, middle: 512×512, right: 1024×1024) and with different compression ratios (top: Cr=10, bottom: Cr=20). The BIRNAT results are not available for 512×512 and 1024×1024 since the model training will be out of memory. See Visualization 1, Visualization 2, Visualization 3, Visualization 4, Visualization 5, and Visualization 6 for the reconstructed videos.

Download full size | View in the Article

Fig. 6. Noise robustness comparison between multiplexed and non-multiplexed masks.

Download full size | View in the Article

Fig. 7. Reconstruction results of the PnP–TV–FastDVDNet on real data captured by our HCA-SCI system (Cr=6, 10, 20, and 30). Note the full frames are of 3200×3200, and we plot small regions about 400×400 in size to demonstrate the high-speed motion.

Download full size | View in the Article

Fig. 8. Reconstruction comparison between the GAP–TV, PnP–FFDNet, and PnP–TV–FastDVDNet on real data captured by our HCA-SCI system (Cr=6, 10, 20, and 30). Note the full frames are of 3200×3200, and we plot small regions 512×512 in size to demonstrate the high-speed motion. See Visualization 7 for the reconstructed videos.

Download full size | View in the Article

Scales	Algorithms	Football	Hummingbird	ReadySteadyGo	Jockey	YachtRide	Average
$256 \times 256$	GAP–TV	27.82, 0.8280	29.24, 0.7918	23.73, 0.7499	31.63, 0.8712	26.65, 0.8056	27.81, 0.8093
	PnP–FFDNet	27.06, 0.8264	25.52, 0.6912	21.68, 0.6859	31.14, 0.8493	23.69, 0.7035	25.82, 0.7513
	PnP–TV–FastDVDNet	31.31, 0.9123	31.19, 0.8264	26.18, 0.8276	31.36, 0.8817	28.90, 0.8841	29.79, 0.8664
	BIRNAT	34.67, 0.9719	34.33, 0.9546	29.50, 0.9389	36.24, 0.9711	31.02, 0.9431	33.15, 0.9559
$512 \times 512$	GAP–TV	29.19, 0.8854	28.32, 0.7887	25.94, 0.7918	31.30, 0.8718	26.59, 0.7939	28.27, 0.8263
	PnP–FFDNet	28.57, 0.8952	28.02, 0.8363	24.32, 0.7457	29.81, 0.8248	23.45, 0.6793	26.83, 0.7963
	PnP–TV–FastDVDNet	30.92, 0.9333	32.24, 0.8834	27.04, 0.8246	32.11, 0.8839	27.87, 0.8487	30.04, 0.8748
$1024 \times 1024$	GAP–TV	30.63, 0.9022	29.16, 0.8459	28.92, 0.8698	31.59, 0.8953	29.03, 0.8470	29.87, 0.8720
	PnP–FFDNet	29.87, 0.9023	27.70, 0.7869	27.70, 0.8483	29.88, 0.8412	25.55, 0.7211	28.14, 0.8200
	PnP–TV–FastDVDNet	30.35, 0.9265	31.71, 0.8909	29.42, 0.8913	31.59, 0.9014	30.44, 0.8713	30.70, 0.8963

Table 1. Average Results of PSNR in dB (left entry in each cell) and SSIM (right entry in each cell) by Different Algorithms (Cr=10)^a

View in the Article

Scales	Algorithms	Football	Hummingbird	ReadySteadyGo	Jockey	YachtRide	Average
$256 \times 256$	GAP–TV	25.01, 0.7544	26.33, 0.6893	20.48, 0.6326	28.13, 0.8318	23.56, 0.7129	24.70, 0.7242
	PnP–FFDNet	21.67, 0.6657	22.13, 0.5835	17.27, 0.5340	27.78, 0.7994	20.39, 0.6024	21.85, 0.6370
	PnP–TV–FastDVDNet	27.83, 0.8459	28.65, 0.7520	23.28, 0.7381	29.51, 0.8597	26.34, 0.8235	27.12, 0.8038
$512 \times 512$	BIRNAT	27.91, 0.9021	28.58, 0.8800	23.79, 0.8279	31.35, 0.9467	26.14, 0.8585	27.55, 0.8830
	GAP–TV	23.97, 0.8179	24.50, 0.6719	22.12, 0.6975	26.99, 0.8297	23.13, 0.6930	24.14, 0.7420
	PnP–FFDNet	22.00, 0.7661	23.62, 0.7245	19.35, 0.6133	25.32, 0.7924	19.48, 0.5418	21.95, 0.6876
	PnP–TV–FastDVDNet	25.63, 0.8852	28.36, 0.7778	23.80, 0.7499	28.79, 0.8553	25.36, 0.7784	26.39, 0.8093
$1024 \times 1024$	GAP–TV	24.82, 0.8353	25.53, 0.7296	24.98, 0.8128	26.63, 0.8388	25.80, 0.7759	25.55, 0.7985
	PnP–FFDNet	23.55, 0.8098	23.02 0.6039	22.48, 0.7702	24.48, 0.7968	21.67, 0.6414	23.04, 0.7244
	PnP–TV–FastDVDNet	26.26, 0.8729	28.68, 0.8076	26.31, 0.8399	29.18, 0.8773	28.07, 0.8194	27.70, 0.8434

Table 2. Average Results of PSNR in dB (left entry in each cell) and SSIM (right entry in each cell) by Different Algorithms (Cr=20)^a

View in the Article

Require

H

y

1: Initialize:

v^{(0)}, λ_{0}, ξ < 1, k = 1, K_{1}, K_{Max}

2: while Not Converge and

k \leq K_{Max}

3: Update

x

by Eq. (7).

4: Update

v

5: if

k \leq K_{1}

then

v^{(k)} = D_{TV} (x^{(k)})

7: else

v^{'} = D_{TV} (x^{(k)})

v^{(k)} = D_{FastDVDNet} (v^{'})

Table 3. PnP–TV–FastDVDNet for HCA-SCI

Zhihong Zhang, Chao Deng, Yang Liu, Xin Yuan, Jinli Suo, Qionghai Dai. Ten-mega-pixel snapshot compressive imaging with a hybrid coded aperture[J]. Photonics Research, 2021, 9(11): 2277

Download Citation

EndNote(RIS)

BibTex

Plain Text

Set citation alerts for the article

Tools

Set citation alerts for the article

Save the article for my favorites

Paper Information