Snapshot spectral compressive imaging reconstruction using convolution and contextual Transformer

Lishun Wang; Zongliang Wu; Yong Zhong; Xin Yuan

doi:10.1364/PRJ.458231

Journals >Photonics Research >Volume 10 >Issue 8 >Page 1848 > Article

Photonics Research
Vol. 10, Issue 8, 1848 (2022)

Snapshot spectral compressive imaging reconstruction using convolution and contextual Transformer

Lishun Wang^1,2, Zongliang Wu³, Yong Zhong^1,2,4,*, and Xin Yuan^3,5,*

Author Affiliations

¹Chengdu Institute of Computer Application, Chinese Academy of Sciences, Chengdu 610041, China

²University of Chinese Academy of Sciences, Beijing 100049, China

³Research Center for Industries of the Future and School of Engineering, Westlake University, Hangzhou 310030, China

⁴e-mail: zhongyong@casit.com.cn

⁵e-mail: xyuan@westlake.edu.cn

show less

DOI: 10.1364/PRJ.458231 Cite this Article Set citation alerts

Lishun Wang, Zongliang Wu, Yong Zhong, Xin Yuan, "Snapshot spectral compressive imaging reconstruction using convolution and contextual Transformer," Photonics Res. 10, 1848 (2022) Copy Citation Text

EndNote(RIS)

BibTex

Plain Text

show less

Reconstructed real data of Legoman, captured by snapshot SCI systems in Ref. [20]. We show reconstruction results of 12 spectral channels, and compare our proposed method with the latest self-supervised method (PnP-DIP-HSI [23]) and the method based on maximum a posteriori (MAP) estimation (DGSMP algorithm [24]). As can be seen from the purple and green areas in the plot, our method reconstructs a clearer image, the PnP-DIP-HSI method produces some artifacts, and the DGSMP method loses some details.

Fig. 1. Reconstructed real data of Legoman, captured by snapshot SCI systems in Ref. [20]. We show reconstruction results of 12 spectral channels, and compare our proposed method with the latest self-supervised method (PnP-DIP-HSI [23]) and the method based on maximum a posteriori (MAP) estimation (DGSMP algorithm [24]). As can be seen from the purple and green areas in the plot, our method reconstructs a clearer image, the PnP-DIP-HSI method produces some artifacts, and the DGSMP method loses some details.

Download full size | View in the Article

Fig. 2. Schematic diagrams of CASSI system.

Download full size | View in the Article

Fig. 3. Architecture of the proposed GAP-CCoT. (a) GAP-net with N stages; G(·) represents the operation of Eq. (6), D(·) represents a denoiser, and v(0)=HTg. (b) CCoT-net, the proposed denoising network plugged into GAP algorithm. (c) Convolution branch and Transformer branch; the output is connected with concatenation. (d) Convolution block with channel attention; c represents the output number of convolution channels. (e) Contextual Transformer block. (f) Pixelshuffle algorithm for fast upsampling.

Download full size | View in the Article

Fig. 4. Reconstruction results of GAP-CCoT and other spectral reconstruction algorithms (λ-net, HSSP, TSA-net, GAP-net, DGSMP, PnP-DIP-HSI) in scene 3 and scene 9. Zoom in for better view.

Download full size | View in the Article

Fig. 5. Architecture of the proposed Stacked CCoT. The input of the network is HTg, and CCoT-net is the same as in Fig. 3(b).

Download full size | View in the Article

Fig. 6. Effect of stage number on SCI reconstruction quality.

Download full size | View in the Article

Fig. 7. Reconstruction results of GAP-CCoT and other spectral reconstruction algorithms (λ-net, TSA-net, GAP-net, DGSMP, PnP-DIP-HSI) in two real scenes (scene 1 and scene 2).

Download full size | View in the Article

Fig. 8. Reconstructed frame of our method and other algorithms (GAP-TV, DeSCI, PnP-FFDNet, U-net, BIRNAT, RevSCI) on six benchmark datasets.

Download full size | View in the Article

Algorithms	Scene 1	Scene 2	Scene 3	Scene 4	Scene 5	Scene 6	Scene 7	Scene 8	Scene 9	Scene 10	Average
TwIST [6]	24.81	19.99	21.14	30.30	21.68	22.16	17.71	22.39	21.43	22.87	$22.44 \pm 3.32$
TwIST [6]	0.730	0.632	0.764	0.874	0.688	0.660	0.694	0.682	0.729	0.595	$0.704 \pm 0.077$
GAP-TV [7]	25.13	20.67	23.19	35.13	22.31	22.90	17.98	23.00	23.36	23.70	$23.73 \pm 4.45$
GAP-TV [7]	0.724	0.630	0.757	0.870	0.674	0.635	0.670	0.624	0.717	0.551	$0.685 \pm 0.088$
DeSCI [8]	27.15	22.26	26.56	39.00	24.80	23.55	20.03	20.29	23.98	25.94	$25.35 \pm 5.38$
DeSCI [8]	0.794	0.694	0.877	0.965	0.778	0.753	0.772	0.740	0.818	0.666	$0.785 \pm 0.087$
HSSP [19]	31.48	31.09	28.96	34.56	28.53	30.83	28.71	30.09	30.43	28.78	$30.35 \pm 3.79$
HSSP [19]	0.858	0.842	0.832	0.902	0.808	0.877	0.824	0.881	0.868	0.842	$0.852 \pm 0.049$
$λ$ -net [9]	30.82	26.30	29.42	36.27	27.84	30.69	24.20	28.86	29.32	27.66	$29.14 \pm 3.20$
$λ$ -net [9]	0.880	0.846	0.916	0.962	0.866	0.886	0.875	0.880	0.902	0.843	$0.886 \pm 0.035$
TSA-net [71]	31.26	26.88	30.03	39.90	28.89	31.30	25.16	29.69	30.03	28.32	$30.15 \pm 3.92$
TSA-net [71]	0.887	0.855	0.921	0.964	0.878	0.895	0.887	0.887	0.903	0.848	$0.893 \pm 0.033$
PnP-DIP-HSI [23]	32.70	27.27	31.32	40.79	29.81	30.41	28.18	29.45	34.55	28.52	$31.30 \pm 3.98$
PnP-DIP-HSI [23]	0.898	0.832	0.920	0.970	0.903	0.890	0.913	0.885	0.932	0.863	$0.901 \pm 0.038$
GAP-net [20]	33.03	29.52	33.04	41.59	30.95	32.88	27.60	30.17	32.74	29.73	$32.13 \pm 3.81$
GAP-net [20]	0.921	0.903	0.940	0.972	0.924	0.927	0.921	0.904	0.927	0.901	$0.924 \pm 0.021$
DGSMP [24]	33.26	32.09	33.06	40.54	28.86	33.08	30.74	31.55	31.66	31.44	$32.63 \pm 3.07$
DGSMP [24]	0.915	0.898	0.925	0.964	0.882	0.937	0.886	0.923	0.911	0.925	$0.917 \pm 0.024$
SSI-ResU-Net (v1) [10]	34.06	30.85	33.14	40.79	31.57	34.99	27.93	33.24	33.58	31.55	$33.17 \pm 3.34$
SSI-ResU-Net (v1) [10]	0.926	0.902	0.924	0.970	0.939	0.955	0.861	0.949	0.931	0.934	$0.929 \pm 0.030$
Ours	35.17	35.90	36.91	42.25	32.61	34.95	33.46	33.13	35.75	32.43	$35.26 \pm 2.89$
Ours	0.938	0.948	0.958	0.977	0.948	0.957	0.923	0.952	0.954	0.941	$0.950 \pm 0.014$

Table 1. Average PSNR in dB (upper entry in each cell) and SSIM (lower entry in each cell) of Different Algorithms on 10 Synthetic Datasets^a

View in the Article

Algorithm	Params ( $10^{6}$ )	FLOPs ( $10^{9}$ )	PSNR (dB)	SSIM
$λ$ -net [9]	66.16	514.33	29.25	0.886
TSA-net [71]	44.25	135.03	30.15	0.893
GAP-net [20]	2.89	54.16	32.13	0.924
DGSMP [24]	3.76	647.28	32.63	0.917
SSI-ResU-Net (v1) [10]	1.25	81.98	33.17	0.929
GAP-CCoT-S3	2.68	31.84	33.89	0.934
GAP-CCoT-S9	8.04	95.52	35.26	0.950

Table 2. Computational Complexity and Average Reconstruction Quality of Several SOTA Algorithms on 10 Synthetic Datasets

View in the Article

Mask	PSNR (dB)	SSIM
Mask used in training	$35.26 \pm 2.89$	$0.950 \pm 0.014$
New mask 1	$35.10 \pm 2.92$	$0.949 \pm 0.015$
New mask 2	$35.06 \pm 2.91$	$0.948 \pm 0.015$
New mask 3	$35.06 \pm 2.91$	$0.949 \pm 0.015$
New mask 4	$35.02 \pm 2.92$	$0.948 \pm 0.014$
New mask 5	$34.99 \pm 2.90$	$0.948 \pm 0.014$

Table 3. Average PSNR and SSIM Results on 10 Synthetic Data with Different Masks

View in the Article

Algorithms	PSNR (dB)	SSIM
Stacked CCoT w/o CoT	$32.86 \pm 3.01$	$0.924 \pm 0.021$
GAP-CCoT w/o CoT	$34.13 \pm 2.95$	$0.933 \pm 0.019$
Stacked CCoT	$34.27 \pm 2.94$	$0.936 \pm 0.018$
GAP-CCoT	$35.26 \pm 2.89$	$0.950 \pm 0.014$

Table 4. Ablation Study: Average PSNR and SSIM Values of Different Algorithms on 10 Synthetic Data

View in the Article

Stage Number	Params ( $10^{6}$ )	FLOPs ( $10^{9}$ )	PSNR (dB)	SSIM
3	2.68	31.84	33.89	0.934
5	4.47	53.06	34.30	0.936
7	6.25	74.29	34.86	0.940
9	8.04	95.52	35.26	0.950
12	10.72	127.35	35.43	0.951
15	13.41	159.19	35.54	0.952

Table 5. Computational Complexity and Average Reconstruction Quality of GAP-CCoT on 10 Synthetic Data with Different Stages

View in the Article

Loss Function	PSNR (dB)	SSIM
LAD	35.48	0.952
MSE	35.26	0.950

Table 6. Average PSNR and SSIM Results on 10 Synthetic Data with Different Loss Functions

View in the Article

Algorithm	PSNR (dB)	SSIM	Running Time (s)
GAP-TV [7]	$26.73 \pm 4.33$	$0.858 \pm 0.082$	4.201 (CPU)
PnP-FFDNet [74]	$29.70 \pm 6.75$	$0.892 \pm 0.071$	3.010 (GPU)
DeSCI [8]	$32.65 \pm 7.07$	$0.935 \pm 0.047$	6180 (CPU)
BIRNAT [75]	$33.31 \pm 5.90$	$0.951 \pm 0.027$	0.165 (GPU)
U-net [76]	$29.45 \pm 4.75$	$0.882 \pm 0.057$	0.031 (GPU)
GAP-net-U-net-S12 [20]	$32.86 \pm 5.92$	$0.947 \pm 0.030$	0.03 (GPU)
MetaSCI [77]	$31.72 \pm 5.72$	$0.926 \pm 0.040$	0.025 (GPU)
RevSCI [78]	$33.92 \pm 6.02$	$0.956 \pm 0.025$	0.190 (GPU)
Ours	$33.53 \pm 5.90$	$0.954 \pm 0.026$	0.064 (GPU)

Table 7. Extending Our Method for Video Compressive Sensing: Average PSNR, SSIM, and Running Time per Measurement of Different Algorithms on Six Benchmark Datasets

View in the Article

Algorithm	Params ( $10^{6}$ )	FLOPs ( $10^{9}$ )	PSNR (dB)	SSIM
BIRNAT [75]	4.13	390.56	33.31	0.951
U-net [76]	0.82	53.63	29.45	0.882
GAP-net-U-net-S12 [20]	5.62	87.58	32.86	0.947
MetaSCI [77]	2.89	54.16	31.72	0.926
RevSCI [78]	5.66	766.95	33.92	0.956
Ours	10.51	113.75	33.53	0.954

Table 8. Computational Complexity and Average Reconstruction Quality of Several SOTA Algorithms on Six Grayscale Benchmark Datasets

Lishun Wang, Zongliang Wu, Yong Zhong, Xin Yuan, "Snapshot spectral compressive imaging reconstruction using convolution and contextual Transformer," Photonics Res. 10, 1848 (2022)

Download Citation

EndNote(RIS)

BibTex

Plain Text

Set citation alerts for the article

Tools

Set citation alerts for the article

Save the article for my favorites

Paper Information

微信扫一扫：分享

微信扫一扫：分享