NIR-VIS face image translation method with dual contrastive learning framework

Rui Sun; Xiaoquan Shan; Qijing Sun; Chunjun Han; Xudong Zhang

doi:10.12086/oee.2022.210317

Journals >Opto-Electronic Engineering >Volume 49 >Issue 4 >Page 210317 > Article

Opto-Electronic Engineering
Vol. 49, Issue 4, 210317 (2022)

NIR-VIS face image translation method with dual contrastive learning framework

Rui Sun^1、2, Xiaoquan Shan^1、2、*, Qijing Sun^1、2, Chunjun Han³, and Xudong Zhang¹

Author Affiliations

¹School of Computer and Information, Hefei University of Technology, Hefei, Anhui 230009, China

²Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei, Anhui 230009, China

³Science and Technology Information Section of Bengbu Public Security Bureau, Bengbu, Anhui 233040, China

show less

DOI: 10.12086/oee.2022.210317 Cite this Article

Rui Sun, Xiaoquan Shan, Qijing Sun, Chunjun Han, Xudong Zhang. NIR-VIS face image translation method with dual contrastive learning framework[J]. Opto-Electronic Engineering, 2022, 49(4): 210317 Copy Citation Text

show less

Fig. 1. Comparison of the VIS image (the first row) generated by some algorithms from NIR domain with the real visible image (the last row)

Download full size | View in the Article

The structure diagram of the proposed method. To simplify the network structure, the identity loss is not indicated in the figure, see Section 2.4.4 for details

Fig. 2. The structure diagram of the proposed method. To simplify the network structure, the identity loss is not indicated in the figure, see Section 2.4.4 for details

Download full size | View in the Article

Fig. 3. The structure diagram of generator in the proposed method

Download full size | View in the Article

Fig. 4. Crop out facial regions and extract edges from face images in NIR and VIS conditions respectively

Download full size | View in the Article

Fig. 5. The comparison experimental results on two datasets. From left to right: input NIR face image, CycleGAN, CSGAN, CDGAN, UNIT, Pix2pixHD, the proposed method, and real VIS face image. Where rows Ⅰ~Ⅲ are from NIR-VIS Sx1 dataset, and rows Ⅳ~Ⅶ are from NIR-VIS Sx2 dataset

Download full size | View in the Article

Fig. 6. Results of the ablation experiments on two datasets. From left to right: input NIR face image, Baseline method, the proposed method without StyleGAN2、

L_{GAN}

、

L_{IDT}

、

L_{PMC}

、

L_{FEE}

respectively, the proposed method and real VIS face image. Where rows Ⅰ~Ⅱ are from NIR-VIS Sx1 dataset and rows Ⅲ~Ⅳ are from NIR-VIS Sx2 dataset

Download full size | View in the Article

Fig. 7. Comparison of edge images obtained by using each edge extraction method separately. From left to right: real face image, Roberts operator, Prewitt operator, Sobel operator, Laplacian operator, Canny operator

Download full size | View in the Article

Fig. 8. The effect of different values of

λ_{FEE}

on the performance of our method on the NIR-VIS Sx1 dataset

Download full size | View in the Article

Method	Mean SSIM	Mean PSNR/dB
CycleGAN	0.7433	29.0987
CSGAN	0.7964	29.9471
CDGAN	0.7636	29.4922
UNIT	0.7935	29.8568
Pix2pixHD	0.8023	31.6584
Ours	0.8096	31.0976

Table 1. Performance comparison of image translation networks on the NIR-VIS Sx1 dataset

View in the Article

Method	Mean SSIM	Mean PSNR/dB
CycleGAN	0.6317	28.7974
CSGAN	0.6891	28.8176
CDGAN	0.5283	28.1679
UNIT	0.6986	29.0634
Pix2pixHD	0.7894	30.5449
Ours	0.8135	31.2393

Table 2. Performance comparison of image translation networks on the NIR-VIS Sx2 dataset

View in the Article

Method	FID (NIR-VIS Sx1)	FID (NIR-VIS Sx2)	Time/s
CycleGAN	142.2574	171.3596	0.181
CSGAN	70.2146	102.6718	0.344
CDGAN	123.7183	212.4299	0.098
UNIT	74.8315	95.7638	0.358
Pix2pixHD	67.1044	106.3615	0.079
Ours	58.5286	46.9364	0.337

Table 3. Comparison of FID performance and average single test time of each image translation network on different datasets

View in the Article

Method	Mean SSIM	Mean PSNR/dB
Baseline	0.5279	28.3419
Ours w/o StyleGAN2	0.5293	28.4381
Ours w/o GAN	0.3617	11.5007
Ours w/o IDT	0.6864	29.2308
Ours w/o PMC	0.6359	28.6156
Ours w/o FEE	0.7982	30.2057
Ours	0.8096	31.0976

Table 4. Performance comparison of ablation methods on the NIR-VIS Sx1 dataset

View in the Article

Method	Mean SSIM	Mean PSNR/dB
Ours (Prewitt)	0.7924	30.2815
Ours (Sobel)	0.8096	31.0976

Table 5. Performance comparison of applying the Prewitt operator and Sobel operator respectively on the NIR-VIS Sx1 dataset

Rui Sun, Xiaoquan Shan, Qijing Sun, Chunjun Han, Xudong Zhang. NIR-VIS face image translation method with dual contrastive learning framework[J]. Opto-Electronic Engineering, 2022, 49(4): 210317

Download Citation

Tools

Save the article for my favorites

Paper Information