• Opto-Electronic Engineering
  • Vol. 49, Issue 4, 210317 (2022)
Rui Sun1、2, Xiaoquan Shan1、2、*, Qijing Sun1、2, Chunjun Han3, and Xudong Zhang1
Author Affiliations
  • 1School of Computer and Information, Hefei University of Technology, Hefei, Anhui 230009, China
  • 2Anhui Province Key Laboratory of Industry Safety and Emergency Technology, Hefei, Anhui 230009, China
  • 3Science and Technology Information Section of Bengbu Public Security Bureau, Bengbu, Anhui 233040, China
  • show less
    DOI: 10.12086/oee.2022.210317 Cite this Article
    Rui Sun, Xiaoquan Shan, Qijing Sun, Chunjun Han, Xudong Zhang. NIR-VIS face image translation method with dual contrastive learning framework[J]. Opto-Electronic Engineering, 2022, 49(4): 210317 Copy Citation Text show less

    Abstract

    Overview: Near-infrared image sensors are widely used because they can overcome the effects of natural light and work in various lighting conditions. In the field of criminal security, NIR face images are usually not directly used for face retrieval and recognition because the single-channel images acquired by NIR sensors miss the natural colors of the original images. Therefore, converting NIR face images into VIS face images and restoring the color information of face images can help further improve the subjective visual effect and cross-modal recognition performance of face images, and provide technical support for building a 24/7 video surveillance system. However, NIR face images are different from other NIR images. If the details of face contours and facial skin tones are distorted in the coloring process, the visual effect and image quality of the generated face images will be greatly affected. Therefore, it is necessary to design algorithms to enhance the retention of detailed information in the coloring process of NIR face images. We propose a NIR-VIS face image translation method under a dual contrastive learning framework. The method is based on the dual contrastive learning network and uses contrastive learning to enhance the quality of the images generated from the image localization. Meanwhile, since the StyleGAN2 network can extract deeper features of face images compared with ResNets, we construct a generator network based on the StyleGAN2 structure and embed it into the dual contrastive learning network to replace the original ResNets generator to further improve the quality of the generated face images. In addition, for the characteristics of blurred external contours and missing edges of human figures in NIR domain images, a facial edge enhancement loss is designed in this paper to further enhance the facial details of the generated face images by using the facial edge information extracted from the source domain images. Experiments show that the generation results on two public datasets based on our method are significantly better than those of recent mainstream methods. The VIS face images generated by our method are closer to the real images and possesses more facial edge details and skin tone information of face images.With the wide application of visible-infrared dual-mode cameras in video surveillance, cross-modal face recognition has become a research hotspot in the field of computer vision. The translation of NIR domain face images into VIS domain face images is a key problem in cross-modal face recognition, which has important research value in the fields of criminal investigation and security. Aiming at the problems that facial contours are easily distorted and skin color restoration is unrealistic during the coloring process of NIR face images, this paper proposes a NIR-VIS face images translation method under a dual contrastive learning framework. This method constructs a generator network based on the StyleGAN2 structure and embeds it into the dual contrastive learning framework to exploit the fine-grained characteristics of face images using bidirectional contrastive learning. Meanwhile, a facial edge enhancement loss is designed to further enhance the facial details in the generated face images and improve the visual effects of the face images using the facial edge information extracted from the source domain images. Finally, experiments on the NIR-VIS Sx1 and NIR-VIS Sx2 datasets show that, compared with the recent mainstream methods, the VIS face images generated by this method are closer to the real images and possesses more facial edge details and skin color information of the face images.
    Rui Sun, Xiaoquan Shan, Qijing Sun, Chunjun Han, Xudong Zhang. NIR-VIS face image translation method with dual contrastive learning framework[J]. Opto-Electronic Engineering, 2022, 49(4): 210317
    Download Citation