Infrared Image Colorization using Generative Adversarial Network(GAN)

Manan Tiwari
6 min readNov 26, 2022

--

My internship work at Space Applications Centre, ISRO, Ahmedabad for the duration (October-December, 2023)

Introduction

Transforming a thermal infrared image into a realistic RGB image is a challenging task. Over the last decade, automatic image colorization has been of significant interest in several application areas including the restoration of aged or degraded images. This problem is highly ill-posed due to the large degrees of freedom during the assignment of color information. Many recent developments in automatic colorization involve images that contain a common theme or require highly processed data such as semantic maps as input. I propose learning the transformation mapping using a coarse-to-fine generator that preserves the details. Since the standard mean squared loss cannot penalize the distance between colorized and ground truth images well, we propose a composite loss function that combines content, adversarial, perceptual, and total variation losses. The content loss is used to recover global image information while the latter three losses synthesize local realistic textures.

I have addressed address thermal infrared colorization problem using a conditional generative adversarial network (TIC-CGAN). I trained the model on a dataset containing a large number of thermal infrared and color image pairs(KAIST dataset). And afterward on our own small dataset too. RGB images generated using their model have blurry details and lack textures. To address this issue, I used a conditional GAN-based model and define a new objective function. Content loss aims to generate global colors and adversarial loss helps generate local details. Both perceptual and TV losses make the local details finer.

(a) Thermal infrared traffic images. (b) Colorized results by our method. (c)True RGB images

Related Work

Earlier works related to colorization have been already done in various domains. Colorization approaches earlier have predicted the chrominance from the luminance, and therefore cannot be directly applied to thermal infrared images, where the luminance also needs to be predicted. Unlike infrared, thermal infrared is emitted energy that is sensed digitally, whereas infrared is reflected energy. Their method is robust to image pair misalignments. However, the details in their colorized results are blurred and the sky has artifacts.

Generative Adversarial Networks

Generative Adversarial Networks(GANs), are an approach to generative modeling using deep learning methods, such as convolutional neural networks.

The GAN model architecture involves two sub-models:

Generator. The model is used to generate new plausible examples from the problem domain.

Discriminator. The model is used to classify examples as real (from the domain) or fake (generated).

Conditional GAN-CGAN (pix2pix)

Generative adversarial nets can be extended to a conditional model if both the generator and discriminator are conditioned on some extra information y. y could be any kind of auxiliary information, such as class, labels or data from other modalities.

In the generator, the prior input noise pz(z), and y are combined in joint hidden representation, and the adversarial training framework allows for considerable flexibility in how this hidden representation is composed.

In the discriminator x and y are presented as inputs to a discriminative function (embodied again by an MLP in this case).

Pix2pix performs paired-to-paired image translation. The generator learns to generate a fake sample with a specific condition or characteristic rather than a generic sample from unknown noise distribution. The combined Loss Function is given by:

TIC-CGAN architecture

Results

Colorized results using three different methods. (a) Thermal infrared images. (b) Naïve. (c)TIR2Lab. (d) TIC-CGAN. (e) True RGB images

Here, TICCGAN is able to locate and generate local realistic textures and fine details, and achieve more plausible RGB images.

Although the proposed approach has achieved impressive results in most cases, some defects are inevitable. This might be caused by the incomplete training dataset, where few training images are captured when the car turns. This phenomenon also occurs in colorizing images taken at the campus entrance. In addition, since colorization does not have a unique solution, there may be multiple colors for the same object in two adjacent frames in the thermal infrared video. Therefore, although the visual effect on a single frame image is satisfactory, colorized images in the video look incoherent.

From top to bottom: thermal infrared images, results from TIC-CGAN, true RGB images

Conclusion

TIC-CGAN — a conditional generative adversarial network to generate RGB images from thermal infrared images. This method uses a coarse-to-fine generator and a composite objective function that combines content, adversarial, perceptual, and TV losses to produce results with realistic colors and fine details. The future work is to study wider GAN techniques and their hybrid models to further improve image quality. Collecting more thermal infrared to RGB images for the model.

References

[1]. Gade R, Moeslund T B. Thermal cameras and applications: a survey[J]. Machine vision and applications, 2014, 25(1): 245–262

[2]. Deshpande A, Lu J, Yeh M C, et al. Learning Diverse Image Colorization[C]/CVPR. 2017: 2877–2885

[3]. A. Antoniou, A. Storkey, and H. Edwards. Data augmentation generative adversarial networks. arXiv preprint arXiv:1711.04340, 2017

[4]. I. Goodfellow, J. Pouget-Abadie, M. Mirza, B. Xu, D. Warde-Farley, S. Ozair, A. Courville, and Y. Bengio. Generative adversarial nets. In Advances in neural information processing systems, pages 2672–2680, 2014.

[5]. P. Isola, J.-Y. Zhu, T. Zhou, and A. A. Efros. Imageto-image translation with conditional adversarial networks. arXiv preprint arXiv:1611.07004, 2016.

My experience

My internship experience is filled with highs and lows. There were days when I was very happy with my results and some days I felt, I did nothing. But overall it was a lovely experience for me to get exposure to working with ISRO scientists. My mentor helped me throughout the internship by making me understand each and every aspect whenever I faced difficulty.

During my internship, I gained a lot of knowledge on the basics of Deep Learning and advanced Deep Learning. The first month was hectic for me as I needed to work hard in understanding how a GAN works. Gradually, I understood what happens in GAN architecture by running a pre-trained model of Conditional GAN on the universal dataset of RGB and IR images. I went through the TensorFlow page describing GAN which helped me to understand how GAN works and where we use it. Eventually, I learned about Conditional GAN and(pix2pix) and Cycle GAN.

Guidance from Bennett University

Bennett University was very helpful in providing me with a Letter of Recommendation to work in such a prestigious organization. The courses which were taught helped me a lot. The professors have been amazing throughout and helped me at every moment whenever required.

--

--

No responses yet