Integrated Multilevel Technique for Image Fusion of Coloured & Infrared Images for Human Recognition using AI

Download Full-Text PDF Cite this Publication

Text Only Version

Integrated Multilevel Technique for Image Fusion of Coloured & Infrared Images for Human Recognition using AI

Mr. More Rahul Tanaji Computer systems dept, MCEME, Secunderabad, 500015.

Abstract- In this paper Author recommends an innovative method for infrared (IR) and visible image fusion where he intended, reviewed and developed contemporary code for image combination/fusion using Guided filter fusion method which is based on a two-scale break up of an image into a base layer containing large scale variations with respect to intensity, and a detail layer confining minute scale details. An innovative guided filtering-based weighted average practice is proposed to make full use of spatial constancy for fusion of the base and detail layers.

The advantages of guided filtering are that it is straightforward computational capable and suitable for real time applications. The source images are decomposed into base parts and detail content by guided filtering and the resulting outputs are fused using weighted averaging method as used by author in his thesis paper.

Keywords – Image fusion, Guided filtering, weighted average, YOLO3, VGG19, multi-contour transform, smoothing, weighted averaging strategy, detail content, Optimization.


As we know coloured camera is a main sensor in most of the real world applications; it provides a high information density at a low cost. Its main limitation is the performance deprivation in low light scenarios. Thermal cameras are increasingly being used to equivalent cameras for dark conditions like night time.

To draw the best from both the images, author proposes a new image fusion technique which Combines the useful features from thermal and colored images into one image.The common issue with regards to image fusion is how to extract salient features from the source images and how to combine them to generate the fused image. This step of feature extraction must be performed accurately in order to obtain a fused image which contains the prevailing features from both the input images.

Image fusion is an upgrading procedure that aims to combine images obtained by different kinds of sensors to generate a strong or informative image that can facilitate subsequent processing or help in decision making. The keys to an excellent fusion method are effective image information taking out and appropriate fusion principles, which allow useful information to be extracted from source images and incorporated in the fused image without introducing any object in the process.

Many applications require images to express more information than is usually available from the use of a single

image. Images/sensors of the same type obtain information from only one portion and are thus unable to provide all required information. Therefore, fusion techniques play an increasingly important role in modern applications and computer vision technique. Experiments are performed on openly available datasets which exhibit that this proposed approach has better fusion performance than other subsidiary methods. This say is justified through both subjective and objective evaluation. The code of this fusion method is accessible at one visible image captured using a visible/coloured camera and another infrared image captured using a infrared camera / HHTI. Both these captured images with very small delay between them in order to ensure high similarity between both the images. The dataset is obtained by storing pairs of images; each pair consists of coloured image and thermal image of the close to similar scene.


Fig 1- Thesis- steps performed


To develop an integrated multilevel technique for image fusion of coloured and thermal images for improved human / object detection.


The scope includes the following:-

  1. Analysis of sensor fusion techniques for coloured & IR images.

  2. To develop an algorithm by using AI/ML for integrating coloured and thermal images.

  3. Selection and development of suitable software filters for image fusion.

  4. Practical implementation of the proposed algorithm for human detection.

  5. Validation of developed algorithm under varying conditions (distance, light, group, shape, size etc)


    Author started with analysis and comparison of present image fusion techniques with structured planning to review the levels used for feature extraction and outline detection with selection of suitable source images for the fusion technique in which he practically implemented the performance of proposed algorithm resulting in best possible accuracy for the images clicked at varying distances and different hrs of night.

    The method of image processing consists of following four phases.

    1. Phase-I Experimentation using publically available datasets.s [2]

    2. Phase-II Generation of own dataset for near similar coloured and IR images.

    3. Phase-III Fusion of coloured and thermal images from collected dataset by developing code using computer vision.

    4. Phase-IV Human detection from the fused images. [7]

    5. Phase-V User Interface for checking the output of fusion and human detection from the fused image.




Infrared and coloured image fusion techniques are extensively used in various fields. An adaptive fusion method for infrared and coloured images was developed with multi- contour transform as demonstrated in the work of Chang X Jiao L, Liu. F and Xin. F[3] This method is efficient in direction selectivity, and considerable information about details could be captured in the fusion process.

Existing infrared and coloured image fusion algorithms are mostly structured at the pixel stage. Although fused images can provide matching information about sights, distortions. These undesired changes limit or degrade the performances of applications built on the corresponding fusion method as shown by Sadjidi F.[2] Method with adequate fusion principles and powerful representations that combine pixel-level characters is still awaiting (for development) which will significantly improve current applications and contribute to inventions of advanced systems in different areas.

Fig 2- Input devices to capture dataset

For every source image (Si), the base parts (Bp) and detail content (Dc) are obtained by resolving the optimization equation as follows:- [7]

Base parts (Bp) = Arg min Bp ||Si Bp ||2 F +

L (||Rx Bp ||2 F + ||Ny Bp||2 F) ——————(a)

Hardware Requirement




Desktop Keyboard

Open CV ,Python 3.5 ( Higher version)

Python & Conda lib

Visual basic code & Tensor flow framework

Graphic card

Image Training & labeling link

Image Capture connector

Dark net code base

Fig 3- Requirement of Hardware & Software L= lamda

Where Rx = [1, 1] and Ny = [1,1 T are the horizontal and vertical gradient operators, respectively. The parameter L is assumed to 5 in this paper. After getting the base parts Bp, the detail content is obtained by,

Dc = I Bp ————————– (b)

Fig 4- Various techniques of Image Fusion

Fig 5- Formation of Fused Image

Fig 6- Fused Image

Experimentation using mages from Dataset

In image fusion, author successfully fused similar kind of images to form a new (fused image) and from that fused image human detection has been carried out by using YOLO3 which he used during object detection from the thermal images.

Consider two images from the dataset collected for this project. The images are read using the imageio library; the images are resized and plotted.

Both the images are then passed to a low pass filter which is responsible for returning low and high frequency components, consisting of the low pass filtered image and its difference with the input image.

For each source image the base parts and detail content are obtained and separated by using average filtering. Average (or mean) filtering is a method of 'smoothing' images by reducing the amount of intensity variation between nearest pixels. The average filter works by moving through the image pixel by pixel, replacing each value with the average value of adjoining pixels, including itself [6].

Then the base parts are fused by weighted averaging approach. In weighted average filter, author gave more weight to the centre value, due to which the contribution of centre became more than the rest of the values. Due to weighted average filtering, blurring of the image can be controlled.

The base parts which are extracted from the source images contain the common features and vast information.

    1. The base parts which are taken out from the source images contain the common features and surplus information. The fusion of detail content is done using VGG19 model. VGG19 model is used to extract deep features. Then the weight maps are obtained by a multi-layer fusion strategy. Finally, the fused detail content is reconstructed by these weight maps and the detail content. Multi-layer fusion strategy works by extracting various feature maps from the detail content and then using these feature maps with normalisation and block based average operator to obtain an

      activity level map[5].

      Afterwards the activity level maps are combined using pixel averaging approach to obtain the final fused image. The fused image is then passed to YOLO3 algorithm which builds bounding boxes around objects of concern present in the fused image.

      Visible Infrared database for research developed by various private groups are available online. The infrared and visible sequences are coordinated and registered. The database is available generously for research and development purposes as below-

      Benchmark Dataset Collection – This is a publicly available benchmark dataset for testing and evaluating new and innovative computer vision algorithms and its utilization of thermal images.


      Fig 7- Fusion of base layer & detailed layer

      Fig 8- Human Detection from fused Image

      1. – CONCLUSION

        In this thesis/paper author has suggested new and improved method of integrating data received from multiple sensors and using them to reconstruct and produce an enhanced image representation of the situation than what would have been possible using any one sensor. He propose a straightforward and helpful fusion method based on a VGG- network (deep learning framework) for an IR and coloured image fusion task. Initially, the source images are putrefied into base parts and detail content. The earlier method contains low frequency information whereas the later contains texture information. These base parts are combined by the weight- averaging strategy. For the detail content, successful multi- layer fusion strategy based on a pre-trained VGG-19 network. The deep features of the detail content are obtained by this fixed VGG-19 network. The normalization and block- averaging operator are used to get the initial weights. The final weights are calculated by the soft matrix methodhodology / technique.

      2. – FUTURE SCOPE

      I believe the suggested image fusion and object detection using fusion of coloured and IR images and the new multi-layer fusion approach can be applied to other image fusion tasks, such as multi-exposure image fusion, medical image fusion, and multi-focus image fusion.


      Author would express gratitude to Mr Krishna Kumar KP (Dean, faculty of Electronics), Mr Kuldeep Yadav (Hod CS dept) and Mr Abhijeet kumar Sinha (Thesis Guide) for their instructions, astute guidance, valuable comments and suggestions, which have significantly improved this paper.


      1. Yang, S., Wang, M., Jiao, L., Wu, R. and Wang, Z., 2010. Image fusion based on a new contourlet packet. Information Fusion, 11(2), pp.78-84.

      2. Li, S., Kang, X. and Hu, J., 2013. Image fusion with guided filtering. IEEE Transactions on Image processing, 22(7), pp.2864- 2875.

      3. Lu, X., Zhang, B., Zhao, Y., Liu, H. and Pei, H., 2014. The infrared and visible image fusion algorithm based on target separation and sparse representation. Infrared Physics & Technology, 67, pp.397- 407.

      4. Liu, Y., Chen, X., Ward, R.K. and Wang, Z.J., 2016. Image fusion with convolutional sparse representation. IEEE signal processing letters, 23(12), pp.1882-1886.

      5. Bavirisetti, D.P. and Dhuli, R., 2016. Two-scale image fusion of visible and infrared images using saliency detection. Infrared Physics & Technology, 76, pp.52-64.

      6. Li, H., Wu, X.J. and Kittler, J., 2018, August. Infrared and visible image fusion using a deep learning framework. In 2018 24th international conference on pattern recognition (ICPR) (pp. 2705- 2710). IEEE.

      7. Chen, J., Li, X., Luo, L., Mei, X. and Ma, J., 2020. Infrared and visible image fusion based on target-enhanced multistate transform decomposition. Information Sciences, 508, pp.64-78.

Leave a Reply

Your email address will not be published. Required fields are marked *