- Open Access
- Total Downloads : 2
- Authors : B. Hela Saraswathi, Praveen C, V P S Naidu
- Paper ID : IJERTCONV6IS13016
- Volume & Issue : NCESC – 2018 (Volume 6 – Issue 13)
- Published (First Online): 24-04-2018
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Real-Time Implementation of Multi-Imaging Sensor Data Fusion Techniques
B. Hela Saraswathi
Dept. of Avionics Institute of Science & Technology,
Kakinada, AP, India
Multi Sensor Data Fusion Lab CSIRNational Aerospace Laboratories
V P S Naidu
Multi Sensor Data Fusion Lab CSIRNational Aerospace Laboratories
Abstract:- Enhanced Vision System (EVS) is one of the most advanced 1technologies that provides good situational awareness to the pilot, which is essential to fly safely under poor visibility conditions. EVS uses Electro-Optical (EO) and Infra-Red (IR) imaging sensors. Individual images obtained from these sensors provide different information of the terrain and surroundings, but when they are fused, it gives better information which improves the visual perception. Fusion of images obtained from such multi-sensors can be achieved using different techniques. For fusing the EO and IR images of EVS, four fusion methods viz., Alpha Blending, Principle Component Analysis (PCA), Laplacian Pyramid, andDiscrete Wavelet Transform (DWT)have been implemented and tested.Laplacian pyramid based image fusion technique proved to provide better fusion when compared to the other techniques.
General Terms:- Electro-Optical; Infra-Red; multi-sensors
Keywords:- Image fusion; image processing; Laplacian Pyramid; wavelets
in the OpenCV project for the real time image fusion. In control point image registration, the user has to select same feature points on both images manually as shown in Fig. 1.
Fig.1 Control Point Image Registration
The Affine matrix for the given scenario in Fig. 1, is as follows,
Image Fusion is a process of combining the features of two or more images of a scene into a single composite image.
Affine Matrix= 0.9718
It results in an image which is more informative and is suitable for visual perception or computer processing. In Image Processing, there are many image fusion methods which are capable to do this. When the images that are to be fused are in the same scene but have different Field of View (FOV), Image Registration is required. While fusing EO (RGB) and IR (Gray level) images, the fusion has to be taken place at the intensity level for retaining the color information of EO image. Therefore, the EO image has to be converted into HSI (Hue Saturation Intensity) image prior to fusion. After the fusion of Intensity component (I) of EO image and IR image, H and S components of EO image have to be added with the fused image to get back the color information.
Image Registration is a process of aligning two or more images of the same scene. In this application, off-line image registration (both EO and IR imaging sensors are in the same platform and there is no relative movement between these sensors and hence online registration is not required) is done for two sample images of EO and IR cameras respectively by using Control Point Image Registration tool box of MATLAB. The transformation matrix (Affine Transform)obtained from MATLAB is used
Then,geometric transformation (MATLAB function
fitgeotrans) has to be applied on the feature points, which gives transformation matrix (Affine matrix) that can be used for real time image registration.
RGB to HSI Conversion
RGB and HSI are color models of an image. In RGB model, colors are encoded by the amount of red light, green light, and blue light emitted and they are represented numerically with a set of three numbers, each of which ranges from 0 to 255. White has the highest RGB value of (255, 255, 255) and black has the lowest value of (0, 0, 0). HSI color model encodes colors according to their Hue, Saturation, and Intensity. Here, hue of a color is its angle measure on a color wheel. Pure red hues are 0Â°, pure green hues are 120Â°, and pure blues are 240Â°. Intensity is the overall lightness or brightness of the color, defined numerically as the average of the equivalent RGB values . The equations for the conversion of RGB values to HSI values are as follows,
I = (R+G+B)/3, I is Intensity value (1) if(R+G+B)=0
S =13*(Min/ ), (2.2204*10-16) is epsilon (2)
S = 13*(Min / (R+G+B)) (3)
H = cos-1[ (R0.5G0.5B)/ (RÂ² + GÂ² + BÂ² RG RB
H = (360Â° /180) cos-1[ (R 0.5G 0.5B)/(RÂ² + GÂ²
2.1 Alpha Blending
Alpha blending is a process of combining an image with the background and the transparency of background and foreground images is controlled by Alpha value and (1- Alpha value) respectively.
+ BÂ² RG RB GB)] (5)
I f (x, y) Alpha* I1 (x, y) (1 Alpha) * I2 (x, y)
Where, the inverse cosine output is in radians.
0 Alpha 1
1.3 HSI to RGB Conversion The R, G, and B are given by, If 0 < H < (120Â° /180)
R= I(1+S cos(H)/cos((120Â° /180) H)) (6) G=3I (R+B) (7)
B= I(1 S) (8)
If (120Â° /180) < H < (240Â° /180)
R=I(1 S) (9)
G=I(1+Scos(H (120Â°/180))/cos(H)) (10) B=3I (R+G) (11)
If (240Â° /180) < H < (360Â° /180)
R=3I (B+G) (12)
I f – fused image,
I1 and I2 – input images to be fused
(x, y) – pixel index
In EVS, the EO image is taken as background image because it has higher field of view. The transparency of background image is decided by the Alpha value provided by the user. The alpha value is limited between 0 to 1. So,
the transparency of foreground image is equal to 1 Alpha value. Alpha blending can be done by using add weighted OpenCV function.
Principal Component Analysis
The information flow diagram of PCA-based image fusion algorithm is shown in Fig. 4. The input images (images to
RGB to HSI and HIS to RGB conversions are validated by
I1 (x, y) and
I 2 (x, y) are arranged in two column
comparing original and reconstructed as shown in Fig. 2. vectors and their empirical means are subtracted. The
resulting vector has a dimension of N 2 , where N is
length of each image vector. Eigen vector and Eigen values for the resulting vector are computed and the eigenvectors corresponding to the larger eigenvalue are obtained. The
P1 and P2
(i.e., P1 P2 1) are
Fig.2 Validation of RGB to HSI conversion (a) Original RGB Image, (b) Reconstructed RGB Image, (c) Error Image
IMAGE FUSION TECHNIQUES
Read IR channel
Read EO channel
The steps involved in real-time image fusion of EO and IR camera images are shown in Fig. 3.
computed from the obtained eigenvector.
The fused image is: I f (x, y) P1I1(x, y) P2I2 (x, y) (16)
Fig.4 PCA based image fusion
Image pyramid is a multi-scale representation of an image in which, image is subjected to repeated smoothing and subsampling.
There are two types of image pyramids which are low pass and band pass. A low pass pyramid is constructed by smoothing the image with an appropriate filter and then subsampling it by a scaling factor of 2 repeatedly.The Gaussian pyramid is one of the examples for low pass pyramid.
Idk g Ik
Here, g is the Gaussian convolution kernel,
I k is input
H & S components
image of the kth level of Gaussian pyramid and
Idk is the
HSI to RGB
Fig.3 Steps involved in real-time image fusion
down sampled image of kth level,whereas band pass pyramid is constructed by forming the difference between the adjacent levels of image pyramid and performing some kind of image interpolation between adjacent levels of resolution. The Laplacian is computed as the difference between the original image and the low pass filtered image,
so that the Laplacian pyramid is set of band pass filtered integers. Wavelet transform of a 1-D signal f(x) is defined
I is the Laplacian image of kth level .
Scaling properties are described by the scaling function,
I x, y Idk x, y
which is used to construct the wavelets. The translation and
dilation of mother wavelet is defined as,
Here, I is up-sampled image of kth level, which can be
1 x b
achieved by interlacing zeros with the down sampled
image I of kth level and n represents the nth pixel of the image.
Laplacian based image fusion has two stages which are
pyramid construction and image fusion. In the first stage, Laplacian pyramid is constructed for both EO and IR images from the respective Gaussian pyramids as shown in Fig. 5.
Wavelet separately filters and down-samples the 2-D data (image) in horizontal and vertical directions. The input image I x, yis filtered by low pass filter (L) and high pass filter (H) in horizontal direction and then down-sampled by
a factor of two to create coefficient matrices ILx, yand
IH x, y. Then the coefficient matrices are both low pass filtered and high pass filtered in vertical direction and down sampled by factor of two, to create sub bands ILLx, y,
IHLx, y, ILH x, y and IHH x, y. ILLx, yrepresents the
approximation of input image
I x, y. Then,
L ILH x, yand IHH x, yrepresent the horizontal, vertical and
diagonal information of the input image
I x, y
U respectively.The inverse 2-D wavelet transform is used to
G reconstruct the original input image I (x, y) from the sub
ILL x, y
IHL x, y
ILH x, y
IHH x, y
D bands , , and . This
Fig.5 Laplacian pyramid construction
In second stage, fusion is performed for the final level images of Gaussian pyramids of both EO and IR images and average is taken from these two images.The resultant
involves column up-sampling and filtering using low pass and high pass filters for each sub images. Row up sampling and filtering with low pass and high pass filters and summation of all matrices would construct the original image I x, y .
The wavelet coefficient matrices are achieved for both the input images by applying Discrete Wavelet transform (DWT) on each input image. The Wavelet coefficient matrix of each image contains approximation, horizontal, vertical and diagonal components. The fused wavelet coefficient matrix is achieved by applying fusion rules on both wavelet coefficient matrices. Finally, the Inverse discrete wavelet transform (IDWT) is applied on the fused wavelet coefficient matrix to obtain the fused image as shown in Fig.6.
image is up-sampled and added with the next level of
maximum image in continuous iterations till it reaches the first level of Laplacian pyramids as in reverse order. The final resultant image is considered as the fused image.
Discrete Wavelet Transform
Wavelet transform is a superset of Fourier transform. In Fourier theory, signal is decomposed into sines and cosines and in wavelet transform,the signal is projected on a set of wavelet functions. While Fourier transform provides good resolution in frequency domain, wavelet transform provides good resolution in both frequency and time domains. Discrete Wavelet Transform uses a discrete set
Registered Source Images
Fused Image I f
Fused Wavelet Coefficients
of wavelet scales and translation and it decomposes the signal into mutually orthogonal set of wavelets.
In Discrete Wavelet Transform, the dilation factor is a 2m
and translation factor is b n2m , where m and n are
Fig.6 Wavelet based image fusion
PERFORMANCE EVALUATION METRICS Usually, performance evaluation of fusion techniques is done by using a reference image, buthere reference image is not available as it is a real time application.So the
performance of fusion algorithms is evaluated by using No- referencemetrics.
3.1 Standard Deviation
Standard deviation is a measure that is used to quantify the amount of variation of a set of data values. Low standard deviation indicates that the data values tend to be close to
3.5 Fusion Mutual Information
Fusion Mutual Information indicates the degree of dependence of the fused image on the source images. The larger value of fusion mutual information implies better quality .
the mean of the set and high standard deviation value
The joint histogram of source image
I1 x, yand
I x, yis
indicates that the data points spread out over a wide range of values. In image processing, standard deviation is
h (i, j) and source image
I1 x, yand
I x, yis
variation of pixel values with respect to the mean of all
pixel values of an image .
h (i, j) .The mutual information is defined as
M 1 N 1
(x, y) )2
hI I (i, j)
MN x0 y0
I1 I f
(i, j) log
2 ( h
Where, M and N represents the number of rows and
columns, is the mean of all pixel values in the image
I1 I f
h (i, j)
MI h (i, j) log ( I 2 I f (32)
I 2 I f I I 22 f
i1 j 1
3.6 Fusion Quality Index
hI (i, j)hI
Entropy is a measure of information content of an image. Entropy is sensitive to the noise an unwanted rapid fluctuations. The image with high information content would have high entropy .
The range of this metric is 0 to 1 and one indicates the fused image contains all the information from the source images.
FQI c(w)((w)QI(I1, I f w) (1 (w))QI(I2 , I f w))
He hI f
(i) log 2 hI
is the normalized histogram of the fused
computed over a window and
image I .Th unit of entropy is bits/pixel.
cw max I
over a window and
3.3 Cross Entropy
Cross entropy is used to verify the similarity in information content between input and fused image. The low cross entropy indicates that the input and fused images are almost similar .
Overall cross entropy of the input images I1 , I 2 and the fused image I f is
normalized version of cw& QI I , I wis the quality index over a window for given source image and fused image .
Spectral Angle Mapper
Spectral Angle Mapper calculates angle in spectral space between the pixels and set of reference spectra for image classification based on spectral similarity.
CE(I ; I ) CE(I ; I )
CE(I , I : I ) 1 f 2 f (24)
1 2 f
t .r )
CE (I ; I ) h (i) log( hi (i) ) (25)
1 f ii
Where r is reference spectra and t is spectra found in each
CE (I ; I ) h (i) log( hi (i) ) (26)
2 f i2
Contrast is a visual characteristic that makes an object or
3.4 Spatial Frequency
Spatial frequency refers to the level of detail present in a stimulus per degree of visual angle. A scene with small details and sharp edges contains more high spatial frequency information than one composed of large coarse stimuli. This metric indicates the overall activity level in the fused image .
its representation in an image distinguishable from other objects and the background. In visual perception, contrast is determined by the difference in the color and brightness of the object and other objects within the same field of view and higher contrast value is preferable .
C (M C(x, y)1)(N 1)
1 M 1N 1 (35)
x 1 y 1
RF 2 CF 2
Where, M and N represents the number of rows and
columns of an image.
Where row frequency of the image For an IR image, the contrast is the gradient calculated for
1 M N
the image as a single component:
RF[I f (x, y) I f (x, y 1)]
2 I (x, y)
and column frequency of the image
I (x, y) I (x, y) i I (x, y) j (36)
1 M N [I
(x, y) I
(x 1, y)]2 (29)
MN x1 y2
Where, = gradient operator
I x, y Image pixel value at (x, y)
Average gradient reflects the clarity of an image. It measures the spatial resolution in an image i.e. larger average gradient indicates a higher resolution. So, higher value of Average Contrast is an indication of better image quality.
For a color image, the color contrast is given by the average of gradients of Red, Green and Blue considered individually as follows:
Fig.7 Hardware Setup for EVS Image Fusion
2 R(x, y) 2G(x, y) 2 B(x, y) (37)
4.1 LWIR Specifications
The LWIR incorporates an uncooled 324×256 pixels micro
3.9 Average Luminance
Luminance describes the amount of light that passes through, or is emitted from a particular area, and falls within a given solid angle. It indicates how much luminous power will be perceived by an eye looking at the surface from a particular angle of view. Luminance is thus an indicator of how bright the surface will appear .
bolometer. It has an internal heater to defrost its protective window. The LWIR has technical specifications as given in the Table1.
4.2 EO Color Camera
The EO Camera with CMOS sensor has technical specifications as given in the Table2.
1 M N I
Field of View
38Â° (H) x 25Â° (V) with 6.8 mm lens
NTSC/PAL Analog, Raw RGB, 1.0Vpp /75
Composite Video Signal
Digital Core 5V DC ~ 24VDC
32mm x 32mm (without lens)
400 TV Lines
6m x 6m
4.752 mm x 3.036 mm
1 Lux (F1.2)
Table 2 Technical Specifications of EO camera
MN x1 y1
For color image,
1 M N
MN x 1 y 1
R(x, y) G(x, y) B(x, y) 3
Higher luminance value represents the higher brightness value of an image.
Energy returns the sum of squared elements in the Gray Level Co-occurrence Matrix (GLCM). It is also known as uniformity, uniformity of energy or angular second moment. The energy lies between zero and one .
E g(i, j)2
Homogeneity is a condition in which all the constituents are of the same nature. In image processing, Homogeneity returns a value that measures the closeness of the distribution of elements in the Gray Level Co-occurrence Matrix (GLCM) to the GLCM diagonal i.e. if all the pixels in a block are within a specific dynamic range. The range homogeneity is from zero to one. Homogeneity is 1 for a diagonal GLCM .
Sensoray Frame Grabber
A four-channel Sensoray Frame Grabber (2255) is used for capturing image frames from both the cameras at a desired frame rate. The maximum frame rate that can be achieved with this frame grabber is 60 frames/sec, when single
channel is used. When two channels are used, the frame rate gets reduced to 30 frames/sec, and when all four
channels are used the frame rate further gets reduced to 15
i1 j1 1 i
frames/sec. Here, two channels are used and so the maximum frame rate achieved is 30 frames/sec. The digitized output from the frame grabber is given to the computer by using USB cable.
Both imaging sensors are operated using +12 V wall adaptor or battery. The outputs of the cameras are connected to the frame grabber through two RS-170 ports respectively. The hardware setup used for developing the image fusion techniques is shown in Fig. 7.
A computer with the specification mentioned in Table 3 is needed to run this application.
Table 3 System Requirements
3.4 GHz processor or more
4GB or more
Hard Disk space
10GB or more
2.0 or higher version
Frame Grabber driver
Direct show windows driver version
1.1.10 should be installed
2008 or higher version
Fig. 9(a)EO Image, (b) IR Image
RESULTS AND DISCUSSIONS
Fusion methods areimplemented by using Open Source Computer Vision (OpenCV) image processing library in Visual Studio platform as Win32 Console application and the C++ programming language is used to implement the methods. This application is capable of capturing real time video data from two cameras and simultaneously fusing both the camera outputs into a single video data. Once the application is started, the user has to select fusion method from the given optionsand also the mode of display as 0 for normal mode and 1 for full screen mode as shown in Fig. 8.
Due to the fog, the mountain is not visible in the EO image as shown in Fig. 9 (a). At the same time because of the weather penetrating nature of the IR camera, the mountain is visible in IR image, but color information is not there as shown in Fig.9 (b). The fused images for various alpha values are shown in the Fig. 10.
Fig.8 User-interface for selecting the mode of display
The advantage of fused output over input image is verified under different environmental conditions such as day time, night time, foggy etc. As IR camera is temperature sensitive, hot areas on the IR images are highlighted including human body. So, during low light condition when EO camera cannot capture the scene properly, IR camera would be helpful to capture the actual scene. So the intention of the fusion techniques is to merge features from both the camera output images and give a featured image which contains all the necessary information.
5.1 Image Fusion for Still Images
The initial development of image fusion techniques is started with the stillEO and IR images in same Field of View (FOV). So image registration is not required since both the input images have same field of view. The performance of fusion techniques for these still images is evaluated using fusion evaluation metrics. The reference still input image taken to perform image fusion is shown in the Fig. 9.
Fig. 10 (a) Fused image when Alpha is 0.2, (b) Fused image when Alpha is 0.5, (c) Fused image when Alpha is 0.7
Principle Component Analysis
The difference between the Alpha blending and PCA techniqueis that the PCA has the capability to determine the weightage to be given to the input images dynamically based on the Eigen vectors of the input images respectively.
The fused image for the PCA technique is shown in Fig. 11.
Fig. 11 Fused image using PCA method
Laplacian pyramid technique is edge sensitive, high weightage would be given to the edges of all the objects in input images. The fused image for various levels achieved in the Laplacian pyramid is shown in the Fig. 12.
temperature highlighted information from IR as shown in Fig.14 (c).
Fig. 14 (a) EO, (b) IR, (c) Fused image in day effect
During night time, the IR cameras performance is better than the EO camera. As EO camera doesnt work without
Fig. 12 (a) Fused image when 1 level is achieved, (b) Fused image when 2
levels are achieved, (c) Fused image when 3 levels are achieved, (d) Fused image when 4 levels are achieved
Discrete Wavelet Transform
In The fused image for one level achieved in wavelet transform is shown in Fig. 13.
light source. So, the fused image contains most of the information from IR and some of the color information from EO as shown in Fig. 15 (c).
Fig. 13 Fused Image when one level of wavelet transform is achieved
Image fusion for Real Time Images
Fusion techniques are implemented for still images obtained from real time cameraand performance of the fusion techniques for the real time images is evaluated using fusion performance metrics.
The frame rate achieved for Alpha blending fusion method is 29 frames per second. It is tested and is concluded that the frame rate is consistent for all the Alpha values. For the outputs shown in the figures 14 and 15, the alpha value is taken as 0.5. So, equal weightage is given to both the input images in the fused image. Here, the weightage is to be determined by the user and it cannot be modified once the application is started to run. During day time, the EO cameras performance is better than the IR camera, even though the high temperature portions are highlighted in the IR image as shown in Fig. 14 (b). So, the fusedimagecontains most of the information from EO and
Fig. 15 (a) EO, (b) IR, (c) Fused image in night effect
Principle Component Analysis
Frame rate achieved for PCA based fusion method is 20 frames per second. During day time, the PCA technique gives high weightage to EO image as shown in Fig. 16 (c). So the fused image contains most of the information from EO and temperature highlighted information from IR.
During day time, the Laplacian pyramid technique gives high weightage to the edges of the EO image as shown in Figure 18(c). So the fused image contains most of the edge information from EO and temperature highlighted edge information from IR.
Fig. 16 (a) EO, (b) IR, (c) Fused image in day effect
During night time, the PCA technique gives high weightage to the IR image as shown in Fig. 17 (c). So the fused image contains most of the information from IR and color sensitive information from EO.
Fig. 18 (a) EO, (b) IR, (c) Fused image in day effect
During night time, the Laplacian pyramid technique gives high weightage to the IR edges image as shown in Fig.19 (c). So the fused image contains most of the edge information from IR and color sensitive edge information from EO.
Fig. 17 (a) EO, (b) IR, (c) Fused image in night effect
The number of levels of image pyramid is limited to the size of the input image. The frame rate variation based on the number of levels is shown in Table 4.
Table 4 Frame Rate variation
Number of Levels
Fig. 19 (a) EO, (b) IR, (c) Fused image in night effect
Wavelet based image fusion method is time consuming process because it is totally based on pixel level operations. So because of its speed limitation only one level of wavelet transform is implemented. The frame rate achieved for Wavelet based fusion method is 12 frames per second. During day time, the Wavelet technique gives high weightage to the horizontal, vertical and diagonal information of the EO image as shown in Figure 20 (c). So the fused image contains most of the information from EO
and temperature highlighted information from IR. During night time, the Wavelet technique gives high weightage to the horizontal, vertical and diagonal information of IR edges as shown in Fig.21 (c). So, the fused image contains most of the information from IR and color sensitive information from EO.
Fig. 21 (a) EO, (b) IR, (c) Fused image in night effect
Fig. 20 (a) EO, (b) IR, (c) Fused image in day effect
Four real-time image fusion techniques viz., alpha blending, Laplacian Pyramid, PCA and DWT are implemented and tested. The performance of these methods is evaluated using fusion performance metrics. As per the results of the fusion performance metrics, it is concluded that the Laplacian pyramid based fusion method provides better results compared to the other fusion methods.
V.P.S. Naidu, J.R Raol, Pixel-level Image Fusion using Wavelet and Pricnipal Component Analysis, Defence Science Journal, Vol.58, No.3, May 2008.
V.P.S. Naidu, L. GarlinDelphina, Assessment of Color and Infrared images using No-reference Image Quality Metrics, Proceedings of NCATC 2011
Table1 Technical Specifications of LWIR camera
Focal Plane Array (FPA), uncooled microbolometer 324 x 256 pixels
8 to 14m
Field of View
36Â° (H) x 27Â° (V) with 19 mm lens
100 mK at +25Â°C
8.3 Hz PAL
Automatic (25 m to infinity)
Analog, CCIR/PAL composite video, 75
When window temperature is below +4Â°C
6 – 16 V DC
2W quiescent, 6 W max (with window heater on)
-40Â°C to +80Â°C
Storage temperature range
-57Â°C to +105Â°C (Extended storage time above +40Â°C is not recommended due to reduction in service life)
Hermetically sealed enclosure
530g shocks in two directions on 3 axes (30 total) 11 msec duration per IEC 60068-2- 27-Ea
57.4 mm x 56.1 mm x 71.4 mm excluding connector which protrudes an additional 28.7 mm