 Open Access
 Total Downloads : 756
 Authors : Ashvini Kulkarni, Manasi Vargantwar
 Paper ID : IJERTV3IS051491
 Volume & Issue : Volume 03, Issue 05 (May 2014)
 Published (First Online): 27052014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Video Based Tracking with MeanShift and Kalman Filter
Ashvini Kulkarni
Department of Electronics and Telecommunication Marathwada Institute of Technology
Aurangabad, India
Manasi Vargantwar
Department of Electronics and Telecommunication Marathwada Institute of Technology
Aurangabad, India
Abstracttracking object in video sequence is receiving enormous interest in computer vision research. This paper we contrast performance of MeanShift algorithms gradient descent based search strategy with Kalman filter based tracking algorithm used to models the dynamic motion of target object to guide estimate objects position through time. Experimental results of tracking a car demonstrate that the proposed Kalman filter for object tracking is efficient under dynamic environment, robust in occlusion comes at the cost of higher computational requirement, helps to separate object pixel from background pixel for fast moving object.
KeywordsObject Tracking,MeanShift,PDF,Kalman Filter

INTRODUCTION
Object tracking on video sequence has many applications such as surveillance systems, public security, visual monitoring and so on. Many surveillance application, the region under video surveillance is simply too large for continuous object observation in video streams. Thus some sort of automation is required to alert the observer to presence of object within the surveillance region and to maintain track of these objects. Continual tracking is most important objective under such application. For continual tracking in video sequence, the tracking algorithm must be able to track object in difficult operation conditions such as low contrast, occlusion of the object being tracked.
The most popular algorithm for object tracking is Mean shift algorithm [1].This algorithm uses gradient optimum algorithm to realize target location and it can track moving object in video sequence. This algorithm has good ability to track articulated objects such as humans. However it has few weak points such as a gradient descent search strategy, the MeanShift is susceptible to converging to similar appearance surround the object being tracked.
To overcome limitation of MeanShift tracker Kalman filter is used. Kalman filter has extensive applications in different fields like real time graphics, robotics and computer vision. Kalman filter is optimal estimator using the information from measurements and previous states. Kalman filter has been used in tracking mainly for smoothing the object trajectory. In this paper we present the performance of MeanShift and Kalman filter algorithm for fast moving vehicle tracking in video sequence.
The remainder of the paper is organized a follows. Section II describes the basic formulation of an object tracking system. Section III describes MeanShift algorithm, gradient descent search. In section IV consists of Kalman filter implementation. Section V consists of description of video sequences and test result in section VI. Conclusion and future work are discussed in section VII.

VIDEOBASED TRACKING
Video based tracking is nothing but process of locating fast or slow moving object. The main difficulty in video tracking is to associate target locations in consecutive video frames, especially when the objects are moving fast relative to the frame rate. Here, video tracking systems usually employ a motion model which describes how the image of the target might change for different possible motions of the object to track. There are two important elements of a video tracking system. Target Representation, Localization and Filtering, Data Association. Out of these two first is Target Representation and Localization. Computational complexity for these algorithms is low. In this initial stage target representation and localization is done on the basis of blob tracking, kernel based tracking and contour tracking algorithms.
Filtering and Data Association, which involves incorporating prior information about the scene or object, dealing with, object dynamics, and evaluation of different hypotheses. The computational complexity for these algorithms is usually much higher. Kalman filtering and Particle filtering are the most popular algorithms in Filtering and data association.
For target representation and localization in given video sequence we have selected one of the method as kernel based tracking [2] and detailed description is provided in section III.

MEANSHIFT BASED OBJECT TRACKING
A. Probability Density function
The kernel based tracker convergence of the natural mean shift procedure is mainly determined by the kernel density estimation of the target feature and similarity of the target with the candidate in the tracking or kernel window like Bhattacharyya coefficients [1].However, due to targetshift
Invariance of the colour histogram, the coefficient surface obtained tends to be flat and would make the meanshift algorithm converge to local extrema and result in imprecise target localization. This situation would further deteriorate when target moves out of the kernel due to target or
and is characterized by PDF p(y).both PDFs are to be estimated from mbins and histogram should be used according to color weighted histogram. Thus we have,
Target model:
background motions. To make the colour feature more robust and discriminative for the meanshift based tracker, spatial
q^ = q^
u
u.m
m
u=1
q^u
= 1 (6)
information has to be incorporated. The mean shift algorithm is an efficient and nonparametric method for mode seeking
Target candidate:
p^ (y) = {p^ (y)} m p^ = 1 (7)
z z u=1 u
based on probability density estimation (PDE).Technically MeanShift algorithm estimates the color PDF of image patch using each pixel contribution in weighted form using kernel. This algorithm addresses kernelwindow towards possible object location, as follows [2]:
So measure of the distance between two densities is based on the Bhattacharya Coefficient, whose general form is given by[2]:
, = () (8)
=
=1
(1)
Properties of Bhattacharya coefficient such as its relation to
=1
Where K(x) is kernel function x and m(x) are current and next Estimated position. Usually the kernel K(x) is a function xÂ² of kernel profile k(x) is given by [1]:
the fisher measure of information, quality of sample estimate, and explicit forms for various distributions such as discussed in [2].This Bhattacharya coefficient is derived from estimation of densities derived from two PDF samples p and q for which color histogram is to formulate. This density of q^ z= {qu }u=1
1, 1
(with m
^ = 1 ) is computed form mbin histogram of
u
^
u=1
q
() = 0, (2)
the target model and target candidate is represented by p z(y) =
{p^ (y)} (with m p^ = 1) is estimated at given location y
z u=1 u
There are different variations in kernel type that is Gaussian, Uniform, Epanechnikov, Flat, and Normal. The Gaussian for
from the mbin histogram of the target candidate. Therefore the sample estimate of the Bhattacharya coefficient is given by:
flat kernel profile is given by:
^ ^ , ^ =
^ ()^
(9)
=1
() = exp(2) (3)
Parzen Window is very popular method to estimate the probability density function of data within the ernel. Parzen window density estimation is essentially a datainterpolation technique [2].Given an instance of the random sample xi in d dimensional space for a set of n data points. Parzen
Based on equation (9) we can define distance between two estimations as
= 1 [^ ()^ ] (10)
C. Weighted Histogram Computation
i
Target Feature: The pixel location of target feature is denoted
windowing estimates the PDF k(x) from which the sample was
by {x *}
i=1.n.
Let b:R {1.m} be function which associates
derived. It essentially superposes kernel functions placed at
to the pixel at the location x * the index b(x *) of the histogram
i i
each observation. In this way, each observation k(x) contributes to the PDF estimate defined as:
bin corresponding to the color of that pixel. The probability of the color u in target feature assigns a smaller weight to the locations that are further from the center of the target. By
` = 1
=1
(4)
assuming that the generic coordinates x and y is normalized with hx and hy, respectively, and target feature equation in weighted histogram is given by:
The PDF estimation depends on the type of the kernel window
used. Gaussian kernel profile is selected amongst all and it is given as:
^ =
2 (11)
2
= 1
2
=1
1
(5)
Where is Kronecker delta function and defined by equation as,
By using this PDF estimation technique, MeanShift algorithm models color PDF of target object and converge towards a
0,
=
1, =
(12)
possible target location area with PDF estimator.
B. Target Localization
The reference target model is representated by its PDF in
The normalizing constant is given by C and derived by imposing the condition by equation (6).The normalizing constant given by:
=
the selected region space. For example the reference model to
1
2
(13)
be chosen to be the color PDF of the target window. And in subsequent frame target candidate is determined at location y
=1
"
i i=1
Candidate Feature: Candidate feature is centred at y location in current frame and denoted by {x *} pixel location of candidate. The probability of the colour u in target feature with same weighting function k is given by
Where g(x) = K(x), the derivation of K(x) with respect to
x.

KALMAN FILTER OBJECT TRACKING
In this section, we propose dynamic scheme for Kalman
^
=
=1
2
(14)
filter as the elements of its state matrix are updated depending upon an evaluation quality of observation. By these mean the tracking procedure may be significantly accelerated.
Kalman filter is a state space estimation method where the
The scale of the candidate feature is determined by the constant h which plays important role in kernel density estimation. With this scale we obtain normalization constant as
dynamic linear model of the target feature is to predict and correct its position through time. This dynamic system can be distributed with noise. This Kalman filter always has
prediction and correction stages with all in presence of
= 1
(15)
Gaussian noise and given by following equations:
2
=1
Here Ch does not dependent on y; this can be different for given kernel with different values of h.
D. Distance Minimization
The new target location in current frame can be at location y^
= + + + 1 (20)
^ = + + [ + ] (21)
^ = [ ]+ (22)
+ + 1 = ^ (23)
+ + 1 = ^ + () (24)
of the target computed. Thus, the probabilities {p^ (y )}
0 Where K is Kalman gain x^ is state estimate, P^ is state
u 0 u=1m
0
of the candidate feature at location y^ in the current frame must be computed first.
The minimization of the distance in equation (10) is equivalent to the maximization of the Bhattacharya coefficient (9), these probabilities can be estimated with the Taylor expansion series with these values:
^ , ^
covariance estimate and is the state transition matrix governing propagation of the state forward in time from discrete time k to k+1.H is measurement matrix that relates measurements to the state, R is the priori measurement noise covariance matrix is the priori process noise covariance matrix and z is the measurement vector. The + sign indicates the prediction of state and covariance estimates one step forward in time.
0
Our object in video sequence is described by its center
1
2 =1
^
^ , ^
1
+
2 =1
^
^
0
^ ^
(16)
coordinates (x,y) and its position varies over time by equation (23). we assume with zero mean and variance 0.5 Gaussian noise with covariance matrix Q
This equation to be place with the values of the target feature and candidate feature with estimated kernel densities for all values of u = 1m then this can be given as
Here state model matrix as
= 1 0 0 0
0 1 0 0
^ , ^
1
0
2 ^ ^ , ^
State transform matrix as
1 0 0 0
=1
+
2
= 0 1 0 0
0 0 1 0
2
=1
(17)
0 0 0 1
With these parameters the tracking of object in video sequence
Where
takes place with Kalman filter.
= ^
(18)

TEST DESCRIPTION
=1
^ ^0
The test cases are used for evaluation of detected object in
The second term in equation (17) represents the density estimate computed with kernel profile k(x) at y in the current frame with weighted function given in equation (18).In this procedure kernel is recursively moved from current location y0 to yt+1 which yields to update its position by
video sequences. There is one sets of video sequence to track car in video sequence. These video sequence is having 300 frames at frame rate approximately 29Hz.The video sequence are of 640 pixels by 480 pixels in dimension. In this video sequence, detected cars have their centroid. Figure 1 shows sample video sequence of few frames with a view of cars.
2
=
=1
(19)
+1
2
=1
Fig. 1. Sample video sequence
We first compare the MeanShift gradient descent search with the car video sequence for selected car. For car selection, one patch is drawn on first frame of the video, and that patch is updated with consequent frames further by MeanShift algorithm.
In next part, same video sequence is tested for Kalman filter algorithm where we have detected first car in video sequence and keep on updating detected car position with estimated position of car.

RESULT
To test our implementation we have tried many videos, out of that one common video is put up to give convincing comparison for Meanshift and Kalman filter algorithm to track cars sucessessfully.
In our implementation car is selected with 3Ã—2 patch for MeanShift algorithm and for Kalman filter whole car size in video is taken for reference including its centriods. All of the experiment is carried out on CPU Pentium IV 3.2 GHz PC with 512M memory under MATLAB.
Fig. 2. MeanShift tracker
Fig. 3. Kalman filter tracker
Figure 2 illusrates the target car is selected through patch and that selected patch region s estimated with density estimation function and with the next frame according to similarity between current and next frame estimated values patch get updated in MeanShift algorithm.
Figure 3 illustartes target car is selected with maximun area in initial frame with centroid that is our current position in first frame which is highlited in blue rectangle.Kalman filter helps to predict position of car in next frame through gain updation given by equation (20).This predicted position is given in red rectangle in figure 3.
TABLE I. PERFORMANCE OF MEANSHIFT AND KALMAN FILTER ALGORITHM
Algorithm
Performance
No. of Frame tracked
Elapsed time
MeanShift
352
34.082 seconds
Kalman filter
352
35.130 seconds

CONCLUSION
In this paper, we have described and evaluated the use of Kalman filter approach to improve tracking performance of detected object in video sequence. We have shown that the MeanShift lacks a segmentation step to separate object pixel from background pixel. Track box is not tight around the object, track lost when object is fast moving, and background changing from light to dark. Furthermore this framework naturally extended to multiobject tracking and get optimize time for object.
REFERENCES

D. Comaniciu and V. Ramesh, MeanShift and Optimal Prediction for Efficient Object Tracking, Proceedings of the IEEE Conference on Image Processing, vol. 3, pp. 7073, Vancouver, Canada, 2000.

D. Comaniciu, V. Ramesh, and P. Meer, KernelBased Object Tracking,IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 25, no. 5, pp. 564577, May 25th 2003.

A. Mohan, C. Papageprgiou, and T. Poggio, ExampleBased Object Detection in Images by Components, IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 21, no. 4, pp. 349361, April 2001.

C. Tomasi and T. Kanade, Detection and Tracking of Point Features, Technical Report CMUCS91132, April 1991.

Mohinder S.Grewal,Angus P.Andrew, Kalman Filtering:Theory and Prctise Using MATLAB,Second Edition,Published by John Wiley & sons,2001.