Adaptive and Automatic Target Detection Without Post-Processing for Indoor Video Sequence

DOI : 10.17577/IJERTV2IS50725

Download Full-Text PDF Cite this Publication

Text Only Version

Adaptive and Automatic Target Detection Without Post-Processing for Indoor Video Sequence

Charan K1, Shreedevi B A2

Dept. of Information Science Engineering RV College of Engineering

Abstract

The proposed work is to develop Background Subtraction methodology for indoor video sequences for color images. The indoor sequence may cause many problems in the scene due to lightning changes in it, and it may affect the target detection based on intensity of focus light. The proposed method first applies Homomorphic filtering algorithm to nullify the effect of illumination changes directly on color images, then using extended Gaussian Mixture model build a background model, using Euclidian distance as measure will detect moving foreground objects in indoor sequences. Background Subtraction is first method for any Automatic Target Detection system to work effectively. The proposed algorithm is directly works on color images and compared with existing algorithms and achieving some good improvements. Results illustrate that the proposed work achieves good results on varying illumination in indoor sequence.

  1. Introduction

    Background Subtraction is one of the most widely used techniques to detect moving objects from static scene [1]-[3]. Since background subtraction is a low-level task, it should consider two aspects: Accuracy and Computational resources (time and memory). First, its accuracy is critical because the output of the background subtraction is used for other high tasks,

    such as tracking and recognition. Second, computational resources used for background subtraction are critical since the resources remaining after this low-level task should be used for high level tasks, and is preferable as a means of implementing this task in real-time embedded systems such as smart cameras [4], [5]. Therefore, it is important for the background subtraction method to obtain high accuracy and low resource requirements at the same time.

    Background Subtraction is a typical approach in indoor sequence for foreground detection, which detects foreground regions by evaluating the difference between the input image and a reference background image. In background subtraction, tracking of objects has to be done through any of the background models. There are many background subtraction methods but many of them got failed to present the correct results for indoor sequence because of illumination changes.

    Gaussian Mixture Model is considered to perform the operation but it is not able to adapt the effectiveness of illumination changes which can occur in each scene. To overcome this adaptive Gaussian Mixture Model has been proposed and its frequently adapt to the changes which is occurring in each scene. After performing improved version also there is so much of distortion and produced false results. To overcome this Homomorphic Filter is used which can easily extract

    the intensity, lightness and brightness of each scene, so it helps to nullify the illumination component and produces good results for each scene. Secondly, Adaptive Gaussian Mixture Model has to be performed for Homomorphic results to get the foreground object.

    The paper is organized as follows. In next section we list some related work. In section 3 the Improved GMM approach from [3] is reviewed. In sections 4 proposed method, we present to improve the algorithm. In section 5 we present some experiments.

  2. Related work

    It is well known that the appearance of an image can be severely affected by illumination. The problem is that the illumination causes larger variations in images. A very promising for compensating illumination changes is homomorphic filtering, based on illumination reflectance model [6], [7]. The procedure is well known in image analysis and processing but was rarely addressed.

    2.1 Background Modelling

    This step uses the new video frames in order to calculate and update the background model. The main aim of developing a background model is that it should be robust against environmental changes in the background, but sensitive enough to identify all moving objects. Frame difference is considered the simplest form of background subtraction. The method involves the subtraction of previous pixel frame with current pixel frame and if the value obtained from this greater than a setthreshold then it is considered as the foreground. Advantages are less complex, the method can easily and quickly adapt to changes and able to distinguish the background from foreground much affectively. Disadvantages are not robust, applicable

    when with fixed camera and fails to only applicable when with fixed camera.

  3. ImprovedGMM

    A Gaussian Mixture Model (GMM) [17] is a parametric probability density function represented as a weighted sum of Gaussian component densities. GMMs are commonly used as a parametric model of the probability distribution of continuous measurements or features in a biometric system, such as vocal-tract related spectral features in a speaker recognition system. GMM parameters are estimated from training data using the iterative Expectation Maximization (EM) algorithm or MaximumAPosteriori(MAP) estimation from a well- trained prior model.

    Gaussian Mixture Model is considered in the proposed system, literature says that GMM got failed to adapt the illumination changes and also to identify the foreground object, to resolve these problems Gaussian Mixture Model was used for background subtraction[17].

    The illumination in the scene could change suddenly (switching on/off lights). A new object could be brought into the scene or a present object removed from it. In order to adapt to changes update the training set by adding new samples and discarding the old ones[17]. Choose a reasonable time period T and at time t,

    = { , , ()} (1)

    For each new sample update the training data set

    and re-estimate

    ( |, ) (2)

    Among the samples some values that belong to the foreground objects and should denote this estimate as

    , + (3)

    GMM with M components

    , + = =1 ( ; , 2) (4)

    Where 1,, are the estimates of the means and

    1 ,, are the estimates of the variances that describe the Gaussian components. The covariance matrices are assumed to be diagonal and the identity matrix I has proper dimensions. The mixing weights denoted by are non-negative and add up to one. Given a new data sample at time t the recursive update equations are.

    + ( ) (5)

    Where 0 is some appropriate initial variance. If the maximum number of component is reached discard the component with smallest

    The foreground object will be represented by some additional clusters with small weights . Therefore approximate the background model by the first B largest clusters

    , = =1 ( ; , 2 ) (11)

    If the components are stored to have descending weights

    = argmin ( =1 > (1 )) (12)

    +

    ( )

    (6)

    Where is a measure of maximum portion of the data

    that can belong to foreground objects without

    2 2 +

    2 (7)

    influencing the background model. For example, if a new object comes into a scene and remains static for

    Where = instead of the time interval Tthat was mentioned above, here constant desribes an exponentially decaying envelope that is used to limit the influence of the old data. The same notation having the mind that approximately =1/T. For a new sample the ownership is setto 1 for the close zero. component with largest and the others are set to The squared distance from the m-th component is calculated as

    some time it will probably generate an additional stabile cluster. Since the old background is occluded the weight +1 of the new cluster will be constantly increasing. If the object remains static long enough, its

    weight becomes larger than and it can be considered to be part of the background. By looking this equation

    + ( )

    conclude that the object should be static for approximately log(1- )= log(1-) frames.

    2

    =

    2

    (8)

    The weight describes how much of the data

    belongs to the m-th component of the GMM. It can be

    If there are no close components a new component is generated with

    +1 = , +1 = (9)

    And

    regarded as the probability that a sample comes from the m-th component and in this way define an underlying multinomial distribution. Let us assume that, t data samples and each of them belongs to one of the components of the GMM. Let us also assume that

    +1

    = 0

    (10)

    the number of samples that belong to the m-th component is

    = =1

    (13)

    Where s are defined in the previous section. The

    log + log +

    (

    1 = 0 (18)

    maximum weights are constrained to sum up to one.

    Take this into account by introducing the lagrange multiplier . The maximum likelihood estimate follows

    Where

    =1

    =1

    = =1 (19)

    from:

    = 1

    (20)

    log + 1( 1 = 0 (14)

    Where

    =1

    =

    =

    =1

    =1

    =

    After getting rid of

    : = = 1

    (15)

    Rewrite as

    (3.30)

    =1

    =

    1

    (21)

    The estimate from t samples is denoted as and it can be rewritten in recursive form as a function of the

    Where = 1

    are the Ml estimate and the

    =1

    estimate 1 for t-1 samples and the ownership

    of the last sample.

    bias from the prior is introduced through c/t. The bias decreases for larger data sets. However, if a small bias

    =

    1 + 1

    1

    (16)

    is acceptable then keep it constant by fixing c/t to

    =c/T with some large T. This means that the bias

    If the influence of the new samples is fix by fixing1/t to = 1/T get the update equation. This fixed influence of the new samples means that rely more on the new samples and the contribution from the old samples is

    will always be the same as if would have been for a data set with T samples. It is easy to show that the recursive version with fixed c/t= is given by

    =

    downweighted in an exponentially decaying manner as

    1 + 1

    1

    1

    mentioned before.

    Prior knowledge for multinomial distribution can be

    1/

    1

    (22)

    introduced by using its conjugate prior, the Dirichlet prior

    Since expect usually a few components M and is small let assume 1-M 1. As mentioned set 1/t to

    = 1

    (17)

    and get the final modified adaptive update equation

    The coefficients have a meaningful interpretation. For the multinomial distribution, the presents the prior evidence for the class m the number of samples that belong to that class a priori. Negative prior evidence means, accept that the class m exists only if there is enough evidence from the data for the existence of this class. This type of prior is also related to minimum message length criterion that is used for selecting proper models for given data. The MAP solution that includes the mentioned prior follows from

    + (23)

    This equation is used instead of this equation

    + ( ) . After each update need to be normalize s so that it is up to one. Then start with GMM with one component centered on the first sample and new components are added as mentioned in the previous. The Dirichlet prior with negative weights will suppress the components that are not supported by the data and discard the component m when its weight becomes negative.This also ensures that the mixing weights stay non-negative. For a chosen = 1/T we

    could require that at least c = 0.01*T samples support a component and we get = 0.01.

      1. Foreground detection

        This method is used for detecting foreground object is highly dependent on background model. It compares the video frame with the background frame, and identifies the foreground object from the frame.

      2. Data Validation

    Finally, this step eliminated any pixels which are not connected to the image. It involves the process of improving the foreground mask based on the information obtained from the outside background model.

    Most background models lack three main points which are ignoring any correlation between the neighboring pixels, the rate of adaption may not match the moving speed of the foreground object, non-stationary pixels, from moving leavers or shadow cast by moving objects are at times mistaken for true foreground objects.

  4. Proposed System

    In background subtraction, tracking of objects has to be done through any of the background models. There are many background subtraction methods but many of them got failed to present the correct results for indoor sequence because of illumination changes.To overcome this adaptive Gaussian Mixture Model has been proposed and its frequently adapt to the changes which is occurring in each scene. After performing improved version also there is so much of distortion and produced false results.To overcome this Homomorphic Filter is used which can easily extract the intensity, lightness and brightness of each scene, so it helps to nullify the illumination component and produces good results for each scene.Secondly, Adaptive Gaussian Mixture Model has to be performed for Homomorphic results to get the foreground object

  5. Experiments

    The Homomorphic Filter enhances the original color image by filtering the lightness or any darkness in an image. The filtered image is transformed back to RGB but the resulting RGB image has identical matrices for the Red, Green and Blue planes, so the image displays as shades of gray.

      1. Background Subtraction

        Fig 1: Foreground segmentation results

        In figure 1(a), the frame size is handled as 320*256 and some peoples are walking in it in the mall as taken as input with some lightning changes in it.

        In figure 1(b), the frame size is handled as 160*128 and one person is teaching in the class room with more focus of light on him taken as input.

        In figure 1(c), the frame size is handled as 160*128 and one person is moving towards light from dark so it is highly illuminated frame taken as input.

        In figure 1(d), the frame size is handled as 160*130 and more peoples are moving in elevator with some lightning changes on it taken as input.

        In figure 1(e), the frame size is handled as 176*144 and more peoples are moving in mall with more focus of light on them taken as input.

        The above mentioned frames are taken as input for background subtraction with some lightning changes on it. The Homomorphic Filter is applied to eliminate the lightning changes in it and later these enhanced rames will be given as input for background subtraction.

        Image: Sample images to perform Background

        • False Negative Ratio

        • True Positive Ratio

          Table 1: Resulted values for performance evaluation. In the above table the results are tabulated for the performance of resulting images. Where FP is the number of false positives, FN is the number of false negatives and TP is the number of true positives.

          Recall is defined by:

          Subtraction.

          Ground truth: Estimated Ground truth for given images.

          GMM: Foreground results for Gaussian Mixture Model.

          Improved GMM: Foreground results for extended GMM.

          Recall=

          +

          (24)

          Proposed: Foreground results for proposed approach. The proposed method was evaluated and compared with GMM and Improved GMM using different type of illumination images. The experimental results are presented to show that the proposed methods can achieve promising performance for illumination changes in background subtraction and foreground object extraction. The proposed system detects and tracks the moving objects exactly. The background scene is modelled using a set of consecutive frames. The object pixels are segmented out from its background followed by morphological process to eliminate noisy pixels for producing better results.

      2. Performance Evaluation

        Three criteria are used for performance evaluation, such as

        • False Positive Ratio

    Table 2: Tabulated results for precision and

    recall

    In the above table the results are tabulated for recall and precision. Recall gives the percentage of detected true positives as compared to the total number of true positives in the ground truth, while precision gives the percentage of detected true positives as compared to the total number of items detected.

    Fig 2:Graph for Recall

    The above graph is plot through resulting values of recall, series 1 is for GMM, series 2 is for IGMM and series 3 is for proposed method.

    Fig 3: Graph for Precision

    The above graph is plot through resulting values of Precision, series 1is for GMM, series 2 is for IGMM and series 3 is for proposed method.

  6. Conclusion

The proposed approach is capable of detecting motion and extracting object information which involves human as object has been described. The strength of the proposed technique is robust against illumination changes in indoor sequence. Main contribution towards the success of the detection in proposed method is because of nullifying the effectiveness of illumination later use the enhanced images to extract the foreground object through the proposed background model.

In future we can consider the outdoor environment also to nullify the effectiveness of distortion in the scene using background subtraction. We need to add many more features to the proposed system for outdoor environment because it is more complicated than indoor environment so it is necessary to find out the best way to get the solution for outdoor environment.

References

[1]M. Piccardi, Background subtraction techniques: A review, in Proc. IEEE Int. Conf. Syst., Man Cybern., Oct. 2004, pp. 30993104.

[2]R. J. Radke, S. Andra, O. Al-Kofahi, and B. Roysam, Image change detectionalgorithms: A systematic survey, IEEE Trans. Image Process., vol. 14, no. 3, pp. 294307, Mar. 2005.

[3]Y. Benezeth, P. M. Jodoin, B. Emile, H. Laurent, and C. Rosenberger, Review and evaluation of commonly-implemented background subtraction algorithms, in Proc. IEEE Int. Conf. Patt. Recog., Dec. 2008, pp. 14.

[4]B. Rinner and W. Wolf, An introduction to distributed smart cameras, Proc. IEEE, vol. 96, no. 10, pp. 15651575, Oct. 2008.

[5]A. N. Belbachir, Smart Cameras. New York: Springer, 2009.

[6]R.C. Gonzalez, R.E. Woods, "Digital Image Processing", Prentice Hall, Upper Saddle River, NJ, 2002

[7]S.E. Umbaugh, "Computer Imaging: Digital Image Analysis and Processing", CRC Press, Florida, 2005

[8]C. Stauffer and W. E. L. Grimson. Adaptive background mixture models for real-time tracking.IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 1999, 2:252, 1999.URL:http://doi.ieeecomputersociety.org/10.1109/ CVPR.1999.784637.

  1. A.M. Elgammal, D. Harwood, and L.S. Davis,Non-ParametricModel for Background Subtraction.ECCV, 2000, pp. 751767.

  2. P. Noriega, and O. Bernier, Real Time Illumination Invariant Background Subtraction Using Local Kernel Histograms.BMVC,

    2006, volume 3, pp. 979988..

  3. D. J. Ketcham, R. Lowe, and W. Weber, Seminar on image processing, in Real-Time Enhancement Techniques, 1976, pp. 16. Hughes Aircraft.

[12]R. Hummel, Image enhancement by histogram transformation, Comp. Graph. Image Process., vol. 6, pp. 184195, 1977.

[13]V. T. Tom and G. J. Wolfe, Adaptive histogram equalization and its applications, SPIE Applicat. Dig.Image Process.IV, vol. 359, pp. 204209, 1982.

[14]S. M. Pizer, E. P. Amburn, J. D. Austin, R. Cromartie, A. Geselowitz, T. Greer, B. H. Romeny, J.

B. Zimmerman, and K. Zuiderveld, Adaptive histogram equalization and its variations, Comp. Vis. Graph. Image Process., vol. 39, no. 3, pp. 355368, 1987.

[15]Russ, J., The Image Processing Handbook, 4th ed., CRC Press, 2002

[16]R.C. Gonzalez, R.E. Woods, "Digital Image Processing", Prentice Hall, Upper Saddle River, NJ, 2002

[17]S.E. Umbaugh, "Computer Imaging: Digital Image Analysis and Processing", CRC Press, Florida, 2005

[17]ZoranZivkovic, "Improved Adaptive Gaussian Mixture Model for Background Subtraction", In Proc. ICPR, 2004

Leave a Reply