Object detecting using PCA image reconstruction and Optical flow

DOI : 10.17577/IJERTV1IS5268

Download Full-Text PDF Cite this Publication

Text Only Version

Object detecting using PCA image reconstruction and Optical flow

Sajjad einy

Mtech spatial information technology JNTU university

Anoop M. Namboodiri

Assistant Professor IIIT hyderabad


This paper addresses the issue of moving object tracking from video. Two steps of processing are PCA reconstruction for classification the area and Optical flow-based tracking of feature points. Optimal flow-based tracking algorithm predicts and restores the feature of area in the real time object tracking. The proposed system is computationally more efficient for learning of dynamic object and object tracking to shape information. PCA based method compare to other methods is powerful algorithm in object detection and the Optical flow- based tracking algorithm could be used for decreasing the false point tracking and remove the error of the motion tracking in real time system tracking. The proposed algorithm tracks a set of feature points; during the tracking, feature is restored inside the predicted region. One important contribution of this work is to provide a restoration process for missing feature points, which occurs at almost every frame under realistic, noisy environment.


Principal Component Analysis is a popular technique for data compressor and has been successfully used as an initial step in many computer vision tasks, including face recognition and object recognition and feature extraction. In this paper we are using PCA in image reconstruction for extraction and classification objects. We present an object tracking system to detect pedestrians in gray level images, with assuming the correction and restoration system. The system works as follows: feature extraction using by PCA reconstruction system second tracking the Objects using by optimal flow-based tracking algorithm. When a new pattern needs to be

classified we compare the reconstruction made by training sets of principal components (PCs). In order to improve the performance of the classifier we can use the training set of PCA classifier and feature point for correction and restoration the information of image. Additionally, we show that the performance of the system can be improved by combining the classifier based on PCA reconstruction with a feature points using a Support Vector Machine. By denition, PCA looks for the set of PCs that best describe the distribution of the data that are being analyzed. Therefore, these PCs are going to preserve better the information of the images from which PCA was performed, or of those that are similar. Thus, if we have a set of PCs that were obtained from a set of pedestrian images only, these must reconstruct better the images of other pedestrians than any other type of images, and vice versa, if we have a set of PCs obtained from images of anything except pedestrians, the reconstruction of the pedestrian images will not be as good [1]. in the fig1 we show the implementation algorithm of object tracking using by feature extraction in this algorithm after classification of the image using PCA then extraction of them used feature point for correction and up data the feature that are tracks. After detecting moving objects from background, we extract a set of feature points inside the object and predict the corresponding feature points in the next frame. We keep checking and restoring any missing feature points during the tracking process. If over 60% of feature points are restored, we decided the set of feature points are not proper for tracking, and redefine new set of points [7].



Feature extraction

Feature tracker

Object detection


Feature tracker?

Feature correction and prediction

Up data

Fig(1)Propose algorithm

  1. Image reconstruction with PCA:

    The formulation of standard PCA is as follows. Consider a set of m images, each of size r · c. Each image Ii is represented by a column vector of length . The mean object of the set is dened by

    C, the covariance matrix, is given by

    The principal components are then the eigenvectors of C. These eigenvectors can be computed in several ways. Perhaps the easiest one is to solve the generalized eigenvector problem using the QZ algorithm or its variants [1,2].It is also common to formulate the problem as that of nding the basis vectors that minimize the reconstruction error and then solve it using standard least-squares techniques In our system we compute these eigenvectors using the implementation provided by Matlab, which is based on the QZ algorithm [1,2].

    Fig2.Image Reconstruction with deferent sets of PCs

    1. Classication using reconstruction

      From this fact we can create a classier based on image reconstruction with PCA, which decides if an image belongs or does not belong to the pedestrian class. The algorithm to do this classication is the following:

      Before doing any classication:

      1. Perform PCA on the set of pedestrian gray level images to obtain the projection matrix and the mean

      2. Perform PCA on the set of pedestrian edge images to obtain the projection matrix and the mean .

      3. Perform PCA on the set of non-pedestrian gray level images to obtain the projection matrix and the mean .

      4. Perform PCA on the set of non-pedestrian edge images to obtain the projection matrix Pen and the mean .[1]

    2. feature extraction using by clustered

      If a window is identied correctly as a pedestrian, then it is very likely that there are no pedestrians either above or below it, and if there are pedestrians beside it, they cannot be too overlapped. This heuristic allows us to eliminate nearby detections. With this purpose, we dene a region around a detection which we are going to use to eliminate any detection whose centroid is inside this region. The size of the region was dened empirically as 1.4 times the detection height upwards and downwards from the centroid and between 0.5 and 0.75 times the detection width towards each side of the centroid. We know that when we have multiple detections we must choose only one to keep, but how do we make this decision A reasonable way to choose is to maintain the grouped detection composed by the most original detections; nevertheless, we observe that usually the biggest detections were the correct ones, due to the fact that arms, legs and head are often confused with pedestrians, so when we need to decide among a set of detections that are in the same region, we must consider the number of times that they have been detected originally as well as the size of the detected regions. To achieve this, the detections that compose a grouped detection are weighted by their height, then the grouped detection with the greatest Preference, according to the following formula is chosen [1].

      Preference =Detections .Weight (height)

      where Detections is the number of original detections that compose the grouped detection that we are evaluating and Weight is a function that determines the value that each detection has, according to the height of the grouped detection, and is given by the formula: Weight(height)=(height- 50)(height-50) There are very few cases where this heuristic does

      not work, and thus it allows to eliminate many false detections when the classier confuses the arms, the legs, or some other object with a pedestrian[1]

      fig3.False detection rate

      Fig. 3. ROC curves comparing the performance of our classiers versus the best reported in the literature. The detection rate is plotted against the false dtection rate measure on logarithmic scale.

      Fig4.object classification algorithm

  2. Optical flow-based tracking of feature points

Optical flow is approximated with the displacement of features between two consecutive frames . It is

interesting to see how such an approximation is good for motion generated by hand held cameras and when otherwise it does lead to serious errors.

    1. Motion-based object detection

      For the detection of an object from background, we use optical flow as an initial clue. The fundamental condition of optical flow is that intensity of a point on the object does not change during the sufficiently small duration. Let represent the distribution of intensity in a continuous frame, the optical flow condition can be defined as[8,9,7]


      By applying the chain rule to (1), we have that[2,1]


      Where and

      From (2), we can evaluate optical flow as


      =0 (3)

      Where ,

      And <0.0> denotes vector inner product[13]. Optical flow that satisfies the constraint in (3) is prone to noise because the difference approximation between adjacent pixels is used for evaluating derivatives. To reduce noise amplification,HornSchunks and LukasKanades methods are widely used in the literature[10,11,7]. Error from the optical flow constraint can be measured as


      Where R represents a neighboring region [8].In minimizing (4), we ignore trivial motion vectors so as to reduce error as long as making noise[7].In order to separate a moving object and noise in low-gradient conditions, we used the following measure, which is called the normalized difference:


      The total number of pixels in R. If the normalized difference is smaller than a pre-specified threshold denoted by ,the corresponding region is considered to be noise. As a result, the region of moving objects can be extracted, and the corresponding region is then labeled based on motion direction [8].

    2. Feature point extraction

      After detection of an object from background, we extract a set of feature points inside the object by using the BOUOGUT tracking algorithm [12,13]. Due to the nature of motion estimation, motion-based object detection algorithms usually extract the object slightly larger than the real size of the object, which results in false extraction of feature points outside the object Let the position of a feature point at frame t be

      , where I represents the index of feature points and

      =[These outside feature points are removed by considering the distance between feature points given as


      Where )

      t represents the index of frames, and i the index of feature points. Heredi represents the sum of distance betweent th frame and t+1st frame with respect to I th feature point. In general, the moving distance of a feature point in the background (outside object) is further less than that of a feature point in the tracked object[7]in the fig 1 we present the effect of t amount in the propose algorithm.

    3. Feature point prediction and correction

Sometimes, a tracking algorithm may fail to track a proper feature point in the next frame. A feature point is defined as untracked when an error value within small window is over a pre-defined threshold. Specifically, a threshold value is determined by distance between average vectors predicted by a spatio-temporal prediction. After the spatio-temporal prediction, reinvestigation is performed. Then, both tracked and untracked feature points are updated in a list. In many real-time, continuous video tracking applications, a feature-based tracking algorithm fails

due to the following reasons: (i) self or partial occlusions of an object and (ii) feature points on or outside the boundary of the object, which are affected by changing back-ground. In order to deal with the tracking failure, we should correct the erroneously predicted feature points by using the location of the previous feature points and inter-pixel relationship between the predicted points. A temporal prediction is suitable for deformable objects while a spatial prediction is good for non-deformable objects. Both temporal and spatial prediction results can also be combined with proper weights. Although users can control the value of the number of frames, K, the value of 7 was used for all test sequences. The larger the value of Kis, the better the performance of the algorithm is. Because of trade-off between processing time and accuracy, the value around 7 was found to be reasonable for temporal prediction[7].

Fig5:(a) feature model (b) relocation (c) result of feature reconstruction


This method is suitable for dynamic background. The experimental results have shown that the proposed method has successfully tracked moving objects in most cases. Applying the PCA reconstruction algorithm caused better realization of information related to the interest area by using both intensity and edge image.


  1. Object detection using image reconstruction with PCA Luis Malago´n-Borja a, Olac Fuentesb,* Image and Vision Computing (2007)

  2. Edgar Osuna, Robert Freund, Federico Girosi, Training support vector machines: an application to face detection, in: IEEE Confer-ence on Computer Vision and Pattern Recognition, 1997, pp. 130

  3. Constantine Papageorgiou, Object and pattern detection in video sequences, Masters thesis, M.I.T.,

    Cambridge, MA, 1997

  4. John C. Platt, Sequential minimal optimization: a fast algorithm for training support vector machines, in: Bernhard Scholkopf, Christo-pher J.C. Burges, Alex J.

    Smola (Eds.), Advances in Kernel Methods Support Vector Learning, MIT Press, Cambridge, MA, USA, 1999 pp. 185208

  5. C.B. Moler, G.W. Stewart, An algorithm for generalized matrix eigenvalue problems, SIAM Journal on Numerical Analysis 10 (2)(1973) 241256

  6. Massimiliano Pontil, Alessandro Verri, Support vector machines for 3D object recognition, IEEE Transactions on Pattern Analysis and Machine Intelligence 20 (6) (1998) 637646

  7. Optical flow-based real-time object tracking using non-prior training active feature model Jeongho Shin a , Sangjin Kim a, Sangkyu Kang b, Seong-Won Lee c, Joonki Paika,Besma AbidId, Mongi Abidida

  8. Haritaoglu I, Harwood D, Davis L. W-4: Real- time surveillance of people and their activities. IEEE Transactions on Pattern Analysis and Machine Intelligence 2000;22(8):80930.

  9. Amer A. Voting-based simultaneous tracking of multiple video objects. Proceedings of the SPIE Visual Communication Image Processing 2003;5022:50011.

  10. Wren C, Azerbayejani A, Darrel T, Pentland A. Pfinder: Real-time tracking of the human body. IEEE Transactions on Pattern Analysis and Machine Intelligence 1997;19(7):7805.

  11. Comaniciu D, Ramesh V, Meer P. Real-time tracking of nonrigid objects using mean shift. Proceedings of the IEEE International Conference on Computer Vision, Pattern Recogni-tion 2000;2:142

  12. Comaniciu D, Ramesh V, Meer P. Kernel-based object tracking.IEEE Transactions on Pattern Analysis and Machine Intelligence2003;25(4):114.

  13. Baumberg AM. Learning Deformable Models for Tracking Human Motion. PhD dissertation, School of Computer Studies,University of Leeds, UK, October 1995.

Leave a Reply