- Open Access
- Total Downloads : 11
- Authors : Vidya R, Rajitha Nair
- Paper ID : IJERTCONV3IS19096
- Volume & Issue : ICESMART – 2015 (Volume 3 – Issue 19)
- Published (First Online): 24-04-2018
- ISSN (Online) : 2278-0181
- Publisher Name : IJERT
- License: This work is licensed under a Creative Commons Attribution 4.0 International License
Disclosure and Sniffout of Various Moving Entity in Real World
ohn Institute of Technology Bangalore,India.
Asst.Prof,Dept.CSE, T.John Institute of Technology
Abstract: Object detection and tracking is an important task within the field of computer vision due to its promising applications in many areas such as video surveillance, traffic monitoring, vehicle navigation, robotics, 3D reconstruction and content based indexing and retrieval. Motion detection consists of identifying the foreground and background objects into the scene, and tracking is the process of following image objects in their movement through an image sequence. Many methods for object detection and tracking have been proposed each having its own strengths and weaknesses. In case of motion detection without using any models, the most popular region-based approaches are background subtraction and optical flow. Background subtraction detects moving objects by subtracting estimated background models from images. This method is sensitive to illumination changes and small movement in the background. Optical flow also has a problem caused by illumination changes since its approximate constraint equation basically ignores temporal illumination changes.
The proposed, system is robust in various environments including indoor and outdoor scenes and different types of background scenes. The proposed method is robust because it uses edge-based features and clustering is used which makes it insensitive to illumination changes. The method is also fast because the area to be covered in edge- based features is less than region based features and is not much computationally expensive.
The proposed algorithm for detection and tracking of multiple moving objects in real time for both indoor and outdoor environment was extensively tested to operate in complex, real world, non-plain and changing background. To evaluate the performance of the proposed tracking system quantitatively, the pixel wise distance from the centroid of the tracking window with ground truth that is obtained manually is compared and found to possess remarkable very high accuracy and precision as compared to other previous work.
Keywords: Detection , Tracking , K-means,clustering
Motion detection is a well-studied problem in computer vision. There are two types of approaches: the region-based approach and the boundary-based approach. In the case of motion detection without using any models, the most popular region-based approaches are background subtraction and optical flow. Background subtraction  detects moving objects by subtracting estimated background models from images. This method is sensitive to illumination changes and small movement in the background. Many techniques have been proposed to
overcome this problem. The mixture of Gaussians is a popular and promising technique to estimate illumination changes and small movement in the background. However, a common problem of background subtraction is that it requires a long time for estimating the background models. It usually takes several seconds for background model estimation because the speed of illumination changes and small movement in the background are very slow. Optical flow also has a problem caused by illumination changes since its approximate constraint equation basically ignores temporal illumination changes. In the case of boundary- based approaches, many of them use edge-based optical flow, level sets, and active contours . In detecting moving edges, the zero-crossings at each pixel are calculated from the convolution of the intensity history with the second-order temporal derivative of the Gaussian function. However, the results of this method are inaccurate when the image is not smoothed sufficiently because it computes velocity without using spatial information of neighborhood pixels.
Most of the existing edge pixel based approaches suffer from random noise. Pixel by pixel matching of edge segments is not suitable for matching as well as tracking due to higher computational cost. Edges that are visible at current frame m i g h t not be found at some later frames. In the existing edge pixel based methods it is not possible to apply different amount of transformation for different parts of edge pixels at the same time. As a result it cannot achieve accurate matching of all parts of the object model in subsequent frames. This phenomenon creates problem in object tracking and it makes difficult to deal with complex motion and shape change situation.
Although Object detection & tracking has been studied for dozens of years, it remains an open research problem. A potent, rigorous and high attainment approach is still a great challenge today. The difficulty level of this problem highly depends on how the object to be detected and tracked is defined. If only a few imaged appearence, such as a specific color, are used as representation of an object, it is fairly easy to identify all pixels with same color as the object. On the other extremity, the face of a specific person, which full of perceptual details and interfering information such as different poses and illumination, is
very hard to be accurately detected, recognized and tracked. Most challenges arise from the image variability of video because video objects generally are moving objects. As an object moves through the field of view of a camera, the images of the object may change dramatically. This variability comes from three principle sources: variation in target pose or target deformations, variation in illumination, and partial or full occlusion of the target.
Understanding activities of objects in any environment by the use of video is both a challenging scientific problem and a very fertile domain with many promising applications. Thus, it draws attentions of several researchers, institutions and commercial companies. The motivation in studying this problem is to create a visual surveillance system with real-time moving object detection and tracking capabilities.
The proposed method uses algorithm which includes three main stages, the first stage is generation of reference edge lists, detection of moving object and then updating edge lists. The system maintains two background reference edge lists. The first one is initial reference edge list, obtained by accumulating the training set of background edge images and the other one is temporary reference edge list which is constructed by the conditional addition of the moving edge segments. The initial reference edge list is static and requires no update but the temporary reference edge list is updated at every frame. Here, edges are extracted from video frames using the canny edge detector. For the detection of moving objects at current frame, the input edge segments are extracted from current frame forming current edge list. Distance maps for reference edge lists are obtained by a distance transformation for fast computing. The cluster segments are based on the edge segment connectivity which emphasize on relative edge distance and motion direction. Initially, the grouping of clusters is done by using a distance based iterative k- means clustering. To cluster segments each edge segment is enclosed within a rectangular boundary and the midpoint of this boundary is used as the edge representative in the computation of the K-centroids for the K-clusters.
System architecture is the conceptual model that defines the structure, behavior, and more views of a system. An architecture description is a formal description and representation of a system, organized in a way that supports reasoning about the structures of the system.
The system maintains two background reference edge lists. The first one is initial reference edge list, obtained by accumulating the training set of background edge images and the other one is temporary reference edge list which is constructed by the conditional addition of the moving edge segments .The initial reference edge list is static and requires no update but the temporary reference edge list is updated at every frame. Here, edges images
are extracted f ro m video frames using the canny edge detector. During edge extraction process some of the prominent background edges might not be extracted in a particular illumination. For the detection of moving objects at current image, the input edge segments are extracted from current image forming current edge list. Distance maps for reference edge lists are obtained by a distance transformation for fast computing. Reference update allows detecting moving objects in dynamic environment. The figure 1 shows the system architecture for moving edge detection.
Figure 1: Architecture for Moving Edge Detection
Tracking problem is to reliably and accurately recognize the moving regions within frames which correspond to same moving object over time. To do so, first detect moving edge segments in current frame. A short description of the methods is explained below:
Moving Edge Segment
The proposed method uses algorithm which includes three main stages generation of reference edge lists, detection of moving object and updating edge lists.
Generation of reference edge lists
The system maintains two background reference edge lists. The first one is initial reference edge list, obtained by accumulating the training set of background edge images and the other one is temporary reference edge list which is constructed by the conditional addition of the moving edge segments.
Detection of Moving Object
For the detection of moving objects at current image, the input edge segments are extracted from current image forming current edge list. Distance maps for reference edge lists are obtained by a distance transformation for fast computing. Each edge segment on current edge list is searched.
Updating edge lists
Reference update allows detecting moving objects in dynamic environment. If a detected moving edge segment is already registered in the moving edge list in the same position, its associated weight value is increased. On the other hand, already registered moving edge segment from the moving edge list will lose its weight value if it is not redetected in the same position. In this process the segments having the weight value greater than TM are moved from moving edge list to temporary reference edge list. An edge segment will be dropped from the temporary reference edge list if its weight reaches to zero.
Selection of Feature points
A common problem in computer vision is the registration of feature point sets. Especially, when a cluster of point samples from one frame of an object is matched with another cluster in another frame .The task of registration is to place the data into a optimum location by estimating the transformations of the object between the two frames. The parameter vector of the objects motion is estimated based on minimizing the sum-of-squared differences between the reference feature points in the reference frame and the observed feature points in the tracking sequence frame.
This becomes challenging because
correspondences between the point sets are not known beforehand. The selection of invariant features that can be reliably computed is a key component to such registration approaches. Curvatures are local features that are invariant to all affine transformations. Past research on range data has shown that surfaces may be classified by observing the signs of their mean and Gaussian curvatures. Thus if curvatures could be reliably and consistently calculated they would be ideal for feature-based registration. The computation of curvature of a surface requires the estimation of the second derivative of the surface. For a curve given parametrically as c(u) = (x(u), y(u)), the curvature is
Determining edge Transformation parameters
Affine transformations are well known in computer vision for recognizing objects. It has been shown that a 2-D affine transformation is equivalent to a 3-D rigid motion of the object followed by orthographic projection and scaling. Given the point correspondences between the two views the affine transformation which relates the two views can be computed by solving a system of linear equations using a least-squares approach. For computing edge segment orientation let us assume that each edge is represented by list of curvature points ( k1, k2,. kn), where point k(x,y) can move to point k(x,y),then coordinates of k can be expressed in terms of the coordinates of K, through an affine transformation, as follows:
k = x k + b ——————- Equation (2)
An Affine transformation that handles translation, rotation, scale and shear in 2D Space can be described by 6
parameters. Rewriting the Equation (2), in terms of the image point coordinates, we have :
Prediction filter for edge segment tracking
The edge orientation parameters and location of the tracked edge segment is obtained by determining edge transformation parameters. Using these parameters as an observation for the edge position we apply Kalman filtering to predict every edge segments location in the next frame .Since a Kalman filter only needs information from the previous state, we can update filter for each frame and predict for the next frame. Here, the idea is to feed the parameters obtained by using the least squares method to the Kalman predictor to predict edge location for the successive frames and take the output of the Kalman predictor as the predicted edge location in the next frame. The advantage of doing this is that feature point mismatch in some frame as well as noise effects will be effectively suppressed. The main reasons which motivated us in using Kalman predictor to solve this problem is that, it is interesting to think of this problem as a learning problem. This is also been exploited in the approach which treat similar problems as learning problems. Now, for every frame, a tracking list for every segment is maintained. If a corresponding segment match is not found we keep this old segment to be tracked in future frames by lowering its weight value .If this weight becomes less than some threshold we then delete this segment from the tracked edge list. This type of activity helps us to track segments even if segments are missing due to noise or due to object's occlusion. In the last stage measurement update is used to refine this already predicted parameters so that new arriving edge segments (from the next frames) can be tracked more effectively . Let,
s (t) = State vector, for the edge segment I from frame I (t)
The estimated motion information of Kalman predicator helps in limiting search space which increase accuracy as well as speed.
Edge matching by prediction and curvature
Given a sequence of NT Image Frame It,t=1,2,NT,here t is a time index, we estimated the motion of all curvature points from one edge List EL(t-1) to another list EL(t).Here
Edge Segment Clustering
Cluster segments based on edge segment connectivity while giving emphasis on relative edge distance and motion direction. Initially, we group clusters using a distance based iterative k- means clustering algorithm for the first two frames. To cluster segments each edge segment is enclosed within a rectangular boundary and the midpoint of this boundary is used as the edge representative in the computation of the K-centroids fr the K-clusters .The algorithm aims to minimize the following objective function in Equation (8)
EL(t) is the list of all Edge segments extracted from Image
C |2 is the distance between the midpoint p of
Frame It. Now, among this n candidate curvature points we th i k th i
select one curvature point that is closest to the
the i rectangular edge segment boundary to the k cluster
predicated corresponding curvature locations.
center (Ck) and N(t) is the total number of moving egde segments present in frame t.
Algorithm 2 : Moving object cluster tracking
EXPERIMENTAL ANAYLSIS & RESULTS
Different size of dataset is used for experimentation. The proposed method has been tested with several video sequences including both indoor and outdoor environment and several other sequences from PETS database. All videos were of size 320×240 with changes in illumination and the system was able to track almost all of the multiple moving objects.
The first feature is declaration of the area for each moving object which is determined based on the result after morphological filtering process. This in turn will be based on the coordination of maximum and minimum on X-axis and Y-axis for the moving object. It will produce the bounding box which is generated by drawing the line from each of the corner obtained from the maximum and minimum coordination of rows and columns from the frame. Then a limitation of area of the bounding box is made. If the area Threshold is less than 800 pixels it will not be considered as a moving object. The second feature- centroid which is the center of the moving object with reference to the bounding box is considered for tracking. Center of moving object is characterized by the rows (x- axis) and columns (y-axis).The third feature-The difference of the average RGB pixel values between current image frame and the previous image frame is the key-point in identifying and differentiating each of the moving objects inside the image frame. This is more accurate due to average value of 5×5 RGB pixels which yields better results than taking one pixel value at the center of the bounding box. Hence, each moving object can be differentiated and this will make the detection & tracking and of the moving object in different environment to have a better accuracy which can be calculated by using equation (8).
The proposed algorithm for detection and tracking of multiple moving objects in real time for both indoor and outdoor environment was extensively tested to operate in complex, real world, non-plain and changing background was found to possess remarkable accuracy and precision of 99%.
CONCLUSION & FUTURE WORK
A new algorithm allowing a robust and accurate detection of multiple moving objects for a small cost in memory consumption and computational complexity has been proposed. The strength of the proposed approach lies
in the ability to track segments in successive frames. As we know the shape and size of an edge segment changes slowly from frame to frame. Also in some especial case some segments might hide for one or more frames then again shows itself . The proposed system successfully handles edge instability problem by using weighted edge segments. Introducing a partial edge segment-based matching approach in object tracking area helps building inter-frame association within edge segments more accurately. The proposed prediction model and inter- frame association accumulates the shape changing knowledge of edges inside every cluster. Hence, it helps to provide accurate cluster location in successive frames. Finally, in the proposed method while clustering edge segments, clusters were merged when the cluster boundary overlapped having similar motion direction and magnitude.The
The clustering algorithm which has been used in the proposed system is especially suitable for very large databases; the algorithm can be used to produce a large number of clusters; and the algorithm has very low time complexity and space complexity.
The proposed method has some limitations. The application will not detect if objects move suddenly at a high speed or change the motion direction instantly with very high. The video frame rate considered is 24 frames/second. It cannot handle video other than .avi format.
The proposed method may fail to build correspondence between segments. One way to solve the problem is to choose the size of the search window based on the confidence of prediction. Since tracking algorithm cannot handle full occlusion of moving objects in future work a model to refine the clusters can be used as well as to extract important movements from the scene and counting the number of objects in the scene can be incorporated. Indexing for the objects can be incorporated in future work.
Antoine Manzanera, Julien C. Richefeu, A survey on motion detection algorithm based on background estimation, Pattern Recognition Letters, Vol. 25,2007,Elsevier,pp. 234-242.
Chadia Khraief , Sami Bourouis , Kamel Hamrouni, Unsupervised video objects detection and tracking using region based level- set,IEEE,2011, pp. 631-635.
Antoine Manzanera, Julien C. Richefeu, A new motion detection algorithm based on – background estimation, Pattern Recognition Letters, Vol. 28,2011,Elsevier,pp. 320-328.
Murat Erisoglu, Nazif Calis, Sadullah Sakallioglu, A new algorithm for initial cluster centers in k-means algorithm, Pattern Recognition Letters, Vol. 32,2011,Elsevier,pp. 1701-1705
P. Assheton, A. Hunter, A shape-based voting algorithm for pedestrian detection and tracking, In: Pattern Recognition, , Vol. 44,2011, Elsevier ,pp. 1106-1120
Konrad Schindler, Andreas Ess, Bastian Leibe, Luc Van Gool, Automatic detection and tracking of pedestrians from a moving stereo rig,ISPRS Journal of Photogrammetry and Remote Sensing
, Vol. 65,2010, Elsevier ,pp 523-537
Mohamed A. El-Sayed, Tarek Abd-El Hafeez,New Edge Detection Technique based on the Shannon Entropy in Gray Level Images,IJSCE,Vol.3 No.6,June 2011, pp.2224-2232
Zdenek Kalal, Krystian Mikolajczyk, and Jiri Matas, Tracking- Learning-Detection, IEEE Transactions On Pattern Analysis And Machine Intelligence, Vol. 6, No. 1, January 2010,pp.1-14
David Marr, Computer Vision.W.H. Freeman and Company, 2nd Edition, 2002, ISBN 0-7167-1284-9.
Azriel Rosenfeld and Avinash Kak, Digital Picture Processing.
Academic Press, 4th Edition, 2008, ISBN 0-12-597301-2.