Analysis the Performance of Three Step Search Algorithm for Motion Estimation

DOI : 10.17577/IJERTCONV2IS03011

Download Full-Text PDF Cite this Publication

Text Only Version

Analysis the Performance of Three Step Search Algorithm for Motion Estimation

Analysis the Performance of Three Step Search Algorithm for Motion Estimation

Rahul Bhandari

M. Tech. Scholar

Department of Computer Science and Engineering JIET, Jodhpur, Rajasthan, India

Ashutosh Vyas

Associate Professor

Department of Computer Science and Engineering JIET, Jodhpur, Rajasthan, India

Abstract- Motion estimation is a key technique in most algorithms for video compression and particularly in various MPEG/H.261/H.263 standards. This paper is a review of the block matching algorithm i.e. three step search used for motion estimation in video compression. This paper discusses analysis report of this algorithm which is generated by instrumentation profiling tool.

KeywordsMotion Estimation, Block matching, Three Step Search Algorithm.


    A video sequence can be considered to be a discretized three-dimensional projection of the real four-dimensional continuous space-time. The objects in the real world may move, rotate, or deform. The movements cannot be observed directly, but instead the light reflected from the object surfaces and projected onto an image. The light source can be moving, and the reflected light varies depending on the angle between a surface and a light source. There may be objects occluding the light rays and casting shadows. The objects may be transparent (so that several independent motions could be observed at the same location of an image) or there might be fog, rain or snow blurring the observed image. The discretization causes noise into the video sequence, from which the video encoder makes its motion estimations. There may also be noises in the image capture device (such as a video camera) or in the electrical transmission lines. A perfect motion model would take all the factors into account and _nd the motion that has the maximum likelihood from the observed video sequence. [12]


    There exists two basic approaches to motion estimation:

    • Pixel based motion estimation

    • Block-based motion estimation

    The pixel based motion estimation approach seeks to determine motion vectors for every pixel in the image. This is also referred to as the optical ow method, which works on the fundamental assumption of brightness constancy, that is the intensity of a pixel remains constant, when it is displaced. However, no unique match for a pixel in the reference frame is found in the direction normal to the intensity gradient. It is for this reason that an additional constraint is also introduced in terms of the smoothness of velocity (or displacement) vectors in the neighborhood. The smoothness constraint makes the algorithm interactive and requires excessively large computation time, making it unsuitable for practical and real time implementation.


    After motion estimation, a picture residue and a set of motion vectors are produced. The following procedure is executed for each block (16×16, 8×8 or4x4) in the current frame.

    1. For the reference frame, a search area is defined for each block in the current frame. The search area is typically sized at 2 to 3 times the macro block size (16×16). Using the fact that the motion between consecutive frames is statistically small, the search range is confined to this area. After the search process, a best match will be found within the area. The best matching usually means having lowest energy in the sum of residual formed by subtracting the candidate block in search region from the current block located in current frame. The

      process of finding best match block by block is called block-based motion estimation.

    2. When the best match is found, the motion vectors and residues between the current block and reference block are computed. The process of getting the residues and motion vectors is known as motion compensation.

    3. The residues and motion vectors of best match are encoded by the transform unit and entropy unit and transmitted to the decoder side.

    4. At decoder side, the process is reversed to reconstruct the original picture.

    Figure 1: Motion estimation and motion vector [12]

    Figure 1 shows an illustration of the above procedure. In modern video coding standards, the reference frame can be a previous frame, a future frame or a combination of two or more previously coded frames. The number of reference frames needed depends on the required accuracy. The more reference frames referenced by current block, the more accurate the prediction is. There are two types of prediction: Intraframe i.e. I-frame and forward predicted frame i.e. P-frame.

    P-pictures are interframe encoded using motion prediction from the previous I or P-picture in the

    sequence. The luminance component of each macroblock is matched with a similar 16×16 sample region in the previous I or P-picture. The difference macroblock, together with the motion vector, is encoded and transmitted. Macroblocks in P-pictures may be optionally intra-coded (without prediction).

    Figure 2: Frames prediction method [12][11]


    Block based matching method is the most widely used motion estimation method for video coding since pictures are normally rectangular in shape and block-division can be easily done. Usually, each standard have different block size for motion estimation. This can be 16×16, 8×8, etc, depending on the target application of the video codec. The goal of motion estimation is to predict the next frame from the current frame by associating the motion vector to picture macroblocks as accurately as possible. The block size determines the quality of prediction and thus the accuracy.


    Block matching techniques match blocks from the current frame with blocks from a reference frame. The displacement in block location from the current frame to the location in the reference frame is the motion vector. Block matching techniques can be divided into three main components as shown in Figure 3: block determination, search method, and matching criteria.

    Figure 3: Block Matching Flowchart

    The first component, block determination, specifies the position and size of blocks in the current frame, the start location of the search in the reference

    frame, and the scale of the blocks. We focus on fixed size, disjoint blocks spanning the frame, with initial start location at the corresponding location of the block in the reference frame. In tracking, a predictive method may be used to improve the start location of the search.

    The search method is the second component, specifying where to look for candidate blocks in the reference frame. A fully exhaustive search consists of searching every possible candidate block in the reference frame. This search is computationally expensive and other search methods have been proposed to reduce the number of candidate blocks and/or reduce the processing for all candidate blocks.

    The third component is the matching criteria. The matching criteria are a similarity metric to determine the best match among the candidate blocks. In faster search methods, the best match so far will also determine the direction of the search (choice of next candidate blocks).

    The motion vectors are fed to the block determination to implement multi-resolution blocks. A coarse t fine resolution of the blocks is generated. The start location of the search at each resolution is the location of the best match (motion vector) from the previous coarser resolution.

    The implementation of block matching using components allows for flexibility, inter- changing components produces a large variety of block matching techniques. Based on the application, components which provide the best results can be chosen with ease.


    Block-based motion estimation obtains the best match by minimizing a cost function. Various cost functions have been proposed and analyzed in the literatures, varying in complexity and efficiency.

    Sum of absolute difference (SAD)

    The SAD cost function is defined as [18]:

    where In and In-1 represent the macroblock in current and reference frame respectively. m and n are the search location motion vector and n is the block size. k and l represent the index of macroblocks.


    To represent the motion of each block, a motion vector is defined as the relative displacement between the current candidate block and the best matching block within the search window in the reference frame. It is a directional pair representing the displacement in horizontal (x-axis) direction and vertical (y-axis) direction. The maximum value of motion vector is determined by the search range. The larger the search range, the more bits needed to code the motion vector. Designers need to make tradeoffs between these two conflicting parameters.


    The quality of a video scene can be determined using both objective and subjective approaches. The higher the PSNR, the higher the quality of the encoding. The PSNR and bit-rate are usually conflicting, the most appropriate point being determined by the application. Although PSNR can objectively represent the quality of coding, it does not equal the subjective quality. Subjective quality is determined by a number of human testers and a conclusion is drawn based on their opinions. There exist cases for which high PSNR results in low subjective quality. However, in most cases, PSNR provides a good approximation to the subjective measure.


    Block based matching algorithms are processes for finding minimum motion vectors in the motion estimation process. Different kinds of algorithm could give a different motion vector. Among all algorithms proposed, only full search gives optimal result within the search range. Other algorithms will give near-to-optimal results but significant lower complexity by reducing the number of search points. There are several block-matching algorithms which are commonly used in various coding standards.


    The Three Step Search begins with eight candidate neighbor blocks a given step size away. The search is a recursive process with each iteration centered at the best match from the previous iteration and the step size halved, until a step size of one is reached. The three step search reduces the number of candidate blocks and covers a large area, making it

    a fast search technique. The approach concentrates on directing the search based on the best match from the previous step.

    The Flowchart of Three Step Search Algorithm are as follow:


    Instrumentation profiling of three step search algorithm is used to measure performance with the help of a visual studio profiling tool, which is based on which type of processor used in the computer because instrumentation profiling time can directly calculated from processor of your computer and plot a graph between wall clock time (in seconds) and CPU usage for whole encoding process (in percentage %).

    Take input as Stefan_cif.yuv then profiler plots a graph which is shown as below:

    This result depends on internal or external resources

    i.e. resources could be anything from I/O.


[1]. Bhagavanreddy Thokala, K.Haripriya, "VLSI Motion Estimation Architecture for Full Search Block Matching Algorithm", International Journal of Scientific Research (IJSR), ISSN No 2277 – 8179, Volume 2,

Issue 2, Feb 2013.

[2]. Ms. Bhavina Patel, Dr. R.V. Kshirsagar, Dr. Vilas Nitnaware,"Review and comparative study of motion estimation techniques to reduce complexity in video compression", International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, Vol. 2, Issue 8, August 2013.

[3]. Ibrahim Nahhas, Martin Drahansky, "Analysis of Block Matching Algorithms with Fast Computational and Winner- update Strategies", International Journal of Signal Processing, Image Processing and Pattern Recognition, Vol. 6, No. 3, June, 2013.

[4]. Sriram B, Eswar Reddy M, Subha Varier G,"Study of various motion estimation algorithms for video compression/coding standards implementation of an optimal algorithm in LabVIEW ", International Journal of Emerging Technology and Advanced Engineering, Volume 3, Issue 4,

April 2013.

[5]. Rutika Joshi, Rajesh Kumar Rai, Jigar Ratnottar,"Review of Di_erent Standards for Digital Video Compression Technique", International Journal of advancement in electronics and computer engineering (IJAECE), ISSN 2278 -1412, Volume 1,

Issue1, pp.32-38, April 2012.

[6]. Darshna D.Jagiwala, Prof. Mrs. S. N. Shah, "Analysis of Block Matching Algorithms for Motion Estimation in H.264 Video CODEC" International Journal of Engineering Research and Applications (IJERA), ISSN 2248-9622, Vol. 2, Issue 6,

pp.1396-1401, November- December 2012. [7]. S. Immanuel Alex Pandian, Dr.G. Josemin Bala, Becky Alma George, A Study on Block Matching Algorithms

for Motion Estimation", International Journal on Computer Science and Engineering (IJCSE), ISSN : 0975- 3397, Vol.3 No.1, pp. 34-44, Jan 2011.

[8]. I. E. G. Richardson,"H.264 and MPEG-4 video compression", Wiley, Chichester, England, 2003.

[9]. Richardson I_E_G_ H_264/MPEG-4 Part 10: Transform Quantization [J]. A white paper. [Online]. Available: http://www_vcodex_com, 2003.

[10]. Feng Xiao,"DCT-based Video Quality Evaluation", Final Project for EE392J, Winter 2000.

[11]. Osama Al- Shaykh, Ralph Ne_, David Taubman, Avideh Zakhor,"Video Sequence Compression", University of california, Berkeley,CRC Press LLC.,2000.

[12]. Hsueh-Ming Hang, Yung-Ming Chou, and Sheu-Chih CHeng," Motion Estimation for video coding standards ", Journal of VLSI signal processing 17, pp.113-136, 1997.

[13]. Borko Furht, Joshua Greenberg, Raymond Westwater, "Motion Estimation Algorithms

For Video Compression", Massachusetts: Kluwer Academic Publishers, 1997. Chapter 2 & 3.

[14]. Ralf Schafer, Thomas Sikora,"Digital Video Coding Standards and their role in video communication", proceeding of IEEE, vol. 83, no.6, June, 1995.

[15]. R. Li, B. Zeng, and M. L. Liou,"A new three-step search algorithm for block motion estimation", IEEE TRANSACTIONS Circuits System Video Technology, vol. 4, pp. 438442, Aug 1994.

[16]. A. Zaccarin and B. Liu," Fast algorithms for block motion estimation", In IEEE ICASSP, volume 3, pages 449-452, 1992.

[17]. M. Ghanbari,"The cross-search algorithm for motion estimation", IEEE TRANSACTIONS Communication, vol. 38, no. 7pp. 950-953, July 1990.

[18]. GHARAVI, H. MILLIS,"Block matching motion estimation algorithms new results", IEEE Transactions on Circuits and Systems, pp. 649-651,1990.

[19]. CCITT Recommendation H_261,"Video Codec for audio visual services at p_Kbits/sec", 1990.

[20]. T. Koga, K. Iinuma, A. Hirano, Y. Iijima, and T. Ishiguro,"Motion compensated interframe coding for video conferencing", Proc.NTC81, pp. C9.6.1-9.6.5, New

Orleans, LA. Nov. 1981.

Leave a Reply