Residue Coding by Mode Dependent Fuzzy Vector Quantization in HEVC

Call for Papers - April 2019

Download Full-Text PDF Cite this Publication

Text Only Version

Residue Coding by Mode Dependent Fuzzy Vector Quantization in HEVC

  1. Jency Leena Dr. V. Mohan Dr. P. Shanmugapriya

    Dept of ECE

    Saranathan College of Engineering Trichy, India

    Dept of ECE

    Saranathan College of engineering Trichy, India

    Dept of ECE

    Saranathan College of engineering Trichy, India

    Abstract The HEVC standard supports thirty five intra prediction modes. Intra mode prediction in HEVC uses prediction of pixel values within a frame. The edges of the frame give high energy prediction residues. This standard is used for reducing the high energy prediction residues after intra prediction. This paper proposes a new method for encoding the high energy prediction residues using a Mode dependent Fuzzy vector quantization (MDFVQ). The Fuzzy clustering method is used to generate the code book from the training sequences. The MDFVQ code book is optimized by using rate-distortion (RD) optimization criteria.

    Keywords MDFVQ; Quad tree Decomposition; Intra Prediction; Residue coding;


      HEVC is the efficient video coding developed by JCT-VC (Joint collaborative Team on video coding). The intra prediction in HEVC is still limited in parts of an intra frame where there are lots of details, texture, edges present.

      By extrapolating the reference pixel intensities from previously reconstructed frame blocks, the intra prediction is performed in H.264 and HEVC video coding schemes. In a block of pixel intensities predicted using planar or DC mode, the residues will have homogeneous patterns. If the block of pixels predicted using angular prediction modes, high energy prediction residues with directional structures are generated.

      In [1], directional transform followed by an adaptive coefficient scanning procedure is used to encode the high energy prediction residues. In [3], a vector quantization based code book is developed through iterative training procedure to model the intra prediction residues.

      In our proposed technique, the same procedure as given in [3], mode dependent but fuzzy vector quantization is followed. The code books for encoding the residues are developed using stochastic optimization technique PSO (Particle swarm optimization) algorithm. The PSO [2] model is used to design the code book for vector quantization. In PSO, particles (codebooks) alter the values based on their previous experience to generate the code book.


      One of the efficient data compression techniques is Vector quantization method, which exploits correlation between vectors. In the vector quantization very difficult to find the best matching code word in the code book. The mode dependent vector quantization (MDVQ) is illustrated in Fig 1.

      Fig1. Illustration of MDVQ based residue coding

      In the mode dependent vector quantization process, initially the image divided into NxN block which are used in intra prediction. The MDVQ method is used to generate the code books for encoding the residual signal, which are performed after intra prediction in the image block. The code books are learned from the training vectors. The matching codeword corresponding index only sends to the receiver. After that the quantization error will be processed by the DCT transform and entropy encoding.


      The proposed work encoder and decoder diagram are shown in Fig 2(a) and Fig 2(b), the key frames are extracted from the input video and the HEVC standard is used to predict the intra modes in the initial frame of the video. The HEVC standard using the block sizes from 4×4, 8×8, 16×16 upto 64 × 64 pixels. First I frame is chosen as a key frame.

      Fig 2(a).Block Diagram of Proposed Encoder

      PB(Prediction Block). There are 33 angular modes, a planar mode and a DC mode. The Table 1 shows the mode name with their corresponding intra prediction mode index.

      Table 1 Intra mode Prediction




      MODE 0

      DC MODE

      MODE 1


      MODE 2 to 34

      The angular mode extrapolates the samples from the reference samples as given in Fig 3 in order to achieve lower complexity.

      Fig 3. Reference Samples Rx,y used to predict the pixels Px,y


      The Predicted Sample is referred as the Pxy and the reference Sample is referred as the Rxy. The angular mode is used the following formula to predict the pixel values.

      P 32 W .R

      W .R

      1 (1)

      X ,Y

      Y i,0

      Y i0

      CY Y.d 5


      W Y.d & 31

      i X CY




      Fig 2(b). Block diagram Proposed Decoder

      1. Intra mode Prediction Process

        The prediction process is used to remove spatial redundancy. Intra-prediction is the process to predict pixels of picture frame. Intra-picture prediction uses neighborhood pixels for predicting the picture block. Before intra-prediction, frame must be divided. The Intra-picture prediction uses the previously decoded boundary samples from spatially neighboring block in order to predict a new prediction block PB. HEVC [6] employs 35 different intra modes to predict a

        where x, y are the spatial coordinates; the sub pixel location in between Ri,0 and Ri+1 ,0 is assumed; Cx and Cy represent pixel parameters corresponding to x and y coordinates; Wy represent the weight/weighing parameter of weighed prediction; i stands for reference sample index; >> denotes a bit shift operation to the right and & denotes logical AND operation.

        The flowchart for compression is shown in Fig 4. The flowchart is describes the compression process at the encoder side.

        amplitude surface with a horizontal and vertical smooth gradient derived from the boundaries samples of the neighboring blocks. The planar mode equations (5), (6), (7) are used to predict the sample values.

        Px, y


        x, y

        • PH

          x, y

        • N log

        N 1



        PVx, y

        N Y .R


        y.R0,N 1


        PH x, y (N X ).R0, y x.RN 1


        Where x and y denoted location of sample; N is stand for total number of sample; V is for vertical and H is for horizontal; R stands for reference samples. The DC mode is efficient to predict plane areas of smoothly varying content in the image.

      2. Computation of Residues

        The sum of absolute difference (SAD) is computed for each mode. The mode is selected based on the minimum SAD. The Residual prediction is the difference between original pixel values and predicted pixel values.

        N N

        SAD | X i, j Yi, j |

        i0 j 0


        Fig 4. Flowchart for compression

        The flowchart for decompression is shown in Fig 5. The flowchart is describes the decompression process at the decoder side.

        di, j X i, j Yi, j (9)

        Where Xi,j referred as the Original pixel values and Yi,j referred as the predicted pixel values ,The residual signal can be predicted by using formula (8).

      3. Quad Tree Decomposition

      Quad tree decomposition divides the input image into four equal sized square blocks. The HEVC standard uses a quad tree partitioning method. Quad tree decomposition for a set of points P in a square Q [ X1q : X 2q ]X [ Y1q : Y2q ] is, If

      | P | 1, it represents the quad tree single leaf. Otherwise QNE, QNW, QS and QSW will be the four quadrants [5].

      X X X

      / 2,Y Y Y

      / 2 (10)


      1q 2q

      mid 1q 2 q

      PNE P

      P P : P X


      P P : P X


      ^ Py

      ^ P





      NW x

      mid y




      Psw P P : P X mid ^ Py Ymid (13)


      P P : P X


      ^ Py



      Fig 5.Flowchart for decompression

      The planar mode is improved in order to preserve continuities along the block edges. The value of each sample as given in Fig 4 of the PB is calculated assuming an

      The quad tree consists of root node v; Q stored in v. each internal node has precisely four leaf nodes. In quad tree decomposition of x-y plane is recursively subdivided into the four quadrants. The four smaller quadrants may be of any shape like square or rectangular. This partitioning is known as Q-tree. Initially the threshold value is set with the minimum block size. For each block, the mean value is computed and compared with the threshold value. If the mean of the block is

      maximum, the block is further split into number of quadrants otherwise it doesn't split the block.







      i j j

      j 1



      The pixel intensity values of each of these blocks forms a multi-dimensional vector known as the training vector. The basic objective of vector quantization is to classify these training vectors into clusters. The centers of these clusters are known as codebook vectors. The codebook vectors need only be stored or transmitted as the original image can be reconstructed from the information contained in the codebook, thus resulting in compression. To reconstruct the image, the training vectors are replaced by their closest codebook vectors. Clustering is an important data processing, it attempts to subdivide a random data set into subsets or clusters, such that the intra-cluster similarity is maximized, while minimizing the inter cluster similarity [4]. The fuzzy c-means (FCM) algorithm is used to assign the pixels in each cluster. The flowchart for fuzzy c-means algorithm is shown in Fig 6.

      The most powerful and quantization technique used for the image compression is vector quantization (VQ).The intra prediction residuals can be represented in the form of vectors. The residual signal vectors quantization process is performed based on the intra prediction modes. The MDFVQ (Mode Dependent Fuzzy Vector Quantization) based residue coding is performed after the intra prediction.

      Mode-dependent codebooks are learned from a training set of residue vectors in the encoding procedure, each K- Dimensional input vector

      X X1 , X 2 ,………………….X K (15)

      To find the distortion between the input vector Xj and ith codeword Vi is the squared Euclidean Distance is given in (16). Let X=(x1, x2…xN) indicates an image with N pixels to be divided into c clusters, where xi represents data. The iterative optimization of algorithm that minimizes the cost function given by


      j 1

      Where uij represents the membership of pixel xj in the ith cluster, vi is the ith cluster center, and m is a constant. The residual signals are encoded by using a MDFVQ method.

      N C 2



      J u m X

      j1 i1




      1 (17)


      c || x j v i || m1

      k 1 || x j vk ||

      The code word Vi represents the input vector X and the Index i

      Fig 6. Flowchart of FCM


      In this section, we present the experimental results of intra prediction methods, Quad tree decomposition, quantization and encoding process. The key frames of video initially decomposed into 4×4, 8×8 and 16×16. A complete Macro block size is 16×16.

      The HEVC standard is used to predict the intra modes in I frame of Akiyo newsreader.avi is shown in Fig 7. I frame encoded by using Vector Quantization method. The compressed frame is shown in Fig 8.

      Fig 7. Intra Prediction modes in Akiyo I frame Fig 8. Compressed Frame

      The quad tree decomposition performed on the Akiyo newsreader frame with the minimum dimension 1 and threshold value 0.50 as shown in Fig 11.

      Fig 11. Quad tree decomposition of Akiyo frame threshold 0.50

      The HEVC standard is used to predict the intra modes in I frame of Foreman.avi is shown in Fig 12. I frame encoded by using Vector Quantization method. The compressed frame is shown in Fig 13.

      The Akiyo newsreader I frame gives the PSNR value is 19.4990dB.The quad tree decomposition in Akiyo newsreader frame as shown in Fig 9. The quad tree decomposition performed on the Akiyo newsreader frame with the minimum dimension 1 and threshold value 0.10.









      Enc oding I Frame

      20 40 60 80 100 120 140









      Compressed I Frame

      20 40 60 80 100 120 140

      Fig 12. Intra Prediction modes in I frame Fig 13. Compressed Frame

      Fig 9.Quad tree decomposition of Akiyo frame threshold 0.10

      The quad tree decomposition performed on the Akiyo newsreader frame with the minimum dimension 1 and threshold value 0.30 as shown in Fig 10.

      Fig 10. Quad tree decomposition of Akiyo frame threshold 0.30

      The foreman video I frame gives the PSNR value is 27.9410 dB. The quad tree decomposition performed on the foreman frame with the minimum dimension 1 and threshold value 0.10 as shown in Fig 14.

      Fig 14. Quad tree Decomposition of foreman threshold 0.10

      The quad tree decomposition performed on the foreman frame with the minimum dimension 4 and threshold value

        1. as shown in Fig 15.

          Fig 15. Quad tree decomposition of Foreman image threshold 0.30


In our experiment, foreman.avi and Akiyo newsreader video sequence are taken for showing frames intra prediction using HEVC. The quad tree partitioning is performed on the Akiyo newsreader video frame and foreman video frame with different threshold values. Mode dependent Fuzzy Vector Quantization is used for coding the high energy residues after intra prediction of frames. This method provides better PSNR values and coding efficiency than the other methods along with improved quality of the video.


      1. Sairam1 Y N, Nan Ma1, Neelu Sinha121ATC Labs, NJ, USA (2008),'A Novel Partial Prediction Algorithm for Fast 4×4 Intra Prediction Mode Decision in H.264/AVC ' Dept. of Computer Science & Math, Fairleigh Dickinson University, NJ, USA 2008 IEEE DOI 10.1109/DCC.2008.93.

      2. Chiranjeevi Karri , Uma ranjan Jena (2016),'Fast vector quantization using a Bat algorithm for image compression'.Department of Electronics and Telecommunication Engineering, Veer Surendra Sai University of Technology(VSSUT), Burla 768018, Odisha, India Engineering Science and Technology, an International Journal 19 (2016) 769781.

      3. Bihong Huang, Felix Henry, Christine Guillemot, Philippe Salembier (2015), 'Mode dependent Vector Quantization with a Rate – Distortion Optimized Codebook For Residue coding in Video compression' 2015 IEEE.

      4. Firas A.Jassim (2012),'Hybrid Image Segmentation using Discerner Cluster in FCM and Histogram Thresholding', Management Information System Department, Irbid National University, 2600 Irbid – Jordan, International Journal of Graphics & Image Processing Vol 2, issue 4.

      5. Pinki, Rajesh Mehra (2016),Quad-tree Decomposition based image analysis using intensity difference threshold, International Journal of computer engineering and application, volume X, Issue VII.

      6. Gary J. Sullivan, Fellow,Jens-Rainer hm, Woo-Jin Han,and Thomas Wiegand, Fellow (2012), 'Overview of the High Efficiency Video Coding (HEVC) Standard' IEEE Transactions on circuits and systems for video technology VOL. 22, NO. 12

Leave a Reply

Your email address will not be published. Required fields are marked *