A Novel Interpolation Based Super-Resolution Of The Cropped Scene From A Video

DOI : 10.17577/IJERTV2IS3239

Download Full-Text PDF Cite this Publication

Text Only Version

A Novel Interpolation Based Super-Resolution Of The Cropped Scene From A Video

A. John, Sakthivel Punniakodi, Ramesh Murukesan

Dept., Information Technology, Christ College of Engineering and Technology Puducherry, India.

Abstract

Super resolution (SR) image reconstruction is the process of combing several low resolution images into a single high resolution image. The videos of the image change frame to frame. This paper is based on interpolation super-resolution method. An algorithm for enhancing the resolution of the scene through Segmentation of the video and cropping the required part of the scene, super-resolution using Interpolation, Regression, and Post-processing, is applied to the effective Super-resolution image output. Further object tracking and identification use the results of this work. We worked in traffic surveillance videos.

Keywords: super Resolution Sequence of Image -Interpolation Method

  1. Introduction

    Super Resolution of Image first appeared in the early 1980s, with one of the first papers in the signal processing community, the paper by Tsai and Huang [1]. Since then the topic has been active, with some of the early good results appearing in the 1990s. The last five or so years however have witnessed an enormous resurgence in SR activity. Any given set of source low resolution (LR) images only captures a finite amount of information from a scene; the goal of SR is to extract the independent information from each image in that set and combine the information into a single high resolution (HR) image. The LR images can come from a variety of sources: they can be taken from different frames of a video sequence, different still images taken from a single camera that has undergone translation or

    rotation, or multiple cameras capturing a single scene. The only requirement is that each LR image must contain some information that is unique to that image. This means that when these LR images are mapped onto a common reference plane their samples must be subpixel shifted from samples of other images otherwise the images would contain only redundant information and SR reconstruction would not be possible. Various types of super resolution methods are available [2], but in this paper we applied interpolation based super-resolution method. We propose a novel super-resolution of the cropped scene from a video.

    1. Converting video into frames

      The first process of our work is to Converting video into frames. Converting video into frames is the process of reading each and every frame in a sequence of image and saving it. Consider a sample video of avi format. Let us consider a sample video be A (A indicate the sample video of .avi format). The image sequence is mathematically mentioned below,

      i

      Where is the Frames present per second in a sequence of images. i =0, 1,2, n denotes the seconds of video. The number of frames per second is mathematically denoted below,

      j

      Wheredenotes number of frames per second, f is the Frames present in sequence of image and j =0,

      1,2, n denotes the possession of the frame in sequence of image. (The available number of frames per second is 24fps / 25fps / 30fps)

      In this paper we consider a sequence of image with 24 fps (fps means frames per second). As our paper fully deals with traffic surveillance we consider traffic video as input. Reading each frame is expressed mathematically as below,

      = f0,f1,f2,fn

      Before converting we have to find number of frames in a sequence of image (video), which is denoted as n. We start reading each frame until the end of frame in a sequence and save each frame in a directory. Some of the frames are displayed below,

      Figure 1.1 Video to Frame (Frame-1, Frame- 38, Frame-64)

    2. Cropping the number plate

      Cropping is defined as removal of unwanted parts or cutting the required part of an image. In our paper we are cropping the required part of an image (frame / scene) which we gathered from a video (sequence of image). Cropping process is mathematically expressed as follows,

      A = I (X1, Y1, X2, Y2)

      Where I am the input frame to crop the required part to apply super resolution X1 , Y1, X2, and Y2 denotes the axis of the rectangle, which have to select and cropped, and A denote the cropped image from the frame.

      In our paper as we concentrate on traffic surveillance (which means number plate detection), we crop the number plate present in a frame as shown below,

      Figure 1.2.2 Cropped image of Selected area

    3. Super-Resolution

      Super resolution is a current researching technique, which is defined as generating high resolution of an image from a low resolution image or a set of low resolution image [3,13]. In this paper we use single image super resolution. Single image SR (super- resolution) is the task of constructing an HR (high resolution) enlargement of a single LR (low-resolution) image, (which we cropped from a frame). Our proposed method performs four main sub steps as mentioned below,

      1. Interpolation.

      2. Generation of a set of candidate images. (Regression)

      3. Combining candidate images to produce a single image.

      4. Post-processing.

      1. Interpolation

        Interpolation is defined as a process of providing specified values at specified points. There are four different Interpolations namely,

        1. Nearest Neighbor Interpolation.

        2. Linear Interpolation.

        3. Cubic Interpolation.

        4. Spline Interpolation.

          In our paper we perform Spline Interpolation. We select Spline Interpolation because it is more sophisticated and produces the smoother edges. Converting a cropped image into the desired scale.

          Algorithm to find the Spline Interpolation [2]

          Let us consider a third order polynomial p (x) for which we produce

          P (x1) = y1 P (x2) = y2 P(x1) = k1

          P(x2) = k2

          Figure 1.2.1 Input Frame, Selecting desired area

          We can write it in symmetrical form as,

          P = (1-t) y1 + ty2 + t (1-t) (a (1-t) + bt)

          Where,

          t x x1

          And

    4. Generation of set of candidate images.

      x2 x1

      a = k1 (x2 x1) (y2 y1)

      b = – k2 (x2 x1) + (y2 y1)

      Double Differentiate P we get as follows,

      2

      2

      P = 2 b 2a (a b)3t

      1

      1

      (x2 x )

      Double differentiation with respect to x1 and x2 as given below,

      A set of candidate images is generated based on patch-wise regression. To reduce the time complexity we utilize kernel ridge regression. By combining gradient descent and kernel matching pursuits we found a spare basis [4,6,7].

      1. Gradient descent

        We use Gradient descent because its a best optimization algorithm. It provides plenty of data available everywhere and extract information

        efciently. The column vector is denoted as,

        P(x ) = 2

        1

        1

        b 2a

        (x2

        x )2

        ( (1), (2)…., (n))T

        1

        1

        t t t t

        t t t t

        2

        2

        P(x ) = 2

        a 2b

        where T denotes transpose .

        (x2

        x )2

        Gradient-descent general mathematical expression

        1

        1

        In spline interpolation, left of the leftmost "knot" is given bellow,

        and the rightmost "knot" thus, it form of a straight line

        1 V (s ) V (s )2

        with q" = 0, because ruler can move freely.

        t 1 t 2 t

        t t t

        P ( xn ) = -2{ 3 ( yn yn-1 ) (2kn + kn-1) ( xn – xn-1 )

        Where, V (s ) is a smooth differentiable function

        / ( xn xn-1 )2 } = 0

        Following graph show the example of Spline

        of t for all

        t t

        s S .

        Interpolation clearly.

        Figure 1.3.1 Graphical representation of interpolation

        Applying Spline Interpolation in our cropped image is shown below,

        Figure1.3.2 Cropped image

        Figure1.3.3 Interpolation image

        Let us generate candidate images using the interpolation image as shown below,

        Figure 1.4.1. Interpolation Resultant Image

        Figure 1.4.2. Regression results

    5. Combining candidate images to produce a single image

      We produce a single image by convex combinations of candidate images by noting that the KRR (kernel ridge regression) (spares) corresponds to the Map estimated with the GP (sparse) prior [8], GP (Gaussian process) is defined as a learning technique, seeks to predict the value of unknown function of any valid input when we provided with of a set of input/examples. By using GP we produce combiner and regresses simultaneously so that error measure is

      minimized [9]. We produced set of linear repressors (candidate images) which is trained such that for the each location (x, y) by applying GP we receive a patch of output image as Z (Nl (x, y) , 🙂 and also produce estimation difference as given below,

      ({d1(x,y), dn(x,y)})

      Where d is the difference between the various x and y values present between the set of images (candidate images), the final estimation of the pixel we obtain as a convex combination of candidates which is formulated below,

      Y (x, y) = {Wi (x, y) Z (x, y, i)}} Where,

      {x} -Latent variable.

      Ns ( j) – js pixel location, 8-connected neighbours. is normalization constant.

      to prevent the output image flowing far away from input(regression based SR) result of y.

      The factor graph representation is given below,

      Wi (x, y) = {exp (M) / }

      M = –

      d (x, y) 1

      i

      Figure 1.6.1. a. NIP term, and b. Deviation penalty term.

      C

      We re-moving hyper parameters which are chosen based on the error rate of SR for a set of images. To combine two image values the difference is considered as given below,

      D([xi,yi] , [xj,yj]) = (||xi xj||2 + x / y) ||yi – yj||2 )

      Where x and y are the variances of the distances between the pairs of training data points in x and y respectively [10].

      Figure 1.5.1. Combining candidate

    6. Post-processing

Post processing is carried based on Based on Image Prior. Image prior is flexible high-order MRF (Based on Image Prior). Here we use a modification of the NIP (natural image prior) framework which was proposed by tappen et al[12].

The input is blurred, and removed the very high spatial-frequency components from it. The major edges are found by the Thresholding the each pixel based on the Laplacian and range of pixel values present in local patches [13]. Applying the post processing in our sample we get the following output.

Figure 1.6.2. Final output

Conclusion

The Super-Resolution of cropped scene from a video is performed. The super resolution is carried by using interpolation, regression (where we generate a set of candidate images), then we combined candidate images produced in regression and finally we perform post- processing to get a Super – Resolute image as output.. Future direction of this paper can be work object detection and traffic surveillances in multi

ditection and multi-angle.

2

1 x j xi

x j y j

P({x}{y})

C

Where,

exp

( j ,iNs( j ))

.exp

N j R

Reference

  1. R. Y. Tsai and T. S. Huang, Multiframe image

    restoration and registration,in Advances in Computer Vision and Image Proscessing. Greenwich, CT: JAI

    {y}-Observed variables corresponding to the pixel

    values of the y.

    Press, 1984, vol. 1, pp. 317339.

  2. Jing Tian, Kai-Kuang Ma A survey on super- resolution imaging Signal, Image and Video Processing, Springer Link September 2011, Volume 5, Issue 3, pp 329-342.

[3]E.C.Pasztor, W.T.Freeman, & O.T.Carmichael, Learning low-level vision; International Journal of Computer Vision, 2000.

  1. Research on an image magnifying algm based on cubic spline interpolation. Mei Wang, Yanhua Che, Yongling chu, ShaoChun Li, EMEIT, 2011 International Conference .

  2. Carin, Dobeck,G.J., Kernet Matching, Stack, J.R., Xuejun Liao, L. Neural Networks, IEEE Transaction.

    Application of Adaptive-Kernel Matching Pursuit to Estimate Mixtyre-Pixel Proportion. Wang Xiaoqin, Wu Bo, Huang Bo. Image & Graphics, 2007.

  3. The Kernel-match Pursuit for the Large-datasets. Vlad Popovici, Samy Bengio JeanPhilippe Thiran. EPFL. Signal Processing Institute.

[7]Y. Kwon and K. I. Kim, Single image super resolution using sparse regression & natural image prior, IEEE . Pattern Analysis & Machine Intelligence 2010.

[8]M. F. Tappen, W. T. Freeman, amd B. C. Russel, "Exploiting sparse-derivative prior for the super resolution & image-demosaicing. IEEE Workshop on Statistical & Computational Theories of Vision-2003.

[9]Z. Ghahramani & E. Snelson, Sparse Gaussian process using pseudo-inputs. In Advances in Neural Info. Processing Sys., Cambridge, MA, 2006.

[10]R. A. Jacobs, I. Jordan ,Hierarchical mixture of experts & the em algorithm. Neural Computation,1994. [11]B. C. Russel, W. T. Freeman, & M. F. Tappen. Exploiting sparse derivative prior for the super- resolution and the image demosaicing, IEEE – 2003. [12]Kwang In Kim and Younghee Kwon Example based Learning Super-Resolution for Single-Image. [13]Deqing Sun and Wai-Kuen Cham, Post processing of Low Bit Rate Block D.C.T. Coded Images Based on the Fields of Experts Prior, IEEE Transactions on Image Processing, Nov 2007.

[14]Y. Gong, S. Dai, Y. Wu, and M. Han. Bilateral back projection for a single image super-resolution. IEEE International Conference on Multimedia and Expo, 2007.

Leave a Reply