A High Speed 2-D DWT Architecture using 9/7 Lifting Scheme for Image Compression

DOI : 10.17577/IJERTV4IS040952

Download Full-Text PDF Cite this Publication

Text Only Version

A High Speed 2-D DWT Architecture using 9/7 Lifting Scheme for Image Compression

Prof. Chandrashekhara

Dept. of Electronics and Communication Dayananda Sagar College of Engineering Bangalore, India

Raghuraja S Bhat

Dept. of Electronics and Communication Dayananda Sagar College of Engineering Bangalore, India

AbstractThe Real-Time implementation for compression techniques plays an important role in optimizing the performance parameters such as speed, area etc. In this paper, we propose an FPGA Implementation of High Speed 2-D DWT using 9/7 lifting scheme for image compression. The 1-D DWT core architecture is implemented using signed multipliers which are required for representing floating-point coefficient values of 9/7 lift scheme. The proposed 2-D DWT architecture is designed efficiently using two 1-D DWT core, Memory Unit and Control Unit. The proposed 2-D DWT is extended for image processing application to compress the 2-D image which is synthesized using Virtex-5 xc5vlx110t-3ff1136 board. It is observed that the performance parameter with respect to operating speed of 232.823MHz is achieved compared to existing architectures.

KeywordsDiscrete Wavelet Transform (DWT); Lifting Schemes; FPGA; LUTs; Image Compression.

  1. INTRODUCTION

    Image compression plays a vital role in reducing the bandwidth for real time data transmission. The Discrete Wavelet Transform (DWT) is being visualized as a major tool for image compression due to the fact that DWT has many useful properties like symmetrical transform, integer to integer transform, in-place computation, and progressive image transmission by resolution [1]. DWT understands Human Visual systems better so that it has been accepted in JPEG 2000 standard and adopted as the transform coder in MPEG-4 still texture coding. The conventional implementation using filter bank approach for 2-D DWT demands very high computation than the Discrete Cosine Transform (DCT) and demands more silicon area with power. Hence Swelden et al.,

    [2] have suggested lifting based scheme. This method speeds up and reduces the computation compared to classical convolution.

    The DWT has been vastly implemented in very-large scale integration (VLSI) to meet the real time specifications. Presently many VLSI architectures have been have been proposed based on lifting scheme. Sugreev Kaur et al., [3] proposed pipelined partially serial architecture to enhance the speed along with optimal utilization and resources available on target FPGA. This design can operate at maximum frequency 231 MHz in Spartan 3 FPGA by consuming power of 117mW at 28 degree/c junction temperature. Naseer et al.,

    1. proposed architecture based on lifting scheme approach, using the (5/3) wavelet filter, which reduces the hardware complexity and size of the on-chip memory. This architecture

      consists of a control unit, a processor unit, two on-chip internal memories to speed up system operations, and an on- board off-chip external memory (Intel strata parallel NOR flash PROM). It operates at maximum speed of 62.767 MHz on Spartan 3E FPGA. Eshwar Reddy and Venkata Narayana

    2. proposed a technique to compress the test images competitively by using Set Partitioning In Hierarchical Trees (SPIHT) algorithm and with lifting concepts. These algorithms resulted in practical advantages, such as, superior low bit rate performance, bit-level compression, progressive transmission by pixel, accuracy and resolution. Hansa et al.,

    3. proposed a highly pipelined and distributed VLSI architecture of lifting based 2D DWT with lifting coefficients represented in fixed point [2:14] format. Compared to conventional architectures, the highly pipelined architecture has high speed design at the expense of more hardware area. Chenchu and krishnaiah [7] discussed the performance of 9/7 and 5/3 wavelets on photographic images (monochrome and color). Li Bao-Feng et al., [8] proposed a parallel architecture for 2D DWT with two rows and two columns processors. It takes 3N+2 buffers to store intermediate data and operates at maximum of 145.54MHz on Altera Stratix II FPGA. Sivachandra Mahalingam et al., [9] present performance results for orthogonal and bi-orthogonal wavelets using both periodic and symmetric extension techniques. It also demonstrates the importance of linear phase filters on image compression performance. Anand Darji et al., [10] proposed architecture with two 1D pipelined architectures along with transpose unit. The design consumes very less power and less area. Sanjay et al., [11] proposed a design that locally adopts the filtering direction to the image content based on direction lifting using SPIHT. Yamini et al., [12] proposed Distributive Arithmetic (DA) for DWT. Architectures range from high memory efficient to low latency to high parallelism.

    Contribution:

    In this paper, VLSI architecture for High Speed 2-D DWT using 9/7 Lifting Scheme for Image Compression is proposed. The architecture for 1-D DWT core, control unit and two memory module blocks are designed to obtain 2-D DWT.

    The paper is organized as follows. Section II discusses the concept of the 9/7 lifting DWT. Section III presents the proposed architecture for 9/7 lift 2-D DWT. Results and discussions are given in section IV. Finally, in Section V brief conclusion is drawn.

  2. 9/7 LIFTING SCHEME

    The convolution method of finding filter coefficients is very slow and consumes high memory area. So, most of the recent architectures have utilized lifting based DWT for similar computation. There are three steps in 9/7 lifting [3] scheme: Splitting, Predict-Update and Scaling.

    1. Split

      The input data samples are divided into even and odd samples.

      (1)

      (2)

    2. Predict-Update

      The odd sample is predicted using two even samples which obtains detailed coefficient. The average coefficients are updated using two detailed coefficients obtained. D(i) and S(i) are the detailed and average coefficients respectively.

      Predictor1:

      (3)

      Updater1:

      (4)

      Predictor2:

      (5)

      Updator2:

      (6)

      Where,

    3. Scaling

    The low pass and high pass coefficients computed must be normalized before passing to the next stage. Scaling performs this operation, which reduce the hardware requirements as given in equation 7 and 8 respectively.

    (7)

    (8)

    The general data flow diagram for 9/7 DWT is shown in Fig. 1. Filter coefficients are multiplied with input data samples in a pre-determined manner to get the high pass and low pass coefficients.

    Fig. 1. Data Flow Graph of 9/7 DWT

  3. PROPOSED ARCHITECTURE

    1. Proposed 1-D DWT Processor Core

      The proposed 1-D DWT core architecture is as shown in Fig. 2. The design is very simple as it uses only four adders and six multipliers. The signed multipliers are designed since the filter coefficients of 9/7 filter have negative values. Once the multiplication is performed we extract only the desired part of the multiplication result and pass to the next stage. Similar to the multiplication, addition is also based on the signed operation.

      Fig. 2. 1-D DWT core Block Diagram

    2. Proposed 2-D DWT architecture

      The Proposed 2-D DWT architecture is as shown in Fig. 3. It consists of two 1-D DWT core to implement 2-D DWT. The 1-D DWT core is explained in previous section.

      Fig. 3. Proposed Architecture for 2-D DWT

      The Data input INP from the image is fed to the 1-D DWT core through multiplexer (Mux) with DATA_CONTROL low. The Low pass L_OP) and high pass coefficients (H_OP) output of 1-D DWT are passed to the memory units through De-multiplexer (Demux) and stored separately. Each memory unit size is N2/2. The 1-D DWT data stored in both memory units are accessed in column-wise and passed to both 1-D DWT cores to calculate 2-D DWT. The speed of design is increased since all four sub-bands are calculated simultaneously.

    3. Control Unit

    This unit is very important block in scheduling the operation of each module or block in the architecture. The detailed structure of control unit is shown in Fig. 4. It consists of only two counters, two multiplexers and a clock divider circuit. Clock Divider circuit divides the main clock frequency by value two. The output of Clock Divider is given as input to the first counter, Counter1. Counter1 counts till it reaches N2 (where N is the image size for N*N) and this count value itself is address ADDR for the memory module to store the 1- D DWT coefficients. The RST_ON, DATA_CONTROL and

    Z are made high when the counter 1 value reaches N2. RST_ON triggers the counter 2. The counter 2 starts counting and the output value of counter 2 is used as address to access the data from memory in column-wise to compute 2-D DWT coefficients. The controller unit is used to synchronize all the blocks in the proposed architecture of 2-D 9/7 Lift DWT using control signals.

    Fig. 4. Control Unit

  4. RESULTS AND DISCUSSIONS

The proposed 9/7 lift based architecture is synthesized on Xilinx FPGA target device using Virtex-5 xc5vlx110t-3ff1136 with -3 Grade speed. The device utilization summary is shown in Table 1.

It is observed that the proposed architecture utilize 632 slice registers and 18% of the slice LUTs available. It requires total memory of N2. The simulation is performed in Xilinx ISE. The original uncompressed image and the compressed LL band are shown in Fig. 5 and Fig. 6 respectively. The Table 2 shows the comparison of proposed architecture with existing architectures in terms of adders, multipliers and registers. The Comparison with existing architecture in terms of slice registers, LUTs and speed for Virtex-5 is shown in Table 3.

Fig. 5. Original Image

Fig. 6. LL Band after 9/7 DWT TABLE 1. HARDWARE UTILIZATION

Logic utilization

Used

Available

Number of Slice registers

632

69120

Number of Slice LUTs

12585

69120

Number of LUT-FF pairs

452

6435

Number of Bonded IOBs

53

640

Number of

BUFG/BUFGCTRLs

4

32

TABLE 2. COMPARISON OF VARIOUS ARCHITECTURES WITH PROPOSED FOR 9/7 2-D DWT

Architecture

Adders

Multipliers

Registers

Vidyadhar Gupta

et al., [14]

8

6

0

Yeong-kang Lai

et al.,[13]

16

10

4

Anand Darji et

al., [15]

16

10

20

Bing-fei Wu et

al.,[17]

8

6

NA

Wei Zang et

al.,[16]

16

10

34

Proposed

8

12

2

TABLE 3. COMPARISON OF PROPOSED ARCHITECTURE WITH EXISTING ARCHITECTURE INTERMS OF SLICES, LUT AND OPERATING

FREQUENCY

Architecture

LUT-FF pairs

Bonded

IOB

BUFGs

Slice

registers

Frequency

Nagabhushanam et

al., [18]

789

259

6

1152

180MHz

Proposed

452

53

4

632

232.823MHz

IV. CONCLUSION

In this paper, we propose an efficient architecture for 2D DWT computation based on 9/7 lifting scheme algorithm. The architecture uses two 1-D DWT core with processing blocks with reduced hardware resource utilization. Performance is high since all four sub-bands are calculated simultaneously. The architecture is synthesized using Xilinx ISE and targeted using Virtex-5 FPGA consisting of 110 million gates. The results obtained shows that the proposed design operates at maximum frequency of 232.823 MHz. The design occupies about 1% of the resource on FPGA.

REFERENCES

  1. S.G. Mallat, A Theory for multiresolution signal decomposition, IEEE Transaction on Pattern Analysis and Machine Intelligence, Vol. 11, no. 7, pp. 674-693, July 1989.

  2. Daubechies I and Sweldens W, Factoring Wavelet Transforms into Lifting Schemes, J. Fourier Analysis and Applications, Vol.4, pp. 247- 269, 1998.

  3. Sugreev Kaur and Rajesh Mehra, High Speed and Area Efficient 2D DWT Processor based Image Compression, Signal & Image Processing: An International Journal (SIPIJ), vol. 1, no. 2, December 2010.

  4. Naseer M, Basheer and Mustafa Mushtak Mohammed, Design and FPGA Implementation of a Lifting Scheme 2D DWT Architecture, International Journal of Recent Technology and Engineering (IJRTE), vol. 2, no 1, March 2013.

  5. B Eswara Reddy and K Venkata Narayana, A Lossless Image Compression Using Traditional and Lifting Based Wavelets, Signal & Image Processing: An International Journal (SIPIJ), vol. 3, no. 2, April 2012.

  6. A.Hasna Jayaraj and U. Kidavu Design and Implementation of Lifting Based 2-D Discrete Wavelet Transforming FPGA, International Journal of Advanced Information and Communication Technology (IJAICT), Vol. 1, no. 3, July 2014.

  7. G.Chenchu Krishnaiah et al.,Efficient Image Compression Algorithms Using Evolved Wavelets, International Journal of Systems and Technologies, Vol. 4, no. 2, pp. 127-146, 2011.

  8. Li Bao Feng, Dou Yong and ShaoQiang, Deeply Parallel Architecture for Lifting based 2D DWT in JPEG 2000, The Sixth IEEE International

    Conference on Computer and Information Technology, pp. 178, September 2006.

  9. B. Sivachandra mahalingam, Pranav priyadarshi prince, and Ganga Shankar kumar, Novel Bi-Orthogonal Filter Coefficient Wavelet Transform for Image Compression, International Journal of Advanced Research in Electrical, Electronics and Instrumentation Engineering, vol. 3, no. 3, April 2014.

  10. Anand Darji, S.N.Merchant and A. N. Chandorkar, Efficient Pipelined VLSI Architecture with Dual Scanning Method for 2D Lifting Based Discrete Wavelet Transform, IEEE International Symposium on Integrated Circuits, pp. 329-331, 2011.

  11. Sanjay H dhabole, Virajit A, and Johan, Efficient Lossy Image Compression using adaptive directional lifting based 9/7 wavelet, International Journal of Emerging Technologies in Computational and Applied Science (IJCSN), pp. 5-11, 2012.

  12. Yamini S.bute and R.W. Jasutkar, Implementation of Discrete Wavelet Transform Processor for Image Compression, International Journal of Computer Science and Network (IJCSN), vol. 1, no 3, June 2012.

  13. Yeong-Kang lai, Lien-Fei Chen, and Yui-Chih Shih, A High Performance and Memory Efficient VLSI Architecture with Parallel Scanning Method For 2-D Lifting Based Discrete Wavelet Transform, IEEE Transactions on Consumer Electronics, vol. 55, no. 2, pp. 400- 407, 2009.

  14. Vidyadhar Gupta, and Krishna Raj, An Efficient Modified Lifting Based 2-D Discrete Wavelet Transform Architecture, First International Conference on Recent Advances in Information Technlogy, pp. 832-837, March 2012.

  15. Anand Darji, Shubham Agarwal, Ankit Oza, vipul sinha, aditya verma,

    S. N. Merchant and A. N. Chandorkar, Dual Scan Parallel Flipping Architecture For Lifting-Based 2-D DWT, IEEE Transactions on Circuits and Systems-II, vol. 61, no. 6, pp. 433-437, June 2014.

  16. Wei Zhang, Zhe Jiang, Zhiyu gao, and Yanyan liu, An Efficient VLSI Architecture for Lifting Based Discrete Wavelet Transform, IEEE Transactions on Circuit and systems-II, vol. 59, no. 3, March 2012.

  17. Bing-Fei Wu and Chung-fu lin, A High Performance and Memory Efficient Pipeline Architecture For the 5/3 and 9/7 Discrete Wavelet Transform of JPEG 2000 Codec, IEEE Transactions on Circuits and Systems for Video Technology, Vol. 15, no. 12, pp. 1615-1628, December 2005.

  18. M. Nagabhushanam, P.Kumar, and S. Ramachandran, FPGA Implementation of 1D and 2D DWT Architecture using Modified Lifting Scheme, WSEAS Transaction on Signal Processing, Vol. 9, Issue 4, October 2013.

Leave a Reply