Pipeline Architecture of 2d Dct for High Efficiency Video Coding

DOI : 10.17577/IJERTV6IS050522

Download Full-Text PDF Cite this Publication

Text Only Version

Pipeline Architecture of 2d Dct for High Efficiency Video Coding

Dhanya R

Department of Electronics and Communication Engineering

Musaliar College of Engineering and Technology Pathanamthitta, Kerala, India

Nishi G Nampoothiri

Asso. Prof

Department of Electronics and Communication Engineering

Musaliar College of Engineering and Technology Pathanamthitta, Kerala, India

Abstract In this paper, a novel computation and energy reduction technique for High Efficiency Video Coding (HEVC) Discrete Cosine Transform (DCT) for all Transform Unit (TU) sizes is proposed. The existing system reduces the computational complexity of HEVC DCT significantly at the expense of slight decrease in PSNR and slight increase in bit rate by only calculating several pre-determined low frequency coefficients of TUs and assuming that the remaining coefficients are zero. It reduced the execution time of HEVC HM software encoder up to 12.74%, and it reduced the execution time of DCT operations in HEVC HM software encoder up to 37.27%.Currently different types of transform techniques are used by different video codes to achieve data compression during video frame transmission. Among them, discrete cosine transform (DCT) is supported by most of modern video standards. The integer DCT is an approximation of DCT. It can be implemented exclusively with integer arithmetic. Integer DCT proves to be highly advantageous in cost and speed for hardware implementation. Implementation of an efficient discrete cosine transform with reduced complexity and number of multiplications. Pipelining technique is introduced to reduce the processing time. The full pipeline variable block size transform engine with the efficient hardware utilization is proposed to handle the DCT/IDCT.2D- DCT is computed by combing two 1D-DCT that connected by a transpose. So in proposed system use pipeline architecture to reduce the computational complexity of HEVC than existing system. In this paper, low energy HEVC 2D DCT hardware for all TU sizes is also designed and implemented using Verilog HDL. The proposed hardware, in the worst case, can process 53 Ultra HD (7680×4320) video frames per second. The proposed technique reduced the energy consumption of this hardware up to 18.9%. Therefore, it can be used in portable consumer electronics products that require a real-time HEVC encoder.

Index Terms HEVC, Discrete Cosine Transform, Hardware Implementation, FPGA, Energy Reduction.

  1. INTRODUCTION

    A new international video compression standard called High Efficiency Video Coding (HEVC) is recently developed [1]-[6]. It has 50% better video compression efficiency than

    H.264 standard. It uses Discrete Cosine Transform (DCT) / Inverse Discrete Cosine Transform (IDCT) same as H.264 standard. However, H.264 standard uses only 4×4 and 8×8 Transform Unit (TU) sizes for DCT/IDCT. HEVC standard uses 4×4, 8×8, 16×16, and 32×32 TU sizes for DCT/IDCT. Larger TU sizes achieve better energy compaction. However, they increase the computational complexity exponentially. In addition, HEVC standard uses Discrete Sine Transform

    (DST) / Inverse Discrete Sine Transform (IDST) for 4×4 intra prediction in certain cases.Transform operations (DCT/IDCT and DST/IDST) are heavily used in an HEVC encoder [7]. DCT and DST have high computational complexity. DCT and DST operations account for 11% of the computational complexity of an HEVC video encoder. They account for 25% of the computational complexity of an all intra HEVC video encoder.

    In this paper, a low energy HEVC 2D DCT hardware for all TU sizes is also designed and implemented using Verilog HDL. The proposed hardware calculates 4, 8, 16 and 32 DCT coefficients per clock cycle for 4×4, 8×8, 16×16 and 32×32 TU sizes, respectively. It, in the worst case, can process 48 Quad Full HD (3840×2160) video frames per second. In this paper, another low energy HEVC 2D DCT hardware for all TU sizes with higher hardware utilization is also designed and implemented using Verilog HDL.

    Clock gating is used to reduce the energy consumptions of both hardware. Hcub Multiplierless Constant Multiplication (MCM) algorithm [9] is used to reduce number and size of the adders in both hardware. Hcub MCM algorithm reduced the energy consumption of the lower utilization (LU) hardware and the higher utilization (HU) hardware up to 5.9% and 13.1%, respectively. Finally, the proposed technique is used to reduce the energy consumptions of both hardware. It further reduced the energy consumption of the LU hardware and the HU hardware up to 17.9% and 18.9%, respectively.Ease of Use

  2. PROPOSED COMPUTATION AND ENERGY REDUCTION TECHNIQUE

    After forward transform and quantization, most of the forward transformed and quantized high frequency coefficients in a TU become zero. In addition, if the values of non-zero forward transformed and quantized low frequency coefficients in a TU are small, they have small impact on the inverse quantized and inverse transformed TU. Therefore, the proposed technique only calculates several pre-determined low frequency coefficients of TUs, and it assumes that the remaining coefficients are zero.

    TABLE I Addition And Shift Reductions For All TU Sizes

    TU

    size

    Org.

    C.Set

    #1

    Red.

    (%)

    C.Set#

    2

    Red.(%

    )

    Coefficient

    Set#3

    4×4

    Add

    shift

    224

    224

    84

    84

    62.5

    62.5

    147

    147

    34.4

    34.4

    8×8

    Add

    shift

    2560

    2304

    960

    864

    62.5

    62.5

    600

    594

    74.2

    74.2

    600

    594

    74.2

    74.2

    16x 16

    Add shift

    20992

    16896

    7872

    6336

    62.5

    62.5

    13776

    11088

    34.4

    34.4

    5412

    4356

    74.2

    74.2

    32x

    Add

    18227

    68352

    62.5

    46992

    74.2

    4699

    74.2

    32

    shift

    2

    57600

    62.5

    39600

    74.2

    2

    74.2

    15360

    3960

    0

    0

    Average

    62.5

    54.3

    74.2

    Table II shows the number of addition and shift operations required for calculating all DCT coefficients in a TU (Original) and for calculating the pre-determined DCT coefficients in a TU for three different DCT coefficient sets. Calculating only the pre-determined DCT coefficients in a TU significantly reduces the number of addition and shift operations

    The proposed technique is integrated into DCT operations performed by HEVC HM software encoder [8]. The pre- determined DCT coefficients are experimentally determined to achieve large computation reduction with slight decrease in PSNR and slight increase i bit rate using HEVC HM software encoder. The impact of the proposed technique on execution time and rate-distortion performance is determined for three different DCT coefficient sets on a workstation with

    3.33 GHz dual-core processor and 64 GB DRAM for People on Street, Traffic (2560×1600), Tennis, Kimono, Basketball Drive, Park Scene (1920×1080), Vidyo1, Vidyo4, Kristen and Sara, Four People (1280×720), Keiba, Party Scene, Race Horses, Basketball Drill (832×480) videos [19].

    First 10 frames of all video sequences are coded with all intra (AI), low delay P (LP) (IPPPP) and random access (RA) (IBBBB) test configurations and with quantization parameters (QP) 22, 27, 32 and 37 using HEVC HM software encoder [8] with and without the proposed technique, and BD-Rate and BD-PSNR values [20] are calculated. the expense of slight decrease in PSNR and slight increase in bit rate. Since it is used in mode decision stage of an HEVC.

    EXISTING SYSTEM

    In this paper, a novel computation and energy reduction technique for High Efficiency Video Coding (HEVC) Discrete Cosine Transform (DCT) for all Transform Unit (TU) sizes is proposed. The existing system reduces the computational complexity of HEVC DCT significantly at the expense of slight decrease in PSNR and slight increase in bit rate by only calculating several pre-determined low frequency coefficients of TUs and assuming that the remaining coefficients are zero.

    Fig. 2. Existing HEVC 2D DCT lower utilization hardware

    It reduced the execution time of HEVC HM software encoder up to 12.74%, and it reduced the execution time of DCT operations in HEVC HM software encoder up to 37.27%.Currently different types of transform techniques are used by different video codes to achieve data compression during video frame transmission. Among them, discrete cosine transform (DCT) is supported by most of modern video standards. The integer DCT is an approximation of DCT. It can be implemented exclusively with integer arithmetic. Integer DCT proves to be highly advantageous in cost and speed for hardware implementation. Implementation of an efficient discrete cosine transform with reduced complexity and number of multiplications. In this paper, low energy HEVC 2D DCT hardware for all TU sizes is also designed and implemented using Verilog HDL. The proposed hardware, in the worst case, can process 53 Ultra HD (7680×4320) video frames per second. The proposed technique reduced the energy consumption of this hardware up to 18.9%. Therefore, it can be used in portable consumer electronics products that require a real-time HEVC encoder.

    Proposed System

    Fig.2.Proposed Hevc 2d Dct Lower Utilization Hardware

    Currently different types of transform techniques are used by different video codes to achieve data compression during video frame transmission. Among them, discrete cosine transform (DCT) is supported by most of modern video standards. The integer DCT is an approximation of DCT. It can be implemented exclusively with integer arithmetic. Integer DCT proves to be highly advantageous in cost and speed for hardware implementation. Implementation of an efficient discrete cosine transform with reduced complexity and number of multiplications. Pipelining technique is

    introduced to reduce the processing time. The full pipeline variable block size transform engine with the efficient hardware utilization is proposed to handle the DCT/IDCT.2D-DCT is computed by combing two 1D-DCT that connected by a transpose. So in proposed system use pipeline architecture to reduce the computational complexity of HEVC than existing system.

    A PROPOSED HEVC 2D DCT HARDWARE

    1. Proposed HEVC 2D DCT Lower Utilization Hardware

      The proposed HEVC 2D DCT lower utilization (LU) hardware for all TU sizes including clock gating, Hcub MCM algorithm, and the proposed technique with coefficient set 3 is shown in Fig. 3. Input splitter is used to select the proper DCT inputs for each TU size. Output multiplexers are used to select the proper DCT outputs for each TU size. Column and row clip modules are used to scale the outputs of 1D column DCT and 1D row DCT to 16 bits, respectively. Column clip shifts 1D column DCT outputs right by 1, 2, 3 and 4 for 4×4, 8×8,

      16×16 and 32×32 TU sizes, respectively. Row clip shifts 1D row DCT outputs right by 8, 9, 10 and 11 for 4×4, 8×8, 16×16 and 32×32 TU sizes, respectively.

      Since HEVC DCT algorithm allows performing an N – point 1D DCT by performing two N/2-point 1D DCTs with some preprocessing, the proposed hardware performs N-point 1D DCT transforms by performing two N/2-point 1D DCT transforms with an efficient butterfly structure. It performs 2D DCT by first performing 1D DCT transform on the columns of a TU, and then performing 1D DCT transform on the rows of the TU. After 1D column DCT, the resulting coefficients are stored in a transpose memory, and they are used as input for 1D row DCT.

      The butterfly structure used for column transforms is shown in Fig 4. For 4×4 TUs, only 4×4 butterfly operation is used. For 8×8 TUs, 8×8 and 4×4 butterfly operations are used. For 16×16 TUs, 16×16, 8×8 and 4×4 butterfly operations are used. For 32×32 TUs, all butterfly operations (32×32, 16×16, 8×8, 4×4) are used.

      One 4×4 datapath is used for 4×4 TU size. Two 4×4 datapaths are used for 8×8 TU size. Two 4×4 datapaths and one 8×8 datapath are used for 16×16 TU size. All datapaths (two 4×4, one 8×8 and one 16×16) are used for 32×32 TU size. In order to reduce the power consumption of proposed hardware, data gating is used for the inputs of 4×4, 8×8 and 16×16 column and row datapaths. The inputs of these datapaths are stored into registers. If a datapath is not used for a TU, its input registers are not updated. This prevents unnecessary switching activities in this datapath.

      DCT multiplications are performed in the datapaths using only adders and shifters. In order to reduce number and size of the adders in the proposed hardware, Hcub MCM algorithm [9] is used for implementing multiplications with constants. Hcub algorithm tries to minimize number and size of the adders in a multiplier block which multiplies a single input with multiple constants using shift and addition operations. Hcub algorithm determines necessary shift and addition operations in a multiplier block.

      The transpose memory is implemented using 32 Block RAMs (BRAM). 4, 8, 16 and 32 BRAMs are used for 4×4, 8×8,

      16×16 and 32×32 TU sizes, respectively. In the figure, the numbers in each box show the BRAM that coefficient is stored. The results of 1D column DCT are generated column by column. For 32×32 TU size, first, the coefficients in column 0 (C0) are generated in a clock cycle and stored in 32 different BRAMs. Then, the coefficients in column 1 (C1) are generated in the next clock cycle and stored in 32 different BRAMs using a rotating addressing scheme. This continuous until the coefficients in column 31 (C31) are generated and stored in 32 different BRAMs using the rotating addressing scheme

    2. Proposed HEVC 2D DCT Higher Utilization Hardware The proposed HEVC 2D DCT higher utilization (HU) hardware processes four 4×4 TUs or two 8×8 TUs in parallel. Same as the LU hardware, it uses two 4×4 datapaths and one 8×8 datapath for 16×16 TU size, and it uses all datapaths (two 4×4, one 8×8 and one 16×16) for 32×32 TU size. However, the HU hardware uses two 4×4 datapaths and one 8×8 datapath for 4×4 and 8×8 TU sizes. Since 4×4 and 8×8 column and row datapaths are used for all TU sizes, data gating is used only for the inputs of 16×16 column and row

      datapaths.

      Same as the LU hardware, multiplier blocks in the first 4×4 datapath and 16×16 datapath multiply a single input with 3 and 16 different constants, respectively. However, in the HU hardware, multiplier blocks in the second 4×4 datapath and 8×8 datapath multiply a single input with 7 and 15 different constants, respectively. Because, in the HU hardware, the second 4×4 datapath and 8×8 datapath are used for all TU sizes.

      In order to calculate each output of 1D DCT for 4×4, 8×8 and 16×16 TU sizes, an output from each multiplier block in both 4×4 datapaths and 8×8 datapath is selected, and these outputs are added or subtracted. Similarly, in order to calculate each output of 1D DCT for 32×32 TU size, 32 outputs from 32 multiplier blocks in all datapaths (two 4×4, one 8×8 and one 16×16) are added or subtracted.

      Same as the LU hardware, transpose memory is implemented using 32 BRAMs. However, in the HU hardware, 8, 8, 16 and 32 BRAMs are used for 4×4, 8×8, 16×16 and 32×32 TU sizes, respectively.

      1. IMPLEMENTATION RESULTS

        The proposed low energy HEVC 2D DCT LU and HU hardware for all TU sizes including clock gating (original hardware), including clock gating and Hcub MCM algorithm (MCM hardware), and including clock gating, Hcub MCM algorithm and the proposed technique with coefficient set 3 (proposed hardware) are implemented in Verilog HDL. The Verilog RTL implementations are verified with RTL simulations. RTL simulation results matched the results of 2D DCT implementation in HEVC HM software encoder [8]. The Verilog RTL codes are synthesized and mapped to an FPGA implemented in 40nm CMOS technology. The FPGA implementations are verified with post place & route simulations. Post place & route simulation results matched the results of 2D DCT implementation in HEVC HM software encoder [8]. The FPGA implementation results

        given in Table VI show that Hcub MCM algorithm considerably decreased area, and the proposed technique slightly increased area.Power consumptions of the FPGA implementations are estimated using a gate level power estimation tool. Post place & route

        timing simulations are performed for Tennis, Kimono and ParkScene (1920×1080) videos at 100 MHz [19] and signal activities are stored in VCD files. These VCD files are used for estimating power consumptions of the FPGA implementations. The energy consumption results for the LU hardware and the HU hardware for one frame of each video are shown in Fig. 8 and Fig. 9, respectively. Hcub MCM algorithm reduced the energy consumption of the LU hardware and the HU hardware up to 5.9% and 13.1%, respectively. The proposed energy reduction technique further reduced the energy consumption of the LU hardware and the HU hardware up to 17.9% and 18.9%, respectively.

        In order to compare the LU hardware and the HU hardware with the HEVC DCT hardware in the literature, their Verilog RTL codes are also synthesized to a 90nm standard cell library and the resulting netlists are placed and routed. The resulting ASIC implementations of the LU hardware and the HU hardware work at 140 MHz and 130 MHz, respectively. Gate counts of the LU hardware and the HU hardware are calculated as 175K and 197K, respectively, according to NAND (3×1) gate area excluding on-chip memory. The comparison of the LU hardware and the HU hardware with the HEVC DCT hardware in the literature is shown in Table VII.

        The proposed 2D DCT hardware has smaller area and power consumption than the 2D DCT hardware proposed in [14]-[17]. The DCT hardware proposed in [18] only performs 1D DCT, and its performance is not given. Since the 2D DCT hardware proposed in [14] and [17] use multipliers, they have larger area than the proposed 2D DCT hardware. Since the 2D DCT hardware proposed in [16] performs DCT operations for several TUs in parallel for smaller TU sizes, it achieves higher performance than the proposed 2D DCT LU hardware at the expense of much larger area and power consumption. It has same performance as the proposed 2D DCT HU hardware with larger area.

      2. CONCLUSIONS

In this paper, a novel computation and energy reduction technique for HEVC DCT for all TU sizes is proposed. The proposed technique reduced the computational complexity of HEVC DCT significantly at the expense of slight decrease in PSNR and slight increase in bit rate. In this paper, a low energy HEVC 2D DCT hardware for all TU sizes is also designed and implemented using Verilog HDL. The proposed hardware, in the worst case, can process 53 Ultra HD (7680×4320) video frames per second. The proposed technique reduced the energy consumption of this hardware up to 18.9%. Therefore, it can be used in portable consumer electronics products that require a real-time HEVC encoder.

Currently different types of transform techniques are used by different video codes to achieve data compression during video frame transmission. Among them, discrete cosine transform (DCT) is supported by most of modern video standards. The integer DCT is an approximation of

DCT. It can be implemented exclusively with integer arithmetic. Integer DCT proves to be highly advantageous in cost and speed for hardware implementation. Implementation of an efficient discrete cosine transform with reduced complexity and number of multiplications. Pipelining technique is introduced to reduce the processing time. The full pipeline variable block size transform engine with the efficient hardware utilization is proposed to handle the DCT/IDCT.2D-DCT is computed by combing two 1D-DCT that connected by a transpose. So in proposed system use pipeline architecture to reduce the computational complexity of HEVC than existing system.

Table iii Timing summary of proposed and existing system

System

Minimum period

Maximum frequency

Minimum input arrival time before

clock

Maximum output required time after clock

Total memory usage is

Proposed system

2.333ns

428.604MHz

1.536ns

16.536ns

427172

kilobytes

Existing system

5.044ns

198.238MHz

5.254ns

18.769ns

507620

kilobytes

ADVANTAGES

    • Reused for any of the prescribed lengths.

    • The proposed structure could be reusable for DCT of lengths 4, 8, 16, and 32 with a throughput of 32 DCT coefficients per cycle irrespective of the transform size.

    • Less-area delay due to Parallel implementation.

    • The proposed architecture could be pruned to reduce the complexity of implementation substantially with only a marginal affect on the coding performance.

      APPLICATIONS

    • It is used in Mobile Multimedia Devices.

    • The proposed architecture is found to support ultrahigh definition 7680 × 4320 at 60 frames/s video, which is one of the applications of HEVC.

    • Signal &Image Processing. ,Digital Cameras,HDTV

      FUTURE SCOPE

    • The proposed system can modified by reducing the Area and delay of the design in future.

    • The fast algorithm for the 8-point DCT 2D architecture designed by applying the 1D DCT structure in the folded and full parallel 2D DCT architecture

REFERENCES

[1] Ercan Kalali, Ahmet Can Mert, Ilker Hamzaoglu, (Vol. 62, No. 2, May 2016)Senior Member, IEEE, IEEE Transactions on Consumer Electronics A Computation and Energy Reduction Technique for HEVC Discrete Cosine Transform F. Pescador, M. Chavarrias, M. J. Garrido, E. Juarez, C. S anz,

Complexity Analysis of an HEVC Decoder Based on a Dig ital Signal Proces or, IEEE Tran s. on Consumer Electronics, vol.59, no.2, pp. 391- 99, May 2013.

  1. E. Ozcan, Y. Adibelli, I. Hamzaoglu, A High Performa nce Deblocking Fi lter Hardware for High Efficie cy Video Codi ng, IEEE Trans. on Consumer Electronics, vol.59 , no.3, pp.714- 20, Aug. 2013.

  2. E. Ozcan, E. K alali, Y. Adibell i, I. Hamzaoglu, A Computation and Energy Reduc tion Technique for HEVC Intra Mode Decisi on, IEEE Trans. on Consumer Electronics, vol.60 , no.4, pp.745- 53, Nov. 2014.

  3. E. Kalali, E. Ozcan, O. M. Yalci nkaya, I Hamzaoglu, A Low

    Ene rgy HEVC Inverse Transform Hardware, IE EE Trans. on Cn sumer Electroni cs, vol. 60, no. 4, pp. 754-761, No v. 2014.

  4. J. Vanne, M. V iitanen, T.D. Hämäläinen, A. Hallapuro, Co mparative Rate-Distortion-Comp lexity Analysis of HEVC and AVC Video Codec s, IEEE Trans. on Circuits an d Systems for Video Technology, v ol. 22, no. 12, pp .1885-1898, Dec . 2012.

  5. K. McCann, B. Br oss, W.J. Han, I.K. Kim, K. S ugimoto, G. J. Sul livan, High Efficiency Video C oding (HEVC) Test Model 15 (H M 15) Encoder D escription, JCT C-Q1002, June 2014.

  6. Y. Voronenko, M . Püschel, Multiplierless Constant Multiple Mu ltiplication, ACM Trans. on A lgorithms, vol. 3, no. 2, May 2007.

  7. Y. H. Moon, G. Y. Kim, J. H. Kim, An Improved E arly Detection Algorithm for All- ero Blocks in H.264 Video Encoding, IEEE Trans. on Circuits a nd Systems for Video Technolog y, vol.15, no.8, pp. 1053- 1057, Aug. 2005.

  8. M. Zhang, T. Zh ou, W. Wang, Adaptive Met hod for Early Detecting Zero Qua ntized DCT Coe fficients in H.2 64/AVC Video Encoding, IEEE Trans. on Circuits and Systems for Video Technology, vol.19, no.1, pp.103-107, Jan. 2009.

  9. K. Lee, H. J. Lee, J. Kim, Y. Choi, A Novel Algorithm for Zero Block Detection in High Efficiency Video Coding, IEEE Journal of Selected Topics in Signal Processing, vol.7, no.6, pp.1124-1134, Dec. 2013.

  10. J. Li, J. Takala, M. Gabbouj, H. Chen, A Detection Algorithm for Zero-Quantized DCT Coefficients in JPEG, IEEE Int. Conf. on Acoustics, Speech and Signal Processing (ICASP), pp. 1189-1192, Apr. 2008.

  11. J. Zhu, Z. Liu, D. Wang, Fully Pipelined DCT/IDCT/Hadamard Unified Transform Architecture for HEVC Codecs, IEEE Int. Symp. on Circuits and Systems (ISCAS), pp. 677-680, May 2013.

  12. W. Zhao, T. Onoye, T. Song, High-Performance Multiplierless Transform Architecture for HEVC, IEEE Int. Symp. on Circuits and Systems (ISCAS), pp. 1668-1671, May 2013.

  13. P. K. Meher, S. Y. Park, B. K. Mohanty, K. S. Lim, C. Yeo, Efficient Integer DCT Architectures for HEVC, IEEE Trans. on Circuits and Systems for Video Technology, vol.24, no.1, pp. 168-178, Jan. 2014.

  14. G. Pastuszak, Hardware Architecture for the H.265/HEVC Discrete Cosine Transform, IET Image Processing, vol. 9, no. 6, pp. 468-477, June 2015.

  15. A. D. Darji, R. P. Makwana, High-Performance Multiplierless DCT Architecture for HEVC, Int. Symp. on VLSI Design and Test, pp. 1-5, June 2015.

  16. F. Bossen, Common test conditions and software reference configurations, JCTVC-I1100, May 2012.

[12] G. Bjontegaard, Calculation of average PSNR differences between RD-curves, 13th Video Coding Experts Group Meeting, 2001.

Ercan Kalali received B.S. degree in Electronics Engineering from Istanbul Technical University, Istanbul, Turkey in 2011. He received M.S. degree in Electronics Engineering from Sabanci University, Istanbul, Turkey in 2013. He is currently pursuing Ph.D. degree in Electronics Engineering at Sabanci University. His research interests include low power digital hardware design for digital video processing and coding.

Ahmet Can Mert received B.S. degree in Electronics Engineering from Sabanci University, Istanbul, Turkey in 2015. He is currently pursuing M.S. degree in Electronics Engineering at Sabanci University. His research interests include low power digital hardware design for digital video processing and coding.

Ilker Hamzaoglu (M00-SM12) received

B.S. and M.S. degrees in Computer Engineering from Bogazici University, Istanbul, Turkey in 1991 and 1993 respectively. He received Ph.D. degree in Computer Science from University of Illinois at Urbana-Champaign, IL, USA in 1999. He worked as a Senior and Principle Staff Engineer at Multimedia Architecture Lab, Motorola Inc. in Schaumburg, IL, USA between August 1999 and August 2003. He is currently an

Associate Professor at Sabanci University, Istanbul, Turkey where he is working as a Faculty Member since September 2003. His research interests include low power digital hardware design for video processing and compression, System-on-Chip (SoC) ASIC and FPGA design, embedded system design, computer-aided design and test for digital VLSI circuits.

Leave a Reply