Memory Efficientlow Dendity Parity Check Codes On FPGA

Shrugal Varde; Dr. Nisha Sarwade

doi:10.17577/IJERTV2IS70227

Volume 02, Issue 07 (July 2013)

Memory Efficientlow Dendity Parity Check Codes On FPGA

DOI : 10.17577/IJERTV2IS70227

Download Full-Text PDF Cite this Publication

Open Access
Article Download / Views: 81
Total Downloads : 467
Authors : Shrugal Varde, Dr. Nisha Sarwade
Paper ID : IJERTV2IS70227
Volume & Issue : Volume 02, Issue 07 (July 2013)
Published (First Online): 10-07-2013
ISSN (Online) : 2278-0181
Publisher Name : IJERT
License: This work is licensed under a Creative Commons Attribution 4.0 International License

PDF Version

View

Text Only Version

Memory Efficientlow Dendity Parity Check Codes On FPGA

Shrugal Varde 1, Dr. NishaSarwade 2

Electrical Engineering department,VJTI,Mumbai,India Electrical Engineering department,VJTI,Mumbai,India

Abstract

Low-density Parity Check(LDPC) codes have been in focus of intense research in Error-correction Cod- ingin recent years. High throughput decoder design for them has been a big challenge for these codes. In thispaper, we introduce decoder design based on projective geometry (PG) structure of LDPC code sum product algorithm. The corresponding fully- parallel VLSI architecture was implemented on Xil- inx LX110T FPGA. MATLAB was used to simulate the code.

Keywords : LPDC , MATLAB, Projective geometry, Virtex 5,Xilinx.

Introduction

LDPC codes are an emerging class of codes which exhibit superior bit error rate(BER) performance. Relative ease of decoder design, coupled with better performance, has made LDPC codes useful in recent digital transmission and storage systems. All LDPC decoding algorithms need large number of parallel working memories. Hence designing for memory efficiencyis one of the signicant task in its decoder design. In this work, we introduce an LDPC decoder design that uses on chip memory for storing data. In general, different code structures result in different architectures, and hence different memory manage- ment schemes. Our choice of structures is derived out of geometry of projective planes. We report prototype implementation results on FPGA, to dem- onstrate simplicity of design, and high efficiency of the hardware, for PG based LDPC codes, apart from throughput improvement. The design has been im- plemented on Xilinx Virtex 5 FPGA board.
Low density parity check codes

LDPC codes are generally decoded using probabilis- ticsoft decision decoding process. The knowledge

of channel noise statistics is used to generate prob- abilistic information for received bits, and given to the decoder. The reliability of this bit information

is then successively improved over iterations, and hard decisions on bit values made. Such decoders

make use of graphs known as Tanner graphs to rep- resent codes, passing probabilistic messages along the graphs edges iteratively. The matrix of

this graph is called parity-check matrix H. A Tanner graph has two sets of nodes: n bit nodes, and m parity-check nodes, for a m*n sized parity-check matrix. Each parity-check node is connected by an edgeto bit nodes corresponding to the code bits in- cludedin that parity-check equation. The sequence of stepsinvolved in iterative decoding using log-sum- product

Algorithm is as follows
1. The initial message is first sent by bit nodes to check nodes. This message is based on the calculated log-likelihood ratio(LLR) of received signal.
2. Next, check nodes calculate and send up- dated LLRs to the bit nodes using the re- ceived messages. The formulae is as given
  
  Ej,i data sent from jth check node to ith
  
  bit
  
  node.
  
  Mj,i – data sent from ith bit node to jth check
  
  node.
3. Bit node add all the data received form the check nodes connected to that bit node
4. Hard decision is done based on sign of Li. If in some iteration syndrome vector is zero then code word Li belongs to code block, else process is repeated.
5. Updates messages are sent back to check node. Formula for updating the check data is as given
  
  ri intrinsic data or apriori data received by bit node at the start of decoding process.
Design

In this design fixed point data format is used to represent binary data. The advantage of using the fixed point data representation is that arithmetic operation is much simpler compared to floating point arithmetic operation. In the design 16 bits are used to represent the data .MSB indicated the sign bit. Integer data is represent by 5 bits and re- maining 10 bits are used to represent fractional data . The data format is as shown in figure 1

Figure 1: Data representation format

The choice of message precision is dictated by the biterror rate and the resources of storage and compu- tation available on the FPGA.
Bit node design

The bit nodes are relatively simpler to implement as they do only addition of several check node mes- sages. As the number of check node messages is relatively more in our case, we employ a partly se- rial adder for our design. The check node messages are first stored in a register bank. Addition process of 9 check nodes along with apriori data is com- pleted in 4 clock .Architecture is as shown in figure 2Then,each check node message is subtracted from this total sum to get the individual bit-to-check mes- sages.

Figure 2 : Bit addition architecture
Check node design

The check nodes are comparatively much more complex.Check node computation includes calucu- lating hyperbolic tan , multiplication and natural logarithm of nine set of 16 bit data. These arithmetic operations are calculated by simple shift and add algorithm called CORDIC algorithm first proposed by Volder in 1961.Hyperbolic tan structure is as shown in figure 3

Figure 3 : Hyperbolic tan structure

The absolute block separates the sign and the magni- tude part of the angle entered. The magnitude part of the values is given to the decision block

The decision block checks the range of the angle . If the value is out of range(greater than 3.5) then deci- sion block generates a done_all control signal. If not, then decision block initializes the value of X, Y and Z and generates a control signal done. These values are given to the computation block.

The computation block implements the hyperbolic CORDIC equation given by walther. This block at the end of 13 iterations generates the values of hy- perbolic sin and hyperbolic cos. These values are given to division block. To increase the speed barrel shifters are used.[5]
The division process used in this code is also implemented using CORDIC algorithm. It computes the value of tanh (tanh = sinh/cosh). Af- ter calculating the value of hyperbolic tan sign value is taken into consideration and final result is gener- ated

Multiplier block structure using CORDIC algorithm is as shown in figure 4

Figure 4 : Multiplier block structure

The absolute block separates the sign and the magnitude part of the angle entered. The magni- tude part of the values is given to the decision block

The algorithm block performs the most impor- tant task of multiplying two 16 bit binary data using CORDIC algorithm .Matlab simulation results have shown that the multiplication block using CORDIC algorithm provides accurate results with error less than 0.01% when the number of iteration used to calculate the multi- plication result is more than 12. After the com- putation process the decision block decides the sign of the result. For positive number the sign bit is 0 and for negative sign bit is 1.

Natural logarithm block structure is as shown in fig. 5

Figure 5 : Natural logarithm block structure

Buffer data receive the data from multiplication block. The main task of the buffer block is to generate set of data which is (1 + input) and (1 input). Depending on the condition (control sig- nals) buffer data either gives (1+ input) as output or (1- input). The decision block decides the input and adjustment value depending on the input data of decision block. The compute block performs the actual operation of calculating the natural

logarithmof the input data. At the end compute block performs the task of log(1+input ) + log(1- input).

Each check node is connected to nine bit nodes . Hence check node at start should calculate hyper- bolic tan of nine input data. After the calculation of hyperbolic tan eight 16 data should be multiplied . If multiplication is done serially then 192 clock cycles would be required to generate the result which will act as input to natural logarithm block. To increase the speed of the system semi parallel structure is adopted for multiplication for 8 16 bit data. Structure is a shown in figure 6

Figure 6 : semi parallel multiplication structure
Control logic

Due to the simplified logic of the decoder, the con- trol unit design becomes extremely simple. This is implemented in terms of a simple FSM as shown in Figure 7. The strobe input is given a strobe after a data block has been input. The output signal goes to select the intrinsic information. Whenever all the parity check equations have been satisfied, they are output one and the unit becomes ready to accept new input blocks.

Also, a one cycle wide pulse is generated on the end output to notify the external device of the comple- tion.

Figure 7 : Control unit schematic
Results and simulations

The vhdlcodes of sum product algorithm was im- plemented on VIRTEX 5 FPGA board. The device used is xc5vlx110t-1ff1136. For synthesis of vhdl code we used XILINX ISE DESIGN SUITE 14.1 .

The code were verified on the board with the help of Xilinx IMPACT tool . MATLAB 2012a was used to generate the simulation graphs.

The synthesis report is as shown

Number of Slice registers	16909 of 69120 (24%)
Number of slice LUTs	69100 of 69120(86%)
Number of fully used LUTs	11200 of 1208 (13%)
Number of bonder IOBs	147of 640 (22 %)
Number of BUFG/BUFGCTRLs	2 of 32 (6%)

Maximum clock frequency at which system can work is 120 MHz. As block RAM is used in the sys- tem the number of interconnects have reduces.

For simulation purpose a matlab code was written that generates 73 random values in the range of -5 to

5. These values were then converted in 16 bit binary format used in the design. The values are stored in on chip RAM. The decoder was simulated using Xilinx ISIM simulator. After simulation it was ob- served that the system takes 2 to 3 iteration to gen- erates the final result.The utilization of the resources is almost 86 %.As block RAM was used to transfer data from bit node to check node and vice versa the interconnection have reduced thus reducing the complexity of the system.Since the cross over prob- ability of the channel is almost 5 % the number of bits in which error is allowed is 73*0.05 =3.6

,approximately 4. Decoder was able to decode upto 4 bit errors with number of iterations required for decoding to be less than 3.

References
1. R.G.Gallager,_Low Density Parity Check codes M.I.T.Press,1963
2. D.J.C.MacKay and R.M.Neal,_Near Shannon limit performance of low density parity checkcodes, Elec-
  
  tronLett.,vol.32,no.18,pp.1645- 1646,1996.
3. Y. Kou, S Lin, and M. P.C. Fossorier, "Low density parity check codes based on _nite ge- ometries: a rediscovery", Proc. IEEE ISIT, Sorrento, Italy, p. 200, Jun 2000.
4. Sarah J. Johnson, Introducing Low-Density Parity-Check Codes.
5. N.Karmarkar,_A New Parallel Architecture for Sparse Matrix Computation Based on Fi niteProjective Geometries_,AT&T Bell Labs,1994.
6. William E Ryan, University of Arizona, INTRODUCTION TO LDPC CODES,2001
7. Hrishikesh Sharma, Subhasis Das, HIGH THROUGHPUT MEMORY-EFFICIENT VLSI DESIGNS FOR STRUCTURED LDPC DECODING,IIT Bombay.,May 2010.
8. Subhasis Das, Hrishikesh Sharma, Sachin Pat kar, A Min-Sum based 800 Mbps LDPC de coder on FPGA,IIT Bombay,March 2010

Memory Efficientlow Dendity Parity Check Codes On FPGA

Leave a Reply