 Open Access
 Total Downloads : 618
 Authors : T.V.A. Bhanuprakash, K. Kameswar Reddy, I.V.Rameswar Reddy
 Paper ID : IJERTV2IS100794
 Volume & Issue : Volume 02, Issue 10 (October 2013)
 Published (First Online): 25102013
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
CRC Algorithm Implementation in FPGA by Xmodem protocol
T.V.A. Bhanuprakasp , K. Kameswar reddy2 S. Mahaboob Basha 3 1 M.Tech student, AVR &SVR Engg College, Kurnool
2 Assistant Professor, Dept of ECE, AVR &SVR Engg College, Kurnool
3 Associate Professor and HOD, Dept of ECE, AVR &SVR Engg College, Kurnool
Abstract: – Xmodem protocol is a widely used asynchronous file transfer protocol. By using Xmodem protocol we can implement the CRC algorithm with 16bit data. Here we studied the implementation of CRC algorithm by LFSR using FPGA. This paper studies of implementation of CRC algorithm of 16bit and 32bit by single byte and multi byte parallel circuit. We can achieve the synthesis design using Verilog language.
Keywords: Xmodem , CRC , Checksum , packet, LFSR
I. INTRODUCTION
XMODEM, like most file transfer protocols, breaks up the original data into a series of "packets" that are sent to the receiver, along with additional information allowing the receiver to determine whether that packet was correctly received. Xmodem is divided as Xmodem and 1kXmodem, Xmodem transmits the data as 128bytes block form and 1k Xmodem transmits as1024 bytes block form. These Xmodem supports the CRC and Checksum verification methods. The Xmodem CRC checksum requires the 128 bytes for the data packets. When the packet is received by the receiver, it sends conformation character if checksum is correct. It send the negative conformation character if error occurs. The Xmodem protocol of data is depends on efficiency of transmitted data check time. The Cyclic redundancy Check code (abbrivated as "CRC") is used to check the errors in the data. It is applicable to verify the data in Xmodem protocol, RFID protocol, USB communication protocol etc.
XMODEM1K was an expanded version of XMODEMCRC, which indicated the longer block size in the sender by starting a packet with the
<STX> character instead of <SOH>. Like other backwardcompatible XMODEM extensions, it was intended that a 1K transfer could be started with any implementation of XMODEM on the other end, backing off features as required.
XMODEM1K was originally one of the many improvements to XMODEM introduced by
Chuck Forsberg in his YMODEM protocol. Forsberg suggested that the various improvements were optional, expecting software authors to implement as
many of them as possible. Instead they generally implemented the bare minimum, leading to a profusion of semicompatible implementations, and eventually, the splitting out of the name "YMODEM" into "XMODEM1K" and a variety of YMODEMs. Thus XMODEM1K actually postdates YMODEM, but remained fairly common anyway.
A backwards compatible extensions of XMODEM with 32k and 64k block lengths was created by Adontec for better performance on high speed error free connections like ISDN or TCP/IP networks.
CRC Xmodem is very similar to Checksum Xmodem. The protocol initiation has changed and the 8 bit checksum has been replaced by a 16 bit CRC. Only theses changes are presented.
One of the earliest and most persistent problems with Xmodem was transmission errors which were not caught by the checksum algorithm. Assuming that there is no bias in asynchronous communications errors, we would expect that 1 out of every 256 erroneous complete or oversized Xmodem packets to have a valid checksum. With the same assumption, if the checksum were 16 bits, we would expect 1 out of every 65,536 erroneous complete or oversized packets would have a valid checksum.
Considerable theoretical research has shown that a 16 bit cyclical redundancy check character (CRC/16) will detect a much higher percent of errors such that it would only allow 1 undetected bit in error for every 1014 bits transmitted. That's 1 undetected error per 30 years of constant transmission at 1 megabit per second. However, my personal experience indicates that something around 109 to 1010 is more realistic. Why such a vast improvement over the checksum algorithm? It is caused by the
unique properties that prime numbers have when being divided into integers. Simply stated, if an integer is divided by a prime number, the remainder is unique. The CRC/16 algorithm treats all 1024 data bits in an Xmodem packet as an integer, multiples that integer by 2^16 and then divides that 1040 bit number by a 17 bit prime number. The low order 16 bits of the remainder becomes the 16 bit CRC.
The 17 bit prime number in CRC Xmodem is 216 + 212 + 25 + 1 or 65536 + 4096 + 32 + 1 =
69665. So calculating the CRC is simple, just multiply the 128 byte data number by 65536, divide by 69665 and the low order 16 bits of the remainder are the CRC. The only problem is, I've never seen a computer which has instructions to support 130 byte integer arithmetic! Fortunately for us, Seephan Satchell, Satchell Evaluations, published a specification a very efficient algorithm to calculate the CRC without either 130 byte arithmetic or bit manipulation. Appendix A contains the source code, in IBM/PC BASIC, for the calculation of a CRC. The initiation of CRC Xmodem was designed to provide for automatic fall back to Checksum Xmodem if the transmitter does not support the CRC version. The receiver requests CRC Xmodem by sending the letter C (decimal 67) instead of a NAK. If the transmitter supports CRC Xmodem, it will begin transmission of the first Xmodem packet upon receipt of the C. If the transmitter does not support CRC Xmodem, it will ignore the C. The receiver should timeout after 3 seconds and repeat sending the C. After 3 timeouts, the receiver should fall back to the checksum Xmodem protocol and send a NAK.
II ALGORITHM PRINCIPLES OF CRC:
The following is the specify CRC implementation principle:
The kbit binary number is about using
M(X) is
M (X) = Ck1 Xk 1 + Ck2 Xk 2 + .+ Ci Xi +
+ C1 X + C0 (1)
Let G (X) = Generated polynomial, r = highest power of sequence data then
M (X) * xr = Ck1 X k+r 1 + Ck2 Xk+r 2 + + C1
X k1 + C0 Xk (2)
The result is divided by generated polynomial G(x) (M(X)..Xr) / G(x) = Q(X) +R( X) ) / G(X) (3)
Here R(X) is remainder and it is CRC. R(X) is expressed as
R (X) = dr1. x r1 ++ d1 x1 + r0 .(.4) The final data is transmitted as
M1 = (CK1, C0, d r1, .d1,d0 ) (5)
Figure 1 is a typical circuit, if the generator polynomial G (x) is 1, the output of the D flipflop connects to the output of the XOR gate; if it is 0, and the output of the D flipflop connects to the output of the superior flipflop. Therefore, this figure can be greatly simplified in the case of a fixed generator polynomial.
There are some standard generator polynomials in the practical applications, as follows:
CRC 8 : X8 + X5 + X4 + 1 with number 0x131;
CRC 12: X12 + X11 + X3 + X2 +1
with number 0x180D;
CRC 16: X16 + X12 + X5 + 1 with number 0x11021;
ANSI CRC 16: X16 + X15 + X2 + 1
with number 0x18005;
CRC 32: X 32+ X26 + X23 + X22 + X16 + X 12 +X11 + X10 +X8 + X7+ X5 + X4 +X2 +X1+1,
with number 0x104C11DB7.
To develop a hardware circuit for computing the CRC checksum, we reduce the polynomial division process to its essentials. The process employs a shift register, which we denote by CRC. This is of length r (the degree of G) bits, not as you might expect. When the subtractions (exclusive ors) are done, it is not necessary to represent the highorder bit, because the highorder bits of G and the quantity it is being subtracted from are oth 1. The division process might be described informally as follows:
Initialize the CRC register to all 0bits. Otherwise, just shift CRC and m left 1 position. If there are more message bits, go back to get the next one. It might seem that the subtraction should be done first, and then the shift. It would be done that way if the CRC register held the entire generator polynomials, which in bit form are bits. Instead, the CRC register holds only the loworder r bits of G, so the shift is done first, to align things properly.
Get first/next message bit m. If the highorder bit of CRC is 1, Shift CRC and m together left 1 position, and XOR the result with the loworder r bits of G.
Below is shown the contents of the CRC register for the generator G = X3 + X + 1 and the message
M =X7 + X6 + X5 + X2 + X
Expressed in binary, G = 1011 and M = 11100110.

Initial CRC contents. Highorder bit is 0, so just shift in first message bit.

Highorder bit is 0, so just shift in second message bit, giving:
011 Highorder bit is 0 again, so just shift in third message bit, giving:
111 Highorder bit is 1, so shift and then XOR with 011, giving:
101 Highorder bit is 1, so shift and then XOR with 011, giving:
001 Highorder bit is 0, so just shift in fifth message bit, giving:
011 Highorder bit is 0, so just shift in sixth message bit, giving:
111 Highorder bit is 1, so shift and then XOR with 011, giving:
101 There are no more message bits, so this is the remainder.
In the Xmodem protocol, we use the standard CCITT (Consultative Committee for International Telephony and Telegraphy) method, CRC16: X 16+ X 12+ X 5+ 1
Implementation of the CRC hardware circuit:

We first clear the flipflop by CR, and move the upper 16 bits (2 bytes) which need to be verified into
16 of the trigger. The upper 16 bits of the data stream will not be changed as the trigger has been cleared;

Then we continue flow the data into the trigger, it just need right 1bit if the output of the r15 trigger is 0; and it need right 1bit after the modulo 2 operation if the output is 1;

At last we need continuous move into 16bit 0 after the data stream M (x) all into the trigger, and end the calculation of this group data CRC
Figure 2 the process of the circuit is summarized as follows:
The above circuit using the general process of the modulo 2 division operation, its biggest drawback is need continuous input 16 "0" after the data stream M
(x) and the CRC of the trigger need to more 16 times calculation.

THE IMPLEMENTATION OF CRC PARALLEL COMPUTING
In the Xmodem protocol, each packet is 128 bytes (1024 bits). We need 1040 (1024 + 6) cycles to figure out the CRC using the Figure 2. This design uses a parallel computing and hardware implementations in order to improve the realtime.
We main narrative the 8bit CRC parallel computing. The state of the flipflop is the remainder of the CRC as shown in Figure 1. The remainder of the CRC is just concerned with the former input and the remainder of the previous state when the serial operation. The calculation of 8bit parallel operation as follows:
Supposed rji as the value of the trigger, i = 1, 2… n, as the input code sequence, j = 0, 1, …, k1, as the trigger coding,
………
The input data is 8bit, so the maximum of i is 8. We can transitive launch by CCITT CRC16 (the polynomial is G(X) = X16 + X12
+X2 + 1, that is, k = 16) and the equation (6).
We need 24clock to calculate the CRC of the 8bit data. In the first 16clock, we move the 8bit data into the high trigger and the low 8bit is zeros. And this is
the initial moment, we can get the CRC of the 8bit data after 8 clocks and the input data is zeros. Then the initial moment of the trigger values are:
If is represent the value after 8 clocks, we
substituted the into equation (6) and get the final expression of
We can derive from other similar items:
Fig3 8bit parallel CCITT CRC16 hardware circuit
IV PACKETS
A method of transferring data by breaking it up into small chunks called packets. Packet data is how most data travels over the Internet, and, in recent years, over all cell phone networks as well. With packetswitched data, each user only consumes network resources when they are actually transferring data. This is often superior to circuitswitched data, where an open data connection must be maintained, which uses network resources even when idle. Packetswitched is the more modern type, and usually faster. .In a mobile phone, data is used for functions involving the Internet, as well as most kinds of streaming video and audio.
There are many different types of packet data for mobile phones, with different maximum speeds. A packet is the unit of data that is routed between an origin and a destination on the Internet or any other packetswitched network.
When any file is sent from one place to another on the Internet, the Transmission Control Protocol (TCP) layer of TCP/IP divides the file into "chunks" of an efficient size for routing. Each of these packets is separately numbered and includes the Internet address of the destination. The individual packets for a given file may travel different routes through the Internet. When they have all arrived, they are reassembled into the original file
The packet was prefixed by a simple 3byte header containing a <SOH> character, a "block number" from 0255, and the "inverse" blocks number255 minus the block number. Block numbering starts with 1 for the first block sent, not 0.
The packet was also suffixed with a singlebyte checksum of the data bytes. The checksum was the sum of all bytes in the packet modulo 256. The modulo operation was easily computed by discarding all but the eight least significant bits of the result, or alternatively on an eight bit machine, ignoring arithmetic overflow which would produce the same effect automatically. In this way the checksum was restricted to an eight bit quantity which was able to be expressed using a single byte. For example, if this checksum method was used on a tiny data packet containing only two bytes carrying the values 130 and 130, the total of these codes is 260 and the resulting checksum is 4. The complete packet was thus 132 bytes long, containing 128 bytes of data, for a total channel efficiency of about 97%.
The file was marked "complete" with a <EOT> character sent after the last block. This character was not in a packet, but sent alone as a single byte. Since the file length was not sent as part of the protocol, the last packet was padded out with a "known character" that could be dropped. In the original specification this defaulted to
<SUB> or 26 decimal, which CP/M used as the end offile marker inside its own disk format. The standard suggested any character could be used for padding, but there was no way for it to be changed within the protocol itself if an implementation changed the padding character, only clients using the same implementation would correctly interpret the new padding character.

IMPLEMENTATION AND RESULTS
The proposed system is designed using Verilog hardware description language and structural form of coding. The basic blocks of scan based test structure are completely synthesized using Xilinx XST and implemented on device family Spartan 3e, device XC3S500E, and package FG320 with speed grade 4.
Fig 4 Simulation Results for CRC16
Fig 5 Simulation Results for CRC32

CONCLUSION

This paper analysis the principle of the CRC calculation, redesign the perbyte parallel computing to the checksum of CCITT CRC16 and CRC 32, present a general method of parallel computing of CRC and the CRC algorithm solution of the data packet. We mplement the Xmodem protocol with CRC check use FPGA based on the above method.
REFERENCES

YU Xun The 32bit cyclic redundancy check parallel algorithm and hardware implementation[J].Information Technology, 2007.

LI Youzhong The generic parallel CRC calculation principle and its hardware implementation[J]. Northwest Minorities University (NaturalScience) 2002.

LI Youmou, FANG Dingyi. CRC coding algorithm research and Realization[J]. Journal of Northwest University(Natural Science Edition),
2006.

JI Shangman, LI Wei, SHEN Kejie, YAO Hui, TAO Zhijie. Improved CRC Arithmetic and Implementation by SCM[J]. Industrial Control Computer. 2009.

ZHU Ronghua. The Principle and Implementation of a Parallel CRC Computing[J]. Acta Electronica Sinica, 1997.

ZHANG Shugang,ZHANG Suinan,HUANG Shitan. CRC Parallel Computation Implementation on FPGA[J]. Computer Technology And Development, 2007.