 Open Access
 Total Downloads : 376
 Authors : M. Jansirani, S. Vidheswari
 Paper ID : IJERTV3IS20240
 Volume & Issue : Volume 03, Issue 02 (February 2014)
 Published (First Online): 01032014
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
VITERBI Decoder for SpaceTime Trellis Codes
M. JansiRani1
Electronics and Communication Engineering Vivekanandha College of Engineering for Women
Tiruchengode, India

Vidheswari2
Electronics and Communication Engineering Vivekanandha College of Engineering for Women Tiruchengode, India
Abstract Spacetime trellis code (STTC) has been widely applied to coded multipleinput multipleoutput (MIMO) systems because of its gains in coding and diversity; however, its great decoding complexity makes it less promising in chip realization compared to the spacetime block code (STBC). The complexity of STTC decoding lies in the branch metric calculation in the Viterbi algorithm and increases significantly along with the number of antennas and the modulation order. Consequently, a lowcomplexity algorithm to mitigate the computational burden is proposed. The results show that 70%, of the computational complexity is reduced for 2Ã—2, to reduce the complexity for 3Ã—3, and 4Ã—4 MIMO configurations, also. Based on the proposed algorithm, a reconfigurable MISO STTC Viterbi decoder is de signed and implemented using 0.18 m 1P6M CMOS technology. The decoder achieves 11.14 Mbps, 8.36 Mbps, and 5.75 Mbps for 4PSK, 8PSK, and 16QAM modulations, respectively.
Keywords – Spacetime trellis code (STTC), Viterbi decoder, CMOS technology, 4PSK, 8PSK, and 16QAM modulations.

INTRODUCTION
A. TRELLIS CODED MODULATION
In a powerlimited environment, the desired system performance should be achieved with the smallest possible transmitted power. The use of errorcorrecting codes can increase power efficiency by adding extra bits to the transmitted symbol sequence. This procedure requires the modulator to operate at a higher data rate, which requires a wider bandwidth. In a bandwidthlimited environment, the use of high order modulation schemes can increase efficiency in frequency utilization. In this case, a large signal power would be required to maintain the same system biterror rate (BER). In order to achieve improved reliability of a digital transmission system without increasing transmitted power or required bandwidth, both coding and modulation are considered in TCM technology; therefore, TCM is a scheme combining errorcorrecting coding with modulation. TCM is used for data communication with the purpose of gaining noise immunity over uncoded transmission without changing the data rate. The use of TCM also improves system reliability without increasing transmitting power and required channel bandwidth. Quadrature Amplitude modulation (QAM) and Quaternary Phase Shift Keying (QPSK) are used in TCM to increase data transmission rate. Since channel bandwidth is a function of the signaltonoise ratio (SNR), larger signal power would be necessary to maintain the same signal separation and the same error probability if more signals are required to be transmitted without enlarging channel bandwidth. It seems that the Trellis code statement violates the
basic power, bandwidth and error probability tradeoff principle. Actually this is not true. Trellis coding introduces dependency between every successive transmitting data symbol. The optimum 2dimensional modulation utilizes the dependency between inphase and quadrature symbols and the 4dimensional modulation employs the dependency between symbols of two successive time intervals. Trellis codes and multidimensional modulation are designed to maximize the Euclidean distance between possible sequences of transmitted symbols. Euclidean distance is a straight line distance between two points in signal constellation.

ERRORCORRECTING CODING
Figure 1 shows a typical communication system incorporating with channel coding. In this system, digital data generated by a source encoder are coded through a channel encoder, and the coded information data is used to modulate a carrier for transmission over a communication channel. The channel always introduces attenuation, distortion, interference and noise, which affect the receivers ability to receive correct information. The demodulator recovers possible values of the transmitted symbols, and the channel decoder recovers the information data before sending to its destination. A widely used channel encoding method is errorcorrecting coding (ECC). In an ECC system, a digital information source sends a data sequence to an encoder; the encoder inserts redundant (or parity) bits, thereby outputting a longer sequence of the code bits. The sequence of the code bits is called a codeword. The codeword is then transmitted to a receiver, which has a suitable decoder to recover the original data sequence. Error correcting coding is related to the method of delivering information from a source to a destination with minimum error. The research on ECC work showed that any communication channel could be characterized by a capacity at which information could be reliably transmitted.
Fig 1: Block Diagram of Digital Coded System
At any rate of information transmission up to the channel capacity, it should be possible to transfer information at any desired level of error rates. Introducing redundancy into transmissions can provide error control. This means more symbols are included in the message than strictly needed just to convey the information, with the result that only certain patterns at the receiver correspond to valid transmissions.
Once an adequate degree of error control has been introduced, the error rates can be made as low as required by extending the length of the code, thus averaging the effects of noise over a longer period of time. The research for error correcting codes was primarily motivated by the problems arising in communications systems, in particular, systems having limitation in their transmitted powers.
Errorcorrecting codes are an excellent means of reducing transmission power requirements because reliable communications can be achieved with the aid of codes even when the information is weakly received at its destination. There are two different types of codes used in errorcorrecting coding: the block codes and the convolution codes. The primary point of both codes is to add redundant bits to achieve error correction objectives. These redundant bits are added into each symbol of information sequence formulating the codeword. They will be used in the decoder at the receiver to correctly restore the original information sequence under specific decoding algorithms. These redundant bits are also called paritycheck bits; the bits provide the code with the capability of combating channel noise. The encoder for the block codes divides information sequences into information blocks of kbit, represented by d = (d0, d1 dk1) information sequence. The encoder transforms each kbit information block d independently into a codeword, c = (c0, c1 cn1). This transformation provides a coding rate R= k/n (R<1). The code is called a (n, k) block code. Decoding algorithm of the block code is based on the method of constructing the codeword in the encoder.
Figure 2 shows how a codeword is generated through a (6, 3) block encoder. Each 6bit output codeword is comprised of the original 3bits message sequence and a 3bits parity sequence.
Fig 2: An illustration of a (6, 3) block code
The encoder for convolutional codes generates codeword for transmission utilizing a sequential finitestate machine driven by the information sequence, or an arbitrarily long sequence. The convolutional encoding process installs the properties of memory and redundancy into the codeword stream, as for block codes.
Fig3: Generic Description of a Convolutional Encoder
The kbit information sequence as the input of the encoder is coded to the nbit codeword through the m stage delay (represented by delay cell D). The coding rate is still R= k/n (R < 1). The code is called (n, k, m) convolutional code. Each of k parallel input lines can have different number of stage delay cells, which is smaller than or equal to m (m 0).

III. CONCEPT OF TCM
Assume there is a model for the transmission of data with discretetime, continuousamplitude over the additive white Gaussian noise (AWGN) channel. In this communication model, messages to be delivered to the user are represented by points, or vectors, in an Ndimensional Euclidean space RN, called a signal space. When a vector x = [x1, x2 xi] is transmitted, the received signal can be represented by the vector Z as:
Z = x + v (2.1)
Where, v is a noise vector [v1, v2vi] whose components v1, v2vi are independent Gaussian random variables with zero mean and the same variance N0/2. N0 is noise power spectral density. The vector x is chosen from a set, the signal constellation which consists the number of signal vectors. In TCM system, signals are dependent. To avoid a reduction of the transmission rate, the constellation should be expanded. The minimum free Euclidean distance dfree between two possible sequences in a large constellation is obtained to be greater than the minimum Euclidean distance dmin between two signals in the original constellation. Hence using maximumlikelihood sequence detection will yield a distance gain d2min/d2free. On the other hand, expanding constellation induces an increase in the average energy expenditure from E to E (i.e., energy loss E/E), where E and E are the average energy spent to transmit with uncoded and coded transmission, respectively. By increasing the number of states (i.e., memory bits) in TCM, the coding gain can be increased. For example, if the fourstate or eightstate trellis coding is used in this TCM scheme instead of twostate, the
coding gain will be 3.01dB and 3.6dB, respectively. The coding gain tendency with different numbers of the trellis states. At each point on the graph, the second number shows the coding gain can be achieved when the number of states (the first number) is used.
Fig 4: Result of coding gain versus the number of states increasing
Increasing the number of states in TCM is a simple way to improve its performance. However, once the number of states grows high enough, the coding gain increases at much slower rate as illustrated in figure 4. In addition, the error coefficient (i.e., the multiplicity of minimumEuclidean distance error events) of the code starts to dominate the code performance. In order to solve this conflict, a multi dimensional constellations model is introduced to the TCM system.

VITERBI ALGORITHM
Andrew Viterbi proposed an algorithm in 1967 to decode convolutional code and this became the Viterbi Algorithm. This algorithm is an application of dynamic programming that finds the shortest paths. (Maximum likelihood sequences) widely used in solving minimization problems. A critical feature of this algorithm is the complexity of the decoding process grows linearly with the number of symbols being transmitted, rather than exponentially with the number of the transmitted symbols. The Viterbi algorithm finds the sequence at a minimum Hamming distance (or Euclidean distance) from the received signal with the minimized equivalent accumulated squared error. The Trellis diagram in shows a relationship between the delay states and the path states in each timing interval. The corresponding relations between delay states and path states construct the basic database used in the Viterbi algorithm cost function calculation. Normally, the calculation of cost functions uses two different types of distance: the Hamming distance and the Euclidean distance.
In this thesis, the Hamming distance is chosen to be the cost function because the use of this distance is suitable in LUT method to simplify hardware implementation. The Viterbi algorithm finds the path with the minimum path metric cost by sequentially moving through the trellis for each time interval. In a time interval T, from kT to (k+1) T, the receiver observes the sample received in that interval and computes the metric associated with all the branches.
This metric stores the cost value of all branches as shown in the trellis diagram. For example, if the memory bits (m bits) is used in the convolutional encoder, the N = 2m is the number of the states in the convolutional encoder. If each delay states have four branches associated with the new delay states, the total 4N branch costs will be calculated; but only N path metrics are retained (i.e., stored in memory) k=0, the forward states will be S0 and S1. The cost for S0S0 branch is 0.0, for S0S1 is 0.8. Because 0.0<0.8, the survival branch is S0S0, and the accumulate cost is 0.0 at k=1. Pattern (3) shows starting state is S0 at k=1 and the forward states will be S0 and S1. The cost for S0S0 branch is 0.3, for S0S1 is 0.0. At k=2, accumulate cost for S0 is 0.3 (i.e., 0.0 plus 0.3), for S1 is 0.0 (i.e., 0.0 plus 0.0). Because 0.0<0.3, the survival branch is S0S1. Same process works on patterns (4) and (5). Finally, at k=4, the minimum accumulate cost is 0.5 for branch S0S0. After tracing back, the survival path is S0S0S1S0S0.
Fig 5: Viterbi algorithm illustration based on twostate trellis

PREVIOUS RESEARCH WORK
The MIMO technique can increase channel capacity and data rate by using more transmitting antennas. In addition, the diversity coding technique has a better capability of
resisting the channel impairment. The most popular diversity coding technique is spacetime coding (STC) which involves space diversity modulation and error correction. The STC can moderately improve the spectral efficiency and provide coding gains for error correction. Alamouti proposed a twotransmit antenna scheme, which is then extended to spacetime block codes (STBC) and has been adopted in wireless standards to enhance the performance of wireless communication systems. However, it only improves the diversity gain rather than the channel capacity of the system. The spacetime trellis code (STTC) was also proposed to improve both the diversity gain and coding gain for wireless communication systems. The STTC encoder generates redundant parity check codes which are transmitted with the original information data streams, and thereby coding gain is obtained. Therefore, the STTC technique possesses a more robust capability than the STBC technique for combating severe MIMO channel impairment. However, the STTC decoder employs the Viterbi algorithm which requires much greater decoding complexity than the STBC decoder thus the STTC is seldom adopted as the diversity coding technique in current wireless communication systems. Many Viterbi decoder chips have been published for convolutional codes (CC) and trelliscoded modulation (TCM) techniques.
A 32state radix4 Viterbi decoder was first proposed and implemented in, and many succeeding works have proposed methods of improving the performance of the CC Viterbi coder. Some approaches enhance the throughput performance or reduce the power consumption using specific techniques, such as lookahead calculation or improved register exchange architectures. Several CC Viterbi decoders aim to support the larger number of states, or different decoding algorithms in certain communication standards.
In an adaptive Viterbi decoder can be reconfigured dynamically in response to the varying channel conditions so as to reduce power consumption. On the other hand, Viterbi decoders for the trelliscoded modulation (TCM) technique, which involves errorcorrecting coding and modulation, have been designed and implemented in recent years. The TCM Viterbi decoder requires a greater decoding complexity than the CC Viterbi decoder thus more efforts must be made to reduce hardware costs.
An ASIC chip has been proposed to support different communication specifications under a variety of channel conditions. Moreover, TCM Viterbi decoders for WiFi and DVB have been proposed. Compared to the traditional Viterbi decoder for convolutional codes and TCM codes, the STTC Viterbi decoder has to perform a great amount of complex value multiplications for branchmetric computation, which is proportional to the modulation order and the number of antennas. Efficient STTC decoders with several antennas for the MIMO/MISO systems were proposed in; however, the STTC decoder architecture is seldom addressed or implemented. To overcome this implementation bottleneck, an efficient treesearch algorithm and constant multiplier architecture is proposed in order to reduce the complexity of branch metric calculation in the STTC Viterbi decoder. The
complexity can be reduced to such a degree that the cost of implementation is reasonable compared with the available STBC decoder. Then, the STTC Viterbi decoder is implemented based on the proposed algorithm using 0.18m 1P6M CMOS technology. To the authors knowledge, this is the first STTC decoder chip to be proposed and implemented.

SIMULATION AND SYNTHESIS RESULT Then implemented both the architectures using
Verilog programming in mixed style modelling (combination of structural, dataflow and behavioral modeling). First, compare the Normal scheme and TR scheme circuits. The functionalities of all the designs can be checked using Isim. In Isim, run designs having long vectors as input. First check the syntax of our designs then will check for functionality of particular designs. Later have to compare the power consumption of both STBC and STTC using Xilinx tool. While verifying power consumption of these designs in Xilinx tool, first load our designs in to tool. Then generate programming file for particular design. Through this file can check syntax of design and check functionality of design by writing test bench. Get synthesis (net list of the system) report of the design. In place and route option can generate RTL schematic for design, to observe placing of components, power analysis etc.

DECODER OUTPUT WITHOUT NOISE
Figure 6 shows the simulation results of Viterbi decoder output without noise. In the simulation results it shows the encoder input and encoded output and decoder output. The input is represented by the X and the encoded output is Y [1:0] and the decoder output is code [1:0]. The number of antennas used in this paper is 2, so the encoder output sequence has the 2 bit wide. When the clock signal is applied and the reset is low the encoder generates the encoded sequence.
This sequence is applied to the BMG unit of the decoder. It generates the Euclidian distance of the given input sequence. This is the one of the input to the ACS block and another value is coming from the ROM, which is theoretically calculated value. These two values are compared by the ACS block and the lowest value i.e. lowest trellis path is obtained. These values are given to the trace back unit to obtain the original input sequence. The output signals of the BMG and ACS blocks for the given input sequence are shown in the simulation results.
Fig 6: Simulation Results of a Decoder without noise

DECODER OUTPUT WITH NOISE
Figure 7 shows the viterbi decoder output with noise. The corrupted sequence is given to the decoder input. Then the decoder out which is same as the original input sequence. If the input sequence is corrupted by the noise then the original values are saved in the ROM, these corrupted values are replaced by the original values, and then get the exact sequence. At the input of the decoder sequence apply the 10 instead of 00 in the original encoded output, but can get the 00 as the output.
Fig 7: Simulation Results of a Decoder with noise
Fig 8: Design Summary


CONCLUSION

The TCM codec chip implemented in this research utilizes the advantages of simplification and speed enhancement through the use of lookup tables. In addition, parallel processing and pipelining are also used to further increase decoding throughputs. The final TCM codec includes a TCM encoder with mapping for 2 dimensional and 4 dimensional modulations and a TCM decoder with a 16state, radix4, rate=2/3 Viterbi decoder. The implementation in 0.18 CMOS technology yields a simulation decoding iteration rate of 289MHz under mapping for 4dimensional modulation condition. This operating frequency converts to a decoding data rate of over 1Gbs. The real speed of the chip is less than 289MHz due to the long delay of the available bonding and I/O pads in the fabrication and packaging processes at TSMC.
The research verifies that the use of lookup table will eliminate the circuits which are used to calculate branch cost and path cost in the Viterbi decoder. Lookup tables not only simplify the circuit but also reduce the core area of the device. The designed decoder employs parallel structure in its architecture. This structure requires higher silicon area than serial structure; however it efficiently increases decoding speed. In addition, the rational use of shift registers in memorizing the delay state conjunction and the realization of pipelining achieve algorithm optimization and increase decoding iteration rate. As mentioned, one of the goals in the design of this codec is to integrate the codec into a single chip system (i.e., SOC). The core of the ASIC can be used as an IP core to incorporate into future development of any highspeed SOC applications using TCM.
REFERENCES

E. Cavus and B. Daneshrad, A very lowcomplexity spacetime block decoder (STBD) ASIC for wireless systems, IEEE Transactions on Circuits and SystemsPart I: Regular Papers, vol. 53, pp. 6069, Jan.2006.

S. Nandula, Y. S. Rao, and S. P. Embanath, High speed area efficient configurable Viterbi decoder for WiFi and WiMAX systems, in Proc. IEEE ICIAS07, Nov. 2007, pp. 13961399.

C. C. Lin, Y. H. Shih, H. C. Chang, and C. Y. Lee, A low power turbo/Viterbi decoder for 3GPP2 applications, IEEE Transactions on Very Large Scale Integration (VLSI) Systems, vol. 14, pp. 426430, Apr. 2006.

R. Manzoor, A. Rafique, and K. B. Bajwa, Hardware implementation of pragmatic trellis coded modulation applied to 8PSK and 16QAM forDVB standard, in Proc. IEEE CNSDSP08, Jul. 2008, pp. 363367.

H. Lee and M. P. Fitz, Systematic expansion of full diversity space time multiple tcm codes for two transmit antennas, IEEE Transactions on Wireless Communications, vol. 7, pp. 20272032, Jun. 2008

M. D. Shieh, T. P. Wang, and D. W. Yang, Lowpower registerex change survivor memory architectures for Viterbi decoders, IET Cir cuits, Devices, & Systems, vol. 3, pp. 8390, Apr. 2009.

P. J. Black and T. H. Meng, A 140Mb/s, 32state, radix4 Viterbi decoder, IEEE Journal of SolidState Circuits, vol. 27, pp. 1877 1885,Dec. 1992.

A. Naguib, V. Tarokh, N. Seshadri, and A. Calderbank, A spacetime coding modem for highdatarate wireless communications, IEEE Journal on Selected Areas in Communications, vol. 16, pp. 14511458, Oct. 1998.

V. Tarokh, A. Naguib, N. Seshadri, and A. R. Calderbank, Combined array processing and spacetime coding, IEEE Transactions on Infor mation Theory.