 Open Access
 Total Downloads : 1370
 Authors : K. Bhagya Lakshmi, Ch. Satish
 Paper ID : IJERTV1IS7147
 Volume & Issue : Volume 01, Issue 07 (September 2012)
 Published (First Online): 25092012
 ISSN (Online) : 22780181
 Publisher Name : IJERT
 License: This work is licensed under a Creative Commons Attribution 4.0 International License
Low Hardware Overhead Accumulator Based 3 Weight Pattern Generation
K. Bhagya Lakshmi #, Ch. Satish*
ECE Department Pragati Engineering College, Surampalem, India.
ABSTRACT
Weighted pseudorandom builtin selftest (BIST) schemes have been utilized in order to drive down the number of vectors to achieve complete fault coverage in BIST applications. A method for accumulator based low hardware 3weight pattern generation is described. The main advantages of the method described over existing methods are: 1) only three easily generated weights0, 1, 0.5 are used.2)As accumulators are commonly used in current VLSI chips, this scheme can efficiently drive down the hardware of BIST pattern generation. An experimental result shows that the proposed BIST scheme can attain 100% fault coverage for all of the benchmark circuits and an average reduction of 73.5% in hardware requirement compared to two previous methodologies.
Index TermsBuiltin selftest (BIST), test per clock, VLSI testing, weighted test pattern generation.

INTRODUCTON
Pseudorandom builtin selftest (BIST) generators have been widely utilized to test integrated circuits and systems. The arsenal of pseudorandom generators includes, among others, linear feedback shift registers (LFSRs) [1], cellular automata [2], and accumulators driven by a constant value [3]. For circuits with hardtodetect faults, a large number of random patterns have to be generated before high fault coverage is achieved. Therefore, weighted pseudorandom techniques have been proposed where inputs are biased by changing the probability of a 0 or a 1 on a given input from 0.5 (for pure pseudorandom tests) to some other value [10], [15].
Weighted random pattern generation methods relying on a single weight assignment usually fail to achieve complete fault coverage using a reasonable number of test patterns since, although the weights are computed to be suitable for most faults, some faults
may require long test sequences to be detected with these weight assignments if they do not match their activation and propagation requirements. Multiple weight assignments have been suggested for the case that different faults require different biases of the input combinations applied to the circuit, to ensure that a relatively small number of patterns can detect all faults [4]. Approaches to derive weight assignments for given deterministic tests are attractive since they have the potential to allow complete coverage with a significantly smaller number of test patterns [10].
In order to minimize the hardware implementation cost, other schemes based on multiple weight assignments utilized weights 0, 1, and 0.5. This approach boils down to keeping some outputs of the generator steady (to either 0 or 1) and letting the remaining outputs change values (pseudo) randomly (weight 0.5). This approach, apart from reducing the hardware overhead has beneficial effect on the consumed power, since some of the circuit under test (CUT) inputs (those having weight 0 or 1) remain steady during the specific test session [30]. Pomeranz and Reddy [5] proposed a 3weight pattern generation scheme relying on weights 0, 1, and 0.5. The choice of weights 0, 1, and 0.5 was done in order to minimize the hardware implementation cost. Wang [8], [13] proposed a 3weight random pattern generator based on scan chains utilizing weights 0, 1, and 0.5, in a way similar to [5]. Recently, Zhang et al. [9] renovated the interest in the 3weight pattern generation schemes, proposing an efficient compaction scheme for the 3 weight patterns 0, 1, and 0.5. From the above we can conclude that 3weight pattern generation based\ on weights 0, 1, and 0.5 has practical interest since it combines low implementation cost with low test time. Current VLSI circuits, e.g., data path architectures, or digital signal processing chips commonly contain arithmetic modules [accumulators or arithmetic logic units (ALUs)]. This has fired the idea of Arithmetic BIST (ABIST) [6]. The basic idea of ABIST is to utilize accumulators for builtin testing (compression of the CUT responses, or generation of test patterns) and
has been shown to result in low hardware overhead and low impact on the circuit normal operating speed [22] [27]. In [22], Munich et al. presented an accumulator based test pattern generation scheme that compares favorably to previously proposed schemes.
In [7], it was proved that the test vectors generated by an accumulator whose inputs are driven by a constant pattern can have acceptable pseudorandom characteristics, if the input pattern is properly selected. However, modules containing hard todetect faults still require extra test hardware either by inserting test points into the mission logic or by storing additional deterministic test patterns [24], [25]. In order to overcome this problem, an accumulator based weighted pattern generation scheme was proposed in [11]. The scheme generates test patterns having one of three weights, namely 0, 1, and 0.5 therefore it can be utilized to drastically reduce the test application time in accumulator based test pattern generation. However, the scheme proposed in [11] possesses three major drawbacks: 1) it can be utilized only in the case that the adder of the accumulator is a ripple carry adder; 2) it requires redesigning the accumulator; this modification, apart from being costly, requires redesign of the core of the data path, a practice that is generally discouraged in current BIST schemes; and 3) it increases delay, since it affects the normal operating speed of the adder.
In this paper, a novel scheme for accumulator based 3weight generation is presented. The proposed scheme copes with the inherent drawbacks of the scheme proposed in [11]. More precisely: 1) it does not impose any requirements about the design of the adder (i.e., it can be implemented using any adder design); 2) it does not require any modification of the adder; and hence, 3) does not affect the operating speed of the adder. Furthermore, the proposed scheme compares favorably to the scheme proposed in [11] and [22] in terms of the required hardware overhead.
This paper is organized as follows. In Section II, the idea underlying the accumulatorbased 3weight generation is presented. In Section III, the design methodology to generate the 3weight patterns utilizing an accumulator is presented. In Section IV, the proposed scheme is compared to the previously proposed ones. Finally, Section V concludes this paper.

ACCUMULATORBASED 3WEIGHT PATTERN GENERATION
We shall illustrate the idea of an accumulator based 3weight pattern generation by means of an
example. Let us consider the test set for the c17 ISCAS benchmark [12], [31] given in Table I.
Fig.1: Accumulator Cell
Starting from this deterministic test set, in order to apply the 3weight pattern generation scheme, one of the schemes proposed in [5], [8], and [9] can be utilized. According to these schemes, a typical weight assignment procedure would involve separating the test set into two subsets, S1 and S2 as follows: S1= {T1, T4} and S2= {T2, T3}. The weight assignments for these subsets is W (S1)= {,,1,,1} and W (S2)= {,
,0,1,0} where a_ denotes a weight assignment of 0.5, a 1 indicates that the input is constantly driven by the logic 1 value, and 0 indicates that the input is
driven by the logic 0 value. In the first assignment, inputs A[2] and A[0] are constantly driven by 1, while inputs A[4], A[3], A[1] are pseudo randomly generated (i.e., have weights 0.5). Similarly, in the second weigh assignment (subset S2), inputs A[2] and A[0] are constantly driven by 0, input A[1] is driven by 1 and inputs A[4] and A[3] are pseudo randomly generated.
The above reasoning calls for a configuration of the accumulator, where the following conditions are met: 1) an accumulator output can be constantly driven by 1 or 0 and 2) an accumulator cell with its output constantly driven to 1 or 0 allows the carry input of the stage to transfer to its carry output unchanged. This latter condition is required in order to effectively generate pseudorandom patterns in the accumulator outputs whose weight assignment is .

DESIGN METHODOLOGY
The implementation of the weightedpattern generation scheme is based on the full adder truth table, presented in Table II. From Table II we can see that in lines #2, #3, #6, and #7 of the truth table, COUT = CIN . Therefore, in order to transfer the carry input to the carry output, it is enough to set A[i]= NOT(B[i]). The proposed scheme is based on this observation.
The implementation of the proposed weighted pattern generation scheme is based on the accumulator cell presented in Fig. 1, which consists of a Full Adder (FA) cell and a Dtype flipflop with asynchronous set and reset inputs whose output is also driven to one of the full adder inputs. In Fig. 1, we assume, without loss of generality, that the set and reset are active high signals. In the same figure the respective cell of the driving register B[i] is also shown. For this accumulator cell, one out of three configurations can be utilized, as shown in Fig. 2.
Fig.2: Configurations of the accumulator cell of Fig.1 In Fig. 2(a) we present the configuration that
drives the CUT inputs when A[i]=1 is required. Set[i]=1 and Reset[i]=0 and hence A[i]=1 and B[i]=0 . Then the output is equal to 1, and CIN is transferred to COUT
In Fig. 2(b), we present the configuration that
drives the CUT inputs when A[i]=0 is required. Set[i]=0 and Reset[i]=1 and hence A[i]=0 and B[i]=1. Then, the output is equal to 0 and CIN is transferred to COUT.
In Fig. 2(c), we present the configuration that
drives the CUT inputs when A[i] = _ is required. Set[i]=0 and Reset[i]=0 . The D input of the flipflop of register B is driven by either 1 or 0, depending on the value that will be added to the accumulator inputs in order to generate satisfactorily random patterns to the inputs of the CUT.
In Fig. 3, the general configuration of the proposed scheme is presented. The Logic module provides the Set[n1:0] and Reset[n1:0] signals that drive the S and R inputs of the Register A and Register B inputs. Note that the signals that drive the S inputs of the flipflops of Register A, also drive the R inputs of the flipflops of Register B and vice versa.
fourth column we present the decrease of the proposed scheme over [11]. In the fifth through the seventh columns, we present the delay of the adder in terms of number of gates that the carry signal has to traverse, from the CIN input of the adder (lower stage full adder cell) to the COUT output (higher stage full adder cell), as well as the respective decrease obtained by the proposed scheme.
Fig.3: Proposed Scheme

COMPARISONS
In this section, we shall perform comparisons in three directions. In Section IVA, we shall compare the proposed scheme with the accumulator based 3weight generation scheme that has been proposed in [11]. In Section IVB, we shall compare the proposed scheme with the 3weight scan schemes that have been proposed in [5] and [8]. In Section IVC, in order to demonstrate the applicability of the proposed scheme we shall compare the proposed scheme with the accumulator based test pattern generation scheme proposed in [22].

Comparisons with [11]
The number of test patterns applied by [11] and the proposed scheme is the same, since the test application algorithms that have been invented and applied by previous researchers, e.g., [5], [8], [9] can be equally well applied with both implementations. Therefore, the comparison will be performed with respect to: 1) the hardware overhead and 2) the impact on the timing characteristics of the adder of the accumulator.
Both schemes require a session counter in order to alter among the different weight sessions; the session counter consists of log2 k bits where k is the number of test sessions (i.e., weight assignments) of the weighted test set. The scheme proposed in [11] requires the redesign of the adder; more precisely, two NAND gates are inserted in each cell of the ripplecarry adder. In order to provide the inputs to the set and reset inputs of the flip flops, decoding logic is implemented, similar to that in [8]. For the proposed scheme, no modification is imposed on the adder of the accumulator. Therefore, there is no impact on the data path timing characteristics. In Table III we present comparison results for some of the ISCAS85 benchmarks. In the first column of Table III, we present the benchmark name; in the second and third columns we present the hardware overhead of the accumulatorbased scheme proposed in [11] and in this work, respectively; in the
TABLE III COMPARISONS WITH [11]
In Table III, the hardware overheads are calculated in gate equivalents, where an ninput NAND or NOR accounts for 0.5 _ gates and an inverter accounts for 0.5 gates, as proposed in [8]. For the calculation of the delay in the adder operation (columns under heading
#gates from CIN to COUT ) we have considered both ripple carry and prefix adder implementations. For the comparisons of the ripple carry adder implementations, the adder cell utilized in [11] is considered; in the cell presented in [11], initially the delay from the CIN to COUT of the adder cell is two NAND gates and one XOR gate; in the modified cell proposed in [11], the delay is increased to three NAND and one XOR gate; we have considered that the delay of a NAND gate is one gate equivalent, while the delay of an XOR gate is two gate equivalents. Since the implementation of the proposed scheme does not rely on a specific adder design, the utilization of a prefix adder can result in impressive results. For the calculation of the delay of prefix adders, the formula obtained by [29] is utilized, where the delay is of the order 4 * log2 n, where _ is the number of the adder stages. From Table III, we can see that the proposed scheme results in 57%90% decrease in hardware overhead, while at the same time achieving a decrease in operational delay overhead that ranges from 84% to 97% for the considered benchmarks.

Comparisons with ScanBased Schemes [5], [8]
Since the test application algorithms that have been invented and applied by [5], [8], and [9] can be equally well applied with the proposed scheme, test application time is similar to that reported there. Therefore, the comparison will be performed with respect to hardware overhead. In the 3weight pattern generation scheme proposed by Pomeranz and Reddy
in [5] the scan chain is driven by the output of a linear feedback shift register (LFSR). Logic is inserted between the scan chain and the CUT inputs to fix the outputs to the required weight (0, 0.5, or 1). In order to implement the scheme [5], a scanstructure is assumed. Furthermore, an LFSR required to feed the pseudorandom inputs to the scan inputs is implemented (the number of LFSR stages is log2 n, where n is the number of scan cells), as well as a scan counter, common to all scan schemes. A number of 3gate modules is required for every required weighted input (in [5, Table V], the hardware overhead is calculated for the ISCAS85 benchmarks).
TABLE IV COMPARISONS WITH THE SCAN SCHEMES PROPOSED IN [5] and [8]:
the LFSR implementation. In order to calculate the numbers in the second column, we utilized the data found in [5, Table V]. For the scheme in [8], the LFSR, the scan counter and the decoding logic are required; the hardware overhead for the decoding logic has been quote from [8, Table I].

Comparisons with [22]
In [22], Munich et al. proposed a methodology to reduce the total test time using an accumulatorbased scheme. The scheme operates in test sessions based on triplets of the form (S, I, L), where S is the starting value of the accumulator, I is the increment, and L is the number of cycles the increment is applied before going to the next session. For the comparisons we have utilized the data from [22, Table I], and have considered that the seeds are stored in a ROM; for the hardware calculation we have considered that a ROM bit is equivalent to Â¼ gates, as has been also considered in [20] and [32]. The comparison data for some of the ISCAS85 and ISCAS89 benchmarks are presented in Table V, where the same fault coverage, i.e., 100% is targeted.
TABLE V
COMPARISONS WITH THE SCHEME PROPOSED IN [22]:
Wang [8] proposed a lowoverhead 3weight random BIST scheme, again based on scan chains. He proposed two schemes, namely serial fixing BIST and parallel fixing BIST. Serial fixing scheme is shown to be more costly [8]; therefore we shall concentrate our comparisons to the parallel fixing BIST scheme. For an ninput CUT and, assuming the availability of the scan chain, the hardware overhead, apart from the LFSR to generate the pseudorandom inputs and the scan counter, includes a decoding logic. The hardware overhead of the decoding logic for some of the ISCAS benchmarks is calculated in [8, Table I]. All schemes require the application of the session counter, required to alter among the different weight sessions. Schemes proposed in [5] and [8] are test per scan schemes, and, of course, assume the existence of scan capability of the latches of the design.
In Table IV, we have presented arithmetic results for some of the ISCAS85 benchmarks. For the calculations in Table IV, we have assumed that schemes [5] and [8] are applied to a circuit with scan capability; therefore, the hardware overhead to transform the latches into scan latches is not taken into account. For the scheme proposed in [5], the total hardware overhead comprises the hardware for the weighting gates (second column), the scan counter and
In Table V, in the first three columns we present the benchmark characteristics (name, hardware overhead in gates and number of inputs). In the two columns to follow, we present the number of tests required for the scheme in [22], and the respective hardware overhead in gate equivalents. Next, we present the number of test patterns and hardware overhead for the proposed scheme. From Table V it is trivial to see that the proposed scheme presents an important decrease in the hardware overhead, while the number of tests is comparable, while in some cases it also outperforms the scheme in [22]. It is interesting to note that the hardware overhead (with respect to the hardware overhead of the benchmarks) is practical, in contrast to [22] which sometimes exceeds the benchmark hardware (c2670, s5378, s9234, s13207,
s15850, s38584). It is also interesting to note the average values presented in the last line of the Table. The average increase in the number of tests is 19%, while the average decrease in hardware overhead is 98%.


CONCLUSION
Weighted pseudorandom BIST schemes have been efficiently utilized in order to drive down the number of vectors to achieve complete fault coverage in Built in Self Test (BIST) applications. Sets of patterns comprising weights 0, 0.5 and 1 have been successfully utilized within the weighted pattern generation paradigm.
We have presented a user controllable, low hardware overhead accumulatorbased 3weight (0, 0.5, and 1) testperclock generation scheme, which can be utilized to efficiently generate weighted patterns without altering the structure of the adder. Comparisons with a previously proposed accumulatorbased 3 weight pattern generation technique [11] indicate that the hardware overhead of the proposed scheme is lower (75%), while at the same time no redesign of the accumulator is imposed, thus resulting in reduction of 20%95% in test application time. Comparisons with scan based schemes [5], [8] show that the proposed schemes results in lower hardware overhead. Finally, comparisons with the accumulatorbased scheme proposed in [22] reveal that the proposed scheme results in significant decrease (98%) in hardware overhead.
REFERENCES:

P. Bar dell, W. McAnney, and J. Savir, BuiltIn Test For VLSI: Pseudorandom Techniques. New York: Wiley, 1987.

P. Hortensius, R. McLeod, W. Pries, M. Miller, and
H. Card, Cellular automatabased pseudorandom generators for builtin self test, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., vol. 8, no. 8, pp. 842 859, Aug. 1989.

A. Stroele, A self test approach using accumulators as test pattern generators, in Proc. Int. Symp. Circuits Syst., 1995, pp. 21202123.

H. J. Wunderlich, Multiple distributions for biased random test patterns, in Proc. IEEE Int. Test Conf., 1988, pp. 236244.

I. Pomeranz and S. M. Reddy, 3 weight pseudo random test generation based on a deterministic test set for combinational and sequential circuits, IEEE Trans.
Comput.Aided Des. Integr. Circuits Syst., vol. 12, no. 7, pp. 10501058, Jul. 1993.

K. Radecka, J. Rajski, and J. Tyszer, Arithmetic builtin selftest for DSP cores, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., vol. 16, no. 11, pp. 13581369, Nov. 1997.

J. Rajski and J. Tyszer, Arithmetic BuiltIn Self Test For Embedded Systems. Upper Saddle River, NJ: Prentice Hall PTR, 1998.

S. Wang, Low hardware overhead scan based 3 weight weighted random BIST, in Proc. IEEE Int. Test Conf., 2001, pp. 868877.

S. Zhang, S. C. Seth, and B. B. Bhattacharya,
Efficient test compaction for pseudorandom testing, in Proc. 14th Asian Test Symp., 2005, pp. 337342.

J. Savir, Distributed generation of weighted
random patterns, IEEE Trans. Comput., vol. 48, no. 12, pp. 13641368, Dec. 1999.

I. Voyiatzis, D. Gizopoulos, and A. Paschalis,
Accumulatorbased weighted pattern generation, presented at the IEEE Int. Line Test Symp., Saint Raphael, French Riviera, France, Jul. 2005.

F. Brglez and H. Fujiwara, A neutral netlist of 10 combinational benchmarks circuits and a target
translator in FORTRAN, presented at the Int. Symp. Circuits Syst., Kyoto, Japan, 1985.

S. Wang, Low hardware overhead scan based 3 weight weighted random BIST architectures, U.S. Patent 6 886 124, Apr. 26, 2005.

C. Hamacher, Z. Vranesic, and S. Zaky, Computer Organization. New York: McGraw Hill, 2002.

F. Brglez, C. Gloster, and G. Kedem, Hardware based weighted random pattern generation for boundary scan, in Proc. IEEE Int. Test Conf. (ITC), 1989, pp. 264274.

H.J. Wunderlich, Self test using unequiprobable random patterns, in Proc. 17th Int. Symp. Fault Tolerant Comput. (FTCS), 1987, pp. 258263.

O. NovÃ¡k, Z. PlÃ¸va, J. Nosek, A. Hlawiczka, T. Garbolino, and K. Gucwa, Testperclock logic BIST with semideterministic test patterns and zeroaliasing compactor, J. Electron. Testing: Theor. Appl., vol. 20, no. 1, pp. 109122, Feb. 2004.

Y. Son, J. Chong, and G. Russell, EBIST: Enhanced testperclock BIST architecture, IEE Proc.Comput. Digit. Techn., vol. 149, pp. 915, Jan 2002.

K. Yamaguchi, M. Inoue, and H. Fujiwara,
Hierarchical BIST: Testper clock BIST with low overhead, Electron. Commun. Japan (Part II: Electron.), vol. 90, no. 6, pp. 4758, Jun. 2007.

E. Kalligeros, X. Kavousianos, D. Bakalis, and D. Nikolos, An efficient seeds selection method FOR
lfsrbased testperclock BIST, in Proc. Int. Symp. Quality Electron. Des., 2002, p. 261.

A. D. Singh, M. Seuring, M. Gossel, and E. S. Sogomonyan, Multimde scan: Test per clock BIST for IP cores, ACM Trans. Design Autom. Electr. Syst., vol. 8, no. 4, pp. 491505, Oct. 2003.

S. Manich, L. GarciaDeiros, and J. Figueras,
Minimizing test time in arithmetic testpattern generators with constrained memory resources,
IEEE Trans. Comput.Aided Des. Integr. Circuits Syst., vol. 26, no. 11, pp. 20462058, Nov. 2007.

S. Manich, L. Garcia, and J. Figueras, Arithmetic test pattern generation: A bit level formulation of the optimization problem, presented at the Des. Circuits Integr. Syst. (DCIS), Lisbon, Portugal, 2005.

S. Manich, L. Garcia, L. Balado, J. Rius, R. RodrÃ¸guez, and J. Figueras, Improving the efficiency of arithmetic bist by combining targeted and general purpose patterns, presented at the Des. Circuits Integr. Syst. (DCIS), Bordeaux, France, 2004.

S. Manich, L. Garcia, L. Balado, E. Lupon, J. Rius, R. Rodriguez, and J. Figueras, On the selection of efficient arithmetic additive test pattern generators, in Proc. Eur. Test Workshop, 2003, pp. 914.

I. Voyiatzis, An ALU based BIST scheme for wordorganized rams, IEEE Trans. Comput., vol. 57, no. 8, pp. 10121022, Aug. 2008.

I. Voyiatzis, An accumulatorbased compaction scheme with reduced aliasing for online BIST of rams, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 16, no. 9, pp. 12481251, Sep. 2008.

A. Rashid Mohamed, Z. Peng, and P. Eles, A wiringaware approach to minimizing builtin selftest overhead, J. Comput. Sci. Technol., vol. 20, no. 2, pp. 216223.

R. Zimmerman, Binary Adder Architectures for CellBased VLSI and their Synthesis, Ph.D. dissertation, Swiss Federal Inst. Technol. (ETH), Zurich, Switzerland, 1998 [Online]. Available: http://www.iis.ee.ethz.ch/~zimmi/publications/comp_ar ith_notes/s/gz

M. B. Santos, I. C. Teixeira, J. P. Teixeira, S. Munich, R. Rodriguez, and J. Figueras, RTL level preparation of highquality/lowenergy/ lowpower BIST, in Proc. Int. Test Conf., 2002, pp. 814823.

C. Albrecht, IWLS 2005 Benchmarks, Lake Arrowhead,CA,2005[Online].Available:http://www.iwl s.org/ iwls2005/benchmarks.html

L. R. Huang, J. Y. Jou, and S. Y. Kuo, Gauss eliminationbased generation of multiple seed polynomial pairs for LFSR, IEEE Trans. Comput. Aided Des. Integr. Circuits Syst., vol. 16, no. 9, pp. 10151024, Sep. 1997., Sep. 1997.
International Journal of Engineering Research & Technology (IJERT)
ISSN: 22780181
Vol. 1 Issue 7, September – 2012