A Survey of Pulse Triggered Flip Flop

Download Full-Text PDF Cite this Publication

Text Only Version

A Survey of Pulse Triggered Flip Flop

Sijithra P. C

M.E. VLSI Design

B.E. Electronics and Communication Engineering

Abstract:- A cell library includes a number of cells with different functionalities, where each cell may be available in several sizes and with different driving capability. Two central categories of cells included in cell libraries are flip-flops and latches. Latches and flip-flops have a direct impact on power consumption and speed of VLSI systems. Therefore study on low-power and high performance latches and flip-flops is inevitable.

Keywords – Flip Flop (FF), Master slave FF, pulse-triggered FF.


    For high performance VLSI chip-design, the choice of the back-end methodology has a significant impact on the design time and the design cost. Making every single gate from scratch is not necessarily the best method. Instead, a sufficient set of predesigned standard cells can be utilized as building blocks to design most of the functional blocks. Semiconductor manufacturers offer standard cell libraries which are also supported by CAD tools in automated design flows including the final physical auto-placement and routing. Despite the performance limitations, standard cell libraries could be useful even in design of high performance VLSI chips. Often, only a smaller portion of the chips include performance-critical units, and the rest of the design could be maximally automated to reduce the design time without degrading the targeted performance. In addition, the concept of cell library can be extended to even support the full-custom part of the chip. Custom cell libraries can be made and shared by the designers of the performance critical units.

    Flip flops and Latches are extremely important circuit elements in any synchronous VLSI chip. They are not only responsible for correct timing, functionality, and performance of the chips, but also their clocked devices consume a significant portion of the total active power. Based on the comparison of the power breakdown for different elements in VLSI chips, latches and flip-flops are the major source of the power consumption in synchronous systems.

    1.1 Factors Desirable for Flip Flops

    The factors which are desirable in latches and flip- flops are high speed, Low power consumption, Robustness and noise stability, Small area and less number of transistors, Supply voltage scalability and less internal activity when data activity is low. A fundamentally different approach for constructing a FF uses pulse signals.


    The register consists of cascading a negative latch (master stage) with a positive latch (slave stage).On the low

    phase of the clock, the master stage is transparent, and the D input is passed to the master stage output,QM. During this period, the slave stage is in the hold mode, keeping its previous value using feedback. On the rising edge of the clock, the master slave stops sampling the input, and the slave stage starts sampling. During the high phase of the clock, the slave stage samples the output of the master stage (QM), while the master stage remains in a hold mode. Since QM is constant during the high phase of the clock, the output Q makes only one transition per cycle. The value of Q is the value of D right before the rising edge of the clock, achieving the positive edge-triggered effect. A negative edge-triggered register can be constructed using the same principle by simply switching the order of the positive and negative latch (this is, placing the positive latch first). Where as in Pulse triggered flip flops, a short pulse around the rising (or falling) edge of the clock is created through a pulse generator circuit. This pulse acts as the clock input to a latch. Sampling of latch is done in this short window created by the pulse generator. Race conditions are thus avoided by keeping the opening time, the transparent period) of the latch very short. The combination of the glitch generation circuitry and the latch results in a positive edge-triggered register. In pulse triggered flip flops only one latch is used whereas it is two in normal edge triggered flip flops. The only type of flip flops which has time borrowing capability with negative set-up time is pulse triggered flip flops.

    1. Dynamic and Static Flip Flops

      Static flip-flops are a group of flip-flops that can preserve their stored value even if the clock is stopped. In contrast, in dynamic flip-flops the stored value will be destroyed if it is not refreshed for a while. Basically dynamic flip-flops can achieve higher speed and lower power consumption. However this family of flip-flops suffers from serious potential failures. Storage loss because of leakage currents, power supply noise and etc. are possible in dynamic flip-flops and must be considered by the designers.

      Millisecond storage retention time is usually not a problem when chip is operating normally; However when chip is in testing mode it becomes a serious problem. In many modern testing modes are inevitable.

      Fig.1 Static Flip Flop

      Fig.2 Dynamic Flip Flop

      For example if IDDQ tests (measurements of quiescent power supply current of the chip) are required for a chip, it requires stopping the clocks (all the activities) in the system, which will be problematic for systems containing dynamic flip-flops. The dynamic charge decay can become much more serious than a loss of correct logic values. As charge leaks in a dynamic node, the voltage on CMOS input after this node gradually changes. So for a considerable time the input voltage of the gate after dynamic node can be in forbidden region where NMOS and PMOS transistor are both on. This will consume considerable static current which in some cases can cause damages in the chip. Most of the dynamic flip-flops can be converted to static flip-flops using keepers for the dynamic nodes.

    2. Single Clock Phase and Multi Clock Phase Flip Flops

      Another classification for flip-flops is according to the needed clock phases. As discussed previously in master- slave flip-flops two latches are used in series which work in different clock phases. So naturally two clock phases are needed for master slave flip-flops if master and slave latches have similar structures. However in some cases changing the structure of the two latches can reduce the number of needed clocks to only one. True Single Phase Clock (TSPC) flip-flops can usually be operating at higher speeds than two clock phase flip-flops. Because of the skew time between two phases of the clock will add up to the delay of the two clock phase flip-flops, degrading the performance of these flip-flops.

    3. Single Edge Triggered and Double Edge Triggered Flip Flops

      In some systems double-edge-triggered flip-flops are required. Unlike single-edge triggered flip-flops, they capture data on both edges of a clock. A positive and a negative edge- triggered flip-flop both sample the D input, and the appropriate flip-flop is selected for the output by a clocked multiplexer. Double-edge triggered flip-flops can

      be beneficial for low-power systems. In general they result in a more efficient system because every power-dissipating clock edge is used to advantage. Master-slave flip-flops are shown to perform slightly better in double-edge-triggered mode than their single-edge-triggered counterparts. However this strategy requires careful control of the clock fs duty cycle to ensure that the combinational logic has adequate time to operate during both the clock high and the clock low cycles.

    4. Single Ended and Differential Flip Flops

      Produce both true and complement outputs. In cases where true and complement signals are available and they are synchronous differential structure can show better performance than single-ended structures. The performance of differential flip-flops will be degraded if the input signals are not synchronous.


      1. Basic Implicit Pulse Triggered Flop Flops

        In implicit type flip-flops the clock distribution circuit is a built in logic and there is no need for an external circuitry for the clock division and distribution. Implicit type flip-flops consist of two parts, a clock distribution network or clock tree and a latch for data storage. Several low power techniques are available which can be applied to the pulse flip-flops they are conditional enhancement, conditional capture and conditional data mapping. Implicit- type designs, however, face a lengthened discharging path in latch design, which leads to inferior timing characteristics. The situation deteriorates further when low- power techniques such as conditional capture, conditional precharge, conditional discharge, or conditional data mapping are applied.

        1. Implicit Pulsed Data close to Output

          Some conventional implicit-type P-FF designs, which are used as the reference designs in later performance comparisons, are first reviewed. It contains an AND logic- based pulse generator and a semi-dynamic structured latch design. The pulse generator takes complementary and delay skewed clock signals to generate a transparent window equal in size to the delay by inverters. Two practical problems exist in this design. First, during the rising edge, nMOS transistors N2 and N3 are turned on. If data remains high, node X will be discharged on every rising edge of the clock. This leads to a large switching power. The other problem is that node X. controls two larger MOS transistors (P2 and N5). The large capacitive load to node X causes speed and power performance degradation.

          Fig.3 Schematic of ip-DCO P-FF

        2. Modified Hybrid Latch Flip Flop

          In this MHLFF, the node transitions occur only when input has different logic value in two successive clocks. The operational principle of this work is explained here. When the clock (CLK) makes a transition from low to high, CLKBD remains high for a period equal to the delay of three inverters creating a transparency window. In this period, C1 is high turning on MN3. In this window, if D is low and Q is high (D was high in the previous clock),MP2 becomes on turning onMN2 forcing the output to low. If both D and Q are low, MP1 and MN2 are on before the beginning of the transparency window making the delay zero, similar to previous flip-flops.

          Fig.4 Schematic of MHLF

          If D is high and Q is low, node X becomes low turning on MP3 forcing the output to high. Note that, asMP1 is a weak transistor, the fighting problem during the output change is alleviated. If D is high and Q is high, node X will not change and, therefore, contrary to the other flip- flops discussed here, redundant transitions are avoided.

        3. Single Ended Conditional Capturing Energy Recovery Flip Flop

          Fig.5 Schematic of SCCER

          In this design, the keeper is replaced by a weak pull up transistor P1 in conjunction with an inverter to reduce the load capacitance of node X. The discharge path contains nMOS transistors N2 and N1 connected in series. In order to eliminate superfluous switching at node X, an extra NMOS transistor N3 is employed. Other techniques like conditional capture and conditional prechage are also there. Since N3 is controlled by Q fdbk, no discharge occurs if input data remains high. The worst case timing of this design occurs when input data is 1 and node X is discharged through four transistors in series

        4. Conditional Pulse Enhancement Flip Flops

          The upper part latch design is similar to the one employed in SCCER design.. Transistor N2, in conjunction with an additional transistor N3, forms a two-input pass transistor logic (PTL)-based AND gate to control the discharge of transistor N1.

          Fig.6 schematic of CPEFF

          Since the two inputs to the AND logic are mostly complementary (except during the transition edges of the clock), the output node Z is kept at zero most of the time. At the rising edges of the clock, both transistors N2 and N3 are turned on and collaborate to pass a weak logic high to node Z, which then turns on transistor N1 by a time span defined by the delay inverter I1. With this design measure, the number of stacked transistors along the discharging path is reduced and the sizes of transistors N1-N5 can be reduced also. So, when reading 1as the discharging path is more the voltage at node Z is increased to Vdd through an extra pulse enhancement transistor. After the rising edge of the clock, the delay inverter I1 drives node Z back to zero through transistor N3 to shut down the discharging path.


        1. Explicit Pulsed Data Close to Output

          The schematic of the Explicit-Pulse Data-Close-to- Output flip-flop (epDCO) which is shown is considered as one of the fastest flip-flops due to its semi-dynamic nature. It uses the delay of three inverters to generate the pulse at the double edge of the clock. In the ep-DCO, there are two stages, the first stage is dynamic and the second stage is static. The clock pulse drives three transistors-M1,M3 and M5. The input data is connected to M2 and the circuit captures the data through M2.

          Fig.7 Schematic of ep-DCO

          When the flip-flop is transparent, the input data propagates to the output, after the transparent period, M3 and M5 will turn off because of the low voltage of the pulse, at the same time, point X change to the high voltage because that M1 is on at this time. So M4 is off after the transparent period. Hence, any change at the input cannot be passed to the output. Now, we begin to analyze the disadvantages of ep-DCO. Moreover, while the output is high, the repeated charging/discharging of node X in each clock cycle causes glitches to appear at the output. These glitches propagate to the driven gates not only to increase their switching power consumption but also to cause noise problems that may lead to system malfunctioning.

        2. Static Explicit Pulse Triggered Flip Flop

          Dynamic CMOS is recommended for high performance gates with large fan-ins. However, for small fan-in circuits in particular, dynamic logic does not provide area or performance advantage over static logic. The internal node X follows the input D during the transparent interval. At the rising edge, N3 and N4 turn on for the short transparency duration, causing the input D to propagate to the output. The keeper maintains the output state. During the transparency period, when input D is stable high, X goes low, causing Q to go high.

          Fig.8 Schematic of SEPFF

        3. Single Transistor Clocked Explicit Pulsed Flip Flop

    The dual path Single Transistor Clocked Explicit Pulsed Flip Flop (called STC-EPFF) consists of two cascaded static latches sharing one clocked transistor. The internal node X is asserted or disserted according to the input data D during the transparent interval. The internal nodes of STC-EPFF switch only when input changes. At the rising edge of the clock, N3 turns on for a short time interval, which is the transparency period, and the circuit acts like two cascaded inverters allowing the input data to propagate to the output.

    Fig.9 Schematic of STC-EPFF


    Hiroshi Kawaguchi and Takayasu Sakurai in A Reduced Clock-Swing Flip-Flop (RCSFF) for 63% Clock Power Reduction proposed a Reduced Clock- Swing Flip-Flop (RCSFF) which can reduce the clocking system power of a VLSI down to 1/3 compared to the conventional Flip flop. The area and the delay of the RCSFF can also be reduced by a factor of about 20%.Thispaper describes a new small-swing clocking scheme which requires only one reduce-swing clock line. The RCSFF is composed of a current latch sense amplifier and cross- coupled NAND gates which act as a slave latcp3. The salient feature of the RCSFF is to accept a reduced voltage swing clock. The voltage swing, Vclk, can be as low as IV.

    By lowering the clock swing, the power of the clock distribution network is decreased as proportional to either Vcfk or Vclk. When the clock is to be stopped, it should be stopped at vss. Then there is no leak current. Transistor count of the RCSFF is 20 including an inverter for generating D, while that of the conventional F/F is 24. The area of the RCSFF is 16% smaller than the conventional F/F as seen from Fig. 4 even when the well for the precharge PMOS is separated. SPICE analysis is carried out assuming typical parameters of a generic 0.5pm double metal CMOS process. The delay depends on Wclk. Since delay improvement is saturated at Wclk = 10pm, this value of Wclk is used in the area and power estimation. The power consumption is reduced to about 1/2-1/3 compared to the conventional F/F depending on the type of the clock driver and Vwell. In the best case studied here, 63% power reduction is observed. For the RCSFF, the D and D input can also be small voltage swing signals. Using these characteristics, the RCSFF can be used to speed up RC delay of long buses. By placing the RCSFF at the end of a long bus and by sense-amplifying the slowly changing D input, RC delay can be reduced to 1/3 compared to the conventional F/F.

    Massimo Alioto, Elio Consoli, and Gaetano Palumb in Flip-Flop Energy/Performance versus Clock Slope and Impact on the Clock Network Design proposed the influence of the clock slope on the speed of various classes of flip-flops (FFs) and on the overall energy dissipation of both FFs and clock domain buffers is analyzed. Analysis shows that an optimum clock slope exists, which minimizes the energy spent in a clock domain. Results show that the clock slope requirement can be relaxed with respect to traditional assumptions, leading up to 30-40% energy savings and at a very small speed performance penalty. The effectiveness of the clock slope optimization is discussed in detail for the existing classes of FFs. The impact of such an optimization in terms of additive skew and jitter contributions is discussed, together to the analysis of the impact of technology scaling. Extensive post-layout simulations on a 65-nm CMOS technology are performed to check the validity of the underlying assumptions and approximations. In this paper, the impact of clock slope on the speed performance and energy consumption of a wide range of FFs has been discussed. Analysis has revealed that, in the wide range of clock slope, the speed performance of FFs is approximately unaffected by the clock slope, whereas the FF energy dissipation is more heavily influenced. Analysis of the energy contributions in a clock domain has shown that a smoother clock slope leads to an increase in the FF energy, and a decrease in the energy dissipated by the local clock buffer. Detailed analysis of this tradeoff has shown that the energy dissipation of the clock network within a clock domain can be significantly reduced by properly choosing the clock slope. This optimal value has been analytically derived and validated by simulations, and has been shown to be quite different from the usual assumption of a steep slope. Typical values of optimal clock slope range from F04-F05. This optimization allows saving up to40%energy

    consumption compared with a typical approach, depending on the FF topology, their number in the clock domain and the transistor sizing. Other than understanding interesting properties of FFs in nanometer technologies, the investigation has also permitted to derive several considerations that help the designer in designing the clock network. A detailed analysis on the FF delay variability and buffer-related skew/jitter sources has also been carried out. By employing a clock slope up to at the clock domain level, the impact of capacitive crosstalk and process variations is very slightly increased, whereas the impact of the supply voltage noise on the jitter is even reduced. As regards delay variability, it is substantially unaffected by the clock slope. Analysis has revealed that, in the wide range of clock slope, the speed performance of FFs is approximately unaffected by the clock slope, whereas the FF energy dissipation is more heavily influenced. Finally, a study on the effect of technology scaling has revealed that optimization of the clock slope will be more important in the future and that the optimal clock slope will move towards smoother values.

    Soheil Ziabakhsh and Meysam Zoghi in Design of a Low-Power High-Speed T-Flip Flop Using the Gate- Diffusion Input Technique proposed an implementation of a new TFF using GDI technique for low-power and high-speed in order to achieve a PDP is presently while having a still low complexity. Simulation results using ADS 2008 show that the proposed flip flop has the least propagation delay of 169.7 psec and consumption power

    188.9 W in a power supply of 1.8 V. Also results show more than 45% decrease in PDP of proposed circuit. This paper proposes of a low-power high-speed T flip flop Using GDI Technique. It is based on the Master-Slave connection of two GDI Latches and some gates. Each latch consists of four basic GDI cells, resulting in a simple eight- transistor structure and gates consists six transistors in order that related with latch. The components of the latch circuit can be divided into two main categories; GDI gate and inverter. GDI gate uses two transistors and controlled by the Clk signal. Clk signals fed to the gate of transistors and create two alternative states: one state is when the Clk is low and the signals are propagating through PMOS transistors and create transient state and other one is when the Clk is high and the prior values are maintained due to conduction of the outputs. In this state, GDI gates holding state of the latch.A novel methodology for asynchronous circuits, based on two-transistor GDI cells, was presented. The proposed circuit has a simple structure, based on two Master-Slave principles, and some gates to describe T flip flop. It contains 24 transistors. An optimization procedure was developed for GDI TFF, based on iterative transistor sizing, while targeting a minimal power-delay product. Performance comparison with other TFF design techniques was shown, with respect to gate area, delay and power dissipation.

    Peiyi Zhao, Tarek K. Darwish and Magdy A. Bayoumi in High-Performance and Low-Power Conditional Discharge Flip-Flop analyzed high-performance flip- flops and classified into two categories: the conditional precharge and the conditional capture technologies. This classification is based on how to prevent or reduce the redundant internal switching activities. A new flip-flop is introduced: the conditional discharge flip-flop (CDFF). It is based on a new technology, known as the conditional discharge technology. This CDFF not only reduces the internal switching activities, but also generates less glitch at the output, while maintaining the negative setup time and small -to- delay characteristics. With a data-switching activity of 37.5%, the proposed flip-flop can save up to 39% of the energy with the same speed as that for the fastest pulsed flip-flops. The general idea of this technique is that the precharging path is controlled to avoid precharging the internal node when stays HIGH. In the absence of the pMOS precharge control and when HIGH stays for a long time, the discharge path will be on during the evaluation periods, causing node to discharge after each precharging phase. To eliminate these charging/discharging activities, a pMOS transistor is inserted in the precharging path, which will prevent the precharging of node in case the data input is stable HIGH. Conditional Capture Technique is based on the clock-gating idea. The clock- gatin in the conditional capture technique results in redundant power consumed by the gate controlling the delivery of the delayed clock to the flip-flop. As a result, conditional precharge technique outperformed the conditional capture technique in reducing the flip-flop EDP. But the conditional precharge technique has been applied only to ip-FF, and it is difficult to use a double- edge triggering mechanism for these flip-flops, as it will require a lot of transistors. A new technique, conditional discharge technique, is proposed in this paper for both implicit and explicit pulse-triggered flip-flops without the problems associated with the conditional capture technique. Also, this new technique is employed to present a new flip- flop as well. In this technique, the extra switching activity is eliminated by controlling the discharge path when the input is stable HIGH and, thus, the name Conditional Discharge Technique. In this scheme, an nMOS transistor controlled by is inserted in the discharge path of the stage with the high-switching activity, when the input undergoes a LOW-to-HIGH transition, the output changes to HIGH and to LOW. This transition at the output switches off the discharge path of the first stage to prevent it from discharging or doing evaluation in succeeding cycles as long as the input is stable HIGH. With a data switching activity of 37.5%, the new flip-flop can save up to 39% of the energy with the same speed as that for the fastest pulsed flip-flops. While ep-DCO is suitable for speed critical paths, CDFF is suitable for both speed critical paths and speed-insensitive paths for energy efficiency. Moreover, in terms of PDP, CDFF outperforms the conditional capture flip-flops (CCFF, imCCFF) as well as conditional precharge flip-flops (CPFF, DE-CPFF). The Conditional Discharge Technique could be applied to implicit pulsed flip flops like ip-DCO and HLFF as well.

    Massimo Alioto, Elio Consoli, and Gaetano Palumbo in Analysis and Comparison in the Energy-Delay-Area Domain of Nanometer CMOS Flip-Flops: Part I Methodology and Design Strategies proposed an extensive comparison of existing flip-flop (FF) classes and topologies is carried out. In contrast to previous works, analysis explicitly accounts for effects that arise in nanometer technologies and affect the energy-delay-area tradeoff (e.g., leakage and the impact of layout and interconnects). Compared to previous papers on FFs comparison, the analysis involves a significantly wider range of FF classes and topologies. In particular, in this Part I, the comparison strategy, which includes the simulation setup, the energy-delay estimation methodology, and an overview of an optimum design strategy, together with the introduction of the analyzed FF classes and topologies, are reported. In Part I, exhaustive analysis and design methodologies for nanometer CMOS FFs have been presented. Such methodologies are based on the notion of the Energy Efficient Curve and on the evaluation of its points that correspond to figures of merit in the energy-delay space, which have a clear physical meaning. Moreover, the FF input capacitance is considered as a further independent variable to be optimized and the impact of local wires parasitic is included in the transistor- level design loop thanks to a stick-diagram based methodology, which is also helpful to achieve good area estimations.

    Massimo Alioto, Elio Consoli, and Gaetano Palumbo in Analysis and Comparison in the Energy-Delay-Area Domain of Nanometer CMOS Flip-Flops: Part II Results and Figures of Merit proposed part II of this paper, which deals with a comparison of the most representative flip-flop (FF) classes and topologies in a 65- nm CMOS technology is carried out. The comparison, which is performed on the energy-delay-area domain, exploits the strategies and methodologies for FFs analysis and design reported in Part I. In particular, the analysis accounts for the impact of leakage and layout parasitic on the optimization of the circuits. The tradeoffs between leakage, area, clock load, delay, and other interesting properties are extensively discussed. The investigation permits to derive several considerations on each FF class and to identify the best topologies for a targeted application. In this paper, an exhaustive comparison of a large number of FFs (19 topologies belonging to four different classes) in nanometer (65-nm) CMOS technology has been carried out, differently from the other most relevant analyses reported in the literature that have so far adopted technologies up to 0.13 m. The comparison has been performed in the whole energy-delay-area design space. The impact of layout parasitic has been included in the transistor-level design phase. The contribution of leakage has been considered in both standby and active mode, weighting it according to the logic depth in the active case. Wide loading and switching activity conditions have been explored and other properties (e.g., the clock load) have been analyzed in detail. As opposite to previous

    papers, figures of merit that designers are familiar with have been considered to gain an insight into the considered tradeoffs in a wide range of applications. Analysis showed that the results are different from previous papers because, here, the layout parasitic has been explicitly included from the beginning and a much wider range of topologies has been considered. According to the presented results, the fastest topology is the STFF, the best low-energy FFs are the DETTGLM and TGFF, whereas the most energy- efficient throughout a wide region of the energy-delay design space is the TGPL. For the first time, the layout efficiency of FFs has been analyzed. In particular, HLFF, MSAFF, and TGFF exhibit a very efficient area-delay tradeoff. Moreover, it has been shown that area is almost proportional to leakage regardless of the FF topology and the transistor sizing. The differences between the leakage- delay and the more general energy-delay tradeoff have been pointed out. It has also been shown that leakage has a significant impact on the optimum transistor sizing, especially for MS FFs. The clock load seen from the clock terminal of a FF and the related dissipation of the clock distribution network, has also been analyzed. It is also shown that, by including the impact of local clock distribution buffers, whose dissipation is directly related with FFs clock load, the rankings of FFs in the E-D space do not change significantly, unless for the MS class that is somewhat penalized. As a general remark, simpler basic structures are rewarded in nanometer technologies because of the strong impact of layout parasitic. In particular, explicit pulsed topologies, and specifically the TGPL, have been recognized as the most efficient FF topologies in a very wide range of applications from many points of view.

    • Majid Rahimi Nezhad and Mohsen Saneei in Low- Power Pulsed Triggered Flip-Flop with New Explicit Pulse in 65-nm CMOS Technology a low power pulse triggered flip-flop with new explicit pulse is proposed. Their idea is in the pulse generator that is explicit and the main structure is from clock branch sharing flip-flop. Because they have decreased short circuit current, and utilizing dual-edge triggering technique also have improvements in power consumption and power-delay product (PDP). In different switching activities, proposed circuit has minimum PDP with FO4 load. Circuits were optimized for PDP. Simulation results show for 50% data activity, power consumption is less than 7% to 32% lower than other flip-flops. Both power consumption and minimum delay of proposed flip flop are better than other flip-flops which have compared to. Supply voltage and operating clock frequency are 1.1v and 1GHz respectively, and proposed flip-flop is implemented in 65 nm CMOS technology. In this paper, low-power pulsed triggered double-edge triggered flip-flop has been proposed. Its latch structure is from clock branch sharing flip-flop. Simulation results show for 50% data activity, power consumption is less than 7% to 32% lowerthan other flip-flops. In different data activity, proposed flip-flop has smallest power consumption. Proposed flip-flop has a number of advantages. Two notable advantages are (1) the size of transistor in critical path is decreased, allowing

      significantly smaller leakage power consumption; and (2) in four corner process, proposed flip flop has significantly decreasing average power consumption than either of the flip-flops discussed earlier. Moreover proposed flip-flop has a good setup time and hold time and this proposed flip- flop is the most area efficient compared with others flip- flops.

    • Borivoje Nikolic, Vojin G. Oklobd zija, Vladimir Stojanovic, Wenyan Jia, James Kar-Shing Chiu and Michael Ming-Tak Leung in Improved Sense- Amplifier-Based Flip-Flop: Design and Measurements presented the design and experimental evaluation of a new sense amplifier based flip-flop (SAFF). It was found that the main speed bottleneck of existing SAFFs is the cross- coupled set-reset (SR) latch in the output stage. The proposed SAFF has all the advantages of earlier published SAFFs. It allows integration of the logic into the flip-flop, as well as reduced clock-swing operation. The single-ended input version with multiplexed data scan and asynchronous reset is possible. The new flip-flop uses a new output stage latch topology that significantly reduces delay and improves driving capability. The performance of this flip- flop is verified by measurements on a test chip implemented in 0.18 m effective channel length CMOS. Demonstrated speed places it among the fastest flip-flops used in the state-of-the-art processors. Measurement techniques employed in this work as well as the measurement set-up are discussed in this paper. The interest in high-speed flip-flop design re-emerged recently as the frequencies of operation passed 1 GHz. The importance of a good flip-flop design affects the power consumed by the clock as well as the available time in ever-shrinking pipeline. The strong driving capability of this flip-flop makes it suitable for GHz design characterized with a short pipeline and high fan-out. The differential input signal nature of the flip-flop makes it compatible with the logic utilizing reduced signal swing. Further, they developed a method for accurate measurement of flip-flop parameters from the test chip. They obtained very good measurement accuracy of 10 ps under difficult conditions characterized with high test frequency. This flip-flop was implemented on a test chip in

      0.18 CMOS technology. The measurement results place it on the top in terms of speed as compared to other flip-flops used in high-performance processors.

    • Fabian Klass, Chaim Amir, Ashutosh Das, Kathirgamar Aingaran, Cindy Truong, Richard Wang, Anup Mehta, Ray Heald, and Gin Yee in A New Family of Semi dynamic and Dynamic Flip-Flops with Embedded Logic for High-Performance Processors presented this paper in an attempt to reduce the pipeline overhead; a new family of edge-triggered flip-flops has been developed. The flip-flops belong to a class of semidynamic and dynamic circuits that can interface to both static and dynamic circuits. The main features of the basic design are short latency, small clock load, small area, and a single-phase clock scheme. Furthermore, the flip-flop family has the capability of easily incorporating logic

    functions with a small delay penalty. This feature greatly reduces the pipeline overhead, since each flip-flop can be viewed as a special logic gate that serves as a synchronization element as well. The flip-flop family presented in this paper has played an integral role in meeting the cycle-time goal of the microprocessor reported in. In an attempt to reduce the pipeline overhead, a new family of edge-triggered flip-flops has been developed. The flip-flops belong to a class of semidynamic and dynamic circuits that can interface to both static and dynamic logic. The term semidynamic is used here to denote circuits that internally have a precharge and evaluation phase, similar to dynamic gates. The main features of the basic design are short latency, small clock load, small area, and a single- phase clock scheme. Furthermore, this flip-flop family has the capability of easily incorporating logic functions with a small delay penalty. This feature greatly reduces the pipeline overhead, since each flip flop can be viewed as a special logic gate that serves as a synchronization element as well. Taken together, these features make the flip-flop family presented in this paper well suited for high- performance microprocessor design. This paper describes a new family of semidynamic and dynamic edge-triggered flip-flops, which are well suited for high-performance microprocessor design. They provide short latency and a good interface to static and dynamic logic, and can easily incorporate complex logic functions with a small delay penalty. These features contribute to reducing the pipeline overhead of the processor by allowing the elimination of one or more gate delays from a path leading to the flip flop. These flip-flops have played an integral role in meeting cycle-time goals.

    Albert Ma and Krste Asanovi´c in A Double-Pulsed Set-Conditional-Reset Flip-Flop proposed a new flip- flop design using a double-pulsed static latch. The flip-flop has only a single stage of logic in the critical path and as a result is up to three times faster than the fastest previously known flip-flops, while consuming approximately the same energy as the lowest power flip-flops. The flip-flop has asymmetric timing properties which make it a good match to skewed logic styles. A novel dual-pulse generator further reduces power requirements. In this work we introduce a new flip-flop structure, the double-pulsed set-conditional- reset flip-flop (DPSCRFF), which is up to three times faster than the fastest previously known flip-flops while consuming the same power as the lowest-power flip-flops. The DPSCRFF is a single-ended static flip-flop design with a single logic stage which can include arbitrary logic functionality. The DPSCRFF is compatible with static or dynamic logic, and in particular can directly drive following dynamic logic. The two pulses are generated by a local pulse generator to avoid pulse distortions from additional pulse buffers and wiring. The pulse generator can be shared by a few neighboring flip flops to reduce pulse generator area and energy overheads. This DPSCRFF does not allow arbitrary time borrowing across the transparency window as with other pulsed latches. Time

    borrowing is only possible for late arriving high inputs, e.g., from a preceding domino logic stage or a preceding skewed static logic stage. The rising and falling delays for the DPSCRFF have been separated out since they differ significantly. The rising delays are negative since the output precharge before the input is required to arrive. The flip-flops were optimized for the worst-case positive delay, which in some cases increases the negative delays. As described above, the negative delay can be used to improve performance or to lower power if skewed logic circuits are used. As can be seen, the fastest DPSCRFF at 54 ps is significantly faster than the next fastest flops (HLFF and SSASPL) at roughly 150 ps. The lowest-power DPSCRFF at 141 fJ is comparable to the lowest-power flop (PPCFF) at 130 fJ. However, it has a propagation delay of only 167 ps compared to 342 ps. When the data is held low while the clock continues to run, the energy dissipation of the DPSCRFF is reduced. However, if the clock is running and the data is held high, the DPSCRFF actually dissipates more power than for the full activity waveforms because of its output glitches. When the clock is held stable, no internal nodes change state and only the single data input gate toggles. The DPSCRFF therefore has low energy when the local clock is gated. When the clock is gated, the DPSCRFF has the lowest possible data input loading (a single transistor gate). The asymmetric propagation delay enables the use of hghly skewed logic to reduce cycle time and energy. The glitching present at the output may cause additional energy dissipation in downstream logic dependent on signal statistics.


Various pulse triggered flip flops were reviewed. A universal flip-flop with the best performance, lowest power consumption, and highest robustness against noise would be an ideal component to be included in cell libraries. The combination of a pulse-generation circuitry and a latch results in a positive edge triggered register. Pulse triggered FFs reduce the number of latch stages into a single stage. The logic complexity and number of stages are reduced in these pulse triggered FFs leading lesser D-to-Q delays. The main advantage of these pulse triggered FFs is that they allow time borrowing across clock cycle boundaries and feature a zero or even negative setup time. Due to these advantages P-FFs has been considered a popular alternative for traditional master slave FF.


I believe the success of any work depends on the encouragement and guidelines of others. So I am taking this opportunity to express my sincere gratitude to the people who have been there in all my success. I owe a sincere prayer to the LORD ALMIGHTY for his kind blessings and full support, without which this would have not been possible. I wish to take this opportunity to express my gratitude to all, who helped me directly or indirectly to complete this paper.


  1. H. Kawaguchi and T. Sakurai, A reduced clock-swing flip-flop (RCSFF) for 63% power reduction, IEEE J. Solid-State Circuits, vol. 33, no. 5, pp. 807811, May 1998.

  2. K. Chen, A 77% energy saving 22-transistor single phase clocking D-flip-flop with adoptive-coupling configuration in 40 nm CMOS, in Proc. IEEE Int. Solid-State Circuits Conf., Nov. 2011, pp. 338339.

  3. E. Consoli, M. Alioto, G. Palumbo, and J. Rabaey, Conditional pushpull pulsed latch with 726 fJops energy delay product in 65 nm CMOS, in Proc. IEEE Int. Solid-State Circuits Conf., Feb. 2012, pp. 482483.

  4. H. Partovi, R. Burd, U. Salim, F.Weber, L. DiGregorio, and D. Draper, Flow-through latch and edge-triggered flip-flop hybrid elements, in Proc. IEEE Int. Solid-State Circuits Conf., Feb. 1996, pp. 138139.

  5. F. Klass, C. Amir, A. Das, K. Aingaran, C. Truong, R. Wang, A. Mehta,R. Heald, and G. Yee, A new family of semi-dynamic and dynamic flip-flops with embedded logic for high-performance processors, IEEE J. Solid-State Circuits, vol. 34, no. 5, pp. 712 716, May 1999.

  6. V. Stojanovic and V. Oklobdzija, Comparative analysis of masterslave latches and flip-flops for high-performance and low- power systems, IEEE J. Solid-State Circuits, vol. 34, no. 4, pp. 536548, Apr. 1999.

  7. J. Tschanz, S. Narendra, Z. Chen, S. Borkar, M. Sachdev, and V. De, Comparative delay and energy of single edge-triggered and dual edge triggered pulsed flip-flops for high-performance microprocessors, in Proc. ISPLED, 2001, pp. 207212.

  8. S. D. Naffziger, G. Colon-Bonet, T. Fischer, R. Riedlinger, T. J. Sullivan, and T. Grutkowski, The implementation of the Itanium 2 microprocessor, IEEE J. Solid-State Circuits, vol. 37, no. 11, pp. 14481460, Nov. 2002.

  9. S. Sadrossadat, H. Mostafa, and M. Anis, Statistical design framework of sub-micron flip-flop circuits considering die-to-die and within-die variations, IEEE Trans. Semicond. Manuf., vol. 24, no. 2, pp. 6979, Feb. 2011.

  10. M. Alioto, E. Consoli, and G. Palumbo, General strategies to design nanometer flip-flops in the energy-delay space, IEEE Trans. Circuits Syst., vol. 57, no. 7, pp. 15831596, Jul. 2010.

  11. M. Alioto, E. Consoli, and G. Palumbo, Flip-flop energy/performance versus Clock Slope and impact on the clock network design, IEEE Trans. Circuits Syst., vol. 57, no. 6, pp. 12731286, Jun. 2010.

  12. M. Alioto, E. Consoli, and G. Palumbo, Analysis and comparison in the energy-delay-area domain of nanometer CMOS flip-flops: Part I methodology and design strategies, IEEE Trans. Very Large Scale Integr. (VLSI) Syst., vol. 19, no. 5, pp. 725736, May 2011.

  13. M. Alioto, E. Consoli and G. Palumbo, Analysis and comparison in the energy-delay-area domain of nanometer CMOS flip-flops: Part II – results and figures of merit, IEEE Trans. Very Large Scale Integr.(VLSI) Syst., vol. 19, no. 5, pp. 737750, May 2011.

  14. B. Kong, S. Kim, and Y. Jun, Conditional-capture flip-flop for statistical power reduction, IEEE J. Solid-State Circuits, vol. 36, no. 8,pp. 12631271, Aug. 2001.

Leave a Reply

Your email address will not be published. Required fields are marked *